Skip to main content

Taxonomy

A taxonomy is a hierarchical structure of attributes to tag the data against.

The taxonomy provides a structure and flow to organizing your data from root to leaf nodes. For example, ‘color’ is an attribute, while the actual colors ‘red’, ‘blue’, etc. are possible values for the color attribute.

Depending on your choice of segment, a preset taxonomy is recommended as a default. The default taxonomy is bootstrapped with preset ML models to power organization without having to start organizing from scratch. Feedback can be provided over the predictions of the preset models, to improve prediction accuracy specific to your data.

There are multiple ways to modify the default taxonomy to suit your needs:

  • By adding/deleting/editing values to/from existing attributes or creating entirely new attributes with values under it. Deleting an attribute or value would lead to deleting all the hierarchy below it.

  • In this video, we look at how to replace the Outerwear category with Coats and Jackets category to suit your organization needs.

  • You could also delete the entire taxonomy and start from scratch.
  • Import a taxonomy from an existing project - You can duplicate the taxonomy from another project, given it’s the same segment as the current project

  • Upload a taxonomy in the form of a CSV - the supported format of the CSV is available for download.

Note:

Editing the attributes in the default taxonomy can result in having to organize your data from scratch (instead of starting off from a semi-organized state).

We recommend keeping the number of classes below 10 per attribute - this helps with training models that perform better. You can always break the taxonomy into multiple levels - for example, you could have a ‘Bottoms’ as one of the classes for category, and break it into Pants, Jeans and Leggings at the next level say 'Bottoms Type'.

Uploading pre-labeled data#

If you have data that’s been pre-labeled or pre-tagged, you can import the tags and map it to the taxonomy on the tool. For example, if your datapoints have been tagged for color, you can upload those tags as metadata, and map it to the color attribute on your taxonomy. This enables you to provide feedback over the pre-tagged data, or skip organizing for color all together and focus on other attributes.

Note:

When you map an attribute to a column on the CSV, values from the CSV are augmented to the existing values in the taxonomy. ML models are trained based on the pre-labeled data, so that the learnings can be applied to unlabeled data feeds.