Overview

Once you have framed your problem, you can start to prepare the data. Our software is quite flexible already and takes several data types, such as images, text (emails, Tweets, SMS, ...), or PDFs. Depending on the data type, the input needs to be prepared in a slightly different way. Here is how:

Labeled or unlabeled?

It is possible to upload both labeled and unlabeled data. At some point, training data needs to get a label but this can be done from within the software and Slack.

Quantity of training data

For all: More is better!

Images:

At least 20 examples per class. You can get some reasonable results in many cases with 100+ examples.

PDFs:

At least 100 examples per class

Text:

At least 100 examples per class

Format & upload

Images:

Images (drag & drop) or a CSV-file which contains one column with all URLs.

PDFs:

Same as for images

Text:

CSV-file with two columns "text" and "label".

Questions?

Please feel free to reach out to us and we are happy to support you! We are constantly working to make the process more self-explanatory and your feedback is important for us!

Did this answer your question?