Understanding model performance
After the AI block has been trained, we automatically calculate a performance score from 0-100 for you. The higher the score, the better the model is performing. This score boils down to a combination of accuracy and how balanced the training dataset is – read more about it here.
Upon the first attempt, many customers experience what we sometimes call "AI reality check": The model isn't as smart as we hoped yet. When that happens, there are three ways to change that:
Upload more training data
Improve the labeling of the data
Balance out the training dataset
The AI technology we are applying is called transfer learning, which relies on comparatively little training data instead of building neural networks from scratch. Despite that modern approach, some data is hard to differentiate – even to the human eye – which means that the model also requires more data to learn from. Unfortunately, we cannot generalize what "more" means upfront, but any increase in training data typically leads to improved performance.
If the model keeps performing not very well, then maybe the data are not different enough across the various labels. One may need to think again about the labeling choice and define more meaningful groups.
Eventually, making sure that each label contains a fair amount of data points is essential. The number of data points that fall under a certain label should be large enough in absolute and relative terms. Imagine that one training dataset contains 30 dogs and 100 cats, while a second one contains 30 dogs and 2,000 cats. Even though one may argue that 30 a large enough number of cats, this is not true anymore if the overall number of data points is 2,000.
What about better algorithms?
We invest a lot of effort into constantly updating and improving our technology. A major part of this is keeping a close eye on the machine learning space, and whenever new models come out, we test them against our current ones.
Therefore, you can trust that the latest findings in research around any given data type will be available to you within a matter of days.
However, if you are still struggling with performance, please do not hesitate to contact us – we are happy to assist you in getting the most out of it!