Tips on improving your machine learning model
AI Blocks in Levity receive a Performance Score, also known as the training score. In a nutshell, this number represents how well your AI model will perform when classifying real data, and boils down to a combination of accuracy and how balanced the training dataset is (read more about it here).
The higher the number (up to 100), the more likely it is to get the predictions correct.
Upon the first attempt, many customers experience what we sometimes call the "AI reality check" - the model isn't as smart as we hoped yet. There are a few reasons why this might happen, and simple steps you can take to improve your model.
You aren't using enough training data
The AI technology we are applying is called transfer learning, which relies on comparatively little training data instead of building neural networks from scratch. Despite this, some data is hard to differentiate – even to the human eye – which means that the model also requires more data to learn from.
Using the minimum amounts of training data is one of the potential causes of a low training score. Adding more data and clicking Retrain AI Block is one way to give your model a boost in knowledge, and a potential improvement in score.
Your data quality could be improved
Depending on your model's use, you might be able to improve your score by using better-vetted training data. For example, if you are using Levity for visual quality inspection and checking for defects in images of products going through your production line, using clearer photographs with a higher resolution may improve performance.
You aren't using balanced data
An imbalanced model will negatively affect your training score. For example, if you are training an AI model to categorize inbox content and email attachments, and you use 1000 pieces of training data to illustrate resume attachments, but only 50 pieces of data for invoices, this would be an imbalanced dataset.
Ideally, you would want to have similar amounts of data for the different 'labels' you were training the model on.
You need to simplify your model
If you are unable to increase the volume of training data acquired, another approach is to reduce the number of outcomes the model is looking for. A human doesn't need much information to know the difference between a sports car and a minibus, but if you ask them to define various models of similar-looking sports cars, the number of reference materials needed increases. AI is similar - as the number of labels increases and the variance between each diminishes, so too does the need for greater volumes of training data per label.
Consider if your model will still provide a benefit to you with less granularity to what it is trying to detect, and simplify your labels accordingly.
What does your Performance Score say?
Your training score is calculated based on a subset of the data that is excluded in training. This data is automatically set aside for the model to test itself during the process. This is what delivers the performance score. Sometimes this score may differ by a few points up or down, even if you add more data - this is perfectly normal, and explained here.
The Performance Score is provided to give you a good understanding of how well your model might perform, but the most accurate measure of your Block's performance will come from testing it in a real scenario.
Our blog features more insights into using and improving your machine learning models. Click here for more information.
What about better algorithms?
We invest a lot of effort into constantly updating and improving our technology. A major part of this is keeping a close eye on the machine learning space, and whenever new models come out, we test them against our current ones and adopt whatever technology will give Levity users the best quality predictions.
However, if you are still struggling with performance, please reach out to our support team – we are happy to take a look at your data and suggest areas for improvement in your workflow.