Usually, there are a lot of steps you need to do before you can train a new computer vision model - gathering data, cleaning data, annotating data, possibly augmenting data, and exporting data in the right format. All of this takes time.
Note: If you are new to computer vision models and model training, read our introductory article.
This quickstart guide is intended to get you up and running and training models right away, so we have simplified the process to 4 steps:
We’ll provide you with a ready-made dataset, in the correct format, so you can focus on training. We’ll guide you through training a license-plate detection model, using a subset of the same dataset we used to train the license plate model in our catalog. And finally, we’ll also walk you through setting up a project, publishing your new model, and testing it out in an application. We’ll finish with a summary and some next steps. Let’s get started!
Step 1: Set Up Your Machine¶
There are a few set up steps we need to complete before you can train your model. If your machine is already configured, feel free to skip to the next step.
Setup the CLI for Model Training¶
First, install the latest version of the CLI. You can find the lastest installers on this page. Select the appropriate tab for your OS and follow the instructions.
If you’re on Windows, skip down to the Windows step.
Docker Configure for Mac and Linux¶
First, ensure you have installed Docker, as described above.
Next, allocate memory to docker by opening
Docker Desktop, selecting
Preferences, and selecting
Resources. Training a model is a very compute intensive process, so we recommend giving Docker access to most of your memory and all but one of your CPUs.
Next, ensure the following path is entered in the
File Sharing section under the
To do this, click the
/path/to/export/directory on bottom of the
File Sharing page and add the above file manually.
Apply & Restart.
Docker is now all configured!
Docker Configuration for Windows¶
Windows users can set the advanced configuration using a GUI in Hyper-V mode or in a
.wslconfig file when using WSL2, which is the recommended route. You can set up a
.wslconfig file as described on this page. Notice the section just below the previous link, which describes the recommended settings for your VM.
Note: Make sure that your .wslconfig file is set in the root directory of your users folder: C:\Users\yourUserName\.wslconfig Also ensure that your file doesn't include a BOM header (you can use an advanced text editor to ensure this, and select without BOM encoding when you save). Finally, make sure your .wslconfig file does not have a suffix, such as '.txt', as this will prevent the file from being used by the system. To make sure your new settings are being used, shutdown the WSL2 using the command wsl --shutdown from within you regular WindowsTerminal. Then restart by typing wsl. Once you are in the WSL2, you can check your available resources by typing the command free.
Logging into alwaysAI¶
Ensure you are logged in using
aai user show
If you aren’t logged in, you can do so using
aai user login.
Step 2: Download the Dataset¶
Get the dataset here. Move it someplace convenient and note the path.
Note: Make sure you’re logged in! You can test this with aai user show. Use aai user login to login if you’re not.
Step 3: Train Your Model¶
The command you’ll use to start training is
$ aai dataset train dataset_sample_584.zip --numEpochs 15 --batchSize 4 --name <modelname>
Running the above command should result in something similar to the following:
This training took approximately 30 minutes using a CPU. You will see loss information printed out for each step, so you can tell how far along in the training process you are at any given time.
Note: For more details on interpreting the this output, see our documentation on training output.
Step 4: Use Your Model in a Project¶
Once training has finished, the first thing to do is publish the model to your personal models, using the following command:
$ aai model publish <username/modelname>
Now the model has been added to your personal account. For more information on how to use models, see this page.
You can also use the model locally by navigating to an existing project directory and using
aai app models add <username/modelname> --local-version <version>
This model can be used in any object-detection based project. For more information on working with projects, see here.
aai app install and
aai app start to run the example app and test the new model output!
The dataset used in this tutorial has 584 images. The rule of thumb is that at least 20 epochs are needed in training a model. Even about two thirds of the way there, the model is already able to detect a fair number of vehicles, and some license plates! Some of the bounding boxes, especially for license plates, are not perfectly centered, but we’re off to a good start in only about 30 minutes.
This guide is meant to be an introduction to the model training tool. At this points you have a couple different options. You can
If you would like to improve this model, the first step is just to run it longer. Simply navigate back to the training directory, and enter
$ aai dataset train resized_dataset_sample_584.zip --numEpochs 5 --batchSize 4 --name <modelname> --continue-from-version <version>
To update the version of a model already used in a project, navigate back to your app directory and run
$ aai app models update
Train a New Model¶
Generating a dataset and annotating it are where you will spend most of your time when creating your own models. If you don’t have a data set yet, follow these steps to generate one:
Check out our documentation on data capture guidelines,
Optionally, use our data generation starter application to generate your dataset, and
Read our guide on data annotation using our annotation CLI tool.
You can make your own dataset from scratch, or you can add images and annotations to the dataset we provided. You would do this to potentially improve the performance of license plate detection model on certain vehicles or plates, or in certain environments.
Repeat the training process you followed in the first stage of this guide using your newly generated dataset.