Glossary of Terms

Annotation

(Adjective). The process of labeling data by defining which areas of an image contain the relevant object(s).

(Noun). The actual files that contain the information regarding the areas of interest for a particular image. Annotations are sometimes referred to as the ground truth and they are used in supervised learning; the model repeatedly compares predictions against annotations in order to improve.

Augmentation

The process of altering images thereby creating new images that are sufficiently different from the originals. Augmentation can include blurring, cropping, brightening, darkening, rotating, and more. Augmentation is used to increase the size of a dataset.

Batch

The number of images trained on in a step.

Difficult

In annotation, ‘difficult’ is set to 1 when the object is not easily recognized, otherwise it is set to 0.

Epoch

Training on each image one time.

Loss

A quantification of how different the model’s prediction is from the ground truth.

Learning Rate

How often the weights in the model are updated.

Overfitting

When the model performs well on the training dataset, but poorly on new test data.

Step

Training ‘batch’ number of images. For instance, if batch size is 16, then in one step, the model would train on 16 images.

Train, Validation, Test Split

There are three components to training a neural network. The actual training, the tuning of the hyperparameters, such as learning rate, and testing the model. To accomplish this, the original dataset is typically split into a training and testing dataset, usually with an 80/20 split, respectively. The training dataset is then split into a training and validation dataset. As the model trains, it compares its prediction to the annotation on all of the training data and adjusts the weights and other hyperparameters accordingly. When the model is done training, the model is tested against the validation data to see how well it performs.

Truncated

When annotating, this describes whether the object being annotated is completely visible. If the object is visible (i.e. not truncated), this value is set to 0, otherwise it is set to 1. Typically if 20% or more of the object is obscured, it should be marked as truncated.

Underfitting

The model has poor performance on validation and training data as well as test data.

Weight

Weights are a way of quantifying how important a given input is for a neural network and how much it contributes to the output.