Training
Machine learning is one method of developing AI, in which an algorithm learns by looking at real-world data. In Decthings, this process is called training.
To start a training session, go to the "Train" tab on the model page, and click "Start training". You will then be presented with a few options, which will be explained in this article.
States
A state is a piece of binary information stored within each model. The state contains data that the model chooses to store - for a neural network, this would be the weights and biases of each neuron in the network.
When a training session is completed a new state will be created and added to the model. This state will contain the updated information, which for a neural network would be the optimized weights and biases. When the model is evaluated, a state is first loaded, which allows the model to use the improvements from the training.
Base state
When starting a training session your can select which state to use as the base state. Before the training session is started the base state will be loaded into the model, which means that the model will train using that particular state. For example, if you set the base state to a state which was previously trained, the model will continue training where the previous training session left off, without having to start over from scratch.
Let's take a more detailed look at the training process.
Prerequisites: Create a model
Before training, you need a model. You have a few options:
- Re-train a pre-made model on your own dataset
The fastest way to get going. For some common applications there are pre-made models that you can train on your own dataset. For example, if you have a dataset of birds you can make an image classifier which can classify the species. Recommended guide: Re-train a pre-made image classifier.
- Using the visual editor
You can create your own neural network by using the visual editor. Recommended guide: Create an image classifier using the visual editor.
- By writing code
You can also create your own model by writing code. Recommended guide: Create an image classifier in code.
Configuration options and input data
Note: For model's created using the visual editor, all input and output parameters are automatically configured, so you can skip this step.
The type of input data that the training session expects can be configured in the model settings. You can also allow your model to receive configuration options when training, such as the number of epochs to train for.
For example, let's consider the case of an image classifier. Click on the settings gearwheel icon on the "Configure" tab on the model page, click "Edit Input/Outputs" and go to the "Train" tab. Remove all previous parameters to start over from scratch. Add a parameter and give it the name "input". Set the data type to dictionary, and the shape to "-1". We use dictionary because each data point contains one image and one label. Give the first entry the name "data", the shape "" (empty shape), and the data type "Image". Add another entry to the dictionary, and give it the name "label", the shape "" and the data type "Text". Our input data is now configured. Now, let's add a configuration option to allow us to configure the number of epochs. Add a new parameter by clicking the plus icon. Name the parameter "Epochs", set the shape to "1" and the data type to "Unsigned int 32". When a user now starts a training session they will be prompted to add input data as well as to configure the number of epochs.
More information about input and output data can be found here.
Start the training session
Go to the "Train" tab on the model page and click "Start training". Select which state to use as the base state and then select a descriptive name that will be given to the new state if the training session is successful. Next, configure the launcher to use. You can either use a temporary or a persistent launcher. For more information, see launchers. Now specify a maximum duration for your training session. If the session runs for this amount of time it will be cancelled by Decthings.
Finally, provide the input data to use as well as the information for all configuration parameters.
Click "Train" and your training session will soon be started.
Visual editor configuration parameters
For models created using the visual editor, the input parameters will be "data", "epochs", "validation_ratio" and "validation_frequency".
The "data" is dependent on your model structure, but will in general be of type dictionary, where each input node corresponds to one key in the dictionary. The "epochs" parameter specifies the number of times to go through the entire input dataset. In most cases we want to go train through the dataset many times to achieve better accuracy, but the training will take more time.
Validation
Validation is often a good idea in order to get an estimate of how well our model is performing. Validation means that we take a portion of our training dataset, and separate it so that we do not use it for training. Instead, at regular intervals during training, we evaluate our model against this validation data, and see how well it is performing. Because the validation data was not used for training, this will give a more realistic estimate of the accuracy than by measuring the accuracy on the training data.
For models created using the visual editor, validation is built-in. The "validation_ratio" parameter specifies how much data to use for validation. For example, if set to 0.2, 20% of the input data will not be used for training, only for validation. The "validation_frequency" parameter specifies how often to run validation. For example, if set to 2, validation will be performed twice per epoch.
If you create your own model in code, you will have to manually separate the input data into multiple chunks and manually run the validation. An example of how to do this can be found in the guide Create an image classifier in code.
Metrics and progress
A training session takes time to complete. How long time depends on the model code. While running, the model code can choose to output the progress and metrics information. When training a model created with the visual editor, the code for reporting progress and metrics has already been added and you do not need to do anything to activate it.
The progress is a single number between 0 and 100, and when reported by the model it will be sent to your browser and displayed as a loading indicator, so that you can see approximately how long time is left.
Metrics allow the training session to output more complex information. Metrics can contain any information, such as numbers, images or text. This information can be viewed while the training is running, or after it has been completed. For example, a neural network could report a metric called "loss", and send the calculated loss of in a batch or epoch. The model would preferably send these metrics at frequent intervals, so that the user can in real-time see how the model performance is changing over time. When creating a model, make sure to output metrics that are of value to the person who started the training session. This could for example be loss, accuracy or average batch duration.
Below is an example where we trained an image classifier created with the visual editor.




As you can see, validation accuracy was in this case lower than training accuracy. This is often the case, because the model can better adapt to the training data that to the data it has not seen before.
System information
Decthings will automatically collect system information. This includes CPU usage, memory usage and disk usage. You can view graphs of how this information changes over time and then use this information to for example determine if you should upgrade or downgrade your launcher specification, for example to add more memory.
Writing the train function in code
For details about how to create your model in code, see write the model code.