Example
The typical workflow of training and using neural networks, regardless of the library used, goes like this:
Training Data
- Getting the training data: the
X
variable is the input, and the Y
variable is the output. The simplest thing to do is to learn a logic gate, where X
is a vector or two numbers and Y
is a vector of one number. Typically, the input and output values are floats, so if it is words, you might associate each word to a different neuron. You could also directly use characters, then it would use less neurons than to keep a whole dictionary.
Architecture
- Defining the neural network's architecture: this is done by specifying how the neurons are linked together and with which algorithm the connections between neurons are trained and changed. As an example, processing text is done using recurrent neural networks, which receive a new input at each timestep and where neurons have a reference to their earlier value in time for effective computation purposes. Commonly, layers of neurons are used, they are generaly stacked one over the other from the inputs to the outputs. The way neurons are connected from a layer to the other varies a lot. Some computer vision architectures uses deep neural networks (with many specialized layers stacked).
Evaluation
- Next, the neural network is typically evaluated on data it has not been directly trained on. This consists of presenting the
X
part of the data to the neural network, then comparing the Y
it predicts to the real Y
. many metrics exists to assess the quality of the learning performed.
Improvement
- It is common to fiddle with the architecture of the neural network again to improve its performance. The neural network must be not too intelligent and not too dumb because both cases yield problems. In the first case, the neural network might be too large for the data, memorizing it perfectly, and it might fail to generalize to new unseen examples. In the second case, if the neural network is too dumb (small), it will fail to learn too.
Real-World Use
- Using it on new data to predict an output. Indeed, neural networks are quite useful, automatic text translation or responding to a textual question are good examples. One of the techniques used to improve the neural network at this stage is online learning, meaning that if the network can get a constructive feedback on his outputs, it is still possible to continue the learning process. As an example, this might be the case of Google Translate that asks users feedback on translations.