The core data structure of Keras is a model, a way to organize layers. The main type of model is the Sequential model, a linear stack of layers. For more complex architectures, you should use the Keras functional API.
Here's the Sequential model:
from keras.models import Sequential
model = Sequential()
Stacking layers is as easy as .add()
:
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
Once your model looks good, configure its learning process with .compile()
:
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
from keras.optimizers import SGD
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
You can now iterate on your training data in batches:
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
Alternatively, you can feed batches to your model manually:
model.train_on_batch(X_batch, Y_batch)
Evaluate your performance in one line:
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
Or generate predictions on new data:
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
You will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc in example folder.