How To Build Multi-Layer Perceptron Neural Network Models with Keras

The Keras Python library for deep learning focuses on the creation of models as a sequence of layers.

In this post you will discover the simple components that you can use to create neural networks and simple deep learning models using Keras from TensorFlow.

Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

May 2016: First version
Update Mar/2017: Updated example for Keras 2.0.2, TensorFlow 1.0.1 and Theano 0.9.0.
Update Jun/2022: Updated code to TensorFlow 2.x. Update external links.

How To Build Multi-Layer Perceptron Neural Network Models with Keras
Photo by George Rex, some rights reserved.

Neural Network Models in Keras

The focus of the Keras library is a model.

The simplest model is defined in the Sequential class which is a linear stack of Layers.

You can create a Sequential model and define all of the layers in the constructor, for example:

from tensorflow.keras.models import Sequential
model = Sequential(…)

A more useful idiom is to create a Sequential model and add your layers in the order of the computation you wish to perform, for example:

from tensorflow.keras.models import Sequential
model = Sequential()

Model Inputs

The first layer in your model must specify the shape of the input.

This is the number of input attributes and is defined by the input_dim argument. This argument expects an integer.

For example, you can define input in terms of 8 inputs for a Dense type layer as follows:

Dense(16, input_dim=8)

Model Layers

Layers of different type are a few properties in common, specifically their method of weight initialization and activation functions.

Weight Initialization

The type of initialization used for a layer is specified in the init argument.

Some common types of layer initialization include:

uniform“: Weights are initialized to small uniformly random values between 0 and 0.05.
normal“:Weights are initialized to small Gaussian random values (zero mean and standard deviation of 0.05).
zero“: All weights are set to zero values.

You can see a full list of initialization techniques supported on the Usage of initializations page.

Activation Function

Keras supports a range of standard neuron activation function, such as: softmax, rectified linear, tanh and sigmoid.

You typically specify the type of activation function used by a layer in the activation argument, which takes a string value.

You can see a full list of activation functions supported by Keras on the Usage of activations page.

Interestingly, you can also create an Activation object and add it directly to your model after your layer to apply that activation to the output of the Layer.

Layer Types

There are a large number of core Layer types for standard neural networks.

Some common and useful layer types you can choose from are:

Dense: Fully connected layer and the most common type of layer used on multi-layer perceptron models.
Dropout: Apply dropout to the model, setting a fraction of inputs to zero in an effort to reduce over fitting.
Concatenate: Combine the outputs from multiple layers as input to a single layer.

You can learn about the full list of core Keras layers on the Core Layers page

Model Compilation

Once you have defined your model, it needs to be compiled.

This creates the efficient structures used by TensorFlow in order to efficiently execute your model during training. Specifically, TensorFlow convert your model into a graph so the training can be carried out efficiently.

You compile your model using the compile() function and it accepts three important attributes:

Model optimizer.
Loss function.

model.compile(optimizer=…, loss=…, metrics=…)

1. Model Optimizers

The optimizer is the search technique used to update weights in your model.

You can create an optimizer object and pass it to the compile function via the optimizer argument. This allows you to configure the optimization procedure with it’s own arguments, such as learning rate. For example:

from tensorflow.keras.optimizers import SGD
sgd = SGD(…)

You can also use the default parameters of the optimizer by specifying the name of the optimizer to the optimizer argument. For example:


Some popular gradient descent optimizers you might like to choose from include:

SGD: stochastic gradient descent, with support for momentum.
RMSprop: adaptive learning rate optimization method proposed by Geoff Hinton.
Adam: Adaptive Moment Estimation (Adam) that also uses adaptive learning rates.

You can learn about all of the optimizers supported by Keras on the Usage of optimizers page.

You can learn more about different gradient descent methods on the Gradient descent optimization algorithms section of Sebastian Ruder’s post An overview of gradient descent optimization algorithms.

2. Model Loss Functions

The loss function, also called the objective function is the evaluation of the model used by the optimizer to navigate the weight space.

You can specify the name of the loss function to use to the compile function by the loss argument. Some common examples include:

mse‘: for mean squared error.
binary_crossentropy‘: for binary logarithmic loss (logloss).
categorical_crossentropy‘: for multi-class logarithmic loss (logloss).

You can learn more about the loss functions supported by Keras on the Losses page.

3. Model Metrics

Metrics are evaluated by the model during training.

Only one metric is supported at the moment and that is accuracy.

Model Training

The model is trained on NumPy arrays using the fit() function, for example, y, epochs=…, batch_size=…)

Training both specifies the number of epochs to train on and the batch size.

Epochs (nb_epoch) is the number of times that the model is exposed to the training dataset.
Batch Size (batch_size) is the number of training instances shown to the model before a weight update is performed.

The fit function also allows for some basic evaluation of the model during training. You can set the validation_split value to hold back a fraction of the training dataset for validation to be evaluated each epoch, or provide a validation_data tuple of (X, y) of data to evaluate.

Fitting the model returns a history object with details and metrics calculated for the model each epoch. This can be used for graphing model performance.

Model Prediction

Once you have trained your model, you can use it to make predictions on test data or new data.

There are a number of different output types you can calculate from your trained model, each calculated using a different function call on your model object. For example:

model.evaluate(): To calculate the loss values for input data.
model.predict(): To generate network output for input data.
model.predict_classes(): To generate class outputs for input data.
model.predict_proba(): To generate class probabilities for input data.

For example, on a classification problem you will use the predict_classes() function to make predictions for test data or new data instances.

Need help with Deep Learning in Python?

Take my free 2-week email course and discover MLPs, CNNs and LSTMs (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Summarize the Model

Once you are happy with your model you can finalize it.

You may wish to output a summary of your model. For example, you can display a summary of a model by calling the summary function, for example:


You can also retrieve a summary of the model configuration using the get_config() function, for example:


Finally, you can create an image of your model structure directly. For example:

from keras.utils.vis_utils import plot_model
plot(model, to_file=’model.png’)


You can learn more about how to create simple neural network and deep learning models in Keras using the following resources:

Getting started with the Keras Sequential model.
About Keras models.
The Sequential model API.


In this post you discovered the Keras API that you can use to create artificial neural networks and deep learning models.

Specifically, you learned about the life-cycle of a Keras model, including:

Constructing a model.
Creating and adding layers including weight initialization and activation.
Compiling models including optimization method, loss function and metrics.
Fitting models including epochs and batch size
Model predictions.
Summarizing the model.

If you have any questions about Keras for Deep Learning or this post, ask in the comments and I will do my best to answer them.

The post How To Build Multi-Layer Perceptron Neural Network Models with Keras appeared first on Machine Learning Mastery.

Read MoreBlog