The Dense class from Keras is an implementation of the simplest neural network building block: the fully connected layer. But using it can be a little confusing because the Keras API adds a bunch of configurable functionality. This post will explain the layer to you in two sections (feel free to skip ahead):

### Fully connected layers

At its core, a fully connected layer is a dot product between a data matrix and a weights matrix:

```
y = XW
```

The data will have an input shape of `(batch_size, n_features)`

and the weight matrix will have shape `(n_features, n_units)`

. This equation is important, because it shows you that the output shape will be `(batch_size, n_units)`

.

Additionally, you typically want to add an `n_units`

bias vector:

```
y = XW + b
```

When `n_units`

is 1, this simplifies to linear regression and `b`

becomes the y-intercept of the equation.

So far, this is a linear transformation. But often, you want to pass the output through an activation function to make it non-linear. For hidden layers in a neural network, this helps with stability during training. For the final layer, activation functions are often used to give the output desirable characteristics like being bounded between 0 and 1 (so they can be interpreted as probabilities).

If we let the activation be an arbitrary function, `f`

, then the full operation can be written:

```
y = f(XW + b)
```

During training, `W`

and `b`

will both be learned in order to best fit the data, `(X, y)`

.

### API

Initializing a dense layer in Keras is easy:

```
import tensorflow as tf
linear = tf.keras.layers.Dense(
2,
use_bias=False,
kernel_initializer='ones',
)
```

In the layer above, the output dimension is 2 (referred to as `n_units`

in the previous section), there is not a bias term and the weights (called a kernel in Keras) are initialized to ones. The only required argument is the first positional argument — the number of output units.

Once defined, you can add a dense layer to a model. But you can also apply it as a forward pass to a tensor to try it out:

```
x = tf.ones((1, 4))
linear(x)
# Expected result
# <tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[4., 4.]], dtype=float32)
```

Notice the input dimensions (`n_features`

) is set to 4 in the `x`

tensor and inferred automatically by the dense layer. This allows you to re-use the same code for multiple input shapes if you write your model correctly. Additionally, the output is a vector of fours because the kernel weights are all ones.

A few other points about the Dense API:

- Bias is optional but the default is to add it (
`use_bias=True`

). - Activation is optional. The default is linear (no activation), but you can add it by specifying the string identifier (e.g.
`activation='relu'`

) of the activation or the actual activation function (e.g.`activation=tf.keras.activations.relu`

). Check out Keras activations for more information. - If input has >2 dimensions, you can think of Keras as flattening all but the last dimension, doing the original operation and then reshaping all but the last dimension back. So for example a
`(2, 3, 4)`

tensor run through a dense layer with`10`

units will result in a`(2, 3, 10)`

output tensor. - Initialization can be customized for weights and bias separately, but the defaults are reasonable. Check out Keras initializers for more information.
- Regularization can be applied to weights and bias separately. Check out Keras regularizers for more information.
- Constraints can be applied to weights and bias separately. Check out Keras constraints for more information.

By the way, if you need something a little more custom, check out my post on custom Keras layers.