SimpleNN: a simple neural network framework

This project builds a simple neural network framework based on python and Numpy. It has three components:

Layers

It has three kinds of layers, and trained with BackPropagation algorithm.

Input Layer: feed data to the neural network;
Output Layer: generate predictions, a softmax output layer is provided;
Hidden Layer: lies in between Input layer and Output layer;

Activations

Several activation functions are provided, and can be plugged into the Hidden Layers.
Simple Framework

With this framework, different neural networks can be built with the Input/Output/Hidden Layers, and the Activate functions.

A concrete example

With this framework, a simple 5-layers neural network is built. The main.py file demonstrates how to use the simple framework to build the NN, and how to train the NN with MNIST dataset. (CNN version can be found in another project.)

Achieve 98% correctness on MNIST

Here is to describe how to use this simple framework to build a neural network, and how to train it on MNIST dataset to achieve 98% correctness.

Even though the network is simple, and the number is not impressive at all, it is still interesting to practice the common optimization skills to train a neural network.

define the nerual network structure

This neural network has three hidden layers, and a softmax output layer. The first two hidden layers uses tanh activation function, the third hidden layer uses the Relu activation function. The derivative of Relu function is 1, so it can transmit the errors from output to previous layers with less loss.

def construct_nn(l2=0.0):
    img_input = nn_layer.InputLayer("mnist_input", 784)
    output_layer = nn_layer.SoftmaxOutputLayer("mnist_output", 10)

    # 1. set input and output layers
    nn = simple_nn.NNetwork()
    nn.set_input(img_input)
    nn.set_output(output_layer)

    # 2. add some hidden layers
    h1 = nn_layer.HiddenLayer("h1", 256, activation.tanhFunc)
    h1.set_lambda2(l2)
    nn.add_hidden_layer(h1)

    h2 = nn_layer.HiddenLayer("h2", 64, activation.tanhFunc)
    h2.set_lambda2(l2)
    nn.add_hidden_layer(h2)

    h3 = nn_layer.HiddenLayer("h3", 10, activation.reluFunc)
    h3.set_lambda2(l2)
    nn.add_hidden_layer(h3)

    # 3. complete nn construction
    nn.connect_layers()
    print(nn.get_detail())
    return nn

With this nerual network (l2=0.0), and simple SGD (no mini-batch), 10 epochs achieved 98.11% correctness on the test dataset.

[train] accuracy=0.9972, avg_cost=0.0148
[test] accuracy=0.9811, avg_cost=0.0656

data normalization

This is the most important adjustment after I finished the code. Before doing data normalization, I can hardly achieve 70% correctness(cost was around 0.70) on the MNIST dataset. After a very simple normalization is applied to the input data, 95% correctness is achieved immediately in the first epoch.

def normalize_img(imgs):
    result = imgs.astype(float)
    result /= 255.0
    avg = np.average(result, axis=1).reshape((-1, 1))
    result -= avg
    return result

data shuffling

Because the naive SGD is used during the training, it is also important to shuffle the data during training.

def train_it(nn, train_data, lr):
    labels = train_data[0]
    imgs = train_data[1]

    # shuffle the data
    alist = range(labels.shape[0])
    shuffle(alist)

    for i in alist:
        label = labels[i, :]
        img = imgs[i, :]
        nn.train(img, label, lr)

    return

Run it

prerequisites

* Python 2.7+
* Numpy

run it

# 1. get code
git clone https://github.com/beekbin/SimpleNN.git
cd SimpleNN

# 2. download the mnist data
cd data
sh get.sh
cd ..

# 3. run it
python main.py

TODO

1. Add more kinds of layers

Only fully connected layers are supported in current implementation.

Convolutional + Pooling Layers

This is implemented in another project.

Embedding Layer

This is important for NLP problems.

Recurrent Layers

such as vanilla RNN, LSTM, GRU.

2. Improve generalization

Dropout

3. Accelerate training process

Adaptive learning rate schedulers

Here is a wonderful review of the popular learning rate schdulers, It will be nice to implement some of them.
BatchNorm

4. More flexible layers: allow multiple inputs

In current implementation, one layer can only have one input layer. However, many modern deep learning networks requries multiple inputs, such as ResNet/HighwayNet/DenseCNN. And in one of my project, we found that even Densely connected LSTM layers are also powerful.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
data		data
nn		nn
test		test
util		util
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SimpleNN: a simple neural network framework

A concrete example

Achieve 98% correctness on MNIST

define the nerual network structure

data normalization

data shuffling

Run it

prerequisites

run it

TODO

1. Add more kinds of layers

2. Improve generalization

3. Accelerate training process

4. More flexible layers: allow multiple inputs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SimpleNN: a simple neural network framework

A concrete example

Achieve 98% correctness on MNIST

define the nerual network structure

data normalization

data shuffling

Run it

prerequisites

run it

TODO

1. Add more kinds of layers

2. Improve generalization

3. Accelerate training process

4. More flexible layers: allow multiple inputs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages