You are browsing as a guest. Sign up (or log in) to start making projects!

Open comments for this post

1h 53m 16s logged

Devlog 2 — Training on MNIST
Today I finally got NeuroLab training on real data. I loaded the MNIST dataset, which has 10,000 handwritten digit images, and split it into 8,000 for training and 2,000 for testing.
I wrote a load_mnist function that reads the CSV file, splits the labels from the pixels, and normalizes the pixel values by dividing by 255.0 so they are between 0 and 1. I also wrote a one_hot function that converts digit labels like 5 into arrays like [0,0,0,0,0,1,0,0,0,0] so the network can compare its 10 outputs to the correct answer.
The first training run was a disaster. The loss started at over 1000, which means the network was completely wrong. Turns out the weight initialization was the problem. Switched to He initialization by multiplying the random weights by np.sqrt(2.0 / input_size), and the starting loss dropped all the way to 0.14.
After 100 training cycles, the loss got down to 0.03!

0
18

Comments 0

No comments yet. Be the first!