Devlog by @chezburger

@chezburger on CS50P Final Project: NeuroLab · about 2 months ago

3h 59m 33s logged

Today I started building NeuroLab, a neural network completely from scratch in Python using only NumPy. No TensorFlow, no PyTorch, just pure math and matrices. This is also my CS50P final project and my main Stardance project for the whole summer!
What I built today: I started with the core building blocks. First was sigmoid, which squashes any number between 0 and 1. Then ReLU, which returns 0 for negative numbers and the input itself for positive numbers. I also wrote relu_derivative for backpropagation, init_weights using random NumPy arrays, and mse_loss, which measures how wrong the network is by averaging squared differences.
Then I built the actual NeuralNetwork class with init to set up weights and biases for each layer, forward_pass to run data through the network, forward_cache, which does the same thing but saves intermediate values so backprop can use them, and backpropagation, which adjusts the weights based on how wrong the prediction was.
Problems I ran into: relu_derivative was crashing on arrays, so I fixed it by using .astype on NumPy boolean arrays. My forward_cache loop was also zipping the wrong lists. The worst bug was the gradient direction being completely flipped; the network was doing gradient ascent instead of descent, so the loss kept getting bigger instead of smaller. Fixed it by switching the subtraction order in the loss function. Gradients were also exploding with large batches, so I divided all gradients by m, which is the number of samples. Here is some of the code I wrote (It might be hard to see):