Softmax Activation and Digits
I modified the API so that cost and activation functions can easily be changed.
After opting for softmax activation at the output layer and changing the cost function to categorical cross-entropy, the model classified the numbers with 85.59% accuracy after 10.000 epochs. I’m surprised it works this well despite the small network and no regularization or tuning at all.
While interesting, the math is getting really complicated now, especially deriving the softmax function is still a mystery to me, i had to look up its derivative online.
I want to spend some more time understanding how the cost and the activation function “influence” each other before implementing an API to more easily train the network.
Comments 0
No comments yet. Be the first!
Sign in to join the conversation.