AI Boostcamp Day 6
What I studied today:
History of Deep Learning
- AlexNet (2012): first to prove the performance of neural networks. Outperformed conventional statistical machine learning methods for the first time in an image classification task.
- DQN (2013): advent of reinforcement learning, e.g. AlphaGo
- Encoder / Decoder (2014): proved that Encoder-Decoder structure can outperform the RNN structure.
- Adam Optimizer (2014): the rule-of-thumb optimizer for deep learning. Decent performance most of the time.
- GAN (2015): neural networks can now generate images.
- Residual Networks (2015): deeper implementations of neural networks possible. Partial solution to the vanishing gradient problem.
- Transformer (2017): ???
- BERT (2018): ???
- Self Supervised Learning: adoption of non-labeled data in the training process.
Multi-layer Perceptron
- Loss Functions:
- Regression Task > MSE
- Why MSE?: MSE is an equative process of MLE on a linear Gaussian model. - Classification Task > CE
- Equation:
- Explanation:is an element of an one hot encoded vector. The higher the uncertainty is (or the lower the probablity is), the larger the absolute value of
is. Simply put, the loss falls if the model can predict the correct label with a higher probablilty.
- Probablistic Task > MLE
- Regression Task > MSE
Reference:
- Britz, Denny. Deep Learning’s Most Important Ideas - A Brief Historical Review. 2020-07-29