Hyperparameters 1 | Machine Learning Guide Podcast

MLG 027 Hyperparameters 1

Jan 27, 2018

Click to Play Episode

Hyperparameters part 1: network architecture

Try a walking desk to stay healthy while you study or work!

Show Notes

Hypers future & meta-learning
- We're always removing hypers. DL removed feature-engineering `
Model selection
- Unsupervised? K-means Clustering => DL
- Linear? Linear regression, logistic regression
- Simple? Naive Bayes, Decision Tree (Random Forest, Gradient Boosting)
- Little data? Boosting
- Lots of data, complex situation? Deep learning
Network
- Layer arch
  - Vision? CNN
  - Time? LSTM
  - Other? MLP
  - Trading LSTM => CNN decision
- Layer size design (funnel, etc)
  - Face pics
  - From BTC episode
  - Don't know? Layers=1, Neurons=mean(inputs, output) https://stats.stackexchange.com/a/1097/107199
Activations / nonlinearity https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6
- Output
  - Sigmoid = predict probability of output, usually at output
  - Softmax = multi-class
  - Nothing = regression
- Relu family (Leaky Relu, Elu, Selu, ...) = vanishing gradient (gradient is constant), performance, usually better
- Tanh = classification between two classes, mean 0 important

Comments temporarily disabled because Disqus started showing ads (and rough ones). I'll have to migrate the commenting system.