Natural Language Processing 2 | Machine Learning Guide Podcast

MLG 019 Natural Language Processing 2

Jul 10, 2017

Click to Play Episode

Natural Language Processing classical/shallow algorithms.

Try a walking desk to stay healthy while you study or work!

Resources

Resources best viewed here

Speech and Language Processing

Natural Language Processing - Stanford University

CS224N: Natural Language Processing with Deep Learning | Winter 2019

Show Notes

Edit distance: Levenshtein distance
Stemming/lemmatization: Porter Stemmer
N-grams, Tokens: regex
Language models
- Machine translation, spelling correction, speech recognition
Classification / Sentiment Analysis: SVM, Navie bayes
Information Extraction (POS, NER): Models: MaxEnt, Hidden Markov Models (HMM), Conditional Random Fields (CRF)
Generative vs Discriminitive models
- Generative: HMM, Bayes, LDA
- Discriminative: SVMs, MaxEnt / LogReg, ANNs
- Pros/Cons
  - Generative depends on fewer data (NLP tends to be few data)
  - MaxEnt vs Naive Bayes: Independence assumption of Bayes, etc ("Hong" "Kong")
Topic Modeling and keyword extraction: Latent Dirichlet Allocation (LDA)
- LDA ~= LSA ~= LSI: Latent diriclet allocation, latent semantic indexing, latent semantic analysis
Search / relevance / document-similarity: Bag-of-words, TF-IDF
Similarity: Jaccard, Cosine, Euclidean

Comments temporarily disabled because Disqus started showing ads (and rough ones). I'll have to migrate the commenting system.