Welcome to this Map of Content.

Notes

  • 0. Glossary (NLP) - - Natural Language Processing (NLP): NLP is a field of linguistics and machine learning focused …
  • Keyword Extraction & Topic Modelling - 1. Keyword Extraction 1. TF 2. TF-IDF 3. RAKE ([rake-keyword](https://github.com/u-prashant/RAKE
  • 3. Word Vectors - - Knowledge-based representation (e.g. WordNet) - Might miss nuance (e.g. “proficient” is listed…
  • 5. Tokenization - - Character Tokenization: The simplest tokenization scheme is to feed each character individuall…
  • 2. Language Modeling & N-grams - In general, language modeling is the task of predicting what word comes next. - Statistical LM (N-…
  • 4. RNNs & CNNs for Text Classification - Improvements over n-gram - No sparsity problem - Model size is not Remainin…
  • LM Benchmarks - For Coding: - HumanEval: Python coding tasks (higher % = better) - MBPP: Python programm…
  • Large Language Models - Characteristics of LLMs: - Scale: They contain millions, billions, or even hundreds of billi…
  • 1. What is NLP? - NLP is a field of linguistics and machine learning focused on understanding everything related to hu…
  • KV Cache - A performance optimization technique used in Large Language Models (LLMs) to speed up text generation by storing the Key and Value vectors of previous tokens.

To Research / Inbox