Introduces the fundamentals of natural language processing, from tokenization and embeddings to transformer architectures and language models. Uses GPT-2's development as a case study to illustrate how unsupervised learning on large text corpora produces surprisingly capable language understanding.