Skip to main content

LLM From Scratch — Video Lessons Index

·763 words·4 mins

This blog is a treasure trove for the curious mind—an open classroom where modern artificial intelligence is unpacked, explained, and built piece by piece. It contains not only the complete text of Build a Large Language Model from Scratch but also an accompanying video series in which the author walks through every chapter step by step, explaining the code and concepts in real time.

Build a Large Language Model from Scratch offers a practical, hands-on exploration of how today’s AI systems like GPT actually work—layer by layer, tensor by tensor. It leads you from raw text and tokenization through attention mechanisms, training loops, and text generation, grounding every idea in both mathematics and executable code. Instead of treating large language models as mysterious black boxes, this work illuminates their inner workings with clarity and rigor. By the end, you’ll not only understand how an LLM thinks—you’ll have built one yourself.

Read the book — Build a Large Language Model From Scratch (PDF)

Below is a carefully organized, lesson-by-lesson index of the LLM From Scratch video series. Each link opens the corresponding MP4 directly so you can follow along from anywhere.

Tip: The videos are grouped by chapter (Unit). Begin with Unit 1 for setup and foundations, explore tokenization in Unit 2, dive into attention in Unit 3, and finish strong with training and text generation in Unit 5.


Unit 1 — Setup & Foundations
#

  1. Python Environment Setup
    Prepare your workstation and Python toolchain so all later notebooks and scripts run without fuss.
    ▶︎ U01M01 Python Environment Setup Video.mp4

  2. Foundations to Build a Large Language Model (From Scratch)
    Big-picture tour of what an LLM is, the building blocks you’ll implement, and how the pieces fit.
    ▶︎ U01M02 Foundations to Build a Large Language Model (From Scratch).mp4


Unit 2 — Tokenization & Data Pipeline
#

  1. Prerequisites to Chapter 2
    Short preface on goals and required background for the tokenization chapter.
    ▶︎ U02M01 Prerequisites to Chapter 2.mp4

  2. Tokenizing Text
    Turn raw text into tokens—the basic symbols your model understands.
    ▶︎ U02M02 Tokenizing text.mp4

  3. Converting Tokens into Token IDs
    Map tokens to integer IDs, the numeric form used by embeddings and models.
    ▶︎ U02M03 Converting tokens into token IDs.mp4

  4. Adding Special Context Tokens
    Insert markers like BOS/EOS and separators to give structure and intent to sequences.
    ▶︎ U02M04 Adding special context tokens.mp4

  5. Byte Pair Encoding (BPE)
    Learn subword tokenization to balance vocabulary size and coverage.
    ▶︎ U02M05 Byte pair encoding.mp4

  6. Data Sampling with a Sliding Window
    Build training sequences efficiently by sliding across long texts.
    ▶︎ U02M06 Data sampling with a sliding window.mp4

  7. Creating Token Embeddings
    Convert token IDs into dense vectors that capture meaning.
    ▶︎ U02M07 Creating token embeddings.mp4

  8. Encoding Word Positions
    Add positional information so the model knows where words occur.
    ▶︎ U02M08 Encoding word positions.mp4


Unit 3 — Attention Basics
#

  1. Prerequisites to Chapter 3
    What to expect before you implement attention mechanisms.
    ▶︎ U03M01 Prerequisites to Chapter 3.mp4

  2. A Simple Self-Attention Mechanism (No Trainable Weights) — Part 1
    Build intuition for how tokens attend to one another without jumping into full transformer math.
    ▶︎ U03M02 A simple self-attention mechanism without trainable weights Part 1.mp4


Unit 5 — Training & Text Generation
#

  1. Prerequisites to Chapter 5
    Scope, datasets, and what “training loop” really means here.
    ▶︎ U05M01 Prerequisites to Chapter 5.mp4

  2. Using GPT to Generate Text
    Wire up a generation function and produce your first outputs.
    ▶︎ U05M02 Using GPT to generate text.mp4

  3. Text Generation Loss: Cross-Entropy & Perplexity
    Measure how well the model predicts the next token, and interpret the metrics.
    ▶︎ U05M03 Calculating the text generation loss: cross entropy and perplexity.mp4

  4. Training & Validation Losses
    Track learning progress and catch overfitting early.
    ▶︎ U05M04 Calculating the training and validation set losses.mp4

  5. Training an LLM
    Put the pieces together: optimizer, batches, checkpoints, and sanity checks.
    ▶︎ U05M05 Training an LLM.mp4

  6. Decoding Strategies to Control Randomness
    Greedy, sampling, and friends—trade off diversity vs. determinism.
    ▶︎ U05M06 Decoding strategies to control randomness.mp4

  7. Temperature Scaling
    Tune output randomness with a single, powerful knob.
    ▶︎ U05M07 Temperature scaling.mp4

  8. Top-k Sampling
    Clip the candidate pool to the k most likely tokens for cleaner generations.
    ▶︎ U05M08 Top-k sampling.mp4

  9. Modifying the Text Generation Function
    Extend your generator to support new strategies and constraints.
    ▶︎ U05M09 Modifying the text generation function.mp4

  10. Loading & Saving Model Weights in PyTorch
    Serialize models cleanly; resume training or deploy for inference.
    ▶︎ U05M10 Loading and saving model weights in PyTorch.mp4

  11. Loading Pretrained Weights from OpenAI
    Plug in existing weights to compare, validate, or bootstrap experiments.
    ▶︎ U05M11 Loading pretrained weights from OpenAI.mp4