Foundations

Foundations

Reference explainers for the concepts that underpin modern ML systems. Each entry covers the math, the implementation, and the intuition — built to be bookmarked, not scrolled past.

Unlike blog posts, these are maintained over time as understanding deepens. Check the 'last updated' date on each.

Transformer Internals

2 parts · ~70 min read

A ground-up walkthrough of the transformer architecture — from the math of attention through positional encoding to a complete forward pass, with tested PyTorch implementations at every step.

  1. Part 1 Attention Is All You Need to Implement ~40 min
  2. Part 2 Positional Encoding: Teaching Transformers to Count ~30 min