1500

The hardest thing about AI systems isn’t the model — it’s everything around it. The data pipeline, the embedding store, the retrieval stage, the prompt assembly, the model server, the eval harness, the observability layer, the cost ceiling. Every team that ships an AI product spends ~80% of its engineering effort on the parts that have nothing to do with model.generate(...).

This roadmap reads the published AI notes in the order that makes that 80% concrete. It’s deliberately small today (three articles) — the rest of the vault is being authored. As more lands, this page is updated first.

Read in order

Foundations

  1. ML System Design Fundamentals — The system-design vocabulary applied to ML: training vs. serving topology, offline vs. online features, the feedback loop trap, when batch beats real-time, and the four production failure modes that don’t exist in textbook ML.

Retrieval-Augmented Generation

  1. RAG Architecture — Embeddings, vector databases, chunking strategies, retrieval ranking, prompt assembly, evaluation. The honest treatment of where RAG works, where it silently fails, and the chunking + reranking choices that determine whether your assistant is useful or merely fluent.

Serving and operations

  1. Model Serving — Batching, KV cache, speculative decoding, paged attention, autoscaling, A/B testing, shadow deployment, cost-per-token economics. The layer between “we have a model” and “we have a product.”

What’s coming next

The vault has work in progress on Embeddings, Chunking strategies, Vector Databases (FAISS, pgvector, Pinecone, Weaviate), HNSW & ANN search, Model evaluation, and LLM Observability. As each one publishes, it will fold into this roadmap in the dependency-respecting order — embeddings before vector DBs before RAG before serving.

Where to go next

  • Foundations — the Foundations Roadmap covers the networking, databases, caching, and distributed-systems primitives AI infrastructure sits on top of.
  • System design interviews — the System Design Interviews Roadmap is useful when the AI problem is framed as a product/system design exercise rather than an ML-only question.

This is a growing roadmap. The order above respects the prerequisites between notes that already exist; new notes will slot in at the right depth as they ship.

Discussion

Comments are open. Anonymous is fine — pick any name and post. Comments appear after a quick moderation check.