Local RAG and embeddings
A complete local RAG pipeline in 30 lines: nomic-embed-text for embeddings, Chroma for the vector DB, Llama 3.2 for the chat model. Why local RAG often beats cloud RAG for personal knowledge bases.
Blog
Posts on AI engineering, LLM systems, and software development.
A complete local RAG pipeline in 30 lines: nomic-embed-text for embeddings, Chroma for the vector DB, Llama 3.2 for the chat model. Why local RAG often beats cloud RAG for personal knowledge bases.
RAG is the pattern of fetching relevant text from a search system and putting it in the LLM's context window before asking your question. Not magic, not fine-tuning — just better prompts.