Your first local LLM, end to end
Install Ollama, pull Llama 3.2 3B, chat, hit the OpenAI-compatible API, and troubleshoot the five things that go wrong on first install. By the end of this post you have a working local LLM.
Blog
Posts on AI engineering, LLM systems, and software development.
Install Ollama, pull Llama 3.2 3B, chat, hit the OpenAI-compatible API, and troubleshoot the five things that go wrong on first install. By the end of this post you have a working local LLM.
What macOS, Linux, and Windows each need to run a local LLM in 2026. Native Windows now works smoothly; WSL2 for Linux power users; Mac is the smoothest path; Linux gives you the most knobs.
Install the OpenAI SDK for Python and Node, configure your API key, and verify with a one-line chat.completions call.
Install the Anthropic SDK for Python and Node, configure your API key, and verify with a one-line messages.create call to Claude.
Install LM Studio on macOS, Linux, and Windows. The fastest GUI for running local LLMs — no terminal needed. Includes the local server for OpenAI-compatible API access.
Build llama.cpp from source with Metal or CUDA acceleration. Run a GGUF model with llama-cli. The closest thing to bare-metal local inference.
Install Ollama on macOS, Linux, and Windows. Pull your first model, run it locally, and verify with ollama list. The fastest path to a local LLM.
Install Docker Desktop on macOS and Windows, Docker Engine on Linux. Verify with docker run hello-world and learn the licensing and resource gotchas.
Skip system Python. Install uv, then use uv to manage Python versions and per-project virtual environments. Verify with uv python list.
Install Node.js and npm via a version manager (nvm, fnm, or Volta) so you can switch versions per project. Verify with node -v and npm -v.
Install Git on macOS, Linux, and Windows. Configure your name and email so commits are attributed correctly. Verify the install in one command.
Install Homebrew, the de-facto package manager on macOS. The one-line installer, PATH setup for Apple Silicon, and how to verify it works.