Blog

Setup Toolbox

Posts on AI engineering, LLM systems, and software development.

Sort:

Setup ToolboxFebruary 21, 2026#7

Build llama.cpp from source with Metal or CUDA, then run a GGUF model with llama-cli. The closest thing to bare-metal local inference.