What it takes to run a model on your machine
Why VRAM is the hard ceiling on local LLMs, what quantization actually does to a model file, and the practical hardware ladder from 8GB laptops to 192GB workstations.
Blog
Posts on AI engineering, LLM systems, and software development.
Why VRAM is the hard ceiling on local LLMs, what quantization actually does to a model file, and the practical hardware ladder from 8GB laptops to 192GB workstations.