← Blog

Loss

ML

/dictionary/loss

Definition

A number that measures how wrong the models prediction is, compared to the truth. Training is the process of changing weights so this number goes down.

Posts that use this term

  • Fine-tuning a model locally

    When fine-tuning is the right answer (rarely) and how to do it on consumer hardware: LoRA, QLoRA, MLX-LM, Unsloth. A worked example fine-tuning Llama 3.2 3B on a 16GB Mac.

  • Quantization, distillation, pruning: making models fit

    Three ways to shrink an LLM. Quantization (Q2-Q8 with K-quants in GGUF), distillation (teacher to student), pruning. Why Q4_K_M is the community default and what each lever costs.

  • The local-LLM vocabulary

    Parameters, B, dense vs MoE, base vs instruct, tokens, context window, chat template, GGUF, quantization suffixes. After this post you can read any HuggingFace model card.

  • What it takes to run a model on your machine

    Why VRAM is the hard ceiling on local LLMs, what quantization actually does to a model file, and the practical hardware ladder from 8GB laptops to 192GB workstations.

  • How a model learns: training and inference

    Training is the expensive one-time event where a model's numbers get tuned. Inference is the cheap repeated use afterwards. The gap in cost is enormous, and it shapes the whole industry.

  • Install llama.cpp

    Build llama.cpp from source with Metal or CUDA acceleration. Run a GGUF model with llama-cli. The closest thing to bare-metal local inference.