February 24, 20263 min read

AI, in plain words

What "AI" actually means, where the term came from, and why every product calls itself AI now. Sets up where machine learning and deep learning fit underneath.

My dad pointed at the washing machine and asked if its "AI" was the same thing as ChatGPT. Fair question. The honest answer is no, not even close, but I couldn't blame him for asking. The same two letters are stamped on washing machines, on the autocomplete in my code editor, and on trillion-dollar research labs. When a word means that many things, it has stopped meaning much. So let's strip it back.

This is post 1 of 8 in the Foundations series. We start at the word and work down: AI, then machine learning, then deep learning, then the stuff you actually touch every day.

It started as a grant pitch nobody could deliver

The word got coined in 1955, in a funding proposal. A mathematician named John McCarthy wanted money for a summer workshop, the Dartmouth Summer Research Project on Artificial Intelligence, which ran in 1956. The pitch was gloriously naive. Get a few sharp people in a room for one summer and work out how to make machines reason, use language, and solve problems on their own.

They didn't crack it. Nobody has, seventy years on. But the name stuck.

And for the next six decades, "AI" mostly meant research that didn't quite work. Chess programs that eventually got good. Handwriting readers that worked on a clear day. Funding droughts so regular they earned their own name, the AI winters. The word carried a faint smell of overpromising, and mostly it had earned it.

Then one launch rewired the word

ChatGPT went live on November 30, 2022. It hit 100 million users in two months, faster than any consumer product before it. After that, "AI" in normal conversation quietly settled into one meaning: a thing that reads or writes text and answers you like a person would.

That's not the textbook definition. The textbook still says "machines that perform tasks requiring human intelligence." But that's not what someone means when they say they "asked AI to plan a trip." They mean an LLM. Probably ChatGPT, Claude, or Gemini.

That gap is where the trouble lives. "AI-powered" on a landing page might be a 70-billion-parameter language model. It might also be a ten-line if statement somebody decided to call smart. Same sticker, very different box.

So why is everything suddenly AI?

Money, mostly. We're in the post-ChatGPT gold rush, and right now the word itself gets rewarded. Investors like it, customers click it. So your calendar app "uses AI" to suggest meeting times. Your toothbrush "has AI" to grade your brushing. Most of these aren't language models. A good few aren't even machine learning.

The other reason is that the word has no real edges. There has never been an agreed line between "rule-based software" and "AI." The spam filters of 2005 used statistical learning and we called them software. The same code in 2026 gets called AI and nobody blinks. The label just drifts with the mood of the room.

So here's the rule of thumb I use. When a product says "AI," read it as marketing and move on. When you actually want to know what's inside, ask one question. Is it a language model, a smaller machine-learning model, or a hand-written rule? The answer tells you what it can and can't do.

One picture that clears most of it up

Here's the bit that made it click for me. AI is the umbrella. Sit machine learning under it, and deep learning under that. Every "AI" thing you actually use lives at one of those nested layers, never the vague top-level "AI" floating on its own.

AI as the outer umbrella, with machine learning inside it and deep learning inside that

That nesting is exactly where post 2 picks up: what each layer really means, and why people swap the three words around even when they shouldn't.

So next time something brags that it's AI-powered, you've got the only follow-up that matters. Which layer? The washing machine, for the record, is a rule. A very confident rule.

AI AI Foundations Beginners History Machine Learning

From the dictionary

Terms used in this post

Quick reference for the 8 terms you met above. Each one comes from the AI dictionary.

Artificial IntelligenceAI: Umbrella term for software that performs tasks usually associated with human reasoning — language, perception, decision-making. Coined at the 1956 Dartmouth Summer Research Project. In everyday 2026 use, "AI" almost always means a large language model like ChatGPT, Claude, or Gemini, even though the textbook definition is much broader.; e.g. When a product page says "AI-powered", it could mean a 70-billion-parameter LLM or a hand-written if-statement. The label moves with the times.
ChatGPTAI: OpenAIs consumer chat product, launched November 30, 2022. The first LLM to reach mass adoption — 100 million users in two months. The product most people mean when they say AI today.
ClaudeAI: Anthropic's family of LLMs (Opus, Sonnet, Haiku) and consumer chat product at claude.ai. Used in this blog's tooling for drafting and dictionary work; also powers Claude Code, the CLI agent.; e.g. This blog's create-post skill drafts inline using Claude.
Deep LearningDL: A subset of machine learning that uses neural networks with many layers ("deep" stacks). Powers image recognition, speech, and the LLMs behind ChatGPT/Claude/Gemini. Needs much more data and compute than classical ML, but scales further.; e.g. Every modern LLM is a deep-learning model — a transformer with billions of parameters trained on internet-scale text.
GeminiAI: Google's family of LLMs and the consumer chat product at gemini.google.com. Tightly integrated with Google's search index and Workspace apps.; e.g. Gemini is Google's answer to ChatGPT, with native access to Search.
Large Language ModelAI: A deep-learning model trained on huge volumes of text to predict the next token given the previous ones. Scaling next-token prediction to billions of parameters yields the chat-like behaviour of ChatGPT, Claude, and Gemini. Capabilities are bounded by training data and the context window.; e.g. Claude is an LLM — it reads your message as tokens and generates a response one token at a time.
Machine LearningML: A subset of AI where the system learns patterns from data instead of following hand-written rules. The output is a model — a set of learned numbers that maps inputs to outputs. Spam filters, recommendation systems, and credit-risk scorers are classical ML.; e.g. Gmail's spam filter learns which emails you mark as junk and updates its model — that's machine learning, not a rule someone wrote.
ParametersML: The individual learned numbers inside a model. "7B parameters" means 7 billion of them. More parameters generally means more capacity, more memory needed, and slower inference.

Rate this article

How helpful did you find this?

Series

AI Foundations

1 / 8 posts

Browse all in AI Foundations →

Newsletter

Get new articles in your inbox

AI engineering, LLM systems, and software architecture — no filler.

No spam. Unsubscribe any time.

Discussion

Comments

Leave a note about the article, architecture choices, or what you would build next.

Loading comments...

On this page

From the dictionary

Terms in this post

Artificial IntelligenceAI: Umbrella term for software that performs tasks usually associated with human reasoning — language, perception, decision-making. Coined at the 1956 Dartmouth Summer Research Project. In everyday 2026 use, "AI" almost always means a large language model like ChatGPT, Claude, or Gemini, even though the textbook definition is much broader.
ChatGPTAI: OpenAIs consumer chat product, launched November 30, 2022. The first LLM to reach mass adoption — 100 million users in two months. The product most people mean when they say AI today.
ClaudeAI: Anthropic's family of LLMs (Opus, Sonnet, Haiku) and consumer chat product at claude.ai. Used in this blog's tooling for drafting and dictionary work; also powers Claude Code, the CLI agent.
Deep LearningDL: A subset of machine learning that uses neural networks with many layers ("deep" stacks). Powers image recognition, speech, and the LLMs behind ChatGPT/Claude/Gemini. Needs much more data and compute than classical ML, but scales further.
GeminiAI: Google's family of LLMs and the consumer chat product at gemini.google.com. Tightly integrated with Google's search index and Workspace apps.
Large Language ModelAI: A deep-learning model trained on huge volumes of text to predict the next token given the previous ones. Scaling next-token prediction to billions of parameters yields the chat-like behaviour of ChatGPT, Claude, and Gemini. Capabilities are bounded by training data and the context window.
Machine LearningML: A subset of AI where the system learns patterns from data instead of following hand-written rules. The output is a model — a set of learned numbers that maps inputs to outputs. Spam filters, recommendation systems, and credit-risk scorers are classical ML.
ParametersML: The individual learned numbers inside a model. "7B parameters" means 7 billion of them. More parameters generally means more capacity, more memory needed, and slower inference.

Series

AI Foundations

1 / 8 posts

Browse all in AI Foundations →