When someone says “AI” in 2026, they almost always mean one thing: large language models. Not robotics. Not the Terminator. Not whatever IBM was selling in 2018. They mean software that reads text, predicts what comes next, and generates a response that sounds like a person wrote it.
90% of the AI conversation right now is about this one technology.
We had to learn all of this from scratch when we started Ironworks. Nobody handed us a cheat sheet, so we wrote one.
What an LLM actually does
Think of an LLM like the world’s most well-read intern. It’s consumed billions of pages of text — books, websites, code, legal filings, manuals, Reddit threads — and built an internal model of how language works. When you give it a prompt (“write me a follow-up email to this prospect”), it predicts, word by word, what the most useful response would be based on everything it’s read.
It doesn’t “think” the way you do. No opinions, no memory between conversations unless you set that up. It’s pattern matching at an insane scale. But the output is good enough that for a huge number of business tasks — drafting emails, summarizing documents, answering customer questions, writing reports — it gets you 80% of the way there in about 5% of the time.
You’re buying back hours.
Who makes these things
Four companies matter. Everyone else is building on top of them.
OpenAI / ChatGPT — The one everyone’s heard of. Flagship model is GPT-5.2. ChatGPT is the consumer product, and it’s probably what your employees are already using whether you know it or not. Strongest general-purpose model with the biggest user base. If you’ve only tried one AI tool, it was this.
Anthropic / Claude — Full disclosure, Ironworks runs on Claude. Anthropic built it with a focus on safety and accuracy for business use. Current models are Claude Opus 4.6, Sonnet 4.6, and Haiku 4.5. Sonnet is arguably the best coding model out right now. Claude Code, their developer tool, hit $1 billion in annual revenue within six months of launch.
Google / Gemini — Gemini 3 powers both the Gemini chatbot (750+ million monthly users) and the AI Overviews in Google Search results (2+ billion users monthly). If you’ve Googled anything recently, you’ve already used an LLM. Where Gemini wins is distribution. It’s baked into Gmail, Docs, Sheets, all of Google Workspace, so adoption happens whether people mean to adopt or not.
Meta / Llama — Llama models are open-source. Anyone can download, modify, and run them on their own hardware. That matters if you’ve got data privacy requirements or want full control. Most SMBs won’t touch Llama directly, but a lot of the AI tools you’re already paying for are running on it somewhere in the stack.
The glossary you’ll actually need
We’re skipping anything you don’t need yet.
Prompt — The instruction you type in. “Summarize this contract” is a prompt. “Write a job posting for a senior accountant at a 40-person firm in Richmond” is a better one. Garbage in, garbage out.
Prompt engineering — Writing good prompts. Sounds silly, matters a lot. The difference between “help me with marketing” and “write three subject lines for a re-engagement email to dormant HVAC maintenance customers, casual tone, under 50 characters” is the difference between useless output and something you’d actually send.
Tokens — How LLMs measure text. One token is roughly three-quarters of a word. You pay per token, input and output, and models have limits on how many they can handle. A 10-page contract costs more to process than a two-sentence question.
Context window — The maximum amount of text a model can “see” at once. GPT-5.2’s context window is 400,000 tokens, which works out to a few hundred pages. Two years ago these windows were tiny. Now most business use cases fit comfortably.
Hallucination — When the model confidently makes something up. It’ll cite a court case that doesn’t exist, invent a statistic, or fabricate a quote. This is the single biggest risk for business use. Always verify anything an LLM produces that involves facts, numbers, or legal claims. The models are getting better here. It hasn’t been solved.
Agents — Software that uses an LLM to take actions, not just generate text. Instead of “write me an email,” an agent actually sends the email, checks for a response, and follows up. This is where the industry is heading. Still early, moving fast.
Fine-tuning — Taking a general-purpose model and training it further on your specific data. A law firm would fine-tune on past case files so the model understands their domain better. Expensive, usually overkill for SMBs. Good prompts and well-organized reference documents get most businesses further for less money.
RAG (Retrieval-Augmented Generation) — Instead of fine-tuning, you give the model access to a database of your documents at query time. It searches your files first, then generates an answer grounded in your actual data. Lower cost than fine-tuning, easier to update, and you keep control of your information. For most small businesses, this is the approach that makes sense.
Where this leaves you
You don’t need to understand transformer architecture or attention mechanisms. You need to know these tools exist, what they cost, where they’re unreliable, and which tasks they handle well enough today.
96% of American businesses plan to adopt AI. Fewer than 25% have done it. That gap is almost entirely an execution problem. The technology works. The products exist. The pricing is accessible. Most companies are stuck on “okay, but what do I actually do with it on Monday morning.”