Back to list
TECH & HUMAN//2026-03-17//8 min

Modern AI for dummies

TL;DR// AI-optimized summary

4 levels of AI available today: 1) Classic chat (ChatGPT, Claude) - answers questions. 2) RAG - connects your own documents. 3) Tool use - AI can click, search, send emails. 4) Agent - autonomously plans and executes tasks without supervision. Most people only know level 1 - like saying 'car' means a 1985 Yugo when Teslas are driving around.

Disclaimer: This isn't an encyclopedia entry. It's a simplified overview for people who don't live and breathe AI every day but want to understand what's actually happening. I'm deliberately simplifying some things so they make sense without a technical background.


Most people think "artificial intelligence" is ChatGPT.

Type a question, get an answer. Done. AI.

It's like saying "car" means a 1985 Yugo. Technically, it's a car. But if you take that comparison onto the highway in 2026, you'll find that what's driving around you is a completely different world. The gap between what most people imagine when they hear "AI" and what AI can actually do today is just as wide. And almost nobody talks about it, because most people writing about AI are still driving the Yugo.

This is a guide to what's available to everyone today. From chat to agent, with examples.


1. Classic AI chat - a smart head without hands

ChatGPT, Claude, Gemini. You know the drill. Open a website, type a question, get an answer.

The model learned from enormous amounts of text and responds from what it knows. Over time, capabilities expanded:

  • Text: answering questions, writing emails, analyzing documents
  • Images: send a photo, it describes or analyzes it
  • Web search: finds current information from the internet

And that's it.

A chat can't DO anything. It can't create a file for you. Can't open a spreadsheet. Can't run a program. Can't send an email on your behalf. It only sees what you paste into the window, and only responds within that window.

It's a smart head without hands.

Most people are here. And most articles about AI describe this and pretend it's the whole story. It's not.


1.5 Generative AI - when the model creates, not just answers

A special category that sits alongside chat. Models that don't generate text but create entirely new content:

  • Images: Midjourney, DALL-E, Flux. Describe what you want, the model draws it. From realistic photos to artistic illustrations.
  • Music: Suno, Udio. Write song lyrics and a style, the model composes and sings a complete track. With vocals, instruments, production.
  • Video: Sora, Veo, Runway. Generates video from a text description or photo. Still short clips for now, but quality improves every month.
  • Voice: ElevenLabs, Cartesia. Clones a voice from a few seconds of audio, or creates an entirely new one.

This is the part of AI that fascinates (or terrifies) most people. But the important thing to understand is that generative models are still just tools. You tell them what to do, they do it. They don't come up with anything on their own, don't launch anything, don't connect to anything. A smart brush, not a painter.


2. Modern web interface - chat that connects to your tools

Claude, ChatGPT, and Gemini have all added features over the past year that make them more than just chatbots.

Claude has Artifacts - mid-conversation, it renders interactive web pages, charts, documents, or mini-apps. And Projects, where you upload documents and instructions for ongoing work. Since January 2026, also MCP Apps - direct connections to Slack, Asana, Figma, Google Drive, and dozens of other tools. No more copying data back and forth. Claude sees your projects directly.

ChatGPT has Canvas - an editable document alongside the chat where you can collaboratively write and edit text or code. Plus Code Interpreter, which runs Python code and analyzes data. And GPTs - custom versions of ChatGPT trained for specific tasks.

Gemini has its own Canvas for prototyping apps directly in chat, Gems connected to Google Docs, and Deep Research, which generates a research report with citations from a single question.

All three platforms are moving in the same direction: from chatbot to workspace. But the rule still holds: everything happens in the browser. The model has no access to your computer. Can't see your files. Can't send emails for you. Doesn't know what you did yesterday. Every conversation starts from zero.

Smarter and more connected, but still locked inside a browser window.


3. Coding agent - a programmer in the terminal

This is the first major leap. And this is where most people drop off, because this stuff is mostly discussed in tech communities.

Claude Code, Codex, Gemini CLI. These tools run directly on your computer. Not in a browser - in the terminal. And they have access to your files.

What that means in practice:

  • Actually builds things. Writes code, creates files, builds websites, installs libraries.
  • Works with massive data. Analyzes a spreadsheet with 100,000 rows, not just "look at the first 20."
  • Sees the whole project. Not just one piece of text you pasted - the entire structure, all files, change history.
  • Runs and tests. Writes code, runs it, sees the error, fixes it, runs again.

And the leap from chat is enormous. I personally used Claude Code to build an app that's on the App Store. Several more in TestFlight for personal use. Websites, internal tools for my company, automations. I'm not a programmer. Never have been. But a coding agent doesn't need you to know how to code. It needs you to know how to describe what you want.

But: you have to start it yourself. Tell it what to do yourself. And when it's done, it waits. No memory between sessions, no internet access, no context about your life.

It's a brilliant programmer you hire for one-off jobs. Does the work and leaves.


4. AI agent - a colleague that never sleeps

And here's the chasm.

An agent is a coding agent that got memory, tools, and initiative. Runs 24/7, remembers what it worked on last week, and doesn't wait for you to ask.

And there are several types:

Personal assistant - connected to your email, calendar, messengers, smart home. Sends you a morning briefing, monitors meetings during the day, reminds you what you missed in the evening. This isn't Siri understanding three commands. It's a colleague who knows your projects, knows who you work with, and when it sees a problem, solves it before you even notice.

Research agent - give it a topic and it autonomously conducts complete research. AutoResearch is an open-source agent where you say "research X" and it finds papers, reads them, compares them, and writes a research report with citations. Not like Google search where you get 10 links. The agent reads 200 papers and synthesizes conclusions.

Security agent - last week, an AI agent from CodeWall autonomously breached McKinsey's internal platform in 2 hours. It chose the target itself, found vulnerabilities itself, exploited them itself. Gained access to 46 million conversations and hundreds of thousands of internal documents. McKinsey patched it within 24 hours of disclosure. The same technology protects and attacks.

Trading agent - trades on the stock market, Polymarket, monitors crypto wallets. Watches markets 24/7 and reacts faster than any human.

Coding agent on steroids - when an agent needs to build something, it calls a coding agent from level 3 and tells it what to create. An agent that launches other agents.

And by the way, an agent doesn't just call APIs. It can control your entire computer like a human. Sees the screen, moves the mouse, types on the keyboard, opens programs. The only thing holding it back today is system permissions. On Mac, that popup: "Do you want to allow access?" Give it permission, and it has the same access as you sitting at the keyboard.


Why this matters

When someone says "AI is just a statistical model that completes words," they're talking about level 1. And they're right. About level 1.

When someone says "AI helps me write emails," they're talking about level 2. Fine. A Yugo drives too.

But at level 4, an agent autonomously hacked McKinsey in two hours. Another agent conducts complete academic research overnight. Another trades stocks while you sleep. And a personal assistant sends you a summary of everything that happened each morning.

NVIDIA last week announced NemoClaw - a security layer on top of OpenClaw, an open-source platform for AI agents. In their announcement, they called OpenClaw "the operating system for personal AI." When NVIDIA - the company that supplies hardware under most AI on the planet - says agents are an operating system, it's not marketing. It's a roadmap.


One sentence for each level

  • Chat = you ask, it answers
  • Generative AI = you say what to create, it creates it (image, music, video)
  • Web interface = you ask, it answers and shows interactive results connected to your tools
  • Coding agent = you say what to build, it builds it on your computer
  • AI agent = a collaborator that knows what you're doing, thinks on its own, and acts on its own

And if you're wondering where most people fall on this spectrum, they're at level 1 debating whether AI is dangerous.


What's missing here

Everything above is stuff anyone can get today. Free or for a few bucks a month. Download, install, use.

What we haven't talked about at all are systems that regular people don't have access to. Proprietary models trained by governments for intelligence purposes. BlackRock's models analyzing global financial markets in real time. Weather prediction models that are now more accurate than traditional physics simulations. Drug discovery models that screen candidates in weeks instead of the years a team of scientists would need. Military systems for autonomous decision-making.

All of this exists, works, and is an order of magnitude beyond what I've described here. And we're still only in March 2026.


This entire article started in Telegram. I told my agent what I wanted. It wrote the draft, I edited it, it pushed to GitHub, and that's how it appeared here on the web. That's exactly the point.