AI Terminology
One goal of this course is for you to be able to follow conversations about AI—in the news, on podcasts, or in technical discussions. Throughout the semester, we'll introduce key terms and concepts. This glossary will help you track them.
#How AI Works
- Token: The basic unit AI processes—roughly a word or word piece. "Tokenization" splits text into these units.
- Context window: How much text the AI can "see" at once (e.g., 128K tokens). Limits how much you can include in a conversation.
- Parameters / weights: The numbers inside a neural network that determine its behavior. GPT-4 has hundreds of billions of parameters.
- Training vs inference: Training is when the model learns from data; inference is when it generates responses. Training is expensive; inference is cheap.
- Compute: Computational resources (GPUs, TPUs, training runs)—the currency of AI development. More compute generally means better models.
- Transformer / attention: The architecture behind modern AI. "Attention" lets the model focus on relevant parts of input.
- Embeddings: How AI represents words/concepts as lists of numbers, where similar meanings are close together in "vector space."
- World model: An AI's internal representation of how things work—what causes what, how objects behave, etc.
#How AI Learns
- Next-token prediction: The core task LLMs learn—given text so far, predict the next word. Surprisingly powerful.
- Autoregressive: Generating text one token at a time, feeding each output back as input for the next prediction.
- Pre-training: Learning patterns from massive amounts of text (the expensive part).
- Fine-tuning: Additional training on specific data to specialize behavior.
- RLHF (Reinforcement Learning from Human Feedback): Training AI to be helpful by having humans rate responses.
- Constitutional AI: Anthropic's approach—training AI using written principles rather than just human ratings.
- Synthetic data: Training data generated by AI itself—increasingly used to improve models.
- Scaling laws: The observation that AI capabilities improve predictably with more data, compute, and parameters.
#AI Capabilities
- In-context learning: The model "learning" from examples in your prompt without any weight updates—one of the most surprising discoveries about large models.
- Few-shot / zero-shot: Few-shot means giving examples in the prompt; zero-shot means no examples.
- Emergent capabilities: Abilities that appear suddenly as models get larger (e.g., arithmetic, reasoning).
- Chain of thought: Getting AI to "think step by step," which often improves reasoning.
- Hallucination: When AI generates plausible-sounding but false information.
#The Model Landscape
- Foundation model: A large pre-trained model that can be adapted for many tasks (GPT-4, Claude, Gemini).
- Frontier model: The most capable models from leading labs (OpenAI, Anthropic, Google).
- Open-weight model: Models where the weights are publicly available (Llama, Mistral).
- Multimodal: AI that handles multiple types of input/output (text, images, audio, video).
#Safety and Alignment
- Alignment: Making AI systems behave according to human values and intentions.
- Misuse vs misalignment: Important distinction—misuse is bad actors using capable AI; misalignment is AI itself pursuing wrong goals.
- Jailbreaking: Tricks to make AI bypass its safety guidelines.
- Red teaming: Deliberately testing AI systems for vulnerabilities and harmful outputs.
- Interpretability: Understanding what's happening inside neural networks (largely unsolved).
- Mechanistic interpretability: The specific research field trying to reverse-engineer neural network circuits to understand how they work.
#Using AI
- Prompt engineering: The practice of crafting inputs to get better outputs from AI—though some argue good writing matters more than special techniques.
- System prompt: Hidden instructions that shape an AI's behavior before you interact with it.
- API (Application Programming Interface): How software talks to AI services programmatically.
- Temperature: Controls randomness in AI outputs (low = predictable, high = creative).
- RAG (Retrieval Augmented Generation): Giving AI access to external documents to improve accuracy.
- Agents / agentic AI: AI systems that can take actions, use tools, and work autonomously toward goals.
#The Big Picture
- AGI (Artificial General Intelligence): Hypothetical AI that matches or exceeds human capability across all domains.
- Capabilities research: Work on making AI more powerful.
- Alignment research / AI safety: Work on making AI safe and beneficial.