All Posts
Tags
Managing State for Streaming AI Responses
LLM responses arrive as chunks, not all at once. Handle loading, streaming, completion, and errors without breaking the user experience.
Designing UI for Streaming AI Responses
AI systems stream responses with variable length and timing. Here's how to design interfaces that show progress immediately and handle uncertainty gracefully.
Prompt Caching: Design for Reuse
Structure prompts to maximize Anthropic's prompt caching, reducing costs by 90% and latency by 85% for repeated context.
LLM Evals: Testing AI Outputs Systematically
How to test LLM outputs with code-based grading, human evaluation, and LLM-as-judge. When to use each method and why statistical rigor matters.
Designing Error Messages for LLMs
Error messages consume context and affect LLM decision-making. Structure errors as data, use reference IDs for details, and return actionable recovery paths.
Understanding MCP Resources
Resources represent data or files that an MCP client can read. A case study of the SQLite MCP server shows how resources and tools work together.
Tool Output Design for Context Efficiency
How to design tool responses that preserve context space for what matters. Filter early, return minimal data, and structure outputs for LLM consumption.
Give AI Agents the Map First
AI agents work better when they see the full structure upfront, then make targeted requests. How to use progressive disclosure for efficient context management.
Context Escape Velocity
How to recognize when your conversation has grown too large to be effective, and what to do about it.
Anatomy of a Context Window
Understanding what fills an LLM's context window and how it affects model behavior.
Getting Your Next.js Site Indexed on Google
Set up Google Analytics, verify your domain with DNS, and get your Next.js site appearing in search results.
Deploying Next.js to Vercel with Git Integration
Connect your GitHub repository to Vercel for automatic deployments every time you push code.
What is a Token?
Definition and explanation of tokens in large language models.
Model Context Protocol: Connecting AI to Your Tools
MCP provides a standardized way for AIs to interact with tools, from Figma to your calendar to custom workflows you build yourself.
Prompting Techniques That Actually Work
Five prompting techniques that improve LLM outputs: few-shot learning, chain-of-thought reasoning, XML structure, output constraints, and prompt chaining.
How Prompt Priming Shapes LLM Responses
Your prompt's opening sets the context for the entire response.
How LLMs Think and Respond
LLMs generate text one token at a time. Understanding how they convert text to vectors, use attention to weigh context, and predict probabilities explains their behavior.
Debugging LLMs: Understanding Attention, Tokens, and Context
When models fail or behave unexpectedly, you need to understand why. Practical debugging techniques for tokenization, attention patterns, and context limits.
Progressive Disclosure in Agent Skills
The architectural pattern that makes Agent Skills scalable: load only what's needed, when it's needed.
Building Agent Skills: A Practical Guide
Anthropic's Agent Skills let you equip Claude with specialized capabilities through reusable skill packages. Here's how to build them.