Lesson 9: Basic Agent
The simplest possible agent: System Prompt + User Input → LLM → Output. Each message is independent—no memory between turns.
- Agno: What it is and why we're using it over alternatives.
- The Agent Loop: System prompt + user input → LLM → response.
- Stateless Design: Why this agent "forgets" everything between turns.
- The Missing Context: What ChatGPT does that this agent doesn't.
- Phoenix Tracing: Observing exactly what gets sent to the LLM.
What is Agno?
Agno is a lightweight Python framework for building AI agents. It handles the boilerplate—LLM calls, tool execution, memory storage—so you can focus on agent logic.
Why Agno?
Agno sits in a sweet spot: thin enough to understand, complete enough to build real systems.
The abstraction is minimal. When you call an agent, you can trace exactly what messages go to the LLM, what comes back, and how tools get invoked. No magic, no hidden prompts, no framework doing things you can't inspect.
Persistence is built in. Pass a PostgreSQL connection string and Agno handles session storage, conversation history, and long-term memory. No need to wire up your own database layer or fight with ORMs.
Tools work two ways: Python functions you define inline, or MCP servers for external integrations. Same agent can use both. The LLM decides when to call what.
Multi-agent coordination is native. Define specialist agents, put them on a team, let a leader delegate. This pattern scales from simple handoffs to complex workflows.
Observability comes free. Agno emits OpenTelemetry traces, so Phoenix (or any OTEL-compatible backend) captures every LLM call, tool execution, and agent interaction automatically.
Alternatives (Brief)
- LangChain: Feature-rich but heavy abstraction. Good for prototypes, harder to debug in production.
- LlamaIndex: Excellent for RAG, less focused on agent orchestration.
- Pydantic AI: Type-safe, great DX, newer ecosystem.
- CrewAI/AutoGen: Multi-agent focused, opinionated workflows.
- Raw API calls: Maximum control, maximum boilerplate.
We use Agno because it's thin enough to understand but complete enough to build real systems. When something breaks, you can trace exactly what happened.
The Stateless Agent
Here's the complete code for a basic agent:
"""
Lesson 9: Basic Agent
The simplest agent loop: System Prompt + User Input → LLM → Output.
Each message is independent - no memory between turns. This is a stateless
agent that processes each request in isolation.
Run: uv run 09-basic-agent.py
Try: "Write a haiku about coding"
Observe in Phoenix (http://localhost:6006):
- Single LLM span per request
- System prompt + user input in messages
- No prior context injected
Reset: uv run tools/reset_data.py
"""
import os
from dotenv import load_dotenv
from phoenix.otel import register
from agno.agent import Agent
from agno.models.openai import OpenAIChat
load_dotenv()
register(project_name="09-basic-agent", auto_instrument=True, batch=True, verbose=True)
agent = Agent(
name="Helpful Assistant",
model=OpenAIChat(id=os.getenv("OPENAI_MODEL_ID")),
instructions="You are a helpful assistant. Be concise.",
markdown=True,
)
agent.cli_app(stream=True)
Breaking It Down
Environment setup:
load_dotenv()
Loads variables from your .env file into the environment. The OpenAI client automatically reads OPENAI_API_KEY from the environment—you don't pass it explicitly. We read OPENAI_MODEL_ID manually to configure which model to use.
Phoenix tracing:
register(project_name="09-basic-agent", auto_instrument=True, batch=True, verbose=True)
Sends all LLM calls to Phoenix. Every request becomes a trace you can inspect.
Agent definition:
agent = Agent(
name="Helpful Assistant",
model=OpenAIChat(id=os.getenv("OPENAI_MODEL_ID")),
instructions="You are a helpful assistant. Be concise.",
markdown=True,
)
name: Label for logging/tracingmodel: Which LLM to use (reads model ID from env)instructions: System prompt sent with every requestmarkdown: Format responses as markdown
Interactive loop:
agent.cli_app(stream=True)
Starts a REPL. Type messages, see responses stream back.
Run It
uv run 09-basic-agent.py
Try these prompts:
> Write a haiku about coding
> What did I just ask you?
> Who are you?
Notice something? The agent can't answer "What did I just ask you?" It has no idea. Each turn is isolated.
What's Missing: The ChatGPT Illusion
When you use ChatGPT, it feels like a continuous conversation. Ask a question, get an answer, ask a follow-up—it remembers. But that's not magic. Here's what's actually happening:
ChatGPT (Stateful UI)
Turn 1: You send "Write a haiku about coding"
→ OpenAI receives: [system prompt, "Write a haiku about coding"]
→ Response: haiku
Turn 2: You send "Make it funnier"
→ OpenAI receives: [system prompt, "Write a haiku about coding", haiku, "Make it funnier"]
→ Response: funnier haiku
The ChatGPT interface stores your conversation and replays the entire history with each request. The LLM itself is stateless—it just processes whatever messages arrive.
Our Agent (Stateless)
Turn 1: You send "Write a haiku about coding"
→ OpenAI receives: [system prompt, "Write a haiku about coding"]
→ Response: haiku
Turn 2: You send "Make it funnier"
→ OpenAI receives: [system prompt, "Make it funnier"]
→ Response: "Make what funnier?"
No history. Each turn starts fresh. The agent genuinely doesn't know what "it" refers to.
Why This Matters
An LLM is fundamentally a function. You pass input, it produces output based on what it learned during training plus whatever you just sent it. Nothing more.
The function is non-deterministic—same input might produce slightly different output each time—but it's still just a function. It has no memory between calls. If you don't send the conversation history, the LLM doesn't have it. There's no hidden state, no background process remembering things.
This stateless design is actually useful in many cases: one-shot tasks like translation, parallel processing where each item is independent, or situations where you explicitly don't want conversation history stored.
The next lessons add state back in controlled ways: session history (Lesson 10), long-term memory (Lesson 11). But it's important to understand the baseline first—without explicit history management, each call stands alone.
Observe in Phoenix
Open http://localhost:6006 after running the agent.
You'll see:
- Project:
09-basic-agent - Traces: One per user input
- Spans: Single LLM call per trace
Click a trace to inspect:
- Input messages: System prompt + user message (that's it—no history)
- Output: The LLM response
- Latency: How long the call took
- Tokens: Input/output token counts
This is the baseline. As we add history, memory, and tools, Phoenix will show the added complexity—more spans, longer message arrays, tool calls.
Key Concepts
| Concept | This Lesson |
|---|---|
| Agent | Wrapper around an LLM with configuration |
| System prompt | Instructions sent with every request |
| Stateless | No memory between turns |
| Trace | Complete record of one agent interaction |
What's Next
This agent forgets everything. In Lesson 10, we'll add session history—the agent will remember what you said within a conversation, just like ChatGPT appears to.