promptdojo_

Reading the response — content blocks, stop_reason, usage — step 1 of 9

Three things every response carries

Lesson 01 wired the messages pattern — how you talk TO the model. This lesson is the other half: how you read what comes BACK.

Every modern LLM API response carries three things you'll touch on every call. The names differ slightly between Anthropic and OpenAI, but the shape is the same:

Anthropic Messages APIOpenAI Responses APIWhat it is
response.content[] (list of blocks)response.output[] (list of blocks)What the model said
response.stop_reasonresponse.status (+ response.output[].finish_reason per block)Why it stopped
response.usage.input_tokens / .output_tokensresponse.usage.input_tokens / .output_tokensWhat it cost

You'll iterate content to get text and tool calls, branch on stop_reason to drive your loop, and sum usage to track cost. This lesson locks in the access patterns.

Content is a LIST, even when there's one block

The single most-shipped beginner bug:

# WRONG — assumes content is a string
print(response.content)              # → "<TextBlock object at 0x...>"

# WRONG — assumes the first block is always there and always text
print(response.content[0].text)      # → fails when content[0] is a tool_use block

The right way is to iterate, branch on .type, and handle each block kind:

for block in response.content:
    if block.type == "text":
        print(block.text)
    elif block.type == "tool_use":
        print(f"calling {block.name} with {block.input}")

You'll see text, tool_use, and (when extended thinking is on) thinking blocks. Always branch on type; never index blindly.

stop_reason is the loop driver

Five values you'll see, ordered by frequency:

ValueWhat it means
end_turnModel is done. Print the text and exit your loop.
tool_useModel wants to run a tool. Run it, send back a tool_result, call again.
max_tokensThe response was cut off. Bump max_tokens or split the task.
stop_sequenceA custom stop string fired (rare).
pause_turnA hosted tool's turn ran out of time (Anthropic's hosted tools only).

Two stop reasons cover 99% of agent loops: end_turn and tool_use. The others are exception paths. Chapter 16 builds the full agent loop; this lesson teaches you to read the field.

usage is your bill

response.usage.input_tokens          # what you sent (prompt + history)
response.usage.output_tokens         # what came back
response.usage.cache_read_input_tokens     # cached read (0.1× price)
response.usage.cache_creation_input_tokens # cache write (1.25× or 2× price)

The cache fields are zero unless you opted in to prompt caching (chapter 23 covers that). Input and output are non-zero on every call. Sum them across an agent loop and you have the cost.

The bug worth naming: agents that don't track usage find out about the bill at the end of the month, not in real time.

What you'll build

A parse_response(response) that takes a response dict (or real SDK object), returns a structured {"text": ..., "tool_uses": [...], "stop_reason": ..., "tokens": ...} shape. Then a fix for the most common bug: hardcoding content[0].text even when the first block is a tool_use.

read, then continue.