Inside a message: blocks of content.
A plain text message is just a string. But content can also be a list of typed blocks — text, images, tool calls, tool results. This is the real unit of an LLM conversation, and where Anthropic's "content blocks" and OpenAI's "parts" differ most.
// content is a string… { "role": "user", "content": "Hello, Claude" } // …or a list of typed blocks { "role": "user", "content": [ { "type": "text", "text": "Describe this:" }, { "type": "image", "source": { "type": "base64", "media_type": "image/png", "data": "iVBORw0KGgo…" } }, { "type": "text", "text": "…long doc…", "cache_control": { "type": "ephemeral" } } ] }
// content is a string… { "role": "user", "content": "Hello, GPT" } // …or a list of typed parts { "role": "user", "content": [ { "type": "text", "text": "Describe this:" }, { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0…" } } ] } // the "developer" role = newer name // for "system" (high-priority rules) { "role": "developer", "content": "…" }
is a stack
of blocks.