Claude 3.7 Thinking and Tool Use Dev Guide

iX.

19 3月, 2025

Overview

Claude 3.7 Sonnet is the first Claude model to offer step-by-step reasoning, which Anthropic has termed "extended thinking". With Claude 3.7 Sonnet, this feature is optional - you can choose between standard thinking and extended thinking for advanced reasoning tasks.

Thinking Blocks Structure

Thinking blocks represent Claude 3.7 Sonnet's internal thought process. When thinking is enabled, Claude will show its reasoning through thinking content blocks in the response.

Response Format

{
    'stream': {
        'contentBlockDelta': {
            'delta': {
                'text': 'string',
                'toolUse': {
                    'input': 'string'
                },
                'reasoningContent': {
                    'text': 'string',
                    'signature': 'string'
                }
            },
            'contentBlockIndex': 123
        }
    }
}

Request Example

{
    "content": [
        {
            "reasoningContent": {
                "reasoningText": {
                    "text": "This is an astronomy question about why we can't see the far side of the Moon from Earth. \nI will use the search_wikipedia tool to find information about the Moon's rotation and tidal locking.",
                    "signature": "eyJhbGciOiJFUzI1NiIsImtpZCI6ImtleS0xMjM0In0.eyJoYXNoIjoiYWJjMTIzIiwiaWF0IjoxNjE0NTM0NTY3fQ...."
                }
            }
        }
    ]
}

Tool Use with Thinking

When using thinking with tool use, the conversation follows this pattern:

First assistant turn: Initial user message → Assistant responds with thinking blocks followed by tool use requests
Tool result turn: User message with tool results → Assistant responds with either more tool calls or just text (no thinking blocks in this response)

The complete flow typically follows these steps:

User sends initial message
Assistant responds with thinking blocks and tool requests
Send tool results back as User message
Assistant responds with either more tool calls or just text (no thinking blocks)
If more tools are requested, repeat steps 3-4 until conversation is complete

If thinking is enabled but a final assistant message doesn't start with a thinking block (preceding the last set of tool_use and tool_result blocks), you may see:

validationException - The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks).

If a thinking block does not contain the complete thinking text, you will receive the following validation error. To resolve this, make sure to include the accumulated thinking content in the 'reasoningText' parameter:

validationException - The model returned the following errors: messages.1.content.0: When providing `thinking` or `redacted_thinking` blocks, the blocks must match the parameters during the original request.

Preserving Thinking Blocks

When passing thinking and redacted_thinking blocks back to the API in a multi-turn conversation, you must provide the complete, unmodified block for:

Reasoning continuity: Thinking blocks capture Claude's step-by-step reasoning that led to tool requests
Context maintenance: While tool results appear as user messages in the API structure, they're part of a continuous reasoning flow

Implementation Considerations

Thinking Budget

Minimum budget_tokens is 1,024 tokens
Anthropic recommends at least 4,000 tokens for comprehensive reasoning
budget_tokens is a target, not a strict limit - actual usage may vary
Expect potentially longer response times due to additional processing

Compatibility

Thinking isn't compatible with temperature, top_p, or top_k modifications
Thinking isn't compatible with forced tool use
You cannot pre-fill responses when thinking is enabled

Context Window and Token Usage

Thinking tokens count towards the context window and are billed as output tokens
Thinking tokens count towards your service quota token per minute (TPM) limit
In multi-turn conversations:
- Thinking blocks from previous turns are stripped and not counted towards context window
- Exception: thinking blocks from the last turn if it's an assistant turn
- Only thinking blocks actually shown to the model are billed
Always send thinking blocks back with your requests - the system will validate and use them as needed

Implementation Details for Signature Handling

Incorrect signature structure may cause errors like:

Invalid number of parameters set for tagged union structure messages[1].content[0].reasoningContent. Can only set one of the following keys: reasoningText redactedContent.
Unknown parameter in messages[1].content[0].reasoningContent: "signature" must be one of: reasoningText redactedContent

Thinking blocks contain a signature field - a cryptographic token verifying the thinking block was generated by Claude

The signature field is a direct property of the reasoningContent object, not nested inside reasoningText. The correct structure is:

{
    "reasoningContent": {
        "reasoningText": {
            "text": "thinking content here"
        },
        "signature": "signature_value_here"
    }
}

When accumulating thinking content from streaming responses:
- Preserve the signature from the last thinking block that contains one
- Only update the signature variable when a non-empty signature is received
- Include this signature when sending accumulated thinking back to the model
Occasionally Claude's internal reasoning will be flagged by automated safety systems. When this occurs, the thinking block is encrypted and returned as a redacted_thinking block.

Claude 3.7 Thinking and Tool Use Dev Guide

Overview

Thinking Blocks Structure

Response Format

Request Example

Tool Use with Thinking

Preserving Thinking Blocks

Implementation Considerations

Thinking Budget

Compatibility

Context Window and Token Usage

Implementation Details for Signature Handling

References

Popular Posts

Categories

Labels

Archive

Overview

Thinking Blocks Structure

Response Format

Request Example

Tool Use with Thinking

Preserving Thinking Blocks

Implementation Considerations

Thinking Budget

Compatibility

Context Window and Token Usage

Implementation Details for Signature Handling

References

Popular Posts

AWS Direct Connect 申请流程指南

MCP 代理工具性能对比

AL2023 启用 IMDSv1 验证测试

AWS Firewall Manager 策略更新指南

AWS Inspector V2 Agent-based 扫描权限控制机制分析与建议

Categories

Labels

Archive