Conversations
Build stateful agents with persistent conversations
Understanding multi-turn conversations
Multi-turn conversations in Scrapybara enable your agents to maintain context and state across multiple interactions. The Act SDK provides a structured way to manage these conversations through its message architecture.
Message architecture
The Act SDK uses a structured message system with three primary message types and five different part types. Understanding these components is crucial for building sophisticated multi-turn agents.
Message types
Message part types
Each message type contains various “parts” that serve different purposes:
Building multi-turn conversations
Instead of providing a single prompt
, you can pass a complete message history using the messages
parameter. This allows you to maintain the full conversation context.
The Act SDK returns a messages
field in the response that contains the complete conversation history. You can reuse this directly in your next act
call.
Python
TypeScript
Including screenshots in messages
Screenshots are a powerful way to provide visual context to your agent. You can include them in user messages using the ImagePart
type.
Python
TypeScript
Working with tools and reasoning
The Act SDK captures both tool calls and agent reasoning in its message architecture. Here’s how you can access and work with this information:
Examining tool calls and results
Python
TypeScript
Accessing agent reasoning
Python
TypeScript
Best practices for multi-turn conversations
-
Maintain message history: Always use the returned
messages
from each call to maintain conversation context. -
Clear instructions: Provide clear, specific instructions in each new user message.
-
Handle context length: For very long conversations, consider summarizing or truncating older messages to avoid exceeding model context limits.
-
Include visual context: Use screenshots when appropriate to provide additional context to the agent.
-
Monitor token usage: Track token usage through the
usage
field to prevent exceeding quotas or limits. -
Process message parts: Parse and handle different message parts appropriately based on their type.
Simple multi-turn example
Here’s an interactive Read-Eval-Print Loop (REPL) implementation that allows you to have ongoing conversations with your agent: