Multi-turn conversations in Scrapybara enable your agents to maintain context and state across multiple interactions. The Act SDK provides a structured way to manage these conversations through its message architecture.
The Act SDK uses a structured message system with three primary message types and five different part types. Understanding these components is crucial for building sophisticated multi-turn agents.
Each message type contains various “parts” that serve different purposes:
Instead of providing a single prompt, you can pass a complete message history using the messages parameter. This allows you to maintain the full conversation context.
The Act SDK returns a messages field in the response that contains the complete conversation history. You can reuse this directly in your next act call.
Screenshots are a powerful way to provide visual context to your agent. You can include them in user messages using the ImagePart type.
The Act SDK captures both tool calls and agent reasoning in its message architecture. Here’s how you can access and work with this information:
Maintain message history: Always use the returned messages from each call to maintain conversation context.
Clear instructions: Provide clear, specific instructions in each new user message.
Handle context length: For very long conversations, consider summarizing or truncating older messages to avoid exceeding model context limits.
Include visual context: Use screenshots when appropriate to provide additional context to the agent.
Monitor token usage: Track token usage through the usage field to prevent exceeding quotas or limits.
Process message parts: Parse and handle different message parts appropriately based on their type.
Here’s an interactive Read-Eval-Print Loop (REPL) implementation that allows you to have ongoing conversations with your agent: