Episodes

An episode is a complete training interaction between your model and a simulated scenario. Episodes are the primary unit of training for multi-turn agents.

Episode Lifecycle

Creating an Episode

To create an episode, POST to /api/v1/episodes with a collection and scenario ID. See the Quickstart for full code examples.

Response

{
  "episode_id": "ep_abc123",
  "initial_observation": {
    "role": "assistant",
    "content": "Hello, I'm calling about my workers' compensation claim..."
  },
  "tools": [
    {
      "id": "medical_assessment",
      "type": "artifact",
      "name": "Medical Assessment",
      "description": "Create a medical assessment document",
      "mode": "json_schema",
      "json_schema": {
        "type": "object",
        "properties": {}
      }
    },
    {
      "id": "claim_status",
      "type": "decision",
      "name": "Claim Status Decision",
      "mode": "options",
      "options": ["approved", "denied", "pending_info"]
    }
  ]
}

Submitting Turns

Each turn represents one exchange in the conversation. POST to /api/v1/episodes/{episode_id}/turns with your messages. See the Quickstart for full code examples. To include tool calls, attach them to the assistant message. See Tool Calls for details.

Response

{
  "turn": 3,
  "observation": {
    "role": "assistant",
    "content": "It's a sharp pain in my lower back, especially when I bend..."
  },
  "reward": 0.75,
  "scores": {
    "D1_InformationGathering": 0.8,
    "D2_GoalProgress": 0.7
  },
  "episode_complete": false
}

The scores field is optional and provides a breakdown of individual scoring dimensions. Dimension names vary by scenario.

Episode Termination

Episodes end for one of these reasons:

Reason	Description
`completed`	Model completed the scenario successfully
`error`	Episode terminated due to an error
`max_turns_reached`	Reached scenario’s turn limit
`client_terminated`	Client explicitly requested episode termination

Episode State

Each episode tracks internal state including:

Which objectives have been discovered
Which goals have been achieved
The scenario’s scoring criteria

Internal state is never exposed in API responses. This prevents models from learning to game the evaluation.

Limits

Limit	Value
Max turns per episode	50
Max message length	10,000 characters

Best Practices

One episode per training sample

Create a fresh episode for each training trajectory. Don’t reuse episode IDs.

Track conversation history locally

The API returns only one message per turn—the simulated persona’s response. Your client is responsible for maintaining the full conversation history and passing all messages with each turn submission.

Handle rate limits gracefully

Check X-RateLimit-Remaining headers and implement exponential backoff.

Overview

Concepts

Integration

Best Practices

Troubleshooting

Episode Lifecycle

Creating an Episode

Response

Submitting Turns

Response

Episode Termination

Episode State

Limits

Best Practices

​Episode Lifecycle

​Creating an Episode

​Response

​Submitting Turns

​Response

​Episode Termination

​Episode State

​Limits

​Best Practices

Episode Lifecycle

Creating an Episode

Response

Submitting Turns

Response

Episode Termination

Episode State

Limits

Best Practices