Episodes

An episode is a complete training interaction between your model and a simulated scenario. Episodes are the primary unit of training for multi-turn agents.

Episode Lifecycle

Creating an Episode

To create an episode, POST to /api/v1/episodes with a collection and scenario ID. See the Quickstart for full code examples.

Response

{
  "episode_id": "ep_abc123",
  "initial_observation": {
    "role": "assistant",
    "content": "Hello, I'm calling about my workers' compensation claim..."
  },
  "tools": [
    {
      "id": "medical_assessment",
      "type": "artifact",
      "name": "Medical Assessment",
      "description": "Create a medical assessment document",
      "mode": "json_schema",
      "json_schema": {
        "type": "object",
        "properties": {}
      }
    },
    {
      "id": "claim_status",
      "type": "decision",
      "name": "Claim Status Decision",
      "mode": "options",
      "options": ["approved", "denied", "pending_info"]
    }
  ]
}

Submitting Turns

Each turn represents one exchange in the conversation. POST to /api/v1/episodes/{episode_id}/turns with your messages. See the Quickstart for full code examples. To include tool calls, attach them to the assistant message. See Tool Calls for details.

Response

{
  "turn": 3,
  "observation": {
    "role": "assistant",
    "content": "It's a sharp pain in my lower back, especially when I bend..."
  },
  "reward": 0.75,
  "scores": {
    "D1_InformationGathering": 0.8,
    "D2_GoalProgress": 0.7
  },
  "episode_complete": false
}
The scores field is optional and provides a breakdown of individual scoring dimensions. Dimension names vary by scenario.

Episode Termination

Episodes end for one of these reasons:
ReasonDescription
completedModel completed the scenario successfully
errorEpisode terminated due to an error
max_turns_reachedReached scenario’s turn limit
client_terminatedClient explicitly requested episode termination

Episode State

Each episode tracks internal state including:
  • Which objectives have been discovered
  • Which goals have been achieved
  • The scenario’s scoring criteria
Internal state is never exposed in API responses. This prevents models from learning to game the evaluation.

Limits

LimitValue
Max turns per episode50
Max message length10,000 characters

Best Practices

Create a fresh episode for each training trajectory. Don’t reuse episode IDs.
The API returns only one message per turn—the simulated persona’s response. Your client is responsible for maintaining the full conversation history and passing all messages with each turn submission.
Check X-RateLimit-Remaining headers and implement exponential backoff.