Tool Calls

Many scenarios include tools that models can invoke. Tools represent real-world actions like creating documents or making decisions.

Tool Types

Artifacts

Artifacts are documents or structured outputs the model creates:
{
  "type": "artifact",
  "name": "medical_assessment",
  "input": {
    "condition": "lumbar_strain",
    "severity": "moderate",
    "work_restrictions": ["no_heavy_lifting", "seated_work_only"],
    "follow_up_date": "2024-02-15"
  }
}
Examples: medical assessments, case notes, reports, recommendations

Decisions

Decisions are choices from predefined options:
{
  "type": "decision",
  "name": "claim_status",
  "input": {
    "value": "approved"
  }
}
Examples: approve/deny decisions, priority levels, category assignments

Tool Schemas

When you create an episode, available tools are returned:
{
  "tools": [
    {
      "id": "medical_assessment",
      "type": "artifact",
      "name": "Medical Assessment",
      "description": "Create a medical assessment for the claimant",
      "mode": "json_schema",
      "json_schema": {
        "type": "object",
        "properties": {
          "condition": { "type": "string" },
          "severity": { "enum": ["mild", "moderate", "severe"] },
          "work_restrictions": {
            "type": "array",
            "items": { "type": "string" }
          }
        },
        "required": ["condition", "severity"]
      }
    },
    {
      "id": "claim_status",
      "type": "decision",
      "name": "Claim Status",
      "mode": "options",
      "options": ["approved", "denied", "pending_info"]
    }
  ]
}

Submitting Tool Calls

Tool calls are attached to the assistant message that invokes them:
{
  "messages": [
    {
      "role": "assistant",
      "content": "Based on our conversation, I'm creating an assessment.",
      "tool_calls": [
        {
          "type": "artifact",
          "name": "medical_assessment",
          "input": {
            "condition": "lumbar_strain",
            "severity": "moderate"
          }
        }
      ]
    }
  ]
}
See the Quickstart for full code examples.

Tool Call Timing

Tool calls should be made after gathering sufficient information. Making a decision too early can result in low rewards.

Tool Call Grading

Tool calls are graded as part of the reward calculation. The grading considers:
  • Whether the tool call was appropriate for the scenario
  • Whether the inputs match what the scenario expected
  • The timing of the tool call relative to information gathered
Tool inputs are not strictly validated against their schemas at submission time. Instead, incorrect or inappropriate tool calls affect your reward score.