Labs API

Labs is a remote reward server for reinforcement learning training. You send conversations, we return reward scores.

What You Get

Reward Scores

Send a conversation, get a calibrated reward (0-1) for RL training.

Multi-Turn Episodes

Run interactive training sessions with turn-by-turn rewards.

Batch Evaluation

Score up to 100 conversations in a single request.

Pairwise Comparison

Compare two responses for preference learning (DPO).

Available Domains

Our scenario library covers:
  • Case Management — Insurance claims, workers’ compensation, return-to-work
  • Healthcare — Patient consultations, symptom investigation
  • Finance — Risk assessment, compliance reviews
Browse available scenarios in the Labs Portal or via the Catalog API.

How Scoring Works

Each scenario has calibrated scoring criteria. When you submit a conversation:
  1. We evaluate against scenario-specific objectives
  2. You receive a reward score (0 = poor, 1 = excellent)
  3. Multi-turn episodes also return per-turn rewards for dense training signal
Scoring criteria are never exposed - this prevents models from gaming the evaluation.

Why Labs?

Labs provides automated rewards without human labelers. Instead of paying annotators to score every conversation, you get calibrated rewards from our evaluation engine. This means:
  • Unlimited scale — No labeler bottleneck
  • Consistent scoring — Deterministic, calibrated rewards
  • Lower cost — Compute instead of human labor

Get Started

Quickstart

Create your first episode and collect rewards in 5 minutes.