RPCS-1

Agent tuning examples

RPCS1 is most useful when an agent's failures look behavioral rather than purely factual: oscillation, overload, premature commitment, excessive retries, or frozen decision-making. These examples show when to call recommend_agent_configuration.

The diagnostic question is fit, not fault: is the deployed agent matched to the task, communication format, timing, environment, and stakes it actually faces?

Customer support copilot tuning assessment

Problem: A deployed support copilot works in demos but gives inconsistent guidance when refunds, billing disputes, policy ambiguity, and live queue pressure collide.

Example request:

Tune a customer support copilot that assists human agents with refunds, billing disputes, escalation decisions, and policy exceptions. The environment is dynamic, somewhat predictable, high stakes, medium-context, and should be cautious before committing.

Assessment inputs:

  • Target platform: anthropic
  • Entropy: dynamic
  • Predictability: somewhat_predictable
  • Stakes: high
  • Context relevance: medium
  • Commitment style: cautious

Run this example in the tuner

Coding agent in a changing repository

Problem: A coding agent repeatedly changes direction, retries too aggressively, or commits before it has enough repository context.

Example request:

Tune a coding agent that inspects a changing repository, edits files, runs tests, and opens pull requests. Mistakes have medium stakes and relevant context is long-lived.

Assessment inputs:

  • Target platform: openai
  • Entropy: moderate
  • Predictability: somewhat_predictable
  • Stakes: medium
  • Context relevance: long
  • Commitment style: balanced

Run this example in the tuner

High-stakes customer support agent

Problem: A support agent gives inconsistent answers or acts too quickly on refunds, disputes, and policy exceptions.

Example request:

Tune a customer support agent handling refunds, billing disputes, and policy exceptions in a dynamic environment with high stakes.

Assessment inputs:

  • Target platform: anthropic
  • Entropy: dynamic
  • Predictability: somewhat_predictable
  • Stakes: high
  • Context relevance: medium
  • Commitment style: cautious

Run this example in the tuner

Research agent with conflicting evidence

Problem: A research agent overreacts to new sources, loses earlier evidence, or presents uncertain conclusions too confidently.

Example request:

Tune a research agent that synthesizes conflicting technical sources into a cautious recommendation while retaining long-context evidence.

Assessment inputs:

  • Target platform: generic
  • Entropy: stable
  • Predictability: highly_predictable
  • Stakes: medium
  • Context relevance: long
  • Commitment style: cautious

Run this example in the tuner

Use through MCP

Connect https://rpcs1.dev/mcp as a Streamable HTTP server, then ask your agent to tune or diagnose another agent. The server is public, read-only, deterministic, and requires no API key.

Use recommend_agent_configuration to tune my agent.

Task: triage production incidents and propose remediation
Environment: dynamic, somewhat predictable, high stakes
Context relevance: long
Commitment style: cautious
Target platform: anthropic