Scheduled Evaluations

Use the Schedules page to automate evaluations — either a one-time future run or a recurring check.

Validate first, then automate

Only schedule a run after you've already done it manually through Runs and confirmed the provider, dataset, and config all work. A schedule multiplies an existing workflow — it's not a substitute for first verifying things work.

One-time schedule

Set up a run for a specific future date and time.

  1. Open Schedules
  2. Choose One time
  3. Enter a schedule name
  4. Pick a provider, dataset, and model name
  5. Set the date/time and timezone
  6. Optionally add JSON run config
  7. Save

When to use: "Rerun this benchmark tomorrow morning after the deployment goes live."

Recurring schedule

Set up a run that repeats on a cadence.

  1. Open Schedules
  2. Choose Recurring
  3. Pick cadence: hourly, daily, or weekly
  4. Set interval and execution time
  5. For weekly: select which days
  6. Review the RRULE preview
  7. Save

If the visual builder is too limiting, switch to manual RRULE input and paste the exact rule.

When to use: "Check this provider every Monday morning" or "Run a nightly regression suite."

Managing schedules

From the schedule list you can:

  • Rename, pause, resume, or delete schedules
  • Update the model name, config, or timing
  • See the last run time and next planned run time

Tips

  • Name schedules descriptively. The name appears in run history, so "GPT-4o Daily Regression" is more useful than "Schedule 1."
  • Reuse saved provider models — they auto-suggest in the schedule form and prevent typos.
  • Start with small datasets when validating a new recurring rule.