Scheduled Evaluations
Use the Schedules page to automate evaluations — either a one-time future run or a recurring check.
Validate first, then automate
Only schedule a run after you've already done it manually through Runs and confirmed the provider, dataset, and config all work. A schedule multiplies an existing workflow — it's not a substitute for first verifying things work.
One-time schedule
Set up a run for a specific future date and time.
- Open Schedules
- Choose One time
- Enter a schedule name
- Pick a provider, dataset, and model name
- Set the date/time and timezone
- Optionally add JSON run config
- Save
When to use: "Rerun this benchmark tomorrow morning after the deployment goes live."
Recurring schedule
Set up a run that repeats on a cadence.
- Open Schedules
- Choose Recurring
- Pick cadence: hourly, daily, or weekly
- Set interval and execution time
- For weekly: select which days
- Review the RRULE preview
- Save
If the visual builder is too limiting, switch to manual RRULE input and paste the exact rule.
When to use: "Check this provider every Monday morning" or "Run a nightly regression suite."
Managing schedules
From the schedule list you can:
- Rename, pause, resume, or delete schedules
- Update the model name, config, or timing
- See the last run time and next planned run time
Tips
- Name schedules descriptively. The name appears in run history, so "GPT-4o Daily Regression" is more useful than "Schedule 1."
- Reuse saved provider models — they auto-suggest in the schedule form and prevent typos.
- Start with small datasets when validating a new recurring rule.
