LightEval Configuration Builder

Last updated: 2026-03-13

This page explains the LightEvalConfigService, its field mapping rules, and how generated configuration is reused by backend execution and export flows. The service lives in backend/src/eval_752/services/lighteval_config.py.

Main Use Cases

  • Celery runner: generate a LightEval config.yaml from the selected provider, model, and run config before execution
  • Export and replay: package the generated configuration inside .eval752.zip
  • CLI and CI: allow developers or CI pipelines to reuse the exact generated config with LightEval CLI or Python APIs

Data Sources

SourceBacking entity
provider base data (id, name, type, base_url)providers table
provider rate limits and capabilitiesprovider JSON fields
run metadata (run_id, dataset_id, dataset_name)run record plus linked dataset
run configuration (config)run JSON config
decrypted API keyprovider key selection at runtime

Example config.yaml

metadata:
  provider_id: "prov-1"
  provider_name: "Demo Provider"
  provider_type: "openai"
  model_name: "openai/gpt-4o-mini"
  run_id: "run-123"
  dataset_id: "ds-42"
  dataset_name: "MMLU"
model_parameters:
  - model_name: "openai/gpt-4o-mini"
    provider_type: "openai"
    base_url: "https://api.example.com/v1"
    api_key: "sk-***"
    concurrent_requests: 6
    timeout: 120
    max_retries: 5
generation_parameters:
  temperature: 0.7
  max_new_tokens: 512
execution:
  endpoint: "litellm"
  use_chat_template: true

Mapping Rules

metadata

  • describes provider, model, run, and dataset identity
  • model_name always includes the LiteLLM provider prefix

model_parameters

  • required fields include model_name, provider_type, and base_url
  • by default the API key is written directly into api_key
  • if include_api_key=False, the config uses api_key_env instead
  • concurrency, timeout, and retry values are derived from provider metadata and run config

generation_parameters

Supported fields include:

  • temperature
  • top_p
  • top_k
  • max_new_tokens
  • stop and stop_sequences
  • presence_penalty
  • frequency_penalty
  • repetition_penalty
  • seed

The service prefers nested generation or sampling sections in the run config, then falls back to top-level keys.

execution

  • endpoint is fixed to litellm
  • use_chat_template defaults to true

Python API

from eval_752.services.lighteval_config import LightEvalConfigService

service = LightEvalConfigService()
config = service.build(
    provider=provider,
    model_name="gpt-4o-mini",
    api_key=decrypted_key,
    run_config=run.config,
    run_id=run.id,
    dataset_id=run.dataset_id,
    dataset_name="MMLU",
)

config_dict = config.to_dict()
yaml_text = config.to_yaml()

If PyYAML is unavailable, to_yaml() will fail. That is acceptable as long as the calling path deliberately chooses JSON serialization instead.

Integration Guidance

  1. Generate config before LightEval execution and keep it alongside the run artifacts when helpful.
  2. Include the generated config in .eval752.zip exports for offline replay.
  3. Inject decrypted provider keys at runtime and avoid leaving config files with secrets on disk longer than necessary.
  4. Keep the field mapping aligned with run execution and provider normalization logic.