Skip to content

Latest commit

 

History

History

README.md

Research Kit

This folder provides the paper-aligned live-style Docker workflow for comparative evaluation of Rilot routing policies.

Components

  • docker-compose.live.yml: starts Rilot, ten high-consumption zone simulators, and Prometheus.
  • config.live.json: 10-zone live-style config with per-zone share cap (max_request_share_percent).
  • prometheus.yml: scrape config for /metrics.
  • scripts/run_live_experiment.sh: primary comparative runner (10 zones + local dynamic carbon API) that writes to result_live/.
  • scripts/run_experiment.sh: compatibility wrapper that delegates to run_live_experiment.sh.
  • scripts/run_comparative_evaluation.py: request-level and summary report generator.
  • scripts/carbon-signal-api.js: local ElectricityMap-compatible dynamic API for reproducible live-style runs.
  • carbon-traces/us-grid-sample.csv: sample trace format.
  • carbon-traces/electricitymap-latest-sample.json: ElectricityMap-style local fixture.

Quickstart

./scripts/run_live_experiment.sh

Outputs are written to ./result_live/comparative-live by default. Both scripts generate charts.html automatically in that folder.

Generated output includes:

  • per-request latency CSV
  • per-mode Prometheus snapshots
  • summary CSV/JSON/Markdown tables for paper-ready comparison
  • baseline-relative trade-off metrics (exposure/CO2e savings, latency delta, error rate, CPU sample delta, memory sample delta)

requests.csv now includes explainability fields:

  • request_region and selected_region
  • cross_region_reroute (true/false)
  • selected_carbon_intensity_g_per_kwh
  • zone_filter_reasons (per-zone eligibility/constraint reason, e.g. added-latency>50, share-cap, eligible)
  • carbon_saved_vs_worst_g_per_kwh
  • decision_reason
  • decision_reason_brief

The comparative summary now also reports reroute counts per mode:

  • cross_region_reroutes
  • east_to_west_reroutes
  • west_to_east_reroutes

Resource-overhead fields are also included per scenario:

  • cpu_percent_sample, cpu_sample_method, cpu_delta_percent_vs_baseline
  • memory_mb_sample, memory_delta_mb_vs_baseline

Run a longer-duration case study (same setup, larger workload):

REQUESTS_PER_REGION=1000 ./scripts/run_live_experiment.sh

Run 10-zone live-style study with high-consuming workloads:

./scripts/run_live_experiment.sh

This uses:

  • docker-compose.live.yml
  • config.live.json as base, rewritten into config.live.dynamic.json for run-time overrides
  • local dynamic ElectricityMap-compatible API (scripts/carbon-signal-api.js)
  • route "/heavy?burn_ms=40" by default
  • no carbon cache (carbon.cache_ttl_seconds=0)
  • random carbon intensities in [100, 700] by default
  • output directory result_live/comparative-live (stable path)
  • cross-region RTT emulation enabled by default (RILOT_EMULATE_CROSS_REGION_RTT=true)

Useful overrides:

REQUESTS_PER_REGION=500 ./scripts/run_live_experiment.sh
RILOT_EMULATE_CROSS_REGION_RTT=false ./scripts/run_live_experiment.sh
CARBON_API_MIN_G=150 CARBON_API_MAX_G=650 ./scripts/run_live_experiment.sh
# Live ElectricityMap mode (requires API key)
CARBON_PROVIDER_OVERRIDE=electricitymap \
ELECTRICITYMAP_API_KEY_OVERRIDE=<your_api_key> \
REQUESTS_PER_REGION=1000 \
./scripts/run_live_experiment.sh

Enable/disable timeout robustness scenario:

ENABLE_FAILURE_SCENARIO=1 ./scripts/run_live_experiment.sh
ENABLE_FAILURE_SCENARIO=0 ./scripts/run_live_experiment.sh

Run weight sensitivity analysis:

python3 ./scripts/run_weight_sensitivity.py

Generate an interactive chart dashboard from the latest comparative run:

node ./scripts/charts.js

Optional:

# Use a specific run folder
node ./scripts/charts.js --input-dir ./result_live/comparative-live

# Use live result base
node ./scripts/charts.js --results-base ./result_live

This writes charts.html into the selected comparative result folder.

Interpreting results

  • Carbon-aware modes can reduce carbon-intensity exposure while keeping latency stable; in many runs the gain is modest (for example, ~1-2%) when regional carbon values are close.
  • latency_first typically minimizes response time at the cost of higher carbon exposure, which is why multi-objective modes (balanced, carbon_first) are included.
  • If CPU columns are 0.0, host CPU sampling was not captured for that run; avoid making compute-overhead claims from that dataset.
  • If memory columns are empty, memory sampling was not captured for that run.
  • To increase signal separation, run longer workloads and/or use traces with wider regional carbon spread (high-carbon vs low-carbon regions).

Submission Reproduction Bundle

Run these commands and include the generated folders in your supplementary package:

cd research-kit
ENABLE_FAILURE_SCENARIO=1 REQUESTS_PER_REGION=1000 ./scripts/run_live_experiment.sh

Expected outputs:

  • result_live/comparative-live/summary.{md,csv,json}
  • result_live/comparative-live/requests.csv
  • result_live/comparative-live/metrics-*.prom
  • result_live/comparative-live/charts.html

Failure/operational evidence is captured by scenario carbon_first_provider_timeout in summary.*. Use this row to demonstrate timeout/fallback behavior and service stability under degraded carbon-signal conditions.

Default comparative scenario order in summary.*:

  1. carbon_first
  2. balanced
  3. latency_first
  4. carbon_first_provider_timeout (when ENABLE_FAILURE_SCENARIO=1)
  5. explicit_cross_region_to_green (when fixture has a clear greener region)
  6. baseline_no_carbon_strict_local
  7. baseline_no_carbon_latency_first
  8. baseline_no_carbon_balanced

Fairness/user-impact evidence is captured in reroute columns:

  • cross_region_reroutes
  • east_to_west_reroutes
  • west_to_east_reroutes

Use these with latency/error metrics to report trade-offs and justify policy guardrails.

Fairness/locality tuning knobs (in config.live.json policy):

  • Reduce w_carbon and increase w_latency for user-facing routes.
  • Set tighter constraints.max_added_latency_ms and constraints.p95_latency_budget_ms.
  • Set constraints.max_request_share_percent to cap per-zone request concentration (for example 20).
  • Use route_class=strict-local for critical locality-sensitive routes.
  • Limit migration scope via constraints.zone_allowlist and zone tags.

Related docs

  • docs/research-toolkit.md
  • docs/runtime-behavior.md
  • docs/config-reference.md