research-kit

Research Kit

This folder provides the paper-aligned live-style Docker workflow for comparative evaluation of Rilot routing policies.

Components

docker-compose.live.yml: starts Rilot, ten high-consumption zone simulators, and Prometheus.
config.live.json: 10-zone live-style config with per-zone share cap (max_request_share_percent).
prometheus.yml: scrape config for /metrics.
scripts/run_live_experiment.sh: primary comparative runner (10 zones + local dynamic carbon API) that writes to result_live/.
scripts/run_experiment.sh: compatibility wrapper that delegates to run_live_experiment.sh.
scripts/run_comparative_evaluation.py: request-level and summary report generator.
scripts/carbon-signal-api.js: local ElectricityMap-compatible dynamic API for reproducible live-style runs.
carbon-traces/us-grid-sample.csv: sample trace format.
carbon-traces/electricitymap-latest-sample.json: ElectricityMap-style local fixture.

Quickstart

./scripts/run_live_experiment.sh

Outputs are written to ./result_live/comparative-live by default. Both scripts generate charts.html automatically in that folder.

Generated output includes:

per-request latency CSV
per-mode Prometheus snapshots
summary CSV/JSON/Markdown tables for paper-ready comparison
baseline-relative trade-off metrics (exposure/CO2e savings, latency delta, error rate, CPU sample delta, memory sample delta)

requests.csv now includes explainability fields:

request_region and selected_region
cross_region_reroute (true/false)
selected_carbon_intensity_g_per_kwh
zone_filter_reasons (per-zone eligibility/constraint reason, e.g. added-latency>50, share-cap, eligible)
carbon_saved_vs_worst_g_per_kwh
decision_reason
decision_reason_brief

The comparative summary now also reports reroute counts per mode:

cross_region_reroutes
east_to_west_reroutes
west_to_east_reroutes

Resource-overhead fields are also included per scenario:

cpu_percent_sample, cpu_sample_method, cpu_delta_percent_vs_baseline
memory_mb_sample, memory_delta_mb_vs_baseline

Run a longer-duration case study (same setup, larger workload):

REQUESTS_PER_REGION=1000 ./scripts/run_live_experiment.sh

Run 10-zone live-style study with high-consuming workloads:

./scripts/run_live_experiment.sh

This uses:

docker-compose.live.yml
config.live.json as base, rewritten into config.live.dynamic.json for run-time overrides
local dynamic ElectricityMap-compatible API (scripts/carbon-signal-api.js)
route "/heavy?burn_ms=40" by default
no carbon cache (carbon.cache_ttl_seconds=0)
random carbon intensities in [100, 700] by default
output directory result_live/comparative-live (stable path)
cross-region RTT emulation enabled by default (RILOT_EMULATE_CROSS_REGION_RTT=true)

Useful overrides:

REQUESTS_PER_REGION=500 ./scripts/run_live_experiment.sh
RILOT_EMULATE_CROSS_REGION_RTT=false ./scripts/run_live_experiment.sh
CARBON_API_MIN_G=150 CARBON_API_MAX_G=650 ./scripts/run_live_experiment.sh

# Live ElectricityMap mode (requires API key)
CARBON_PROVIDER_OVERRIDE=electricitymap \
ELECTRICITYMAP_API_KEY_OVERRIDE=<your_api_key> \
REQUESTS_PER_REGION=1000 \
./scripts/run_live_experiment.sh

Enable/disable timeout robustness scenario:

ENABLE_FAILURE_SCENARIO=1 ./scripts/run_live_experiment.sh
ENABLE_FAILURE_SCENARIO=0 ./scripts/run_live_experiment.sh

Run weight sensitivity analysis:

python3 ./scripts/run_weight_sensitivity.py

Generate an interactive chart dashboard from the latest comparative run:

node ./scripts/charts.js

Optional:

# Use a specific run folder
node ./scripts/charts.js --input-dir ./result_live/comparative-live

# Use live result base
node ./scripts/charts.js --results-base ./result_live

This writes charts.html into the selected comparative result folder.

Interpreting results

Carbon-aware modes can reduce carbon-intensity exposure while keeping latency stable; in many runs the gain is modest (for example, ~1-2%) when regional carbon values are close.
latency_first typically minimizes response time at the cost of higher carbon exposure, which is why multi-objective modes (balanced, carbon_first) are included.
If CPU columns are 0.0, host CPU sampling was not captured for that run; avoid making compute-overhead claims from that dataset.
If memory columns are empty, memory sampling was not captured for that run.
To increase signal separation, run longer workloads and/or use traces with wider regional carbon spread (high-carbon vs low-carbon regions).

Submission Reproduction Bundle

Run these commands and include the generated folders in your supplementary package:

cd research-kit
ENABLE_FAILURE_SCENARIO=1 REQUESTS_PER_REGION=1000 ./scripts/run_live_experiment.sh

Expected outputs:

result_live/comparative-live/summary.{md,csv,json}
result_live/comparative-live/requests.csv
result_live/comparative-live/metrics-*.prom
result_live/comparative-live/charts.html

Failure/operational evidence is captured by scenario carbon_first_provider_timeout in summary.*. Use this row to demonstrate timeout/fallback behavior and service stability under degraded carbon-signal conditions.

Default comparative scenario order in summary.*:

carbon_first
balanced
latency_first
carbon_first_provider_timeout (when ENABLE_FAILURE_SCENARIO=1)
explicit_cross_region_to_green (when fixture has a clear greener region)
baseline_no_carbon_strict_local
baseline_no_carbon_latency_first
baseline_no_carbon_balanced

Fairness/user-impact evidence is captured in reroute columns:

cross_region_reroutes
east_to_west_reroutes
west_to_east_reroutes

Use these with latency/error metrics to report trade-offs and justify policy guardrails.

Fairness/locality tuning knobs (in config.live.json policy):

Reduce w_carbon and increase w_latency for user-facing routes.
Set tighter constraints.max_added_latency_ms and constraints.p95_latency_budget_ms.
Set constraints.max_request_share_percent to cap per-zone request concentration (for example 20).
Use route_class=strict-local for critical locality-sensitive routes.
Limit migration scope via constraints.zone_allowlist and zone tags.

Related docs

docs/research-toolkit.md
docs/runtime-behavior.md
docs/config-reference.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Research Kit

Components

Quickstart

Interpreting results

Submission Reproduction Bundle

Related docs

Name		Name	Last commit message	Last commit date
parent directory ..
carbon-traces		carbon-traces
result_live/comparative-live		result_live/comparative-live
scripts		scripts
README.md		README.md
config.docker.json		config.docker.json
config.live.json		config.live.json
docker-compose.live.yml		docker-compose.live.yml
docker-compose.yml		docker-compose.yml
prometheus.yml		prometheus.yml

FilesExpand file tree

research-kit

Directory actions

More options

Directory actions

More options

Latest commit

History

research-kit

Folders and files

parent directory

README.md

Research Kit

Components

Quickstart

Interpreting results

Submission Reproduction Bundle

Related docs