[DND-172] [BE] perf: optimize opik-python-backend Dockerfile#7012
Open
GuySaar8 wants to merge 2 commits into
Open
[DND-172] [BE] perf: optimize opik-python-backend Dockerfile#7012GuySaar8 wants to merge 2 commits into
GuySaar8 wants to merge 2 commits into
Conversation
Apply the DND-171 sandbox-executor optimizations to the python-backend image, scoped to this server's constraints: - Bytecode-compile the venv deps (compileall -o 2 -b, PYTHONNODEBUGRANGES=1), delete .py/.pyi/__pycache__, keep *.dist-info for importlib.metadata. - Strip pip/setuptools/wheel/ensurepip from the venv. - Consolidate runtime ENV (LITELLM_MODE=PRODUCTION, PYTHONNODEBUGRANGES=1, PYTHONDONTWRITEBYTECODE=1); add syntax header + stage banners. Diverged from DND-171 where the runtime differs: strip scope is the venv site-packages only — src/ keeps its .py because optimizer_runner.py is exec'd as a subprocess script and config.py is read via Flask from_pyfile. Kept root user + Alpine + tini + dockerd entrypoint since the server needs the Docker socket to spawn sandbox-executor containers. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Contributor
Python Backend Tests Results196 tests 193 ✅ 3m 0s ⏱️ Results for commit 027799d. ♻️ This comment has been updated with latest results. |
…istances test_demo_spans_all_shifted_to_latest_trace_date asserted shifted_span_start.date() == shifted_latest_trace_end.date(). The shift moves the latest trace end to now(), so a span originally earlier in the day shifts to a fixed delta before now — which lands on the previous calendar day whenever the run happens just after UTC midnight, failing the .date() equality despite the spans being <4h apart. Assert the preserved gap (shifted_latest_trace_end - shifted_span_start) stays within one day instead of comparing calendar dates. Real demo data max gap is ~3h46m, so the 1-day bound holds with margin and is immune to the rollover. Verified passing locally inside the same post-midnight UTC window that failed CI attempts 1 and 2. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
obezpalko
approved these changes
Jun 10, 2026
andrescrz
requested changes
Jun 10, 2026
Member
There was a problem hiding this comment.
Hi @GuySaar8
Before you move forward.Can you please explain the goal of this PR?
This service runs in a quite optimal way nowadays and I don't see any revelant information here about what we're trying to accomplish:
In addition, can you provide some bechmarking results?
Finally, this could be a sensitive change. Have you regressed the main functionality of this service:
- Python Online Evaluations.
1.a With docker executor.
1.b With process executor. - Demo data generation.
- Optimisations execution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details
Applies the same Dockerfile optimizations as #7010 (DND-171, sandbox-executor) to the
opik-python-backendimage — faster builds, smaller bytecode-only dependency layer, and safer gating — scoped to this server's runtime constraints.compileall -o 2 -b,PYTHONNODEBUGRANGES=1), delete*.py/*.pyi/__pycache__, and strippip/setuptools/wheel/ensurepip.*.dist-infois intentionally kept —opik/litellm/tiktokenread their own version viaimportlib.metadataat import time.LITELLM_MODE=PRODUCTION(avoids litellm'sload_dotenv()frame inspection, which breaks under a.pyc-only layout),PYTHONNODEBUGRANGES=1,PYTHONDONTWRITEBYTECODE=1. Added# syntax=docker/dockerfile:1and stage banners.027799d273):test_time_shift_distances::test_demo_spans_all_shifted_to_latest_trace_dateassertedshifted_span_start.date() == shifted_latest_trace_end.date(). Since the shift moves the latest trace tonow(), an earlier-in-day span lands a fixed delta before now — i.e. the previous calendar day whenever the run is just after UTC midnight — failing the.date()equality despite the spans being <4h apart. Now asserts the preserved gap (shifted_latest_trace_end - shifted_span_start ≤ 1 day) instead of comparing calendar dates, making it immune to the rollover. Unrelated to the Dockerfile but folded in here because it was blocking this PR's CI.Divergence from DND-171
This image is a long-running Flask/gunicorn server (Alpine + Docker CLI), not a one-shot scorer, so two parts of the sandbox pattern do not carry over:
src/keeps its.py:optimizer_runner.pyis executed as a subprocess script andconfig.pyis read via Flaskfrom_pyfile— deleting app source would break optimizer jobs.selftest.shgate. The server needs the Docker socket to spawn sandbox-executor containers, so it can't drop to the unprivilegedUSER 1001the sandbox uses; the scoringselftest.shbuild-gate is sandbox-specific and N/A here.Change checklist
Issues
AI-WATERMARK
AI-WATERMARK: yes
Testing
Built and verified locally with
docker buildx(colima, arm64):docker buildx build --load apps/opik-python-backend— builds green..pyc-only stripped venv succeed:flask, gunicorn, docker, litellm, opik, tiktoken, redis, rq, pydantic, opentelemetry+opik.evaluation.metrics.BaseMetric→DEPS_OK. This is the main risk in the bytecode-only layout (litellm under.pyc-only), confirmed working..py=0,.pyc=10453,.pyi=0,__pycache__=0,.dist-info=129 kept;pip/setuptoolsstripped;src/*.py=35 kept;optimizer_runner.pypresent.test_time_shift_distances(both tests) run locally inside the same post-midnight UTC window that failed CI earlier → 2 passed; the full backend-tests run on the fix commit is green.Documentation
No documentation changes — internal build optimization only.