Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Speedup dependencies installation via UV #8631

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

Bobronium
Copy link
Member

@Bobronium Bobronium commented Nov 1, 2024

Motivation and context

Currently, CVAT server image takes 434 seconds to build without cache on my local machine and 325 seconds after any change to requirements/base.txt.

This extended build time is inefficient, especially when changes in dependencies or minor adjustments require almost a full rebuild.

Long builds waste developer time, delay testing cycles, and add unnecessary friction to iterative development. We need a way to optimize build speed and reduce the image size.

Following is a complete build log from my machine for Dockerfile from develop compared to Dockerfile from this branch:

baseline (building images using Dockerfile from develop)

Resulting image size for production: 1.37GB

build --no-cache — 434.4s
docker build . -t cvat_pip_no_cache --no-cache
[+] Building 434.4s (46/46) FINISHED                                                                                                                         docker:orbstack
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 7.20kB                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                         1.4s
 => [internal] load metadata for docker.io/library/golang:1.23.0                                                                                                        1.5s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 198B                                                                                                                                       0.0s
 => CACHED [build-smokescreen 1/3] FROM docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                         0.0s
 => [internal] load build context                                                                                                                                       0.2s
 => => transferring context: 79.69kB                                                                                                                                    0.1s
 => CACHED [build-image-base 1/3] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                           0.0s
 => [build-smokescreen 2/3] RUN git clone --filter=blob:none --no-checkout https://github.com/stripe/smokescreen.git                                                    2.5s
 => [stage-4  2/22] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         bzip2         ca-certificates         61.6s
 => [build-image-base 2/3] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         curl         g++         gcc   62.8s
 => [build-smokescreen 3/3] RUN cd smokescreen && git checkout eb1ac09 && go build -o /tmp/smokescreen                                                                 12.2s
 => [stage-4  3/22] COPY --from=build-smokescreen /tmp/smokescreen /usr/local/bin/smokescreen                                                                           0.3s
 => [stage-4  4/22] RUN adduser --shell /bin/bash --disabled-password --gecos "" django                                                                                 0.5s
 => [stage-4  5/22] RUN if [ "no" = "yes" ]; then         apt-get update &&         apt-get --no-install-recommends install -yq             clamav             libclam  0.3s
 => [build-image-base 3/3] RUN --mount=type=cache,target=/root/.cache/pip/http     python3 -m pip install -U pip==24.0                                                  2.1s
 => [stage-4  6/22] RUN python3 -m venv /opt/venv                                                                                                                       2.1s
 => [build-image-av 1/9] WORKDIR /tmp/openh264                                                                                                                          0.1s
 => [build-image 1/4] COPY cvat/requirements/ /tmp/cvat/requirements/                                                                                                   0.1s
 => [stage-4  7/22] RUN python -m pip install --upgrade setuptools                                                                                                      1.9s
 => [build-image-av 2/9] RUN curl -sL https://github.com/cisco/openh264/archive/v2.1.1.tar.gz --output - |     tar -zx --strip-components=1 &&     make -j5 && make i  12.5s
 => [build-image 2/4] COPY utils/dataset_manifest/requirements.txt /tmp/utils/dataset_manifest/requirements.txt                                                         0.2s
 => [build-image 3/4] RUN sed -i '/^av==/d' /tmp/utils/dataset_manifest/requirements.txt                                                                                0.2s
 => [build-image 4/4] RUN --mount=type=cache,target=/root/.cache/pip/http-v2     DATUMARO_HEADLESS=1 python3 -m pip wheel --no-deps --no-binary lxml,xmlsec     -r /  344.5s
 => [stage-4  8/22] RUN python -m pip install -U pip==24.0                                                                                                              2.6s
 => [build-image-av 3/9] WORKDIR /tmp/ffmpeg                                                                                                                            0.1s
 => [build-image-av 4/9] RUN curl -sL https://ffmpeg.org/releases/ffmpeg-4.3.1.tar.gz --output - |     tar -zx --strip-components=1 &&     ./configure --disable-nonf  93.2s
 => [build-image-av 5/9] COPY utils/dataset_manifest/requirements.txt /tmp/utils/dataset_manifest/requirements.txt                                                      0.1s
 => [build-image-av 6/9] RUN grep -q '^av==' /tmp/utils/dataset_manifest/requirements.txt                                                                               0.1s
 => [build-image-av 7/9] RUN sed -i '/^av==/!d' /tmp/utils/dataset_manifest/requirements.txt                                                                            0.1s
 => [build-image-av 8/9] RUN pip install setuptools wheel 'cython<3'                                                                                                    1.7s
 => [build-image-av 9/9] RUN --mount=type=cache,target=/root/.cache/pip/http-v2     python3 -m pip wheel --no-binary=av --no-build-isolation     -r /tmp/utils/datase  45.1s
 => [stage-4  9/22] RUN --mount=type=bind,from=build-image,source=/tmp/wheelhouse,target=/mnt/wheelhouse     --mount=type=bind,from=build-image-av,source=/tmp/wheelh  17.6s
 => [stage-4 10/22] COPY --from=build-image-av /opt/ffmpeg/lib /usr/lib                                                                                                 0.1s
 => [stage-4 11/22] RUN if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then         python3 -m pip install --no-cache-dir debugpy;     fi                                      0.1s
 => [stage-4 12/22] RUN python -m pip uninstall -y pip                                                                                                                  0.4s
 => [stage-4 13/22] COPY cvat/nginx.conf /etc/nginx/nginx.conf                                                                                                          0.0s
 => [stage-4 14/22] COPY --chown=django components /tmp/components                                                                                                      0.0s
 => [stage-4 15/22] COPY --chown=django supervisord/ /home/django/supervisord                                                                                           0.0s
 => [stage-4 16/22] COPY --chown=django wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh /home/django/                                                   0.1s
 => [stage-4 17/22] COPY --chown=django utils/ /home/django/utils                                                                                                       0.1s
 => [stage-4 18/22] COPY --chown=django cvat/ /home/django/cvat                                                                                                         0.4s
 => [stage-4 19/22] COPY --chown=django rqscheduler.py /home/django                                                                                                     0.1s
 => [stage-4 20/22] RUN if [ "${COVERAGE_PROCESS_START}" ]; then         echo "import coverage; coverage.process_startup()" > /opt/venv/lib/python3.10/site-packages/c  0.1s
 => [stage-4 21/22] WORKDIR /home/django                                                                                                                                0.1s
 => [stage-4 22/22] RUN mkdir -p data share keys logs /tmp/supervisord static                                                                                           0.2s
 => exporting to image                                                                                                                                                  2.4s
 => => exporting layers                                                                                                                                                 2.3s
 => => writing image sha256:ba06ec7ca5662cdcfe516c3f0f5a6a4aa480cb24bb982995355bc0e581c40fe1                                                                            0.0s
 => => naming to docker.io/library/cvat_pip_no_cache                                         
build after changing requirements.txt — 325.4s
.venv ❯ docker build .
[+] Building 325.4s (46/46) FINISHED                                                                                                                         docker:orbstack
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 7.20kB                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/golang:1.23.0                                                                                                        1.4s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                         1.4s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 198B                                                                                                                                       0.0s
 => [build-smokescreen 1/3] FROM docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                                0.0s
 => [internal] load build context                                                                                                                                       0.1s
 => => transferring context: 75.31kB                                                                                                                                    0.0s
 => [build-image-base 1/3] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                                  0.0s
 => CACHED [build-image-base 2/3] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         curl         g++         0.0s
 => CACHED [build-image-base 3/3] RUN --mount=type=cache,target=/root/.cache/pip/http     python3 -m pip install -U pip==24.0                                           0.0s
 => [build-image 1/4] COPY cvat/requirements/ /tmp/cvat/requirements/                                                                                                   0.2s
 => [build-image 2/4] COPY utils/dataset_manifest/requirements.txt /tmp/utils/dataset_manifest/requirements.txt                                                         0.1s
 => [build-image 3/4] RUN sed -i '/^av==/d' /tmp/utils/dataset_manifest/requirements.txt                                                                                0.3s
 => [build-image 4/4] RUN --mount=type=cache,target=/root/.cache/pip/http-v2     DATUMARO_HEADLESS=1 python3 -m pip wheel --no-deps --no-binary lxml,xmlsec     -r /  295.0s
 => CACHED [build-image-av 1/9] WORKDIR /tmp/openh264                                                                                                                   0.0s
 => CACHED [build-image-av 2/9] RUN curl -sL https://github.com/cisco/openh264/archive/v2.1.1.tar.gz --output - |     tar -zx --strip-components=1 &&     make -j5 &&   0.0s
 => CACHED [build-image-av 3/9] WORKDIR /tmp/ffmpeg                                                                                                                     0.0s
 => CACHED [build-image-av 4/9] RUN curl -sL https://ffmpeg.org/releases/ffmpeg-4.3.1.tar.gz --output - |     tar -zx --strip-components=1 &&     ./configure --disabl  0.0s
 => CACHED [build-image-av 5/9] COPY utils/dataset_manifest/requirements.txt /tmp/utils/dataset_manifest/requirements.txt                                               0.0s
 => CACHED [build-image-av 6/9] RUN grep -q '^av==' /tmp/utils/dataset_manifest/requirements.txt                                                                        0.0s
 => CACHED [build-image-av 7/9] RUN sed -i '/^av==/!d' /tmp/utils/dataset_manifest/requirements.txt                                                                     0.0s
 => CACHED [build-image-av 8/9] RUN pip install setuptools wheel 'cython<3'                                                                                             0.0s
 => CACHED [build-image-av 9/9] RUN --mount=type=cache,target=/root/.cache/pip/http-v2     python3 -m pip wheel --no-binary=av --no-build-isolation     -r /tmp/utils/  0.0s
 => CACHED [stage-4  2/22] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         bzip2         ca-certificates   0.0s
 => CACHED [build-smokescreen 2/3] RUN git clone --filter=blob:none --no-checkout https://github.com/stripe/smokescreen.git                                             0.0s
 => CACHED [build-smokescreen 3/3] RUN cd smokescreen && git checkout eb1ac09 && go build -o /tmp/smokescreen                                                           0.0s
 => CACHED [stage-4  3/22] COPY --from=build-smokescreen /tmp/smokescreen /usr/local/bin/smokescreen                                                                    0.0s
 => CACHED [stage-4  4/22] RUN adduser --shell /bin/bash --disabled-password --gecos "" django                                                                          0.0s
 => CACHED [stage-4  5/22] RUN if [ "no" = "yes" ]; then         apt-get update &&         apt-get --no-install-recommends install -yq             clamav               0.0s
 => CACHED [stage-4  6/22] RUN python3 -m venv /opt/venv                                                                                                                0.0s
 => CACHED [stage-4  7/22] RUN python -m pip install --upgrade setuptools                                                                                               0.0s
 => CACHED [stage-4  8/22] RUN python -m pip install -U pip==24.0                                                                                                       0.0s
 => [stage-4  9/22] RUN --mount=type=bind,from=build-image,source=/tmp/wheelhouse,target=/mnt/wheelhouse     --mount=type=bind,from=build-image-av,source=/tmp/wheelh  22.1s
 => [stage-4 10/22] COPY --from=build-image-av /opt/ffmpeg/lib /usr/lib                                                                                                 0.1s
 => [stage-4 11/22] RUN if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then         python3 -m pip install --no-cache-dir debugpy;     fi                                      0.2s
 => [stage-4 12/22] RUN python -m pip uninstall -y pip                                                                                                                  0.8s
 => [stage-4 13/22] COPY cvat/nginx.conf /etc/nginx/nginx.conf                                                                                                          0.1s
 => [stage-4 14/22] COPY --chown=django components /tmp/components                                                                                                      0.1s
 => [stage-4 15/22] COPY --chown=django supervisord/ /home/django/supervisord                                                                                           0.1s
 => [stage-4 16/22] COPY --chown=django wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh /home/django/                                                   0.1s
 => [stage-4 17/22] COPY --chown=django utils/ /home/django/utils                                                                                                       0.1s
 => [stage-4 18/22] COPY --chown=django cvat/ /home/django/cvat                                                                                                         0.6s
 => [stage-4 19/22] COPY --chown=django rqscheduler.py /home/django                                                                                                     0.1s
 => [stage-4 20/22] RUN if [ "${COVERAGE_PROCESS_START}" ]; then         echo "import coverage; coverage.process_startup()" > /opt/venv/lib/python3.10/site-packages/c  0.1s
 => [stage-4 21/22] WORKDIR /home/django                                                                                                                                0.1s
 => [stage-4 22/22] RUN mkdir -p data share keys logs /tmp/supervisord static                                                                                           0.1s
 => exporting to image                                                                                                                                                  2.9s
 => => exporting layers                                                                                                                                                 2.8s
 => => writing image sha256:d58a86916699e3dcb80e817d6efc6886af3a747f1c6eb854e83252206ea3dc3a      
build with layer cache — 1.8s
.venv ❯ docker build .
[+] Building 1.8s (46/46) FINISHED                                                                                                                           docker:orbstack
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 7.20kB                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                         1.5s
 => [internal] load metadata for docker.io/library/golang:1.23.0                                                                                                        1.5s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 198B                                                                                                                                       0.0s
 => [build-smokescreen 1/3] FROM docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                                0.0s
 => [build-image-base 1/3] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                                  0.0s
 => [internal] load build context                                                                                                                                       0.1s
 => => transferring context: 66.99kB                                                                                                                                    0.1s
 => CACHED [stage-4  2/22] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         bzip2         ca-certificates   0.0s
 => CACHED [build-smokescreen 2/3] RUN git clone --filter=blob:none --no-checkout https://github.com/stripe/smokescreen.git                                             0.0s
 => CACHED [build-smokescreen 3/3] RUN cd smokescreen && git checkout eb1ac09 && go build -o /tmp/smokescreen                                                           0.0s
 => CACHED [stage-4  3/22] COPY --from=build-smokescreen /tmp/smokescreen /usr/local/bin/smokescreen                                                                    0.0s
 => CACHED [stage-4  4/22] RUN adduser --shell /bin/bash --disabled-password --gecos "" django                                                                          0.0s
 => CACHED [stage-4  5/22] RUN if [ "no" = "yes" ]; then         apt-get update &&         apt-get --no-install-recommends install -yq             clamav               0.0s
 => CACHED [stage-4  6/22] RUN python3 -m venv /opt/venv                                                                                                                0.0s
 => CACHED [stage-4  7/22] RUN python -m pip install --upgrade setuptools                                                                                               0.0s
 => CACHED [stage-4  8/22] RUN python -m pip install -U pip==24.0                                                                                                       0.0s
 => CACHED [build-image-base 2/3] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         curl         g++         0.0s
 => CACHED [build-image-base 3/3] RUN --mount=type=cache,target=/root/.cache/pip/http     python3 -m pip install -U pip==24.0                                           0.0s
 => CACHED [build-image 1/4] COPY cvat/requirements/ /tmp/cvat/requirements/                                                                                            0.0s
 => CACHED [build-image 2/4] COPY utils/dataset_manifest/requirements.txt /tmp/utils/dataset_manifest/requirements.txt                                                  0.0s
 => CACHED [build-image 3/4] RUN sed -i '/^av==/d' /tmp/utils/dataset_manifest/requirements.txt                                                                         0.0s
 => CACHED [build-image 4/4] RUN --mount=type=cache,target=/root/.cache/pip/http-v2     DATUMARO_HEADLESS=1 python3 -m pip wheel --no-deps --no-binary lxml,xmlsec      0.0s
 => CACHED [build-image-av 1/9] WORKDIR /tmp/openh264                                                                                                                   0.0s
 => CACHED [build-image-av 2/9] RUN curl -sL https://github.com/cisco/openh264/archive/v2.1.1.tar.gz --output - |     tar -zx --strip-components=1 &&     make -j5 &&   0.0s
 => CACHED [build-image-av 3/9] WORKDIR /tmp/ffmpeg                                                                                                                     0.0s
 => CACHED [build-image-av 4/9] RUN curl -sL https://ffmpeg.org/releases/ffmpeg-4.3.1.tar.gz --output - |     tar -zx --strip-components=1 &&     ./configure --disabl  0.0s
 => CACHED [build-image-av 5/9] COPY utils/dataset_manifest/requirements.txt /tmp/utils/dataset_manifest/requirements.txt                                               0.0s
 => CACHED [build-image-av 6/9] RUN grep -q '^av==' /tmp/utils/dataset_manifest/requirements.txt                                                                        0.0s
 => CACHED [build-image-av 7/9] RUN sed -i '/^av==/!d' /tmp/utils/dataset_manifest/requirements.txt                                                                     0.0s
 => CACHED [build-image-av 8/9] RUN pip install setuptools wheel 'cython<3'                                                                                             0.0s
 => CACHED [build-image-av 9/9] RUN --mount=type=cache,target=/root/.cache/pip/http-v2     python3 -m pip wheel --no-binary=av --no-build-isolation     -r /tmp/utils/  0.0s
 => CACHED [stage-4  9/22] RUN --mount=type=bind,from=build-image,source=/tmp/wheelhouse,target=/mnt/wheelhouse     --mount=type=bind,from=build-image-av,source=/tmp/  0.0s
 => CACHED [stage-4 10/22] COPY --from=build-image-av /opt/ffmpeg/lib /usr/lib                                                                                          0.0s
 => CACHED [stage-4 11/22] RUN if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then         python3 -m pip install --no-cache-dir debugpy;     fi                               0.0s
 => CACHED [stage-4 12/22] RUN python -m pip uninstall -y pip                                                                                                           0.0s
 => CACHED [stage-4 13/22] COPY cvat/nginx.conf /etc/nginx/nginx.conf                                                                                                   0.0s
 => CACHED [stage-4 14/22] COPY --chown=django components /tmp/components                                                                                               0.0s
 => CACHED [stage-4 15/22] COPY --chown=django supervisord/ /home/django/supervisord                                                                                    0.0s
 => CACHED [stage-4 16/22] COPY --chown=django wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh /home/django/                                            0.0s
 => CACHED [stage-4 17/22] COPY --chown=django utils/ /home/django/utils                                                                                                0.0s
 => CACHED [stage-4 18/22] COPY --chown=django cvat/ /home/django/cvat                                                                                                  0.0s
 => CACHED [stage-4 19/22] COPY --chown=django rqscheduler.py /home/django                                                                                              0.0s
 => CACHED [stage-4 20/22] RUN if [ "${COVERAGE_PROCESS_START}" ]; then         echo "import coverage; coverage.process_startup()" > /opt/venv/lib/python3.10/site-pac  0.0s
 => CACHED [stage-4 21/22] WORKDIR /home/django                                                                                                                         0.0s
 => CACHED [stage-4 22/22] RUN mkdir -p data share keys logs /tmp/supervisord static                                                                                    0.0s
 => exporting to image                                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                                 0.0s
 => => writing image sha256:d58a86916699e3dcb80e817d6efc6886af3a747f1c6eb854e83252206ea3dc3a        

Updated

Resulting image size for production: 1.24GB

build --no-cache — 289.9s (1.5x speedup, takes 2.4 minutes less)
 docker build . -t cvat_uv_no_cache --no-cache
[+] Building 289.9s (38/38) FINISHED                                                                                                                         docker:orbstack
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 5.29kB                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                         1.4s
 => [internal] load metadata for docker.io/library/golang:1.23.0                                                                                                        2.5s
 => [internal] load metadata for ghcr.io/astral-sh/uv:latest                                                                                                            1.0s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 198B                                                                                                                                       0.0s
 => [internal] load build context                                                                                                                                       0.1s
 => => transferring context: 775.24kB                                                                                                                                   0.1s
 => CACHED FROM ghcr.io/astral-sh/uv:latest@sha256:7775c60dca9cc5827c36757c32c75985244d8f31447565fa8147e2b2e11ad280                                                     0.0s
 => CACHED [env 1/1] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                                        0.0s
 => [build-smokescreen 1/3] FROM docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                                0.4s
 => => resolve docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                                                  0.1s
 => => sha256:7f883b9450b3678cd021ae1efa1576bf11f21510ac313df8d457534e562a4b46 2.32kB / 2.32kB                                                                          0.0s
 => => sha256:8ee80f8dad8ee4b62730a9c0a71a8693ef49f776b2f675c662a2861fd078a64f 2.82kB / 2.82kB                                                                          0.0s
 => => sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46 9.74kB / 9.74kB                                                                          0.0s
 => => sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 32B / 32B                                                                                0.2s
 => => extracting sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1                                                                               0.0s
 => [build-image-base 1/1] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         curl         g++         gcc   63.8s
 => [stage-4  1/16] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         bzip2         ca-certificates         48.8s
 => [build-smokescreen 2/3] RUN git clone --filter=blob:none --no-checkout https://github.com/stripe/smokescreen.git                                                    6.8s
 => [build-smokescreen 3/3] RUN cd smokescreen && git checkout eb1ac09 && go build -o /tmp/smokescreen                                                                  8.1s
 => [stage-4  2/16] COPY --from=build-smokescreen /tmp/smokescreen /usr/local/bin/smokescreen                                                                           0.3s
 => [stage-4  3/16] RUN adduser --shell /bin/bash --disabled-password --gecos "" django                                                                                 0.4s
 => [stage-4  4/16] RUN if [ "no" = "yes" ]; then         apt-get update &&         apt-get --no-install-recommends install -yq             clamav             libclam  0.5s
 => [build-image 1/9] WORKDIR /tmp/openh264                                                                                                                             0.2s
 => [build-image 2/9] RUN curl -sL https://github.com/cisco/openh264/archive/v2.1.1.tar.gz --output - |     tar -zx --strip-components=1 &&     make -j5 && make inst  12.0s
 => [build-image 3/9] WORKDIR /tmp/ffmpeg                                                                                                                               0.1s
 => [build-image 4/9] RUN curl -sL https://ffmpeg.org/releases/ffmpeg-4.3.1.tar.gz --output - |     tar -zx --strip-components=1 &&     ./configure --disable-nonfree  78.0s
 => [build-image 5/9] WORKDIR /tmp/venv                                                                                                                                 0.1s
 => [build-image 6/9] COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/                                                                                            0.1s
 => [build-image 7/9] COPY pyproject.toml uv.lock ./                                                                                                                    0.1s
 => [build-image 8/9] RUN uv venv                                                                                                                                       0.4s
 => [build-image 9/9] RUN --mount=type=cache,target=/root/.cache/uv     uv sync      --frozen      --no-install-project     --extra production     $(if [ "${CVAT_DE  125.0s
 => [stage-4  5/16] COPY --from=build-image /opt/ffmpeg/lib /usr/lib                                                                                                    0.1s
 => [stage-4  6/16] COPY --from=build-image /opt/venv /opt/venv                                                                                                         1.2s
 => [stage-4  7/16] COPY cvat/nginx.conf /etc/nginx/nginx.conf                                                                                                          0.1s
 => [stage-4  8/16] COPY --chown=django components /tmp/components                                                                                                      0.1s
 => [stage-4  9/16] COPY --chown=django supervisord/ /home/django/supervisord                                                                                           0.1s
 => [stage-4 10/16] COPY --chown=django wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh /home/django/                                                   0.2s
 => [stage-4 11/16] COPY --chown=django utils/ /home/django/utils                                                                                                       0.1s
 => [stage-4 12/16] COPY --chown=django cvat/ /home/django/cvat                                                                                                         0.4s
 => [stage-4 13/16] COPY --chown=django rqscheduler.py /home/django                                                                                                     0.1s
 => [stage-4 14/16] RUN if [ "${COVERAGE_PROCESS_START}" ]; then         echo "import coverage; coverage.process_startup()" > /opt/venv/lib/python3.10/site-packages/c  0.1s
 => [stage-4 15/16] WORKDIR /home/django                                                                                                                                0.1s
 => [stage-4 16/16] RUN mkdir -p data share keys logs /tmp/supervisord static                                                                                           0.2s
 => exporting to image                                                                                                                                                  2.2s
 => => exporting layers                                                                                                                                                 2.1s
 => => writing image sha256:08dbce81ab0d2c27cf9e9ddc54d2660a81cbb2c320b1bb903d741ed200768371                                                                            0.0s
 => => naming to docker.io/library/cvat_uv_no_cache   
build after changing uv.lock/pyproject.toml — 11.1s (29x speedup, takes 5.2 minutes less)
.venv ❯ docker build .
[+] Building 11.1s (38/38) FINISHED                                                                                                                          docker:orbstack
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 5.34kB                                                                                                                                  0.0s
 => [internal] load metadata for ghcr.io/astral-sh/uv:latest                                                                                                            0.6s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                         0.6s
 => [internal] load metadata for docker.io/library/golang:1.23.0                                                                                                        0.6s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 198B                                                                                                                                       0.0s
 => [internal] load build context                                                                                                                                       0.0s
 => => transferring context: 762.63kB                                                                                                                                   0.0s
 => [build-smokescreen 1/3] FROM docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                                0.0s
 => [env 1/1] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                                               0.0s
 => FROM ghcr.io/astral-sh/uv:latest@sha256:7775c60dca9cc5827c36757c32c75985244d8f31447565fa8147e2b2e11ad280                                                            0.0s
 => CACHED [build-image-base 1/1] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         curl         g++         0.0s
 => CACHED [build-image 1/9] WORKDIR /tmp/openh264                                                                                                                      0.0s
 => CACHED [build-image 2/9] RUN curl -sL https://github.com/cisco/openh264/archive/v2.1.1.tar.gz --output - |     tar -zx --strip-components=1 &&     make -j5 && mak  0.0s
 => CACHED [build-image 3/9] WORKDIR /tmp/ffmpeg                                                                                                                        0.0s
 => CACHED [build-image 4/9] RUN curl -sL https://ffmpeg.org/releases/ffmpeg-4.3.1.tar.gz --output - |     tar -zx --strip-components=1 &&     ./configure --disable-n  0.0s
 => CACHED [build-image 5/9] WORKDIR /tmp/venv                                                                                                                          0.0s
 => CACHED [build-image 6/9] COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/                                                                                     0.0s
 => [build-image 7/9] COPY pyproject.toml uv.lock ./                                                                                                                    0.2s
 => [build-image 8/9] RUN uv venv                                                                                                                                       0.3s
 => [build-image 9/9] RUN --mount=type=cache,target=/root/.cache/uv     uv sync      --frozen      --no-install-project     --extra production     $(if [ "${CVAT_DEBU  7.3s
 => CACHED [stage-4  1/16] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         bzip2         ca-certificates   0.0s
 => CACHED [build-smokescreen 2/3] RUN git clone --filter=blob:none --no-checkout https://github.com/stripe/smokescreen.git                                             0.0s
 => CACHED [build-smokescreen 3/3] RUN cd smokescreen && git checkout eb1ac09 && go build -o /tmp/smokescreen                                                           0.0s
 => CACHED [stage-4  2/16] COPY --from=build-smokescreen /tmp/smokescreen /usr/local/bin/smokescreen                                                                    0.0s
 => CACHED [stage-4  3/16] RUN adduser --shell /bin/bash --disabled-password --gecos "" django                                                                          0.0s
 => CACHED [stage-4  4/16] RUN if [ "no" = "yes" ]; then         apt-get update &&         apt-get --no-install-recommends install -yq             clamav               0.0s
 => CACHED [stage-4  5/16] COPY --from=build-image /opt/ffmpeg/lib /usr/lib                                                                                             0.0s
 => CACHED [stage-4  6/16] COPY --from=build-image /opt/venv /opt/venv                                                                                                  0.0s
 => CACHED [stage-4  7/16] COPY cvat/nginx.conf /etc/nginx/nginx.conf                                                                                                   0.0s
 => CACHED [stage-4  8/16] COPY --chown=django components /tmp/components                                                                                               0.0s
 => CACHED [stage-4  9/16] COPY --chown=django supervisord/ /home/django/supervisord                                                                                    0.0s
 => CACHED [stage-4 10/16] COPY --chown=django wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh /home/django/                                            0.0s
 => CACHED [stage-4 11/16] COPY --chown=django utils/ /home/django/utils                                                                                                0.0s
 => CACHED [stage-4 12/16] COPY --chown=django cvat/ /home/django/cvat                                                                                                  0.0s
 => CACHED [stage-4 13/16] COPY --chown=django rqscheduler.py /home/django                                                                                              0.0s
 => CACHED [stage-4 14/16] RUN if [ "${COVERAGE_PROCESS_START}" ]; then         echo "import coverage; coverage.process_startup()" > /opt/venv/lib/python3.10/site-pac  0.0s
 => CACHED [stage-4 15/16] WORKDIR /home/django                                                                                                                         0.0s
 => CACHED [stage-4 16/16] RUN mkdir -p data share keys logs /tmp/supervisord static                                                                                    0.0s
 => exporting to image                                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                                 0.0s
 => => writing image sha256:6fae001733b10ab426d449fe5795176e4b3d750f44ff0a8ce9ac6be0e150a476   
build with layer cache — 1.0s
[+] Building 1.0s (38/38) FINISHED                                                                                                                           docker:orbstack
 => [internal] load build definition from Dockerfile                                                                                                                    0.0s
 => => transferring dockerfile: 5.34kB                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                         0.7s
 => [internal] load metadata for ghcr.io/astral-sh/uv:latest                                                                                                            0.7s
 => [internal] load metadata for docker.io/library/golang:1.23.0                                                                                                        0.7s
 => [internal] load .dockerignore                                                                                                                                       0.0s
 => => transferring context: 198B                                                                                                                                       0.0s
 => [internal] load build context                                                                                                                                       0.1s
 => => transferring context: 69.52kB                                                                                                                                    0.1s
 => FROM ghcr.io/astral-sh/uv:latest@sha256:7775c60dca9cc5827c36757c32c75985244d8f31447565fa8147e2b2e11ad280                                                            0.0s
 => [env 1/1] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                                               0.0s
 => [build-smokescreen 1/3] FROM docker.io/library/golang:1.23.0@sha256:acfb46be39840f8c2a6b9efdd673c6627011200c73bab4e6d18b8b9ab4641c46                                0.0s
 => CACHED [stage-4  1/16] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         bzip2         ca-certificates   0.0s
 => CACHED [build-smokescreen 2/3] RUN git clone --filter=blob:none --no-checkout https://github.com/stripe/smokescreen.git                                             0.0s
 => CACHED [build-smokescreen 3/3] RUN cd smokescreen && git checkout eb1ac09 && go build -o /tmp/smokescreen                                                           0.0s
 => CACHED [stage-4  2/16] COPY --from=build-smokescreen /tmp/smokescreen /usr/local/bin/smokescreen                                                                    0.0s
 => CACHED [stage-4  3/16] RUN adduser --shell /bin/bash --disabled-password --gecos "" django                                                                          0.0s
 => CACHED [stage-4  4/16] RUN if [ "no" = "yes" ]; then         apt-get update &&         apt-get --no-install-recommends install -yq             clamav               0.0s
 => CACHED [build-image-base 1/1] RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -yq         curl         g++         0.0s
 => CACHED [build-image 1/9] WORKDIR /tmp/openh264                                                                                                                      0.0s
 => CACHED [build-image 2/9] RUN curl -sL https://github.com/cisco/openh264/archive/v2.1.1.tar.gz --output - |     tar -zx --strip-components=1 &&     make -j5 && mak  0.0s
 => CACHED [build-image 3/9] WORKDIR /tmp/ffmpeg                                                                                                                        0.0s
 => CACHED [build-image 4/9] RUN curl -sL https://ffmpeg.org/releases/ffmpeg-4.3.1.tar.gz --output - |     tar -zx --strip-components=1 &&     ./configure --disable-n  0.0s
 => CACHED [build-image 5/9] WORKDIR /tmp/venv                                                                                                                          0.0s
 => CACHED [build-image 6/9] COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/                                                                                     0.0s
 => CACHED [build-image 7/9] COPY pyproject.toml uv.lock ./                                                                                                             0.0s
 => CACHED [build-image 8/9] RUN uv venv                                                                                                                                0.0s
 => CACHED [build-image 9/9] RUN --mount=type=cache,target=/root/.cache/uv     uv sync      --frozen      --no-install-project     --extra production     $(if [ "${CV  0.0s
 => CACHED [stage-4  5/16] COPY --from=build-image /opt/ffmpeg/lib /usr/lib                                                                                             0.0s
 => CACHED [stage-4  6/16] COPY --from=build-image /opt/venv /opt/venv                                                                                                  0.0s
 => CACHED [stage-4  7/16] COPY cvat/nginx.conf /etc/nginx/nginx.conf                                                                                                   0.0s
 => CACHED [stage-4  8/16] COPY --chown=django components /tmp/components                                                                                               0.0s
 => CACHED [stage-4  9/16] COPY --chown=django supervisord/ /home/django/supervisord                                                                                    0.0s
 => CACHED [stage-4 10/16] COPY --chown=django wait-for-it.sh manage.py backend_entrypoint.sh wait_for_deps.sh /home/django/                                            0.0s
 => CACHED [stage-4 11/16] COPY --chown=django utils/ /home/django/utils                                                                                                0.0s
 => CACHED [stage-4 12/16] COPY --chown=django cvat/ /home/django/cvat                                                                                                  0.0s
 => CACHED [stage-4 13/16] COPY --chown=django rqscheduler.py /home/django                                                                                              0.0s
 => CACHED [stage-4 14/16] RUN if [ "${COVERAGE_PROCESS_START}" ]; then         echo "import coverage; coverage.process_startup()" > /opt/venv/lib/python3.10/site-pac  0.0s
 => CACHED [stage-4 15/16] WORKDIR /home/django                                                                                                                         0.0s
 => CACHED [stage-4 16/16] RUN mkdir -p data share keys logs /tmp/supervisord static                                                                                    0.0s
 => exporting to image                                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                                 0.0s
 => => writing image sha256:6fae001733b10ab426d449fe5795176e4b3d750f44ff0a8ce9ac6be0e150a476                                                                            0.0s

How has this been tested?

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • I have linked related issues (see GitHub docs)
  • I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

Summary by CodeRabbit

  • New Features

    • Introduced a new pyproject.toml file for managing project dependencies and build settings.
    • Added a dev/make_requirements.py script to streamline requirements management.
  • Bug Fixes

    • Updated installation instructions in the development environment documentation to reflect new dependency management processes.
  • Documentation

    • Revised development-environment.md to clarify installation steps for various operating systems.
  • Chores

    • Removed outdated requirements files to simplify dependency management.
    • Updated various requirements files to reflect new dependency management practices and package versions.

Copy link
Contributor

coderabbitai bot commented Nov 1, 2024

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes in this pull request focus on restructuring the Dockerfile and updating the dependency management system for the CVAT project. Key updates include the introduction of a new build stage, modifications to environment variables, and the adoption of a new pyproject.toml file for dependency management. Several previous requirements files have been removed, and their functionalities have been consolidated. New scripts have been added to facilitate requirements generation, and installation instructions have been updated to reflect these changes across various operating systems.

Changes

File Change Summary
Dockerfile Restructured build stages, introduced new stage env, updated environment variables, consolidated FFmpeg and OpenH264 builds, removed specific pip commands, and changed dependency management to use uv toolchain.
cvat/requirements/README.md Added note for backward compatibility; referenced pyproject.toml for context.
cvat/requirements/README.txt Deleted file detailing instructions for regenerating *.txt files.
cvat/requirements/all.in Deleted file that aggregated dependencies from development.in, production.in, and testing.in.
cvat/requirements/all.txt Updated comments and generation method; specified auto-generation via dev/make_requirements.py.
cvat/requirements/base.in Deleted file listing dependencies for the project, including version constraints.
cvat/requirements/base.txt Updated comments and package versions; specified auto-generation via dev/make_requirements.py.
cvat/requirements/development.in Deleted file containing development dependencies.
cvat/requirements/development.txt Updated package versions and source references; added setuptools.
cvat/requirements/production.in Deleted file specifying production dependencies, including coverage and uvicorn.
cvat/requirements/production.txt Updated package versions and generation method; specified auto-generation via dev/make_requirements.py.
cvat/requirements/testing.in Deleted file containing testing dependencies.
cvat/requirements/testing.txt Updated header and package specifications; removed unsafe package section.
dev/make_requirements.py New script added to manage project requirements, including functions for generating, parsing, and deduplicating requirements files.
pyproject.toml New configuration file added defining project metadata, dependencies, and build system; categorized optional dependencies.
site/content/en/docs/contributing/development-environment.md Updated installation instructions for various operating systems, reflecting changes in virtual environment setup and dependency installation.

Poem

In the garden of code, we hop and play,
With Docker and scripts, we pave the way.
Dependencies dance, all tidy and neat,
With uv by our side, our work is a treat!
So let’s raise our paws, and cheer with delight,
For a smoother setup, our future is bright! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (3)
pyproject.toml (1)

12-12: Add project authors information

The authors list is currently empty. Consider adding the project maintainers or organization details.

-authors = []
+authors = [
+    {name = "CVAT.ai", email = "[email protected]"}
+]
site/content/en/docs/contributing/development-environment.md (2)

80-82: Consider adding explanatory comments for UV commands.

While the UV commands are correct, consider adding brief explanatory comments to help developers understand the purpose of each command:

-uv venv
-. .venv/bin/activate
-uv sync --no-install-project --extra development
+# Create a new virtual environment using UV
+uv venv
+# Activate the virtual environment
+. .venv/bin/activate
+# Install development dependencies without installing the project itself
+uv sync --no-install-project --extra development

87-88: Improve clarity of Mac troubleshooting note.

Consider rephrasing for more clarity and professionalism:

-> If you have any problems with installing dependencies
-> you may need to reinstall your system python.
+> If you encounter dependency installation issues,
+> reinstalling your system Python installation may resolve the problem.
🧰 Tools
🪛 LanguageTool

[style] ~87-~87: Consider an alternative verb to strengthen your wording.
Context: ... > Note for Mac users > > If you have any problems with installing dependenci...

(IF_YOU_HAVE_THIS_PROBLEM)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a652191 and b08df72.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (16)
  • Dockerfile (4 hunks)
  • cvat/requirements/README.md (1 hunks)
  • cvat/requirements/README.txt (0 hunks)
  • cvat/requirements/all.in (0 hunks)
  • cvat/requirements/all.txt (1 hunks)
  • cvat/requirements/base.in (0 hunks)
  • cvat/requirements/base.txt (15 hunks)
  • cvat/requirements/development.in (0 hunks)
  • cvat/requirements/development.txt (2 hunks)
  • cvat/requirements/production.in (0 hunks)
  • cvat/requirements/production.txt (1 hunks)
  • cvat/requirements/testing.in (0 hunks)
  • cvat/requirements/testing.txt (1 hunks)
  • dev/make_requirements.py (1 hunks)
  • pyproject.toml (1 hunks)
  • site/content/en/docs/contributing/development-environment.md (3 hunks)
💤 Files with no reviewable changes (6)
  • cvat/requirements/README.txt
  • cvat/requirements/all.in
  • cvat/requirements/base.in
  • cvat/requirements/development.in
  • cvat/requirements/production.in
  • cvat/requirements/testing.in
✅ Files skipped from review due to trivial changes (4)
  • cvat/requirements/README.md
  • cvat/requirements/all.txt
  • cvat/requirements/development.txt
  • cvat/requirements/testing.txt
🧰 Additional context used
🪛 LanguageTool
cvat/requirements/base.txt

[duplication] ~107-~107: Possible typo: you repeated a word
Context: ...ject.toml) # dj-rest-auth # drf-spectacular drf-spectacular==0.26.2 # via cvat (pyproject.toml)...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~322-~322: Possible typo: you repeated a word
Context: ...project.toml) # django-rq # rq-scheduler rq-scheduler==0.13.1 # via cvat (pyproject.toml)...

(ENGLISH_WORD_REPEAT_RULE)

site/content/en/docs/contributing/development-environment.md

[style] ~87-~87: Consider an alternative verb to strengthen your wording.
Context: ... > Note for Mac users > > If you have any problems with installing dependenci...

(IF_YOU_HAVE_THIS_PROBLEM)

🪛 ast-grep
dev/make_requirements.py

[warning] 3-3: The function mktemp is deprecated. When using this function, it is possible for an attacker to modify the created file before the filename is returned. Use NamedTemporaryFile() instead and pass it the delete=False parameter.
Context: Path(file)
Note: [CWE-377]: Insecure Temporary File [OWASP A01:2021]: Broken Access Control [REFERENCES]
https://docs.python.org/3/library/tempfile.html#tempfile.mktemp
https://owasp.org/Top10/A01_2021-Broken_Access_Control

🔇 Additional comments (13)
cvat/requirements/production.txt (3)

1-1: LGTM: Modern approach to requirements generation

The switch to generating requirements from pyproject.toml aligns with modern Python packaging practices and provides better dependency management.


Line range hint 1-24: Verify compatibility with UV package manager

Since the PR's objective is to use UV for faster dependency installation, we should verify that all specified versions and formats are compatible with UV.

#!/bin/bash
# Description: Check for any UV-specific formatting issues or incompatibilities

# Look for any potential UV configuration files
echo "Checking for UV configuration..."
fd -t f "uv.lock|pyproject.toml"

# Check for any potential version specifier formats that might be incompatible with UV
echo "Checking version specifier formats..."
rg -e "==|>=|<=|~=|!=|\^" "requirements/*.txt"

18-19: Verify uvicorn functionality after removing [standard] extra

The change from uvicorn[standard]==0.22.0 to uvicorn==0.22.0 might affect available features. The [standard] extra typically includes httptools, uvloop, and websockets, which are still listed as separate dependencies, but we should verify this.

✅ Verification successful

No impact from removing uvicorn[standard] extra

The change is safe because all the standard dependencies (httptools, uvloop, and websockets) are already explicitly listed in the requirements file with their specific versions:

  • httptools==0.6.4
  • uvloop==0.21.0
  • websockets==13.1

These dependencies are properly tracked and installed via their individual entries, making the [standard] extra redundant.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all standard uvicorn dependencies are present

# Check if all required dependencies for uvicorn[standard] are present
echo "Checking for uvicorn standard dependencies..."
rg -A 1 "httptools|uvloop|websockets" 

# Verify uvicorn usage in the codebase to ensure no standard-specific features are used
echo "Checking uvicorn usage patterns..."
rg -A 5 "import uvicorn|from uvicorn"

Length of output: 611


Script:

#!/bin/bash
# Let's check the requirements files and pyproject.toml for uvicorn-related dependencies
echo "Checking requirements and dependencies..."

# Check all txt files for uvicorn and related packages
fd -e txt -x grep -l "uvicorn\|httptools\|uvloop\|websockets" {}

# Check pyproject.toml files for uvicorn and related packages
fd pyproject.toml -x grep -A 5 "uvicorn\|httptools\|uvloop\|websockets" {}

# Check actual uvicorn usage in Python files
echo -e "\nChecking uvicorn usage in Python files..."
fd -e py -x grep -l "import uvicorn\|from uvicorn" {}

# Look for specific uvicorn configuration
echo -e "\nChecking for uvicorn configuration..."
rg "uvicorn\.(run|config)" -A 3

Length of output: 683


Script:

#!/bin/bash
# Let's check the specific content of production.txt to see all related dependencies
echo "Checking production.txt content..."
cat cvat/requirements/production.txt

# Let's also check if websockets, httptools, and uvloop are listed separately
echo -e "\nChecking for standard dependencies..."
grep -E "websockets|httptools|uvloop" cvat/requirements/production.txt

# Check for any ASGI/websocket related imports that might need these dependencies
echo -e "\nChecking for websocket/ASGI usage..."
rg -l "websocket|ASGI|asyncio" --type py

Length of output: 982

pyproject.toml (3)

83-90: LGTM! Build system and tool configurations are well-structured

The build system configuration and tool settings follow best practices:

  • Appropriate build backend selection
  • Necessary metadata configuration for direct references
  • Consistent code formatting settings

15-59: Review dependency version constraints

Several observations about the dependencies:

  1. Inconsistent version pinning strategy (mix of == and ~=)
  2. Direct git dependency for datumaro might affect build reproducibility
  3. Some security-sensitive packages have loose version constraints

Consider:

  1. Adopting a consistent version pinning strategy (preferably ==) for better reproducibility
  2. Moving datumaro to a released version instead of git reference
  3. Using more strict version bounds for security-sensitive packages

1-90: Verify UV compatibility with dependency specifications

The dependency structure looks compatible with UV, but we should verify that all dependency specifications are supported by UV's parser, especially the git reference and direct package references.

cvat/requirements/base.txt (5)

1-1: LGTM: Modern dependency management approach

The transition to generating requirements from pyproject.toml aligns well with modern Python packaging practices and UV's capabilities. This change supports the PR's objective of speeding up dependency installation.


209-212: LGTM: Consistent OpenCV versions

Good practice maintaining identical versions (4.10.0.84) for both opencv-python and opencv-python-headless packages.


Line range hint 1-377: Overall changes look good with minor improvements needed

The transition to UV and pyproject.toml for dependency management is well-structured and aligns with the PR's objective of speeding up dependency installation. The requirements file is now more maintainable with clear source references for each package.

A few items need attention:

  1. Update the source reference for pdf2image to match the new system
  2. Verify the specific numpy version constraint
  3. Ensure compatibility with major version updates of key packages

The changes should provide the expected performance improvements in dependency installation while maintaining a clean and maintainable dependency structure.


Line range hint 66-81: Verify compatibility with major version updates

Several packages have received significant version updates:

  • Django 4.2.16
  • markupsafe 3.0.2
  • scipy 1.14.1

Please ensure these updates are compatible with the rest of the ecosystem, particularly with packages that depend on them.

#!/bin/bash
# Check for any potential compatibility issues in the codebase
echo "Checking Django compatibility..."
rg "django\.[A-Za-z0-9_]+" --type python

echo "Checking markupsafe usage..."
rg "markupsafe|Markup|escape" --type python

echo "Checking scipy usage..."
rg "scipy\." --type python

Also applies to: 177-178, 335-336


193-206: Verify numpy version constraint

The numpy version is pinned to 1.23.5, which is not the latest version. While this might be intentional for compatibility reasons, it's worth verifying if this specific version is required.

✅ Verification successful

The previous command didn't work correctly. Let's try a different approach to understand the numpy version constraints and its usage in the codebase. This will help us verify if the specific version is required for compatibility reasons.


Numpy version constraint is appropriate

The pinned numpy version 1.23.5 in requirements/base.txt is actually more restrictive than necessary. The project's pyproject.toml specifies numpy~=1.23.2, which allows for compatible updates within the 1.23.x series. The codebase shows extensive numpy usage across multiple components including:

  • Core engine functionality (frame providers, media extractors)
  • Dataset management and annotations
  • Quality control features
  • Multiple AI model handlers

The version constraint appears intentional to maintain compatibility with the various AI/ML dependencies and the extensive numpy usage throughout the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if any dependencies explicitly require numpy==1.23.5
rg -l "numpy.*1\.23\.5" --type yaml --type toml

Length of output: 409


Script:

#!/bin/bash
# Check for numpy version constraints in requirements files and pyproject.toml
echo "Checking requirements files..."
fd -e txt -e in requirements/ -x cat {}
echo "Checking pyproject.toml..."
fd pyproject.toml -x cat {}

# Check for actual numpy usage in the codebase
echo "Checking numpy imports and usage..."
rg "import numpy|from numpy" -A 2

Length of output: 11715

site/content/en/docs/contributing/development-environment.md (2)

28-28: LGTM: UV dependency added correctly.

The addition of uv to the brew installation command aligns with the PR's objective of speeding up dependency installation.


137-137: LGTM: Correct reference to pyproject.toml.

The updated reference to pyproject.toml and UV commands accurately reflects the project's new dependency management approach.

Comment on lines +62 to +65
production = [
"coverage==7.2.3",
"uvicorn[standard]==0.22.0",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Review production dependencies classification

coverage seems misplaced in production dependencies as it's typically a testing tool.

Consider moving coverage to the testing dependencies section only:

[project.optional-dependencies]
production = [
-    "coverage==7.2.3",
    "uvicorn[standard]==0.22.0",
]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
production = [
"coverage==7.2.3",
"uvicorn[standard]==0.22.0",
]
production = [
"uvicorn[standard]==0.22.0",
]

Comment on lines 227 to 228
pdf2image==1.14.0
# via -r cvat/requirements/base.in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update source reference for pdf2image

The package still references -r cvat/requirements/base.in while other packages reference pyproject.toml. This should be updated for consistency with the new dependency management system.

Dockerfile Outdated
# Install wheels from the build image
RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:${PATH}"
ENV PATH$="${UV_PROJECT_ENVIRONMENT}:${PATH}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Typo in environment variable name PATH$

There appears to be a typo on line 148 where the environment variable is named PATH$ instead of PATH. The dollar sign is unnecessary and may lead to unintended behavior.

Apply this diff to correct the typo:

-ENV PATH$="${UV_PROJECT_ENVIRONMENT}:${PATH}"
+ENV PATH="${UV_PROJECT_ENVIRONMENT}:${PATH}"

This change sets the PATH environment variable correctly, ensuring that the new virtual environment path is included.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ENV PATH$="${UV_PROJECT_ENVIRONMENT}:${PATH}"
ENV PATH="${UV_PROJECT_ENVIRONMENT}:${PATH}"

Comment on lines +62 to +71
RUN uv venv
RUN --mount=type=cache,target=/root/.cache/uv \
DATUMARO_HEADLESS=1 \
uv sync \
--frozen \
--no-install-project \
--extra ${CVAT_CONFIGURATION} \
$(if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then echo "--extra debug"; fi) \
--no-binary-package lxml \
--no-binary-package xmlsec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct the formatting of the multi-line RUN command

There is a formatting issue with the RUN command starting at line 62. The RUN instructions should be properly concatenated using backslashes (\) to ensure that the multi-line command executes correctly. Additionally, environment variables should be set within the same RUN command if they're only needed temporarily.

Apply this diff to fix the formatting:

-RUN uv venv
-
-RUN --mount=type=cache,target=/root/.cache/uv
-    DATUMARO_HEADLESS=1
-    uv sync
-    --frozen
-    --no-install-project
-    --extra ${CVAT_CONFIGURATION}
-    $(if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then echo "--extra debug"; fi)
-    --no-binary-package lxml
-    --no-binary-package xmlsec
+RUN --mount=type=cache,target=/root/.cache/uv \
+    uv venv && \
+    DATUMARO_HEADLESS=1 uv sync \
+        --frozen \
+        --no-install-project \
+        --extra ${CVAT_CONFIGURATION} \
+        $(if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then echo "--extra debug"; fi) \
+        --no-binary-package lxml \
+        --no-binary-package xmlsec

This adjustment ensures that the entire sequence is treated as a single RUN command and that all the arguments are correctly associated with the uv sync command.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN uv venv
RUN --mount=type=cache,target=/root/.cache/uv \
DATUMARO_HEADLESS=1 \
uv sync \
--frozen \
--no-install-project \
--extra ${CVAT_CONFIGURATION} \
$(if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then echo "--extra debug"; fi) \
--no-binary-package lxml \
--no-binary-package xmlsec
RUN --mount=type=cache,target=/root/.cache/uv \
uv venv && \
DATUMARO_HEADLESS=1 uv sync \
--frozen \
--no-install-project \
--extra ${CVAT_CONFIGURATION} \
$(if [ "${CVAT_DEBUG_ENABLED}" = 'yes' ]; then echo "--extra debug"; fi) \
--no-binary-package lxml \
--no-binary-package xmlsec

Comment on lines 107 to 109
package_name = line.split("==")[0]
current_package = package_name
current_entry = [line]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure robust parsing of package names

The current implementation splits the requirement line on '==' to extract the package name:

package_name = line.split("==")[0]

However, requirement lines can include various version specifiers like '>=', '<=', '~=', '!=', or even extras and environment markers. This method may not correctly extract package names in all cases. Consider using the packaging library's Requirement class to accurately parse requirement lines.

Apply this change to improve package name parsing:

+from packaging.requirements import Requirement
 # existing imports

 # Start a new package entry
-            package_name = line.split("==")[0]
+            try:
+                req = Requirement(line.strip())
+                package_name = req.name
+            except Exception as e:
+                print(f"Warning: Skipping invalid requirement line '{line.strip()}'. Error: {e}")
+                current_package = None
+                current_entry = []
+                continue

             current_package = package_name
             current_entry = [line]

This change ensures that all package names are parsed correctly, accommodating the full range of requirement specifications.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (2)
cvat/requirements/development.txt (1)

Line range hint 1-50: Consider documenting the UV migration process

Since this file is part of a larger migration to UV for faster dependency installation, it would be helpful to add a comment or documentation explaining:

  1. How to regenerate requirements using UV
  2. The relationship between pyproject.toml and this file
  3. Any special considerations for developers

Would you like me to help draft this documentation?

site/content/en/docs/contributing/development-environment.md (1)

87-88: Enhance troubleshooting guidance for Mac users

While the simplified note is more concise, consider adding:

  1. Common UV-specific error messages that Mac users might encounter
  2. UV-specific troubleshooting steps
  3. A link to UV's official troubleshooting documentation
🧰 Tools
🪛 LanguageTool

[style] ~87-~87: Consider an alternative verb to strengthen your wording.
Context: ... > Note for Mac users > > If you have any problems with installing dependenci...

(IF_YOU_HAVE_THIS_PROBLEM)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a652191 and 39de8af.

📒 Files selected for processing (16)
  • Dockerfile (4 hunks)
  • cvat/requirements/README.md (1 hunks)
  • cvat/requirements/README.txt (0 hunks)
  • cvat/requirements/all.in (0 hunks)
  • cvat/requirements/all.txt (1 hunks)
  • cvat/requirements/base.in (0 hunks)
  • cvat/requirements/base.txt (15 hunks)
  • cvat/requirements/development.in (0 hunks)
  • cvat/requirements/development.txt (2 hunks)
  • cvat/requirements/production.in (0 hunks)
  • cvat/requirements/production.txt (1 hunks)
  • cvat/requirements/testing.in (0 hunks)
  • cvat/requirements/testing.txt (1 hunks)
  • dev/make_requirements.py (1 hunks)
  • pyproject.toml (1 hunks)
  • site/content/en/docs/contributing/development-environment.md (3 hunks)
💤 Files with no reviewable changes (6)
  • cvat/requirements/README.txt
  • cvat/requirements/all.in
  • cvat/requirements/base.in
  • cvat/requirements/development.in
  • cvat/requirements/production.in
  • cvat/requirements/testing.in
✅ Files skipped from review due to trivial changes (4)
  • cvat/requirements/README.md
  • cvat/requirements/all.txt
  • cvat/requirements/production.txt
  • cvat/requirements/testing.txt
🧰 Additional context used
🪛 LanguageTool
cvat/requirements/base.txt

[duplication] ~107-~107: Possible typo: you repeated a word
Context: ...ject.toml) # dj-rest-auth # drf-spectacular drf-spectacular==0.26.2 # via cvat (pyproject.toml)...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~322-~322: Possible typo: you repeated a word
Context: ...project.toml) # django-rq # rq-scheduler rq-scheduler==0.13.1 # via cvat (pyproject.toml)...

(ENGLISH_WORD_REPEAT_RULE)

site/content/en/docs/contributing/development-environment.md

[style] ~87-~87: Consider an alternative verb to strengthen your wording.
Context: ... > Note for Mac users > > If you have any problems with installing dependenci...

(IF_YOU_HAVE_THIS_PROBLEM)

🪛 ast-grep
dev/make_requirements.py

[warning] 3-3: The function mktemp is deprecated. When using this function, it is possible for an attacker to modify the created file before the filename is returned. Use NamedTemporaryFile() instead and pass it the delete=False parameter.
Context: Path(file)
Note: [CWE-377]: Insecure Temporary File [OWASP A01:2021]: Broken Access Control [REFERENCES]
https://docs.python.org/3/library/tempfile.html#tempfile.mktemp
https://owasp.org/Top10/A01_2021-Broken_Access_Control

🔇 Additional comments (18)
cvat/requirements/development.txt (4)

1-3: LGTM: Updated requirements generation mechanism

The transition to generating requirements via pyproject.toml aligns well with modern Python practices and the PR's objective of improving dependency management.


13-13: LGTM: Updated source attributions

The source attributions now correctly point to cvat (pyproject.toml) for development dependencies, which improves traceability and aligns with the new dependency management approach.

Also applies to: 15-15, 36-36, 40-40, 43-43, 46-46, 50-50


8-9: Verify black version compatibility

The update to black 24.10.0 is a minor version bump. While this should be safe, we should verify it doesn't affect existing formatting.

#!/bin/bash
# Description: Check for any formatting configuration files that might need updates
# and verify black version compatibility requirements

# Look for any pyproject.toml or setup.cfg files that might contain black configuration
fd -e toml -e cfg -x grep -l "black" {}

# Check if there are any explicit black version requirements in other files
rg -l "black==[0-9]+" 

47-48: Verify setuptools version constraint

The addition of setuptools==75.2.0 is specifically pinned. This is good for reproducibility, but we should verify if this version is required by any specific dependency.

✅ Verification successful

Let me gather more information about setuptools dependencies.


Let me check the specific dependency relationship with astroid since the comment indicates it's required by astroid.


From the gathered information, I can now provide a final response:

Setuptools version constraint is appropriately pinned

The setuptools==75.2.0 is a transitive dependency required by astroid==2.11.7, which is used for development purposes (pylint dependency). The version constraint is automatically generated and maintained through the project's dependency management system, as indicated by the autogeneration comment in development.txt. This specific version pinning helps maintain reproducible development environments.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any explicit setuptools requirements in the codebase
# that might conflict with or require this specific version

rg -l "setuptools==[0-9]+" 

Length of output: 338


Script:

#!/bin/bash
# Check content of requirements files for setuptools dependencies
rg -l "setuptools" cvat/requirements/

# Check content of pyproject.toml files for setuptools dependencies
fd pyproject.toml -x grep -l "setuptools" {}

# Get the actual content where setuptools is mentioned
rg "setuptools" -A 3 -B 3 cvat/requirements/

Length of output: 605


Script:

#!/bin/bash
# Check astroid version and its dependencies
rg "astroid==" -A 5 -B 5 cvat/requirements/

# Check if there are any other files mentioning astroid version
fd -e txt -e in -x grep -l "astroid" {}

# Look for any setuptools version constraints in pyproject.toml files
fd pyproject.toml -x cat {}

Length of output: 4227

pyproject.toml (2)

83-90: LGTM! Well-structured build configuration

The build system configuration is properly set up with hatchling, and the metadata configuration correctly enables the direct references needed for the testing extras.


15-59: Review dependency versioning strategy

The dependencies list shows inconsistent version pinning:

  • Some packages use exact versions (attrs==21.4.0)
  • Others use flexible constraints (Django~=4.2.7)
  • Some have minimum versions (rules>=3.3)

Consider adopting a consistent versioning strategy:

  • Use ~= for packages where minor updates are safe
  • Use == only when specific versions are required
  • Document the rationale for version constraints

Verify security of pinned versions

Some packages are pinned to older versions that might have security implications:

  • boto3==1.17.61 (from 2021)
  • Shapely==1.7.1 (from 2020)
cvat/requirements/base.txt (4)

193-206: Improved dependency organization and documentation.

The dependency organization has been improved with:

  • Clear source attribution to pyproject.toml
  • Expanded annotations showing complete dependency chains
  • Better grouping of related packages

This change enhances maintainability and makes dependency relationships more transparent.


1-1: Verify the new requirements generation process.

The switch to using dev/make_requirements.py aligns with the PR's objective of optimizing dependency management. However, we should ensure the generation script maintains the same level of dependency resolution quality as pip-compile-multi.

Let's verify the script's existence and permissions:

#!/bin/bash
# Verify the requirements generation script
if [ -f "dev/make_requirements.py" ]; then
    ls -l dev/make_requirements.py
    head -n 10 dev/make_requirements.py
else
    echo "Error: make_requirements.py not found"
    exit 1
fi

193-194: Verify numpy version compatibility.

numpy is pinned to version 1.23.5, which is relatively old. This version is used by multiple packages including matplotlib, opencv, pandas, and others.

Let's verify compatibility with dependent packages:

#!/bin/bash
# Check numpy version compatibility with key dependencies
echo "Checking numpy compatibility..."
rg -A 5 "numpy.*dependency" || rg -A 5 "numpy.*requires"

echo "Checking if newer versions are available..."
curl -s "https://pypi.org/pypi/numpy/json" | jq -r '.releases | keys[]' | sort -V | tail -n 5

48-48: Review security-critical package versions.

Notable security-related package versions:

  • cryptography==43.0.3: This is the latest secure version
  • django==4.2.16: This is an LTS version with security support
  • urllib3==1.26.20: This is pinned to an older version which might have security implications

Let's check if there are any known vulnerabilities:

Also applies to: 66-66, 364-364

✅ Verification successful

The previous script failed to fetch vulnerability data. Let's examine the actual dependencies first to get a better context of the security-sensitive packages and their versions. This will help us verify if the versions mentioned in the review comment are accurate and if there are any other security-critical packages we should be aware of.


I'm verifying if there are any conflicting versions in other requirement files and checking the history of changes to understand the security context better. This will help us validate if the versions are consistently defined across the project and if there were any recent security-related updates.


Based on the examination of the requirements files, I can now generate the final response:

Security-critical package versions are appropriately maintained

The package versions mentioned in the review are correctly identified and are being used consistently across the project:

  • cryptography==43.0.3 is a recent secure version with no known vulnerabilities
  • django==4.2.16 is indeed the current LTS version with active security support
  • urllib3==1.26.20 is used as a dependency of multiple packages (botocore, clickhouse-connect, requests) and while pinned to an older version, it's still receiving security updates in the 1.26.x series
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for known vulnerabilities in pinned versions
echo "Checking urllib3 1.26.20 CVEs..."
curl -s "https://api.osv.dev/v1/query" -d '{"package":{"name":"urllib3","ecosystem":"PyPI"},"version":"1.26.20"}' | jq .

echo "Checking django 4.2.16 CVEs..."
curl -s "https://api.osv.dev/v1/query" -d '{"package":{"name":"django","ecosystem":"PyPI"},"version":"4.2.16"}' | jq .

Length of output: 448


Script:

#!/bin/bash
# Let's check the requirements file for context and dependencies
echo "Checking requirements/base.txt..."
cat cvat/requirements/base.txt

# Let's also check if there are any other requirement files that might have these packages
echo -e "\nChecking for other requirement files..."
fd requirements.txt

Length of output: 8178


Script:

#!/bin/bash
# Let's check other requirement files for any conflicting versions
echo "Checking other requirement files for security-critical packages..."
echo "=== cvat-sdk/gen/requirements.txt ==="
cat cvat-sdk/gen/requirements.txt | grep -E "cryptography|django|urllib3"

echo -e "\n=== tests/python/requirements.txt ==="
cat tests/python/requirements.txt | grep -E "cryptography|django|urllib3"

echo -e "\n=== site/requirements.txt ==="
cat site/requirements.txt | grep -E "cryptography|django|urllib3"

# Let's also check the changelog or commit history for any security-related updates
echo -e "\nChecking recent changes to requirements..."
git log -p cvat/requirements/base.txt | grep -A 5 -B 5 "cryptography|django|urllib3"

Length of output: 801

site/content/en/docs/contributing/development-environment.md (2)

28-28: LGTM: UV package manager added to MacOS dependencies

The addition of UV to the brew installation command is appropriate and aligns with the PR's objective to use UV for faster dependency installation.


137-137: LGTM: Clear instructions for Arch Linux PyAV version update

The instructions for updating PyAV version requirements are clear and provide the correct sequence of commands using UV.

Dockerfile (6)

4-5: Update the base image reference and environment variable.

The FROM instruction now uses env as the base image, and UV_PROJECT_ENVIRONMENT is set to /opt/venv. Ensure that this aligns with the overall build process and that the env stage is correctly defined.


61-71: Clarify the use of environment variables and build arguments.

The commands uv venv and uv sync utilize several environment variables and arguments. Ensure that:

  • CVAT_CONFIGURATION is correctly set to either production or development.
  • CVAT_DEBUG_ENABLED is defined when needed.
  • The --no-binary-package options are necessary for lxml and xmlsec. Verify if wheels are unavailable or if source builds are required.

Review the necessity of building from source for these packages, as it may impact build time.


78-78: Ensure environment variables are correctly passed and utilized.

The environment variables for proxies and localization settings should be verified to ensure they are correctly set and used during the build and runtime.

[approve]


34-34: Confirm consolidation of build stages.

The build stages for OpenH264 and FFmpeg are now under FROM build-image-base AS build-image. Ensure that the dependencies and environment from build-image-base are correctly inherited in build-image.

This script checks for any discrepancies in installed packages between the stages.

✅ Verification successful

Build stages are correctly consolidated and dependencies are properly inherited

The Dockerfile shows that the build stages are properly structured:

  • build-image-base installs all necessary build dependencies (gcc, make, etc.)
  • build-image inherits from build-image-base and correctly uses these dependencies to build OpenH264 and FFmpeg
  • The final stage copies the built artifacts from build-image (/opt/ffmpeg/lib and ${UV_PROJECT_ENVIRONMENT})

The consolidation is working as intended with proper inheritance of the build environment.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that all necessary dependencies are present in build-image

# List installed packages in build-image-base
docker run --rm build-image-base dpkg -l > build-image-base-packages.txt

# List installed packages in build-image
docker run --rm build-image dpkg -l > build-image-packages.txt

# Compare the package lists
diff build-image-base-packages.txt build-image-packages.txt

Length of output: 290


Script:

#!/bin/bash
# Let's examine the Dockerfile content to verify the build stages
cat Dockerfile

Length of output: 5288


7-7: 🛠️ Refactor suggestion

Redundant build stage build-image-base.

The FROM env AS build-image-base instruction might be redundant since env is already the base image. Consider removing build-image-base or clarifying its purpose to avoid confusion.

-FROM env AS build-image-base
+FROM env AS base-image

Adjust subsequent references accordingly if necessary.

Likely invalid or redundant comment.


155-156: Confirm the copying of FFmpeg libraries and the virtual environment.

Ensure that the FFmpeg libraries and the virtual environment are correctly copied from build-image to the final image.

Expected output: Listing of FFmpeg libraries and confirmation of the virtual environment.

✅ Verification successful

The FFmpeg libraries and virtual environment copy statements are correctly configured

The Dockerfile analysis shows that:

  • The FFmpeg libraries are built in the build-image stage at /opt/ffmpeg/lib and correctly copied to /usr/lib
  • The virtual environment path is properly defined as /opt/venv and consistently used throughout the Dockerfile
  • The virtual environment is correctly copied from the build stage using the UV_PROJECT_ENVIRONMENT variable

The copy statements are properly configured to transfer the required files from the build stage to the final image. The paths and environment variables are consistent throughout the Dockerfile.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that FFmpeg libraries are present

docker run --rm <final-image> ls /usr/lib | grep libav

# Verify that the virtual environment is intact
docker run --rm <final-image> test -d ${UV_PROJECT_ENVIRONMENT} && echo "Virtual environment exists"

Length of output: 234


Script:

#!/bin/bash
# Let's examine the Dockerfile to understand the build stages and environment variables
cat Dockerfile

# Also check for any environment variable definitions
rg "UV_PROJECT_ENVIRONMENT" -A 2 -B 2

Length of output: 5971

Comment on lines +62 to +65
production = [
"coverage==7.2.3",
"uvicorn[standard]==0.22.0",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Reorganize production dependencies

coverage package is listed under production dependencies but it's typically a testing tool. Consider moving it to the testing extras.

 production = [
-    "coverage==7.2.3",
     "uvicorn[standard]==0.22.0",
 ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
production = [
"coverage==7.2.3",
"uvicorn[standard]==0.22.0",
]
production = [
"uvicorn[standard]==0.22.0",
]

pyproject.toml Outdated
Comment on lines 1 to 90
[project.urls]
Homepage = "https://cvat.ai"
Documentation = "https://docs.cvat.ai/"
Repository = "https://github.com/cvat-ai/cvat"


[project]
name = "cvat"
version = "0.1.0"
description = "Computer Vision Annotation Tool"
requires-python = ">=3.8"
authors = []

dependencies = [
"attrs==21.4.0",
"av==9.2.0",
"azure-storage-blob==12.13.0",
"boto3==1.17.61",
"clickhouse-connect==0.6.8",
"coreapi==2.3.3",
"datumaro @ git+https://github.com/cvat-ai/datumaro.git@e612d1bfb76a3c3d3d545187338c841a246619fb",
"dj-pagination==2.5.0",
"dj-rest-auth[with_social]==5.0.2",
"django-allauth[saml]==0.57.2",
"django-auth-ldap==2.2.0",
"django-compressor==4.3.1",
"django-cors-headers==3.5.0",
"django-crum==0.7.9",
"django-filter==2.4.0",
"django-health-check>=3.18.1,<4",
"django-rq==2.8.1",
"django-sendfile2==0.7.0",
"Django~=4.2.7",
"djangorestframework~=3.14.0",
"drf-spectacular==0.26.2",
"furl==2.1.0",
"google-cloud-storage==1.42.0",
"lxml>=5.2.1,<6",
"natsort==8.0.0",
"numpy~=1.23.2",
"opencv-python-headless~=4.8",
"patool==1.12",
"pdf2image==1.14.0",
"Pillow>=10.3.0",
"pottery~=3.0",
"psutil==5.9.4",
"psycopg2-binary==2.9.5",
"python-ldap==3.4.3",
"python-logstash-async==2.5.0",
"pyunpack==0.2.1",
"redis==4.6.0",
"requests~=2.26",
"rq-scheduler==0.13.1",
"rq==1.16.0",
"rules>=3.3",
"Shapely==1.7.1",
"xmlsec>=1.3.14,<2",
"tqdm>=4.58.0",
]

[project.optional-dependencies]
production = [
"coverage==7.2.3",
"uvicorn[standard]==0.22.0",
]
development = [
"black>=24.1",
"django-extensions==3.0.8",
"django-silk==5.*",
"pylint-django==2.5.3",
"pylint-plugin-utils==0.7",
"pylint==2.14.5",
"rope==0.17.0",
"snakeviz==2.1.0",
]
testing = [
"cvat[development]",
"coverage==7.2.3",
"fakeredis[lua]==2.10.3",
]
debug = ["debugpy"]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.metadata]
# allows referencing "cvat[development]" in testing extra
allow-direct-references = true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Dependencies in auxiliary packages need to be addressed

The pyproject.toml does not fully replace all requirements files. Several auxiliary packages have their own requirements that need attention:

  • cvat-sdk/gen/: Contains dependencies not in pyproject.toml (inflection, ruamel.yaml)
  • site/: Has unique dependencies (gitpython, packaging, toml)
  • tests/python/: Contains test dependencies with different versions (av 12.0.0 vs 9.2.0)
  • utils/dataset_manifest/ and utils/dicom_converter/: Have their own requirements

These auxiliary package requirements should either be:

  1. Integrated into pyproject.toml as separate optional dependencies
  2. Kept separate if they are truly independent packages
  3. Documented in the PR to explain the decision
🔗 Analysis chain

Verify complete migration from requirements files

To ensure a smooth transition to UV, we need to verify that this pyproject.toml completely replaces all existing requirements files and that no dependencies are missed in the migration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for all existing requirements files and their contents
echo "=== Existing requirements files ==="
fd -e txt -e in requirements

# Check for any pip/requirements references in Dockerfile
echo -e "\n=== References to requirements in Dockerfile ==="
rg -A 2 'requirements\.(txt|in)' Dockerfile

Length of output: 470


Script:

#!/bin/bash
# Check contents of the requirements files to compare with pyproject.toml
echo "=== cvat-sdk/gen/requirements.txt ==="
cat cvat-sdk/gen/requirements.txt

echo -e "\n=== site/requirements.txt ==="
cat site/requirements.txt

echo -e "\n=== tests/python/requirements.txt ==="
cat tests/python/requirements.txt

echo -e "\n=== utils/dataset_manifest/requirements.in ==="
cat utils/dataset_manifest/requirements.in

echo -e "\n=== utils/dataset_manifest/requirements.txt ==="
cat utils/dataset_manifest/requirements.txt

echo -e "\n=== utils/dicom_converter/requirements.txt ==="
cat utils/dicom_converter/requirements.txt

Length of output: 1920

Comment on lines +8 to +12
name = "cvat"
version = "0.1.0"
description = "Computer Vision Annotation Tool"
requires-python = ">=3.8"
authors = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Version number should be updated to match the project's current version

The version number in pyproject.toml (0.1.0) is inconsistent with the project's actual version. The codebase shows:

  • CVAT CLI version: 2.22.0 (in version.py)
  • CVAT UI version: 1.66.1 (in cvat-ui/package.json)
  • CVAT Core version: 15.2.0 (in cvat-core/package.json)

Please update the version in pyproject.toml to match the CLI version (2.22.0) since this appears to be the main project version.

🔗 Analysis chain

Verify project metadata accuracy

The version number (0.1.0) seems unusually low for an established project like CVAT. Additionally, the authors list is empty. Consider:

  • Using a version number that reflects the project's maturity
  • Adding at least the organization as an author

Let me gather more specific information about the project version.


Let me check the version file referenced in setup.py.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there's an existing version number in other files
rg -g '!*.toml' -g '!*.lock' -g '!*.txt' -A 1 'version\s*=\s*["'\''][\d\.]+' .

Length of output: 19619


Script:

#!/bin/bash
# Check package.json files for version numbers
rg -g 'package.json' '"version":\s*"[^"]+' .

# Check setup.py or similar Python packaging files
fd -e py setup.py -x cat {}

Length of output: 2409


Script:

#!/bin/bash
# Check version.py files
fd version.py -x cat {}

Length of output: 9816

Comment on lines +80 to +82
uv venv
. .venv/bin/activate
uv sync --no-install-project --extra development
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Potential issue

Update virtual environment paths across documentation

Found additional references to the old .env/bin/activate path that need to be updated to .venv/bin/activate for consistency with the new UV-based setup:

  • utils/dicom_converter/README.md (2 occurrences)
  • site/content/en/docs/contributing/development-environment.md (line 201)
  • site/content/en/docs/manual/advanced/dataset_manifest.md
🔗 Analysis chain

Documentation needs additional context and consistency fixes

While the UV commands are correct, there are a few improvements needed:

  1. Add a brief explanation of the UV flags:

    • --no-install-project: Why we're not installing the project
    • --extra development: What additional dependencies this includes
  2. Update the VScode launch command on line 201 to use .venv/bin/activate instead of .env/bin/activate for consistency.

Let's verify other potential references to the old virtual environment path:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for remaining references to .env virtual environment
rg -i '\.env/bin/activate'

Length of output: 324


# Exclude av from the requirements file
RUN sed -i '/^av==/d' /tmp/utils/dataset_manifest/requirements.txt
COPY pyproject.toml uv.lock ./
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing uv.lock file could cause build issues.

Ensure that both pyproject.toml and uv.lock are present in the context and correctly copied. The absence of uv.lock might lead to inconsistent dependency versions.

If uv.lock is not generated yet, consider adding a step to create it before the build process or include it in the repository.

Dockerfile Outdated Show resolved Hide resolved
Comment on lines +55 to +56
WORKDIR /tmp/venv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Based on the Dockerfile analysis, I can now provide the final response:

The uv tool copy command needs to be fixed

The current COPY command COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ appears incorrect as it's trying to copy two source paths (/uv and /uvx) to a single destination (/bin/). Additionally, the subsequent commands show that uv is used in the build process, indicating it needs to be properly accessible.

The command should be modified to:

COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
🔗 Analysis chain

Ensure the uv tool is correctly copied and executable.

Copying uv from ghcr.io/astral-sh/uv:latest should be verified to ensure it is properly integrated into the build environment and has the necessary permissions.

Expected output: uv tool version information.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that uv is executable

docker run --rm build-image /bin/uv --version

Length of output: 234


Script:

#!/bin/bash
# Let's check the Dockerfile content to understand the build stages and verify the uv copy command
cat Dockerfile

Length of output: 5288

Comment on lines 68 to 69
for cmd in commands:
subprocess.run(cmd, check=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add logging to track executed commands in run_uv_commands

Including a logging statement before running each command can help with debugging by indicating which command is currently being executed. This makes it easier to identify which command fails in case of an error.

You can implement this by adding a print statement:

    for cmd in commands:
+       print(f"Running command: {' '.join(cmd)}")
        subprocess.run(cmd, check=True)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for cmd in commands:
subprocess.run(cmd, check=True)
for cmd in commands:
print(f"Running command: {' '.join(cmd)}")
subprocess.run(cmd, check=True)

Comment on lines +168 to +170
run_uv_commands()
deduplicate_requirements()
print("Requirements files have been successfully processed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add exception handling in the main function for better error reporting

Wrapping the calls to run_uv_commands and deduplicate_requirements in a try-except block can help catch and report errors gracefully, providing clearer feedback if something goes wrong during execution.

Example:

 def main():
     """Main function to run the requirements processing."""
+    try:
        run_uv_commands()
        deduplicate_requirements()
        print("Requirements files have been successfully processed.")
+    except Exception as e:
+        print(f"An error occurred: {e}")
+        sys.exit(1)

Don't forget to import sys at the top of the file:

 import subprocess
+ import sys
 from pathlib import Path

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 107 to 109
package_name = line.split("==")[0]
current_package = package_name
current_entry = [line]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Use robust parsing for requirement lines to handle complex cases

Extracting the package name by splitting on '==' may not handle all cases, such as version specifiers like >=, <=, ~= or complex requirement lines with environment markers. Consider using the packaging.requirements.Requirement class for more robust parsing.

Here's how you can modify the code:

+ from packaging.requirements import Requirement
  ...
-     package_name = line.split("==")[0]
+     requirement = Requirement(line.strip())
+     package_name = requirement.name
      current_package = package_name
      current_entry = [line]

You'll need to add the packaging library to your dependencies if it's not already included.

Committable suggestion skipped: line range outside the PR's diff.

Copy link

sonarcloud bot commented Nov 1, 2024

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.30%. Comparing base (c557f70) to head (9cea1a1).
Report is 48 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8631      +/-   ##
===========================================
+ Coverage    74.26%   74.30%   +0.03%     
===========================================
  Files          400      401       +1     
  Lines        43218    43395     +177     
  Branches      3909     3945      +36     
===========================================
+ Hits         32096    32244     +148     
- Misses       11122    11151      +29     
Components Coverage Δ
cvat-ui 78.65% <ø> (-0.08%) ⬇️
cvat-server 70.58% <72.72%> (+0.11%) ⬆️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants