-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Python 3.12 #276
Conversation
@jameslamb @pentschev It looks like there are segfaults (from libucx?) in all the Python 3.12 tests (on multiple OSes, CUDA versions, amd64/arm64, both pip and conda). I don't have a good explanation for this. Can we investigate further? Segfault logs
|
Worth noting that last night, I restarted only the failing jobs here (which means only the Python 3.12 ones). It's possible that some change happened in some dependency, between when those other jobs passed and these were re-run, which would cause these segfaults on even more jobs. To check that, I just restarted all CI here. (edited): yes it looks like all Python 3.12 jobs, and only Python 3.12 jobs, are failing deterministically |
Ok, think I found the problem! Or at least, one problem. Narrowed this down to the failing test case code to do that (click me)On an x86_64 machine with CUDA 12.2, this reproduces the segfault: docker run \
--rm \
--gpus 1 \
--env RAPIDS_BUILD_TYPE="pull-request" \
--env RAPIDS_REF_NAME=pull-request/276 \
--env RAPIDS_REPOSITORY=rapidsai/ucxx \
--env RAPIDS_SHA=7b9f38614547684cd15efbb9c5bdff209ced45a3 \
-v $(pwd):/opt/work \
-w /opt/work \
-it rapidsai/ci-conda:cuda12.5.1-ubuntu22.04-py3.12 \
bash
. /opt/conda/etc/profile.d/conda.sh
rapids-dependency-file-generator \
--output conda \
--file-key test_python \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml
rapids-mamba-retry env create --yes -f env.yaml -n test
conda activate test
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
libucxx ucxx
UCXPY_PROGRESS_MODE=thread \
UCXPY_ENABLE_PYTHON_FUTURE=1 \
python -m pytest \
-vs \
'python/ucxx/ucxx/_lib_async/tests/test_custom_send_recv.py[<lambda>0]' Ran that with UCXPY_PROGRESS_MODE=thread \
UCXPY_ENABLE_PYTHON_FUTURE=1 \
gdb --args \
python -m pytest \
-vs \
'python/ucxx/ucxx/_lib_async/tests/test_custom_send_recv.py[<lambda>0]'
# gdb commands:
#
# $ run
# $ bt UCXPY_PROGRESS_MODE=thread \
UCXPY_ENABLE_PYTHON_FUTURE=1 \
python -m pytest \
-vs \
'python/ucxx/ucxx/_lib_async/tests/test_custom_send_recv.py[<lambda>0]' and was able to get this more informative trace:
It looks to me like the signature of Relevant code in ucxx/cpp/python/src/future.cpp Lines 143 to 150 in 18ada55
The signature for that call in Python 3.11 (code link): static PyObject *
future_set_result(FutureObj *fut, PyObject *res) Python 3.12 (code link) static PyObject *
future_set_result(asyncio_state *state, FutureObj *fut, PyObject *res) Looks like that happened around https://github.com/python/cpython/pull/99122/files#diff-6bd9e39980b88a721d902bcd915bbb3f24762f7f253430c45e52c42a2c5afd01R578 I think |
Thanks @bdice for reporting and @jameslamb for investigating. It looks like you're right, I'll work on a proper fix for this today. |
Thanks again @jameslamb for investigating, it looks like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests passing now, thanks @jameslamb !
Ahhhh awesome!!! Thanks so much for the fix and the thorough explanation @pentschev , much appreciated! |
/merge |
Follow-up to #1380. Now that both `cudf` (rapidsai/cudf#16745) and `ucxx` (rapidsai/ucxx#276) have Python 3.12 wheels available, it should be possible to test `dask-cuda` against Python 3.12 in CI. This proposes that. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #1382
Description
Contributes to rapidsai/build-planning#40
This PR adds support for Python 3.12.
Notes for Reviewers
This is part of ongoing work to add Python 3.12 support across RAPIDS.
It temporarily introduces a build/test matrix including Python 3.12, from rapidsai/shared-workflows#213.
A follow-up PR will revert back to pointing at the
branch-24.10
branch ofshared-workflows
once allRAPIDS repos have added Python 3.12 support.
This will fail until all dependencies have been updates to Python 3.12
CI here is expected to fail until all of this project's upstream dependencies support Python 3.12.
This can be merged whenever all CI jobs are passing.