Skip to content

Conversation

@fatelei
Copy link
Contributor

@fatelei fatelei commented Dec 11, 2025

bytearray: prevent UAF in search-like methods by exporting self buffer

Fix a heap use-after-free when bytearray search helpers captured the raw
buffer pointer before normalizing the “sub” argument. A crafted index
or buffer provider could clear/resize the same bytearray during argument
conversion, invalidating the saved pointer and leading to UAF.

Change:
• For bytearray methods find/rfind/index/rindex/count/startswith/endswith/
contains/split/rsplit, export a temporary Py_buffer on self and pass
view.buf/view.len to the Py_bytes* helpers, then release it. While the
export is live, resizing/clearing raises BufferError, preventing stale
pointer dereferences.

Tests:
• Add re-entrancy tests to Lib/test/test_bytes.py that verify BufferError is
raised when index clears the target during find/count/index/rfind/rindex.

This mirrors existing protection used in bytearray.join and removes the
re-entrancy hazard without changing public APIs.

@fatelei fatelei changed the title gh-142495: bytearray: prevent UAF in search-like methods by exporting self buffer gh-142558: bytearray: prevent UAF in search-like methods by exporting self buffer Dec 11, 2025
@picnixz picnixz changed the title gh-142558: bytearray: prevent UAF in search-like methods by exporting self buffer gh-142560: bytearray: prevent UAF in search-like methods by exporting self buffer Dec 11, 2025
@aisk
Copy link
Contributor

aisk commented Dec 11, 2025

Hi, please add a news entry for this change via blurb or blurb-it: https://devguide.python.org/contrib/core-team/committing/#how-to-add-a-news-entry

@aisk
Copy link
Contributor

aisk commented Dec 11, 2025

Hi , according to the devguide, force push should be avoided.

@aisk
Copy link
Contributor

aisk commented Dec 12, 2025

A lot of the methods changed in this PR share the same pattern, such as bytearray_find_impl, bytearray_count_impl, bytearray_index_impl, and so on. This introduces a lot of duplicated code.

Can we add a wrapper function to reduce this boilerplate?

@fatelei
Copy link
Contributor Author

fatelei commented Dec 12, 2025

A lot of the methods changed in this PR share the same pattern, such as bytearray_find_impl, bytearray_count_impl, bytearray_index_impl, and so on. This introduces a lot of duplicated code.

Can we add a wrapper function to reduce this boilerplate?

typedef PyObject* (*_ba_bytes_op)(const char *buf, Py_ssize_t len,
                                  PyObject *sub, Py_ssize_t start,
                                  Py_ssize_t end);

static PyObject *
_bytearray_with_buffer(PyByteArrayObject *self, PyObject *sub,
                       Py_ssize_t start, Py_ssize_t end, _ba_bytes_op op)
{
    Py_buffer view;
    PyObject *res;
    if (PyObject_GetBuffer((PyObject *)self, &view, PyBUF_SIMPLE) != 0) {
        return NULL;
    }
    res = op((const char *)view.buf, view.len, sub, start, end);
    PyBuffer_Release(&view);
    return res;
}

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants