Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Oct 16, 2025

Summary by CodeRabbit

  • New Features

    • Code objects can now be instantiated directly via a public constructor.
    • Bytecode is validated and decoded when creating or replacing code objects, raising errors on invalid input.
  • Improvements

    • Stronger validation of local variables and symbol tables to ensure consistency.
    • Function objects now serialize access to their code, improving safety when reading or updating code.
  • Bug Fixes

    • Replacing code now re-parses and validates bytecode to prevent inconsistent state.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Fix PyCode constructor/replace" directly addresses the primary changes in the changeset. The main objective is to implement a full constructor pathway for PyCode and rework the replace flow, which is exactly what the title describes. While the changeset includes supporting modifications such as wrapping PyFunction.code in PyMutex for thread safety and updating related call sites in the JIT module, these changes are secondary infrastructure changes necessary to support the primary constructor/replace implementation. The title is clear, specific, and concise enough that a teammate scanning commit history would immediately understand the primary focus of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
vm/src/builtins/function.rs (3)

46-54: GC traverse is incomplete after adding PyMutex fields

New PyMutex-wrapped fields (code, name, qualname, type_params, annotations, module, doc) are not traversed. This breaks cycle GC and can cause leaks or missed edges.

Include them in Traverse.

 unsafe impl Traverse for PyFunction {
     fn traverse(&self, tracer_fn: &mut TraverseFn<'_>) {
         self.globals.traverse(tracer_fn);
         if let Some(closure) = self.closure.as_ref() {
             closure.as_untyped().traverse(tracer_fn);
         }
-        self.defaults_and_kwdefaults.traverse(tracer_fn);
+        self.defaults_and_kwdefaults.traverse(tracer_fn);
+        self.code.traverse(tracer_fn);
+        self.name.traverse(tracer_fn);
+        self.qualname.traverse(tracer_fn);
+        self.type_params.traverse(tracer_fn);
+        self.annotations.traverse(tracer_fn);
+        self.module.traverse(tracer_fn);
+        self.doc.traverse(tracer_fn);
     }
 }

99-103: Borrowing from a temporary lock guard – compile error

let code = &*self.code.lock(); takes a reference tied to a temporary guard that’s dropped at the end of the statement. This will not compile.

Clone the PyRef instead.

-        let code = &*self.code.lock();
+        let code = self.code.lock().clone();

609-611: Missing lock on code in JIT compile path – won’t compile

After wrapping code in PyMutex, zelf.code.code is invalid. Lock or clone first.

-                rustpython_jit::compile(&zelf.code.code, &arg_types, ret_type)
+                let code = zelf.code.lock().clone();
+                rustpython_jit::compile(&code.code, &arg_types, ret_type)
🧹 Nitpick comments (4)
vm/src/builtins/function.rs (2)

403-408: Use the cloned code for consistency (avoid TOCTOU and extra lock)

You already cloned code above. Re-locking for flags can observe a different code object and adds overhead. Use code.flags.

-        let locals = if self
-            .code
-            .lock()
-            .flags
-            .contains(bytecode::CodeFlags::NEW_LOCALS)
-        {
+        let locals = if code.flags.contains(bytecode::CodeFlags::NEW_LOCALS) {

463-466: Invalidate JIT when __code__ changes

Changing __code__ should drop any cached compiled code; otherwise, stale JIT may run with mismatched code/flags.

     #[pygetset(setter)]
     fn set___code__(&self, code: PyRef<PyCode>) {
         *self.code.lock() = code;
+        #[cfg(feature = "jit")]
+        {
+            // If available, clear cached compiled code. Adjust to actual API of OnceCell.
+            let _ = self.jitted_code.take();
+        }
     }

Please verify crate::common::lock::OnceCell exposes take() (or an equivalent method). If not, we may need a small wrapper or to recreate the OnceCell via interior mutability.

vm/src/builtins/code.rs (2)

608-618: Unsafe co_code byte view: consider safe encode path

Current approach assumes CodeUnit is exactly 2 bytes and tightly packed. If that invariant ever changes, this breaks.

Optionally, re-encode via iterator for safety.

-        // SAFETY: CodeUnit is #[repr(C)] with size 2, so we can safely transmute to bytes
-        let bytes = unsafe {
-            std::slice::from_raw_parts(
-                self.code.instructions.as_ptr() as *const u8,
-                self.code.instructions.len() * 2,
-            )
-        };
-        vm.ctx.new_bytes(bytes.to_vec())
+        // Safer: rebuild from op/arg pairs (avoids layout assumptions)
+        let mut out = Vec::with_capacity(self.code.instructions.len() * 2);
+        for cu in self.code.instructions.iter() {
+            out.push(u8::from(cu.op));
+            out.push(cu.arg.0);
+        }
+        vm.ctx.new_bytes(out)

1037-1055: parse_bytecode(): minor diagnostics improvement (optional)

Consider returning a PyErr with precise position/opcode for easier debugging rather than None.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b48dcc and 0cb5f20.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_code.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • vm/src/builtins/code.rs (8 hunks)
  • vm/src/builtins/function.rs (5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • vm/src/builtins/code.rs
  • vm/src/builtins/function.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/code.rs
  • vm/src/builtins/function.rs
🧬 Code graph analysis (2)
vm/src/builtins/code.rs (5)
vm/src/builtins/function.rs (7)
  • py_new (671-712)
  • py_new (771-779)
  • py_new (898-902)
  • obj (522-522)
  • new (58-91)
  • new (783-785)
  • new (907-911)
vm/src/frame.rs (4)
  • cls (1418-1419)
  • cls (1466-1467)
  • new (137-171)
  • arg (2457-2458)
compiler/core/src/bytecode.rs (4)
  • names (1254-1258)
  • new (401-403)
  • new (833-835)
  • try_from (814-820)
compiler/core/src/marshal.rs (6)
  • len (198-205)
  • len (228-230)
  • len (235-237)
  • len (241-246)
  • instructions (188-195)
  • try_from (79-111)
vm/src/vm/context.rs (1)
  • new_code (627-630)
vm/src/builtins/function.rs (2)
vm/src/frame.rs (3)
  • new (137-171)
  • locals (188-224)
  • locals (2160-2164)
vm/src/stdlib/io.rs (3)
  • lock (1419-1423)
  • lock (2354-2358)
  • lock (3304-3311)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (6)
vm/src/builtins/code.rs (6)

396-421: Constructor string tuple validation LGTM

Good, strict checking that names/varnames/cellvars/freevars are all strings and interning them.


446-454: Validate nlocals vs varnames – good guard

Ensures internal invariant holds. LGTM.


455-462: Bytecode validation path is solid

Even-length check and opcode validation via Instruction::try_from. Clear error message.


925-935: replace(): validate/coerce co_code via same parser – good

Reuses parse_bytecode; consistent with constructor. LGTM.


953-962: replace(): co_nlocals validation guard LGTM

Good invariant check aligning with constructor semantics.


977-1004: replace(): fields rebuild is coherent

Rebuilds CodeObject preserving locations and cell2arg. Nice.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
vm/src/builtins/code.rs (2)

546-553: co_cellvars uses non-existent field; should read from self.code.cellvars (compile error)

Accessing self.cellvars will not compile; PyCode has only the code field. Use self.code.cellvars.

Apply:

-    pub fn co_cellvars(&self, vm: &VirtualMachine) -> PyTupleRef {
-        let cellvars = self
-            .cellvars
-            .iter()
-            .map(|name| name.to_pyobject(vm))
-            .collect();
-        vm.ctx.new_tuple(cellvars)
-    }
+    pub fn co_cellvars(&self, vm: &VirtualMachine) -> PyTupleRef {
+        let cellvars = self
+            .code
+            .cellvars
+            .iter()
+            .map(|name| name.to_pyobject(vm))
+            .collect();
+        vm.ctx.new_tuple(cellvars)
+    }

895-903: Don’t unwrap user inputs in replace(); validate and raise TypeError on bad items

as_interned_str(vm).unwrap() will panic if elements aren’t strings. Convert with type checks and return a Python TypeError matching constructor behavior.

Apply:

-        let names = match co_names {
-            OptionalArg::Present(names) => names,
+        let names = match co_names {
+            OptionalArg::Present(names) => names,
             OptionalArg::Missing => self
                 .code
                 .names
                 .deref()
                 .iter()
                 .map(|name| name.to_pyobject(vm))
                 .collect(),
         };
@@
-        let varnames = match co_varnames {
-            OptionalArg::Present(varnames) => varnames,
-            OptionalArg::Missing => self.code.varnames.iter().map(|s| s.to_object()).collect(),
-        };
+        let varnames = match co_varnames {
+            OptionalArg::Present(varnames) => varnames,
+            OptionalArg::Missing => self.code.varnames.iter().map(|s| s.to_object()).collect(),
+        };
@@
-        let cellvars = match co_cellvars {
-            OptionalArg::Present(cellvars) => cellvars
-                .into_iter()
-                .map(|o| o.as_interned_str(vm).unwrap())
-                .collect(),
-            OptionalArg::Missing => self.code.cellvars.clone(),
-        };
+        let cellvars: Box<[&'static PyStrInterned]> = match co_cellvars {
+            OptionalArg::Present(cellvars) => cellvars
+                .into_iter()
+                .map(|o| {
+                    o.as_interned_str(vm).ok_or_else(|| {
+                        vm.new_type_error("co_cellvars must be a sequence of strings".to_owned())
+                    })
+                })
+                .collect::<PyResult<Vec<_>>>()?
+                .into_boxed_slice(),
+            OptionalArg::Missing => self.code.cellvars.clone(),
+        };
@@
-        let freevars = match co_freevars {
-            OptionalArg::Present(freevars) => freevars
-                .into_iter()
-                .map(|o| o.as_interned_str(vm).unwrap())
-                .collect(),
-            OptionalArg::Missing => self.code.freevars.clone(),
-        };
+        let freevars: Box<[&'static PyStrInterned]> = match co_freevars {
+            OptionalArg::Present(freevars) => freevars
+                .into_iter()
+                .map(|o| {
+                    o.as_interned_str(vm).ok_or_else(|| {
+                        vm.new_type_error("co_freevars must be a sequence of strings".to_owned())
+                    })
+                })
+                .collect::<PyResult<Vec<_>>>()?
+                .into_boxed_slice(),
+            OptionalArg::Missing => self.code.freevars.clone(),
+        };

Then, just before building new_code, convert names/varnames with validation:

-            names: names
-                .into_iter()
-                .map(|o| o.as_interned_str(vm).unwrap())
-                .collect(),
-            varnames: varnames
-                .into_iter()
-                .map(|o| o.as_interned_str(vm).unwrap())
-                .collect(),
+            names: {
+                let v: Vec<_> = names.into_iter().map(|o| {
+                    o.as_interned_str(vm).ok_or_else(|| {
+                        vm.new_type_error("co_names must be a sequence of strings".to_owned())
+                    })
+                }).collect::<PyResult<_>>()?;
+                v.into_boxed_slice()
+            },
+            varnames: {
+                let v: Vec<_> = varnames.into_iter().map(|o| {
+                    o.as_interned_str(vm).ok_or_else(|| {
+                        vm.new_type_error("co_varnames must be a sequence of strings".to_owned())
+                    })
+                }).collect::<PyResult<_>>()?;
+                v.into_boxed_slice()
+            },

Also applies to: 910-913, 937-951

🧹 Nitpick comments (2)
vm/src/builtins/function.rs (2)

99-105: Avoid holding the code lock across the entire argument fill

let code = &*self.code.lock(); keeps the lock for the whole function scope. Clone the PyRef<PyCode> and release the lock early.

Apply:

-        let code = &*self.code.lock();
+        let code = self.code.lock().clone(); // PyRef<PyCode>
+        let code = &*code; // drop lock; use local clone

Keeps critical section small and reduces contention.


401-409: Reuse the cloned code instead of re-locking for NEW_LOCALS check

You already cloned code above; use code.flags here to avoid an extra lock and potential inconsistency.

Apply:

-        let locals = if self
-            .code
-            .lock()
-            .flags
-            .contains(bytecode::CodeFlags::NEW_LOCALS)
-        {
+        let locals = if code.flags.contains(bytecode::CodeFlags::NEW_LOCALS) {
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0cb5f20 and 8c3a4af.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_code.py is excluded by !Lib/**
📒 Files selected for processing (3)
  • vm/src/builtins/code.rs (8 hunks)
  • vm/src/builtins/function.rs (6 hunks)
  • vm/src/builtins/function/jit.rs (6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • vm/src/builtins/function/jit.rs
  • vm/src/builtins/function.rs
  • vm/src/builtins/code.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/function/jit.rs
  • vm/src/builtins/function.rs
  • vm/src/builtins/code.rs
🧬 Code graph analysis (3)
vm/src/builtins/function/jit.rs (1)
compiler/core/src/bytecode.rs (1)
  • arg_names (1149-1177)
vm/src/builtins/function.rs (1)
vm/src/frame.rs (3)
  • new (137-171)
  • locals (188-224)
  • locals (2160-2164)
vm/src/builtins/code.rs (2)
compiler/core/src/bytecode.rs (3)
  • new (401-403)
  • new (833-835)
  • try_from (814-820)
compiler/core/src/marshal.rs (2)
  • instructions (188-195)
  • try_from (79-111)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with rustfmt and clippy
🔇 Additional comments (5)
vm/src/builtins/code.rs (2)

1037-1055: parse_bytecode: solid validation and conversion path

Even-length check and opcode validation via Instruction::try_from is correct. Good addition.


610-616: Unsafe layout assumption in co_code; verify CodeUnit #[repr(C)] or use a safe encoder

Transmuting CodeUnit to bytes relies on #[repr(C)] and size 2. Please verify, or switch to a safe conversion.

Verification script to confirm CodeUnit layout attribute:

Safe alternative (no unsafe):

     pub fn co_code(&self, vm: &VirtualMachine) -> crate::builtins::PyBytesRef {
-        // SAFETY: CodeUnit is #[repr(C)] with size 2, so we can safely transmute to bytes
-        let bytes = unsafe {
-            std::slice::from_raw_parts(
-                self.code.instructions.as_ptr() as *const u8,
-                self.code.instructions.len() * 2,
-            )
-        };
-        vm.ctx.new_bytes(bytes.to_vec())
+        let mut bytes = Vec::with_capacity(self.code.instructions.len() * 2);
+        for cu in self.code.instructions.iter() {
+            bytes.push(u8::from(cu.op));
+            bytes.push(cu.arg.0);
+        }
+        vm.ctx.new_bytes(bytes)
     }

As per coding guidelines (prefer safe idioms; clippy-friendly).

vm/src/builtins/function/jit.rs (2)

68-79: Thread-safe access to code and early varargs/varkeywords rejection look good

Locking the code object and rejecting varargs/varkeywords up front simplifies downstream logic.


161-167: Lock scoping and arg indexing improvements are appropriate

  • Using locals arg_count/posonlyarg_count avoids repeated lock reads.
  • kw-only arg index = kw_idx + arg_count is correct.
  • Explicit drop(code) before returning is a nice touch.

Also applies to: 185-186, 209-214, 221-223

vm/src/builtins/function.rs (1)

609-611: JIT compile path: correct switch to locked code object

Passing &code.code from a locked code ref is correct and consistent with new synchronization.

Comment on lines +507 to +508
cell2arg: None, // TODO: reuse `fn cell2arg`
constants,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

cell2arg left as None breaks closures capturing arguments

When creating code via the constructor, closures will not get argument cells populated (Frame.fill_locals_from_args relies on cell2arg). Compute this mapping.

Apply:

-            cell2arg: None, // TODO: reuse `fn cell2arg`
+            // Map each cellvar to its argument index or -1 if not an arg
+            cell2arg: Some(
+                cellvars
+                    .iter()
+                    .map(|cv| {
+                        varnames
+                            .iter()
+                            .position(|vn| vn.as_str() == cv.as_str())
+                            .map(|i| i as i32)
+                            .unwrap_or(-1)
+                    })
+                    .collect::<Vec<_>>()
+                    .into_boxed_slice(),
+            ),

Also recompute cell2arg similarly in replace() whenever varnames or cellvars change. I can provide that diff too if desired.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cell2arg: None, // TODO: reuse `fn cell2arg`
constants,
// Map each cellvar to its argument index or -1 if not an arg
cell2arg: Some(
cellvars
.iter()
.map(|cv| {
varnames
.iter()
.position(|vn| vn.as_str() == cv.as_str())
.map(|i| i as i32)
.unwrap_or(-1)
})
.collect::<Vec<_>>()
.into_boxed_slice(),
),
constants,
🤖 Prompt for AI Agents
In vm/src/builtins/code.rs around lines 507-508, the constructor sets cell2arg:
None which leaves closures without argument cells populated; compute and set the
cell2arg mapping instead of None by mapping each cellvar name to its
corresponding argument index (search varnames for the cellvar and use the arg
index if found), and ensure the same recomputation is performed in replace()
whenever varnames or cellvars are changed so Frame.fill_locals_from_args can
rely on a valid cell2arg mapping.

@youknowone youknowone force-pushed the code branch 2 times, most recently from e13d468 to fca521c Compare October 21, 2025 04:57
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
vm/src/builtins/code.rs (1)

894-903: Don’t unwrap on user data; validate names/varnames/cellvars/freevars are strings

replace() converts provided vectors using as_interned_str(...).unwrap(). If a non-str slips in, this panics. Return a TypeError instead, mirroring the constructor’s validation.

Apply:

-        let names = match co_names {
-            OptionalArg::Present(names) => names,
+        let names = match co_names {
+            OptionalArg::Present(names) => {
+                let mut out = Vec::with_capacity(names.len());
+                for o in names {
+                    let Some(s) = o.as_interned_str(vm) else {
+                        return Err(vm.new_type_error("co_names must be a sequence of strings".to_owned()));
+                    };
+                    out.push(s);
+                }
+                out
+            }
             OptionalArg::Missing => self
                 .code
                 .names
                 .deref()
                 .iter()
-                .map(|name| name.to_pyobject(vm))
+                .map(|name| name.to_pyobject(vm))
                 .collect(),
         };
@@
-        let varnames = match co_varnames {
-            OptionalArg::Present(varnames) => varnames,
+        let varnames = match co_varnames {
+            OptionalArg::Present(varnames) => {
+                let mut out = Vec::with_capacity(varnames.len());
+                for o in varnames {
+                    let Some(s) = o.as_interned_str(vm) else {
+                        return Err(vm.new_type_error("co_varnames must be a sequence of strings".to_owned()));
+                    };
+                    out.push(s);
+                }
+                out
+            }
             OptionalArg::Missing => self.code.varnames.iter().map(|s| s.to_object()).collect(),
         };
@@
-        let cellvars = match co_cellvars {
-            OptionalArg::Present(cellvars) => cellvars
-                .into_iter()
-                .map(|o| o.as_interned_str(vm).unwrap())
-                .collect(),
+        let cellvars = match co_cellvars {
+            OptionalArg::Present(cellvars) => {
+                let mut out = Vec::with_capacity(cellvars.len());
+                for o in cellvars {
+                    let Some(s) = o.as_interned_str(vm) else {
+                        return Err(vm.new_type_error("co_cellvars must be a sequence of strings".to_owned()));
+                    };
+                    out.push(s);
+                }
+                out
+            }
             OptionalArg::Missing => self.code.cellvars.clone(),
         };
@@
-        let freevars = match co_freevars {
-            OptionalArg::Present(freevars) => freevars
-                .into_iter()
-                .map(|o| o.as_interned_str(vm).unwrap())
-                .collect(),
+        let freevars = match co_freevars {
+            OptionalArg::Present(freevars) => {
+                let mut out = Vec::with_capacity(freevars.len());
+                for o in freevars {
+                    let Some(s) = o.as_interned_str(vm) else {
+                        return Err(vm.new_type_error("co_freevars must be a sequence of strings".to_owned()));
+                    };
+                    out.push(s);
+                }
+                out
+            }
             OptionalArg::Missing => self.code.freevars.clone(),
         };

As per coding guidelines.

Also applies to: 910-913, 937-951

vm/src/builtins/function.rs (1)

46-54: GC traverse misses several fields after PyMutex refactor

The manual Traverse impl doesn’t traverse code, builtins, or the PyMutex-wrapped fields (name, qualname, type_params, annotations, module, doc). This can break cycle detection. Add them.

Apply:

 unsafe impl Traverse for PyFunction {
     fn traverse(&self, tracer_fn: &mut TraverseFn<'_>) {
         self.globals.traverse(tracer_fn);
+        self.builtins.traverse(tracer_fn);
+        self.code.traverse(tracer_fn);
         if let Some(closure) = self.closure.as_ref() {
             closure.as_untyped().traverse(tracer_fn);
         }
-        self.defaults_and_kwdefaults.traverse(tracer_fn);
+        self.defaults_and_kwdefaults.traverse(tracer_fn);
+        self.name.traverse(tracer_fn);
+        self.qualname.traverse(tracer_fn);
+        self.type_params.traverse(tracer_fn);
+        self.annotations.traverse(tracer_fn);
+        self.module.traverse(tracer_fn);
+        self.doc.traverse(tracer_fn);
     }
 }

As per coding guidelines.

♻️ Duplicate comments (2)
vm/src/builtins/code.rs (2)

507-515: Compute cell2arg in constructor; closures capturing args currently broken

cell2arg is left as None, so Frame.fill_locals_from_args won’t populate cells for arguments captured by closures. Compute the mapping from each cellvar to its argument index (or -1). This is a correctness bug affecting closures created from this constructor.

Apply:

-            cell2arg: None, // TODO: reuse `fn cell2arg`
+            // Map each cellvar to its argument index or -1 if not an argument
+            cell2arg: Some(
+                cellvars
+                    .iter()
+                    .map(|cv| {
+                        varnames
+                            .iter()
+                            .position(|vn| vn.as_str() == cv.as_str())
+                            .map(|i| i as i32)
+                            .unwrap_or(-1)
+                    })
+                    .collect::<Vec<_>>()
+                    .into_boxed_slice(),
+            ),

977-1006: Locations must match instruction count; rebuild when co_code changes

locations is cloned even if instructions length changes. This breaks invariants and downstream consumers. Recompute locations to instructions.len() using first_line_number when co_code is replaced.

Apply:

-        let new_code = CodeObject {
+        // Ensure locations length matches instructions length
+        let locations = if instructions.len() == self.code.instructions.len() {
+            self.code.locations.clone()
+        } else {
+            let row = first_line_number.unwrap_or(OneIndexed::MIN);
+            vec![
+                rustpython_compiler_core::SourceLocation {
+                    // NOTE: confirm field names below in core crate
+                    row,
+                    column: OneIndexed::from_zero_indexed(0),
+                };
+                instructions.len()
+            ]
+            .into_boxed_slice()
+        };
+
+        let new_code = CodeObject {
             flags: CodeFlags::from_bits_truncate(flags),
@@
-            instructions,
-            // FIXME: invalid locations. Actually locations is a duplication of linetable.
-            // It can be removed once we move every other code to use linetable only.
-            locations: self.code.locations.clone(),
+            instructions,
+            locations,
🧹 Nitpick comments (2)
vm/src/builtins/function.rs (2)

99-160: Optional: shrink code lock scope in fill_locals_from_args

The code lock is held across argument parsing. Copy small scalars (counts/flags) then drop the lock to reduce contention. Behavior unchanged.

Sketch:

  • Lock, read arg_count, kwonlyarg_count, posonlyarg_count, flags, varnames ref (clone Arc/Ptr if cheap), then release.
  • Proceed with locals filling.

Also applies to: 196-205, 272-303


348-363: Optional: validate closure length in set_function_attribute

When setting CLOSURE via MakeFunctionFlags, consider validating tuple length equals code.freevars.len() to match new checks and fail fast.

Example:

-            self.closure = Some(closure_tuple.try_into_typed::<PyCell>(vm)?);
+            let typed = closure_tuple.try_into_typed::<PyCell>(vm)?;
+            if typed.len() != self.code.lock().freevars.len() {
+                return Err(vm.new_value_error("closure size mismatch with freevars"));
+            }
+            self.closure = Some(typed);
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e13d468 and fca521c.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_code.py is excluded by !Lib/**
  • Lib/test/test_funcattrs.py is excluded by !Lib/**
📒 Files selected for processing (3)
  • vm/src/builtins/code.rs (8 hunks)
  • vm/src/builtins/function.rs (6 hunks)
  • vm/src/builtins/function/jit.rs (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • vm/src/builtins/function/jit.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • vm/src/builtins/code.rs
  • vm/src/builtins/function.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/code.rs
  • vm/src/builtins/function.rs
🧬 Code graph analysis (2)
vm/src/builtins/code.rs (4)
vm/src/builtins/function.rs (7)
  • py_new (672-713)
  • py_new (772-780)
  • py_new (899-903)
  • obj (522-522)
  • new (58-91)
  • new (784-786)
  • new (908-912)
compiler/core/src/bytecode.rs (3)
  • new (401-403)
  • new (833-835)
  • try_from (814-820)
compiler/core/src/marshal.rs (6)
  • len (198-205)
  • len (228-230)
  • len (235-237)
  • len (241-246)
  • instructions (188-195)
  • try_from (79-111)
vm/src/vm/context.rs (1)
  • new_code (627-630)
vm/src/builtins/function.rs (2)
vm/src/frame.rs (3)
  • new (137-171)
  • locals (188-224)
  • locals (2164-2168)
jit/src/lib.rs (1)
  • compile (114-131)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (1)
vm/src/builtins/code.rs (1)

481-488: Verify SourceLocation field names align with core crate

This uses fields line and character_offset; earlier usages in this codebase typically used row and column. Please confirm rustpython_compiler_core::SourceLocation field names to avoid compile errors, and adjust here and in replace() accordingly.

If needed:

-                    line: row,
-                    character_offset: OneIndexed::from_zero_indexed(0),
+                    row,
+                    column: OneIndexed::from_zero_indexed(0),

Comment on lines +1001 to +1005
cellvars,
freevars,
cell2arg: self.code.cell2arg.clone(),
linetable,
exceptiontable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Recompute cell2arg in replace() when varnames/cellvars change

Keeping old cell2arg after changing varnames/cellvars yields incorrect closure cells for arguments. Recompute when either input is provided; otherwise reuse existing.

Apply:

-            cellvars,
-            freevars,
-            cell2arg: self.code.cell2arg.clone(),
+            cellvars,
+            freevars,
+            cell2arg: if matches!(co_varnames, OptionalArg::Present(_))
+                || matches!(co_cellvars, OptionalArg::Present(_))
+            {
+                Some(
+                    cellvars
+                        .iter()
+                        .map(|cv| {
+                            varnames
+                                .iter()
+                                .position(|vn| vn.as_str() == cv.as_str())
+                                .map(|i| i as i32)
+                                .unwrap_or(-1)
+                        })
+                        .collect::<Vec<_>>()
+                        .into_boxed_slice(),
+                )
+            } else {
+                self.code.cell2arg.clone()
+            },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cellvars,
freevars,
cell2arg: self.code.cell2arg.clone(),
linetable,
exceptiontable,
cellvars,
freevars,
cell2arg: if matches!(co_varnames, OptionalArg::Present(_))
|| matches!(co_cellvars, OptionalArg::Present(_))
{
Some(
cellvars
.iter()
.map(|cv| {
varnames
.iter()
.position(|vn| vn.as_str() == cv.as_str())
.map(|i| i as i32)
.unwrap_or(-1)
})
.collect::<Vec<_>>()
.into_boxed_slice(),
)
} else {
self.code.cell2arg.clone()
},
linetable,
exceptiontable,
🤖 Prompt for AI Agents
In vm/src/builtins/code.rs around lines 1001 to 1005, the replace() call
currently reuses self.code.cell2arg.clone() unconditionally; change it so
cell2arg is recomputed whenever varnames or cellvars are provided (i.e., if
varnames.is_some() || cellvars.is_some()), otherwise keep the existing clone.
Recompute by invoking the existing utility that builds the mapping (e.g., call
the function that computes cell2arg from varnames and cellvars with the new
values), and pass that result into the new code object; if neither input
changed, fall back to self.code.cell2arg.clone().

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
vm/src/builtins/code.rs (3)

507-508: cell2arg left as None breaks closures capturing arguments.

This issue has been previously identified. Setting cell2arg to None prevents closures from properly capturing function arguments. The cell2arg mapping must be computed to map each cellvar to its corresponding argument index.

Based on past review comments, the fix is to compute the mapping:

-            cell2arg: None, // TODO: reuse `fn cell2arg`
+            cell2arg: Some(
+                cellvars
+                    .iter()
+                    .map(|cv| {
+                        varnames
+                            .iter()
+                            .position(|vn| vn.as_str() == cv.as_str())
+                            .map(|i| i as i32)
+                            .unwrap_or(-1)
+                    })
+                    .collect::<Vec<_>>()
+                    .into_boxed_slice(),
+            ),

1001-1005: cell2arg not recomputed when varnames/cellvars change.

This issue was previously identified. When varnames or cellvars are modified via replace(), the cell2arg mapping must be recomputed to maintain correctness. Currently, it always clones the old mapping.

Based on past review comments, the fix is:

             cellvars,
             freevars,
-            cell2arg: self.code.cell2arg.clone(),
+            cell2arg: if matches!(co_varnames, OptionalArg::Present(_))
+                || matches!(co_cellvars, OptionalArg::Present(_))
+            {
+                Some(
+                    cellvars
+                        .iter()
+                        .map(|cv| {
+                            varnames
+                                .iter()
+                                .position(|vn| vn.as_str() == cv.as_str())
+                                .map(|i| i as i32)
+                                .unwrap_or(-1)
+                        })
+                        .collect::<Vec<_>>()
+                        .into_boxed_slice(),
+                )
+            } else {
+                self.code.cell2arg.clone()
+            },
             linetable,
             exceptiontable,

989-991: Locations remain inconsistent when instructions change.

The FIXME comment acknowledges that locations is kept from the old code object even when instructions may have changed. While a past review comment about this was marked as addressed in commit fca521c, the current code still has the issue. When co_code is replaced with a different number of instructions, the locations vector length won't match instructions.len().

The fix should ensure locations length matches instructions length:

+        // Ensure locations length matches instructions length
+        let locations = if instructions.len() == self.code.instructions.len() {
+            self.code.locations.clone()
+        } else {
+            let row = first_line_number.unwrap_or(OneIndexed::MIN);
+            vec![
+                rustpython_compiler_core::SourceLocation {
+                    line: row,
+                    character_offset: OneIndexed::from_zero_indexed(0),
+                };
+                instructions.len()
+            ]
+            .into_boxed_slice()
+        };
+
         let new_code = CodeObject {
             flags: CodeFlags::from_bits_truncate(flags),
             posonlyarg_count,
             arg_count,
             kwonlyarg_count,
             source_path: source_path.as_object().as_interned_str(vm).unwrap(),
             first_line_number,
             obj_name: obj_name.as_object().as_interned_str(vm).unwrap(),
             qualname: qualname.as_object().as_interned_str(vm).unwrap(),
 
             max_stackdepth,
             instructions,
-            // FIXME: invalid locations. Actually locations is a duplication of linetable.
-            // It can be removed once we move every other code to use linetable only.
-            locations: self.code.locations.clone(),
+            locations,
🧹 Nitpick comments (2)
vm/src/builtins/function/jit.rs (2)

67-107: Consider reducing the lock scope for better concurrency.

The code lock is held for the entire duration of annotation processing and iteration. Consider extracting only the necessary metadata (arg_names, flags) into local variables, then releasing the lock before the annotation processing loop. This would improve concurrency when multiple threads JIT functions simultaneously.

Apply this pattern:

 pub fn get_jit_arg_types(func: &Py<PyFunction>, vm: &VirtualMachine) -> PyResult<Vec<JitType>> {
-    let code = func.code.lock();
-    let arg_names = code.arg_names();
+    let (arg_names, has_varargs_or_varkeywords) = {
+        let code = func.code.lock();
+        let arg_names = code.arg_names();
+        let flags = code.flags.intersects(CodeFlags::HAS_VARARGS | CodeFlags::HAS_VARKEYWORDS);
+        (arg_names.to_owned(), flags)
+    };
 
-    if code
-        .flags
-        .intersects(CodeFlags::HAS_VARARGS | CodeFlags::HAS_VARKEYWORDS)
-    {
+    if has_varargs_or_varkeywords {
         return Err(new_jit_error(
             "Can't jit functions with variable number of arguments".to_owned(),
             vm,
         ));
     }

Note: You'd need to own the arg_names data or adjust the extraction strategy based on the Arguments struct definition.


161-221: Consider reducing the lock scope for better performance.

The code lock is held for approximately 60 lines while processing arguments, defaults, and keyword defaults. This lock duration could impact concurrent JIT compilation attempts. Consider extracting only the necessary metadata upfront, then releasing the lock before the argument processing logic.

Additionally, the explicit drop(code) at line 221 is unnecessary since the lock would be automatically released at function exit. While harmless, it adds clutter.

-    drop(code);
-
     jit_args.into_args().ok_or(ArgsError::NotAllArgsPassed)
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fca521c and f2e19bd.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_code.py is excluded by !Lib/**
  • Lib/test/test_funcattrs.py is excluded by !Lib/**
📒 Files selected for processing (3)
  • vm/src/builtins/code.rs (8 hunks)
  • vm/src/builtins/function.rs (6 hunks)
  • vm/src/builtins/function/jit.rs (6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • vm/src/builtins/code.rs
  • vm/src/builtins/function/jit.rs
  • vm/src/builtins/function.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/code.rs
  • vm/src/builtins/function/jit.rs
  • vm/src/builtins/function.rs
🧬 Code graph analysis (3)
vm/src/builtins/code.rs (3)
compiler/core/src/bytecode.rs (3)
  • new (401-403)
  • new (833-835)
  • try_from (814-820)
compiler/core/src/marshal.rs (2)
  • instructions (188-195)
  • try_from (79-111)
vm/src/vm/context.rs (1)
  • new_code (627-630)
vm/src/builtins/function/jit.rs (1)
compiler/core/src/bytecode.rs (1)
  • arg_names (1149-1177)
vm/src/builtins/function.rs (2)
vm/src/frame.rs (3)
  • new (137-171)
  • locals (188-224)
  • locals (2164-2168)
jit/src/lib.rs (1)
  • compile (114-131)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (12)
vm/src/builtins/function.rs (4)

31-31: LGTM!

The change to wrap code in a PyMutex is correctly implemented in both the field declaration and initialization. This enables safe concurrent access to the code object.

Also applies to: 76-76


99-99: LGTM!

The lock is correctly acquired for the entire duration of fill_locals_from_args, which is appropriate since the function needs a consistent view of the code object throughout argument processing.


379-432: LGTM!

The lock-and-clone pattern at line 401 is correct: it acquires the lock, clones the PyRef<PyCode> for the frame, then releases the lock. This ensures the frame has a stable code reference while allowing other threads to access or modify the function's code.


605-615: LGTM!

The code correctly locks the code object during JIT compilation to ensure a consistent view while compiling.

vm/src/builtins/code.rs (8)

5-5: LGTM!

The new imports (PyBytesRef, CodeUnit, OptionalArg, Constructor) are all necessary for the constructor implementation and bytecode parsing functionality.

Also applies to: 9-9, 13-14


370-391: LGTM!

The PyCodeNewArgs struct correctly defines all necessary parameters for constructing a code object. Field types are appropriate for their purposes.


398-453: LGTM!

The string interning for names, varnames, cellvars, and freevars is correctly implemented. The validation that nlocals matches varnames.len() is an important consistency check.


455-488: LGTM!

Bytecode parsing and validation using parse_bytecode is appropriate, with proper error handling. The locations vector creation correctly initializes each instruction's location based on firstlineno.


523-523: LGTM!

Adding Constructor to the pyclass declaration correctly enables the public constructor pathway via py_new.


839-924: LGTM!

The destructuring of ReplaceArgs and the pattern of using OptionalArg to either use the provided value or fall back to the existing value is clean and consistent throughout.


925-935: LGTM!

Re-parsing bytecode when co_code is provided ensures validation of the new bytecode and proper conversion to CodeUnit instructions.


1039-1057: LGTM!

The parse_bytecode helper correctly validates bytecode by checking for even length and validating each opcode via Instruction::try_from. Returning None on invalid input provides clear error handling. The implementation matches the pattern used in the compiler's marshal code.

Comment on lines +458 to 467
#[pygetset(setter)]
fn set___code__(&self, code: PyRef<PyCode>) {
*self.code.lock() = code;
// TODO: jit support
// #[cfg(feature = "jit")]
// {
// // If available, clear cached compiled code.
// let _ = self.jitted_code.take();
// }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

JIT cache invalidation missing in code setter.

When __code__ is replaced, the cached JIT compilation (jitted_code) must be invalidated, otherwise the function will continue executing the old compiled code. The TODO comment acknowledges this, but it's a correctness issue that should be addressed.

Apply this diff:

     #[pygetset(setter)]
     fn set___code__(&self, code: PyRef<PyCode>) {
         *self.code.lock() = code;
-        // TODO: jit support
-        // #[cfg(feature = "jit")]
-        // {
-        //     // If available, clear cached compiled code.
-        //     let _ = self.jitted_code.take();
-        // }
+        #[cfg(feature = "jit")]
+        {
+            // Clear cached compiled code when code object changes
+            let _ = self.jitted_code.take();
+        }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[pygetset(setter)]
fn set___code__(&self, code: PyRef<PyCode>) {
*self.code.lock() = code;
// TODO: jit support
// #[cfg(feature = "jit")]
// {
// // If available, clear cached compiled code.
// let _ = self.jitted_code.take();
// }
}
#[pygetset(setter)]
fn set___code__(&self, code: PyRef<PyCode>) {
*self.code.lock() = code;
#[cfg(feature = "jit")]
{
// Clear cached compiled code when code object changes
let _ = self.jitted_code.take();
}
}
🤖 Prompt for AI Agents
In vm/src/builtins/function.rs around lines 458 to 467, the __code__ setter
updates the code but does not invalidate the JIT cache; add a cfg-gated
invalidation so when setting *self.code.lock() = code the cached compiled code
is cleared under #[cfg(feature = "jit")] by taking/setting self.jitted_code to
None (e.g., self.jitted_code.take()) to ensure the function won't keep running
stale compiled code, keeping the invalidation inside the same setter and
preserving the existing TODO comment removal.

@youknowone
Copy link
Member Author

@ShaharNaveh could you review the changes?


/// Validates and parses bytecode bytes into CodeUnit instructions.
/// Returns None if bytecode is invalid (odd length or contains invalid opcodes).
fn parse_bytecode(bytecode_bytes: &[u8]) -> Option<Box<[CodeUnit]>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more appropriate to return

Result<Box<[CodeUnit]>, MarshalError>

Copy link
Member Author

@youknowone youknowone Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the concept marshal already existing in PyCode level?
I thought marshal can serialize/deserialize code but it is not necessarily coupled to PyCode itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, marshal is in compiler and PyCode is in vm. so that feels weird but making sense

Copy link
Collaborator

@ShaharNaveh ShaharNaveh Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe marshal is not the appropriate error, I initially thought of marshal because it has MarshalError::InvalidBytecode

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this function might belong in compiler-core anyways

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to marshal error and added a comment that it is not a kind of necessary decision


let mut instructions = Vec::with_capacity(bytecode_bytes.len() / 2);

for chunk in bytecode_bytes.chunks_exact(2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this logic in multiple places. I think a future enhancement for having a newtype for

pub struct CodeUnits(Box<[CodeUnit]>);

which implements, TryFrom<&[u8]> and returns a MarshalError.
but again, this is for a future PR.

Copy link
Collaborator

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
vm/src/builtins/code.rs (3)

507-507: cell2arg left as None breaks closures capturing arguments.

This issue has already been identified in previous reviews. When creating code via the constructor, closures will not get argument cells populated correctly since Frame.fill_locals_from_args relies on cell2arg. The mapping should be computed by matching each cellvar to its corresponding argument index in varnames.


986-988: Locations can become inconsistent when co_code changes.

The FIXME comment correctly identifies that keeping old locations while changing instruction count breaks invariants. This issue was previously identified and the code should recompute locations to match instructions.len() when the instruction count changes.


1000-1000: cell2arg not recomputed when varnames/cellvars change.

This issue has already been identified in previous reviews. When co_varnames or co_cellvars are provided in replace(), the cell2arg mapping should be recomputed to reflect the new variable-to-cell mappings, otherwise closures will have incorrect argument cells.

🧹 Nitpick comments (1)
vm/src/builtins/code.rs (1)

1036-1047: LGTM! Pragmatic bytecode validation helper.

The helper function correctly validates bytecode length (must be even for 2-byte instructions) and delegates to parse_instructions_from_bytes. The comment acknowledges that using MarshalError is temporary, which is reasonable for this refactoring stage.

For future consideration: You may want to introduce a dedicated error type for bytecode validation in the VM layer to avoid coupling with marshal errors, but the current approach is acceptable.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2e19bd and 2166a05.

📒 Files selected for processing (2)
  • compiler/core/src/marshal.rs (2 hunks)
  • vm/src/builtins/code.rs (8 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • compiler/core/src/marshal.rs
  • vm/src/builtins/code.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/code.rs
🧬 Code graph analysis (2)
compiler/core/src/marshal.rs (1)
compiler/core/src/bytecode.rs (1)
  • try_from (814-820)
vm/src/builtins/code.rs (2)
compiler/core/src/marshal.rs (5)
  • parse_instructions_from_bytes (170-179)
  • len (204-211)
  • len (234-236)
  • len (241-243)
  • len (247-252)
vm/src/vm/context.rs (1)
  • new_code (627-630)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (11)
compiler/core/src/marshal.rs (2)

168-179: LGTM! Well-structured bytecode parsing helper.

The function correctly parses bytecode bytes into CodeUnit instructions with proper error handling. The use of chunks_exact(2) ensures only complete 2-byte instructions are processed, and the Instruction::try_from provides validation.


201-201: LGTM! Clean refactoring.

The refactoring successfully extracts bytecode parsing into a reusable function while maintaining the same error handling behavior.

vm/src/builtins/code.rs (9)

5-22: LGTM! Necessary imports for the new constructor.

The new imports support the Constructor implementation and bytecode parsing functionality. All imports are used appropriately in the code.


373-394: LGTM! Well-structured constructor arguments.

The PyCodeNewArgs struct properly captures all necessary parameters for code object construction with appropriate Rust types matching the Python code object structure.


401-456: LGTM! Thorough validation of constructor arguments.

The constructor properly validates and converts all tuple arguments to interned strings, and correctly validates that nlocals matches varnames.len(). Error handling is appropriate with clear error messages.


460-461: LGTM! Proper bytecode validation.

The constructor correctly validates bytecode before creating the code object, providing clear error messages for invalid bytecode.


523-523: LGTM! Proper exposure of constructor.

The #[pyclass] attribute is correctly updated to include Constructor, enabling the py_new path for code object creation.


556-558: LGTM! Correct derivation of nlocals.

The getter now correctly derives co_nlocals from varnames.len(), which aligns with the validation enforced in both the constructor and replace method.


839-923: LGTM! Clear and maintainable refactoring.

The destructuring and pattern matching for optional arguments makes the replace logic much more readable and maintainable. The validation logic is preserved correctly.


925-932: LGTM! Consistent bytecode validation.

The replace method correctly validates bytecode when co_code is provided, using the same parse_bytecode helper as the constructor for consistency.


950-959: LGTM! Proper nlocals validation in replace.

The validation ensures consistency between co_nlocals and co_varnames length, preventing invalid code object states.

@youknowone youknowone merged commit f22aed2 into RustPython:main Oct 22, 2025
12 checks passed
@youknowone youknowone deleted the code branch October 22, 2025 12:09
@youknowone
Copy link
Member Author

@ShaharNaveh Thank you for your review 😄

@coderabbitai coderabbitai bot mentioned this pull request Nov 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants