Skip to content

Conversation

@arihant2math
Copy link
Collaborator

@arihant2math arihant2math commented Mar 10, 2025

Removes dependency on winreg and tries to make it easier to add new functionality. Also adds tests.

Summary by CodeRabbit

  • New Features

    • Expanded Windows Registry support: create/open/delete/enumerate keys; query, set and save values; flush/load; reflection controls; environment-string expansion; richer handle semantics and Python-friendly value behavior.
  • Refactor

    • Reworked registry implementation for safer handle management, deterministic cleanup and clearer error mapping.
    • Removed a Windows-specific dependency to reduce platform-targeted footprint.

✏️ Tip: You can customize this high-level summary in your review settings.

@arihant2math arihant2math force-pushed the winreg branch 3 times, most recently from dbb8283 to 2631331 Compare March 15, 2025 04:02
@arihant2math arihant2math force-pushed the winreg branch 4 times, most recently from ea7f15a to db9774d Compare March 23, 2025 17:56
@arihant2math arihant2math force-pushed the winreg branch 2 times, most recently from 367f887 to ca2cc19 Compare March 30, 2025 22:10
@arihant2math arihant2math marked this pull request as ready for review March 30, 2025 22:18
@arihant2math
Copy link
Collaborator Author

DeleteKey always returns error code 5. I'm sure this is terrible, but I can't seem to fix it.

@arihant2math arihant2math force-pushed the winreg branch 3 times, most recently from 8a47552 to d927764 Compare April 1, 2025 14:44
@arihant2math arihant2math force-pushed the winreg branch 2 times, most recently from 181ab0b to fbf0d39 Compare April 21, 2025 19:19
This was referenced Apr 23, 2025
@arihant2math arihant2math force-pushed the winreg branch 3 times, most recently from 3713038 to 71afd0a Compare April 28, 2025 00:03
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

Walkthrough

Removed an external Windows-only dependency and implemented an internal winreg stdlib module: a PyHkey type with atomic handle management and Drop-based cleanup, many PyO3-exposed registry APIs (Create/Open/Delete/Query/Set/Enum/Flush/Load/Save/Reflection/Expand), conversions and error mapping to OSError.

Changes

Cohort / File(s) Summary
Dependency removal
crates/vm/Cargo.toml
Removed the Windows-specific dependency winreg = "0.55" from [target.'cfg(windows)'.dependencies].
Winreg stdlib implementation
crates/vm/src/stdlib/winreg.rs
Added a new #[pymodule] winreg reimplementation: introduces pub struct PyHkey (atomic HKEY handle + Drop), many #[pyfunction] bindings (ConnectRegistry, CreateKey/CreateKeyEx, OpenKey/OpenKeyEx, DeleteKey/DeleteKeyEx, EnumValue, EnumKey, FlushKey, LoadKey, SaveKey, SetValue/SetValueEx, QueryValue/QueryValueEx, QueryInfoKey, EnableReflectionKey, DisableReflectionKey, ExpandEnvironmentStrings, etc.), conversion helpers (py2reg, reg_to_py, UTF-16/binary handling), constants/HKEY exposures, error mapping to OSError, numeric/hash/context-manager support, and associated helper types/impls.

Sequence Diagram(s)

sequenceDiagram
    participant PyVM as Python VM
    participant Winreg as winreg module (PyHkey)
    participant WinAPI as Windows API

    Note over PyVM,Winreg: Querying a registry value (QueryValueEx)
    PyVM->>Winreg: QueryValueEx(hkey, name)
    Winreg->>WinAPI: RegQueryValueExW(hkey.handle, name_wide, ...)
    WinAPI-->>Winreg: (type, raw bytes) or ERROR
    Winreg->>Winreg: reg_to_py converts raw -> PyObject (REG_SZ, DWORD, QWORD, MULTI_SZ, BINARY, EXPAND_SZ)
    Winreg-->>PyVM: Return Python object or raise OSError
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review Drop and AtomicHKEY usage for race conditions and safe closing.
  • Verify correctness of py2reg / reg_to_py conversions across REG_* types and boundary cases.
  • Check Windows error mapping to Python OSError and message fidelity.
  • Validate #[pyfunction] signatures, context-manager behavior, and constant/HKEY initialization.

Poem

🐰 I dug through bytes and widened each string,
I wrapped a handle that humbly would cling,
I tied keys to functions and closed them with care,
Converted, expanded, and handled each pair,
A rabbit, some code, and a registry flair. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.43% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: a rewrite of the winreg module and a version bump to 3.13.2, which aligns with the significant refactoring shown in the raw summary.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (2)
crates/vm/src/stdlib/winreg.rs (2)

175-184: __exit__ duplicates Close logic—consider reusing.

This has the same two-lock issue as Close and duplicates the error message. Consider calling the existing Close method.

     #[pymethod(magic)]
     fn exit(zelf: PyRef<Self>, _args: FuncArgs, vm: &VirtualMachine) -> PyResult<()> {
-        let res = unsafe { Registry::RegCloseKey(*zelf.hkey.write()) };
-        *zelf.hkey.write() = std::ptr::null_mut();
-        if res == 0 {
-            Ok(())
-        } else {
-            Err(vm.new_os_error("msg TODO".to_string()))
-        }
+        zelf.Close(vm)
     }

160-168: Remove or implement commented-out code.

Several blocks of commented-out code exist (AsHKEY, EnumKey). Consider either implementing these functions or removing the dead code to improve maintainability. Note that the commented EnumKey has a syntax error (std::mem::sizeof should be std::mem::size_of::<T>()).

Also applies to: 396-418

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b90e82 and 1a17fab.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/test_winreg.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/vm/Cargo.toml (0 hunks)
  • crates/vm/src/stdlib/winreg.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • crates/vm/Cargo.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/winreg.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (1)
crates/vm/src/stdlib/winreg.rs (1)

115-117: Unsafe Send/Sync implementations require careful justification.

The TODO: Fix comment suggests this is a known concern. Windows registry handles (HKEY) are not inherently thread-safe, and the PyRwLock only synchronizes access to the pointer value, not the underlying Windows operations on the handle. Consider documenting the safety invariants or implementing proper synchronization at the API call level.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (10)
crates/vm/src/stdlib/winreg.rs (10)

141-150: Duplicate: Double write lock acquisition and placeholder error message.

This issue was already flagged in a previous review. The two separate hkey.write() calls create a race window, and the error message is still a placeholder.


152-158: Duplicate: Detach does not prevent Drop from closing the handle.

This issue was already flagged in a previous review.


378-394: Duplicate: Incorrect parameter order in RegDeleteKeyExW call.

This issue was already flagged in a previous review. The reserved and access parameters are swapped.


460-505: Duplicate: Registry type is discarded, causing redundant API call.

This issue was already flagged in a previous review.


572-591: Duplicate: Null pointer passed as output parameter causes undefined behavior.

This issue was already flagged in a previous review.


629-649: Duplicate: QueryValue queries size but never retrieves or returns the actual value.

This issue was already flagged in a previous review.


673-689: Duplicate: Buffer length not set, causing empty slice.

This issue was already flagged in a previous review. The retBuf variable name also violates Rust naming conventions (ret_buf).


729-745: Duplicate: Encoding mismatch - UTF-8 string passed to UTF-16 API.

This issue was already flagged in a previous review.


899-899: Duplicate: Original error from py2reg is discarded.

This issue was already flagged in a previous review.


914-931: Duplicate: Non-null-terminated string passed to ANSI API, should use Wide variant.

This issue was already flagged in a previous review.

🧹 Nitpick comments (4)
crates/vm/src/stdlib/winreg.rs (4)

115-117: Document the safety invariants for Send and Sync implementations.

The unsafe impl Send and unsafe impl Sync require documenting why they are safe. While Windows registry handles can technically be used across threads, the current implementation has race conditions (e.g., Close acquires the write lock twice). Consider adding a safety comment explaining the invariants:

-    // TODO: Fix
-    unsafe impl Send for PyHKEYObject {}
-    unsafe impl Sync for PyHKEYObject {}
+    // SAFETY: Registry handles (HKEY) can be used from any thread.
+    // Thread-safety is ensured by the PyRwLock wrapper which synchronizes access.
+    unsafe impl Send for PyHKEYObject {}
+    unsafe impl Sync for PyHKEYObject {}

175-184: __exit__ duplicates Close logic with the same issues.

The __exit__ method has the same double write lock acquisition problem and placeholder error message as Close. Consider reusing Close:

     #[pymethod]
     fn __exit__(zelf: PyRef<Self>, _args: FuncArgs, vm: &VirtualMachine) -> PyResult<()> {
-        let res = unsafe { Registry::RegCloseKey(*zelf.hkey.write()) };
-        *zelf.hkey.write() = std::ptr::null_mut();
-        if res == 0 {
-            Ok(())
-        } else {
-            Err(vm.new_os_error("msg TODO".to_string()))
-        }
+        zelf.Close(vm)
     }

262-295: Consider reducing code duplication in ConnectRegistry.

The two branches differ only in the first parameter. This can be simplified:

 fn ConnectRegistry(
     computer_name: Option<String>,
     key: PyRef<PyHKEYObject>,
     vm: &VirtualMachine,
 ) -> PyResult<PyHKEYObject> {
+    let wide_computer_name = computer_name.map(|n| to_utf16(n));
     let mut ret_key = std::ptr::null_mut();
-    if let Some(computer_name) = computer_name {
-        let wide_computer_name = to_utf16(computer_name);
-        let res = unsafe {
-            Registry::RegConnectRegistryW(
-                wide_computer_name.as_ptr(),
-                *key.hkey.read(),
-                &mut ret_key,
-            )
-        };
-        // ...
-    } else {
-        let res = unsafe {
-            Registry::RegConnectRegistryW(std::ptr::null_mut(), *key.hkey.read(), &mut ret_key)
-        };
-        // ...
-    }
+    let res = unsafe {
+        Registry::RegConnectRegistryW(
+            wide_computer_name.as_ref().map_or(std::ptr::null(), |v| v.as_ptr()),
+            *key.hkey.read(),
+            &mut ret_key,
+        )
+    };
+    if res == 0 {
+        Ok(PyHKEYObject::new(ret_key))
+    } else {
+        Err(vm.new_os_error(format!("error code: {res}")))
+    }
 }

396-418: Consider removing or completing the commented-out EnumKey function.

The commented-out EnumKey implementation has a syntax error (std::mem::sizeof should be std::mem::size_of) and is incomplete. Either complete it or remove the commented code to reduce noise:

-    // #[pyfunction]
-    // fn EnumKey(key: PyRef<PyHKEYObject>, index: i32, vm: &VirtualMachine) -> PyResult<String> {
-    //     let mut tmpbuf = [0u16; 257];
-    //     let mut len = std::mem::sizeof(tmpbuf.len())/std::mem::sizeof(tmpbuf[0]);
     ...
-    // }
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a17fab and 4e4c61a.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/test_winreg.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/vm/Cargo.toml (0 hunks)
  • crates/vm/src/stdlib/winreg.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • crates/vm/Cargo.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/winreg.rs
🧠 Learnings (1)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/winreg.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (2)
crates/vm/src/stdlib/winreg.rs (2)

29-31: LGTM!

The to_utf16 helper function correctly converts an OsStr to a null-terminated UTF-16 vector, which is appropriate for Windows API calls.


593-627: LGTM with minor style note.

The implementation correctly retrieves registry key info and returns a tuple. The 0 as _ on line 604 works but std::ptr::null_mut() would be more idiomatic for the reserved parameter.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
crates/vm/src/stdlib/winreg.rs (3)

113-146: Avoid closing predefined HKEY handles and make Close a no‑op for them

PyHKEYObject always calls RegCloseKey in Drop for any non‑null handle, and the predefined keys (HKEY_CLASSES_ROOT, HKEY_CURRENT_USER, etc.) are created via PyHKEYObject::new(...) with those special pseudo‑handles. On Windows, calling RegCloseKey on predefined keys is documented as unsupported/undefined, and CPython explicitly treats them as non‑owned handles and makes CloseKey a no‑op for them. This can lead to undefined behavior or odd failures if the VM ever drops these objects.

Suggest tracking ownership on the wrapper and skipping close for predefined keys:

-    #[pyattr]
-    #[pyclass(name)]
-    #[derive(Debug, PyPayload)]
-    pub struct PyHKEYObject {
-        hkey: AtomicHKEY,
-    }
+    #[pyattr]
+    #[pyclass(name)]
+    #[derive(Debug, PyPayload)]
+    pub struct PyHKEYObject {
+        hkey: AtomicHKEY,
+        // true if this object owns the underlying handle and must close it
+        owns_handle: bool,
+    }

@@
-    impl PyHKEYObject {
-        pub fn new(hkey: *mut std::ffi::c_void) -> Self {
-            Self {
-                hkey: AtomicHKEY::new(hkey),
-            }
-        }
+    impl PyHKEYObject {
+        pub fn new(hkey: *mut std::ffi::c_void) -> Self {
+            Self {
+                hkey: AtomicHKEY::new(hkey),
+                owns_handle: true,
+            }
+        }
+
+        pub fn new_predefined(hkey: *mut std::ffi::c_void) -> Self {
+            Self {
+                hkey: AtomicHKEY::new(hkey),
+                owns_handle: false,
+            }
+        }
@@
-    #[pyattr(once)]
-    fn HKEY_CLASSES_ROOT(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_CLASSES_ROOT).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_CLASSES_ROOT(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_CLASSES_ROOT).into_ref(&vm.ctx)
+    }
@@
-    #[pyattr(once)]
-    fn HKEY_CURRENT_USER(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_CURRENT_USER).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_CURRENT_USER(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_CURRENT_USER).into_ref(&vm.ctx)
+    }
@@
-    #[pyattr(once)]
-    fn HKEY_LOCAL_MACHINE(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_LOCAL_MACHINE).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_LOCAL_MACHINE(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_LOCAL_MACHINE).into_ref(&vm.ctx)
+    }
@@
-    #[pyattr(once)]
-    fn HKEY_USERS(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_USERS).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_USERS(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_USERS).into_ref(&vm.ctx)
+    }
@@
-    #[pyattr(once)]
-    fn HKEY_PERFORMANCE_DATA(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_PERFORMANCE_DATA).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_PERFORMANCE_DATA(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_PERFORMANCE_DATA).into_ref(&vm.ctx)
+    }
@@
-    #[pyattr(once)]
-    fn HKEY_CURRENT_CONFIG(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_CURRENT_CONFIG).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_CURRENT_CONFIG(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_CURRENT_CONFIG).into_ref(&vm.ctx)
+    }
@@
-    #[pyattr(once)]
-    fn HKEY_DYN_DATA(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
-        PyHKEYObject::new(Registry::HKEY_DYN_DATA).into_ref(&vm.ctx)
-    }
+    #[pyattr(once)]
+    fn HKEY_DYN_DATA(vm: &VirtualMachine) -> PyRef<PyHKEYObject> {
+        PyHKEYObject::new_predefined(Registry::HKEY_DYN_DATA).into_ref(&vm.ctx)
+    }
@@
     #[pymethod]
     fn Close(&self, vm: &VirtualMachine) -> PyResult<()> {
-        // Atomically swap the handle with null and get the old value
-        let old_hkey = self.hkey.swap(std::ptr::null_mut());
+        // Predefined HKEYs are not owned and must not be closed – make Close a no‑op.
+        if !self.owns_handle {
+            return Ok(());
+        }
+        // Atomically swap the handle with null and get the old value
+        let old_hkey = self.hkey.swap(std::ptr::null_mut());
@@
     impl Drop for PyHKEYObject {
         fn drop(&mut self) {
             let hkey = self.hkey.swap(std::ptr::null_mut());
-            if !hkey.is_null() {
+            if !hkey.is_null() && self.owns_handle {
                 unsafe { Registry::RegCloseKey(hkey) };
             }
         }
     }

This keeps ownership semantics correct for opened/created keys, avoids closing pseudo‑handles, and keeps Detach behavior unchanged (it already nulls the handle).

Also applies to: 148-223


671-760: Fix unsafe &[u8] → &[u16] casts for UTF‑16 strings (potential UB)

Both QueryValue and reg_to_py reinterpret a Vec<u8> buffer as &[u16] via from_raw_parts(buffer.as_ptr() as *const u16, ...). The original allocation is for u8, whose alignment is 1; creating a &[u16] from that pointer violates Rust’s alignment requirements and is undefined behavior even if it “works” on x86.

You can safely reconstruct UTF‑16 code units with chunks_exact(2) and u16::from_le_bytes (Win32 registry data is little‑endian):

@@ fn QueryValue(key: HKEYArg, sub_key: Option<String>, vm: &VirtualMachine) -> PyResult<String> {
-            // Convert UTF-16 to String
-            let u16_slice = unsafe {
-                std::slice::from_raw_parts(buffer.as_ptr() as *const u16, size as usize / 2)
-            };
-            let len = u16_slice
-                .iter()
-                .position(|&c| c == 0)
-                .unwrap_or(u16_slice.len());
-            break String::from_utf16(&u16_slice[..len])
+            // Convert UTF‑16 (little‑endian) bytes to u16 safely
+            let u16_vec: Vec<u16> = buffer[..size as usize]
+                .chunks_exact(2)
+                .map(|chunk| u16::from_le_bytes([chunk[0], chunk[1]]))
+                .collect();
+            let len = u16_vec
+                .iter()
+                .position(|&c| c == 0)
+                .unwrap_or(u16_vec.len());
+            break String::from_utf16(&u16_vec[..len])
                 .map_err(|e| vm.new_value_error(format!("UTF16 error: {e}")));
@@ fn reg_to_py(vm: &VirtualMachine, ret_data: &[u8], typ: u32) -> PyResult {
-            REG_SZ | REG_EXPAND_SZ => {
-                // Treat the data as a UTF-16 string.
-                let u16_count = ret_data.len() / 2;
-                let u16_slice = unsafe {
-                    std::slice::from_raw_parts(ret_data.as_ptr() as *const u16, u16_count)
-                };
-                // Only use characters up to the first NUL.
-                let len = u16_slice
-                    .iter()
-                    .position(|&c| c == 0)
-                    .unwrap_or(u16_slice.len());
-                let s = String::from_utf16(&u16_slice[..len])
+            REG_SZ | REG_EXPAND_SZ => {
+                // Treat the data as a UTF-16 (little-endian) string.
+                let u16_vec: Vec<u16> = ret_data
+                    .chunks_exact(2)
+                    .map(|chunk| u16::from_le_bytes([chunk[0], chunk[1]]))
+                    .collect();
+                // Only use characters up to the first NUL.
+                let len = u16_vec
+                    .iter()
+                    .position(|&c| c == 0)
+                    .unwrap_or(u16_vec.len());
+                let s = String::from_utf16(&u16_vec[..len])
@@
-            REG_MULTI_SZ => {
-                if ret_data.is_empty() {
-                    Ok(vm.ctx.new_list(vec![]).into())
-                } else {
-                    let u16_count = ret_data.len() / 2;
-                    let u16_slice = unsafe {
-                        std::slice::from_raw_parts(ret_data.as_ptr() as *const u16, u16_count)
-                    };
-
-                    // Remove trailing null if present (like CPython's countStrings)
-                    let len = if u16_count > 0 && u16_slice[u16_count - 1] == 0 {
-                        u16_count - 1
-                    } else {
-                        u16_count
-                    };
-
-                    let mut strings: Vec<PyObjectRef> = Vec::new();
-                    let mut start = 0;
-                    for i in 0..len {
-                        if u16_slice[i] == 0 {
-                            let s = String::from_utf16(&u16_slice[start..i])
+            REG_MULTI_SZ => {
+                if ret_data.is_empty() {
+                    Ok(vm.ctx.new_list(vec![]).into())
+                } else {
+                    let u16_vec: Vec<u16> = ret_data
+                        .chunks_exact(2)
+                        .map(|chunk| u16::from_le_bytes([chunk[0], chunk[1]]))
+                        .collect();
+
+                    // Remove trailing null if present (like CPython's countStrings)
+                    let len = if !u16_vec.is_empty() && u16_vec[u16_vec.len() - 1] == 0 {
+                        u16_vec.len() - 1
+                    } else {
+                        u16_vec.len()
+                    };
+
+                    let mut strings: Vec<PyObjectRef> = Vec::new();
+                    let mut start = 0;
+                    for i in 0..len {
+                        if u16_vec[i] == 0 {
+                            let s = String::from_utf16(&u16_vec[start..i])
                                 .map_err(|e| vm.new_value_error(format!("UTF16 error: {e}")))?;
                             strings.push(vm.ctx.new_str(s).into());
                             start = i + 1;
                         }
                     }
                     // Handle last string if not null-terminated
                     if start < len {
-                        let s = String::from_utf16(&u16_slice[start..len])
+                        let s = String::from_utf16(&u16_vec[start..len])
                             .map_err(|e| vm.new_value_error(format!("UTF16 error: {e}")))?;
                         strings.push(vm.ctx.new_str(s).into());
                     }

This removes UB while preserving behavior for all supported string types.

Also applies to: 919-987


770-832: Honor the actual size returned by RegQueryValueExW before passing data to reg_to_py

QueryValueEx resizes the buffer on ERROR_MORE_DATA, but after the final successful call it ignores ret_size and passes the entire ret_buf slice to reg_to_py. For REG_BINARY or other non‑string types this can expose extra trailing zero bytes that are not part of the value, diverging from CPython’s behavior.

You already track ret_size inside the loop; make it persist and truncate the buffer before conversion:

-        let mut ret_buf = vec![0u8; buf_size as usize];
-        let mut typ = 0;
-
-        // Loop to handle ERROR_MORE_DATA
-        loop {
-            let mut ret_size = buf_size;
+        let mut ret_buf = vec![0u8; buf_size as usize];
+        let mut typ = 0;
+        let mut ret_size: u32 = buf_size;
+
+        // Loop to handle ERROR_MORE_DATA
+        loop {
+            ret_size = buf_size;
             let res = unsafe {
                 Registry::RegQueryValueExW(
                     hkey,
                     wide_name.as_ptr(),
                     std::ptr::null_mut(),
                     &mut typ,
                     ret_buf.as_mut_ptr(),
                     &mut ret_size,
                 )
             };
@@
-        }
-
-        let obj = reg_to_py(vm, ret_buf.as_slice(), typ)?;
+        }
+
+        // Only pass the bytes actually returned by the API.
+        ret_buf.truncate(ret_size as usize);
+        let obj = reg_to_py(vm, ret_buf.as_slice(), typ)?;

This keeps integer and string cases working as today, while making binary/unknown types match the real registry value length.

🧹 Nitpick comments (2)
crates/vm/src/stdlib/winreg.rs (2)

392-401: DeleteKey behavior (ERROR 5) and more precise error mapping

The RegDeleteKeyW call itself looks correct; ERROR_ACCESS_DENIED (5) is what Windows returns when the parent key lacks delete rights (e.g. opened with KEY_READ instead of KEY_WRITE/KEY_ALL_ACCESS). That likely explains the “always returns 5” behavior you mentioned: the tests probably open the parent with read‑only access.

Two improvements would help here:

  1. Make the error more informative (and consistent with other helpers) by using os_error_from_windows_code:
-        if res == 0 {
-            Ok(())
-        } else {
-            Err(vm.new_os_error(format!("error code: {res}")))
-        }
+        if res == 0 {
+            Ok(())
+        } else {
+            Err(os_error_from_windows_code(vm, res as i32, "RegDeleteKey"))
+        }
  1. Consider documenting (or asserting in tests) that the key argument must be opened with appropriate delete/write access (mirroring CPython’s DeleteKey docs), rather than treating ERROR 5 here as an implementation bug.

1170-1186: Handle long expansions and returned size in ExpandEnvironmentStrings

ExpandEnvironmentStrings currently uses a fixed 1024‑wide buffer and then scans for the first NUL, ignoring the value returned in r. If the expansion is longer than 1023 characters, the API will indicate the required size in r, but you never reallocate and retry, so large expansions will be truncated or fail unexpectedly.

You can honor the returned size and re‑call once if needed:

     fn ExpandEnvironmentStrings(i: String, vm: &VirtualMachine) -> PyResult<String> {
         let wide_input = i.to_wide_with_nul();
-        let mut out = vec![0u16; 1024];
-        let r = unsafe {
+        let mut out: Vec<u16> = vec![0u16; 1024];
+        let mut r = unsafe {
             windows_sys::Win32::System::Environment::ExpandEnvironmentStringsW(
                 wide_input.as_ptr(),
                 out.as_mut_ptr(),
                 out.len() as u32,
             )
         };
         if r == 0 {
             return Err(vm.new_os_error("ExpandEnvironmentStringsW failed".to_string()));
         }
-        let len = out.iter().position(|&c| c == 0).unwrap_or(out.len());
-        String::from_utf16(&out[..len]).map_err(|e| vm.new_value_error(format!("UTF16 error: {e}")))
+        // If the buffer was too small, r is the required size including the null terminator.
+        if (r as usize) > out.len() {
+            out.resize(r as usize, 0);
+            r = unsafe {
+                windows_sys::Win32::System::Environment::ExpandEnvironmentStringsW(
+                    wide_input.as_ptr(),
+                    out.as_mut_ptr(),
+                    out.len() as u32,
+                )
+            };
+            if r == 0 {
+                return Err(vm.new_os_error("ExpandEnvironmentStringsW failed".to_string()));
+            }
+        }
+        // r includes the terminating NUL.
+        let len = (r as usize).saturating_sub(1);
+        String::from_utf16(&out[..len])
+            .map_err(|e| vm.new_value_error(format!("UTF16 error: {e}")))
     }

This matches the documented contract of ExpandEnvironmentStringsW and avoids silent truncation for long environment expansions.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e4c61a and 7c8c043.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/test_winreg.py is excluded by !Lib/**
📒 Files selected for processing (2)
  • crates/vm/Cargo.toml (0 hunks)
  • crates/vm/src/stdlib/winreg.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • crates/vm/Cargo.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/winreg.rs
🧠 Learnings (1)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/winreg.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
crates/vm/src/stdlib/winreg.rs (3)

31-43: Consider making UTF‑16 conversion resilient to misalignment

bytes_as_wide_slice uses align_to::<u16>() and then debug‑asserts that the prefix/suffix are empty. This is usually fine for registry buffers, but it relies on the backing allocation being 2‑byte aligned in practice rather than by Rust’s type system. If that assumption ever fails, you’ll silently drop leading/trailing bytes in release builds.

To make this more robust while keeping zero‑copy in the common case, you could:

  • Check if !prefix.is_empty() || !suffix.is_empty() and, in that case, fall back to allocating a Vec<u16> by chunking the bytes manually; keep the current path when both are empty.
  • Alternatively, skip align_to entirely in reg_to_py and always construct Vec<u16> via chunks_exact(2); it’s slightly more allocation but clearly correct and avoids relying on allocator alignment.

The current implementation is safe (no UB) but a bit brittle around that assumption.

Also applies to: 933-996


1097-1145: SetValueEx correctly wires py2reg output into RegSetValueExW

  • Distinguishes between Some(v) and None from py2reg to either:
    • Set the given data blob, or
    • Write a zero‑length value (for types where that is meaningful).
  • Uses value_name.to_wide_with_nul() for the value name and passes the exact len from py2reg.
  • Propagates any error from py2reg directly.

One small stylistic improvement: you could avoid duplicating the RegSetValueExW call by factoring the common parts and just varying (ptr, len), but that’s purely optional.


1148-1177: Reflection key helpers are thin and correct

  • DisableReflectionKey, EnableReflectionKey, and QueryReflectionKey wrap the respective APIs and map non‑zero results to OSError.
  • QueryReflectionKey interprets a non‑zero BOOL as true.

Behaviorally this matches the Win32 and CPython semantics for key reflection. (learn.microsoft.com)

On very old or non‑server Windows versions, these APIs may return ERROR_CALL_NOT_IMPLEMENTED; if you want perfect CPython parity you could consider turning that into NotImplementedError, but that’s an optional polish.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c8c043 and 83c5146.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_importlib/resources/test_files.py is excluded by !Lib/**
  • Lib/test/test_winreg.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/vm/src/stdlib/winreg.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/winreg.rs
🧠 Learnings (1)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/winreg.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (19)
crates/vm/src/stdlib/winreg.rs (19)

6-8: Module wrapper delegation is correct

make_module simply forwards to the inner winreg::make_module(vm), which matches the usual pattern for #[pymodule]-based stdlib modules in this codebase. No changes needed.


45-59: Error helper and QueryValue behavior look consistent with CPython

  • os_error_from_windows_code correctly maps common Win32 codes (FILE_NOT_FOUND, ACCESS_DENIED) to more specific Python exception types and otherwise falls back to OSError, with a [WinError {code}]-style message. This aligns well with how CPython’s winreg surfaces OS errors. (learn.microsoft.com)
  • QueryValue:
    • Rejects HKEY_PERFORMANCE_DATA up front with an invalid-handle error, mirroring CPython’s special handling. (documentation.help)
    • Opens a child key when sub_key is non-empty, reads the unnamed (default) value via RegQueryValueExW, and correctly loops on ERROR_MORE_DATA by resizing the buffer and retrying.
    • Returns an empty string when RegQueryValueExW reports ERROR_FILE_NOT_FOUND, matching CPython’s documented behavior for a missing default value. (documentation.help)
    • Ensures the type is REG_SZ and then converts the UTF‑16 buffer to String, trimming at the first NUL.

Overall, this helper + QueryValue implementation looks good and behaviourally compatible with CPython.

Also applies to: 685-772


61-74: HKEYArg conversion is straightforward and useful

Allowing both PyHKEYObject and raw integer handles via TryFromObject is a nice compatibility affordance with CPython’s winreg, and the implementation is clear and side‑effect‑free.

No issues from a correctness or ergonomics standpoint.


174-190: PyHKEYObject lifecycle and context‑manager behavior look good

  • handle, __bool__, and __int__ simply expose the underlying handle and null-ness; they’re cheap and match CPython expectations for truthiness and integer conversion of HKEYs. (documentation.help)
  • __enter__/__exit__ delegate to Close, giving idiomatic context‑manager semantics.
  • The use of AtomicCell plus swap in Close/Detach/Drop correctly ensures:
    • Handles are closed exactly once.
    • Explicit Close and Detach prevent Drop from double‑closing by nulling out the stored handle before Drop runs.

Once the HKEY type/null sentinel issues from the previous comment are addressed, the lifecycle logic itself is sound.

Also applies to: 219-227, 230-237


268-307: Numeric protocol integration for PyHKEYObject is consistent

AsNumber is wired so that:

  • All arithmetic/bitwise operations fail with a uniform TypeError (“bad operand type”), which matches the non‑numeric nature of handles.
  • boolean delegates to __bool__ and int delegates to __int__, which is exactly what Python code using the numeric protocol would expect from bool(hkey) / int(hkey).

This is a clean and minimal numeric interface for a handle type.


309-341: ConnectRegistry, CreateKey, and CloseKey wrappers look correct

  • ConnectRegistry:

    • Handles both computer_name=None (local) and Some(String) (remote).
    • Uses to_wide_with_nul() for the remote machine name and passes NULL when None, per Win32 docs. (learn.microsoft.com)
    • Returns a fresh PyHKEYObject on success and maps non‑zero status to OSError.
  • CreateKey:

    • Calls RegCreateKeyW with the parent handle and UTF‑16 subkey name and wraps the returned handle.
    • Aligns with CPython’s winreg.CreateKey behavior. (docs.python.org)
  • CloseKey simply delegates to PyHKEYObject.Close, which centralizes close logic.

No functional issues here once the underlying HKEY representation is fixed as noted earlier.

Also applies to: 343-359, 401-405


406-415: Delete operations are wired correctly; ERROR_ACCESS_DENIED is usually expected for non‑empty keys

  • DeleteKey wraps RegDeleteKeyW with a UTF‑16 subkey name. Per Win32 docs, this function fails with ERROR_ACCESS_DENIED (5) if the subkey has subkeys; it cannot delete a key tree. (learn.microsoft.com)
  • DeleteKeyEx correctly calls RegDeleteKeyExW with (hKey, lpSubKey, samDesired, Reserved) now that the access/reserved order is fixed.
  • DeleteValue passes NULL for the value name when value=None, which the API treats as the unnamed/default value. (learn.microsoft.com)

If you are consistently seeing error code 5 from DeleteKey, double‑check that:

  • The target subkey has no subkeys.
  • You are pointing at the correct WOW64 view (for DeleteKeyEx) via access.
  • You’re not trying to delete HKEY_PERFORMANCE_DATA or other special views.

From the Rust side, these wrappers look correct; the remaining 5s are likely due to the documented Win32 constraints rather than an FFI bug.

Also applies to: 417-431, 433-461


463-490: EnumKey implementation matches Win32 and CPython expectations

  • Uses a 257‑character UTF‑16 buffer per the documented “255 chars + NUL, but 256 + no NUL is observed” behavior.
  • Calls RegEnumKeyExW and returns OSError on non‑zero status, matching CPython’s approach to index‑out‑of‑range and other errors. (documentation.help)
  • Converts only the returned character count to UTF‑16 string, with proper error mapping for invalid data.

No issues spotted here.


492-581: EnumValue logic and buffer‑growth handling look solid

  • Uses RegQueryInfoKeyW to determine maximum value name and data lengths, then allocates buffers with room for terminators.
  • Properly handles ERROR_MORE_DATA from RegEnumValueW by doubling the buffers and retrying, updating the “current size” inputs each iteration.
  • Uses the actual current_data_size when slicing the data buffer and then delegates to reg_to_py for type‑aware conversion.
  • Returns the expected (name, data, type) tuple.

Once the underlying HKEY type/“null handle” issues are corrected as noted earlier, this function is well‑structured and close to CPython’s algorithm.


585-592: FlushKey, LoadKey, and SaveKey are thin but correct wrappers

Each of these functions:

  • Calls the corresponding wide Win32 API (RegFlushKey, RegLoadKeyW, RegSaveKeyW) with the raw handle and UTF‑16 file/key names.
  • Treats any non‑zero return as an OSError with the raw code.

This is appropriate for RustPython; any additional privilege/SE_BACKUP/SE_RESTORE checks would normally be the caller’s responsibility in Python space.

Also applies to: 595-610, 848-859


612-647: OpenKey/OpenKeyEx look correct; good use of helper error mapping

  • OpenKeyArgs mirrors CPython’s signature (key, sub_key, reserved, access), with KEY_READ as the default access mask. (learn.microsoft.com)
  • OpenKey performs a single RegOpenKeyExW call with a UTF‑16 subkey name and wraps the returned HKEY in PyHKEYObject.
  • Errors are passed through os_error_from_windows_code, ensuring appropriate subclasses (e.g. FileNotFoundError, PermissionError) are raised.

One small consistency tweak: now that PyHKEYObject::new exists, you could use Ok(PyHKEYObject::new(res)) here for symmetry with other constructors.


649-683: QueryInfoKey implementation is straightforward and matches docs

  • Calls RegQueryInfoKeyW to obtain the subkey count, value count, and last write time.
  • Combines FILETIME.dwHighDateTime and dwLowDateTime into a single 64‑bit timestamp and returns (subkey_count, value_count, last_write_time).

This matches CPython’s winreg.QueryInfoKey contract. (documentation.help)


873-931: SetValue key handling and UTF‑16 encoding look correct

Beyond the small new_type_error issue:

  • Rejects HKEY_PERFORMANCE_DATA up front with ERROR_INVALID_HANDLE, which matches CPython’s special‑casing. (documentation.help)
  • Creates the subkey with RegCreateKeyExW (when sub_key is non‑empty) using KEY_SET_VALUE, then sets the unnamed value with RegSetValueExW + UTF‑16LE encoded string including terminator.
  • Closes the child key if it was created.

The use of to_wide_with_nul and (wide_value.len() * 2) byte count is correct for registry REG_SZ values.


933-962: reg_to_py covers the main registry types correctly

  • REG_DWORD / REG_QWORD: read up to 4/8 bytes with from_ne_bytes, returning 0 when data is too short, which mirrors CPython’s forgiving behavior.
  • REG_SZ / REG_EXPAND_SZ: convert via bytes_as_wide_slice, trim at the first NUL, and map UTF‑16 decoding errors to ValueError.
  • REG_MULTI_SZ: handle empty data as [], strip the trailing NUL if present, split on embedded NULs, and return a list of decoded strings.
  • Other / unknown types, including REG_BINARY, return a bytes object when data is non‑empty and None when empty.

Once the minor alignment concerns in bytes_as_wide_slice are addressed (see earlier comment), this conversion function looks comprehensive and faithful to CPython’s winreg behavior. (documentation.help)

Also applies to: 964-996


1008-1095: py2reg integer and string conversions are robust and CPython‑like

Highlights:

  • REG_DWORD / REG_QWORD:

    • Accept None as zero, matching CPython. (documentation.help)
    • Require a PyInt, rejecting non‑integers with TypeError.
    • Use Sign::Minus and to_u32/to_u64 to detect negative or too‑large values and raise OverflowError instead of panicking.
  • REG_SZ / REG_EXPAND_SZ:

    • Treat None as an empty string (a single UTF‑16 NUL), again matching CPython behavior.
    • Accept only str values and convert via UTF‑16LE, including the trailing NUL.
  • REG_MULTI_SZ:

    • Treat None as an empty list (double NUL terminator).
    • Enforce a list of strings and correctly concatenate UTF‑16LE strings with NUL separators and a final double‑NUL.
  • For REG_BINARY/others:

    • None maps to “no data” (so SetValueEx passes length 0 and null pointer).
    • Only accepts bytes for non‑None values, with a clear TypeError otherwise.

This is a solid, well‑guarded implementation.


1179-1210: ExpandEnvironmentStrings implementation looks correct and robust

  • Uses the wide ExpandEnvironmentStringsW API, first with lpDst = NULL/size = 0 to obtain the required size, then with an appropriately sized Vec<u16> buffer. (learn.microsoft.com)
  • Checks for 0 return as failure from both calls.
  • Trims at the first NUL and converts UTF‑16 to String, mapping decode errors to ValueError.

This is a clean, CPython‑like implementation with correct buffer sizing and error handling.


361-371: Out‑parameter and HKEY typing in CreateKeyEx are incorrect

CreateKeyEx has low‑level issues with RegCreateKeyExW's output parameter handling:

  • RegCreateKeyExW's last "phkResult" parameter expects PHKEY (*mut HKEY), but the code currently uses *mut std::ffi::c_void instead of the proper Registry::HKEY type. With windows_sys, HKEY is isize, so the output parameter should be a local Registry::HKEY, passed as &mut hkey to the API and then stored in PyHKEYObject.

  • This mirrors the bug fixed earlier in OpenKey: using a null *mut *mut c_void as the out‑pointer causes undefined behavior if the API writes through it.

Replace the *mut std::ffi::c_void declaration with let mut res: Registry::HKEY = 0; and use PyHKEYObject::new(res) for consistency with OpenKey.


861-871: SetValue error creation may have signature mismatch with new_type_error

The code passes a string literal to vm.new_type_error():

if typ != Registry::REG_SZ {
    return Err(vm.new_type_error("type must be winreg.REG_SZ"));
}

If new_type_error expects a String parameter (as suggested by other calls in this file using .to_string()), apply:

-    if typ != Registry::REG_SZ {
-        return Err(vm.new_type_error("type must be winreg.REG_SZ"));
-    }
+    if typ != Registry::REG_SZ {
+        return Err(vm.new_type_error("type must be winreg.REG_SZ".to_string()));
+    }

Verify the actual signature of vm.new_type_error() and confirm whether string literal is accepted or if .to_string() is required for consistency with the rest of the file.


95-113: Handle representation, predefined keys, and __str__ need tightening

There are a few intertwined issues around PyHKEYObject and HKEY constants:

  1. HKEY type vs raw pointers

    In windows_sys, HKEY is defined as isize, not *mut c_void. The code appears to treat HKEYs as raw pointers in multiple locations, which will cause type mismatches or rely on unintended coercions. Refactor to treat HKEYs as the integer handle type that windows_sys exposes.

  2. Predefined HKEY constants being closed

    Predefined constants (HKEY_CLASSES_ROOT, HKEY_CURRENT_USER, etc.) must not be closed via RegCloseKey as Windows documentation specifies the handle must have been opened by RegCreateKeyEx, RegOpenKeyEx, RegConnectRegistry, etc. Add a flag (e.g. is_predefined: bool) or separate constructor for predefined keys and skip RegCloseKey in Drop/Close.

  3. __str__ formatting as a pointer

    __str__ currently formats with {:p} which does not type-check with HKEY = isize. Use hex integer representation instead: format!("<PyHKEY:0x{:x}>", self.hkey.load() as usize).

@youknowone youknowone force-pushed the winreg branch 5 times, most recently from fa5552a to cb7fd60 Compare December 5, 2025 08:39
Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/vm/src/stdlib/winreg.rs (1)

120-153: Critical: Predefined HKEY constants must not be closed.

The predefined registry keys (HKEY_CLASSES_ROOT, HKEY_CURRENT_USER, etc.) are pseudo-handles that should never be passed to RegCloseKey. According to Windows documentation, calling RegCloseKey on these handles can cause undefined behavior or corruption on certain Windows versions.

The current implementation wraps these predefined keys in PyHkey instances, which means:

  1. If a user calls .Close() on them, they will attempt to close the predefined handle
  2. When the PyHkey is dropped, the Drop impl will also call RegCloseKey

You need to distinguish predefined keys from opened keys. Options:

Option 1: Add an is_predefined flag

 #[pyclass(name = "HKEYType")]
 #[derive(Debug, PyPayload)]
 struct PyHkey {
     hkey: AtomicHKEY,
+    is_predefined: bool,
 }

 impl PyHkey {
-    fn new(hkey: Registry::HKEY) -> Self {
+    fn new(hkey: Registry::HKEY) -> Self {
+        Self::new_with_flag(hkey, false)
+    }
+
+    fn new_predefined(hkey: Registry::HKEY) -> Self {
+        Self::new_with_flag(hkey, true)
+    }
+
+    fn new_with_flag(hkey: Registry::HKEY, is_predefined: bool) -> Self {
         Self {
             hkey: AtomicHKEY::new(hkey),
+            is_predefined,
         }
     }
 }

 impl Drop for PyHkey {
     fn drop(&mut self) {
+        if self.is_predefined {
+            return;
+        }
         let hkey = self.hkey.swap(std::ptr::null_mut());
         if !hkey.is_null() {
             unsafe { Registry::RegCloseKey(hkey) };
         }
     }
 }

 #[pyclass(with(AsNumber, Hashable))]
 impl PyHkey {
     #[pymethod]
     fn Close(&self, vm: &VirtualMachine) -> PyResult<()> {
+        if self.is_predefined {
+            return Ok(()); // Silently succeed for predefined keys
+        }
         // ... existing close logic
     }

     #[pymethod]
     fn Detach(&self) -> PyResult<usize> {
+        if self.is_predefined {
+            return Err(vm.new_value_error("Cannot detach predefined registry keys".to_string()));
+        }
         // ... existing detach logic
     }
 }

Then update the constant definitions:

 #[pyattr(once)]
 fn HKEY_CLASSES_ROOT(vm: &VirtualMachine) -> PyRef<PyHkey> {
-    PyHkey::new(Registry::HKEY_CLASSES_ROOT).into_ref(&vm.ctx)
+    PyHkey::new_predefined(Registry::HKEY_CLASSES_ROOT).into_ref(&vm.ctx)
 }

Based on past review comments, this issue was supposedly addressed in commits c2ca9a7 to 7c8c043, but the current code still shows the same vulnerability.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83c5146 and d8dde12.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_winreg.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/vm/src/stdlib/winreg.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/winreg.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (22)
crates/vm/src/stdlib/winreg.rs (22)

1-27: LGTM! Clean module structure and appropriate imports.

The use of AtomicCell from crossbeam_utils for lock-free thread-safe HKEY management is a solid choice for this use case.


44-58: Good error mapping implementation.

The centralized error mapping function properly categorizes Windows errors into appropriate Python exception types.


60-73: LGTM! Flexible handle argument type.

The HKEYArg wrapper properly handles both PyHkey objects and raw integer handles, providing good ergonomics for the API.


185-239: Well-implemented PyHkey methods with proper atomic operations.

The atomic operations in Close() and Detach() correctly prevent race conditions. The context manager protocol (__enter__/__exit__) is properly implemented.


259-298: LGTM! Proper AsNumber implementation.

The number protocol correctly implements only the operations that make sense for HKEY (boolean and int conversions) while rejecting arithmetic operations.


300-332: LGTM! Proper implementation of ConnectRegistry.

The function correctly handles both local (None) and remote (Some) registry connections with proper wide string conversion.


334-386: LGTM! Proper key creation functions.

Both CreateKey and CreateKeyEx correctly wrap the Windows Registry API with appropriate error handling and wide string conversion.


388-444: LGTM! Deletion operations properly implemented.

The delete functions correctly handle optional parameters and pass arguments in the correct order to the Windows API.


446-473: LGTM! Proper key enumeration with good documentation.

The comment explaining the 257-character buffer quirk is helpful and shows attention to Windows API edge cases.


475-565: LGTM! Comprehensive EnumValue implementation.

The function properly queries required buffer sizes, handles ERROR_MORE_DATA by doubling buffers, and correctly returns the value tuple. The initial size query ensures buffers start at a reasonable size.


567-593: LGTM! FlushKey and LoadKey properly implemented.

Both functions correctly wrap their respective Windows Registry APIs with proper error handling.


595-630: LGTM! OpenKey properly implemented with correct output parameter handling.

The function correctly passes a mutable reference to a local variable as the output parameter, avoiding the null pointer issue from previous versions.


632-666: LGTM! QueryInfoKey properly retrieves registry key information.

The function correctly calls the Windows API and converts the FILETIME to a Python-compatible integer.


668-763: LGTM! QueryValue properly handles default registry values.

The function correctly handles various edge cases including:

  • Rejecting HKEY_PERFORMANCE_DATA
  • Opening subkeys when needed
  • Returning empty string for missing default values
  • Type validation

Note: This function uses bytes_as_wide_slice which has the alignment concern flagged earlier.


765-829: LGTM! QueryValueEx properly handles buffer sizing and ERROR_MORE_DATA.

The function ensures buf_size is at least 256 before entering the retry loop, preventing the infinite loop scenario. The two-phase approach (size query followed by data retrieval) is correct.


831-842: LGTM! SaveKey properly implemented.

The function correctly wraps RegSaveKeyW with proper error handling.


844-914: LGTM! SetValue properly creates subkeys and sets default values.

The function correctly:

  • Validates the type is REG_SZ
  • Rejects HKEY_PERFORMANCE_DATA
  • Creates subkeys when needed
  • Converts strings to UTF-16 with proper byte counting
  • Cleans up resources

916-988: LGTM! Comprehensive registry data to Python conversion.

The function handles all major registry data types correctly:

  • Integer types (DWORD, QWORD) with defensive handling for insufficient data
  • String types (REG_SZ, REG_EXPAND_SZ, REG_MULTI_SZ) with proper null-termination handling
  • Binary data as fallback

Note: Uses bytes_as_wide_slice which has the alignment concern flagged earlier.


990-1077: Excellent! Comprehensive Python to registry data conversion.

The function properly handles:

  • Integer types with overflow and negative value checks
  • String types with UTF-16 encoding
  • Multi-string with proper null termination
  • None values with appropriate defaults
  • Type validation with clear error messages

The use of Sign::Minus check before to_u32()/to_u64() prevents panics on negative integers.


1079-1128: LGTM! SetValueEx properly handles value conversion and setting.

The function correctly handles both Some and None results from py2reg, passing appropriate pointers and lengths to the Windows API.


1130-1159: LGTM! Registry reflection functions properly implemented.

All three reflection key functions correctly wrap their respective Windows APIs with proper error handling.


1161-1192: LGTM! ExpandEnvironmentStrings uses correct two-phase approach.

The function properly:

  1. Queries required buffer size
  2. Allocates exact buffer
  3. Performs expansion
  4. Converts result to String

This matches the recommended Windows API usage pattern.

Comment on lines +31 to +42
/// Convert byte slice to UTF-16 slice (zero-copy when aligned)
fn bytes_as_wide_slice(bytes: &[u8]) -> &[u16] {
// SAFETY: Windows Registry API returns properly aligned UTF-16 data.
// align_to handles any edge cases safely by returning empty prefix/suffix
// if alignment doesn't match.
let (prefix, u16_slice, suffix) = unsafe { bytes.align_to::<u16>() };
debug_assert!(
prefix.is_empty() && suffix.is_empty(),
"Registry data should be u16-aligned"
);
u16_slice
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential data loss if alignment assumption is violated.

The function uses debug_assert! to check that prefix and suffix are empty, but then returns u16_slice regardless. In release builds where assertions are disabled, if the data is not properly aligned, the function will silently return a slice that excludes the unaligned prefix/suffix bytes, causing data loss.

Consider either:

  1. Returning a Result and propagating alignment errors, or
  2. Falling back to a manual conversion if alignment fails:
 fn bytes_as_wide_slice(bytes: &[u8]) -> &[u16] {
-    // SAFETY: Windows Registry API returns properly aligned UTF-16 data.
-    // align_to handles any edge cases safely by returning empty prefix/suffix
-    // if alignment doesn't match.
     let (prefix, u16_slice, suffix) = unsafe { bytes.align_to::<u16>() };
-    debug_assert!(
-        prefix.is_empty() && suffix.is_empty(),
-        "Registry data should be u16-aligned"
-    );
+    if !prefix.is_empty() || !suffix.is_empty() {
+        panic!("Registry data is not u16-aligned - this should never happen");
+    }
     u16_slice
 }

Or use a more defensive approach that handles misalignment.

@youknowone youknowone merged commit 6f786b5 into RustPython:main Dec 5, 2025
13 checks passed
@arihant2math arihant2math deleted the winreg branch December 7, 2025 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants