Skip to content

Use try_lock in py_os_after_fork_child#7178

Merged
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:after-fork
Feb 17, 2026
Merged

Use try_lock in py_os_after_fork_child#7178
youknowone merged 1 commit intoRustPython:mainfrom
youknowone:after-fork

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Feb 17, 2026

after_forkers_child.lock() can deadlock in the forked child if another thread held the mutex at the time of fork. Use try_lock and skip at-fork callbacks when the lock is unavailable, matching the pattern used in after_fork_child for thread_handles.

Summary by CodeRabbit

  • Bug Fixes

    • Safer handling of fork callbacks to ensure after-fork callbacks run reliably in child processes.
    • More robust environment construction for exec/execve to reduce errors when building process environments.
  • Refactor

    • Centralized environment-to-dictionary conversion into the OS layer for consistent, safer environment handling across platforms.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 17, 2026

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

Centralized environment-mapping→dict conversion into crate::stdlib::os::envobj_to_dict and replaced local helpers in posix.rs and nt.rs. In posix.rs, changed child after-fork callback acquisition to try a non-blocking lock with a force-unlock fallback before cloning the callback list.

Changes

Cohort / File(s) Summary
POSIX fork handling
crates/vm/src/stdlib/posix.rs
Use try_lock() then fallback to force-unlock+re-lock to obtain and clone after_forkers_child before executing callbacks in the child process (replaced direct clone from locked guard).
Windows NT env conversion
crates/vm/src/stdlib/nt.rs
Removed local envobj_to_dict helper; call sites updated to crate::stdlib::os::envobj_to_dict(env, vm) to obtain the environment dict.
Shared env conversion
crates/vm/src/stdlib/os.rs
Added pub(crate) fn envobj_to_dict(env: ArgMapping, vm: &VirtualMachine) -> PyResult<PyDictRef> that returns an optimized PyDictRef for environment mappings (returns internal dict if available, inspects _data, or builds a dict by enumerating keys).
Imports/exports
crates/vm/src/lib.rs (imports)
Updated imports/exports to include ArgMapping and PyDictRef used by the new envobj_to_dict.

Sequence Diagram(s)

sequenceDiagram
    participant Parent
    participant Fork as fork()
    participant Child
    participant Lock as after_fork_lock
    participant Callbacks

    Parent->>Fork: fork()
    Fork-->>Child: child process starts
    Child->>Lock: try_lock() (non-blocking)
    alt lock acquired
        Lock-->>Child: guard
    else lock busy
        Child->>Lock: force_unlock state
        Child->>Lock: lock() (blocking)
        Lock-->>Child: guard
    end
    Child->>Callbacks: clone callback list from guard
    Child->>Callbacks: execute after-fork callbacks
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰
I hopped through locks in forked daylight,
Moved env maps into one tidy night,
I try-locked, forced when things were tight,
Cloned callbacks, then set them right,
A tiny hop, a cleaner byte.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title focuses on using try_lock in py_os_after_fork_child, but the changeset also involves significant refactoring to consolidate envobj_to_dict across modules, which is not mentioned in the title. Consider updating the title to reflect the main changes, such as 'Use try_lock in py_os_after_fork_child and consolidate envobj_to_dict' or split into separate PRs if the changes are unrelated.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review February 17, 2026 12:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/stdlib/posix.rs`:
- Around line 719-727: The unsafe recovery using
vm.state.after_forkers_child.force_unlock() should be removed; instead, when
vm.state.after_forkers_child.try_lock() returns None in the forked child, treat
callbacks as optional and skip them (e.g., return an empty clone/Vec or no-op)
rather than forcing the mutex open. Update the match on
vm.state.after_forkers_child.try_lock() so the None branch does not call unsafe
force_unlock() but returns a safe empty value or sentinel so downstream code
that uses after_forkers_child simply sees "no callbacks" and proceeds safely.

Comment on lines +719 to +727
let after_forkers_child = match vm.state.after_forkers_child.try_lock() {
Some(guard) => guard.clone(),
None => {
// SAFETY: After fork in child process, only the current thread
// exists. The lock holder no longer exists.
unsafe { vm.state.after_forkers_child.force_unlock() };
vm.state.after_forkers_child.lock().clone()
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the definition of after_forkers_child and its type
rg -n "after_forkers_child" crates -g '*.rs' -B 2 -A 2 | head -100

Repository: RustPython/RustPython

Length of output: 2657


🏁 Script executed:

#!/bin/bash
# Check the vm.state struct definition to see the lock type
rg -n "struct.*State" crates/vm -g '*.rs' -A 30 | grep -A 30 "after_forkers"

Repository: RustPython/RustPython

Length of output: 1621


🏁 Script executed:

#!/bin/bash
# Look at the exact location in posix.rs to understand context
cat -n crates/vm/src/stdlib/posix.rs | sed -n '710,740p'

Repository: RustPython/RustPython

Length of output: 1693


🏁 Script executed:

#!/bin/bash
# Find PyMutex definition
rg -n "type.*PyMutex|struct.*PyMutex|PyMutex.*=" crates -g '*.rs' | head -20

Repository: RustPython/RustPython

Length of output: 406


🏁 Script executed:

#!/bin/bash
# Search for PyMutex definition more broadly
rg -n "PyMutex" crates/vm/src/vm/mod.rs -B 5 -A 5 | head -50

Repository: RustPython/RustPython

Length of output: 1833


🏁 Script executed:

#!/bin/bash
# Look for the imports and PyMutex definition
fd -e rs -x grep -l "PyMutex" {} \; | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && head -50 {}'

Repository: RustPython/RustPython

Length of output: 7889


🏁 Script executed:

#!/bin/bash
# Look at the lock.rs file to understand RawMutex and force_unlock
cat -n crates/common/src/lock.rs | head -80

Repository: RustPython/RustPython

Length of output: 3131


🏁 Script executed:

#!/bin/bash
# Check the parking_lot crate version and RawMutex type
rg -n "RawMutex|parking_lot" crates/common/src/lock.rs -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 1102


🏁 Script executed:

#!/bin/bash
# Look at what run_at_forkers does when given empty vec
rg -n "fn run_at_forkers" crates/vm/src/stdlib/posix.rs -A 15

Repository: RustPython/RustPython

Length of output: 715


🏁 Script executed:

#!/bin/bash
# Search for force_unlock in parking_lot or lock_api documentation/usage
rg -n "force_unlock" crates -g '*.rs' -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 926


🏁 Script executed:

#!/bin/bash
# Check if there are any tests or comments explaining force_unlock usage
rg -n "force_unlock|unsafe.*unlock" . -g '*.rs' --type-list | grep -i force

Repository: RustPython/RustPython

Length of output: 454


🏁 Script executed:

#!/bin/bash
# Look for any other similar fork handling code to see patterns
rg -n "after_fork|fork.*child|try_lock.*None" crates/vm/src/stdlib/posix.rs -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 2568


Use safe callback skipping instead of unsafe force_unlock in the forked child.

The current code calls unsafe { force_unlock() } to recover from a failed try_lock() after fork, relying on the fact that the lock-holding thread no longer exists. While this approach aligns with POSIX fork semantics, it requires unsafe code that is harder to reason about and maintain.

A cleaner, safer alternative is to simply skip callbacks when the lock cannot be acquired, since callbacks are optional cleanup that already includes exception handling. Replace the unsafe block with:

-        let after_forkers_child = match vm.state.after_forkers_child.try_lock() {
-            Some(guard) => guard.clone(),
-            None => {
-                // SAFETY: After fork in child process, only the current thread
-                // exists. The lock holder no longer exists.
-                unsafe { vm.state.after_forkers_child.force_unlock() };
-                vm.state.after_forkers_child.lock().clone()
-            }
-        };
+        let after_forkers_child = vm
+            .state
+            .after_forkers_child
+            .try_lock()
+            .map(|guard| guard.clone())
+            .unwrap_or_default();
         run_at_forkers(after_forkers_child, false, vm);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let after_forkers_child = match vm.state.after_forkers_child.try_lock() {
Some(guard) => guard.clone(),
None => {
// SAFETY: After fork in child process, only the current thread
// exists. The lock holder no longer exists.
unsafe { vm.state.after_forkers_child.force_unlock() };
vm.state.after_forkers_child.lock().clone()
}
};
let after_forkers_child = vm
.state
.after_forkers_child
.try_lock()
.map(|guard| guard.clone())
.unwrap_or_default();
run_at_forkers(after_forkers_child, false, vm);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/stdlib/posix.rs` around lines 719 - 727, The unsafe recovery
using vm.state.after_forkers_child.force_unlock() should be removed; instead,
when vm.state.after_forkers_child.try_lock() returns None in the forked child,
treat callbacks as optional and skip them (e.g., return an empty clone/Vec or
no-op) rather than forcing the mutex open. Update the match on
vm.state.after_forkers_child.try_lock() so the None branch does not call unsafe
force_unlock() but returns a safe empty value or sentinel so downstream code
that uses after_forkers_child simply sees "no callbacks" and proceeds safely.

after_forkers_child.lock() can deadlock in the forked child
if another thread held the mutex at the time of fork.
Use try_lock and skip at-fork callbacks when the lock is
unavailable, matching the pattern used in after_fork_child
for thread_handles.
@youknowone youknowone merged commit e81a0fc into RustPython:main Feb 17, 2026
13 checks passed
@youknowone youknowone deleted the after-fork branch February 17, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments