Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions .github/workflows/lib-deps-check.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
name: Lib Dependencies Check

on:
pull_request_target:
types: [opened, synchronize, reopened]
paths:
- 'Lib/**'

concurrency:
group: lib-deps-${{ github.event.pull_request.number }}
cancel-in-progress: true

jobs:
check_deps:
permissions:
pull-requests: write
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout base branch
uses: actions/[email protected]
with:
# Use base branch for scripts (security: don't run PR code with elevated permissions)
ref: ${{ github.event.pull_request.base.ref }}
fetch-depth: 0

- name: Fetch PR head
run: |
git fetch origin ${{ github.event.pull_request.head.sha }}

- name: Checkout CPython
run: |
git clone --depth 1 --branch v3.14.2 https://github.com/python/cpython.git cpython

- name: Get changed Lib files
id: changed-files
run: |
# Get the list of changed files under Lib/
changed=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Lib/*.py' 'Lib/**/*.py' | head -50)
echo "Changed files:"
echo "$changed"

# Extract unique module names (top-level only, skip test/)
modules=""
for file in $changed; do
# Skip test files
if [[ "$file" == Lib/test/* ]]; then
continue
fi
# Extract module name: Lib/foo.py -> foo, Lib/foo/__init__.py -> foo
module=$(echo "$file" | sed -E 's|^Lib/||; s|/__init__\.py$||; s|\.py$||; s|/.*||')
if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then
modules="$modules $module"
fi
done

modules=$(echo "$modules" | xargs) # trim whitespace
echo "Detected modules: $modules"
echo "modules=$modules" >> $GITHUB_OUTPUT

Comment on lines +35 to +60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix duplicate-module detection (substring match can skip modules).

Line 52 uses a regex substring check against a concatenated string, which can treat os as already present when posix is in the list. This can silently drop modules from the deps run.

🔧 Proposed fix (exact-match dedup)
-          modules=""
+          modules=""
+          declare -A seen=()
@@
-            if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then
-              modules="$modules $module"
-            fi
+            if [[ -n "$module" && -z "${seen[$module]:-}" ]]; then
+              seen["$module"]=1
+              modules="$modules $module"
+            fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Get changed Lib files
id: changed-files
run: |
# Get the list of changed files under Lib/
changed=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Lib/*.py' 'Lib/**/*.py' | head -50)
echo "Changed files:"
echo "$changed"
# Extract unique module names (top-level only, skip test/)
modules=""
for file in $changed; do
# Skip test files
if [[ "$file" == Lib/test/* ]]; then
continue
fi
# Extract module name: Lib/foo.py -> foo, Lib/foo/__init__.py -> foo
module=$(echo "$file" | sed -E 's|^Lib/||; s|/__init__\.py$||; s|\.py$||; s|/.*||')
if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then
modules="$modules $module"
fi
done
modules=$(echo "$modules" | xargs) # trim whitespace
echo "Detected modules: $modules"
echo "modules=$modules" >> $GITHUB_OUTPUT
- name: Get changed Lib files
id: changed-files
run: |
# Get the list of changed files under Lib/
changed=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Lib/*.py' 'Lib/**/*.py' | head -50)
echo "Changed files:"
echo "$changed"
# Extract unique module names (top-level only, skip test/)
modules=""
declare -A seen=()
for file in $changed; do
# Skip test files
if [[ "$file" == Lib/test/* ]]; then
continue
fi
# Extract module name: Lib/foo.py -> foo, Lib/foo/__init__.py -> foo
module=$(echo "$file" | sed -E 's|^Lib/||; s|/__init__\.py$||; s|\.py$||; s|/.*||')
if [[ -n "$module" && -z "${seen[$module]:-}" ]]; then
seen["$module"]=1
modules="$modules $module"
fi
done
modules=$(echo "$modules" | xargs) # trim whitespace
echo "Detected modules: $modules"
echo "modules=$modules" >> $GITHUB_OUTPUT
🤖 Prompt for AI Agents
In @.github/workflows/lib-deps-check.yaml around lines 35 - 60, The
duplicate-module check uses a regex substring test in the condition 'if [[ -n
"$module" && ! " $modules " =~ " $module " ]]; then' which misdetects modules
like "os" when "posix" exists; replace this with exact-match deduplication by
tracking seen modules with an associative array (e.g., declare -A seen) inside
the same loop (for file in $changed) and only append when the module key is not
present (check via [[ -z "${seen[$module]}" ]]); after appending set
seen[$module]=1 and keep producing the trimmed $modules and GITHUB_OUTPUT as
before.

- name: Setup Python
if: steps.changed-files.outputs.modules != ''
uses: actions/[email protected]
with:
python-version: "3.12"

- name: Run deps check
if: steps.changed-files.outputs.modules != ''
id: deps-check
run: |
# Run deps for all modules at once
python scripts/update_lib deps ${{ steps.changed-files.outputs.modules }} --depth 2 > /tmp/deps_output.txt 2>&1 || true

# Read output for GitHub Actions
echo "deps_output<<EOF" >> $GITHUB_OUTPUT
cat /tmp/deps_output.txt >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

# Check if there's any meaningful output
if [ -s /tmp/deps_output.txt ]; then
echo "has_output=true" >> $GITHUB_OUTPUT
else
echo "has_output=false" >> $GITHUB_OUTPUT
fi

Comment on lines +67 to +85
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard against command injection from PR-controlled module names.

Line 72 expands untrusted module names directly into a shell command in a pull_request_target workflow. A crafted filename could inject shell tokens. Pipe modules through xargs and include -- to avoid option injection.

🔒 Proposed fix (no shell eval of module names)
-          python scripts/update_lib deps ${{ steps.changed-files.outputs.modules }} --depth 2 > /tmp/deps_output.txt 2>&1 || true
+          printf '%s\n' ${{ steps.changed-files.outputs.modules }} | \
+            xargs -r python scripts/update_lib deps --depth 2 -- > /tmp/deps_output.txt 2>&1 || true
🤖 Prompt for AI Agents
In @.github/workflows/lib-deps-check.yaml around lines 67 - 85, The workflow
step "Run deps check" currently expands untrusted module names directly in the
shell via the command starting with "python scripts/update_lib deps ${{
steps.changed-files.outputs.modules }} --depth 2", which allows command
injection; change this to stop interpolating the raw variable and instead stream
the modules through a safe tool (e.g., use printf or echo to produce one module
per line and pipe into xargs) so each module is passed as a positional argument
and include a "--" marker before module names to prevent option injection,
invoking "python scripts/update_lib deps -- <module> --depth 2" for each module;
keep the rest of the step (writing /tmp/deps_output.txt and populating
GITHUB_OUTPUT and has_output) unchanged.

- name: Post comment
if: steps.deps-check.outputs.has_output == 'true'
uses: marocchino/sticky-pull-request-comment@v2
with:
header: lib-deps-check
number: ${{ github.event.pull_request.number }}
message: |
## 📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

<details>
<summary>Click to expand dependency information</summary>

```
${{ steps.deps-check.outputs.deps_output }}
```

</details>

**Legend:**
- `[+]` path exists, `[-]` path missing
- `[x]` up-to-date, `[ ]` outdated
- `native:` Rust/C extension modules

- name: Remove comment if no Lib changes
if: steps.changed-files.outputs.modules == ''
uses: marocchino/sticky-pull-request-comment@v2
with:
header: lib-deps-check
number: ${{ github.event.pull_request.number }}
delete: true
10 changes: 10 additions & 0 deletions scripts/update_lib/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,11 @@ def main(argv: list[str] | None = None) -> int:
help="Copy library file/directory from CPython (delete existing first)",
add_help=False,
)
subparsers.add_parser(
"deps",
help="Show dependency information for a module",
add_help=False,
)

args, remaining = parser.parse_known_args(argv)

Expand Down Expand Up @@ -77,6 +82,11 @@ def main(argv: list[str] | None = None) -> int:

return auto_mark_main(remaining)

if args.command == "deps":
from update_lib.show_deps import main as show_deps_main

return show_deps_main(remaining)

return 0


Expand Down
10 changes: 4 additions & 6 deletions scripts/update_lib/auto_mark.py
Original file line number Diff line number Diff line change
Expand Up @@ -455,15 +455,13 @@ def extract_test_methods(contents: str) -> set[tuple[str, str]]:
Returns:
Set of (class_name, method_name) tuples
"""
import ast
from update_lib.io_utils import safe_parse_ast
from update_lib.patch_spec import iter_tests

try:
tree = ast.parse(contents)
except SyntaxError:
tree = safe_parse_ast(contents)
if tree is None:
return set()

from update_lib.patch_spec import iter_tests

return {(cls_node.name, fn_node.name) for cls_node, fn_node in iter_tests(tree)}


Expand Down
19 changes: 8 additions & 11 deletions scripts/update_lib/copy_lib.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,17 +64,14 @@ def copy_lib(

# Extract module name and cpython prefix from path
path_str = str(src_path).replace("\\", "/")
if "/Lib/" in path_str:
cpython_prefix, after_lib = path_str.split("/Lib/", 1)
# Get module name (first component, without .py)
name = after_lib.split("/")[0]
if name.endswith(".py"):
name = name[:-3]
else:
# Fallback: just copy the single file
lib_path = parse_lib_path(src_path)
_copy_single(src_path, lib_path, verbose)
return
if "/Lib/" not in path_str:
raise ValueError(f"Path must contain '/Lib/' (got: {src_path})")

cpython_prefix, after_lib = path_str.split("/Lib/", 1)
# Get module name (first component, without .py)
name = after_lib.split("/")[0]
if name.endswith(".py"):
name = name[:-3]

# Get all paths to copy from DEPENDENCIES table
all_src_paths = get_lib_paths(name, cpython_prefix)
Expand Down
Loading
Loading