Skip to content

Conversation

@ShaharNaveh
Copy link
Collaborator

@ShaharNaveh ShaharNaveh commented Oct 19, 2025

Took @coolreader18 PR #6177 (he did 99% of the work) and updated ruff to 0.14.1 and patched our compile function as ruff moved duplicated params to be a semantic error, since 0.11.7 (astral-sh/ruff#17131)

Closes #6170
Closes #6177

Summary by CodeRabbit

  • Bug Fixes

    • Detect and report duplicate parameters in function signatures.
  • Chores

    • Updated ruff-related dependencies.
    • Improved UTF‑8-aware source-location reporting for more accurate errors and tracebacks.
    • Migrated string interpolation to a new interpolated-string representation for more consistent f-string behavior.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 19, 2025

Walkthrough

This PR updates ruff-related dependencies and adapts the codebase to ruff's revised AST and source-location shapes: adding node_index to many AST nodes, renaming row/columnline/character_offset, switching to PositionEncoding::Utf8, and replacing FString types with InterpolatedString types (plus initial t-string scaffolding).

Changes

Cohort / File(s) Summary
Dependency update
Cargo.toml
Bumped ruff-related git tags from 0.11.00.14.1.
Codegen — compile & IR
compiler/codegen/src/compile.rs, compiler/codegen/src/ir.rs, compiler/codegen/src/lib.rs
Added duplicate-parameter validation hook; switched location encoding to PositionEncoding::Utf8; adapted source-location field access (rowline, columncharacter_offset); migrated FString → InterpolatedString representations; added TString name mapping/scaffold.
Codegen — symboltable & unparse
compiler/codegen/src/symboltable.rs, compiler/codegen/src/unparse.rs
Pattern matches updated to include new node_index fields; unparse now consumes ruff::InterpolatedStringElement and InterpolatedStringFormatSpec; added TString unparsing scaffold.
Compiler core & marshal
compiler/core/src/lib.rs, compiler/core/src/marshal.rs, compiler/src/lib.rs
Re-exported PositionEncoding; serialized/deserialized SourceLocation fields renamed to line/character_offset; CompileError::location made const; parse errors use Utf8 encoding.
Bytecode / disassembly
compiler/core/src/bytecode.rs
Adjusted formatting digit calculations and use of line for disassembly output.
VM frame, traceback & faulthandler
vm/src/frame.rs, vm/src/builtins/frame.rs, stdlib/src/faulthandler.rs
Use line/character_offset for frame/traceback/faulthandler; avoid unnecessary SourceLocation clones (rely on Copy semantics).
VM AST glue — many files
vm/src/stdlib/ast.rs, vm/src/stdlib/ast/* (expression.rs, statement.rs, parameter.rs, pattern.rs, string.rs, module.rs, constant.rs, argument.rs, other.rs, type_parameters.rs, elif_else_clause.rs, exception.rs)
Added and threaded node_index across ~30+ AST node types; updated (de)serialization/conversion paths; replaced FString flows with InterpolatedString flows; inserted placeholders for TString where applicable.
String interpolation & lexer errors
vm/src/stdlib/ast/string.rs, src/shell.rs, vm/src/vm/vm_new.rs
Reworked f-string handling to InterpolatedStringElement variants (including node_index); switched lexical error variant to InterpolatedStringErrorType::UnterminatedTripleQuotedString.
WASM & JS error mapping
wasm/lib/src/convert.rs
Map compile error locations to JS using line and character_offset fields.

Sequence Diagram(s)

sequenceDiagram
    participant Parser as Ruff Parser (new AST)
    participant Compiler as Compiler / Codegen
    participant VM as VM / Runtime

    Note over Parser,Compiler: AST shape changes: node_index, InterpolatedString, SourceLocation(line+char_off)
    Parser->>Compiler: provide AST nodes (with node_index, ranges)
    Compiler->>Compiler: validate duplicate params
    Compiler->>Compiler: convert/interpolate strings (detect await inside interpolations)
    Compiler->>Compiler: compute source locations using PositionEncoding::Utf8
    Compiler->>VM: emit bytecode + linetable entries (line, character_offset)
    VM->>VM: build tracebacks and frame info using .line/.character_offset
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested reviewers

  • arihant2math

Poem

🐰 I hopped through nodes with tiny feet,

Gave every node an index neat,
Strings now sparkle, braces sing,
Row became line in UTF‑8 spring,
A rabbit cheers—code leaps to meet! 🎉

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title Check ✅ Passed The title "Update ruff 0.14.1" accurately describes the primary objective of this pull request, which is to upgrade the ruff dependency from version 0.11.0 to 0.14.1. While the changeset includes extensive refactoring related to AST changes (node_index fields), string interpolation types, source location field renames, and duplicated-parameter validation, these are consequences of the version upgrade rather than independent changes. The title is clear, specific, and sufficiently conveys the main change to a developer scanning the repository history.
Linked Issues Check ✅ Passed The PR successfully addresses the coding requirements from both linked issues. Issue #6170 and #6177 both target ruff version upgrades (toward >=0.13.1), and this PR achieves that by updating to ruff 0.14.1. The PR handles all the major AST-related changes including: addition of node_index fields across numerous AST node types (ExprNamed, ExprDict, ExprAwait, StmtFunctionDef, etc.), refactoring string interpolation from FString types to InterpolatedStringElements, adding preliminary t-string support, implementing source location field renames (row → line, column → character_offset), using PositionEncoding::Utf8, exporting PositionEncoding publicly, and adding compile-time validation for duplicate parameters. The changes comprehensively address the objectives outlined in the linked issues.
Out of Scope Changes Check ✅ Passed All changes in this pull request are directly related to the ruff 0.14.1 upgrade and its associated API changes. The modifications span Cargo.toml (dependency versions), compiler modules (AST handling and codegen), VM and stdlib components (AST field propagation), and WASM utilities (error location mapping)—all addressing the structural changes introduced by the ruff upgrade. There are no unrelated refactorings, unrelated feature additions, or changes outside the scope of adapting the codebase to the new ruff version and its AST structure modifications.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
vm/src/stdlib/ast/type_parameters.rs (1)

75-94: TypeVar: default_value is dropped in ast_to_object (serialization asymmetry).

ast_from_object reads "default_value", but ast_to_object doesn’t write it. This loses the default type parameter when round‑tripping.

Apply this diff to serialize the default:

     fn ast_to_object(self, _vm: &VirtualMachine, source_file: &SourceFile) -> PyObjectRef {
         let Self {
-            node_index: _,
-            name,
-            bound,
-            range: _range,
-            default: _,
+            node_index: _,
+            name,
+            bound,
+            range: _range,
+            default,
         } = self;
         let node = NodeAst
             .into_ref_with_type(_vm, pyast::NodeTypeParamTypeVar::static_type().to_owned())
             .unwrap();
         let dict = node.as_object().dict().unwrap();
         dict.set_item("name", name.ast_to_object(_vm, source_file), _vm)
             .unwrap();
         dict.set_item("bound", bound.ast_to_object(_vm, source_file), _vm)
             .unwrap();
+        dict.set_item(
+            "default_value",
+            default.ast_to_object(_vm, source_file),
+            _vm,
+        )
+        .unwrap();
         node_add_location(&dict, _range, _vm, source_file);
         node.into()
     }
vm/src/stdlib/ast/string.rs (2)

20-33: Use-after-move: fstring_element consumed at len() then used again.

fstring_element.into_iter().len() moves fstring_element, then it’s borrowed again in the closure — this won’t compile. Also O(n^2) due to re-creating the iterator each time.

Apply this safe pattern that avoids the move and keeps behavior:

-fn ruff_fstring_element_into_iter(
-    mut fstring_element: ruff::InterpolatedStringElements,
-) -> impl Iterator<Item = ruff::InterpolatedStringElement> + 'static {
-    let default =
-        ruff::InterpolatedStringElement::Literal(ruff::InterpolatedStringLiteralElement {
-            node_index: Default::default(),
-            range: Default::default(),
-            value: Default::default(),
-        });
-    (0..fstring_element.into_iter().len()).map(move |i| {
-        let fstring_element = &mut fstring_element;
-        let tmp = fstring_element.into_iter().nth(i).unwrap();
-        std::mem::replace(tmp, default.clone())
-    })
-}
+fn ruff_fstring_element_into_iter(
+    mut fstring_element: ruff::InterpolatedStringElements,
+) -> impl Iterator<Item = ruff::InterpolatedStringElement> + 'static {
+    let default =
+        ruff::InterpolatedStringElement::Literal(ruff::InterpolatedStringLiteralElement {
+            node_index: Default::default(),
+            range: Default::default(),
+            value: Default::default(),
+        });
+    let len = fstring_element.len(); // avoid moving the container
+    (0..len).map(move |i| {
+        // create a fresh &mut iterator each time; consider optimizing later
+        let tmp = fstring_element.iter_mut().nth(i).expect("in-bounds");
+        std::mem::replace(tmp, default.clone())
+    })
+}

Optionally, we can later switch to a linear-time drain-style iterator if InterpolatedStringElements exposes a mutable slice.


164-197: ExprFString loses source range (set to Default).

Top-level ExprFString { range: Default::default(), .. } drops location info. Should be the JoinedStr.range for accurate tracebacks/mappings.

Apply this minimal fix:

 impl JoinedStr {
     pub(super) fn into_expr(self) -> ruff::Expr {
         let Self { range, values } = self;
         ruff::Expr::FString(ruff::ExprFString {
             node_index: Default::default(),
-            range: Default::default(),
+            range,
             value: match values.len() {
                 // unchanged…
vm/src/stdlib/ast/other.rs (1)

109-133: WithItem misses location propagation in ast_to_object

_range is captured but never applied; Alias does call node_add_location. Add node_add_location to keep locations consistent.

Apply:

         let dict = node.as_object().dict().unwrap();
@@
         dict.set_item(
             "optional_vars",
             optional_vars.ast_to_object(vm, source_file),
             vm,
         )
         .unwrap();
+        node_add_location(&dict, _range, vm, source_file);
         node.into()
vm/src/stdlib/ast/parameter.rs (1)

352-361: Fix potential usize underflow when computing default_argument_count

If defaults.len() > posonlyargs.len() + args.len(), len_a + len_b - len_c underflows (debug panic / release wrap), leading to incorrect .skip() behavior. Use saturating subtraction and keep intent explicit.

Apply:

-    let default_argument_count = posonlyargs.len() + args.len() - defaults.len();
+    let total = posonlyargs.len() + args.len();
+    let default_argument_count = total.saturating_sub(defaults.len());
vm/src/stdlib/ast/pattern.rs (3)

314-333: Runtime panic risk: split_pattern_match_class is todo! but invoked

PatternMatchClass::ast_to_object calls split_pattern_match_class, which is todo!(). Any conversion of match-class patterns to objects will panic.

  • Implement split_pattern_match_class (extract positional patterns, keyword attrs, keyword patterns).
  • If not ready, gate the call and return a structured Python-side error instead of panicking, or temporarily unreachable!() under a feature flag to avoid accidental invocation.

Also applies to: 534-543


357-372: Runtime panic risk: merge_pattern_match_class is todo! but invoked

PatternMatchClass::ast_from_object constructs via merge_pattern_match_class, which is todo!(). Deserializing match-class patterns from Python objects will panic.

  • Implement merge_pattern_match_class to combine (patterns, kwd_attrs, kwd_patterns) into Vec<ruff::Pattern> and Vec<ruff::PatternKeyword>.
  • Until implemented, fail gracefully with a descriptive error rather than panicking.

Also applies to: 544-551


164-183: Unimplemented ruff::Singleton Node conversions break match-singleton paths

PatternMatchSingleton delegates to ruff::Singleton Node conversion, but ruff::Singleton has todo!() in both ast_to_object and ast_from_object. Any (de)serialization of MatchSingleton will panic.

I can draft the minimal ruff::Singleton conversions for {None, True, False} to unblock this.

Also applies to: 184-199, 201-213

compiler/codegen/src/symboltable.rs (1)

796-827: Reorder type parameter scope entry and consolidate .type_params registration for PEP 695 compliance.

The concerns in your review are verified:

  1. Scope visibility confirmed: enter_type_param_block at lines 796-827 (ClassDef), 1419-1426 (FunctionDef), and 663-686 (TypeAlias) is called before entering the parent scope. When enter_type_param_block checks in_class = self.tables.last().is_some_and(|t| t.typ == CompilerScope::Class), it returns false for class definitions because the class scope is entered afterward, leaving can_see_class_scope = false. This misaligns with PEP 695 semantics where type parameters should be visible to class members.

  2. Double registration confirmed: .type_params is registered twice—once in enter_type_param_block (line 680 with SymbolUsage::Assigned) and again in scan_type_params (line 1421 with SymbolUsage::TypeParam).

Required fixes:

  • Reorder: enter parent scope (Class/Function) first, then enter type parameter block, or pass scope context to enter_type_param_block
  • Remove one .type_params registration to eliminate redundancy
vm/src/stdlib/ast/expression.rs (1)

1271-1292: Add node_add_location() call for Comprehension objects to maintain consistency.

The node_add_location() function is called in every other impl_to_object throughout the AST module (70+ instances), but is missing from the Comprehension implementation at lines 1271-1292. The _range parameter is available but unused, indicating this was an oversight. While Python's standard AST spec does not include location attributes on comprehension objects, this RustPython implementation systematically adds location info to all node types. The proposed fix is correct:

        let dict = node.as_object().dict().unwrap();
        dict.set_item("target", target.ast_to_object(vm, source_file), vm)
            .unwrap();
        dict.set_item("iter", iter.ast_to_object(vm, source_file), vm)
            .unwrap();
        dict.set_item("ifs", ifs.ast_to_object(vm, source_file), vm)
            .unwrap();
        dict.set_item("is_async", is_async.ast_to_object(vm, source_file), vm)
            .unwrap();
+       node_add_location(&dict, _range, vm, source_file);
        node.into()
🧹 Nitpick comments (9)
vm/src/stdlib/ast/elif_else_clause.rs (1)

10-15: Consistent handling of node_index in elif/else clauses.

Ignoring incoming indices where appropriate and defaulting new ones is fine for internal indexing.

If node_index is intended for stable mapping across transforms, consider threading through (not defaulting) when lifting nested If into ElifElseClause.

Also applies to: 63-66, 75-76, 84-85, 92-93

vm/src/stdlib/ast/string.rs (1)

4-17: Quadratic iteration over fstring_value (perf nit).

Repeated iter_mut().nth(i) is O(n^2). If this path is hot, consider a linear iterator that replaces in place without re-scanning each time.

Example direction (subject to API availability):

  • Precompute len, then iterate once with a mutable index and use direct indexing or a draining iterator.
  • Or, if FStringValue supports move-out iteration, prefer into_iter() to avoid replacements.
vm/src/stdlib/ast/argument.rs (1)

160-165: Compute a meaningful range when merging class-def args

range: Default::default() drops location data. Build the covered range from args/keywords for consistency with merge_function_call_arguments.

Apply:

-    Some(Box::new(ruff::Arguments {
-        node_index: Default::default(),
-        range: Default::default(), // TODO
-        args,
-        keywords,
-    }))
+    // Derive range from any present args/keywords
+    let range = args
+        .iter()
+        .map(|a| a.range())
+        .chain(keywords.iter().map(|k| k.range()))
+        .reduce(|acc, next| acc.cover(next))
+        .unwrap_or_default();
+
+    Some(Box::new(ruff::Arguments {
+        node_index: Default::default(),
+        range,
+        args,
+        keywords,
+    }))
compiler/codegen/src/ir.rs (2)

185-191: Location replication matches instruction expansion

Using repeat_n(info.location, arg.instr_size()) keeps locations aligned with extras + base opcode. LGTM. Consider a brief assert in debug to prevent drift if encoding changes.


424-427: Update comment to match renamed fields and zero-based column

The comment still mentions “row/column (OneIndexed)”. Now it’s line (OneIndexed) and character_offset with to_zero_indexed(). Suggest updating the comment to avoid confusion.

vm/src/stdlib/ast/other.rs (1)

140-151: WithItem range set to Default in ast_from_object

Use range_from_object to preserve locations like Alias and ExceptHandler.

Apply:

-        Ok(Self {
-            node_index: Default::default(),
+        Ok(Self {
+            node_index: Default::default(),
             context_expr: Node::ast_from_object(
                 vm,
                 source_file,
                 get_node_field(vm, &object, "context_expr", "withitem")?,
             )?,
             optional_vars: get_node_field_opt(vm, &object, "optional_vars")?
                 .map(|obj| Node::ast_from_object(vm, source_file, obj))
                 .transpose()?,
-            range: Default::default(),
+            range: range_from_object(vm, source_file, object, "withitem")?,
         })
vm/src/stdlib/ast/parameter.rs (1)

406-413: Guard against mismatched keyword/default lengths

zip(kw_only_args.keywords, defaults.defaults) truncates to the shorter length. If lengths diverge, defaults will be silently dropped or missing. Consider asserting equality in debug or returning an error.

     std::iter::zip(kw_only_args.keywords, defaults.defaults)
         .map(|(parameter, default)| ruff::ParameterWithDefault {
             node_index: Default::default(),
             parameter,
             default,
             range: Default::default(),
         })
         .collect()
+    // debug-only sanity:
+    // debug_assert_eq!(kw_only_args_len, kw_defaults_len, "kwonlyargs/defaults length mismatch");
vm/src/stdlib/ast/statement.rs (2)

963-969: Remove duplicate _cls fetch in StmtTry::ast_from_object.

let _cls = _object.class(); is called twice; the second is redundant.

-        let _cls = _object.class();
-        let is_star = _cls.is(pyast::NodeStmtTryStar::static_type());
-        let _cls = _object.class();
+        let _cls = _object.class();
+        let is_star = _cls.is(pyast::NodeStmtTryStar::static_type());
         debug_assert!(
             _cls.is(pyast::NodeStmtTry::static_type())
                 || _cls.is(pyast::NodeStmtTryStar::static_type())
         );

171-221: Unify identifier conversion style for FunctionDef vs ClassDef.

FunctionDef::ast_to_object writes name via vm.ctx.new_str(name.as_str()), while ClassDef::ast_to_object calls name.ast_to_object(...). Consider standardizing on one approach for consistency and to avoid surprises if either path gains behavior.

If Identifier implements Node, prefer using it consistently. Otherwise, prefer direct new_str(...) for both.

Also applies to: 268-307

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b48dcc and 0a48741.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/test_fstring.py is excluded by !Lib/**
📒 Files selected for processing (29)
  • Cargo.toml (1 hunks)
  • compiler/codegen/src/compile.rs (19 hunks)
  • compiler/codegen/src/ir.rs (2 hunks)
  • compiler/codegen/src/lib.rs (1 hunks)
  • compiler/codegen/src/symboltable.rs (16 hunks)
  • compiler/codegen/src/unparse.rs (24 hunks)
  • compiler/core/src/bytecode.rs (1 hunks)
  • compiler/core/src/lib.rs (1 hunks)
  • compiler/core/src/marshal.rs (2 hunks)
  • compiler/src/lib.rs (3 hunks)
  • src/shell.rs (2 hunks)
  • stdlib/src/faulthandler.rs (1 hunks)
  • vm/src/builtins/frame.rs (1 hunks)
  • vm/src/frame.rs (2 hunks)
  • vm/src/stdlib/ast.rs (4 hunks)
  • vm/src/stdlib/ast/argument.rs (4 hunks)
  • vm/src/stdlib/ast/constant.rs (10 hunks)
  • vm/src/stdlib/ast/elif_else_clause.rs (4 hunks)
  • vm/src/stdlib/ast/exception.rs (2 hunks)
  • vm/src/stdlib/ast/expression.rs (51 hunks)
  • vm/src/stdlib/ast/module.rs (4 hunks)
  • vm/src/stdlib/ast/other.rs (4 hunks)
  • vm/src/stdlib/ast/parameter.rs (8 hunks)
  • vm/src/stdlib/ast/pattern.rs (18 hunks)
  • vm/src/stdlib/ast/statement.rs (46 hunks)
  • vm/src/stdlib/ast/string.rs (10 hunks)
  • vm/src/stdlib/ast/type_parameters.rs (7 hunks)
  • vm/src/vm/vm_new.rs (2 hunks)
  • wasm/lib/src/convert.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • vm/src/builtins/frame.rs
  • compiler/codegen/src/lib.rs
  • compiler/core/src/lib.rs
  • compiler/codegen/src/ir.rs
  • wasm/lib/src/convert.rs
  • compiler/core/src/marshal.rs
  • vm/src/stdlib/ast/module.rs
  • compiler/core/src/bytecode.rs
  • stdlib/src/faulthandler.rs
  • vm/src/vm/vm_new.rs
  • vm/src/stdlib/ast/exception.rs
  • vm/src/frame.rs
  • src/shell.rs
  • vm/src/stdlib/ast/constant.rs
  • compiler/codegen/src/unparse.rs
  • vm/src/stdlib/ast/other.rs
  • vm/src/stdlib/ast/type_parameters.rs
  • compiler/src/lib.rs
  • vm/src/stdlib/ast/elif_else_clause.rs
  • vm/src/stdlib/ast/pattern.rs
  • vm/src/stdlib/ast/string.rs
  • vm/src/stdlib/ast.rs
  • vm/src/stdlib/ast/expression.rs
  • vm/src/stdlib/ast/statement.rs
  • vm/src/stdlib/ast/parameter.rs
  • vm/src/stdlib/ast/argument.rs
  • compiler/codegen/src/symboltable.rs
  • compiler/codegen/src/compile.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/builtins/frame.rs
  • vm/src/stdlib/ast/module.rs
  • stdlib/src/faulthandler.rs
  • vm/src/vm/vm_new.rs
  • vm/src/stdlib/ast/exception.rs
  • vm/src/frame.rs
  • vm/src/stdlib/ast/constant.rs
  • vm/src/stdlib/ast/other.rs
  • vm/src/stdlib/ast/type_parameters.rs
  • vm/src/stdlib/ast/elif_else_clause.rs
  • vm/src/stdlib/ast/pattern.rs
  • vm/src/stdlib/ast/string.rs
  • vm/src/stdlib/ast.rs
  • vm/src/stdlib/ast/expression.rs
  • vm/src/stdlib/ast/statement.rs
  • vm/src/stdlib/ast/parameter.rs
  • vm/src/stdlib/ast/argument.rs
🧬 Code graph analysis (5)
stdlib/src/faulthandler.rs (1)
vm/src/frame.rs (1)
  • current_location (173-175)
vm/src/stdlib/ast/other.rs (1)
vm/src/stdlib/ast/expression.rs (14)
  • ast_to_object (11-53)
  • ast_to_object (148-165)
  • ast_to_object (191-208)
  • ast_to_object (234-254)
  • ast_to_object (285-302)
  • ast_to_object (327-344)
  • ast_to_object (370-390)
  • ast_to_object (421-445)
  • ast_to_object (477-491)
  • ast_to_object (511-528)
  • ast_to_object (554-571)
  • ast_to_object (597-617)
  • ast_to_object (648-666)
  • ast_to_object (694-708)
vm/src/stdlib/ast/pattern.rs (2)
vm/src/stdlib/ast/expression.rs (12)
  • ast_from_object (55-143)
  • ast_from_object (167-186)
  • ast_from_object (210-229)
  • ast_from_object (256-280)
  • ast_from_object (303-322)
  • ast_from_object (346-365)
  • ast_from_object (392-416)
  • ast_from_object (447-472)
  • ast_from_object (492-506)
  • ast_from_object (530-549)
  • ast_from_object (573-592)
  • ast_from_object (619-643)
vm/src/stdlib/ast.rs (2)
  • get_node_field (56-59)
  • range_from_object (163-186)
vm/src/stdlib/ast/string.rs (1)
vm/src/stdlib/ast/constant.rs (1)
  • new_str (13-23)
vm/src/stdlib/ast/statement.rs (3)
vm/src/stdlib/ast/exception.rs (2)
  • ast_to_object (6-10)
  • ast_to_object (36-59)
vm/src/stdlib/ast/expression.rs (14)
  • ast_to_object (11-53)
  • ast_to_object (148-165)
  • ast_to_object (191-208)
  • ast_to_object (234-254)
  • ast_to_object (285-302)
  • ast_to_object (327-344)
  • ast_to_object (370-390)
  • ast_to_object (421-445)
  • ast_to_object (477-491)
  • ast_to_object (511-528)
  • ast_to_object (554-571)
  • ast_to_object (597-617)
  • ast_to_object (648-666)
  • ast_to_object (694-708)
vm/src/stdlib/ast.rs (1)
  • range_from_object (163-186)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (39)
Cargo.toml (1)

161-164: Ruff dependency updates verified and consistent.

The tag 0.14.1 exists in the Ruff repository, and all four ruff crates in the workspace are correctly pinned to this version. Member crates properly reference these dependencies via { workspace = true }, with no conflicting versions detected elsewhere in the workspace. The updates are correctly coordinated.

vm/src/vm/vm_new.rs (2)

320-334: LGTM! Correct update to InterpolatedStringErrorType.

The migration from FStringErrorType to InterpolatedStringErrorType correctly reflects ruff's API changes. The error handling logic for unterminated triple-quoted strings remains intact.


412-418: LGTM! Correct field rename from row to line.

The SourceLocation field rename from row to line improves semantic clarity while maintaining the same functionality for extracting error statement lines.

src/shell.rs (2)

3-6: LGTM! Import correctly updated.

The import of InterpolatedStringErrorType (replacing the previous FStringErrorType) correctly reflects ruff's API changes.


52-58: LGTM! Correct type usage update.

The pattern match correctly uses InterpolatedStringErrorType::UnterminatedTripleQuotedString, maintaining the REPL's input continuation behavior for unterminated triple-quoted interpolated strings.

vm/src/builtins/frame.rs (1)

62-64: Alignment with new SourceLocation fields looks good.

The getter now uses line; consistent with vm/src/frame.rs. No issues here.

compiler/core/src/marshal.rs (1)

201-203: Location encoding is consistent (1-based line, 0-based column).

Read uses OneIndexed::new for line and from_zero_indexed for character_offset; write uses line.get() and character_offset.to_zero_indexed(). Good.

Consider adding a quick round‑trip test for a few edge positions (line 1/col 0; last line/col end) to prevent regressions.

Also applies to: 659-661

vm/src/frame.rs (1)

391-400: Traceback construction updated to use loc.line — LGTM.

Mapping to PyTraceback now passes OneIndexed line; consistent with new fields.

Double-check that frame.lasti() here matches the expected “last executed” index (preincrement happens earlier). If consumers assume CPython semantics, add a quick assertion in tests.

stdlib/src/faulthandler.rs (1)

11-15: Uses .line in faulthandler output — OK.

Matches new SourceLocation fields. Safe once current_location handles lasti==0.

vm/src/stdlib/ast/module.rs (1)

69-73: node_index propagation is consistent.

Destructuring ignores node_index on to_object and initializes it with Default on from_object. Matches the broader AST changes.

Also applies to: 99-101, 152-156, 173-174

vm/src/stdlib/ast.rs (1)

26-27: UTF‑8 position encoding + field rename wired correctly.

  • PySourceLocation now maps row/column -> line/character_offset.
  • Offsets computed with PositionEncoding::Utf8.
  • Interactive mode wrapped into ModModule with node_index default. All good.

Add/adjust a parser test asserting lineno/col_offset round‑trip for multibyte UTF‑8 to lock in the encoding change.

Also applies to: 94-98, 196-201, 202-205, 276-280

wasm/lib/src/convert.rs (1)

254-255: The review comment is based on an incorrect premise—no action required.

rustpython uses Python's convention where line numbers are 1-based and character offsets are 0-based. The code at lines 254-255 and 259-260 is already correct:

  • err.location().unwrap().line.get() provides 1-based row (standard for editors)
  • err.location().unwrap().character_offset.get() provides 0-based column (standard for tooling and JavaScript error conventions)

The review incorrectly claimed character_offset.get() is 1-based. It is actually 0-based. The to_zero_indexed() method does not exist and is not needed—the values are already in the correct indexing base for JavaScript consumers.

vm/src/stdlib/ast/type_parameters.rs (1)

18-22: LGTM on node_index default initialization.

Constructor now threads node_index consistently. No issues.

vm/src/stdlib/ast/string.rs (1)

89-107: Format-spec conversion looks correct with node_index threading.

Destructure/mapping to JoinedStr preserves range and elements; node_index intentionally ignored. LGTM.

Please confirm InterpolatedStringFormatSpec.elements ordering matches CPython fmt parsing expectations in downstream consumers.

compiler/codegen/src/lib.rs (1)

59-60: Add TString name mapping — good coverage.

This keeps error messages coherent for t-strings. No issues.

compiler/core/src/lib.rs (1)

11-13: Re-export PositionEncoding — consistent with location model.

Public surface now matches downstream imports. LGTM.

vm/src/stdlib/ast/argument.rs (3)

59-64: node_index defaulting is fine here

Setting node_index: Default::default() for newly constructed ruff::Arguments is acceptable given we don't rely on stable IDs downstream.


70-75: Ignoring node_index on split is appropriate

Discarding node_index when splitting preserves behavior and avoids leaking internal IDs to wrappers.


109-114: Consistent destructuring

Ignoring node_index on class-def split matches the function-call split. LGTM.

compiler/src/lib.rs (3)

46-55: Explicit UTF‑8 position encoding is correct

Using PositionEncoding::Utf8 for parse errors aligns python_location and linetable expectations. LGTM.


65-71: python_location stays 1‑based; consistent with user expectations

Returning (line.get(), character_offset.get()) keeps 1‑based reporting. LGTM.

Double-check any callers that expect zero-based columns; most human-facing errors expect 1-based, while linetable uses zero-based. Confirm no mismatch.


58-63: Code is correct; SourceLocation implements Copy

The review comment's concern is invalid. SourceLocation from ruff_source_file derives Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash, Debug, making Option<SourceLocation> also Copy-compatible. The const fn location(&self) -> Option<SourceLocation> implementation at lines 58–63 is sound and requires no changes.

vm/src/stdlib/ast/other.rs (2)

59-63: Decorator node_index wiring: OK

Setting node_index: Default::default() and deriving range from expression is consistent. LGTM.


69-85: Alias ast_to_object ignores node_index as intended

Location is propagated via node_add_location; looks good.

vm/src/stdlib/ast/exception.rs (2)

37-43: ExceptHandlerExceptHandler: ignoring node_index on to_object is fine

Matches pattern across AST nodes. LGTM.


66-81: ExceptHandlerExceptHandler: node_index defaulting and location handling look good

Location is preserved via range_from_object and node_add_location. LGTM.

vm/src/stdlib/ast/parameter.rs (1)

8-15: node_index plumbing looks consistent

Ignoring on serialization and defaulting on deserialization aligns with the rest of the AST conversions.

Also applies to: 84-96, 107-116, 141-154, 162-171, 184-195, 335-350

vm/src/stdlib/ast/pattern.rs (1)

219-234: node_index defaulting across pattern nodes looks good

Consistent defaulting and ignoring in destructuring; location handling unchanged.

Also applies to: 241-250, 257-278, 286-302, 309-333, 341-372, 430-458, 463-498, 503-531

vm/src/stdlib/ast/constant.rs (1)

250-258: node_index propagation through constants: OK

Defaulting node_index on construction and ignoring on object round-trips is consistent and low-risk.

Also applies to: 260-270, 272-282, 284-288, 289-304, 305-328, 330-333, 335-340, 341-345

compiler/codegen/src/unparse.rs (1)

536-543: InterpolatedStringElement migration: looks solid

Updated signatures and traversal match the new AST shape.

Also applies to: 588-606

compiler/codegen/src/compile.rs (4)

150-162: Duplicate parameter check added — good, but verify coverage of vararg/kwarg

Logic is correct and fast. Ensure the iterator over Parameters includes:

  • posonly + args,
  • kwonly,
  • vararg,
  • kwarg.

Add tests that exercise duplicates across categories (e.g., def f(a, *a): ..., def f(*, a, a=1): ..., lambda a, a: ...) to confirm detection triggers everywhere.


1517-1529: Placement of duplicate-parameter validation is right

Running validation before emitting function/lambda code prevents partial codegen.

Also applies to: 4678-4683


5570-5591: Await detection in interpolated strings covers nested specs

Recursive walk through InterpolatedStringElement and nested format_spec looks correct.


254-269: Utf8-based SourceLocation use is consistent

Switched to PositionEncoding::Utf8 across error/reporting helpers; good.

Also applies to: 544-554, 5361-5369

vm/src/stdlib/ast/expression.rs (2)

686-688: parenthesized: true hard-coded; verify correctness.

Both ExprGenerator and ExprTuple set parenthesized: true unconditionally. This may misrepresent tuples without parens and certain generator contexts.

Please confirm against Ruff 0.14.1 AST semantics. If fidelity matters, derive this flag during (de)serialization rather than forcing true. I can help wire it from the parser tokens if exposed.

Also applies to: 1179-1181


149-154: node_index threading looks correct.

Destructuring with node_index: _ and defaulting it in ast_from_object is consistent and low-risk.

Also applies to: 173-174, 236-242, 262-263, 286-292, 309-310, 329-334, 352-353, 372-377, 398-399, 422-427, 468-469, 478-483, 498-499

vm/src/stdlib/ast/statement.rs (1)

354-365: node_index propagation and location setting are consistent.

Pattern matches introduce node_index and keep ranges via node_add_location(...). Looks good.

Also applies to: 385-397, 418-437, 465-489, 520-540, 570-597, 631-662, 709-729, 757-777, 788-814, 848-865, 889-906, 1001-1017, 1039-1053, 1072-1092, 1119-1132, 1151-1165, 1184-1198, 1217-1228, 1243-1253, 1269-1280

compiler/codegen/src/symboltable.rs (2)

22-23: UTF‑8 position encoding adoption is correct.

The switch to PositionEncoding::Utf8 on source_location(...) aligns locations with the new encoding model.

Also applies to: 1051-1053, 1607-1609


1271-1279: F‑string scanning updates look correct and comprehensive.

Using elements().filter_map(|x| x.as_interpolation()) and scanning nested format specs via interpolations() is aligned with the new representation.

Also applies to: 1310-1318, 1169-1176, 1186-1193, 1195-1213, 1241-1269

Comment on lines +4905 to +4907
Expr::TString(_) => {
return Err(self.error(CodegenErrorType::NotImplementedYet));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

TString currently unimplemented in compiler

Compiler rejects Expr::TString with NotImplementedYet while unparser prints a placeholder. Align behavior (either both reject or both support) to avoid surprises.

  • For now, consider a targeted SyntaxError with a message like “t-strings not yet supported” to improve UX.
🤖 Prompt for AI Agents
In compiler/codegen/src/compile.rs around lines 4905–4907, the match arm for
Expr::TString returns a generic NotImplementedYet error while the unparser
prints a placeholder; change this to return a targeted SyntaxError with a clear
message like "t-strings not yet supported" (include the original span/location
info when constructing the error so diagnostics point to the t-string site).
Replace the NotImplementedYet Err with a call that creates/returns a SyntaxError
carrying that message and the expression span so the compiler reports a
user-friendly syntax-level message.

Comment on lines +1281 to +1289
return Err(SymbolTableError {
error: "not yet implemented".into(),
location: Some(
self.source_file
.to_source_code()
.source_location(tstring.range.start(), PositionEncoding::Utf8),
),
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

T-string not implemented here: align behavior with VM conversion.

Returning a SymbolTableError with location is good. Ensure vm/src/stdlib/ast/expression.rs doesn’t panic on TString (it does today). The two layers should agree: either both error gracefully or both support t-strings.

Link the expression-layer change to this one (see my other comment) and add a test asserting a clean error is raised with a correct span.


I need to verify the review comment by examining the codebase. Let me investigate the t-string handling across the layers mentioned.


🏁 Script executed:

#!/bin/bash
# Search for TString handling in VM expression layer
echo "=== Searching for TString in vm/src/stdlib/ast/expression.rs ==="
rg -n "TString" vm/src/stdlib/ast/expression.rs -C 3

echo ""
echo "=== Searching for panic or unwrap near TString in VM layer ==="
rg -n "TString" vm/src/stdlib/ast/ -C 2 | head -50

Length of output: 1253


🏁 Script executed:

#!/bin/bash
# Check if vm/src/stdlib/ast/expression.rs exists and examine t-string handling
echo "=== Checking for t-string related code in compiler codegen ==="
rg -n "TString" compiler/codegen/src/ -C 2 | head -40

echo ""
echo "=== Search for error handling patterns in symboltable.rs ==="
rg -n "not yet implemented" compiler/codegen/src/symboltable.rs -B 5 -A 5

Length of output: 2657


🏁 Script executed:

#!/bin/bash
# Look for tests related to t-string error handling
echo "=== Searching for t-string tests ==="
fd -e rs | xargs rg -l "TString.*test|test.*TString" | head -20

echo ""
echo "=== Search for SymbolTableError tests ==="
fd -e rs | xargs rg -l "SymbolTableError" | grep test

Length of output: 79


Let me search for test files with different patterns and verify the full scope of the inconsistency:


🏁 Script executed:

#!/bin/bash
# Look for test files with different patterns
echo "=== Finding test directories ==="
fd -type d "test" | head -10

echo ""
echo "=== Search for integration or unit tests mentioning TString or SymbolTableError ==="
fd -name "*.rs" -path "*/test*" | xargs rg -l "TString|SymbolTableError" 2>/dev/null | head -10

echo ""
echo "=== Check for unimplemented macro usage related to TString ==="
rg -n "unimplemented|todo" vm/src/stdlib/ast/expression.rs -B 2 -A 1

Length of output: 1541


🏁 Script executed:

#!/bin/bash
# Verify the full context of TString handling in compile.rs
echo "=== TString handling in compile.rs (lines around 4905) ==="
rg -n "TString" compiler/codegen/src/compile.rs -B 3 -A 3

echo ""
echo "=== Check if unimplemented!() is the panic mechanism in VM ==="
rg -n "fn.*to_object" vm/src/stdlib/ast/expression.rs -A 2 | head -20

Length of output: 1584


Fix VM layer inconsistency: TString handling must error gracefully, not panic.

The review comment is accurate. VM layer panics via unimplemented!() on TString (vm/src/stdlib/ast/expression.rs:39), while codegen layer gracefully returns SymbolTableError with location (compiler/codegen/src/symboltable.rs:1281). Additionally, codegen compile.rs also uses graceful error handling with CodegenErrorType::NotImplementedYet (line 4906).

Critical actions required:

  • Replace VM's unimplemented!() with graceful error handling matching the codegen layer
  • Add test coverage verifying the error is raised with correct span
  • Link related expression-layer changes in commit/PR
🤖 Prompt for AI Agents
In vm/src/stdlib/ast/expression.rs around line 39 and
compiler/codegen/src/symboltable.rs lines 1281–1289, replace the panic-producing
unimplemented!() for TString with graceful error construction/return that
mirrors the codegen layer (use the same NotImplemented/NotImplementedYet error
variant and include the source location/span), update the VM function signature
to return a Result if needed and propagate the error instead of panicking, add a
unit/integration test that invokes the TString path and asserts the returned
error type and that the error contains the correct source span/location, and
mention/link this expression-layer change to the related compiler/codegen
commit/PR in your PR description so reviewers can see the coordinated fixes.

Comment on lines +368 to 371
Expr::TString(_) => self.p("t\"\"")?,
Expr::StringLiteral(ruff::ExprStringLiteral { value, .. }) => {
if value.is_unicode() {
self.p("u")?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Inconsistent TString support between unparse and compile

Unparser emits t"", but compile.rs returns NotImplementedYet for Expr::TString. This can break round-trips and tests.

  • Either gate TString in unparse behind the same feature/flag used in compile, or make unparse emit a clearer placeholder only when compile can also handle it.

Do we plan to land end-to-end t-string support in this PR or a follow-up?



---

🏁 Script executed:

```shell
#!/bin/bash
# Search for TString handling in compile.rs to verify the NotImplementedYet claim
rg -n "TString" compiler/codegen/src/compile.rs -A 3 -B 3

Length of output: 658


🏁 Script executed:

#!/bin/bash
# Get fuller context around unparse.rs lines 368-371
sed -n '360,380p' compiler/codegen/src/unparse.rs

Length of output: 895


🏁 Script executed:

#!/bin/bash
# Search for all TString references in unparse.rs to understand the pattern
rg -n "TString" compiler/codegen/src/unparse.rs -B 2 -A 5

Length of output: 419


🏁 Script executed:

#!/bin/bash
# Check for feature gates or conditional compilation around TString
rg -n "cfg|feature" compiler/codegen/src/unparse.rs | head -20

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for any TODO or NotImplementedYet comments related to TString
rg -n "NotImplementedYet|TODO.*[Tt]String" compiler/codegen/src/ -A 2 -B 2

Length of output: 1369



---

🏁 Script executed:

```shell
#!/bin/bash
# Get context around the interpolated_string_contains_await TString handling
sed -n '5550,5570p' compiler/codegen/src/compile.rs

Length of output: 766


🏁 Script executed:

#!/bin/bash
# Check if there are other places where TString is matched and NOT returning NotImplementedYet
rg -n "Expr::TString" compiler/codegen/src/compile.rs -B 1 -A 2

Length of output: 332


🏁 Script executed:

#!/bin/bash
# Search for any comments around TString implementation status
rg -n "TString|t-string|t string" compiler/codegen/src/ -i -B 2 -A 2 | head -50

Length of output: 2870


🏁 Script executed:

#!/bin/bash
# Check git history or comments for TString to understand if this is new work
git log --oneline -n 30 -- compiler/codegen/src/unparse.rs compiler/codegen/src/compile.rs 2>/dev/null | head -20

Length of output: 911


Clarify TString implementation status and gate inconsistent behavior

The inconsistency is confirmed:

  • compile.rs line 4905 rejects Expr::TString with NotImplementedYet
  • symboltable.rs line 1280 also blocks TString
  • unparse.rs line 368 emits t"" without guards
  • Yet compile.rs line 5557 partially handles TString in contains_await

This creates an asymmetry: unparsed output (t"") cannot be recompiled. Either gate TString emission in unparse to match the NotImplementedYet state in compile, or complete TString support across all code paths. Clarify whether end-to-end TString support is in scope for this PR or deferred to follow-up work.

🤖 Prompt for AI Agents
In compiler/codegen/src/unparse.rs around lines 368 to 371, unparse emits a
TString as t"" even though compile.rs (around line 4905) and symboltable.rs
(around line 1280) reject TString via NotImplementedYet and other code paths
only partially handle it; update unparse to match the current compiler state by
gating TString emission: detect Expr::TString and either return a
NotImplementedYet error/result (or emit a clear placeholder comment) instead of
producing t"" so emission and compilation are symmetric; alternatively, if you
intend full support in this PR, implement consistent handling across compile.rs,
symboltable.rs, and contains_await so unparse output can be recompiled—pick one
approach and apply the corresponding change consistently.

Comment on lines +1199 to 1207
let line_digits = (3).max(self.locations.last().unwrap().line.digits().get());
let offset_digits = (4).max(1 + self.instructions.len().ilog10() as usize);
let mut last_line = OneIndexed::MAX;
let mut arg_state = OpArgState::default();
for (offset, &instruction) in self.instructions.iter().enumerate() {
let (instruction, arg) = arg_state.get(instruction);
// optional line number
let line = self.locations[offset].row;
let line = self.locations[offset].line;
if line != last_line {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid panics in disassembly width calc; keep correct widths.

Two edge cases can panic:

  • self.instructions.len().ilog10() when len == 0.
  • self.locations.last().unwrap() when locations is empty.

Also, offset width should reflect max offset (len-1).

Apply a safe, minimal change:

-        let line_digits = (3).max(self.locations.last().unwrap().line.digits().get());
-        let offset_digits = (4).max(1 + self.instructions.len().ilog10() as usize);
+        let line_digits = (3).max(
+            self.locations
+                .last()
+                .map(|loc| loc.line.digits().get())
+                .unwrap_or(1),
+        );
+        let instrs = self.instructions.len();
+        let max_offset = instrs.saturating_sub(1);
+        let offset_digits = (4).max(
+            1 + if max_offset == 0 { 0 } else { max_offset.ilog10() as usize }
+        );

Change to line accessor is correct; keep it.

🤖 Prompt for AI Agents
In compiler/core/src/bytecode.rs around lines 1199–1207, avoid panics when
computing column widths: replace the direct unwrap of self.locations.last() with
a safe fallback (e.g., use self.locations.last().map(|l|
l.line.digits().get()).unwrap_or(1) and then take max with 3), and avoid calling
ilog10 on zero by computing offset_digits from the greatest offset (if
instructions.len() == 0 then use 1, otherwise compute 1 + (instructions.len() -
1).ilog10() as usize) and then take max with 4; keep the existing line variable
usage unchanged.

Comment on lines 173 to 175
pub fn current_location(&self) -> SourceLocation {
self.code.locations[self.lasti() as usize - 1].clone()
self.code.locations[self.lasti() as usize - 1]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard against lasti==0 to avoid underflow/panic.

current_location indexes lasti()-1; when lasti==0 (pre-execution), this will underflow.

Apply:

-    pub fn current_location(&self) -> SourceLocation {
-        self.code.locations[self.lasti() as usize - 1]
-    }
+    pub fn current_location(&self) -> SourceLocation {
+        let idx = (self.lasti()).saturating_sub(1) as usize;
+        self.code.locations[idx]
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn current_location(&self) -> SourceLocation {
self.code.locations[self.lasti() as usize - 1].clone()
self.code.locations[self.lasti() as usize - 1]
}
pub fn current_location(&self) -> SourceLocation {
let idx = (self.lasti()).saturating_sub(1) as usize;
self.code.locations[idx]
}
🤖 Prompt for AI Agents
In vm/src/frame.rs around lines 173–175, current_location() subtracts 1 from
lasti() and will underflow when lasti()==0; guard by checking if self.lasti() ==
0 and return a safe fallback (for example SourceLocation::default() or a clone
of self.code.locations[0] if you prefer the first location), otherwise compute
index = (self.lasti() as usize) - 1 and return self.code.locations[index]; also
handle the case where self.code.locations is empty to avoid panics.

Comment on lines +39 to 41
Self::TString(_) => unimplemented!(),
Self::BytesLiteral(cons) => constant::bytes_literal_to_object(vm, source_file, cons),
Self::BooleanLiteral(cons) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Don't panic on TString in Expr::ast_to_object; handle or degrade gracefully.

unimplemented!() will abort the VM if a t-string is encountered. Prefer converting to a Python AST node or returning a well-formed error consistently with the rest of the pipeline.

Apply this change here and add a handler in string.rs:

-            Self::TString(_) => unimplemented!(),
+            Self::TString(cons) => string::tstring_to_object(vm, source_file, cons),

And add a minimal adapter (outside this hunk) to keep behavior aligned with FString until full support lands:

// vm/src/stdlib/ast/string.rs
pub(crate) fn tstring_to_object(
    vm: &VirtualMachine,
    source_file: &SourceFile,
    cons: ruff::ExprTString,
) -> PyObjectRef {
    // Temporary: mirror f-string lowering for interpolations.
    // TODO: Replace with true t-string semantics.
    fstring_to_object(vm, source_file, cons.into_fstring_like())
}

If into_fstring_like() doesn't exist, implement a best-effort translation or raise a Python SyntaxError object and return a sentinel node, but avoid a panic.

🤖 Prompt for AI Agents
In vm/src/stdlib/ast/expression.rs around lines 39 to 41, replace the
panic-producing Self::TString(_) => unimplemented!() with a non-panicking
handler that either converts the t-string into a Python AST node or returns a
well-formed PyError consistent with other branches; call a new helper
vm/src/stdlib/ast/string.rs::tstring_to_object(vm, source_file, cons) instead of
unimplemented!. Add the tstring_to_object adapter in vm/src/stdlib/ast/string.rs
that mirrors f-string lowering for now (calling fstring_to_object on a
best-effort into_fstring_like translation), and if no into_fstring_like exists
implement a best-effort transformation or construct and return a Python
SyntaxError object (not panic) so the VM degrades gracefully until full t-string
support is implemented.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
compiler/codegen/src/compile.rs (1)

4903-4908: Return a targeted SyntaxError for TString, not NotImplementedYet.

Users should get a clear syntax error like “t-strings not yet supported” with the expression location, instead of a generic NotImplemented. Replace the Err accordingly.

-            Expr::TString(_) => {
-                return Err(self.error(CodegenErrorType::NotImplementedYet));
-            }
+            Expr::TString(_) => {
+                return Err(self.error(CodegenErrorType::SyntaxError(
+                    "t-strings not yet supported".to_owned(),
+                )));
+            }

This aligns compiler behavior with user expectations and prior review guidance.

🧹 Nitpick comments (2)
compiler/codegen/src/compile.rs (2)

3600-3633: Duplicate mapping-key guard is good; tiny polish.

Using HashSet on UnparseExpr for literal keys prevents duplicate-key patterns. Optional: pre-allocate seen with keys.len() for small perf wins and move self.compile_expression(key)? behind validation to fail fast.


5529-5567: Optional: check dict values in contains_await for completeness.

contains_await for Expr::Dict only inspects keys; consider also checking values to keep symmetry, even if current call-sites already handle dictcomp specially.

-            Expr::Dict(ExprDict { items, .. }) => items
-                .iter()
-                .flat_map(|item| &item.key)
-                .any(Self::contains_await),
+            Expr::Dict(ExprDict { items, .. }) => items.iter().any(|item| {
+                item.key.as_deref().is_some_and(Self::contains_await)
+                    || Self::contains_await(&item.value)
+            }),
```<!-- review_comment_end -->

<!-- file_end -->

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: Path: .coderabbit.yml

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 0a487413c1bad7ab03811b149fef2ed1348bcc45 and fb2dc3b5bfa4d6c8df765f1e1f0e3af77b003451.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `compiler/codegen/src/compile.rs` (19 hunks)

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>📓 Path-based instructions (1)</summary>

<details>
<summary>**/*.rs</summary>


**📄 CodeRabbit inference engine (.github/copilot-instructions.md)**

> `**/*.rs`: Format Rust code with the default rustfmt style (run `cargo fmt`)
> Run clippy and fix any warnings or lints introduced by your changes
> Follow Rust best practices for error handling and memory management

Files:
- `compiler/codegen/src/compile.rs`

</details>

</details>

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)</summary>

* GitHub Check: Run snippets and cpython tests (windows-latest)
* GitHub Check: Run snippets and cpython tests (macos-latest)
* GitHub Check: Run rust tests (ubuntu-latest)
* GitHub Check: Run rust tests (windows-latest)
* GitHub Check: Run rust tests (macos-latest)

</details>

<details>
<summary>🔇 Additional comments (4)</summary><blockquote>

<details>
<summary>compiler/codegen/src/compile.rs (4)</summary><blockquote>

`1517-1529`: **Call-site integration for duplicate-parameter checks looks right.**

Validating before compiling the function body is the correct timing. After fixing the validator, this should emit a SyntaxError early with proper location via `self.error(...)`. 


Add a small compile-time test that asserts a duplicate-parameters function raises a SyntaxError with the function span (not a generic internal error).<!-- review_comment_end -->

---

`4679-4682`: **Good: Lambda parameters also validated.**

Catching duplicates in lambdas keeps behavior consistent with functions.
<!-- review_comment_end -->

---

`253-269`: **Source location updates to PositionEncoding::Utf8 LGTM.**

Switching to `PositionEncoding::Utf8` and using `line`/`character_offset` is consistent and improves diagnostics. 
<!-- review_comment_end -->



Also applies to: 548-552, 5361-5369

---

`5570-5591`: **Nice: await detection in interpolated strings (f/t-strings).**

The recursive walk over `InterpolatedStringElement` and nested format specs is correct and covers tricky cases. Consider adding a mirror test for `TString` like the existing `test_fstring_contains_await`. 

<!-- review_comment_end -->

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +150 to +162
fn validate_duplicate_params(params: &Parameters) -> Result<(), CodegenErrorType> {
let mut seen_params = HashSet::new();
for param in params {
let param_name = param.name().as_str();
if !seen_params.insert(param_name) {
return Err(CodegenErrorType::SyntaxError(format!(
r#"Duplicate parameter "{param_name}""#
)));
}
}

Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Duplicate-parameter validation is incomplete and likely incorrect; iterate explicit param groups.

  • Parameters typically doesn’t implement IntoIterator, and items don’t have name(); this likely won’t compile.
  • Logic also misses vararg, kwarg, and kwonlyargs names; duplicates across groups won’t be caught.

Fix by iterating all parameter kinds explicitly:

-fn validate_duplicate_params(params: &Parameters) -> Result<(), CodegenErrorType> {
-    let mut seen_params = HashSet::new();
-    for param in params {
-        let param_name = param.name().as_str();
-        if !seen_params.insert(param_name) {
-            return Err(CodegenErrorType::SyntaxError(format!(
-                r#"Duplicate parameter "{param_name}""#
-            )));
-        }
-    }
-    Ok(())
-}
+fn validate_duplicate_params(params: &Parameters) -> Result<(), CodegenErrorType> {
+    let mut seen: HashSet<String> = HashSet::new();
+
+    let mut check = |ident: &Identifier| {
+        let name = ident.as_str();
+        if !seen.insert(name.to_owned()) {
+            Err(CodegenErrorType::SyntaxError(format!(
+                r#"Duplicate parameter "{name}""#
+            )))
+        } else {
+            Ok(())
+        }
+    };
+
+    // posonly + args
+    for a in &params.posonlyargs {
+        check(&a.parameter.name)?
+    }
+    for a in &params.args {
+        check(&a.parameter.name)?
+    }
+    // *vararg
+    if let Some(vararg) = params.vararg.as_deref() {
+        check(&vararg.name)?
+    }
+    // kwonly
+    for a in &params.kwonlyargs {
+        check(&a.parameter.name)?
+    }
+    // **kwarg
+    if let Some(kwarg) = params.kwarg.as_deref() {
+        check(&kwarg.name)?
+    }
+    Ok(())
+}

Please run clippy to ensure there are no lifetime or borrow checker complaints and add unit tests that cover duplicates across groups (e.g., def f(a, /, a): ..., def f(*a, a): ..., def f(a, *, a): ...).


I'll verify the review comment by examining the codebase structure and the Parameters type. Let me generate verification scripts.


---

🏁 Script executed:

```shell
#!/bin/bash
# Find imports related to Parameters in the file
echo "=== Checking imports in compile.rs ==="
head -100 compiler/codegen/src/compile.rs | grep -E "use|from" | head -20

Length of output: 341


🏁 Script executed:

#!/bin/bash
# Search for Parameters struct definition
echo "=== Searching for Parameters struct definition ==="
rg -n "struct Parameters" --type rs

Length of output: 77


🏁 Script executed:

#!/bin/bash
# Search for how Parameters is used in the codebase
echo "=== Checking Parameters usage patterns ==="
rg -n "for.*in.*params" compiler/codegen/src/ --type rs -A 2 -B 2 | head -40

Length of output: 69


🏁 Script executed:

#!/bin/bash
# Look for existing parameter validation or iteration
echo "=== Searching for validate or check parameter patterns ==="
rg -n "validate.*param|check.*param|posonlyargs|kwonlyargs" compiler/codegen/src/ --type rs -C 2 | head -50

Length of output: 85


🏁 Script executed:

#!/bin/bash
# Check the exact definition and fields of Parameters
echo "=== Searching for Parameters definition with field structure ==="
rg -A 15 "pub struct Parameters" --type rs

Length of output: 91



---

🏁 Script executed:

```shell
#!/bin/bash
# Check lines 150-162 directly
echo "=== Viewing lines 150-162 of compile.rs ==="
sed -n '150,162p' compiler/codegen/src/compile.rs

Length of output: 453


🏁 Script executed:

#!/bin/bash
# Search for Parameters usage without type filter
echo "=== Searching for Parameters in compile.rs ==="
grep -n "Parameters" compiler/codegen/src/compile.rs | head -20

Length of output: 557


🏁 Script executed:

#!/bin/bash
# Search for posonlyargs, kwonlyargs to see if they exist in the codebase
echo "=== Searching for parameter field patterns ==="
grep -rn "posonlyargs\|kwonlyargs\|vararg" compiler/codegen/src/ | head -30

Length of output: 2396


🏁 Script executed:

#!/bin/bash
# List Python AST related files to understand the structure
echo "=== Checking for Parameters-related code patterns ==="
grep -n "fn validate_duplicate_params\|\.args\|\.posonlyargs" compiler/codegen/src/compile.rs

Length of output: 916


🏁 Script executed:

#!/bin/bash
# Check Cargo.toml to understand dependencies
echo "=== Checking ruff_python_ast dependency ==="
grep -A 5 -B 2 "ruff_python_ast\|ruff-python-ast" Cargo.toml

Length of output: 415



---

🏁 Script executed:

```shell
#!/bin/bash
# Look at how parameter items are accessed to understand their structure
echo "=== Checking parameter item access patterns ==="
grep -n "\.parameter\." compiler/codegen/src/compile.rs | head -20

Length of output: 47


🏁 Script executed:

#!/bin/bash
# Look for how vararg and kwarg are accessed
echo "=== Checking vararg/kwarg access patterns ==="
grep -B 2 -A 2 "parameters\.vararg\|parameters\.kwarg" compiler/codegen/src/compile.rs | head -40

Length of output: 660


🏁 Script executed:

#!/bin/bash
# Check symboltable.rs which seems to handle parameters
echo "=== Checking symboltable.rs parameter handling ==="
grep -B 2 -A 4 "scan_parameters\|\.vararg\|\.kwarg" compiler/codegen/src/symboltable.rs | head -50

Length of output: 1112


🏁 Script executed:

#!/bin/bash
# Search for type definitions or trait implementations for parameter types
echo "=== Searching for parameter item type and name access ==="
grep -n "\.name()\|\.name\s*=" compiler/codegen/src/compile.rs | head -20

Length of output: 817



---

🏁 Script executed:

```shell
#!/bin/bash
# Check if validate_duplicate_params is called
echo "=== Searching for calls to validate_duplicate_params ==="
grep -n "validate_duplicate_params" compiler/codegen/src/compile.rs

Length of output: 316


🏁 Script executed:

#!/bin/bash
# Try to locate any compilation or test activity involving this function
echo "=== Searching across codebase for validate_duplicate_params ==="
grep -rn "validate_duplicate_params" compiler/

Length of output: 419


Code will not compile—Parameters doesn't implement IntoIterator; duplicate validation is incomplete.

The function at lines 150-162 has two blocking issues:

  1. Iterator failure: for param in params will not compile; Parameters does not implement IntoIterator. The codebase accesses parameter fields explicitly (.posonlyargs, .args, .kwonlyargs, .vararg, .kwarg).

  2. Logic gap: Missing duplicate detection across parameter groups. Only iterating the first group would miss duplicates like def f(a, /, a): or def f(*a, a):.

The function is actively called at lines 1518 and 4679. Fix by iterating all parameter field groups explicitly, following the codebase pattern:

fn validate_duplicate_params(params: &Parameters) -> Result<(), CodegenErrorType> {
    let mut seen: HashSet<String> = HashSet::new();

    let mut check = |ident: &Identifier| {
        let name = ident.as_str();
        if !seen.insert(name.to_owned()) {
            Err(CodegenErrorType::SyntaxError(format!(
                r#"Duplicate parameter "{name}""#
            )))
        } else {
            Ok(())
        }
    };

    for arg in &params.posonlyargs {
        check(&arg.parameter.name)?
    }
    for arg in &params.args {
        check(&arg.parameter.name)?
    }
    if let Some(vararg) = params.vararg.as_deref() {
        check(&vararg.name)?
    }
    for arg in &params.kwonlyargs {
        check(&arg.parameter.name)?
    }
    if let Some(kwarg) = params.kwarg.as_deref() {
        check(&kwarg.name)?
    }
    Ok(())
}

Run cargo fmt and cargo clippy to validate the fix, then add tests covering cross-group duplicates.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn validate_duplicate_params(params: &Parameters) -> Result<(), CodegenErrorType> {
let mut seen_params = HashSet::new();
for param in params {
let param_name = param.name().as_str();
if !seen_params.insert(param_name) {
return Err(CodegenErrorType::SyntaxError(format!(
r#"Duplicate parameter "{param_name}""#
)));
}
}
Ok(())
}
fn validate_duplicate_params(params: &Parameters) -> Result<(), CodegenErrorType> {
let mut seen: HashSet<String> = HashSet::new();
let mut check = |ident: &Identifier| {
let name = ident.as_str();
if !seen.insert(name.to_owned()) {
Err(CodegenErrorType::SyntaxError(format!(
r#"Duplicate parameter "{name}""#
)))
} else {
Ok(())
}
};
for arg in &params.posonlyargs {
check(&arg.parameter.name)?
}
for arg in &params.args {
check(&arg.parameter.name)?
}
if let Some(vararg) = params.vararg.as_deref() {
check(&vararg.name)?
}
for arg in &params.kwonlyargs {
check(&arg.parameter.name)?
}
if let Some(kwarg) = params.kwarg.as_deref() {
check(&kwarg.name)?
}
Ok(())
}
🤖 Prompt for AI Agents
In compiler/codegen/src/compile.rs around lines 150 to 162, the
validate_duplicate_params function currently attempts to iterate `params`
directly (which doesn't implement IntoIterator) and doesn't check every
parameter group, so it won't compile and can miss duplicates across groups;
update the function to explicitly iterate the parameter collections used
elsewhere in the codebase (e.g., params.posonlyargs, params.args,
params.kwonlyargs) and also check the optional params.params.vararg and
params.params.kwarg (if present) by extracting their names, inserting each name
into a HashSet and returning the same CodegenErrorType::SyntaxError when insert
fails for a duplicate; after making the change, run cargo fmt and cargo clippy
and add unit tests that assert detection of duplicates across groups (e.g.,
posonly vs args, vararg vs normal arg) to cover the previously missed cases.

@youknowone youknowone merged commit fcf1969 into RustPython:main Oct 20, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants