Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 16, 2025

There are currently some code paths that use __new__ and __init__ directly instead of going through the Constructor/Initializer. I plan to change these so that they raise an error. Before doing that, we’ll need to remove any code that actually relies on this behavior.

close #6448

Summary by CodeRabbit

  • Refactor
    • Centralized and reworked object and exception initialization to a trait-based initializer/constructor system.
    • Many built-ins and exceptions now use the new initialization pathway (no visible behavior change; construction semantics preserved).
    • Some previous public init implementations were removed and replaced by the new initializer-based flow (e.g., Cursor), preserving external behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

The PR refactors construction/initialization wiring: macros now emit and respect with(Constructor/Initializer) items, and many types move from inline init/pyslot handlers to explicit Constructor/Initializer trait implementations; derive macro gains trait-aware code generation and validation.

Changes

Cohort / File(s) Summary
Macro generation logic
crates/derive-impl/src/pyclass.rs
Add is_trait: bool to ImplContext; track with_items in impl_pyexception_impl; conditionally omit or generate slot_new/slot_init wrappers based on with(Constructor) / with(Initializer); emit with(#(#with_items),*) in generated pyclass attribute; validate/disallow __new__/__init__ as pymethods in non-trait impls.
Exception types — Initializer/Constructor migration
crates/vm/src/exceptions.rs, crates/vm/src/exception_group.rs
Replace inline slot_init/slot-based constructors with #[pyexception(with(Initializer))] (and with(Constructor) where applicable) and add impl Initializer / impl Constructor blocks implementing slot_init/slot_new and unreachable init/py_new placeholders; move some methods (e.g., __reduce__) to normal PyMethods.
Builtins / base object changes
crates/vm/src/builtins/object.rs
Add Initializer to PyBaseObject pyclass; implement Initializer with slot_init (noop) and unreachable init; remove prior no-op __init__; rename/refactor __class__ setter to set___class__ with #[pygetset(setter)]; update imports to include PyRef and Initializer.
Stdlib types — sqlite & ast
crates/stdlib/src/sqlite.rs, crates/vm/src/stdlib/ast/python.rs
Cursor pyclass gains Initializer; remove public __init__ and add impl Initializer for Cursor performing inner initialization. NodeAst moves to #[pyclass(with(Constructor, Initializer), ...)], adds impl Initializer with slot_init for field binding, removes __init__, and makes slot_new delegate to slot_init.
Small import update
crates/stdlib/src/ssl/error.rs
Expand imports from types::Constructor to types::{Constructor, Initializer}.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Attention areas:
    • crates/derive-impl/src/pyclass.rs — macro generation correctness, with_items handling, emitted tokens.
    • Exception conversions — verify argument parsing, preservation of prior semantics (esp. Unicode/OSError families).
    • PyBaseObject setter renaming and its effect on getset machinery.
    • sqlite Cursor and NodeAst initializer behaviors to ensure no regressions in construction paths.

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐰 I hopped from init into traits so neat,

With Constructor and Initializer, constructors meet.
The macro learned to mind each with(...) call,
Exceptions lined up tidy, one and all.
A happy hop — build passes, carrots for all! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title precisely captures the main objective: disallowing direct calls to new and init methods, which is the core change across multiple files in this PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the disallow-new-init branch 3 times, most recently from 103206d to 88f591d Compare December 21, 2025 15:13
@youknowone youknowone marked this pull request as ready for review December 21, 2025 15:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/exceptions.rs (1)

2026-2116: PyIncompleteInputError::slot_init discards SyntaxError location args

PySyntaxError::slot_init uses the second positional argument (a location tuple) to populate filename, lineno, offset, text, end_lineno, and end_offset, which are then consumed by write_syntaxerror and PySyntaxError.__str__.

PyIncompleteInputError is a subclass of PySyntaxError, but its current slot_init only sets:

zelf.set_attr("name", vm.ctx.new_str("SyntaxError"), vm)?;

and ignores the original FuncArgs, meaning _IncompleteInputError instances won’t get their location fields initialized even if they are constructed with the standard (msg, location_tuple) arguments. That likely breaks formatting and tracebacks for incomplete‑input errors.

To preserve SyntaxError semantics while still forcing the name, you can delegate to PySyntaxError::slot_init after setting name, e.g.:

Proposed fix delegating to `PySyntaxError::slot_init`
-    impl Initializer for PyIncompleteInputError {
-        type Args = FuncArgs;
-
-        fn slot_init(zelf: PyObjectRef, _args: FuncArgs, vm: &VirtualMachine) -> PyResult<()> {
-            zelf.set_attr("name", vm.ctx.new_str("SyntaxError"), vm)?;
-            Ok(())
-        }
+    impl Initializer for PyIncompleteInputError {
+        type Args = FuncArgs;
+
+        fn slot_init(zelf: PyObjectRef, args: FuncArgs, vm: &VirtualMachine) -> PyResult<()> {
+            // Force the public name while still populating all SyntaxError fields.
+            zelf.set_attr("name", vm.ctx.new_str("SyntaxError"), vm)?;
+            PySyntaxError::slot_init(zelf, args, vm)
+        }
@@
-        fn init(_zelf: PyRef<Self>, _args: Self::Args, _vm: &VirtualMachine) -> PyResult<()> {
-            unreachable!("slot_init is defined")
-        }
+        fn init(_zelf: PyRef<Self>, _args: Self::Args, _vm: &VirtualMachine) -> PyResult<()> {
+            unreachable!("slot_init is defined")
+        }
     }

This keeps _IncompleteInputError fully aligned with SyntaxError’s initialization while still branding it as a separate type.

Also applies to: 2127-2141

🧹 Nitpick comments (4)
crates/derive-impl/src/pyclass.rs (1)

736-744: Consider using iterator methods for more idiomatic code.

The with_contains helper manually iterates and compares. This can be simplified using iterator methods.

🔎 Suggested refactor
-    let with_contains = |with_items: &[Ident], s: &str| {
-        // Check if Constructor is in the list
-        for ident in with_items {
-            if ident.to_string().as_str() == s {
-                return true;
-            }
-        }
-        false
-    };
+    let with_contains = |with_items: &[Ident], s: &str| {
+        with_items.iter().any(|ident| ident == s)
+    };
crates/vm/src/stdlib/ast/python.rs (1)

5-10: NodeAst: custom slot_new + Initializer look good; please confirm slot_init isn’t run twice

The new Initializer impl and slot_init body for NodeAst are clean and mirror the existing _fields-based initializer pattern from io.rs, which is a nice move away from calling __init__ directly.

One thing to double‑check: NodeAst is now declared with with(Constructor, Initializer) and slot_new manually calls Self::slot_init(zelf.clone(), args, vm). If the type-call path for pyclasses with Initializer also invokes the tp_init/slot_init hook after slot_new, this could result in initialization happening twice per construction. If the runtime only calls slot_new for this type (and not tp_init), a short comment here clarifying that assumption would help future readers.

Optionally, since this slot_init is identical to the one used elsewhere for _fields-driven AST-like types, consider a shared helper to avoid the two implementations drifting over time.

Also applies to: 16-18, 24-27, 41-45, 52-88

crates/stdlib/src/sqlite.rs (1)

833-885: Connection/Cursor Initializer wiring looks sound; consider CPython parity for direct Cursor instantiation

The new Initializer for Connection correctly handles re‑initialization: it flips initialized to false, resets factories, drops any existing db, opens a new one, and only then repopulates the connection state and marks it initialized again. That matches the comment’s intent of leaving a failed reinit in an uninitialized/ProgrammingError state.

For Cursor, the combination of:

  • py_new returning new_uninitialized(connection, vm) (inner = None), and
  • Initializer::init lazily creating a default CursorInner if missing,

covers both subclassed cursors (via the type call path) and the internal Connection.cursor() construction path (which uses Cursor::new and never needs Initializer). That looks consistent with how check_cursor_state and Base Cursor.__init__ not called. are used.

One behavior worth explicitly confirming is the niche case of sqlite3.Cursor(conn) from Python: Initializer::Args = FuncArgs and init ignores its arguments, so argument count/type for that call is effectively unchecked and row_factory won’t be wired from the connection the way Connection.cursor() does. If CPython parity there matters, you might want to parse (PyRef<Connection>,) in init and/or error on invalid calls; otherwise, the current, simpler behavior is fine.

Also applies to: 902-937, 939-944, 1543-1572, 1574-1601, 1922-1932, 1934-1953

crates/vm/src/exception_group.rs (1)

42-47: PyBaseExceptionGroup slot_new looks correct; consider rejecting unexpected keyword args

The new Constructor::slot_new for PyBaseExceptionGroup does a nice job of mirroring CPython’s rules: strict two‑positional‑arg arity, type checks for the message and exceptions sequence, per‑item BaseException validation, and the ExceptionGroup vs BaseExceptionGroup vs subclass selection with appropriate nesting restrictions.

One behavioral edge case: args.kwargs is currently ignored, so something like BaseExceptionGroup("msg", excs, foo=1) would succeed here even though CPython treats unexpected keyword arguments to __new__ as a TypeError. If you want to match CPython more closely, you could add an early if !args.kwargs.is_empty() check returning a TypeError (and keep slot_init as the documented no‑op).

Also applies to: 53-63, 244-370

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b3bf55 and 0d791f9.

📒 Files selected for processing (7)
  • crates/derive-impl/src/pyclass.rs (7 hunks)
  • crates/stdlib/src/sqlite.rs (2 hunks)
  • crates/stdlib/src/ssl/error.rs (1 hunks)
  • crates/vm/src/builtins/object.rs (4 hunks)
  • crates/vm/src/exception_group.rs (2 hunks)
  • crates/vm/src/exceptions.rs (14 hunks)
  • crates/vm/src/stdlib/ast/python.rs (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/ssl/error.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/stdlib/ast/python.rs
  • crates/stdlib/src/sqlite.rs
  • crates/vm/src/builtins/object.rs
  • crates/vm/src/exceptions.rs
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.py : In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/derive-impl/src/pyclass.rs
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Applied to files:

  • crates/vm/src/exception_group.rs
  • crates/vm/src/exceptions.rs
🧬 Code graph analysis (4)
crates/vm/src/exception_group.rs (1)
crates/vm/src/exceptions.rs (7)
  • slot_new (710-717)
  • slot_new (1645-1666)
  • args (568-570)
  • args (574-574)
  • class (45-47)
  • new (547-555)
  • new (1061-1063)
crates/vm/src/stdlib/ast/python.rs (2)
crates/vm/src/builtins/object.rs (7)
  • slot_init (121-123)
  • class (26-28)
  • class (399-401)
  • class (403-405)
  • value (470-470)
  • init (125-127)
  • init (557-559)
crates/vm/src/types/slot.rs (1)
  • slot_init (947-972)
crates/vm/src/builtins/object.rs (2)
crates/vm/src/stdlib/io.rs (2)
  • slot_init (1465-1468)
  • vm (3524-3526)
crates/vm/src/types/slot.rs (1)
  • slot_init (947-972)
crates/vm/src/exceptions.rs (3)
crates/vm/src/builtins/object.rs (6)
  • slot_init (121-123)
  • init (125-127)
  • init (557-559)
  • class (26-28)
  • class (399-401)
  • class (403-405)
crates/vm/src/exception_group.rs (5)
  • slot_init (354-365)
  • zelf (212-213)
  • init (367-369)
  • obj (195-195)
  • obj (384-385)
crates/vm/src/types/slot.rs (1)
  • slot_init (947-972)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (7)
crates/stdlib/src/ssl/error.rs (1)

10-10: The Initializer import is correctly used by the #[pyexception] macro for generating exception trait implementations. Compilation succeeds without clippy warnings, confirming the import is necessary.

crates/derive-impl/src/pyclass.rs (4)

66-66: LGTM: is_trait field addition.

The new is_trait field correctly distinguishes between trait and impl contexts, enabling conditional validation downstream.


236-239: LGTM: Trait context initialization.

Correctly sets is_trait = true for trait implementations while defaulting other fields.


906-921: LGTM: Validation logic correctly enforces the new restrictions.

The validation properly:

  • Only applies to non-trait impl blocks (line 908)
  • Provides clear, actionable error messages directing users to the correct approach
  • Aligns with the PR objective to disallow direct __new__ and __init__ method definitions

817-824: LGTM: Correct generation of pyclass attribute with trait information.

The generated #[pyclass] attribute correctly includes all with_items, ensuring that Constructor/Initializer traits are properly registered when auto-generated.

crates/vm/src/builtins/object.rs (1)

5-10: PyBaseObject Initializer + __class__ setter refactor LGTM

The no‑op Initializer for PyBaseObject is consistent with object.__init__, and wiring it via with(Constructor, Initializer) integrates cleanly with the new initialization model. The reworked __class__ getter/setter using PyObjectRef and the mutable/module checks preserves the previous semantics while generalizing the setter shape; no issues spotted here.

Also applies to: 118-128, 250-252, 459-462, 464-498

crates/vm/src/exceptions.rs (1)

1374-1387: Initializer implementations for StopIteration/AttributeError/OSError/Unicode look consistent*

The new Initializer/slot_init implementations for:

  • PyStopIteration (value set from the first positional arg or None),
  • PyAttributeError (name/obj taken from kwargs, defaulting to None),
  • PyOSError (field population + args truncation wired through slot_init and its getters/setters),
  • PyUnicodeDecodeError, PyUnicodeEncodeError, and PyUnicodeTranslateError (typed Args tuples, attribute population, and detailed __str__),

all integrate cleanly with the existing PyBaseException Constructor/Initializer model and match the expected data layout for these exceptions. I don’t see correctness issues in these blocks; the move away from direct __init__ calls to trait‑based initialization looks solid here.

Also applies to: 1423-1446, 1670-1731, 2178-2225, 2227-2328

Comment on lines +1463 to +1478
#[pyexception(with(Initializer))]
impl PyImportError {
#[pyslot]
#[pymethod(name = "__init__")]
pub(crate) fn slot_init(
zelf: PyObjectRef,
args: ::rustpython_vm::function::FuncArgs,
vm: &::rustpython_vm::VirtualMachine,
) -> ::rustpython_vm::PyResult<()> {
#[pymethod]
fn __reduce__(exc: PyBaseExceptionRef, vm: &VirtualMachine) -> PyTupleRef {
let obj = exc.as_object().to_owned();
let mut result: Vec<PyObjectRef> = vec![
obj.class().to_owned().into(),
vm.new_tuple((exc.get_arg(0).unwrap(),)).into(),
];

if let Some(dict) = obj.dict().filter(|x| !x.is_empty()) {
result.push(dict.into());
}

result.into_pytuple(vm)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

PyImportError::__reduce__ can panic when no args are present

In PyImportError::__reduce__, this line:

vm.new_tuple((exc.get_arg(0).unwrap(),)).into()

assumes there is always at least one positional argument. If a user constructs ImportError() (no args) and then calls exc.__reduce__() (directly or via pickle), get_arg(0) returns None and unwrap() will panic, bringing down the VM instead of raising a Python exception.

slot_init’s keyword handling and name/path population via the instance dict otherwise look good.

A safer pattern is to fall back to the existing BaseException.__reduce__ shape when there is no first argument (or use an empty string), e.g.:

Proposed fix to avoid panic in `__reduce__`
-        fn __reduce__(exc: PyBaseExceptionRef, vm: &VirtualMachine) -> PyTupleRef {
-            let obj = exc.as_object().to_owned();
-            let mut result: Vec<PyObjectRef> = vec![
-                obj.class().to_owned().into(),
-                vm.new_tuple((exc.get_arg(0).unwrap(),)).into(),
-            ];
+        fn __reduce__(exc: PyBaseExceptionRef, vm: &VirtualMachine) -> PyTupleRef {
+            let obj = exc.as_object().to_owned();
+            let args = exc.args();
+            let msg = if args.is_empty() {
+                // Fall back to the BaseException behavior when no message is present.
+                vm.ctx.empty_tuple.clone().into()
+            } else {
+                args[0].clone()
+            };
+            let mut result: Vec<PyObjectRef> = vec![
+                obj.class().to_owned().into(),
+                vm.new_tuple((msg,)).into(),
+            ];
@@
             if let Some(dict) = obj.dict().filter(|x| !x.is_empty()) {
                 result.push(dict.into());
             }
 
             result.into_pytuple(vm)
         }

This avoids a Rust panic and keeps __reduce__ well‑defined even for unusual ImportError instances.

Also applies to: 1484-1505

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/derive-impl/src/pyclass.rs (1)

718-739: Replace expect() with proper error handling.

Line 724 uses expect() which will panic if with() contains non-ident items. In proc-macro code, panics produce poor error messages for users. Use ok_or_else() with a proper syn::Error instead.

🔎 Proposed fix
                 for meta in nested {
-                    with_items.push(meta.get_ident().expect("with() has non-ident item").clone());
+                    let ident = meta.get_ident().ok_or_else(|| {
+                        syn::Error::new_spanned(meta, "#[pyclass(with(...))] arguments must be identifiers")
+                    })?;
+                    with_items.push(ident.clone());
                 }
♻️ Duplicate comments (1)
crates/vm/src/exceptions.rs (1)

1465-1478: PyImportError::__reduce__ can panic when no args are present.

The call exc.get_arg(0).unwrap() will panic if ImportError() is constructed without arguments and then __reduce__ is called (e.g., via pickle). This should handle the empty args case gracefully.

Proposed fix to avoid panic
         #[pymethod]
         fn __reduce__(exc: PyBaseExceptionRef, vm: &VirtualMachine) -> PyTupleRef {
             let obj = exc.as_object().to_owned();
+            let args = exc.args();
+            if args.is_empty() {
+                // Fall back to BaseException behavior when no args
+                if let Some(dict) = obj.dict().filter(|x| !x.is_empty()) {
+                    return vm.new_tuple((obj.class().to_owned(), vm.ctx.empty_tuple.clone(), dict));
+                } else {
+                    return vm.new_tuple((obj.class().to_owned(), vm.ctx.empty_tuple.clone()));
+                }
+            }
             let mut result: Vec<PyObjectRef> = vec![
                 obj.class().to_owned().into(),
-                vm.new_tuple((exc.get_arg(0).unwrap(),)).into(),
+                vm.new_tuple((args[0].clone(),)).into(),
             ];

             if let Some(dict) = obj.dict().filter(|x| !x.is_empty()) {
                 result.push(dict.into());
             }

             result.into_pytuple(vm)
         }
🧹 Nitpick comments (1)
crates/stdlib/src/sqlite.rs (1)

1937-1937: Optional: Consider documenting why _connection parameter is unused.

The _connection parameter is intentionally unused (note the _ prefix), since the Connection is already available via zelf.connection (set by the Constructor in new_uninitialized). While this is valid, a brief comment explaining this design choice would improve code clarity.

🔎 Suggested documentation improvement
-    fn init(zelf: PyRef<Self>, _connection: Self::Args, _vm: &VirtualMachine) -> PyResult<()> {
+    fn init(zelf: PyRef<Self>, _connection: Self::Args, _vm: &VirtualMachine) -> PyResult<()> {
+        // Note: connection is already stored in zelf.connection via Constructor::py_new
         let mut guard = zelf.inner.lock();
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d791f9 and 78c23c5.

📒 Files selected for processing (5)
  • crates/derive-impl/src/pyclass.rs
  • crates/stdlib/src/sqlite.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/exceptions.rs
  • crates/vm/src/stdlib/ast/python.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/ast/python.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/stdlib/src/sqlite.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/exceptions.rs
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.py : In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/derive-impl/src/pyclass.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.py : In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only

Applied to files:

  • crates/derive-impl/src/pyclass.rs
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Applied to files:

  • crates/vm/src/exception_group.rs
  • crates/vm/src/exceptions.rs
🧬 Code graph analysis (3)
crates/derive-impl/src/pyclass.rs (2)
crates/derive-impl/src/util.rs (2)
  • syn (202-202)
  • with (334-346)
crates/derive-impl/src/lib.rs (1)
  • pyclass (41-47)
crates/stdlib/src/sqlite.rs (4)
crates/vm/src/stdlib/io.rs (2)
  • flags (3247-3247)
  • zelf (3415-3415)
crates/vm/src/stdlib/ast/python.rs (1)
  • init (86-88)
crates/vm/src/builtins/getset.rs (1)
  • zelf (110-110)
crates/vm/src/builtins/tuple.rs (1)
  • zelf (284-288)
crates/vm/src/exceptions.rs (3)
crates/vm/src/exception_group.rs (5)
  • slot_init (356-367)
  • zelf (212-213)
  • init (369-371)
  • obj (195-195)
  • obj (386-387)
crates/vm/src/builtins/object.rs (6)
  • slot_init (121-123)
  • init (125-127)
  • init (557-559)
  • class (26-28)
  • class (399-401)
  • class (403-405)
crates/vm/src/types/slot.rs (2)
  • slot_init (947-972)
  • repr (1126-1129)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (22)
crates/vm/src/exception_group.rs (3)

46-46: LGTM! Trait imports and wiring annotation look correct.

The addition of Constructor and Initializer imports and the #[pyexception(with(Constructor, Initializer))] annotation properly wire the trait-based construction pattern for PyBaseExceptionGroup.

Also applies to: 53-53


244-351: Constructor implementation is comprehensive and follows CPython semantics.

The slot_new implementation correctly:

  • Validates exactly 2 positional arguments
  • Ensures message is a string
  • Rejects sets/frozensets/None as exceptions argument
  • Validates non-empty sequence of BaseException instances
  • Handles auto-conversion to ExceptionGroup when all exceptions are Exception subclasses
  • Properly handles user-defined subclasses

The py_new returning unimplemented!("use slot_new") follows the established pattern for types that need access to the full FuncArgs in construction.


353-372: Initializer implementation is correct as a no-op.

The comment clearly explains why slot_init is a no-op: __new__ already sets up the correct args (message, exceptions_tuple), and this design allows subclasses to pass extra arguments without __init__ complaining about argument count. This matches CPython's behavior.

crates/vm/src/exceptions.rs (9)

1374-1387: LGTM! PyStopIteration Initializer implementation is correct.

The slot_init properly sets the value attribute from the first positional argument (or None if not provided), and init is correctly marked as unreachable.


1423-1446: LGTM! PyAttributeError Initializer implementation is correct.

The slot_init properly extracts name and obj from keyword arguments and sets them as attributes, matching CPython behavior.


1481-1505: LGTM! PyImportError Initializer implementation is correct.

The slot_init properly:

  • Extracts and removes name and path from kwargs
  • Validates no unexpected keyword arguments remain
  • Sets the attributes in the instance dict
  • Delegates to PyBaseException::slot_init for base initialization

1669-1729: LGTM! PyOSError Initializer implementation handles complex initialization correctly.

The slot_init properly:

  • Sets errno/strerror when 2-5 args are provided
  • Sets filename when 3+ args
  • Handles Windows-specific winerror conversion (when 4+ args)
  • Sets filename2 when exactly 5 args
  • Truncates args to 2 when filename is present (CPython compatibility)
  • Delegates to PyBaseException::slot_init

The unsafe swaps are appropriately commented and occur during initialization when fields are guaranteed to be None.


2076-2116: LGTM! PySyntaxError Initializer implementation is correct.

The slot_init properly:

  • Sets print_file_and_line to None
  • Parses the location tuple (second argument) when present and sets filename, lineno, offset, text, end_lineno, end_offset attributes
  • Delegates to PyBaseException::slot_init

2127-2141: LGTM! PyIncompleteInputError Initializer properly delegates to parent.

The slot_init sets the name attribute and correctly delegates to PySyntaxError::slot_init for the rest of the initialization.


2207-2225: LGTM! PyUnicodeDecodeError Initializer implementation is correct.

The slot_init binds the expected 5 arguments (encoding, object, start, end, reason) and sets them as attributes. The object is correctly converted to bytes before storing.


2262-2279: LGTM! PyUnicodeEncodeError Initializer implementation is correct.

The slot_init binds the expected 5 arguments (encoding, object, start, end, reason) and sets them as attributes appropriately.


2312-2328: LGTM! PyUnicodeTranslateError Initializer implementation is correct.

The slot_init binds the expected 4 arguments (object, start, end, reason) and sets them as attributes. Note this type has one fewer argument than the encode/decode variants since it doesn't need an encoding.

crates/derive-impl/src/pyclass.rs (6)

66-66: LGTM!

The is_trait field addition to ImplContext is clean and necessary for differentiating trait vs. impl block behavior, which aligns with the PR objectives.


236-239: LGTM!

Correctly initializes the context with is_trait: true when processing trait blocks, enabling trait-aware code generation downstream.


748-772: Previous type qualification issue resolved.

The missing type qualifications flagged in a previous review (lines 766-767) have been fixed. VirtualMachine and PyResult are now fully qualified as ::rustpython_vm::VirtualMachine and ::rustpython_vm::PyResult<Self>.

The conditional generation logic is correct: when Constructor is not in with(), a default implementation is generated and Constructor is added to with_items.


778-803: Previous type qualification issue resolved.

The missing type qualifications flagged in a previous review (lines 797-798) have been fixed. VirtualMachine and PyResult<()> are now fully qualified as ::rustpython_vm::VirtualMachine and ::rustpython_vm::PyResult<()>.

The conditional generation for Initializer mirrors the Constructor logic and is correct.


812-819: LGTM!

The generated code structure correctly assembles the with() attribute and outputs the trait implementations. The use of #(#with_items),* properly expands the accumulated items.


901-916: LGTM - Correctly implements PR objectives.

This validation correctly disallows __new__ and __init__ as #[pymethod] in impl blocks (non-trait contexts), while allowing them in trait definitions. The error messages are clear and guide users to use #[pyclass(with(Constructor))] and #[pyclass(with(Initializer))] instead.

This aligns with the PR's goal of preventing direct calls to __new__ and __init__.

crates/stdlib/src/sqlite.rs (1)

1543-1543: LGTM: Initializer trait added to Cursor's pyclass declaration.

The addition of Initializer to the with(...) clause correctly enables trait-based initialization for the Cursor type, consistent with the PR's objective to move away from direct __init__ calls.

crates/vm/src/stdlib/ast/python.rs (3)

6-6: LGTM! Import additions support the new Initializer pattern.

The added imports are necessary for the trait-based refactoring.

Also applies to: 9-9


16-16: LGTM! pyclass attribute updated to enable trait-based pattern.

Adding with(Constructor, Initializer) correctly wires the new trait implementations.


53-89: LGTM! Initializer implementation with proper validation and error handling.

The slot_init logic correctly:

  • Retrieves and validates _fields
  • Checks positional argument counts
  • Handles keyword arguments with duplicate detection
  • Interns string keys for efficiency

The unreachable! in init signals that slot_init is the designated entry point for this trait implementation.

Comment on lines +1934 to +1936
impl Initializer for Cursor {
type Args = PyRef<Connection>;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for the Cursor implementation in sqlite.rs and examine Constructor/Initializer patterns
cd crates/stdlib/src || exit 1

# Look at lines around 1923 and 1934-1936 for Constructor and Initializer
echo "=== Constructor for Cursor (around line 1923) ==="
sed -n '1920,1930p' sqlite.rs

echo ""
echo "=== Initializer for Cursor (around line 1934-1936) ==="
sed -n '1933,1950p' sqlite.rs

Repository: RustPython/RustPython

Length of output: 1095


🏁 Script executed:

#!/bin/bash
# Check Connection class pattern
cd crates/stdlib/src || exit 1

echo "=== Connection class Constructor (around line 853-937) ==="
grep -n "impl Constructor for Connection" sqlite.rs -A 10

echo ""
echo "=== Connection class Initializer ==="
grep -n "impl Initializer for Connection" sqlite.rs -A 10

echo ""
echo "=== Search for other Constructor/Initializer patterns in stdlib ==="
find crates/stdlib/src -name "*.rs" -exec grep -l "impl Constructor\|impl Initializer" {} \; | head -5

Repository: RustPython/RustPython

Length of output: 1345


🏁 Script executed:

#!/bin/bash
# Get correct path and search for ConnectArgs definition and other patterns
find . -name "sqlite.rs" | head -1

echo ""
echo "=== Search for ConnectArgs definition ==="
rg "struct ConnectArgs|type ConnectArgs" crates/stdlib/src/sqlite.rs

echo ""
echo "=== Search for other Constructor/Initializer in sqlite.rs ==="
rg "impl Constructor|impl Initializer" crates/stdlib/src/sqlite.rs -B 1 -A 5

echo ""
echo "=== Search for Constructor/Initializer patterns across stdlib ==="
rg "impl Constructor|impl Initializer" . -t rs --max-count=20

Repository: RustPython/RustPython

Length of output: 1448


🏁 Script executed:

#!/bin/bash
# Search for Constructor/Initializer patterns across the codebase
echo "=== Constructor/Initializer patterns in codebase ==="
rg "impl Constructor|impl Initializer" --max-count=30 -A 3

echo ""
echo "=== Row class implementation ==="
rg "impl Constructor for Row" crates/stdlib/src/sqlite.rs -A 10

echo ""
echo "=== Check if Row has Initializer ==="
rg "impl Initializer for Row" crates/stdlib/src/sqlite.rs -A 5

Repository: RustPython/RustPython

Length of output: 39811


🏁 Script executed:

#!/bin/bash
# Check how Constructor and Initializer are invoked in the framework
echo "=== Look for how Constructor and Initializer are called ==="
rg "impl.*Constructor\|impl.*Initializer" crates/vm/src/object/ -A 20 | head -100

echo ""
echo "=== Check if there's framework code handling argument conversion ==="
rg "fn py_new|fn init" crates/vm/src/object/ -B 2 -A 5 | head -80

Repository: RustPython/RustPython

Length of output: 688


🏁 Script executed:

#!/bin/bash
# Search for Constructor and Initializer trait definitions
echo "=== Constructor trait definition ==="
rg "trait Constructor" crates/vm/src/ -A 10

echo ""
echo "=== Initializer trait definition ==="
rg "trait Initializer" crates/vm/src/ -A 10

echo ""
echo "=== Look for code that calls Constructor::py_new and Initializer::init ==="
rg "Constructor::py_new|Initializer::init" crates/vm/src/ -B 2 -A 2 | head -60

Repository: RustPython/RustPython

Length of output: 2361


🏁 Script executed:

#!/bin/bash
# Look at the full slot_init implementation
echo "=== Full Initializer::slot_init implementation ==="
rg "fn slot_init" crates/vm/src/types/slot.rs -A 25

echo ""
echo "=== Check FromArgs trait ==="
rg "trait FromArgs" crates/vm/src/ -A 5

Repository: RustPython/RustPython

Length of output: 1640


🏁 Script executed:

#!/bin/bash
# Verify if Row has Initializer by doing a more thorough search
echo "=== Search for Row impl blocks in sqlite.rs ==="
rg "impl.*for Row" crates/stdlib/src/sqlite.rs -A 3

echo ""
echo "=== Check the line numbers for Row to understand its structure ==="
rg -n "^struct Row" crates/stdlib/src/sqlite.rs

echo ""
echo "=== Look for all impl blocks after Row struct ==="
rg -n "impl.*Row" crates/stdlib/src/sqlite.rs

Repository: RustPython/RustPython

Length of output: 1480


🏁 Script executed:

#!/bin/bash
# Check if there's automatic FromArgs handling for tuples vs non-tuples
echo "=== Search for FromArgs implementation for tuples ==="
rg "impl.*FromArgs.*for.*\(" crates/vm/src/function/ -A 5 | head -80

echo ""
echo "=== Check if there are FromArgs implementations handling conversion ==="
rg "impl.*FromArgs" crates/vm/src/function/argument.rs -A 8 | head -120

Repository: RustPython/RustPython

Length of output: 3127


🏁 Script executed:

#!/bin/bash
# Look at the actual tuple FromArgs implementation more carefully
echo "=== Full tuple_from_py_func_args macro ==="
rg "tuple_from_py_func_args" crates/vm/src/function/argument.rs -A 50 | head -150

echo ""
echo "=== Test if single-element tuple behaves same as non-tuple ==="
echo "Looking for how args.bind works..."
rg "fn bind\|pub fn bind" crates/vm/src/ -B 2 -A 8 | head -80

Repository: RustPython/RustPython

Length of output: 1422


🏁 Script executed:

#!/bin/bash
# Verify the actual framework behavior - let's check if there are any tests for Cursor
echo "=== Check for Cursor tests to understand expected behavior ==="
find . -name "*.rs" -path "*/tests/*" -o -name "*test*" | xargs grep -l "Cursor\|sqlite" 2>/dev/null | head -5

echo ""
echo "=== Check if there's documentation about how Constructor and Initializer Args should relate ==="
rg "Constructor.*Args|Initializer.*Args" crates/vm/src/types/slot.rs -B 3 -A 3

echo ""
echo "=== Final confirmation: check the exact parameter handling in init function ==="
sed -n '1936,1945p' crates/stdlib/src/sqlite.rs

Repository: RustPython/RustPython

Length of output: 723


Argument type inconsistency between Constructor and Initializer.

The Initializer::Args is PyRef<Connection> (unwrapped), but Constructor::Args is (PyRef<Connection>,) (single-element tuple). This differs from the Connection class pattern (lines 853-937), where both Constructor and Initializer use the same ConnectArgs type.

While the RustPython framework's FromArgs trait can handle both forms, this inconsistency reduces maintainability and contradicts the established pattern throughout the codebase. Align argument types for consistency.

🤖 Prompt for AI Agents
In crates/stdlib/src/sqlite.rs around lines 1934-1936, the Initializer::Args
type is declared as PyRef<Connection> while the corresponding Constructor::Args
is (PyRef<Connection>,); change Initializer::Args to the single-element tuple
type (PyRef<Connection>,) so both Constructor and Initializer use the identical
argument shape, and update any pattern matching or parameter destructuring in
the initializer implementation to unpack the tuple accordingly.

Comment on lines +42 to +43
// FIXME: This is probably incorrect. Please check if init should be called outside of __new__
Self::slot_init(zelf.clone(), args, vm)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Read the file to understand the Constructor impl and FIXME
head -100 crates/vm/src/stdlib/ast/python.rs | tail -70

Repository: RustPython/RustPython

Length of output: 2518


🏁 Script executed:

#!/bin/bash
# Check the lines around 42-43 in the file
sed -n '20,55p' crates/vm/src/stdlib/ast/python.rs

Repository: RustPython/RustPython

Length of output: 1212


🏁 Script executed:

#!/bin/bash
# Search for other slot_new implementations in the codebase to understand the pattern
rg -l "slot_new" --type rs

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

#!/bin/bash
# Find Constructor implementations that call slot_init within slot_new
rg -A 15 "impl Constructor" --type rs | rg -A 10 "slot_new"

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

#!/bin/bash
# Search for other Constructor impls that might call slot_init
rg "impl Constructor" -A 20 crates/

Repository: RustPython/RustPython

Length of output: 50378


🏁 Script executed:

#!/bin/bash
# Search for slot_init calls across the codebase
rg "slot_init" crates/ | head -30

Repository: RustPython/RustPython

Length of output: 2976


🌐 Web query:

CPython AST node constructor tp_new implementation

💡 Result:

Which CPython version do you want the tp_new implementation for?


🌐 Web query:

CPython ast module BinOp constructor implementation source

💡 Result:

Which CPython version do you want the BinOp constructor source for? (e.g. 3.10, 3.11, main)


Resolve the FIXME: slot_init should not be called inside slot_new.

CPython has moved initialization logic from tp_new to tp_init, which follows standard Python semantics where __new__ creates and returns an object, and then the framework calls __init__ on the returned object. Calling initialization inside slot_new violates this pattern.

The comment acknowledges that AST node constructors parse arguments and assign them as attributes, but this should happen in slot_init (the __init__ equivalent), not inside slot_new. Either:

  1. Remove the slot_init call from slot_new and let the framework invoke it, or
  2. If the current behavior is intentional for compatibility reasons, remove the FIXME and document why it's necessary.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ast/python.rs around lines 42-43, remove the call to
Self::slot_init(zelf.clone(), args, vm) from slot_new so that slot_new only
creates/returns the object and all argument parsing and attribute assignment
happens in slot_init; if compatibility requires calling init inside new, instead
remove the FIXME and add a brief comment documenting why this deviation from
CPython semantics is required, and update/verify any tests that relied on the
current behavior to ensure initialization happens exactly once via slot_init.

@youknowone youknowone merged commit fdd2ac3 into RustPython:main Dec 22, 2025
13 checks passed
@youknowone youknowone deleted the disallow-new-init branch December 22, 2025 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace manual fn __init__ to Initializer

1 participant