Skip to content

Conversation

@ShaharNaveh
Copy link
Collaborator

@ShaharNaveh ShaharNaveh commented Jan 10, 2026

There are no logic changes in this PR.
In a future PR I plan on moving the oparg enums to their own file as well

Summary by CodeRabbit

  • Refactor
    • Reorganized bytecode instruction definitions into a dedicated module for improved code maintainability.
    • Renamed instruction variant from CopyItem to Copy for consistency.
    • Enhanced instruction handling and display logic in the virtual machine.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 10, 2026

📝 Walkthrough

Walkthrough

The PR refactors the bytecode compilation system by extracting the Instruction enum from bytecode.rs into a dedicated module file (bytecode/instruction.rs), while simultaneously renaming the CopyItem opcode variant to Copy across the codebase. The public API for Instruction is preserved through re-export.

Changes

Cohort / File(s) Summary
Instruction enum extraction
crates/compiler-core/src/bytecode.rs, crates/compiler-core/src/bytecode/instruction.rs
Moved comprehensive Instruction enum and all associated implementations (~791 lines) from bytecode.rs into a new module file instruction.rs; preserved public API via pub use re-export in parent module. The new instruction.rs includes full enum definition with 1-byte opcode alignment, extensive pattern matching for stack effects, conversions (From<Instruction> for u8, TryFrom<u8> for Instruction), and formatting/disassembly logic.
Opcode variant rename (CopyItem → Copy)
crates/codegen/src/compile.rs, crates/vm/src/frame.rs
Updated all emission and execution sites to use Instruction::Copy instead of Instruction::CopyItem; variant behavior and argument structure remain functionally equivalent.
Opcode metadata generation
scripts/generate_opcode_metadata.py
Updated source path to read from bytecode/instruction.rs and removed special-case handling for "CopyItem" in cpython_name generation logic.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • youknowone

Poem

🐰 We've shuffled the instructions neat,
From bytecode.rs to instruction's beat,
CopyItem → Copy, a rename so spry,
The enum's extracted, reaching up high!
Module-ized magic, refactored with care. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: moving the Instruction enum to its own file, which aligns with the primary refactoring objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

}
}

use self::Instruction::*;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the removal of this line I've decided to rename CopyItem -> Copy

@ShaharNaveh ShaharNaveh marked this pull request as ready for review January 10, 2026 14:33
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/codegen/src/compile.rs (1)

1384-1394: Rename to Instruction::Copy is safe and complete; update stale documentation in frame.rs.

The rename from CopyItem to Instruction::Copy is consistently applied across all 35+ call sites in compile.rs. The variant correctly maps to CPython's COPY opcode (opcode 61) and the stack effect is properly calculated. However, stale references remain in crates/vm/src/frame.rs at lines 766, 767, 772, and 775—these are comments and error messages that still mention CopyItem and should be updated to use Copy for consistency:

  • Lines 766–767: Comments explaining Copy semantics
  • Line 772: Error message "CopyItem ERROR: ..."
  • Line 775: Panic message "CopyItem: stack underflow"

The underlying code at line 765 is correct and handles the instruction properly.

crates/vm/src/frame.rs (1)

765-780: Stale comments reference old instruction name.

The comments on lines 766-768 and 772 still reference CopyItem, but the instruction has been renamed to Copy. Update the comments for consistency.

📝 Suggested fix
             bytecode::Instruction::Copy { index } => {
-                // CopyItem { index: 1 } copies TOS
-                // CopyItem { index: 2 } copies second from top
+                // Copy { index: 1 } copies TOS
+                // Copy { index: 2 } copies second from top
                 // This is 1-indexed to match CPython
                 let idx = index.get(arg) as usize;
                 let stack_len = self.state.stack.len();
                 if stack_len < idx {
-                    eprintln!("CopyItem ERROR: stack_len={}, idx={}", stack_len, idx);
+                    eprintln!("Copy ERROR: stack_len={}, idx={}", stack_len, idx);
                     eprintln!("  code: {}", self.code.obj_name);
                     eprintln!("  lasti: {}", self.lasti());
-                    panic!("CopyItem: stack underflow");
+                    panic!("Copy: stack underflow");
                 }
🧹 Nitpick comments (2)
scripts/generate_opcode_metadata.py (1)

13-15: Good: metadata now tracks the moved Instruction enum, and Copy will map to COPY.
This aligns the generator with the new source location and removes the need for the old CopyItem special-case.

To guard against the moved file being formatted with multi-line struct variants, I’d harden extract_enum_body() to stop at the enum’s closing brace at column 0, not the first \n} encountered.

Proposed hardening for enum extraction
 def extract_enum_body(contents: str) -> str:
-    res = re.search(r"pub enum Instruction \{(.+?)\n\}", contents, re.DOTALL)
+    res = re.search(
+        r"pub enum Instruction \{(.*?)^}\s*$",
+        contents,
+        re.DOTALL | re.MULTILINE,
+    )
     if not res:
         raise ValueError("Could not find Instruction enum")

Verification (also do a clean rebuild since bytecode layout/opcodes are involved):
Based on learnings, consider a full clean build after regenerating metadata.

#!/bin/bash
set -euo pipefail

python scripts/generate_opcode_metadata.py

# Ensure COPY exists and COPY_ITEM doesn't.
rg -n "^ *'COPY':" Lib/_opcode_metadata.py
rg -n "COPY_ITEM" Lib/_opcode_metadata.py && exit 1 || true

# Clean build suggestion from repo learnings:
# rm -r target/debug/build/rustpython-* && find . -name '*.pyc' -print0 | xargs -0 rm -f

Also applies to: 24-27

crates/compiler-core/src/bytecode/instruction.rs (1)

1-2: Consider using core::mem for consistency with no_std compatibility.

Line 1 uses alloc::fmt suggesting no_std support, but line 2 uses std::mem. For consistency and to maintain no_std compatibility, consider using core::mem instead.

♻️ Suggested fix
 use alloc::fmt;
-use std::mem;
+use core::mem;
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ff7b3e and 25c3348.

📒 Files selected for processing (5)
  • crates/codegen/src/compile.rs
  • crates/compiler-core/src/bytecode.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/vm/src/frame.rs
  • scripts/generate_opcode_metadata.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/frame.rs
  • crates/codegen/src/compile.rs
  • crates/compiler-core/src/bytecode.rs
  • crates/compiler-core/src/bytecode/instruction.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • scripts/generate_opcode_metadata.py
🧠 Learnings (1)
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`

Applied to files:

  • scripts/generate_opcode_metadata.py
🧬 Code graph analysis (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
crates/compiler-core/src/bytecode.rs (5)
  • decode_load_attr_arg (99-103)
  • marker (540-542)
  • argc (1126-1133)
  • get (455-461)
  • get (558-560)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (4)
crates/compiler-core/src/bytecode.rs (1)

17-19: LGTM!

The module extraction and re-export pattern correctly preserves the public API surface. Users can continue importing Instruction from bytecode without changes.

crates/compiler-core/src/bytecode/instruction.rs (3)

298-359: LGTM!

The TryFrom<u8> implementation correctly validates opcode ranges, handling the non-contiguous layout (gaps at 125-127, Resume at 149, pseudo-ops at 252-255). The explicit custom_ops list ensures only valid RustPython-specific opcodes are accepted.


419-619: LGTM!

The stack_effect implementation comprehensively covers all instruction variants with correct stack delta calculations. The placeholder opcodes returning 0 is appropriate since they're not emitted by the compiler.


621-821: LGTM!

The display formatting implementation is well-structured using a macro-based approach for consistency. The LoadAttr special handling for decoding the method flag (lines 747-760) correctly uses decode_load_attr_arg from the parent module.

Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShaharNaveh I feel like Arg is a part of Instruction. If it needs to be splited, putting in instruction.rs makes more sense to me than putting it on its own sub module.

@youknowone youknowone merged commit 440b8de into RustPython:main Jan 11, 2026
13 checks passed
@ShaharNaveh
Copy link
Collaborator Author

@ShaharNaveh I feel like Arg is a part of Instruction. If it needs to be splited, putting in instruction.rs makes more sense to me than putting it on its own sub module.

No worries, I'll append it in a future PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants