Preserve str subclasses in ascii() for ASCII-only __repr__ results#7455
Preserve str subclasses in ascii() for ASCII-only __repr__ results#7455youknowone merged 4 commits intomainfrom
str subclasses in ascii() for ASCII-only __repr__ results#7455Conversation
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
str subclasses in ascii() for ASCII-only __repr__ results
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] test: cpython/Lib/test/test_str.py (TODO: 15) dependencies: dependent tests: (no tests depend on str) Legend:
|
| assert ascii(chr(0x9999)) == "'\\u9999'" | ||
| assert ascii(chr(0x0A)) == "'\\n'" | ||
|
|
||
| class MyStr(str): |
There was a problem hiding this comment.
[auto-format] reported by reviewdog 🐶
| class MyStr(str): | |
| class MyStr(str): |
| class MyStr(str): | ||
| pass | ||
|
|
||
| class Foo: |
There was a problem hiding this comment.
[auto-format] reported by reviewdog 🐶
| class Foo: | |
| class Foo: |
| def __repr__(self): | ||
| return MyStr("hello") | ||
|
|
||
| class Bar: |
There was a problem hiding this comment.
[auto-format] reported by reviewdog 🐶
| class Bar: | |
| class Bar: |
| def __repr__(self): | ||
| return MyStr("héllo") | ||
|
|
||
| assert type(ascii(Foo())) is MyStr |
There was a problem hiding this comment.
[auto-format] reported by reviewdog 🐶
| assert type(ascii(Foo())) is MyStr | |
| assert type(ascii(Foo())) is MyStr |
|
@copilot Fix CI failures |
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThis PR modifies the Changes
Sequence DiagramsequenceDiagram
actor User
participant ascii as ascii() Function
participant repr as repr() Protocol
participant check as ASCII Check
participant return_direct as Return Repr<br/>(subclass preserved)
participant escape as Escape Non-ASCII
participant return_new as Return New PyStr
User->>ascii: ascii(obj)
ascii->>repr: Get __repr__(obj)
repr-->>ascii: repr_result (PyRef<PyStr>)
ascii->>check: Is repr already ASCII?
alt Already ASCII
check->>return_direct: Yes
return_direct-->>ascii: repr_result unchanged
ascii-->>User: PyRef<PyStr> (subclass preserved)
else Non-ASCII
check->>escape: No, escape non-ASCII chars
escape->>return_new: Create new plain PyStr
return_new-->>ascii: new PyStr
ascii-->>User: PyRef<PyStr> (plain str)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/builtins.rs (1)
67-74: Code duplication withPyObject::ascii()inobject.rs.The implementation here is identical to
PyObject::ascii()at lines 373-379 incrates/vm/src/protocol/object.rs. Consider having one delegate to the other to avoid maintaining the same logic in two places.♻️ Option: Delegate to PyObject::ascii()
#[pyfunction] pub fn ascii(obj: PyObjectRef, vm: &VirtualMachine) -> PyResult<PyStrRef> { - let repr = obj.repr(vm)?; - if repr.as_wtf8().is_ascii() { - Ok(repr) - } else { - Ok(vm.ctx.new_str(to_ascii(repr.as_wtf8()))) - } + obj.ascii(vm) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/builtins.rs` around lines 67 - 74, Replace the duplicated logic in builtins::ascii (the function pub fn ascii in builtins.rs) by delegating to the existing PyObject::ascii() implementation on the object (call the object's ascii method with the same vm) and return its PyResult<PyStrRef>; remove the manual repr/as_wtf8/is_ascii/to_ascii flow so the builtin just forwards to obj.ascii(vm) (adjust imports/signature if needed to match the method call).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/vm/src/stdlib/builtins.rs`:
- Around line 67-74: Replace the duplicated logic in builtins::ascii (the
function pub fn ascii in builtins.rs) by delegating to the existing
PyObject::ascii() implementation on the object (call the object's ascii method
with the same vm) and return its PyResult<PyStrRef>; remove the manual
repr/as_wtf8/is_ascii/to_ascii flow so the builtin just forwards to
obj.ascii(vm) (adjust imports/signature if needed to match the method call).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 5fb0f8c6-cca6-48d9-a5fc-c96d61bd3787
⛔ Files ignored due to path filters (1)
Lib/test/test_str.pyis excluded by!Lib/**
📒 Files selected for processing (6)
crates/vm/src/cformat.rscrates/vm/src/format.rscrates/vm/src/frame.rscrates/vm/src/protocol/object.rscrates/vm/src/stdlib/builtins.rsextra_tests/snippets/builtin_ascii.py
|
@copilot check review #7455 (review) |
Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
Done in 77767fc. |
ascii()was always materializing a new plainstr, which discardedstrsubclass types even when__repr__had already returned an ASCII-only string. This diverged from CPython, which returns the original repr object unchanged in that case.Behavior change
builtins.ascii()andPyObject::ascii()to short-circuit whenrepr(obj)is already ASCII.strsubclass identity.str.Call site updates
%a/ format-conversion consumers to work with a returnedPyStrobject instead of anAsciiString.Regression coverage
test_asciicase inLib/test/test_str.py.strsubclass preserves the subclassstrOriginal prompt
ascii()discards str subclass type when repr result is already ASCII #7451📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.
Summary by CodeRabbit
Bug Fixes
ascii()builtin function to properly handle cases where__repr__returns a string subclass, preserving the return type when the representation is already in ASCII format.Tests
ascii()behavior with custom string subclasses.