Skip to content

Conversation

@moreal
Copy link
Contributor

@moreal moreal commented Nov 6, 2021

This pull request may close #356.

This pull request implements object.__sizeof__ fully, except for setting all types' ob_size. It introduces PyVarObject.ob_size field and tp_itemsize slot. Also it provides a way to setup itemsize slot with pyclass macro.

ob_size field is used to calculate variable object's size with itemsize (ob_size * itemsize + basicsize).

Because this pull request modified the core part like (PyInner and PyObjectRef) and I felt the next work can be more bigger, I cut this part and request a review for this pull request.


Related codes:


The next work will be to call set_ob_size.


#[pymethod(magic)]
fn sizeof(zelf: PyRef<Self>) -> usize {
zelf.dict().sizeof()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dict has data but dict_keys is just a view of dict. So it would be not the same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could just be size_of::<Self> and could be implemented directly on DictView.

@moreal moreal marked this pull request as draft November 6, 2021 09:17
@moreal moreal changed the title Implement dict_keys.__sizeof__ Implement dict_keys.__sizeof__, dict_items.__sizeof__, dict_values.__sizeof__ Nov 6, 2021
@youknowone
Copy link
Member

@moreal is there anything more work for this PR?

@moreal
Copy link
Contributor Author

moreal commented Nov 10, 2021

When I was working on this pull request, I realized there is no __sizeof__ method for dict_keys and dict_values, dict_items but it is from object.__sizeof__. So the correct solution is to implement object.__sizeof__ with additional slots.

So I'll implement object.__sizeof__ with additional slots.

P.S. The reason I didn't any action for this pull request, is just I forgot. Thanks!

@moreal moreal changed the title Implement dict_keys.__sizeof__, dict_items.__sizeof__, dict_values.__sizeof__ Implement object.__sizeof__ Nov 14, 2021
@moreal moreal marked this pull request as ready for review November 20, 2021 14:01
Comment on lines 314 to 317
// TODO: CPython implementation has `PyVarObject.ob_size` field for each objects.
// Its default value seems 0 but it is changed by dynamically.
let object_size = 0;
size = object_size * isize;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About these comments, I couldn't find an API to get PyObject's size yet. 😞

You can see the original implementation at https://github.com/python/cpython/blob/6d430ef5ab62158a200b94dff31b89524a9576bb/Objects/typeobject.c#L5481-L5494.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyTypeObject PyBaseObject_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "object",        
...

I think that second argument in PyVarObject_HEAD_INIT is ob_size and that PyVarObject_HEAD_INIT assigned to ob_base in each types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It was helpful to analyze this issue again. 🙏🏻

@moreal moreal marked this pull request as draft November 20, 2021 16:13
@rng-dynamics
Copy link
Contributor

rng-dynamics commented Mar 30, 2022

Regarding tp_basicsize, tp_itemsize, and ob_size in CPython: tp_itemsize and ob_size are for variable-size objects only. At https://docs.python.org/3/extending/newtypes_tutorial.html#defining-new-types we can see the intended use of these fields.

AFAIU RustPython does not try to replicate this interface exactly. For example, in RustPython __sizeof__ is already implemented for list, bytearray, ... without tp_itemsize or ob_size. So, I am wondering: can't we just return std::mem::size_of::<PyBaseObject>() + std::mem::size_of::<PyObject>() in object.__sizeof__?

If you want, I can make a separate pull request, since this issue is already inactive since more than four months.

@moreal
Copy link
Contributor Author

moreal commented May 20, 2022

I stopped to work on this pull request since I met the situation I couldn't find the way to get RustPython object's memory size.
As I understood, if it doesn't need to be same between RustPython's __sizeof__ and CPython's __sizeof__, #3728 will be merged and this pull request should be closed.

If not, I'll continue to look this pull request on this weekend.

@youknowone
Copy link
Member

I merged #3728. If this PR has more enhancement, please keep go ahead.

@moreal
Copy link
Contributor Author

moreal commented May 21, 2022

This comment is a note to record the progress to solve this issue.


At current commit, when we run the below code, it seems like unnatural.

# RustPython
>>>>> object.__sizeof__(1223456789812391231291231231231212312312312312312312321321321321312312321123123123199129391239219394923912949213021949302194942130123949203912430392402139210492139123012940219394923942395943856228368385)
32

# CPython
>>> object.__sizeof__(1223456789812391231291231231231212312312312312312312321321321321312312321123123123199129391239219394923912949213021949302194942130123949203912430392402139210492139123012940219394923942395943856228368385)
116

When I build the CPython with debugging log in local environment and I ran it, the object.__sizeof__ doesn't call int.__sizeof__. So I decided to focus on the int type.

The object.__sizeof__ requires:

  • tp_basicsize: int uses offsetof(PyLongObject, ob_digit)
  • tp_itemsize: int uses sizeof(digit)
  • ob_size: int initializes the size at _PyLong_New -> _PyObject_InitVar -> Py_SET_SIZE -> _Py_SET_SIZE -> ob->ob_size = size; after PyObject_Malloc manually.

bytes also does it.

Then It will be the solution to introduce ob_size field and to provide method to initialize it. Where should the field be added? Now, I guess it'll be PyInner 🤔

@moreal moreal force-pushed the dict-keys-sizeof branch 3 times, most recently from 6e15bd6 to 86d4999 Compare May 22, 2022 06:33
@moreal moreal marked this pull request as ready for review May 22, 2022 07:22
@moreal moreal requested review from DimitrisJim and youknowone May 22, 2022 08:08
@youknowone
Copy link
Member

Does int.__sizeof__ correctly work? If that correct, How about adding slot_sizeof and SizeOf trait?
Then object.__sizeof__ will call correct variants of each types

/// >>> int('0b100', base=0)
/// 4
#[pyclass(module = false, name = "int")]
#[pyclass(module = false, name = "int", itemsize = 4)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we have a better way than adding manual itemsize to type definition.

fn sizeof(zelf: PyObjectRef) -> usize {
zelf.class().slots.basicsize
let mut res: usize = 0;
let isize: usize = zelf.class().slots.itemsize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let isize: usize = zelf.class().slots.itemsize;
let itemsize: usize = zelf.class().slots.itemsize;

to avoid reusing primitive type name isize


impl<T> PyInner<T> {
#[inline(always)]
pub fn ob_size(&self) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In RustPython, we don't use the prefix ob_.

Suggested change
pub fn ob_size(&self) -> usize {
pub fn size(&self) -> usize {

}

#[inline(always)]
pub fn set_ob_size(&self, size: usize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this being called?

@moreal
Copy link
Contributor Author

moreal commented Jun 12, 2022

Does int.__sizeof__ correctly work? If that correct, How about adding slot_sizeof and SizeOf trait? Then object.__sizeof__ will call correct variants of each types

I agree about this suggestion because it doesn't need to implement same struct model with CPython. I'll implement the idea again by referring to other traits. If I implement it, some methods or fields will be removed or replaced. (e.g. set_ob_size).

@moreal moreal marked this pull request as draft June 17, 2022 17:40
@moreal moreal force-pushed the dict-keys-sizeof branch 2 times, most recently from b05a34d to 73e90a0 Compare June 17, 2022 17:42
@moreal moreal force-pushed the dict-keys-sizeof branch from 73e90a0 to 35655e0 Compare June 17, 2022 17:54
@moreal
Copy link
Contributor Author

moreal commented Jun 18, 2022

I wanted RustPython's implementations return same value with CPython implementation.
But I realized object.__sizeof__ should return different value with other types like str. (object.__sizeof__(_) != str.__sizeof__(_)). Of course, it isn't clean to set ob_size manually.

I'll close this pull request but it may be because of my passion and my code skill.

@moreal moreal closed this Jun 18, 2022
@youknowone
Copy link
Member

oh.. I thought this is going to the right way. what was not fitting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

object.__sizeof__

5 participants