This article describes how integer objects are managed by Python internally.

An integer object in Python is represented internally by the structure PyIntObject. Its value is an attribute of type long.

typedef struct {
    PyObject_HEAD
    long ob_ival;
} PyIntObject;

To avoid allocating a new integer object each time a new integer object is needed, Python allocates a block of free unused integer objects in advance.

The following structure is used by Python to allocate integer objects, also called PyIntObjects. Once this structure is initialized, the integer objects are ready to be used when new integer values are assigned to objects in a Python script. This structure is called “PyIntBlock” and is defined as:

struct _intblock {
    struct _intblock *next;
    PyIntObject objects[N_INTOBJECTS];
};
typedef struct _intblock PyIntBlock;

When a block of integer objects is allocated by Python, the objects have no value assigned to them yet. We call them free integer objects ready to be used. A value will be assigned to the next free object when a new integer value is used in your program. No memory allocation will be required when a free integer object’s value is set so it will be fast.

The integer objects inside the block are linked together back to front using their internal pointer called ob_type. As noted in the source code, this is an abuse of this internal pointer so do not pay too much attention to the name.

Each block of integers contains the number of integer objects which can fit in a block of 1K bytes, about 40 PyIntObject objects on my 64-bit machine. When all the integer objects inside a block are used, a new block is allocated with a new list of integer objects available.

A singly-linked list is used to keep track of the integers blocks allocated. It is called “block_list” internally.

Python integer object internals

A specific structure is used to refer small integers and share them so access is fast. It is an array of 262 pointers to integer objects. Those integer objects are allocated during initialization in a block of integer objects we saw above. The small integers range is from -5 to 256. Many Python programs spend a lot of time using integers in that range so this is a smart decision.

#define NSMALLPOSINTS           257
#define NSMALLNEGINTS           5
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
Python integer object internals

The integer object representing the integer -5 is at the offset 0 inside the small integers array. The integers object representing -4 is at offset 1 …

What happens when an integer is defined in a Python script like this one?

>>> a=1
>>> a
1

When you execute the first line, the function PyInt_FromLong is called and its logic is the following:

if integer value in range -5,256:
    return the integer object pointed by the small integers array at the offset (value + 5).
else:
    if no free integer object available:
        allocate new block of integer objects
    set value of the next free integer object in the current block
of integers.
    return integer object

With our example: integer 1 object is pointed by the small integers array at offset: 1+5 = 6. A pointer to this integer object will be returned and the variable “a” will be pointing to that integer object.

Python integer object internals

Let’s a look at a different example:

>>> a=300
>>> a
300

300 is not in the range of the small integers array so the next free integer object’s value is set to 300.

Python integer object internals

If you take a look at the file intobject.c in the Python 2.6 source code, you will see a long list of functions taking care of operations like addition, multiplication, conversion… The comparison function looks like this:

static int
int_compare(PyIntObject *v, PyIntObject *w)
{
    register long i = v->ob_ival;
    register long j = w->ob_ival;
    return (i < j) ? -1 : (i > j) ? 1 : 0;
}

The value of an integer object is stored in its ob_ival attribute which is of type long. Each value is placed in a register to optimize access and the comparison is done between those 2 registers. -1 is returned if the integer object pointed by v is less than the one pointed by w. 1 is returned for the opposite and 0 is returned if they are equal.

That’s it for now. I hope you enjoyed the article. Please write a comment if you have any feedback.

Last modified: October 9, 2020

Author

Comments

@Dustin That line is interpreted as 1 code object. If you were to do that over multiple lines, then your answer would be different because python would parse multiple lines into multiple code objects. Basically, when python creates a code object from a chunk of syntax, it will scan for all constants and make only 1 object to represent a given constant.

Raunak Sabhani 

Nicely written article. Gave good understanding of the internal implementation

Feodor Kichatov 

1. >>> a = 256 >>> b = 256 >>> a is b True 2. >>> a = 257 >>> b = 257 >>> a is b False 3. >>> a, b = 257, 257 >>> a is b True 4. >>> a = 257; b = 257 >>> a is b False 2. and 4. are same. but 3. is interesting =)

If you do python -c “a = 1000000; b = 1000000; print a is b; print 1000000 is 1000000” it prints True twice, which seems to imply that source code literal integers are the same. Not sure exactly how it works, but I would hope that literals aren’t recreated unnecessarily.

So looking at the PyInt_fromLong logic, I don’t see any way for the virtual machine to reuse integer objects. For example, if I refer to the literal number “300” multiple times in my Python script, will the virtual machine re-create a “300” integer object every time, or will it reuse the first 300 object I created? 300 is not within the span of the “small integer” array.

    Author
    Laurent Luce 

    @Ricky: In the case of 300, it will use a free integer object from the linked-list each time.

Hi, nice article. Also, I was linked to this article with a search trying to find out how Python makes up the ability to represent huge integers.

“The small integers range is from -5 to 257”. Not 257 but 256. Zero is counted as a positive integer.

Thanks for this and your other posts on Python data structure implementation. I’m teaching an intro CS course this term, using Python, and decided to spend a lecture on “opening up the machine” for my students. These articles are invaluable for my lecture prep.

Alexi Zaviruha 

Great! Thanks!

Andrea Bisognin 

interesting post! i find very fascinating to look at how my favourite language manage data structures.

Comments are closed.