Hi all
Firstly thanks for reading this. It's kinda hard to articulate what I'm doing and I hope that I've written it clearly.
I tried playing around with long (larger than 1<<31) and ints in PyPy3 to see how performance is affected. If I'm not mistaken, numbers larger than 1<<31, or numbers originating from such numbers (eg. divided from a large number) are stored as longs in the underlying C struct. Smaller numbers are stored as int32.
For a control experiment, I tried storing large numbers:
//Just storing Longs as control experiment. Source:
x=1<<59
y=[]
for _ in range(10**7):
y.append(x)
Output:
===== Used: 873 ms, 113092 KB ______
I then proceeded to try converting the Long to int. Note that I did a division initially (x>>1). This was intentional, as it slowed down the code significantly. The code would run much faster (~300ms) without this step. Note how the memory used has reduced as (I believe) the numbers are now stored as int32.
//Long to int directly, with dividing long by 2. Source:
x=1<<59
x=x>>1
print(id(x))
y=[]
for _ in range(10**7):
y.append(int(1.0*(x>>31)))
Output: 4611686018427387905
===== Used: 1669 ms, 76632 KB
Finally, I attempted the same, but I assigned x to temp, deleted x, and reassigned temp to x. The code ran much faster (~300ms). But I have no idea why. I printed out the ids of x before and after to verify that the object (and hence its properties) didn't change.
//Long to int, with dividing long by 2. But deleting the long and reassigning the long. Source:
x=1<<59
x=x>>1
print(id(x))
temp=x
del x
x=temp
print(id(x)) #id does not change. i.e. same object, so properties shouldn't change
y=[]
for _ in range(10**7):
y.append(int(1.0*(x>>31)))
Output: 4611686018427387905 4611686018427387905
===== Used: 358 ms, 76796 KB
I have no idea what's happening here, but my suspicion is that PyPy3's JIT compiler's logic was affected by the deletion and reassignment of x, and took a different series of steps when running the code.