YMSeah's blog

By YMSeah, history, 12 months ago, In English

I ran into a runtime error when trying to solve 1514D - Cut and Stick with Mo's Algorithm. Part of the problem requires one to find the counts of an element within a range if it's greater than or equal to about half the counts of elements in the range. One way to solve it is to use Mo's Algorithm to keep track of the most common element in range and its counts.

I sort the queries in the standard way where the left boundaries are grouped together in blocks of about sqrt of the number of elements. Within a "left block", the right queries are sorted in increasing order if the "left block" is an even block, otherwise it is sorted in decreasing order (so that the right points zigzag left-right-left-right through alternating blocks).

I have found that using the following within the sort criterion gives a runtime error (235446182):

if (block1 == block2) {
    // to ensure that the right portion sweeps in alternate directions between adjacent blocks
    if (block1% 2 == 0 && r < other.r) {
        return true;
    }
    if (block1 % 2 == 1 && r >= other.r) {
        return true;
    }
}

and the following does not (235448205).

if (block1 == block2) {
    // to ensure that the right portion sweeps in alternate directions between adjacent blocks
    if (block1% 2 == 0 && r < other.r) {
        return true;
    }
    // if (block1 % 2 == 1 && r >= other.r) { // This gives run time error
    if (block1 % 2 == 1 && r > other.r) {
        return true;
    }
}

I also see that I am not the only person who faced this problem. Here are nawa's codes and his fix using a similar change: runtime error 129082770 vs accepted 129084946.

I am not a native C++ user and I cannot figure out what went wrong. I will be grateful if anyone can shed some light on what may have caused this error.

Full text and comments »

  • Vote: I like it
  • +1
  • Vote: I do not like it

By YMSeah, history, 3 years ago, In English

I think it's a nice coincidence that I was briefly sandwiched between Monogon and SecondThread on Kattis, who are among the top 10 contributors here on Codeforces.

Screenshots (their names on Kattis is their real-life names)

The fact that I immediately recognised their names is testament to the amount of content they have contributed on Codeforces (and also YouTube for SecondThread including his video on How to Upvote Monogon).

While CP is largely an individual mind sport, it is great to have a community of passionate people teaching the sport. I've personally found editorials and especially YouTube videos useful in building my skills. So here's to the active contributors to the sport. Thanks for the value you guys are adding!

Full text and comments »

  • Vote: I like it
  • -1
  • Vote: I do not like it

By YMSeah, history, 3 years ago, In English

I attempted problem B from the recent #737 (Div 2) contest and had a runtime error. I prevented that error by adding a useless line of code "j == -1":

RE submission: 125632647, AC submission: 125632637

For some context, my code performs a binary search (using a method similar to binary lifting) to find j in the increasing array b where b[j] == a[i]. Since b is the sorted copy of a, a[i] always exists in b so j != -1.

Is this a problem associated with PyPy3's JIT compiler, or am I missing something?

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it

By YMSeah, history, 4 years ago, In English

I have been doing competitive programming for some months now. While attempting problems above 1800 in difficulty, I noticed that some of them required the use of ordered sets/multisets that support lookup, addition and deletion of elements in O(log(n)) time. Their solutions usually involve the use of std::set or std::multiset in C++. However, I code primarily in Python and there are no built-in packages available on Codeforces. Normally outside of Codeforces I just import sortedcontainers.sortedlist which I use like an Order Statistic tree. Here I discuss some strategies which I have used to overcome these issues. I hope that it helps others who are stuck so that they do not give up on their favourite languages. For C++ users, this may not be as useful, but you can still see how alternative data structures may be used to solve problems.

The example I use here is CSES — Concert Tickets (for those with no CSES account, see here). This question requires the deletion of entries, and searching for the next largest available entry in sorted array arr smaller than a certain value x.

Method 1: Segment Trees

A segment tree is a very versatile data structure that work for cumulative searches for binary functions with associative properties. Go read up on it if you don't know what it is.

For Concert Tickets, we need to find the largest non-deleted value closest to x. To do this, initialize a maximum segment tree with its values corresponding to its index value. i.e. the segmentTreeArray=[0,1,2,3,...n-1], and queries between [l,r] will give the largest value within the range [l,r]. Everytime we delete a value, we set the index corresponding to that value to -inf (or -1e9, if you prefer). This way, the index will not "interfere" with future query ranges that includes it. To find a value, first perform a binary search on the original array arr to find where the search should end such that arr[i]<x. To find the next largest available index, query for the largest from 0 to i.

If you want to search for next smallest to the right, use a minimum segment tree, set value to inf (1e9) on delete, and search from i to n-1 instead.

Complexity: O(log(n)) for search, insert, or delete. Adding elements is only possible if you know beforehand what are the elements you can add (and thereby allocate a "space" in your segment tree).

What the code might look like:

# class SegmentTree...
# def getMaxSegTree(arr) ...
def main():
    
    n,m=readIntArr()
    
    actual=readIntArr()
    actual.sort()
    custMax=readIntArr()
    
    availIndexes=getMaxSegTree(list(range(n)))
    #if available, mark node as 1. Else, mark node as -inf
    ans=[]
    for cm in custMax:
        b=n
        i=-1
        while b>0: # binary search
            while i+b<n and actual[i+b]<=cm:
                i+=b
            b//=2
        if i==-1: # no smaller value
            ans.append(-1)
        else:
            idx=availIndexes.query(0,i)
            if idx==-inf: # no available smaller value
                ans.append(-1)
            else:
                ans.append(actual[idx])
                availIndexes.updateTreeNode(idx,-inf) # delete the node
    multiLineArrayPrint(ans)
        
    return

Note: If you want to query element counts (i.e how many elements between 0 and i has not been deleted), use a sum Segment Tree and set index to 1 if element is present, and 0 if element is deleted. Then query(0,i) gives you the number of elements present between [0,i].

Method 2: Linked List with Path Compression (like DSU)

Note: From my (rather shallow) experience, this only works with item deletion. Maintain an array parent in parallel to arr of same size. Originally parent looks like [0,1,2,...n-1]. Every time an item is deleted, replace parent[deleted_i] with deleted_i-1. When searching for a value, perform a binary search (like in segment tree example) for the index i such that arr[i]<x. Then, run a search for the parent of arr[i] with while(parent[i]!=i)i=parent[i]. Remember to perform a path compression after that, as this will make future searches faster.

Note that if the search queries are for next larger item (item on right that isn't deleted), then replace parent[deleted_i] with deleted_i+1 on deletion.

Complexity: O(1) for deletion. O(log(n)) for search, because of the binary search. I don't know the complexity of finding-of-parent-with-path-compression (see https://mirror.codeforces.com/blog/entry/2381). This linked-list with path compression runs quickly. For my submission for this question, it runs close to 2x faster than segment tree.

What the code might look like:

# no special classes needed :). only needs the parent array.
def main():
    
    n,m=readIntArr()
    h=readIntArr() # n prices
    parent=[i for i in range(n)] #price parent
    t=readIntArr() # m customers
    h.sort()
    
    ans=[]
    for x in t:
        b=n
        i=-1
        while b>0: # binary search
            while i+b<n and h[i+b]<=x:
                i+=b
            b//=2
        if i==-1:
            ans.append(-1)
        else:
            eventualParent=i
            while eventualParent!=-1 and parent[eventualParent]!=eventualParent:
                eventualParent=parent[eventualParent]
            if eventualParent==-1: # no smaller item found
                ans.append(-1)
                eventualParent+=1
            else:
                ans.append(h[eventualParent])
            eventualParent-=1 # eventualParent is deleted. to update its parent to eventualParent-1 later
            # path compression like DSU
            while i!=eventualParent:
                temp=parent[i]
                parent[i]=eventualParent
                i=temp
    multiLineArrayPrint(ans)
    
    return

Method 3: Actually using a self-implemented B-tree or Order Statistic Tree

This is straightforward. Just copy-paste already written code. For me, I copy-pasted the import library of sortedcontainers.sortedlist which is 800+ lines of code (eg. 111203644) after removing empty lines. I dislike this method because 1) it makes my code super long, 2) slows down my IDE which must parse all that code, 3) tends to have the largest overhead (usually slower than using segment tree). However, it doesn't require innovative thinking on how to manipulate a segment tree/linked list etc. to perform equivalent functions. Time complexity: O(log(n)) for search, insert and delete.

What the code might look like: (looks elegant, but imagine I have 800+ lines of code for the OrderedList data structure. By the way, the OrderedList is my implementation of the Order Statistic Tree)

EDIT: rishabnahar2025 has a good suggestion for an Order Statistic Tree template for Python users: here. This code is only 200+ lines long and runs faster than the OrderedList that I used previously. As pointed out by xuanji in the comments, the documentation for the original sortedcontainers.SortedList implementation (for Python) mentioned that its operations run in n^(1/3) to n^(1/2) with small overheads. I should point out that I have used PyRival's SortedList to solve many problems requiring nlogn time complexity.

# class OrderedList ... (800+ lines)
def main():
    
    n,m=readIntArr()
    h=readIntArr() # n ticket prices
    t=readIntArr() # m customers
    
    h.sort()
    ol=OrderedList(h)
    
    ans=[]
    for x in t:
        idx=ol.bisect_left(x)
        if idx==len(ol) or ol[idx]>x:
            idx-=1
        if idx==-1:
            ans.append(-1) #cant take anything
        else:
            ans.append(ol.pop(idx))
    multiLineArrayPrint(ans)
    
    return

Another Example

If you'd like to see these concepts used in another question, you may look at my solutions for 1642E - Anonymity Is Important where a part of each of them requires marking the earliest visited times for items in indexes 1 to n, and each time a range is given (ranges may overlap).

Method 1: Direct update in using Lazy-Propagated Segment Trees : 151336538 (Note that I subsequently transferred the final result to a sparse table for faster range querying of the final result. This is out of scope for thisdiscussion so ignore that code)

Method 2: Linked list with path compression : 151410511

Method 3: Using some sort of Order Statistic Tree (PyRival's SortedList) : 151340346

Alright that's it. Thanks for reading this far, and I hope that this has been helpful or informative. If there're mistakes in my post please let me know.

Full text and comments »

  • Vote: I like it
  • +36
  • Vote: I do not like it

By YMSeah, history, 4 years ago, In English

Hi all

Firstly thanks for reading this. It's kinda hard to articulate what I'm doing and I hope that I've written it clearly.

I tried playing around with long (larger than 1<<31) and ints in PyPy3 to see how performance is affected. If I'm not mistaken, numbers larger than 1<<31, or numbers originating from such numbers (eg. divided from a large number) are stored as longs in the underlying C struct. Smaller numbers are stored as int32.

For a control experiment, I tried storing large numbers:

//Just storing Longs as control experiment. Source:

x=1<<59

y=[]
for _ in range(10**7):
    y.append(x)

Output:

===== Used: 873 ms, 113092 KB ______

I then proceeded to try converting the Long to int. Note that I did a division initially (x>>1). This was intentional, as it slowed down the code significantly. The code would run much faster (~300ms) without this step. Note how the memory used has reduced as (I believe) the numbers are now stored as int32.

//Long to int directly, with dividing long by 2. Source:

x=1<<59
x=x>>1

print(id(x))

y=[]
for _ in range(10**7):
    y.append(int(1.0*(x>>31)))

Output: 4611686018427387905

===== Used: 1669 ms, 76632 KB


Finally, I attempted the same, but I assigned x to temp, deleted x, and reassigned temp to x. The code ran much faster (~300ms). But I have no idea why. I printed out the ids of x before and after to verify that the object (and hence its properties) didn't change.

//Long to int, with dividing long by 2. But deleting the long and reassigning the long. Source:

x=1<<59
x=x>>1

print(id(x))
temp=x
del x
x=temp
print(id(x)) #id does not change. i.e. same object, so properties shouldn't change

y=[]
for _ in range(10**7):
    y.append(int(1.0*(x>>31)))

Output: 4611686018427387905 4611686018427387905

===== Used: 358 ms, 76796 KB


I have no idea what's happening here, but my suspicion is that PyPy3's JIT compiler's logic was affected by the deletion and reassignment of x, and took a different series of steps when running the code.

Full text and comments »

  • Vote: I like it
  • +13
  • Vote: I do not like it