illuminati_13's blog

By illuminati_13, history, 4 months ago, In English
import requests
from bs4 import BeautifulSoup

url = "https://mirror.codeforces.com/problemset/page/11?tags=binary+search"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

questions = soup.find_all('div', class_='problem-statement')
question_texts = [q.find('div', class_='title').text for q in questions]

total_questions = len(question_texts)
first_five_words = [q.split()[:5] for q in question_texts[:5]]
last_five_words = [q.split()[:5] for q in question_texts[-5:]]

print("Total questions:", total_questions)
print("First 5 words of the first question:", first_five_words)
print("First 5 words of the last 5 questions:", last_five_words)

I know this code isn't working bcoz its web scrapping, can anyone tell me how do I use Codeforces API for the same? To fetch all the questions under binary search and store them as a Python list. I am doing this for a Project where I have to train a Transformer model over Programming Problem Statements.

  • Vote: I like it
  • 0
  • Vote: I do not like it

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by illuminati_13 (previous revision, new revision, compare).

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by illuminati_13 (previous revision, new revision, compare).

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by illuminati_13 (previous revision, new revision, compare).

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by illuminati_13 (previous revision, new revision, compare).

»
4 months ago, # |
Rev. 4   Vote: I like it 0 Vote: I do not like it

here is the corrected and working code without codeforces api that i have wrote ,hope it helps

import requests

from bs4 import BeautifulSoup

def getproblems(url):

problems= []


response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
problem_links = soup.find_all('a', href=lambda href: href and "/problemset/problem" in href)
for link in problem_links:
    name=(link.get_text().strip())
    original_link = link['href']
    # Extracting the number and problem code from the original link
    parts = original_link.split("/")
    number = parts[-2]
    problem_code = parts[-1]
    # Constructing the full link with the correct format
    full_link = f"https://mirror.codeforces.com/problemset/status/{number}/problem/{problem_code}"
    if(name !=(number+problem_code)):
        problems.append(name)


return problems

url = "https://mirror.codeforces.com/problemset/page/11?tags=binary+search"

response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

questions = getproblems(url)

total_questions = len(questions) first_five_words = [q[:5] for q in questions[:5]] last_five_words = [q[-5:] for q in questions[-5:]]

print("Total questions:", total_questions) print("First 5 words of the first question:", first_five_words) print("First 5 words of the last 5 questions:", last_five_words)

print(questions)