bicsi's blog

By bicsi, history, 5 years ago, In English

Hello!

After seeing the recent hype towards debugging utils and code snippets usage and what not, and due to the fact that I had 30 minutes of free time to spend this afternoon, I decided to write a small Python script that pre-compiles multiple files into one big file for Codeforces submission.

The tool is here: https://pastebin.com/c4dS4Pik (Python 3 required)

Basically the tool does these steps:

  • remove the #include <bla> lines from the code
  • run the GCC preprocessor on the code
  • add the #include <bla> lines back
  • run some cleanup routine (optional)

In order to use this script, copy it somewhere (preferably in your PATH directory), and run it using: /path/to/scr.py file.cpp [ARGS] The ARGS are similar to the ones you would give to your g++ command.

For example, let's say we have three files:

FILE: debug.hpp
FILE: dsu.hpp
FILE: sol.cpp

Output of /path/to/scr.py sol.cpp:

OUTPUT

Output of /path/to/scr.py sol.cpp -DDEBUG:

OUTPUT

The main idea is that it merges all your #include "bla" files (not the #include <bla> ones, though), and replaces all the #defines and other preprocessor instructions.

Let me know what you think! The tool should work fine in UNIX-based systems (Mac OS X, Linux). I would be happy if someone could test/port this tool for Windows. I think one cool thing about this tool is that it pre-compiles all the #defines, so that most of the output code would look more standardized (C++-ish).

I would personally probably not use this tool too much, as my general approach is to copy-paste implementations and tweak them to the specific problem, but I know a lot of people prefer to use them as black boxes, so this might be useful.

  • Vote: I like it
  • +85
  • Vote: I do not like it

| Write comment?
»
5 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by bicsi (previous revision, new revision, compare).

Script now shouldn't remove comments.

»
5 years ago, # |
  Vote: I like it +18 Vote: I do not like it

Nice script. But I can top it with my setup.

A few weeks ago I made a similar solution. One disadvantage of your script is, that you manually need to decide which files you include. Which is about the same workload as copying the file itself.

I only use one header file, which I include every time (part of my template). This single header contains multiple different classes and functions, that I don't need to modify, like a 3D point class, or my modular arithmetic class. Like you the actually important algorithms/data structures need to be modified most of the time, so I also just copy them directly into the file.

The before submitting I run the program Caide C++ inliner on my file. That basically does exactly the same thing as your script, it merges the solution file with the header. The advantage of that program is, that it is intelligent enough to determine what parts of the included header are used and which aren't. So if the problem has nothing to do with geometry, the 3D point class will not be included, even though it is in the header.

I haven't used it much yet, but so far I really enjoy it. Since during writing the code I have the complete file included, I get autocompletion for every single function or classes in the header, and when submitting only the relevant parts are in the submission.

  • »
    »
    5 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Oh, and the tool doesn't actually preprocess the file. So the #define lines are still there.

  • »
    »
    5 years ago, # ^ |
      Vote: I like it +16 Vote: I do not like it

    Interesting tool! However, I'm noticing some clang configurations in there, which means that it's probably using Clang to parse the source file into an AST and do some dead code elimination. I did a similar thing for cleaning up submission files as part of my master thesis, and what I found out is that a lot of source code submitted for Codeforces isn't parsed successfully by Clang (mainly because people use GNU GCC extensions like pb_ds in their templates), so the tool might not be 100% compatible. Correct me if I'm wrong.

    Also, I feel like replacing #defines is actually a good thing :). I would very much prefer looking at for (int i = 0; i < n; ++i) instead of FORN(i, n) or similar. This can, of course, be disabled by tweaking the script.

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Yes, you're right about the functionality.

      However I just tested it, and the tool works perfectly fine with pb_ds.

      Yeah, for those #defines it can be useful. I haven't even though about them, since I don't use them (snippet support in the editor is enough). I didn't like the large line in your debugger output, but on the other side that code isn't submitted so who cares.

      • »
        »
        »
        »
        5 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Interesting that it works. Maybe the AST is still valid to some extent, even with the missing dependencies. The large line in my output doesn’t look too nice, but as you said, it’s not the submitted version. The reason I put the output from running the script with -DDEBUG is to show that you could use extra arguments and how the macros are expanded (however, running the inliner with the DEBUG flag isn’t a realistic scenario).

        I guess there’s advantages here and there, like portability (if it’s worth much), and simplicity. It’s also probably faster, if this matters.

        I’m going to think a bit if you could opt to ignore ‘unused files’ in an easier way than parsing the C++ file, but including library files one by one might not be that bad for many people (I guess it depends on the granularity and modularity of your library).

  • »
    »
    5 years ago, # ^ |
      Vote: I like it +18 Vote: I do not like it

    This seems similar to CHelper (for Java), it inlines everything into one file and only keeps symbols that are used.

  • »
    »
    5 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    I've been also using caide-inliner for a while. Since compiling the binary was a painful experience I made a wrapper in docker (caide-docker) that uses the released binary.

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      If you use the release binary, what exactly is the use for Docker? That binary should run on a Linux machine the exact same way as inside the container. Or are you using Windows or macOS?

»
5 years ago, # |
Rev. 2   Vote: I like it +8 Vote: I do not like it

I use the following bash script for the same purpose:

#!/bin/sh
clang++ ${@:2} -P -E -D CP_GEN $1.cpp > $1.generated.cpp 2> /dev/null
  • »
    »
    5 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How does that work with #includes from standard library?

    FYI, mine does the same thing, but avoiding to process the standard includes.

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      To avoid inlining the standard library, I include it like this:

      #ifdef CP_GEN
      #define INCLUDE #include
      INCLUDE <bits/stdc++.h>
      #else
      #include <bits/stdc++.h>
      #endif
      
»
5 years ago, # |
  Vote: I like it +10 Vote: I do not like it

Cool script. If you want to keep track of modifications then it is probably better to post it on Github gists or as a git repo.

I wonder is this possible using only the C preprocessor (maybe there is some GCC or Clang option that does this)

  • »
    »
    5 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I’m not sure what you mean. It is using only the GCC preprocessor (with little added functionality on top).

»
5 years ago, # |
  Vote: I like it +8 Vote: I do not like it

you can also use command_fs to avoid running this manually