Блог пользователя sslotin

Автор sslotin, 3 года назад, По-английски

Hi everyone! I'm writing a book on performance engineering, and a few days ago, I finished a draft of one of its main crown jewels: the SIMD programming chapter.

The main findings that are published:

  • You can compute array sums and other reductions such as the minimum 2x faster than std::accumulate or an auto-vectorized loop would.
  • You can sometimes copy and assign memory 2x faster than memcpy and memset respectively.
  • You can search array elements 10x times faster than std::find or manually.
  • You can count a value in an array 1.5x faster than std::count or manually.
  • You can calculate population counts of large vectors ~2x faster than with the intrinsic.
  • You can filter arrays 6-7x faster, which translates to e. g. a 6-7x faster quicksort (to be published).
  • You can calculate the index of the minimum element ~10x faster than std::min_element or manually.
  • UPD: you can calculate the prefix sum of an array ~2.5x faster than std::partial_sum or manually.

All speedup numbers are architecture-specific and may be different (usually larger) on CodeForces servers.

Enjoy — and as always, I'm happy to hear any comments and suggestions.

  • Проголосовать: нравится
  • +503
  • Проголосовать: не нравится

»
3 года назад, # |
  Проголосовать: нравится +28 Проголосовать: не нравится

Super cool.

»
3 года назад, # |
  Проголосовать: нравится +23 Проголосовать: не нравится

fascinating. If you have a library for optimized common algorithms, can you share it?

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +25 Проголосовать: не нравится

    I don't have a library, but I have a (somewhat disorganized) repo with complete implementations of all the algorithms described in the book. Eventually turning it into a proper C/C++ library is definitely worthwhile, and this way it will also be more likely to be merged into the mainstream STL implementations.

»
2 года назад, # |
  Проголосовать: нравится +44 Проголосовать: не нравится

An update: I will be speaking at a small online conference called Performance Summit tomorrow. Still figuring out the best way to create a YouTube live stream of my session, but once I do, the link will be in the "streams" section (you can also join via Zoom to ask quesitons).

The talk is called "The Art of SIMD Programming", and it covers pretty much everything in this chapter and a bit more. Come by if you prefer talks over books.

  • »
    »
    2 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +22 Проголосовать: не нравится

    Just noticed that the stream was removed from the streams section, and I can't seem to find any zoom link either. Is there a way to watch the stream somehow?

    • »
      »
      »
      2 года назад, # ^ |
      Rev. 2   Проголосовать: нравится +17 Проголосовать: не нравится

      Yeah, sorry, I screwed up with the YT stream — it turns out you can't properly run Zoom webinars and OBS at the same time on Linux. But the organizers promised to give me the recording right after the conference finishes, so I will post it here in a few hours.