Vector vs Array

Another post recycled from my earlier notes. I really don’t have motivation to improve it further 🦥. Vector vs Array Initilization The Vector is the preferred choice for data storage in mordern C++. It is internally implemented based on the Array. However, the performance gap between the two is indeed obvious. The Vector can be initialized via std::vector<T> vec(size). Meanwhile, an Array is initialized by T* arr = new T[size]...

May 1, 2023 · 460 words · Yac

Gather with SIMD

Writing SIMD code that works across different platforms can be a challenging task. The following log illustrates how a seemingly simple operation in C++ can quickly escalate into a significant problem. Let’s look into the code below, where the elements of x is accessed through indices specified by idx. normal code std::vector<float> x = /*some data*/ std::vector<int> idx = /* index */ for(auto i: idx) { auto data = x[i]; } Gather with Intel In AVX512, Gather is a specific intrinsic function to transfer data from a data array to a target vec, according to an index vec....

April 27, 2023 · 1014 words · Yac

SIMD is Pain

Writing code with SIMD for vectorization is painful. It deserves a blog series to record all sorts of pains I have encountered and (partially) overcome. Indeed, once the pain of coding and debugging is finished, the program is lightning-faster. Nonetheless, I am here to complain instead of praising. Let me state why writing SIMD code is causing me emotional damage: a single line of normal c++ code could be easily inflated to a dozen lines of code....

April 25, 2023 · 477 words · Yac

Parallel Algorithms from Libraries

The content of this post is extracted from my previous random notes. I am too lazy to update and organize it 🦥. C++17 new feature – parallel algorithms The parallel algorithms and execution policies are introduced in C++17. Unfortuantely, according to CppReference, only GCC and Intel support these features. Clang still leaves them unimplemented. A blog about it. The parallel library brough by C++17 requires the usage of Intel’s oneTBB for multithreading....

April 25, 2023 · 382 words · Yac