Another post recycled from my earlier notes. I really don’t have motivation to improve it further 🦥.
Vector vs Array
Initilization
The Vector is the preferred choice for data storage in mordern C++. It is internally implemented based on the Array. However, the performance gap between the two is indeed obvious.
The Vector can be initialized via std::vector<T> vec(size). Meanwhile, an Array is initialized by T* arr = new T[size]
allocate data field of size 2^63 (initializing to default value Zero), the time difference:
| Data Field | time (1^-6s) |
|---|---|
Vector | 14.18 |
Array | 3.67 |
!! smart pointers (e.g., unique_ptr) are as slow as vector.
creating multi-dimensional arrays/vectors, the array of arrays still performs as the fastest one:
- Array of Array:
T** AoA = new T [size] - Array of Vector:
vector<T>* AoV = new vector<T> [size] - Vector of Vector:
vector<vector<T>> = VoV(size)Additionally, Static Array of Vector:vector<T> SAoV [size]. We ignore the static array of (static) array because they are not often used for large volume of data.
Given the 1st dimension to be 2^10 and the 2nd dimension to be 2^31, the time difference between those structures are:
| Structure | time (1^-6s) |
|---|---|
AoA | 0.005417 |
SAoA | 0.008417 |
AoV | 0.010792 |
VoV | 2.92248 |
Filling
C++ also provides function to fill the array/vector with the same elements, by the function std::fill. Additionally, vector can also be initialized with the same value. Below is the performance variation over different approaches.
| method | time (1^-6s) |
|---|---|
fill arr | 0.003667 |
fill vec | 0.003667 |
init vec | 0.007917 |
fill_n arr | 0.01225 |
omp parallel for | 8.55099 |
omp parallel for simd | 2.32663 |
using std::fill achieves the best performance for both array and vector. The initialization of vector with a specific value (i.e., init vec) performs as the second fastest method, which is mainly decelerated by vector’s inherent overhead. std::fill_n is (somehow?) substantially slower than the popular std::fill. Do NOT rely on the omp for simple data assignment, as its performance is TOO poor!!!
Usage of C-style Array
employ delete/delete[] when new/new[] are used. The number of delete and new shall be matched.
An object is created by new. When using delete, the destructor of the pointed-to object is called before de-allocation.
Similarly, an array of objects are allocated using new[]. The destructors of all created objects are called when calling the array version delete[].
Thus, if an array of vector is created using new[], e.g., AoV in the prior section, one just need one delete[] to release all resource. There is NO need to delete/erase/clear any vector objects before deleting the array.
As for the vector, which requires no new nor delete, the destructor of a std::vector automatically calls the destructors of the contained objects. Much more convenient (at the cost of speed), as programmers carry no burden of counting the new and delete pairs.