I think it is confusing that there is no parallel version of std::accumulate
in the C++ standard. It appears to me that it would be trivial to implement it in parallel, for instance, based on OpenMP or SIMD instructions. Does anybody have a good explanation for why the standard committee chose to introduce a parallel version of std::reduce
but not std::accumulate
? Is it because of the different iterator types?
5
1 Answer
They introduced reduce
because accumulate
cannot be parallelized as it was written.
The C++ standard defines the expected behavior of the various functions. accumulate
is defined as follows:
Computes its result by initializing the accumulator
acc
with the initial valueinit
and then modifies it withacc = std::move(acc) + *i
oracc = binary_op(std::move(acc), *i)
for every
iteratori
in the range[first, last)
in order.
Emphasis added. Those two words make the algorithm unparallelizable. And because those words are part of the standard, users of accumulate
are allowed to rely on the "in order" guarantee. They can make operator+
or binary_op
non-commutative operations, ones that rely on the ordering of the operation. So the committee couldn’t just take those words away without potentially breaking a bunch of code.
And they didn’t want to have a parallel version of accumulate
that had fundamentally different behavior from the non-parallel version. Therefore, reduce
was introduced without the "in order" guarantee (and with other language that helped make it parallelizable).
What you want is
std::reduce
. See en.cppreference.com/w/cpp/algorithm/reduce9 hours ago
Hard question to ask here. Very often the answer to "Why is there no X in the Standard Library?" is an unsatisfying, "Because no one has written a proposal with a compelling reason for X's inclusion."
9 hours ago
Maybe dup of stackoverflow.com/questions/71093773/…
8 hours ago
Could a reason be that not all platforms can perform in parallel?
7 hours ago
@ThomasMatthews Nope. The reason (as described in Nicol Bolas' answer below) is that parallelism requires that order of operations doesn't matter (e.g. the order that that elements of the range are accessed, and binary operation applied), and
std::accumulate()
requires a fixed order. If the binary operation is commutative, parallelisation doesn't change the final result, only the way it is reached. If an implementation doesn't support parallelisation, all parallelisation policies will typically have no net effect, and the algorithms will run sequentially.7 hours ago