std::experimental::parallel::reduce
From cppreference.com
                    
                                        
                    < cpp | experimental
                    
                                                            
                    |   Defined in header  
<experimental/numeric>
  | 
||
|   template<class InputIt> 
typename std::iterator_traits<InputIt>::value_type reduce(  | 
(1) | |
|   template<class ExecutionPolicy, class InputIterator> 
typename std::iterator_traits<InputIt>::value_type reduce(  | 
(2) | |
|   template<class InputIt, class T> 
T reduce(InputIt first, InputIt last, T init);  | 
(3) | |
|   template<class ExecutionPolicy, class InputIt, class T> 
T reduce(ExecutionPolicy&& exec, InputIt first, InputIt last, T init);  | 
(4) | |
|   template<class InputIt, class T, class BinaryOp> 
T reduce(InputIt first, InputIt last, T init, BinaryOp binary_op);  | 
(5) | |
|   template<class ExecutionPolicy, class InputIt, class T, class BinaryOp> 
T reduce(ExecutionPolicy&& exec,  | 
(6) | |
1) same as reduce(first, last, typename std::iterator_traits<InputIt>::value_type{})
3) same as reduce(first, last, init, std::plus<>())
5) Reduces the range [first; last), possibly permuted and aggregated in unspecified manner, along with the initial value 
init over binary_op.
2,4,6) Same as (1,3,5), but executed according to 
exec
The behavior is non-deterministic if binary_op is not associative or not commutative.
The behavior is undefined if binary_op modifies any element or invalidates any iterator in [first; last).
Contents | 
[edit] Parameters
| first, last | - | the range of elements to apply the algorithm to | 
| init | - | the initial value of the generalized sum | 
| policy | - | the execution policy | 
| binary_op | - |   binary Callable that will be applied in unspecified order to the result of dereferencing the input iterators, the results of other binary_op and init.
 | 
| Type requirements | ||
 -
InputIt must meet the requirements of InputIterator.
 | 
||
[edit] Return value
Generalized sum of init and *first, *(first+1), ... *(last-1) over binary_op,
where generalized sum GSUM(op, a
1, ..., a
N) is defined as follows: 
-  if N=1, a
1 -  if N > 1, op(GSUM(op, b
1, ..., b
K), GSUM(op, b
M, ..., b
N)) where 
- 
-  b
1, ..., b
N may be any permutation of a1, ..., aN and - 1 < K+1 = M ≤ N
 
 -  b
 
in other words, the elements of the range may be grouped and rearranged in arbitrary order
[edit] Complexity
O(last - first) applications of binary_op.
[edit] Exceptions
- If execution of a function invoked as part of the algorithm throws an exception,
 
- 
-  if 
policyisparallel_vector_execution_policy, std::terminate is called -  if 
policyissequential_execution_policyorparallel_execution_policy, the algorithm exits with an exception_list containing all uncaught exceptions. If there was only one uncaught exception, the algorithm may rethrow it without wrapping inexception_list. It is unspecified how much work the algorithm will perform before returning after the first exception was encountered. -  if 
policyis some other type, the behavior is implementation-defined 
 -  if 
 
-  If the algorithm fails to allocate memory (either for itself or to construct an 
exception_listwhen handling a user exception), std::bad_alloc is thrown. 
[edit] Notes
If the range is empty, init is returned, unmodified
-  If 
policyis an instance ofsequential_execution_policy, all operations are performed in the calling thread. -  If 
policyis an instance ofparallel_execution_policy, operations may be performed in unspecified number of threads, indeterminately sequenced with each other -  If 
policyis an instance ofparallel_vector_execution_policy, execution may be both parallelized and vectorized: function body boundaries are not respected and user code may be overlapped and combined in arbitrary manner (in particular, this implies that a user-provided Callable must not acquire a mutex to access a shared resource) 
[edit] Example
reduce is the out-of-order version of std::accumulate:
Run this code
#include <iostream> #include <chrono> #include <vector> #include <numeric> #include <experimental/execution_policy> #include <experimental/numeric> int main() { std::vector<double> v(10'000'007, 0.5); { auto t1 = std::chrono::high_resolution_clock::now(); double result = std::accumulate(v.begin(), v.end(), 0.0); auto t2 = std::chrono::high_resolution_clock::now(); std::chrono::duration<double, std::milli> ms = t2 - t1; std::cout << std::fixed << "std::accumulate result " << result << " took " << ms.count() << " ms\n"; } { auto t1 = std::chrono::high_resolution_clock::now(); double result = std::experimental::parallel::reduce( std::experimental::parallel::par, v.begin(), v.end()); auto t2 = std::chrono::high_resolution_clock::now(); std::chrono::duration<double, std::milli> ms = t2 - t1; std::cout << "parallel::reduce result " << result << " took " << ms.count() << " ms\n"; } }
Possible output:
std::accumulate result 5000003.50000 took 12.7365 ms parallel::reduce result 5000003.50000 took 5.06423 ms
[edit] See also
|    sums up a range of elements   (function template)  | 
|
|    applies a function to a range of elements   (function template)  | 
|
|    (parallelism TS) 
 | 
   applies a functor, then reduces out of order   (function template)  |