Functions | |
template<typename F > | |
void | Grappa::call_on_all_cores (F work) |
Call message (work that cannot block) on all cores, block until ack received from all. More... | |
template<typename F > | |
void | Grappa::on_all_cores (F work) |
Spawn a private task on each core, block until all complete. More... | |
template<typename T , T(*)(const T &, const T &) ReduceOp> | |
T | Grappa::allreduce (T myval) |
Called from SPMD context, reduces values from all cores calling allreduce and returns reduced values to everyone. More... | |
template<typename T , T(*)(const T &, const T &) ReduceOp> | |
void | Grappa::allreduce_inplace (T *array, size_t nelem=1) |
Called from SPMD context. More... | |
template<typename T , T(*)(const T &, const T &) ReduceOp> | |
T | Grappa::reduce (const T *global_ptr) |
Called from a single task (usually user_main), reduces values from all cores onto the calling node. More... | |
template<typename T , T(*)(const T &, const T &) ReduceOp> | |
T | Grappa::reduce (GlobalAddress< T > localizable) |
Reduce over a symmetrically allocated object. More... | |
template<typename T , typename P , T(*)(const T &, const T &) ReduceOp, T(*)(GlobalAddress< P >) Accessor> | |
T | Grappa::reduce (GlobalAddress< P > localizable) |
Reduce over a member of a symmetrically allocated object. More... | |
template<typename F = nullptr_t> | |
auto | Grappa::sum_all_cores (F func) -> decltype(func()) |
Custom reduction from all cores. More... | |
T Grappa::allreduce | ( | T | myval | ) |
Called from SPMD context, reduces values from all cores calling allreduce
and returns reduced values to everyone.
Blocks until reduction is complete, so suffices as a global barrier.
Example:
Definition at line 261 of file Collective.hpp.
void Grappa::allreduce_inplace | ( | T * | array, |
size_t | nelem = 1 |
||
) |
Called from SPMD context.
Do an in-place allreduce (works on arrays). All elements of the array will be overwritten by the operation with the total from all cores.
Definition at line 278 of file Collective.hpp.
void Grappa::call_on_all_cores | ( | F | work | ) |
Call message (work that cannot block) on all cores, block until ack received from all.
Like Grappa::on_all_cores() but does not spawn tasks on each core. Can safely be called concurrently with others.
Definition at line 84 of file Collective.hpp.
void Grappa::on_all_cores | ( | F | work | ) |
Spawn a private task on each core, block until all complete.
To be used for any SPMD-style work (e.g. initializing globals). Also used as a primitive in Grappa system code where anything is done on all cores.
Example:
Definition at line 113 of file Collective.hpp.
T Grappa::reduce | ( | const T * | global_ptr | ) |
Called from a single task (usually user_main), reduces values from all cores onto the calling node.
Blocks until reduction is complete. Safe to use any number of these concurrently.
Example:
Definition at line 296 of file Collective.hpp.
T Grappa::reduce | ( | GlobalAddress< T > | localizable | ) |
Reduce over a symmetrically allocated object.
Blocks until reduction is complete. Safe to use any number of these concurrently. Example:
Definition at line 332 of file Collective.hpp.
T Grappa::reduce | ( | GlobalAddress< P > | localizable | ) |
Reduce over a member of a symmetrically allocated object.
The Accessor function is used to pull out the member. Blocks until reduction is complete. Safe to use any number of these concurrently.
Example:
Definition at line 373 of file Collective.hpp.
auto Grappa::sum_all_cores | ( | F | func | ) | -> decltype(func()) |
Custom reduction from all cores.
Takes a lambda to run on each core, returns the sum of all the results to the caller. This is often easier than using the "custom Accessor" version of reduce, and also works on symmetric addresses.
Basically, reduce() could be implemented as:
Definition at line 411 of file Collective.hpp.