Functions
template<typename F >
void	Grappa::call_on_all_cores (F work)
	Call message (work that cannot block) on all cores, block until ack received from all. More...

template<typename F >
void	Grappa::on_all_cores (F work)
	Spawn a private task on each core, block until all complete. More...

template<typename T , T(*)(const T &, const T &) ReduceOp>
T	Grappa::allreduce (T myval)
	Called from SPMD context, reduces values from all cores calling `allreduce` and returns reduced values to everyone. More...

template<typename T , T(*)(const T &, const T &) ReduceOp>
void	Grappa::allreduce_inplace (T *array, size_t nelem=1)
	Called from SPMD context. More...

template<typename T , T(*)(const T &, const T &) ReduceOp>
T	Grappa::reduce (const T *global_ptr)
	Called from a single task (usually user_main), reduces values from all cores onto the calling node. More...

template<typename T , T(*)(const T &, const T &) ReduceOp>
T	Grappa::reduce (GlobalAddress< T > localizable)
	Reduce over a symmetrically allocated object. More...

template<typename T , typename P , T()(const T &, const T &) ReduceOp, T()(GlobalAddress< P >) Accessor>
T	Grappa::reduce (GlobalAddress< P > localizable)
	Reduce over a member of a symmetrically allocated object. More...

template<typename F = nullptr_t>
auto	Grappa::sum_all_cores (F func) -> decltype(func())
	Custom reduction from all cores. More...

Detailed Description

Function Documentation

template<typename T , T(*)(const T &, const T &) ReduceOp>

T Grappa::allreduce ( T myval )

Called from SPMD context, reduces values from all cores calling allreduce and returns reduced values to everyone.

Blocks until reduction is complete, so suffices as a global barrier.

Warning: May only one with a given type/op combination may be used at a time, uses a function-private static variable.

Example:

Grappa::on_all_cores([]{
  int value = foo();
  int total = Grappa::allreduce<int,collective_add>(value);
});

Definition at line 261 of file Collective.hpp.

template<typename T , T(*)(const T &, const T &) ReduceOp>

void Grappa::allreduce_inplace	(	T *	array,
		size_t	nelem = `1`
	)

Called from SPMD context.

Do an in-place allreduce (works on arrays). All elements of the array will be overwritten by the operation with the total from all cores.

Warning: May only one with a given type/op combination may be used at a time, uses a function-private static variable.

Definition at line 278 of file Collective.hpp.

template<typename F >

void Grappa::call_on_all_cores ( F work )

Call message (work that cannot block) on all cores, block until ack received from all.

Like Grappa::on_all_cores() but does not spawn tasks on each core. Can safely be called concurrently with others.

Definition at line 84 of file Collective.hpp.

template<typename F >

void Grappa::on_all_cores ( F work )

Spawn a private task on each core, block until all complete.

To be used for any SPMD-style work (e.g. initializing globals). Also used as a primitive in Grappa system code where anything is done on all cores.

Example:

int x[Grappa::cores()];
GlobalAddress<int> x_base = make_global(x);
Grappa::on_all_cores([x_base]{
  Grappa::delegate::write(x_base+Grappa::mycore(), 1);
});

Definition at line 113 of file Collective.hpp.

template<typename T , T(*)(const T &, const T &) ReduceOp>

T Grappa::reduce ( const T * global_ptr )

Called from a single task (usually user_main), reduces values from all cores onto the calling node.

Blocks until reduction is complete. Safe to use any number of these concurrently.

Example:

static int x;
void user_main() {
  on_all_cores([]{ x = foo(); });
  int total = reduce<int,collective_add>(&x);
}

Definition at line 296 of file Collective.hpp.

template<typename T , T(*)(const T &, const T &) ReduceOp>

T Grappa::reduce ( GlobalAddress< T > localizable )

Reduce over a symmetrically allocated object.

Blocks until reduction is complete. Safe to use any number of these concurrently. Example:

void user_main() {
  auto x = Grappa::symmetric_global_alloc<BlockAlignedInt>();
  on_all_cores([]{ x = foo(); });
  int total = reduce<int,collective_add>(x);
}

Definition at line 332 of file Collective.hpp.

template<typename T , typename P , T(*)(const T &, const T &) ReduceOp, T(*)(GlobalAddress< P >) Accessor>

T Grappa::reduce ( GlobalAddress< P > localizable )

Reduce over a member of a symmetrically allocated object.

The Accessor function is used to pull out the member. Blocks until reduction is complete. Safe to use any number of these concurrently.

Example:

struct BlockAlignedObj {
  int x;
} GRAPPA_BLOCK_ALIGNED;
int getX(GlobalAddress<BlockAlignedObj> o) {
  return o->x;
}
void user_main() {
  auto x = Grappa::symmetric_global_alloc<BlockAlignedObj>();
  on_all_cores([]{ x = foo(); });
  int total = reduce<int,BlockedAlignedObj,collective_add,&getX>(x);
}

Definition at line 373 of file Collective.hpp.

template<typename F = nullptr_t>

auto Grappa::sum_all_cores ( F func ) -> decltype(func())

Custom reduction from all cores.

Takes a lambda to run on each core, returns the sum of all the results to the caller. This is often easier than using the "custom Accessor" version of reduce, and also works on symmetric addresses.

Basically, reduce() could be implemented as:

int global_x;
// (in main task)
int total = sum_all_cores([]{ return global_x; });
// is equivalent to:
int total = reduce<collective_add>(&global_x);

Definition at line 411 of file Collective.hpp.

Functions

Detailed Description

Function Documentation