Distributed execution

unified_map.multivariate.distributed.dask(function, argument_list)

Apply a multivariate function to a list of arguments in a distributed fashion.

Uses Dask’s delayed() function to build a task graph and compute() function with a cluster connection to calculate results.

Parameters:
  • function – A callable object that accepts more than one argument
  • argument_list – An iterable object of input argument collections
Returns:

List of output results

Raises:

ConnectionError – If no connection (“context”) to a Spark scheduler (“master”) was established.

Example

>>> def add(x, y, z):
...     return x+y+z
...
>>> dask(add, [(1, 2, 3), (10, 20, 30)])
[6, 60]

Note

Requires that a connection to a scheduler has been established, see Cluster setup and Dask cluster setup.

References

unified_map.multivariate.distributed.spark(function, argument_list)

Apply a multivariate function to a list of arguments in a distributed fashion.

Uses Apache Spark’s map() and collect() functions provided by a resilient distributed dataset (RDD).

Parameters:
  • function – A callable object that accepts more than one argument
  • argument_list – An iterable object of input argument collections
Returns:

List of output results

Raises:

ConnectionError – If no connection (“context”) to a Spark scheduler (“master”) was established.

Example

>>> def add(x, y, z):
...     return x+y+z
...
>>> spark(add, [(1, 2, 3), (10, 20, 30)])
[6, 60]

Note

Requires that a connection to a scheduler has been established, see Cluster setup and Spark cluster setup.

References