Distributed execution¶
-
unified_map.multivariate.distributed.
dask
(function, argument_list)¶ Apply a multivariate function to a list of arguments in a distributed fashion.
Uses Dask’s delayed() function to build a task graph and compute() function with a cluster connection to calculate results.
Parameters: - function – A callable object that accepts more than one argument
- argument_list – An iterable object of input argument collections
Returns: List of output results
Raises: ConnectionError
– If no connection (“context”) to a Spark scheduler (“master”) was established.Example
>>> def add(x, y, z): ... return x+y+z ... >>> dask(add, [(1, 2, 3), (10, 20, 30)]) [6, 60]
Note
Requires that a connection to a scheduler has been established, see Cluster setup and Dask cluster setup.
References
-
unified_map.multivariate.distributed.
spark
(function, argument_list)¶ Apply a multivariate function to a list of arguments in a distributed fashion.
Uses Apache Spark’s map() and collect() functions provided by a resilient distributed dataset (RDD).
Parameters: - function – A callable object that accepts more than one argument
- argument_list – An iterable object of input argument collections
Returns: List of output results
Raises: ConnectionError
– If no connection (“context”) to a Spark scheduler (“master”) was established.Example
>>> def add(x, y, z): ... return x+y+z ... >>> spark(add, [(1, 2, 3), (10, 20, 30)]) [6, 60]
Note
Requires that a connection to a scheduler has been established, see Cluster setup and Spark cluster setup.
References