Distributed execution¶
-
unified_map.univariate.distributed.
dask
(function, argument_list)¶ Apply a univariate function to a list of arguments in a distributed fashion.
Uses Dask’s delayed() function to build a task graph and compute() function with a cluster connection to calculate results.
Parameters: - function – A callable object that accepts one argument
- argument_list – An iterable object of input arguments
Returns: List of output results
Raises: ConnectionError
– If no connection to a Dask scheduler was established.Example
>>> def square(x): ... return x**2 ... >>> dask(square, [1, 2, 3, 4, 5]) [1, 4, 9, 16, 25]
Note
Requires that a connection to a scheduler has been established, see Cluster setup and Dask cluster setup.
References
-
unified_map.univariate.distributed.
spark
(function, argument_list)¶ Apply a univariate function to a list of arguments in a distributed fashion.
Uses Apache Spark’s map() and collect() functions provided by a resilient distributed dataset (RDD).
Parameters: - function – A callable object that accepts one argument
- argument_list – An iterable object of input arguments
Returns: List of output results
Raises: ConnectionError
– If no connection (“context”) to a Spark scheduler (“master”) was established.Example
>>> def square(x): ... return x**2 ... >>> spark(square, [1, 2, 3, 4, 5]) [1, 4, 9, 16, 25]
Note
Requires that a connection to a scheduler has been established, see Cluster setup and Spark cluster setup.
References