kgw
kgw - Knowledge Graph Workflows
A Python package for downloading, converting, and analyzing a selection of knowledge graphs.
Currently five projects from the domain of biomedicine are covered. In future, more projects from the same or other domains might get added. Contributions are welcome and encouraged!
Subpackages
Functions
|
Execute all tasks in the provided workflow according to their dependencies. |
Package Contents
- kgw.run(workflow, num_workers=None, verbose=True)[source]
Execute all tasks in the provided workflow according to their dependencies.
This function uses the package Luigi [1] to build a dependency graph of all tasks defined in the workflow and execute them in parallel with multiple worker processes.
- Parameters:
workflow (
Project, or list ofProject) – Specification of a workflow in form of a single or multiple project objects. A project object provides several methods that can be called in order to add specific tasks. For example, calling the methodto_csv()will store a task in the object that represents the conversion of the project’s knowledge graph into CSV format. The workflow engine automatically detects these tasks, inspects their dependencies, and schedules all necessary steps in the correct order.num_workers (
int, optional, default=4*cpu_count) – The number of worker processes to run tasks in parallel. If not specified, it defaults to 4 times the number of CPU cores available on the machine.verbose (
bool, optional, default=True) – IfTrue, a log of tasks and a summary of results is written to stdout. IfFalse, no text is printed.
- Returns:
success (
bool) – ReturnsTrueif all tasks were successfully scheduled and executed, otherwiseFalse. A failed run can be resumed from an intermediate state without re-running previously completed tasks.- Raises:
TypeError – Raised if
workflowis not a project object or list of such objects, or ifnum_workersis not an integer, or ifverboseis not a boolean.ValueError – Raised if
workflowis an empty list.
References