kgw
kgw - Knowledge Graph Workflows
A Python package for downloading, converting, and analyzing a selection of knowledge graphs.
Currently five projects from the domain of biomedicine are covered. In future, more projects from the same or other domains might get added. Contributions are welcome and encouraged!
Subpackages
Functions
|
Execute all tasks in the provided workflow according to their dependencies. |
Package Contents
- kgw.run(workflow, num_workers=None, verbose=True)[source]
Execute all tasks in the provided workflow according to their dependencies.
This function uses the package Luigi [1] to build a dependency graph of all tasks defined in the workflow and execute them in parallel with multiple worker processes.
- Parameters:
workflow (
Project
, or list ofProject
) – Specification of a workflow in form of a single or multiple project objects. A project object provides several methods that can be called in order to add specific tasks. For example, calling the methodto_csv()
will store a task in the object that represents the conversion of the project’s knowledge graph into CSV format. The workflow engine automatically detects these tasks, inspects their dependencies, and schedules all necessary steps in the correct order.num_workers (
int
, optional, default=4*cpu_count) – The number of worker processes to run tasks in parallel. If not specified, it defaults to 4 times the number of CPU cores available on the machine.verbose (
bool
, optional, default=True) – IfTrue
, a log of tasks and a summary of results is written to stdout. IfFalse
, no text is printed.
- Returns:
success (
bool
) – ReturnsTrue
if all tasks were successfully scheduled and executed, otherwiseFalse
. A failed run can be resumed from an intermediate state without re-running previously completed tasks.- Raises:
TypeError – Raised if
workflow
is not a project object or list of such objects, or ifnum_workers
is not an integer, or ifverbose
is not a boolean.ValueError – Raised if
workflow
is an empty list.
References