Human interactome

This Jupyter notebook provides an example of using the Python package gravis. The .ipynb file can be found here.

It visualizes protein-protein interactions (PPi) taken from the Human Reference Interactome (HuRI) and HuRI combined with other systematic screening efforts at CCSB (HI-union).

References

  • Center for Cancer Systems Biology (CCSB)

    • The Human Reference Protein Interactome Mapping Project

      • Download

        • HuRI.tsv with 52569 interactions (Ensembl gene IDs)

        • HI-union.tsv with 64006 interactions

        • Preprint paper

          • “The dataset, versioned HI-III-19 (Human Interactome obtained from screening Space III, published in 2019), contains 52,569 verified PPIs involving 8,275 proteins (Supplementary Table 6). Given its systematic nature, completeness and scale, we consider HI-III-19 to be the first draft of the Human Reference Interactome (HuRI).”

          • “Combining HuRI with all previously published systematic screening efforts at CCSB yields 64,006 binary PPIs involving 9,094 proteins (HI-union)”

  • EMBL-EBI

    • Ensembl: Ensembl is a genome browser for vertebrate genomes

      • About: In order to improve consistency between the data provided by different genome browsers, Ensembl has entered into an agreement with UCSC and NCBI with regard to sequence identifiers

  • UniProt

    • About: The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data.

[1]:
import csv
import os

import gravis as gv
import networkx as nx

Load protein-protein interaction (PPi) data

The data is given as table that contains (source, target) pairs, which is a simple edge list. Note that HI-union-minimal.tsv is a reduced version of HI-union.tsv that contains only the first two columns.

[2]:
def load_csv_data(filepath, delimiter=','):
    with open(filepath) as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=delimiter)
        data = list(csv_reader)
    return data


filepath = os.path.join('data', 'HuRI.tsv')
data_huri = load_csv_data(filepath, delimiter='\t')

filepath = os.path.join('data', 'HI-union-minimal.tsv')
data_hi_union = load_csv_data(filepath, delimiter='\t')

Create PPi network as NetworkX graph

[3]:
def construct_graph(data, name):
    graph = nx.Graph()
    for source, target in data:
        graph.add_edge(source, target)

    uniprot_template = (
        'Degree: {degree}<br>'
        'Uniprot: <a href="https://www.uniprot.org/uniprot/{id}" target="_blank">{id}</a>')
    ensembl_template = (
        'Degree: {degree}<br>'
        'Ensembl: <a href="https://www.ensembl.org/Homo_sapiens/Gene/Summary?g={id}" target="_blank">{id}</a><br>'
        'NCBI Gene: <a href="https://www.ncbi.nlm.nih.gov/gene/?term={id}" target="_blank">{id}</a>')

    for node_id in graph.nodes:
        node = graph.nodes[node_id]
        template = ensembl_template if node_id.lower().startswith('ens') else uniprot_template
        node['hover'] = template.format(id=node_id, degree=graph.degree[node_id])
        node['click'] = '$hover'
    print('Protein-protein interaction network "{}"'.format(name))
    print('- Number of nodes:', len(graph.nodes))
    print('- Number of edges:', len(graph.edges))
    print()
    return graph


graph_huri = construct_graph(data_huri, 'HuRI')
graph_hi_union = construct_graph(data_hi_union, 'HI-union')
Protein-protein interaction network "HuRI"
- Number of nodes: 8272
- Number of edges: 52548

Protein-protein interaction network "HI-union"
- Number of nodes: 9573
- Number of edges: 65330

Plot filtered versions of the large graph

Filter 1: Egocentric network (=neighborhood of a chosen node)

Focus on an actor (“ego”) and show all edges to his direct neighbors (“alters”) and between them.

Chosen here is the MYC gene, with ENSG00000136997 as Ensembl identifier of the gene and P01106 as Uniprot identifier of the protein transcribed from the gene.

[4]:
def list_edges_containing_a_node(data, beginning_node_id):
    for source, target in data:
        for node_id in [source, target]:
            if node_id.startswith(beginning_node_id):
                print(' ', source, target)

gene_id = 'ENSG00000136997'
print('Edges containing the gene id "{}" in HuRI database'.format(gene_id))
list_edges_containing_a_node(data_huri, gene_id)

print()

protein_id = 'P01106'
print('Edges containing the protein id "{}" in HI-union database'.format(protein_id))
list_edges_containing_a_node(data_hi_union, protein_id)
Edges containing the gene id "ENSG00000136997" in HuRI database
  ENSG00000004487 ENSG00000136997
  ENSG00000125952 ENSG00000136997

Edges containing the protein id "P01106" in HI-union database
  P61244-1 P01106-1
  P61244-1 P01106-1
  P61244-1 P01106-1
  O60341-1 P01106-1
  P01106-1 P61244-1
  P01106-1 P61244-1
  O60341-1 P01106-1
  O60341-1 P01106-1
  P61244-1 P01106-1
[5]:
def create_ego_graph(graph, ego_node_id, radius=1):
    ego_graph = nx.ego_graph(graph, ego_node_id, radius=radius)
    ego_node = ego_graph.nodes[ego_node_id]
    ego_node['color'] = 'red'
    ego_node['label_color'] = 'red'
    pos_counter = {i: 0 for i in range(radius+1)}
    for node_id in ego_graph.nodes:
        node = ego_graph.nodes[node_id]
        distance = len(nx.shortest_path(graph, ego_node_id, node_id)) - 1
        node['x'] = pos_counter[distance] * 40 - 1000
        node['y'] = distance * 120 - 150
        node['size'] = 10 + graph.degree[node_id] / 10
        pos_counter[distance] += 1
        if distance == 1:
            node['color'] = 'blue'
        elif distance == 2:
            node['color'] = 'green'
    print('Egocentric graph')
    print('- Number of nodes:', len(ego_graph.nodes))
    print('- Number of edges:', len(ego_graph.edges))
    return ego_graph

1) HuRI data

[6]:
# Examples of proteins in HuRI data (ensembl identifiers)
#  Myc: ENSG00000136997
#  Max: ENSG00000125952
ego_graph = create_ego_graph(graph_huri, 'ENSG00000136997', radius=2)

gv.d3(ego_graph, zoom_factor=0.33, graph_height=250, node_label_rotation=35)
Egocentric graph
- Number of nodes: 53
- Number of edges: 119
[6]:
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force

2) HI-union data

[7]:
# Examples of proteins in HI-union data (uniprot identifiers)
#  Myc: 'P01106-1'
#  Max: 'P61244-1'
ego_graph = create_ego_graph(graph_hi_union, 'P01106-1', radius=2)

gv.d3(ego_graph, zoom_factor=0.33, graph_height=250, node_label_rotation=35)
Egocentric graph
- Number of nodes: 60
- Number of edges: 159
[7]:
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force

Filter 2: Only well-connected nodes with degree >= n

Show only proteins that have interactions with at least n other proteins.

[8]:
def create_high_degree_graph(graph, n):
    filtered_graph = graph.copy()

    # Step 1
    to_remove = [node for node, degree in graph.degree() if degree < n]
    filtered_graph.remove_nodes_from(to_remove)

    # Step 2
    to_remove = [node for node, degree in filtered_graph.degree() if degree < 1]
    filtered_graph.remove_nodes_from(to_remove)

    print('Filtered graph containing only nodes of degree >= {}'.format(n))
    print('- Number of nodes:', len(filtered_graph.nodes))
    print('- Number of edges:', len(filtered_graph.edges))
    return filtered_graph

1) HuRI data

[9]:
graph = create_high_degree_graph(graph_huri, n=150)

gv.d3(graph, node_label_size_factor=0.5)
Filtered graph containing only nodes of degree >= 150
- Number of nodes: 44
- Number of edges: 168
[9]:
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force

2) HI-union data

[10]:
graph = create_high_degree_graph(graph_hi_union, n=175)

gv.d3(graph, node_label_size_factor=0.5)
Filtered graph containing only nodes of degree >= 175
- Number of nodes: 56
- Number of edges: 333
[10]:
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force