Neighboring country graph

This Jupyter notebook provides an example of using the Python package gravis. The .ipynb file can be found here.

It uses geographical data extracted from a Wikipedia page to represent countries and their neighboring countries as undirected graph, where each country is a node and each edge a neighborhood relation. Additionaly, data from Gapminder is used to show national indicators and statistics such as GDP, life expectancy or child mortality as node properties (e.g. size, color, shape).

References

[1]:
import json
import os

import gravis as gv
import networkx as nx
import numpy as np
import pandas as pd

Data fetching

Load countries and their neighboring countries

The data was fetched previously from a Wikipedia site with another notebook in the data directory.

[2]:
filepath = os.path.join('data', 'neighboring_countries.json')
with open(filepath) as f:
    country_data = json.load(f)

Load national indicators and statistics

The data was downloaded previously by hand from the Gapminder website.

[3]:
filepath = os.path.join('data', 'country_statistics_gapminder.csv')
df = pd.read_csv(filepath)
df = df.fillna(0)

indicator_name = list(df.columns)
indicator_data = df.values.T.tolist()

Map country names from dataset 2 onto those in dataset 1

[4]:
gapminder_to_wikipeda_name = {
    "Cote d'Ivoire": "Côte d'Ivoire",
    "Congo, Dem. Rep.": "Democratic Republic of the Congo",
    "Congo, Rep.": "Republic of the Congo",
    "Gambia": "The Gambia",
    "Holy See": "Vatican City",
    "Kyrgyz Republic": "Kyrgyzstan",
    "Lao": "Laos",
    "Micronesia, Fed. Sts.": "Federated States of Micronesia",
    "Sao Tome and Principe": "São Tomé and Príncipe",
    "Slovak Republic": "Slovakia",
    "St. Kitts and Nevis": "Saint Kitts and Nevis",
    "St. Lucia": "Saint Lucia",
    "St. Vincent and the Grenadines": "Saint Vincent and the Grenadines",
    "Swaziland": "Eswatini (Swaziland)",
    "Timor-Leste": "East Timor",
}
# Comments on others:
#  Kyrgyzstan: "Kyrgyz Republic" is the official name according to wiki
#  Macedonia: Seems to be a part of Greece according to wiki, not a country itself but a region
#  Abkhazia: self-declared sovereign state in the South Caucasus, Georgian–Abkhazian conflict
#  Adélie Land: a claimed territory on the continent of Antarctica, France
#  Akrotiri and Dhekelia: a British Overseas Territory on the island of Cyprus
#  American Samoa: an unincorporated territory of the United States, southeast of Samoa
#  Anguilla: a British overseas territory in the Caribbean
#  Guam is the largest region of Micronesia

indicator_data[0] = [gapminder_to_wikipeda_name.get(c, c) for c in indicator_data[0]]

Create a graph

[5]:
graph = nx.Graph()
known_edges = set()
for source, targets in country_data.items():
    for target in targets:
        if (target, source) not in known_edges:
            known_edges.add((source, target))
            graph.add_edge(source, target)

Calculate graph properties

[6]:
def detect_communities(graph, num_communities):
    community_generator = nx.algorithms.community.girvan_newman(graph)
    for i in range(num_communities-2):
        communities = next(community_generator)
    return communities


def assign_node_color_by_community(graph, communities, colors=None):
    if colors is None:
        colors = ["blue", "orange", "green", "red", "darkviolet",
                  "brown", "pink", "gray", "yellowgreen", "lightblue", 'cyan']
    for community_number, community in enumerate(communities):
        for member in community:
            graph.nodes[member]["color"] = colors[community_number % len(colors)]
    return graph


def assign_node_position_by_community(graph, communities):
    x_shift = -1000
    y_shift = -600
    for community_number, community in enumerate(communities):
        sorted_community_members = sorted(list(community), key=lambda name: graph.nodes[name]["degree"])
        for member_number, member in enumerate(sorted_community_members):
            graph.nodes[member]["x"] = x_shift + member_number * 50
            graph.nodes[member]["y"] = y_shift + community_number * 100
    return graph


def assign_node_size_by_degree(graph):
    for node_id in graph.nodes:
        graph.nodes[node_id]["degree"] = 5 + graph.degree[node_id]
    return graph


def assign_edge_size_by_centrality(graph):
    edge_centralities = nx.algorithms.centrality.edge_betweenness_centrality(graph)
    #edge_centralities = nx.algorithms.centrality.edge_current_flow_betweenness_centrality(graph)
    for edge_id, centrality_value in edge_centralities.items():
        graph.edges[edge_id]["centrality"] = 0.2 + centrality_value * 40
    return graph


communities = detect_communities(graph, 12)
graph = assign_node_size_by_degree(graph)
graph = assign_edge_size_by_centrality(graph)
graph = assign_node_color_by_community(graph, communities)
graph = assign_node_position_by_community(graph, communities)

Add node properties from Gapminder data

[7]:
countries = indicator_data[0]

for i, indicator in enumerate(indicator_name):
    if i == 0:
        continue
    for country, number in zip(countries, indicator_data[i]):
        if country in ['Macedonia, FYR']:
            continue
        graph.nodes[country][indicator] = number

Plot country graph

Notes on how you can interact with the plot:

  • The node positions are fixed in the beginning (y value is determined by community, x value by node degree). Individual nodes can be released by dragging them. All nodes can be released in the Nodes menu with the Release fixed nodes button.

  • The node sizes are initially determined by the degree of each node, i.e. the number of edges it has. The Data selection menu has a drop-down menu called Node size where for example Life expectancy can be chosen, so that the node size reflects the life expectancy of each country. Further the sizes can be normalized, so that rather than using the raw value, it is adapted to lie within a certain minimum and maximum size for better visual appearance.

[8]:
node_size_data_sources = [
    'degree',
    'Median age [Years] 2020',
    'Babies per woman 2018',
    'Child mortality [0-5 year olds dying per 1000 born] 2018',
]

for source in node_size_data_sources:
    print()
    print(source)
    fig = gv.d3(
        graph, zoom_factor=0.33, use_centering_force=False,
        node_hover_neighborhood=True, node_label_rotation=15,
        node_size_data_source=source, use_node_size_normalization=True, node_size_normalization_max=50,
        edge_size_data_source='centrality')
    fig.display(inline=True)

degree
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force

Median age [Years] 2020
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force

Babies per woman 2018
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force

Child mortality [0-5 year olds dying per 1000 born] 2018
Details for selected element
General
App state
Display mode
Export
Data selection
Graph
Node label text
Edge label text
Node size
Minimum
Maximum
Edge size
Minimum
Maximum
Nodes
Visibility
Size
Scaling factor
Position
Drag behavior
Hover behavior
Node images
Visibility
Size
Scaling factor
Node labels
Visibility
Size
Scaling factor
Rotation
Angle
Edges
Visibility
Size
Scaling factor
Form
Curvature
Hover behavior
Edge labels
Visibility
Size
Scaling factor
Rotation
Angle
Layout algorithm
Simulation
Many-body force
Strength
Theta
Min
Max
Links force
Collision force
Radius
Strength
x-positioning force
Strength
y-positioning force
Strength
Centering force