Neighboring country graph¶
This Jupyter notebook provides an example of using the Python package gravis. The .ipynb file can be found here.
It uses geographical data extracted from a Wikipedia page to represent countries and their neighboring countries as undirected graph, where each country is a node and each edge a neighborhood relation. Additionaly, data from Gapminder is used to show national indicators and statistics such as GDP, life expectancy or child mortality as node properties (e.g. size, color, shape).
References¶
Wikipedia: List of countries and territories by land and maritime borders
Gapminder: Bulk data = all indicators displayed in Gapminder World
[1]:
import json
import os
import gravis as gv
import networkx as nx
import numpy as np
import pandas as pd
Data fetching¶
Load countries and their neighboring countries¶
The data was fetched previously from a Wikipedia site with another notebook in the data directory.
[2]:
filepath = os.path.join('data', 'neighboring_countries.json')
with open(filepath) as f:
country_data = json.load(f)
Load national indicators and statistics¶
The data was downloaded previously by hand from the Gapminder website.
[3]:
filepath = os.path.join('data', 'country_statistics_gapminder.csv')
df = pd.read_csv(filepath)
df = df.fillna(0)
indicator_name = list(df.columns)
indicator_data = df.values.T.tolist()
Map country names from dataset 2 onto those in dataset 1¶
[4]:
gapminder_to_wikipeda_name = {
"Cote d'Ivoire": "Côte d'Ivoire",
"Congo, Dem. Rep.": "Democratic Republic of the Congo",
"Congo, Rep.": "Republic of the Congo",
"Gambia": "The Gambia",
"Holy See": "Vatican City",
"Kyrgyz Republic": "Kyrgyzstan",
"Lao": "Laos",
"Micronesia, Fed. Sts.": "Federated States of Micronesia",
"Sao Tome and Principe": "São Tomé and Príncipe",
"Slovak Republic": "Slovakia",
"St. Kitts and Nevis": "Saint Kitts and Nevis",
"St. Lucia": "Saint Lucia",
"St. Vincent and the Grenadines": "Saint Vincent and the Grenadines",
"Swaziland": "Eswatini (Swaziland)",
"Timor-Leste": "East Timor",
}
# Comments on others:
# Kyrgyzstan: "Kyrgyz Republic" is the official name according to wiki
# Macedonia: Seems to be a part of Greece according to wiki, not a country itself but a region
# Abkhazia: self-declared sovereign state in the South Caucasus, Georgian–Abkhazian conflict
# Adélie Land: a claimed territory on the continent of Antarctica, France
# Akrotiri and Dhekelia: a British Overseas Territory on the island of Cyprus
# American Samoa: an unincorporated territory of the United States, southeast of Samoa
# Anguilla: a British overseas territory in the Caribbean
# Guam is the largest region of Micronesia
indicator_data[0] = [gapminder_to_wikipeda_name.get(c, c) for c in indicator_data[0]]
Create a graph¶
[5]:
graph = nx.Graph()
known_edges = set()
for source, targets in country_data.items():
for target in targets:
if (target, source) not in known_edges:
known_edges.add((source, target))
graph.add_edge(source, target)
Calculate graph properties¶
[6]:
def detect_communities(graph, num_communities):
community_generator = nx.algorithms.community.girvan_newman(graph)
for i in range(num_communities-2):
communities = next(community_generator)
return communities
def assign_node_color_by_community(graph, communities, colors=None):
if colors is None:
colors = ["blue", "orange", "green", "red", "darkviolet",
"brown", "pink", "gray", "yellowgreen", "lightblue", 'cyan']
for community_number, community in enumerate(communities):
for member in community:
graph.nodes[member]["color"] = colors[community_number % len(colors)]
return graph
def assign_node_position_by_community(graph, communities):
x_shift = -1000
y_shift = -600
for community_number, community in enumerate(communities):
sorted_community_members = sorted(list(community), key=lambda name: graph.nodes[name]["degree"])
for member_number, member in enumerate(sorted_community_members):
graph.nodes[member]["x"] = x_shift + member_number * 50
graph.nodes[member]["y"] = y_shift + community_number * 100
return graph
def assign_node_size_by_degree(graph):
for node_id in graph.nodes:
graph.nodes[node_id]["degree"] = 5 + graph.degree[node_id]
return graph
def assign_edge_size_by_centrality(graph):
edge_centralities = nx.algorithms.centrality.edge_betweenness_centrality(graph)
#edge_centralities = nx.algorithms.centrality.edge_current_flow_betweenness_centrality(graph)
for edge_id, centrality_value in edge_centralities.items():
graph.edges[edge_id]["centrality"] = 0.2 + centrality_value * 40
return graph
communities = detect_communities(graph, 12)
graph = assign_node_size_by_degree(graph)
graph = assign_edge_size_by_centrality(graph)
graph = assign_node_color_by_community(graph, communities)
graph = assign_node_position_by_community(graph, communities)
Add node properties from Gapminder data¶
[7]:
countries = indicator_data[0]
for i, indicator in enumerate(indicator_name):
if i == 0:
continue
for country, number in zip(countries, indicator_data[i]):
if country in ['Macedonia, FYR']:
continue
graph.nodes[country][indicator] = number
Plot country graph¶
Notes on how you can interact with the plot:
The node positions are fixed in the beginning (y value is determined by community, x value by node degree). Individual nodes can be released by dragging them. All nodes can be released in the
Nodes
menu with theRelease fixed nodes
button.The node sizes are initially determined by the degree of each node, i.e. the number of edges it has. The
Data selection
menu has a drop-down menu calledNode size
where for exampleLife expectancy
can be chosen, so that the node size reflects the life expectancy of each country. Further the sizes can be normalized, so that rather than using the raw value, it is adapted to lie within a certain minimum and maximum size for better visual appearance.
[8]:
node_size_data_sources = [
'degree',
'Median age [Years] 2020',
'Babies per woman 2018',
'Child mortality [0-5 year olds dying per 1000 born] 2018',
]
for source in node_size_data_sources:
print()
print(source)
fig = gv.d3(
graph, zoom_factor=0.33, use_centering_force=False,
node_hover_neighborhood=True, node_label_rotation=15,
node_size_data_source=source, use_node_size_normalization=True, node_size_normalization_max=50,
edge_size_data_source='centrality')
fig.display(inline=True)
degree
Median age [Years] 2020
Babies per woman 2018
Child mortality [0-5 year olds dying per 1000 born] 2018