{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# SNAP\n", "\n", "This Jupyter notebook provides an example of using the Python packages [gravis](https://pypi.org/project/gravis) and [SNAP](https://snap.stanford.edu) (=Stanford Network Analysis Platform). The .ipynb file can be found [here](https://github.com/robert-haas/gravis/tree/master/examples).\n", "\n", "## References\n", "\n", "- [SNAP website](https://snap.stanford.edu)\n", " - [Snap.py](https://snap.stanford.edu/snappy/index.html)\n", " - [Quick Introduction](https://snap.stanford.edu/snappy/index.html#introduction)\n", " - [Documentation](https://snap.stanford.edu/snappy/doc/index.html)\n", " - [Tutorial](https://snap.stanford.edu/snappy/doc/tutorial/index-tut.html)\n", " - [Reference manual](https://snap.stanford.edu/snappy/doc/reference/index-ref.html)\n", " - Datasets\n", " - [Stanford Large Network Dataset Collection](https://snap.stanford.edu/data/index.html)\n", " - [Stanford Biomedical Network Dataset Collection](https://snap.stanford.edu/biodata/index.html)\n", " - [Web and Blog datasets](https://snap.stanford.edu/data/other.html)\n", " - [Additional network dataset resources](https://snap.stanford.edu/data/links.html)\n", " - [Tutorials](https://cs.stanford.edu/people/jure/teaching.html) (Teaching)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installation\n", "\n", "- With [pip](https://pypi.org/project/snap-stanford/): `pip install snap-stanford`\n", "- With [conda](https://anaconda.org/search?q=snap-stanford): Not available except for MacOS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "import snap\n", "\n", "import gravis as gv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quick start\n", "\n", "### Example 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a random graph\n", "g = snap.GenForestFire(250, 0.35, 0.35)\n", "\n", "# Plot it\n", "gv.d3(g, zoom_factor=0.2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graph construction\n", "\n", "### 1) Manual graph construction\n", "\n", "- Tutorial: [Graph creation](https://snap.stanford.edu/snappy/doc/tutorial/tutorial.html#graph-creation)\n", "- API reference: [Graph and network classes](https://snap.stanford.edu/snappy/doc/reference/graphs.html#graph-and-network-classes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1.a) TUNGraph\n", "\n", "undirected, with self-loop (max 1 per node), without parallel edges, without attributes\n", "\n", "- [TUNGraph](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tungraph)\n", " - AddNode(NId)\n", " - AddNode(NodeI)\n", " - AddEdge(SrcNId, DstNId)\n", " - AddEdge(EdgeI)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ug = snap.TUNGraph.New()\n", "\n", "\n", "# Node with automatic id (starts from 0 or highest user-defined id)\n", "ug.AddNode(-1) # argument: -1 for auto\n", "\n", "# Node with user-defined id\n", "ug.AddNode(5) # argument: int\n", "ug.AddNode(7)\n", "\n", "\n", "# Nodes\n", "# ~ Not supported (only via node iterator from other graph) ~\n", "\n", "\n", "# Edge (nodes need to exist already)\n", "ug.AddEdge(0, 5)\n", "ug.AddEdge(5, 7)\n", "ug.AddEdge(7, 0)\n", "ug.AddEdge(5, 5)\n", "ug.AddEdge(7, 7)\n", "\n", "\n", "# Edges\n", "# ~ Not supported (only via edge iterator from other graph) ~\n", "\n", "\n", "gv.d3(ug, graph_height=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1.b) TNGraph\n", "\n", "directed, with self-loop (max 1), without parallel edges, without attributes\n", "\n", "- [TNGraph](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tngraph)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "dg = snap.TNGraph.New()\n", "\n", "\n", "# Node with automatic id\n", "dg.AddNode(-1) # argument: -1 for auto\n", "\n", "# Node with user-defined id\n", "dg.AddNode(5) # argument: int\n", "dg.AddNode(7)\n", "\n", "\n", "# Nodes\n", "# ~ Not supported (only via node iterator from other graph) ~\n", "\n", "\n", "# Edge (nodes need to exist already)\n", "dg.AddEdge(0, 5)\n", "dg.AddEdge(5, 0)\n", "dg.AddEdge(5, 7)\n", "dg.AddEdge(7, 0)\n", "dg.AddEdge(5, 5)\n", "dg.AddEdge(7, 7)\n", "\n", "\n", "# Edges\n", "# ~ Not supported (only via edge iterator from other graph) ~\n", "\n", "\n", "gv.d3(dg, graph_height=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1.c) TUndirNet (similar to TUNGraph)\n", "\n", "directed, with self-loop (max 1 per node), without parallel edges, without attributes\n", "\n", "- [TUndirNet](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tundirnet)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "umg = snap.TUndirNet.New()\n", "\n", "\n", "# Node with automatic id\n", "umg.AddNode(-1) # argument: -1 for auto\n", "\n", "# Node with user-defined id\n", "umg.AddNode(5) # argument: int\n", "umg.AddNode(7)\n", "\n", "\n", "# Nodes\n", "# ~ Not supported (only via node iterator from other graph) ~\n", "\n", "\n", "# Edge (nodes need to exist already)\n", "umg.AddEdge(0, 5)\n", "umg.AddEdge(5, 0)\n", "umg.AddEdge(5, 7)\n", "umg.AddEdge(7, 0)\n", "umg.AddEdge(5, 5)\n", "umg.AddEdge(7, 7)\n", "\n", "\n", "# Edges\n", "# ~ Not supported (only via edge iterator from other graph) ~\n", "\n", "\n", "gv.d3(umg, graph_height=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1.d) TDirNet (similar to TNGraph)\n", "\n", "undirected, with self-loop (max 1 per node), without parallel edges, without attributes\n", "\n", "- [TDirNet](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tdirnet)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dmg = snap.TDirNet.New()\n", "\n", "\n", "# Node with automatic id\n", "dmg.AddNode(-1) # argument: -1 for auto\n", "\n", "# Node with user-defined id\n", "dmg.AddNode(5) # argument: int\n", "dmg.AddNode(7)\n", "\n", "\n", "# Nodes\n", "# ~ Not supported (only via node iterator from other graph) ~\n", "\n", "\n", "# Edge (nodes need to exist already)\n", "dmg.AddEdge(0, 5)\n", "dmg.AddEdge(5, 0)\n", "dmg.AddEdge(5, 7)\n", "dmg.AddEdge(7, 0)\n", "dmg.AddEdge(5, 5)\n", "dmg.AddEdge(7, 7)\n", "\n", "\n", "# Edges\n", "# ~ Not supported (only via edge iterator from other graph) ~\n", "\n", "\n", "gv.d3(dmg, graph_height=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1.e) TNEANet\n", "\n", "undirected, with self-loops, with parallel edges, with attributes (no graph attributes, node and edge attributes after explicit declaration)\n", "\n", "- [TNEANet](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tneanet)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dmg = snap.TNEANet.New()\n", "\n", "\n", "# Node with automatic id\n", "dmg.AddNode(-1) # argument: -1 for auto\n", "\n", "# Node with user-defined id\n", "dmg.AddNode(1) # argument: int\n", "dmg.AddNode(2)\n", "dmg.AddNode()\n", "dmg.AddNode(4)\n", "dmg.AddNode()\n", "dmg.AddNode(6)\n", "dmg.AddNode(7)\n", "\n", "\n", "# Nodes\n", "# ~ Not supported (only via node iterator from other graph) ~\n", "\n", "\n", "# Edge (nodes need to exist already)\n", "dmg.AddEdge(0, 1)\n", "dmg.AddEdge(1, 2)\n", "dmg.AddEdge(2, 3)\n", "dmg.AddEdge(3, 4)\n", "dmg.AddEdge(4, 5)\n", "dmg.AddEdge(5, 6)\n", "dmg.AddEdge(6, 7)\n", "dmg.AddEdge(7, 0)\n", "dmg.AddEdge(0, 0)\n", "dmg.AddEdge(0, 0)\n", "dmg.AddEdge(0, 1)\n", "dmg.AddEdge(0, 1)\n", "\n", "\n", "# Edges\n", "# ~ Not supported (only via edge iterator from other graph) ~\n", "\n", "\n", "gv.d3(dmg, graph_height=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Assign attributes to a created graph\n", "\n", "- API reference:\n", " - [TNEANet class](https://snap.stanford.edu/snappy/doc/reference/graphs.html#tneanet): the only data structure in SNAP which can have node and edge attributes\n", " - [Sparse attributes](https://snap.stanford.edu/snappy/doc/reference/attr.html#sparse-attributes)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g = snap.TNEANet.New()\n", "for node in [0, 1, 2, 3, 4, 5, 6, 7]:\n", " g.AddNode(node)\n", "for source, target in [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 0)]:\n", " g.AddEdge(source, target)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### 1) Graph attributes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ~ Not supported ~" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### 2) Node attributes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define node attributes\n", "g.AddIntAttrN('size') # int\n", "g.AddFltAttrN('opacity') # float\n", "g.AddStrAttrN('color') # str\n", "g.AddStrAttrN('shape') # str\n", "\n", "# Nodes\n", "num_nodes = len(list(g.Nodes()))\n", "for i in range(num_nodes):\n", " g.AddIntAttrDatN(i, 5 + i*5, 'size')\n", " g.AddStrAttrDatN(i, 'lightblue', 'color')\n", "\n", "# Node\n", "g.AddStrAttrDatN(3, 'darkred', 'color')\n", "g.AddStrAttrDatN(3, 'hexagon', 'shape')\n", "g.AddIntAttrDatN(3, 40, 'size')\n", "g.AddFltAttrDatN(3, 0.3, 'opacity')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### 3) Edge attributes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define edge attributes\n", "g.AddIntAttrE('size') # int\n", "# g.AddFltAttrE('opacity') # float\n", "g.AddStrAttrE('color') # str\n", "\n", "# Edges\n", "num_edges = len(list(g.Edges()))\n", "for i in range(num_edges):\n", " g.AddIntAttrDatE(i, 1 + i, 'size')\n", " g.AddStrAttrDatE(i, 'lightgreen', 'color')\n", "\n", "# Edge\n", "g.AddIntAttrDatE(3, 1, 'size')\n", "# g.AddFltAttrDatE(3, 1.0, 'opacity')\n", "g.AddStrAttrDatE(3, 'red', 'color')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gv.d3(g, graph_height=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2) Algorithmic graph construction" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_nodes = 20\n", "num_edges = 10" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g = snap.GenForestFire(num_nodes, 0.35, 0.35)\n", "\n", "# A random Erdos-Renyi directed graph\n", "g = snap.GenRndGnm(snap.PNGraph, num_nodes, num_edges)\n", "\n", "# A Preferential Attachment graph with out-degree 3\n", "g = snap.GenPrefAttach(num_nodes, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3) Graph loading from an internal collection" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4) Graph import and export\n", "\n", "#### Import\n", "\n", "- [Tutorial: Saving and Loading Graphs](https://snap.stanford.edu/snappy/doc/tutorial/tutorial.html#saving-and-loading-graphs)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filepath = os.path.join('data', 'snap_graph.net')\n", "g = snap.LoadPajek(snap.PNGraph, filepath)\n", "\n", "# TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Export\n", "\n", "- [Tutorial: Saving and Loading Graphs](https://snap.stanford.edu/snappy/doc/tutorial/tutorial.html#saving-and-loading-graphs)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filepath = os.path.join('data', 'snap_graph.net')\n", "snap.SavePajek(g, filepath)\n", "\n", "# Other methods\n", "# snap.SaveEdgeList(g, 'snap_graph_edge_list.txt')\n", "# TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic graph inspection\n", "\n", "### 1) Graph and its properties" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Type:', type(g))\n", "print('Directed:', )\n", "print('Number of nodes:', g.GetNodes())\n", "print('Number of edges:', g.GetEdges())\n", "print('Well-formed:', g.IsOk())\n", "print('Approximate graph diameter:', snap.GetBfsFullDiam(g, 10))\n", "print('Number of triads:', snap.GetTriads(g))\n", "print('Clustering coefficient', snap.GetClustCf(g))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2) Nodes and their properties" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#g.AddIntAttrN(\"NValInt\", 0)\n", "for node in g.Nodes():\n", " print('Class:', type(node))\n", " print(\"Id:\", node.GetId())\n", " print(\"In degree:\", node.GetInDeg())\n", " print(\"Out degree:\", node.GetOutDeg())\n", " break\n", " #g.AddIntAttrDatN(node_id, 42, \"NValInt\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3) Edges and their properties" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for edge in g.Edges():\n", " print('Class:', type(edge))\n", " print(\"Id:\", edge.GetId())\n", " print(\"Source node id:\", edge.GetSrcNId())\n", " print(\"Target node id:\", edge.GetDstNId())\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating graph measures and metrics\n", "\n", "### 1) Quantitative measures" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "snap.GetClustCf(g)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2) Structure inference\n", "\n", "#### Community detection\n", "\n", "- [Docs: Community Detection](https://snap.stanford.edu/snappy/doc/reference/community.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g = snap.GenRndGnm(snap.PUNGraph, 50, 100)\n", "\n", "CmtyV = snap.TCnComV()\n", "modularity = snap.CommunityGirvanNewman(g, CmtyV)\n", "\n", "CmtyV = snap.TCnComV()\n", "modularity = snap.CommunityCNM(g, CmtyV)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Other kinds of groups" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g = snap.GenRndGnm(snap.PUNGraph, 50, 100)\n", "\n", "snap.GetTriads(g)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graph visualization" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# TODO" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 2 }