[python] 基于NetworkX实现网络图的绘制
网络图 (或图表或图形)显示了一组实体之间的互连。每个实体由一个或多个节点表示。节点之间的连接通过链接(或边)表示。网络的理论与实现是一个广阔的研究领域。整个网络都可以致力于此。例如,网络可以是有向的或无向的,加权的或未加权的。有许多不同的输入格式。为了指导您该领域,我建议按照建议的顺序执行以下示例。请注意关于该工具,我主要依靠NetworkX库(2.4版本)。但是请注意,还应考虑使用Graph Tool,尤其是在涉及高维网络时。该章节主要内容有:
- 来自pandas基础网格图绘制 Basic Network from pandas data frame
- 自定义NetworkX图形外观 Custom NetworkX graph appearance
- 网络布局的可能性 Network layout possibilities
- 有向或无向网络 Directed or Undirected network
- 将颜色映射到网络节点 Map a color to network nodes
- 将颜色映射到网络的边 Map colour to the edges of a Network
- 网络图的背景颜色 Background colour of network chart
- 来自相关性矩阵的网络 Network from correlation matrix
pip install networkx==2.4
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
1. 自pandas基础网格图绘制 Basic Network from pandas data frame
# libraries 导入模块
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with 4 connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# Build your graph
# 绘制网络图,每次结果可能不一样
G=nx.from_pandas_edgelist(df, 'from', 'to')
# Plot it
nx.draw(G, with_labels=True)
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
2. 自定义NetworkX图形外观 Custom NetworkX graph appearance
- 节点 Nodes
- 标签 Labels
- 边 Edges
- 总结 All
## 节点 Nodes
# libraries 载入库
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# Build your graph 建立表格
G=nx.from_pandas_edgelist(df, 'from', 'to')
# Graph with Custom nodes: 自定义表格
# with_labels是否显示标签,node_size节点大小,node_color节点颜色,node_shape节点形状,alpha透明度,linewidths线条宽度
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", node_shape="s", alpha=0.5, linewidths=10)
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
## 标签 Labels
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to')
# Custom the edges:
# font_size标签字体大小,font_color标签字体颜色,font_weight字体形式
nx.draw(G, with_labels=True, node_size=1500, font_size=25, font_color="yellow", font_weight="bold")
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
## 边 Edges
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to')
# Chart with Custom edges:
# width边线条宽,edge_color边线条颜色
nx.draw(G, with_labels=True, width=10, edge_color="skyblue", style="solid")
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
## 总结 All
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to')
# All together we can do something fancy
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", node_shape="o", alpha=0.5, linewidths=4, font_size=25, font_color="grey", font_weight="bold", width=2, edge_color="grey")
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
3. 网络布局的可能性 Network layout possibilities
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A','E','F','E','G','G','D','F'], 'to':['D', 'A', 'E','C','A','F','G','D','B','G','C']})
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to')
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
4 | E | A |
5 | F | F |
6 | E | G |
7 | G | D |
8 | G | B |
9 | D | G |
10 | F | C |
# Fruchterman Reingold Fruchterman Reingold引导布局算法布局
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", pos=nx.fruchterman_reingold_layout(G))
Text(0.5, 1.0, 'fruchterman_reingold')
# Circular 环形布局
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", pos=nx.circular_layout(G))
Text(0.5, 1.0, 'circular')
# Random 随机布局
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", pos=nx.random_layout(G))
Text(0.5, 1.0, 'random')
# Spectral 光谱式布局
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", pos=nx.spectral_layout(G))
Text(0.5, 1.0, 'spectral')
# Spring 跳跃式布局
nx.draw(G, with_labels=True, node_size=1500, node_color="skyblue", pos=nx.spring_layout(G))
Text(0.5, 1.0, 'spring')
4. 有向或无向网络 Directed or Undirected network
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
# This time a pair can appear 2 times, in one side or in the other!
df = pd.DataFrame({ 'from':['D', 'A', 'B', 'C','A'], 'to':['A', 'D', 'A', 'E','C']})
# Build your graph. Note that we use the DiGraph function to create the graph!
# create_using=nx.DiGraph()创建有向图,默认是无向图
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.DiGraph())
# Make the graph 有向图
nx.draw(G, with_labels=True, node_size=1500, alpha=0.3, arrows=True)
from | to | |
0 | D | A |
1 | A | D |
2 | B | A |
3 | C | E |
4 | A | C |
# Build a dataframe with your connections
# This time a pair can appear 2 times, in one side or in the other!
df = pd.DataFrame({ 'from':['D', 'A', 'B', 'C','A'], 'to':['A', 'D', 'A', 'E','C']})
# Build your graph. Note that we use the Graph function to create the graph!
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph())
# Make the graph
nx.draw(G, with_labels=True, node_size=1500, alpha=0.3, arrows=True)
from | to | |
0 | D | A |
1 | A | D |
2 | B | A |
3 | C | E |
4 | A | C |
Text(0.5, 1.0, 'UN-Directed')
5. 将颜色映射到网络节点 Map a color to network nodes
- 您要映射的要素是一个数值。然后,我们将使用连续的色标。在左图上,A比C暗,比B暗。
- 该功能是分类的。在右图上,A和B属于同一组,D和E分组在一起,而C在他的组中单独存在。我们使用了分类色标。
- 连续颜色
- 分类颜色
## Continuous color scale 连续颜色
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# And a data frame with characteristics for your nodes
carac = pd.DataFrame({ 'ID':['A', 'B', 'C','D','E'], 'myvalue':['123','25','76','12','34'] })
# 设置值
# Build your graph 建立图
G =nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph() )
# The order of the node for networkX is the following order:
# 节点顺序
# Thus, we cannot give directly the 'myvalue' column to netowrkX, we need to arrange the order!
# Here is the tricky part: I need to reorder carac, to assign the good color to each node
# 根据myvalue设置颜色,并匹配节点顺序和ID号
carac = carac.set_index('ID')
carac =carac.reindex(G.nodes())
# Plot it, providing a continuous color scale with cmap:
# node_color设定颜色,输入的必须是float数组或者int值;cmap颜色条
nx.draw(G, with_labels=True, node_color=np.array(carac['myvalue'].values,dtype='float32'), cmap=plt.cm.Blues)
ID | myvalue | |
0 | A | 123 |
1 | B | 25 |
2 | C | 76 |
3 | D | 12 |
4 | E | 34 |
NodeView(('A', 'D', 'B', 'C', 'E'))
myvalue | |
ID | |
A | 123 |
D | 12 |
B | 25 |
C | 76 |
E | 34 |
## Categorical color scale 连续颜色
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C']})
# And a data frame with characteristics for your nodes
carac = pd.DataFrame({ 'ID':['A', 'B', 'C','D','E'], 'myvalue':['group1','group1','group2','group3','group3'] })
# Build your graph
# 建立图
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph() )
# The order of the node for networkX is the following order:
# 打印节点顺序
# Thus, we cannot give directly the 'myvalue' column to netowrkX, we need to arrange the order!
# Here is the tricky part: I need to reorder carac to assign the good color to each node
carac= carac.set_index('ID')
# 根据节点顺序设定值
# And I need to transform my categorical column in a numerical value: group1->1, group2->2...
# 设定类别
# Custom the nodes:
nx.draw(G, with_labels=True, node_color=carac['myvalue'].cat.codes, cmap=plt.cm.Set1, node_size=1500)
NodeView(('A', 'D', 'B', 'C', 'E'))
A 0
D 2
B 0
C 1
E 2
dtype: int8
6. 将颜色映射到网络的边 Map colour to the edges of a Network
- 数值型 numerical
- 类别型 categorical
## 数值型 numerical
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
# value设定链接值
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C'], 'value':[1, 10, 5, 5]})
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph() )
# Custom the nodes:
# edge_color设置边的颜色
nx.draw(G, with_labels=True, node_color='skyblue', node_size=1500, edge_color=df['value'], width=10.0, edge_cmap=plt.cm.Blues)
from | to | value | |
0 | A | D | 1 |
1 | B | A | 10 |
2 | C | E | 5 |
3 | A | C | 5 |
## 类别型 categorical
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
# value设置类型
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C'], 'value':['typeA', 'typeA', 'typeB', 'typeB']})
# And I need to transform my categorical column in a numerical value typeA->1, typeB->2...
# 转换为类别
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph() )
# Custom the nodes:
nx.draw(G, with_labels=True, node_color='skyblue', node_size=1500, edge_color=df['value'].cat.codes, width=10.0, edge_cmap=plt.cm.Set2)
from | to | value | |
0 | A | D | typeA |
1 | B | A | typeA |
2 | C | E | typeB |
3 | A | C | typeB |
0 0
1 0
2 1
3 1
dtype: int8
7. 网络图的背景颜色 Background colour of network chart
你可以改变背景颜色您的网络图与 fig.set_facecolor()。 请注意,如果要保留png的背景色,则需要添加fig.get_facecolor。
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# Build a dataframe with your connections
df = pd.DataFrame({ 'from':['A', 'B', 'C','A'], 'to':['D', 'A', 'E','C'] })
# Build your graph
G=nx.from_pandas_edgelist(df, 'from', 'to', create_using=nx.Graph() )
# Custom the nodes:
fig = plt.figure()
nx.draw(G, with_labels=True, node_color='skyblue', node_size=1500, edge_color='white')
# 设置背景颜色
# If you want to save the figure to png:
# 保存图像需要设定facecolor=fig.get_facecolor() ,否者背景颜色为白色
# plt.savefig('yourname.png', facecolor=fig.get_facecolor(),dpi=300)
from | to | |
0 | A | D |
1 | B | A |
2 | C | E |
3 | A | C |
8. 来自相关性矩阵的网络 Network from correlation matrix
# libraries
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
# I build a data set: 10 individuals and 5 variables for each
# 建立矩阵
df = pd.DataFrame({ 'A':ind1, 'B':ind1 + np.random.randint(10, size=(10)) , 'C':ind1 + np.random.randint(10, size=(10)) , 'D':ind1 + np.random.randint(5, size=(10)) , 'E':ind1 + np.random.randint(5, size=(10)), 'F':ind5, 'G':ind5 + np.random.randint(5, size=(10)) , 'H':ind5 + np.random.randint(5, size=(10)), 'I':ind5 + np.random.randint(5, size=(10)), 'J':ind5 + np.random.randint(5, size=(10))})
# Calculate the correlation between individuals. We have to transpose first, because the corr function calculate the pairwise correlations between columns.
# 计算相关性
corr = df.corr()
# Transform it in a links data frame (3 columns only):
# 将相关系数矩阵压平
links = corr.stack().reset_index()
# 设置列名
links.columns = ['var1', 'var2','value']
# Keep only correlation over a threshold and remove self correlation (cors (A,A)=1)
# 剔除相同编号的行以及相关系数小于0.8的行
links_filtered=links.loc[ (links['value'] > 0.8) & (links['var1'] != links['var2']) ]
# Build your graph
# 作图
G=nx.from_pandas_edgelist(links_filtered, 'var1', 'var2')
# Plot the network:
nx.draw(G, with_labels=True, node_color='orange', node_size=500, edge_color='black', linewidths=5, font_size=15)
A | B | C | D | E | F | G | H | I | J | |
0 | 5 | 13 | 6 | 8 | 8 | 1 | 4 | 2 | 1 | 3 |
1 | 10 | 19 | 10 | 14 | 12 | 1 | 3 | 4 | 4 | 5 |
2 | 3 | 9 | 3 | 5 | 3 | 13 | 17 | 13 | 14 | 15 |
3 | 4 | 6 | 5 | 4 | 5 | 4 | 7 | 4 | 7 | 4 |
4 | 8 | 13 | 9 | 12 | 10 | 18 | 19 | 19 | 20 | 19 |
5 | 10 | 13 | 13 | 11 | 11 | 5 | 8 | 9 | 7 | 9 |
6 | 12 | 16 | 14 | 15 | 13 | 2 | 2 | 3 | 6 | 3 |
7 | 1 | 7 | 6 | 3 | 4 | 11 | 14 | 14 | 14 | 11 |
8 | 9 | 11 | 9 | 9 | 9 | 3 | 4 | 5 | 7 | 5 |
9 | 4 | 9 | 4 | 7 | 4 | 8 | 8 | 8 | 10 | 8 |
A | B | C | D | E | F | G | H | I | J | |
A | 1.000000 | 0.816480 | 0.901905 | 0.936634 | 0.949857 | -0.409401 | -0.505922 | -0.327200 | -0.325622 | -0.276172 |
B | 0.816480 | 1.000000 | 0.706978 | 0.928908 | 0.876425 | -0.380840 | -0.440560 | -0.291830 | -0.369119 | -0.214817 |
C | 0.901905 | 0.706978 | 1.000000 | 0.830659 | 0.926892 | -0.343944 | -0.416735 | -0.200915 | -0.245105 | -0.230368 |
D | 0.936634 | 0.928908 | 0.830659 | 1.000000 | 0.939070 | -0.282163 | -0.397256 | -0.212778 | -0.229146 | -0.151093 |
E | 0.949857 | 0.876425 | 0.926892 | 0.939070 | 1.000000 | -0.412766 | -0.488815 | -0.301198 | -0.346611 | -0.278961 |
F | -0.409401 | -0.380840 | -0.343944 | -0.282163 | -0.412766 | 1.000000 | 0.972397 | 0.968543 | 0.975579 | 0.965554 |
G | -0.505922 | -0.440560 | -0.416735 | -0.397256 | -0.488815 | 0.972397 | 1.000000 | 0.952668 | 0.923379 | 0.957782 |
H | -0.327200 | -0.291830 | -0.200915 | -0.212778 | -0.301198 | 0.968543 | 0.952668 | 1.000000 | 0.956089 | 0.973569 |
I | -0.325622 | -0.369119 | -0.245105 | -0.229146 | -0.346611 | 0.975579 | 0.923379 | 0.956089 | 1.000000 | 0.927947 |
J | -0.276172 | -0.214817 | -0.230368 | -0.151093 | -0.278961 | 0.965554 | 0.957782 | 0.973569 | 0.927947 | 1.000000 |
