Seaborn分布数据可视化---直方图/密度图

直方图\密度图

直方图和密度图一般用于分布数据的可视化。

distplot

用于绘制单变量的分布图，包括直方图和密度图。

sns.distplot(

    a,

    bins=None,

    hist=True,

    kde=True,

    rug=False,

    fit=None,

    hist_kws=None,

    kde_kws=None,

    rug_kws=None,

    fit_kws=None,

    color=None,

    vertical=False,

    norm_hist=False,

    axlabel=None,

    label=None,

    ax=None,

)

Docstring:

Flexibly plot a univariate distribution of observations.

This function combines the matplotlib ``hist`` function (with automatic

calculation of a good default bin size) with the seaborn :func:`kdeplot`

and :func:`rugplot` functions. It can also fit ``scipy.stats``

distributions and plot the estimated PDF over the data.

Parameters

----------

a : Series, 1d-array, or list.

    Observed data. If this is a Series object with a ``name`` attribute,

    the name will be used to label the data axis.

bins : argument for matplotlib hist(), or None, optional

    Specification of hist bins, or None to use Freedman-Diaconis rule.

hist : bool, optional

    Whether to plot a (normed) histogram.

kde : bool, optional

    Whether to plot a gaussian kernel density estimate.

rug : bool, optional

    Whether to draw a rugplot on the support axis.

fit : random variable object, optional

    An object with `fit` method, returning a tuple that can be passed to a

    `pdf` method a positional arguments following an grid of values to

    evaluate the pdf on.

{hist, kde, rug, fit}_kws : dictionaries, optional

    Keyword arguments for underlying plotting functions.

color : matplotlib color, optional

    Color to plot everything but the fitted curve in.

vertical : bool, optional

    If True, observed values are on y-axis.

norm_hist : bool, optional

    If True, the histogram height shows a density rather than a count.

    This is implied if a KDE or fitted density is plotted.

axlabel : string, False, or None, optional

    Name for the support axis label. If None, will try to get it

    from a.namel if False, do not set a label.

label : string, optional

    Legend label for the relevent component of the plot

ax : matplotlib axis, optional

    if provided, plot on this axis

Returns

-------

ax : matplotlib Axes

    Returns the Axes object with the plot for further tweaking.

See Also

--------

kdeplot : Show a univariate or bivariate distribution with a kernel

          density estimate.

rugplot : Draw small vertical lines to show each observation in a

          distribution.

kdeplot

用于绘制单变量或双变量的核密度图。

sns.kdeplot(

    data,

    data2=None,

    shade=False,

    vertical=False,

    kernel='gau',

    bw='scott',

    gridsize=100,

    cut=3,

    clip=None,

    legend=True,

    cumulative=False,

    shade_lowest=True,

    cbar=False,

    cbar_ax=None,

    cbar_kws=None,

    ax=None,

    **kwargs,

)

Docstring:

Fit and plot a univariate or bivariate kernel density estimate.

Parameters

----------

data : 1d array-like

    Input data.

data2: 1d array-like, optional

    Second input data. If present, a bivariate KDE will be estimated.

shade : bool, optional

    If True, shade in the area under the KDE curve (or draw with filled

    contours when data is bivariate).

vertical : bool, optional

    If True, density is on x-axis.

kernel : {'gau' | 'cos' | 'biw' | 'epa' | 'tri' | 'triw' }, optional

    Code for shape of kernel to fit with. Bivariate KDE can only use

    gaussian kernel.

bw : {'scott' | 'silverman' | scalar | pair of scalars }, optional

    Name of reference method to determine kernel size, scalar factor,

    or scalar for each dimension of the bivariate plot. Note that the

    underlying computational libraries have different interperetations

    for this parameter: ``statsmodels`` uses it directly, but ``scipy``

    treats it as a scaling factor for the standard deviation of the

    data.

gridsize : int, optional

    Number of discrete points in the evaluation grid.

cut : scalar, optional

    Draw the estimate to cut * bw from the extreme data points.

clip : pair of scalars, or pair of pair of scalars, optional

    Lower and upper bounds for datapoints used to fit KDE. Can provide

    a pair of (low, high) bounds for bivariate plots.

legend : bool, optional

    If True, add a legend or label the axes when possible.

cumulative : bool, optional

    If True, draw the cumulative distribution estimated by the kde.

shade_lowest : bool, optional

    If True, shade the lowest contour of a bivariate KDE plot. Not

    relevant when drawing a univariate plot or when ``shade=False``.

    Setting this to ``False`` can be useful when you want multiple

    densities on the same Axes.

cbar : bool, optional

    If True and drawing a bivariate KDE plot, add a colorbar.

cbar_ax : matplotlib axes, optional

    Existing axes to draw the colorbar onto, otherwise space is taken

    from the main axes.

cbar_kws : dict, optional

    Keyword arguments for ``fig.colorbar()``.

ax : matplotlib axes, optional

    Axes to plot on, otherwise uses current axes.

kwargs : key, value pairings

    Other keyword arguments are passed to ``plt.plot()`` or

    ``plt.contour{f}`` depending on whether a univariate or bivariate

    plot is being drawn.

Returns

-------

ax : matplotlib Axes

    Axes with plot.

See Also

--------

distplot: Flexibly plot a univariate distribution of observations.

jointplot: Plot a joint dataset with bivariate and marginal distributions.

rugplot

用于在坐标轴上绘制数据点，显示数据分布情况，一般结合distplot和kdeplot一起使用。

sns.rugplot(a, height=0.05, axis='x', ax=None, **kwargs)

Docstring:

Plot datapoints in an array as sticks on an axis.

Parameters

----------

a : vector

    1D array of observations.

height : scalar, optional

    Height of ticks as proportion of the axis.

axis : {'x' | 'y'}, optional

    Axis to draw rugplot on.

ax : matplotlib axes, optional

    Axes to draw plot into; otherwise grabs current axes.

kwargs : key, value pairings

    Other keyword arguments are passed to ``LineCollection``.

Returns

-------

ax : matplotlib axes

    The Axes object with the plot on it.

一维数据可视化

distplot()

#直方图distplot()

#参数：bins->箱数， hist->是否显示箱曲线， kde->是否显示密度曲线， norm_hist->直方图是否按照密度来表示

#rug->是否显示数据分布情况， vertical->是否水平显示，label->设置图例， axlabel->设置x轴标注

rs = np.random.RandomState(123)             #设定随机种子

datas = pd.Series(rs.randn(100))            #创建包含100个随机数据的Series

sns.distplot(a=datas, bins=10, hist=True, kde=False, norm_hist=False,

             rug=True, vertical=False, color='r', label='distplot', axlabel='x')

plt.legend()

#kde=True设置密度曲线

sns.distplot(a=datas, bins=10, hist=True, kde=True, norm_hist=False,

             rug=True, vertical=False, color='r', label='distplot', axlabel='x')

plt.legend()

#norm_hist设置直方图按照密度曲线显示，实现hist=True 加 kde=True 共同的效果

sns.distplot(a=datas, bins=10, norm_hist=True,

             rug=True, vertical=False, color='r', label='distplot', axlabel='x')

plt.legend()

#rug=False不显示频率分布，vertical=False横向放置图形

sns.distplot(a=datas, bins=10, norm_hist=True,

             rug=False, vertical=False, color='r', label='distplot', axlabel='x')

plt.legend()

#总体参数设置

sns.distplot(datas, rug=True,

             #rug_kws设置数据频率分布颜色

             rug_kws={'color':'y'},

             #kde_kws设置密度曲线颜色、线宽、标注、线型

             kde_kws={'color':'r', 'lw':1, 'label':'KDE', 'linestyle':'--'},

             #hist_kws设置箱子的风格、线宽、透明度、颜色

             #histtype包括’bar'、‘barstacked’,'step','stepfilled'

             hist_kws={'histtype':'step', 'linewidth':1, 'alpha':1, 'color':'k'})

kdeplot()

#密度图 -- kdeplot()

#shade--> 填充设置

sns.kdeplot(datas, shade=True, color='r', vertical=False)

#bw --> 拟合参数

sns.kdeplot(datas, bw=5, label='bw:0.2',

            linestyle='-', linewidth=1.2, alpha=0.5)

sns.kdeplot(datas, bw=20, label='bw:2',

            linestyle='-', linewidth=1.2, alpha=0.5)

#rugplot()设置频率分布图

sns.rugplot(datas, height=0.1, color='k', alpha=0.5)

二维数据可视化

kdeplot()

#二维数据密度图

rs = np.random.RandomState(12345)

df = pd.DataFrame(rs.randn(100,2),

                  columns=['A','B'])

sns.kdeplot(df['A'],df['B'],

            cbar = True,             #设置显示颜色图例条

            shade = True,            #是否填充

            cmap = 'Reds',           #设置调色盘

            shade_lowest = 'False',  #设置最外围颜色是否显示

            n_levels = 10)           #设置曲线个数（越多越平滑）

#分别设置x,y轴的频率分布图

sns.rugplot(df['A'], color='y', axis='x', alpha=0.5)

sns.rugplot(df['B'], color='k', axis='y', alpha=0.5)

sns.kdeplot(df['A'],df['B'],

            cbar = True,

            shade = False,            #不填充

            cmap = 'Reds',

            shade_lowest = 'False',

            n_levels = 10)           

#分别设置x,y轴的频率分布图

sns.rugplot(df['A'], color='y', axis='x', alpha=0.5)

sns.rugplot(df['B'], color='k', axis='y', alpha=0.5)

sns.kdeplot(df['A'],df['B'],

            cbar = True,

            shade = True,

            cmap = 'Reds',

#             shade_lowest = 'False',  #设置最外围颜色是否显示,与shade配合使用

            n_levels = 10)           #设置曲线个数（越多越平滑）

#分别设置x,y轴的频率分布图

sns.rugplot(df['A'], color='y', axis='x', alpha=0.5)

sns.rugplot(df['B'], color='k', axis='y', alpha=0.5)

sns.kdeplot(df['A'],df['B'],

            cbar = True,

            shade = True,

            cmap = 'Reds',

#             shade_lowest = 'False',  #设置最外围颜色是否显示,与shade配合使用

            n_levels = 100)           #设置曲线个数（越多则边界渐变越平滑）

#分别设置x,y轴的频率分布图

sns.rugplot(df['A'], color='y', axis='x', alpha=0.5)

sns.rugplot(df['B'], color='k', axis='y', alpha=0.5)

#多个密度图

#创建两个DataFrame数组

rs1 = np.random.RandomState(12)

rs2 = np.random.RandomState(21)

df1 = pd.DataFrame(rs1.randn(100,2)+2, columns=['A','B'])

df2 = pd.DataFrame(rs2.randn(100,2)-2, columns=['A','B'])

#创建密度图

sns.kdeplot(df1['A'], df1['B'], cmap='Greens',

            shade=True, shade_lowest=False)

sns.kdeplot(df2['A'], df2['B'], cmap='Blues',

            shade=True, shade_lowest=False)

Seaborn分布数据可视化---直方图/密度图的更多相关文章

Python图表数据可视化Seaborn：1. 风格| 分布数据可视化-直方图| 密度图| 散点图
conda install seaborn 是安装到jupyter那个环境的 1. 整体风格设置对图表整体颜色.比例等进行风格设置,包括颜色色板等调用系统风格进行数据可视化 set() / se ...
seaborn分布数据可视化：直方图|密度图|散点图
系统自带的数据表格(存放在github上https://github.com/mwaskom/seaborn-data),使用时通过sns.load_dataset('表名称')即可,结果为一个Dat ...
Echarts数据可视化series-radar雷达图，开发全解+完美注释
全栈工程师开发手册 (作者:栾鹏) Echarts数据可视化开发代码注释全解 Echarts数据可视化开发参数配置全解 6大公共组件详解(点击进入): title详解. tooltip详解.toolb ...
Echarts数据可视化series-line线图，开发全解+完美注释
全栈工程师开发手册 (作者:栾鹏) Echarts数据可视化开发代码注释全解 Echarts数据可视化开发参数配置全解 6大公共组件详解(点击进入): title详解. tooltip详解.toolb ...
Echarts数据可视化series-graph关系图，开发全解+完美注释
全栈工程师开发手册 (作者:栾鹏) Echarts数据可视化开发代码注释全解 Echarts数据可视化开发参数配置全解 6大公共组件详解(点击进入): title详解. tooltip详解.toolb ...
seaborn线性关系数据可视化：时间线图|热图|结构化图表可视化
一.线性关系数据可视化lmplot( ) 表示对所统计的数据做散点图,并拟合一个一元线性回归关系. lmplot(x, y, data, hue=None, col=None, row=None, p ...
Matplotlib学习---用matplotlib画直方图/密度图（histogram, density plot）
直方图用于展示数据的分布情况,x轴是一个连续变量,y轴是该变量的频次. 下面利用Nathan Yau所著的<鲜活的数据:数据可视化指南>一书中的数据,学习画图. 数据地址:http://d ...
用Python的Plotly画出炫酷的数据可视化(含各类图介绍，附代码)
前言本文的文字及图片来源于网络,仅供学习.交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理. 作者: 我被狗咬了在谈及数据可视化的时候,我们通常都会使用到matplo ...
R绘图(1): 在散点图边缘加上直方图/密度图/箱型图
当我们在绘制散点图的时候,可能会遇到点特别多的情况,这时点与点之间过度重合,影响我们对图的认知.为了更好地反映特征,我们可以加上点的密度信息,比如在原来散点所在的位置将密度用热图的形式呈现出来,再比如 ...
python-两个筛子数据可视化(直方图)
""" 作者:zxj 功能:模拟掷骰子,两个筛子数据可视化版本:3.0 日期:19/3/24 """ import random impo ...

随机推荐

【会员题】253. 会议室 II
会议室II 给定一个会议时间安排的数组,每个会议时间都会包括开始和结束的时间s1,e1,s2,e2]..](si<ei) ,为避免会议冲突,同时要考虑充分利用会议室资源,请你计算至少需要多少间 ...
【Azure K8S|AKS】进入AKS的POD中查看文件，例如PVC Volume Mounts使用情况
问题描述在昨天的文章中,创建了 Disk + PV + PVC + POD 方案(https://www.cnblogs.com/lulight/p/17604441.html),那么如何进入到PO ...
「实操」结合图数据库、图算法、机器学习、GNN 实现一个推荐系统
本文是一个基于 NebulaGraph 上图算法.图数据库.机器学习.GNN 的推荐系统方法综述,大部分介绍的方法提供了 Playground 供大家学习. 基本概念推荐系统诞生的初衷是解决互联网时 ...
FolkMQ 作个简单的消息中间件（最简单的那种）， v1.3.1 发布
功能简介角色功能生产端(或发起端) 发布消息.定时消息(或叫延时).顺序消息.可过期消息.事务消息.发送消息(rpc)支持 Qos0.Qos1 消费端(或接收端) 订阅.取消订阅.消费-ACK( ...
十: SQL执行流程
SQL执行流程 1. MySQL 中的 SQL执行流程 MySQL的查询流程: 1.1 查询缓存 Server 如果在查询缓存中发现了这条 SQL 语句,就会直接将结果返回给客户端:如果没有,就进入 ...
RocketMQ(7) 消费幂等
1 什么是消费幂等当出现消费者对某条消息重复消费的情况时,重复消费的结果与消费一次的结果是相同的,并且多次消费并未对业务系统产生任何负面影响,那么这个消费过程就是消费幂等的. 幂等:若某操作执行多 ...
淘宝电商api接口获取商品详情搜索商品
iDataRiver平台 https://www.idatariver.com/zh-cn/ 提供开箱即用的taobao淘宝电商数据采集API,供用户按需调用. 接口使用详情请参考淘宝接口文档接口列 ...
Java面试必考题之线程的生命周期，结合源码，透彻讲解!
写在开头在前面的几篇博客里,我们学习了Java的多线程,包括线程的作用.创建方式.重要性等,那么今天我们就要正式踏入线程,去学习更加深层次的知识点了. 第一个需要学的就是线程的生命周期,也可以将之理 ...
Zabbix6.0使用教程 (三)—zabbix6.0的安装要求
接上篇,我们继续为大家详细介绍zabbix6.0的使用教程之zabbix6.0的安装部署.接下来我们将从zabbix部署要求到四种不同的安装方式逐一详细的为大家介绍.本篇讲的是部署zabbix6.0的 ...
使用 PMML 实现模型融合及优化技巧
在机器学习的生产环境中,我们经常需要将多个模型的预测结果进行融合,以便提高预测的准确性.这个过程通常涉及到多个模型子分的简单逻辑回归融合.虽然离线训练时我们可以直接使用sklearn的逻辑回归进行训练 ...

Seaborn分布数据可视化---直方图/密度图

直方图\密度图

distplot

kdeplot

rugplot

一维数据可视化

distplot()

kdeplot()

二维数据可视化

kdeplot()

Seaborn分布数据可视化---直方图/密度图的更多相关文章

随机推荐

热门专题