pandas的corsstab

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False)

index : array-like, Series, or list of arrays/Series

Values to group by in the rows

columns : array-like, Series, or list of arrays/Series

Values to group by in the columns

values : array-like, optional

Array of values to aggregate according to the factors. Requires aggfunc be specified.

aggfunc : function, optional

If specified, requires values be specified as well

rownames : sequence, default None

If passed, must match number of row arrays passed

colnames : sequence, default None

If passed, must match number of column arrays passed

margins : boolean, default False

Add row/column margins (subtotals)

dropna : boolean, default True

Do not include columns whose entries are all NaN

normalize : boolean, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False

Normalize by dividing all values by the sum of values.

If passed ‘all’ or True, will normalize over all values.

If passed ‘index’ will normalize over each row.

If passed ‘columns’ will normalize over each column.

If margins is True, will also normalize margin values.

New in version 0.18.1.

In [1]:

import numpy as np

a = np.array(["foo", "foo", "foo", "foo", "bar", "bar","bar", "bar", "foo", "foo", "foo"], dtype=object)

a

In [2]:

b = np.array(["one", "one", "one", "two", "one", "one", "one", "two", "two", "two", "one"], dtype=object)

b

In [3]:

pd.crosstab(a,b)

Out[3]:

col_0	one	two
row_0
bar	3	1
foo	4	3

In [4]:

 pd.crosstab(a, b, rownames=['a'], colnames=['b'])

Out[4]:

b	one	two
a
bar	3	1
foo	4	3

In [5]

c = np.array(["dull", "dull", "shiny", "dull", "dull", "shiny","shiny", "dull", "shiny", "shiny", "shiny"],

               dtype=object)

c

In [6]:

import pandas as pd

pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])

Out[6]:

b	one		two
c	dull	shiny	dull	shiny
a
bar	1	2	1	0
foo	2	2	1	2

In [7]:

foo1 = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c'])

bar1= pd.Categorical(['d', 'e'], categories=['d', 'e', 'f'])

pd.crosstab(foo1, bar1,dropna='true')

# 'c' and 'f' are not represented in the data,

# and will not be shown in the output because

# dropna is True by default. Set 'dropna=False'

# to preserve categories with no data

Out[7]:

col_0	d	e	f
row_0
a	1	0	0
b	0	1	0
c	0	0	0

pandas的corsstab的更多相关文章

pandas基础-Python3
未完 for examples: example 1: # Code based on Python 3.x # _*_ coding: utf-8 _*_ # __Author: "LEM ...
10 Minutes to pandas
摘要一.创建对象二.查看数据三.选择和设置四.缺失值处理五.相关操作六.聚合七.重排(Reshaping) 八.时间序列九.Categorical类型十.画图十一 ...
利用Python进行数据分析(15) pandas基础: 字符串操作
字符串对象方法 split()方法拆分字符串: strip()方法去掉空白符和换行符: split()结合strip()使用: "+"符号可以将多个字符串连接起来: join( ...
利用Python进行数据分析(10) pandas基础: 处理缺失数据
数据不完整在数据分析的过程中很常见. pandas使用浮点值NaN表示浮点和非浮点数组里的缺失数据. pandas使用isnull()和notnull()函数来判断缺失情况. 对于缺失数据一般处理 ...
利用Python进行数据分析(12) pandas基础: 数据合并
pandas 提供了三种主要方法可以对数据进行合并: pandas.merge()方法:数据库风格的合并: pandas.concat()方法:轴向连接,即沿着一条轴将多个对象堆叠到一起: 实例方法c ...
利用Python进行数据分析(9) pandas基础: 汇总统计和计算
pandas 对象拥有一些常用的数学和统计方法. 例如,sum() 方法,进行列小计: sum() 方法传入 axis=1 指定为横向汇总,即行小计: idxmax() 获取最大值对应的索 ...
利用Python进行数据分析(8) pandas基础: Series和DataFrame的基本操作
一.reindex() 方法:重新索引针对 Series 重新索引指的是根据index参数重新进行排序. 如果传入的索引值在数据里不存在,则不会报错,而是添加缺失值的新行. 不想用缺失值,可以用 ...
利用Python进行数据分析(7) pandas基础: Series和DataFrame的简单介绍
一.pandas 是什么 pandas 是基于 NumPy 的一个 Python 数据分析包,主要目的是为了数据分析.它提供了大量高级的数据结构和对数据处理的方法. pandas 有两个主要的数据结构 ...
pandas.DataFrame对行和列求和及添加新行和列
导入模块: from pandas import DataFrame import pandas as pd import numpy as np 生成DataFrame数据 df = DataFra ...

随机推荐

微信小程序---交互反馈
1.学习大纲: 2.showToast(): wx.showToast({ title: '成功', icon: 'success', duration: }) 3.hieToast(): wx.sh ...
Mysql事务特性
事务概念事务可由一条sql或者一组sql组成.事务是访问并更新数据库中各种数据项的一个程序执行单元. 事务会把数据库从一种一致状态转换为另一种一致状态.在数据提交工作时,可以确保要么所有修改都已经保 ...
【HANA系列】SAP ECLIPSE中创建ABAP项目失败原因解析
公众号:SAP Technical 本文作者:matinal 原文出处:http://www.cnblogs.com/SAPmatinal/ 原文链接:[HANA系列]SAP ECLIPSE中创建AB ...
MTU，MRU，MSS
MTU是以太网数据链路层概念,默认是1500,当在PPPOE环境的时候,是1492和1480,两者有何区别,暂不清楚 MRU是PPP链路数据链路层的概念,都是最大传输单元的意思 MSS是最大报文段长度 ...
《Python编程从0到1》笔记3——欧几里得算法
本节以欧几里得算法(这是人类历史上最早记载的算法)为示例,向读者展示注释.文档字符串(docstring).变量.循环.递归.缩进以及函数定义等Python语法要素. 欧几里得算法:“在数学中, ...
java反射机制学习笔记
内容引用自:https://www.cnblogs.com/wkrbky/p/6201098.html https://www.cnblogs.com/xumBlog/p/8882489.html,本 ...
JSR303 校验扩展（分组、按顺序校验）
1.在spring MVC 项目中使用JSR303 校验数据合法性,一般情况下使用方法为 (1)在接受数据的实体使用注解标添加校验规则 package com.hzsj.wechatdto; impo ...
Connection is read-only. Queries leading to data modification are not allowed 错误原因
因为我再spring 中使用了AOP进行事务管理,有如下配置 <tx:advice id="txAdvice" transaction-manager="trans ...
Spring（十）--Advisor顾问
Spring之Advisor顾问 1. 创建新的xml文件 advisor.xml  <bean i ...
pandas中的数据结构-DataFrame
pandas中的数据结构-DataFrame DataFrame是什么? 表格型的数据结构 DataFrame 是一个表格型的数据类型,每列值类型可以不同 DataFrame 既有行索引.也有列索引 ...

pandas的corsstab

pandas的corsstab的更多相关文章

随机推荐

热门专题