[译]pandas中的iloc loc的区别?
loc 从特定的
gets rows (or columns) with particular labels from the index.
iloc gets rows (or columns) at particular positions in the index (so it only takes integers).
ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.
It's important to note some subtleties that can make ix slightly tricky to use:
if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.
if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.
To illustrate the differences between the three methods, consider the following Series:
>>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
>>> s
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
We'll look at slicing with the integer value 3.
In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):
>>> s.iloc[:3] # slice the first three rows
49 NaN
48 NaN
47 NaN
>>> s.loc[:3] # slice up to and including label 3
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
>>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).
What if we try with an integer label that isn't in the index (say 6)?
Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.
>>> s.iloc[:6]
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
>>> s.loc[:6]
KeyError: 6
>>> s.ix[:6]
KeyError: 6
As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.
If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:
>>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
>>> s2.index.is_mixed() # index is mix of different types
True
>>> s2.ix[:6] # now behaves like iloc given integer
a NaN
b NaN
c NaN
d NaN
e NaN
1 NaN
Keep in mind that ix can still accept non-integers and behave like loc:
>>> s2.ix[:'c'] # behaves like loc given non-integer
a NaN
b NaN
c NaN
As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.
Combining position-based and label-based indexing
Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.
For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?
>>> df = pd.DataFrame(np.nan,
index=list('abcde'),
columns=['x','y','z', 8, 9])
>>> df
x y z 8 9
a NaN NaN NaN NaN NaN
b NaN NaN NaN NaN NaN
c NaN NaN NaN NaN NaN
d NaN NaN NaN NaN NaN
e NaN NaN NaN NaN NaN
In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):
>>> df.ix[:'c', :4]
x y z 8
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
In later versions of pandas, we can achieve this result using iloc and the help of another method:
>>> df.iloc[:df.index.get_loc('c') + 1, :4]
x y z 8
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.
There are further examples in pandas' documentation here.
[译]pandas中的iloc loc的区别?的更多相关文章
- python pandas(ix & iloc &loc)
python pandas(ix & iloc &loc) loc——通过行标签索引行数据 iloc——通过行号索引行数据 ix——通过行标签或者行号索引行数据(基于loc和iloc ...
- [译] Pandas中根据列的值选取多行数据
# 选取等于某些值的行记录 用 == df.loc[df['column_name'] == some_value] # 选取某列是否是某一类型的数值 用 isin df.loc[df['column ...
- Pandas中Series与Dataframe的区别
1. Series Series通俗来讲就是一维数组,索引(index)为每个元素的下标,值(value)为下标对应的值 例如: arr = ['Tom', 'Nancy', 'Jack', 'Ton ...
- Pandas中merge和join的区别
可以说merge包含了join的操作,merge支持通过列或索引连表,而join只支持通过索引连表,只是简化了merge的索引连表的参数 示例 定义一个left的DataFrame left=pd.D ...
- pandas中loc-iloc-ix的使用
转自:https://www.jianshu.com/p/d6a9845a0a34 Pandas中loc,iloc,ix的使用 使用 iloc 从DataFrame中筛选数据 iloc 是基于“位置” ...
- pandas中df.ix, df.loc, df.iloc 的使用场景以及区别
pandas中df.ix, df.loc, df.iloc 的使用场景以及区别: https://stackoverflow.com/questions/31593201/pandas-iloc-vs ...
- pandas中DataFrame的ix,loc,iloc索引方式的异同
pandas中DataFrame的ix,loc,iloc索引方式的异同 1.loc: 按照标签索引,范围包括start和end 2.iloc: 在位置上进行索引,不包括end 3.ix: 先在inde ...
- pandas-03 DataFrame()中的iloc和loc用法
pandas-03 DataFrame()中的iloc和loc用法 简单的说: iloc,即index locate 用index索引进行定位,所以参数是整型,如:df.iloc[10:20, 3:5 ...
- Pandas中Series和DataFrame的索引
在对Series对象和DataFrame对象进行索引的时候要明确这么一个概念:是使用下标进行索引,还是使用关键字进行索引.比如list进行索引的时候使用的是下标,而dict索引的时候使用的是关键字. ...
随机推荐
- UVA 1606 Amphiphilic Carbon Molecules 两亲性分子 (极角排序或叉积,扫描法)
任意线可以贪心移动到两点上.直接枚举O(n^3),会TLE. 所以采取扫描法,选基准点,然后根据极角或者两两做叉积比较进行排排序,然后扫一遍就好了.旋转的时候在O(1)时间推出下一种情况,总复杂度为O ...
- Kafka 完全分布式集群环境搭建
思路: 先在主机s1上安装配置,然后远程复制到其它两台主机s2.s3上, 并分别修改配置文件server.properties中的broker.id属性. 1. 搭建前准备 示例共三台主机,主机IP映 ...
- Centos7.3 安装devstack stein版本
1. 系统准备 # 关闭防火墙 systemctl stop firewalld systemctl disable firewalld # 关闭selinux setenforce 0 sed -i ...
- 利用ss5服务搭建代理服务器
利用ss5服务搭建代理服务器 1. 下载ss5-3.8.9-8.tar.gz ###官网下载http://ss5.sourceforge.net/ 2. 安装ss5 yum -y install gc ...
- ssh整合思想 Spring分模块开发 crud参数传递 解决HTTP Status 500 - Write operations are not allowed in read-only mode (FlushMode.MANUAL): Turn your Session into FlushMode.COMMIT/AUTO or(增加事务)
在Spring核心配置文件中没有增加事务方法,导致以上问题 Action类UserAction package com.swift.action; import com.opensymphony.xw ...
- 在线聊天项目1.4版 使用Gson方法解析Json字符串以便重构request和response的各种请求和响应 解决聊天不畅问题 Gson包下载地址
在线聊天项目结构图: 多用户登陆效果图: 多用户聊天效果图: 数据库效果图: 重新构建了Server类,使用了Gson方法,通过解析Json字符串,增加Info类,简化判断过程. Server类代码如 ...
- Codevs1082 线段树练习 3
题目描述 Description 给你N个数,有两种操作: 1:给区间[a,b]的所有数增加X 2:询问区间[a,b]的数的和. 输入描述 Input Description 第一行一个正整数n,接下 ...
- 洛谷 P5015 标题统计
第一道题很简单,标签:字符串.模拟. 只需要一个判断去除空格就对了: if(a[i]!=' ' && a[i]!='\n') v++; code: #include<iostre ...
- PHP计算今天、昨天、本周、本月、上月开始时间和结束时间
PHP计算今天.昨天.本周.本月.上月开始时间和结束时间 $today = date('Y-m-d H:i:s',mktime(0,0,0,date('m'),date('d'),date('Y')) ...
- Java中将字符串与unicode的相互转换工具类
unicode编码规则 unicode码对每一个字符用4位16进制数表示.具体规则是:将一个字符(char)的高8位与低8位分别取出,转化为16进制数,如果转化的16进制数的长度不足2位,则在其后补0 ...