pandas的学习6-合并concat

import  pandas as pd

import  numpy as  np

'''

pandas处理多组数据的时候往往会要用到数据的合并处理,使用 concat是一种基本的合并方式.

而且concat中有很多参数可以调整,合并成你想要的数据形式.

'''

# todo axis (合并方向)

# axis=0是预设值，因此未设定任何参数时，函数默认axis=0。

#定义资料集

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'])

df2 = pd.DataFrame(np.ones((3,4))*1, columns=['a','b','c','d'])

df3 = pd.DataFrame(np.ones((3,4))*2, columns=['a','b','c','d'])

#concat纵向合并

res = pd.concat([df1, df2, df3], axis=0)  #vertical stack

#打印结果

print(res)

#     a    b    c    d

# 0  0.0  0.0  0.0  0.0

# 1  0.0  0.0  0.0  0.0

# 2  0.0  0.0  0.0  0.0

# 0  1.0  1.0  1.0  1.0

# 1  1.0  1.0  1.0  1.0

# 2  1.0  1.0  1.0  1.0

# 0  2.0  2.0  2.0  2.0

# 1  2.0  2.0  2.0  2.0

# 2  2.0  2.0  2.0  2.0

# todo 仔细观察会发现结果的index是0, 1, 2, 0, 1, 2, 0, 1, 2，若要将index重置，请看例子二。

# ignore_index (重置 index)

#承上一个例子，并将index_ignore设定为True

res = pd.concat([df1, df2, df3], axis=0, ignore_index=True)

#打印结果

print(res)

#     a    b    c    d

# 0  0.0  0.0  0.0  0.0

# 1  0.0  0.0  0.0  0.0

# 2  0.0  0.0  0.0  0.0

# 3  1.0  1.0  1.0  1.0

# 4  1.0  1.0  1.0  1.0

# 5  1.0  1.0  1.0  1.0

# 6  2.0  2.0  2.0  2.0

# 7  2.0  2.0  2.0  2.0

# 8  2.0  2.0  2.0  2.0

# 结果的index变0, 1, 2, 3, 4, 5, 6, 7, 8

'''

join (合并方式)

join='outer'为预设值，因此未设定任何参数时，函数默认join='outer'。

此方式是依照column来做纵向合并，有相同的column上下合并在一起，其他独自的column个自成列，原本没有值的位置皆以NaN填充。

'''

#定义资料集

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'], index=[1,2,3])

df2 = pd.DataFrame(np.ones((3,4))*1, columns=['b','c','d','e'], index=[2,3,4])

#纵向"外"合并df1与df2

res = pd.concat([df1, df2], axis=0, join='outer')

print(res)

#     a    b    c    d    e

# 1  0.0  0.0  0.0  0.0  NaN

# 2  0.0  0.0  0.0  0.0  NaN

# 3  0.0  0.0  0.0  0.0  NaN

# 2  NaN  1.0  1.0  1.0  1.0

# 3  NaN  1.0  1.0  1.0  1.0

# 4  NaN  1.0  1.0  1.0  1.0

#todo 原理同上个例子的说明，但只有相同的column合并在一起，其他的会被抛弃。

#承上一个例子

#纵向"内"合并df1与df2

res = pd.concat([df1, df2], axis=0, join='inner')

#打印结果

print(res)

#     b    c    d

# 1  0.0  0.0  0.0

# 2  0.0  0.0  0.0

# 3  0.0  0.0  0.0

# 2  1.0  1.0  1.0

# 3  1.0  1.0  1.0

# 4  1.0  1.0  1.0

#重置index并打印结果

res = pd.concat([df1, df2], axis=0, join='inner', ignore_index=True)

print(res)

#     b    c    d

# 0  0.0  0.0  0.0

# 1  0.0  0.0  0.0

# 2  0.0  0.0  0.0

# 3  1.0  1.0  1.0

# 4  1.0  1.0  1.0

# 5  1.0  1.0  1.0

# join_axes (依照 axes 合并) 坐标轴合并

#定义资料集

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'], index=[1,2,3])

df2 = pd.DataFrame(np.ones((3,4))*1, columns=['b','c','d','e'], index=[2,3,4])

#依照`df1.index`进行横向合并

res = pd.concat([df1, df2], axis=1, join_axes=[df1.index])#根据谁的index来的

#打印结果

print(res)

#index的原因

#     a    b    c    d    b    c    d    e

# 1  0.0  0.0  0.0  0.0  NaN  NaN  NaN  NaN

# 2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

# 3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

#移除join_axes，并打印结果

res = pd.concat([df1, df2], axis=1)

print(res)

#     a    b    c    d    b    c    d    e

# 1  0.0  0.0  0.0  0.0  NaN  NaN  NaN  NaN

# 2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

# 3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

# 4  NaN  NaN  NaN  NaN  1.0  1.0  1.0  1.0

# append (添加数据)  纵向才是添加数据嘛，横向是增加数据的维度，就不是append了

# append只有纵向合并，没有横向合并。

#定义资料集

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'])

df2 = pd.DataFrame(np.ones((3,4))*1, columns=['a','b','c','d'])

df3 = pd.DataFrame(np.ones((3,4))*1, columns=['a','b','c','d'])

s1 = pd.Series([1,2,3,4], index=['a','b','c','d'])

#将df2合并到df1的下面，以及重置index，并打印出结果

res = df1.append(df2, ignore_index=True)

print(res)

#     a    b    c    d

# 0  0.0  0.0  0.0  0.0

# 1  0.0  0.0  0.0  0.0

# 2  0.0  0.0  0.0  0.0

# 3  1.0  1.0  1.0  1.0

# 4  1.0  1.0  1.0  1.0

# 5  1.0  1.0  1.0  1.0

#合并多个df，将df2与df3合并至df1的下面，以及重置index，并打印出结果

res = df1.append([df2, df3], ignore_index=True)

print(res)

#     a    b    c    d

# 0  0.0  0.0  0.0  0.0

# 1  0.0  0.0  0.0  0.0

# 2  0.0  0.0  0.0  0.0

# 3  1.0  1.0  1.0  1.0

# 4  1.0  1.0  1.0  1.0

# 5  1.0  1.0  1.0  1.0

# 6  1.0  1.0  1.0  1.0

# 7  1.0  1.0  1.0  1.0

# 8  1.0  1.0  1.0  1.0

#合并series，将s1合并至df1，以及重置index，并打印出结果

res = df1.append(s1, ignore_index=True)

print(res)

#     a    b    c    d

# 0  0.0  0.0  0.0  0.0

# 1  0.0  0.0  0.0  0.0

# 2  0.0  0.0  0.0  0.0

# 3  1.0  2.0  3.0  4.0

concat是一种基本的合并方式，但是concat有很多参数可以调整

axis=0是预设值，也就是默认就为vertical合并

ignore_index=true 这个参数用于忽略以前的index，生成新的有序的index

join合并 join=‘outer’为预设值，按照column做纵向合并，去重功能，不够的用nan填充

inner模式就不存在nan，相当于outer模式合并后去掉有nan的所有列

join_axes是concat的一个参数，join_axes=[df1.index]表示按照df1的index进行合并，axis=1（表示横向增加维度）

比如df1有1,2,3 ，但是df2只有2,3，4此时会舍弃df2的4，并且后半部分1为空

append为添加数据 vertical stack

出处：https://morvanzhou.github.io/tutorials/data-manipulation/np-pd/3-6-pd-concat/

pandas的学习6-合并concat的更多相关文章

【转】Pandas学习笔记（五）合并 concat
Pandas学习笔记系列: Pandas学习笔记(一)基本介绍 Pandas学习笔记(二)选择数据 Pandas学习笔记(三)修改&添加值 Pandas学习笔记(四)处理丢失值 Pandas学 ...
pandas的学习总结
pandas的学习总结作者:csj更新时间:2017.12.31 email:59888745@qq.com 说明:因内容较多,会不断更新 xxx学习总结: 回主目录:2017 年学习记录和总结 1 ...
pandas连接多个表格concat()函数
网易云课堂该课程链接地址 https://study.163.com/course/courseMain.htm?share=2&shareId=400000000398149&cou ...
Pandas 合并 concat
pandas处理多组数据的时候往往会要用到数据的合并处理,使用 concat是一种基本的合并方式.而且concat中有很多参数可以调整,合并成你想要的数据形式. 1.axis(合并方向):axis=0 ...
python数据表的合并(python pandas join() 、merge()和concat()的用法)
merage# pandas提供了一个类似于关系数据库的连接(join)操作的方法<Strong>merage</Strong>,可以根据一个或多个键将不同DataFrame中 ...
Pandas中DataFrame数据合并、连接（concat、merge、join）之merge
二.merge:通过键拼接列类似于关系型数据库的连接方式,可以根据一个或多个键将不同的DatFrame连接起来. 该函数的典型应用场景是,针对同一个主键存在两张不同字段的表,根据主键整合到一张表里面 ...
Pandas中DataFrame数据合并、连接（concat、merge、join）之concat
一.concat:沿着一条轴,将多个对象堆叠到一起 concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, key ...
Pandas中DataFrame数据合并、连接（concat、merge、join）之join
pandas.DataFrame.join 自己弄了很久,一看官网.感觉自己宛如智障.不要脸了,直接抄 DataFrame.join(other, on=None, how='left', lsuff ...
pandas时间序列学习笔记
目录创建一个时间序列 pd.date_range() info() asfred() shifted(),滞后函数 diff()求差分加减乘除 DataFrame.reindex() 通过data ...

随机推荐

20-SAP PI开发手册-ERP发布服务供外部系统调用（sproxy代理类）
一. 接口内容接口详细信息 1. 字段对应关系发送字段对应关系返回字段对应关系 2. 报文信息传入报文(报文结构,外围系统提供) 1 <?xml version=" ...
C# Random类的正确应用
Random类介绍 Random类一个用于产生伪随机数字的类.这里的伪随机表示有随机性但是可以基于算法模拟出随机规律. Random类的构造方式有两种. Random r= new Random(). ...
FL Studio钢琴卷轴之刷子工具以及其他
上一篇文章我们重点讲解了FL Studio钢琴卷轴的画笔工具,今天我们就来讲解钢琴卷轴窗口中剩下的工具.由于接下来的工具都很简单,所以我们将放在一起讲,现在就和小编一起来看看FL Studio钢琴卷轴 ...
在Mac上也能轻松拥有Windows应用程序的简便方法
一般而言,如果我们想要在Windows的环境下下载一款软件那是件很方便的事情.只要我们登陆软件的官网进行下载即可.但是如果我们使用的是Mac OS系统,很多用户就会发现,许多软件会出现不兼容的情况. ...
Vue—新版本router-view 与 keep-alive 的互动
1. <keep-alive> 直接嵌套到 <router-view> 上会失效,正确写法: <router-view #="{ Component }&quo ...
JQuery案例：左右选
左右选 <head> <meta charset="UTF-8"> <title></title> <style> se ...
Java基础教程——二维数组
二维数组 Java里的二维数组其实是数组的数组,即每个数组元素都是一个数组. 每个数组的长度不要求一致,但最好一致. // 同样有两种风格的定义方法 int[][] _arr21_推荐 = { { 1 ...
[Android systrace系列] 抓取开机过程systrace
------------------------------------------------------------------------- 这篇文章的小目标:了解抓取开机过程systrace的 ...
c++11-17 模板核心知识（十一）—— 编写泛型库需要的基本技术
Callables 函数对象 Function Objects 处理成员函数及额外的参数 std::invoke<>() 统一包装泛型库的其他基本技术 Type Traits std:: ...
【SHOI2008】JZOJ2020年9月5日提高组循环的债务
CSP-2020倒计时:36天 [SHOI2008]JZOJ2020年9月5日提高组循环的债务题目 Description Alice.Bob和Cynthia总是为他们之间混乱的债务而烦恼,终于有 ...

pandas的学习6-合并concat

pandas的学习6-合并concat的更多相关文章

随机推荐

热门专题