Pandas 1 表格数据类型DataFrame

# -*- encoding:utf-8 -*-

# Copyright (c) 2015 Shiye Inc.

# All rights reserved.

#

# Author: ldq <liangduanqi@shiyejinrong.com>

# Date: 2019/2/12 10:07

import numpy as np

import pandas as pd

dates = pd.date_range("", periods=5)

'''

DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',

               '2019-01-05', '2019-01-06'],

              dtype='datetime64[ns]', freq='D')

'''

df = pd.DataFrame(np.random.randn(5, 4), index=dates,

                  columns=["a", "b", "c", "d"])

'''

                   a         b         c         d

2019-01-01 -0.406321 -0.518128 -0.151546  1.438366

2019-01-02 -0.738235  0.400646  1.337277  1.393154

2019-01-03  1.646115 -0.073540  0.644506  0.987226

2019-01-04 -1.270745 -1.333457 -1.571356 -0.051486

2019-01-05 -0.075171  2.424032 -0.274433  1.205959

'''

df1 = pd.DataFrame(np.arange(12).reshape(3, 4))

'''

   0  1   2   3

0  0  1   2   3

1  4  5   6   7

2  8  9  10  11

'''

data2 = {

    "a": 1,

    "b": pd.Timestamp(""),

    "c": pd.Series(1, index=range(4), dtype=np.float64),

    "d": np.array([3] * 4, dtype=np.int32),

    "e": pd.Categorical(["test", "train", "test", "train"]),

    "f": "foo",

    "g": pd.date_range("",periods=4),

}

df2 = pd.DataFrame(data2)

'''

   a          b    c  d      e    f          g

0  1 2019-01-01  1.0  3   test  foo 2002-02-05

1  1 2019-01-01  1.0  3  train  foo 2002-02-06

2  1 2019-01-01  1.0  3   test  foo 2002-02-07

3  1 2019-01-01  1.0  3  train  foo 2002-02-08

'''

columns1 = df2.columns

'''

所有列

Index(['a', 'b', 'c', 'd', 'e', 'f', 'g'], dtype='object')

'''

index1 = df2.index

'''

RangeIndex(start=0, stop=4, step=1)

'''

values1 = df2.values

'''

[[1 Timestamp('2019-01-01 00:00:00') 1.0 3 'test' 'foo'

  Timestamp('2002-02-05 00:00:00')]

 [1 Timestamp('2019-01-01 00:00:00') 1.0 3 'train' 'foo'

  Timestamp('2002-02-06 00:00:00')]

 [1 Timestamp('2019-01-01 00:00:00') 1.0 3 'test' 'foo'

  Timestamp('2002-02-07 00:00:00')]

 [1 Timestamp('2019-01-01 00:00:00') 1.0 3 'train' 'foo'

  Timestamp('2002-02-08 00:00:00')]]

'''

describe1 = df2.describe()

'''

数据简单统计

         a    c    d

count  4.0  4.0  4.0

mean   1.0  1.0  3.0

std    0.0  0.0  0.0

min    1.0  1.0  3.0

25%    1.0  1.0  3.0

50%    1.0  1.0  3.0

75%    1.0  1.0  3.0

max    1.0  1.0  3.0

'''

transpose1 = df2.T

'''

数据翻转

                     0         ...                             3

a                    1         ...                             1

b  2019-01-01 00:00:00         ...           2019-01-01 00:00:00

c                    1         ...                             1

d                    3         ...                             3

e                 test         ...                         train

f                  foo         ...                           foo

g  2002-02-05 00:00:00         ...           2002-02-08 00:00:00

[7 rows x 4 columns]

'''

df2_sort_index = df2.sort_index(axis=0, ascending=False)

'''

对行和列的索引进行排序

   a          b    c  d      e    f          g

3  1 2019-01-01  1.0  3  train  foo 2002-02-08

2  1 2019-01-01  1.0  3   test  foo 2002-02-07

1  1 2019-01-01  1.0  3  train  foo 2002-02-06

0  1 2019-01-01  1.0  3   test  foo 2002-02-05

'''

df2_sort_values = df2.sort_values(by='g', ascending=False)

'''

根据值排序

   a          b    c  d      e    f          g

3  1 2019-01-01  1.0  3  train  foo 2002-02-08

2  1 2019-01-01  1.0  3   test  foo 2002-02-07

1  1 2019-01-01  1.0  3  train  foo 2002-02-06

0  1 2019-01-01  1.0  3   test  foo 2002-02-05

'''

Pandas 1 表格数据类型DataFrame的更多相关文章

pandas中的数据结构-DataFrame
pandas中的数据结构-DataFrame DataFrame是什么? 表格型的数据结构 DataFrame 是一个表格型的数据类型,每列值类型可以不同 DataFrame 既有行索引.也有列索引 ...
pandas库的数据类型运算
pandas库的数据类型运算算数运算法则根据行列索引,补齐运算(不同索引不运算,行列索引相同才运算),默认产生浮点数补齐时默认填充NaN空值二维和一维,一维和0维之间采用广播运算(低维元素与每 ...
数据类型-DataFrame
数据类型-DataFrame DataFrame是由多个Series数据列组成的表格数据类型,每行Series值都增加了一个共用的索引既有行索引,又有列索引行索引,表明不同行,横向索引,叫inde ...
Python之Pandas中Series、DataFrame
Python之Pandas中Series.DataFrame实践 1. pandas的数据结构Series 1.1 Series是一种类似于一维数组的对象,它由一组数据(各种NumPy数据类型)以及一 ...
Python之Pandas中Series、DataFrame实践
Python之Pandas中Series.DataFrame实践 1. pandas的数据结构Series 1.1 Series是一种类似于一维数组的对象,它由一组数据(各种NumPy数据类型)以及一 ...
pandas向表格中循环写入数据
pandas向表格中循环写入多行数据 import pandas as pd def list_topic(total_num, str1): """ 生成多个主题 :p ...
利用Python进行数据分析(7) pandas基础: Series和DataFrame的简单介绍
一.pandas 是什么 pandas 是基于 NumPy 的一个 Python 数据分析包,主要目的是为了数据分析.它提供了大量高级的数据结构和对数据处理的方法. pandas 有两个主要的数据结构 ...
python简单爬虫使用pandas解析表格,不规则表格
url = http://www.hnu.edu.cn/xyxk/xkzy/zylb.htm 部分表格如图: 部分html代码: <table class="MsoNormalTabl ...
pandas使用drop_duplicates去除DataFrame重复项
DataFrame中存在重复的行或者几行中某几列的值重复,这时候需要去掉重复行,示例如下: data.drop_duplicates(subset=['A','B'],keep='first',inp ...

随机推荐

最近面试被问到一个问题，AtomicInteger如何保证线程安全？
最近面试被问到一个问题,AtomicInteger如何保证线程安全?我查阅了资料发现还可以引申到乐观锁/悲观锁的概念,觉得值得一记. 众所周知,JDK提供了AtomicInteger保证对数字的操 ...
Moment.js简单使用
1.设置语言环境,如设置中文环境: moment.locale("zh-cn"); 2.当前时间.指定时间: // 假设当前时间为:2018年12月10日 moment(); // ...
CF1153C Serval and Parenthesis Sequence
题目地址:CF1153C Serval and Parenthesis Sequence 思路:贪心如果有解,那么 \(s_0 = (\) && \(s_{n-1} = )\) &a ...
吴恩达《机器学习》课程笔记——第七章：Logistic回归
上一篇 ※※※※※※※※ [回到目录] ※※※※※※※※ 下一篇 7.1 分类问题本节内容:什么是分类之前的章节介绍的都是回归问题,接下来是分类问题.所谓的分类问题是指输出变量为有限个离散 ...
中国交建 WAF 基础平台 http://waf.ccccltd.cn/
中国交建 WAF 基础平台 http://waf.ccccltd.cn/
CF666E Forensic Examination [后缀自动机，线段树合并]
洛谷 Codeforces 思路最初的想法:后缀数组+区间众数,似乎并不能过. 既然后缀数组不行,那就按照套路建出广义SAM,然后把\(S\)放在上面跑,得到以每个点结尾会到SAM上哪个节点. 询问 ...
8266编译错误 xtensa-lx106-elf/bin/ld: segmentled section `.text' will not fit in region `iram1_0_seg'
一种简单的解决办法 Okay, the solution was to copy the libgcc.a file from: esp-open-sdk/ESP8266_NONOS/lib/ to ...
MongoDB数据库(二):增删查改
MongoDB数据库的增删查改 1.插入数据语法: db.集合名称.insert(document) db.table_name.insert({name:'gj',gender:1}) db.ta ...
高可用Redis(十三)：Redis缓存的使用和设计
1.缓存的受益和成本 1.1 受益 1.可以加速读写:Redis是基于内存的数据源,通过缓存加速数据读取速度 2.降低后端负载:后端服务器通过前端缓存降低负载,业务端使用Redis降低后端数据源的负载 ...
css3——border-image属性的用法
项目需求是实现鼠标移到按钮上时,下方显示一张渐变的三角图片,于是想到使用border-image来实现. 实现;//向外偏移10px,可使边框内部的内容不是那么紧凑border-image-repea ...

Pandas 1 表格数据类型DataFrame

Pandas 1 表格数据类型DataFrame的更多相关文章

随机推荐

热门专题