Pandas 基础学习
加载数据
Fun:pandas.read_csv
>>> import pandas
>>> food_info = pandas.read_csv("food_info.csv")
>>> print(food_info.dtypes)
NDB_No int64
Shrt_Desc object
Water_(g) float64
Energ_Kcal int64
Protein_(g) float64
Lipid_Tot_(g) float64
Ash_(g) float64
Carbohydrt_(g) float64
Fiber_TD_(g) float64
Sugar_Tot_(g) float64
Calcium_(mg) float64
Iron_(mg) float64
Magnesium_(mg) float64
Phosphorus_(mg) float64
Potassium_(mg) float64
Sodium_(mg) float64
Zinc_(mg) float64
Copper_(mg) float64
Manganese_(mg) float64
Selenium_(mcg) float64
Vit_C_(mg) float64
Thiamin_(mg) float64
Riboflavin_(mg) float64
Niacin_(mg) float64
Vit_B6_(mg) float64
Vit_B12_(mcg) float64
Vit_A_IU float64
Vit_A_RAE float64
Vit_E_(mg) float64
Vit_D_mcg float64
Vit_D_IU float64
Vit_K_(mcg) float64
FA_Sat_(g) float64
FA_Mono_(g) float64
FA_Poly_(g) float64
Cholestrl_(mg) float64
dtype: object
>>> print(type(food_info))
<class 'pandas.core.frame.DataFrame'>
取数据的头和尾
头:head
food_info.head(1)
尾:tail
food_info.tail(1)
shape
>>> food_info.shape
(8618, 36)
取数据
指定行数据
>>> print(food_info.loc[0])
NDB_No 1001
Shrt_Desc BUTTER WITH SALT
Water_(g) 15.87
Energ_Kcal 717
Protein_(g) 0.85
Lipid_Tot_(g) 81.11
Ash_(g) 2.11
Carbohydrt_(g) 0.06
Fiber_TD_(g) 0
Sugar_Tot_(g) 0.06
Calcium_(mg) 24
Iron_(mg) 0.02
Magnesium_(mg) 2
Phosphorus_(mg) 24
Potassium_(mg) 24
Sodium_(mg) 643
Zinc_(mg) 0.09
Copper_(mg) 0
Manganese_(mg) 0
Selenium_(mcg) 1
Vit_C_(mg) 0
Thiamin_(mg) 0.005
Riboflavin_(mg) 0.034
Niacin_(mg) 0.042
Vit_B6_(mg) 0.003
Vit_B12_(mcg) 0.17
Vit_A_IU 2499
Vit_A_RAE 684
Vit_E_(mg) 2.32
Vit_D_mcg 1.5
Vit_D_IU 60
Vit_K_(mcg) 7
FA_Sat_(g) 51.368
FA_Mono_(g) 21.021
FA_Poly_(g) 3.043
Cholestrl_(mg) 215
Name: 0, dtype: object
取范围数据
>>> print(food_info.loc[1:2])
NDB_No Shrt_Desc Water_(g) Energ_Kcal Protein_(g) \
1 1002 BUTTER WHIPPED WITH SALT 15.87 717 0.85
2 1003 BUTTER OIL ANHYDROUS 0.24 876 0.28
Lipid_Tot_(g) Ash_(g) Carbohydrt_(g) Fiber_TD_(g) Sugar_Tot_(g) \
1 81.11 2.11 0.06 0.0 0.06
2 99.48 0.00 0.00 0.0 0.00
... Vit_A_IU Vit_A_RAE Vit_E_(mg) Vit_D_mcg Vit_D_IU \
1 ... 2499.0 684.0 2.32 1.5 60.0
2 ... 3069.0 840.0 2.80 1.8 73.0
Vit_K_(mcg) FA_Sat_(g) FA_Mono_(g) FA_Poly_(g) Cholestrl_(mg)
1 7.0 50.489 23.426 3.012 219.0
2 8.6 61.924 28.732 3.694 256.0
取列数据
>>> print(food_info["NDB_No"])
0 1001
1 1002
2 1003
3 1004
4 1005
5 1006
6 1007
7 1008
8 1009
9 1010
10 1011
11 1012
12 1013
13 1014
14 1015
15 1016
16 1017
17 1018
18 1019
19 1020
20 1021
21 1022
22 1023
23 1024
24 1025
25 1026
26 1027
27 1028
28 1029
29 1030
...
8588 43544
8589 43546
8590 43550
8591 43566
8592 43570
8593 43572
8594 43585
8595 43589
8596 43595
8597 43597
8598 43598
8599 44005
8600 44018
8601 44048
8602 44055
8603 44061
8604 44074
8605 44110
8606 44158
8607 44203
8608 44258
8609 44259
8610 44260
8611 48052
8612 80200
8613 83110
8614 90240
8615 90480
8616 90560
8617 93600
Name: NDB_No, Length: 8618, dtype: int64
取多个列的数据
>>> print(food_info[["NDB_No","Shrt_Desc"]])
NDB_No Shrt_Desc
0 1001 BUTTER WITH SALT
1 1002 BUTTER WHIPPED WITH SALT
2 1003 BUTTER OIL ANHYDROUS
3 1004 CHEESE BLUE
4 1005 CHEESE BRICK
5 1006 CHEESE BRIE
6 1007 CHEESE CAMEMBERT
7 1008 CHEESE CARAWAY
8 1009 CHEESE CHEDDAR
9 1010 CHEESE CHESHIRE
10 1011 CHEESE COLBY
11 1012 CHEESE COTTAGE CRMD LRG OR SML CURD
12 1013 CHEESE COTTAGE CRMD W/FRUIT
13 1014 CHEESE COTTAGE NONFAT UNCRMD DRY LRG OR SML CURD
14 1015 CHEESE COTTAGE LOWFAT 2% MILKFAT
15 1016 CHEESE COTTAGE LOWFAT 1% MILKFAT
16 1017 CHEESE CREAM
17 1018 CHEESE EDAM
18 1019 CHEESE FETA
19 1020 CHEESE FONTINA
20 1021 CHEESE GJETOST
21 1022 CHEESE GOUDA
22 1023 CHEESE GRUYERE
23 1024 CHEESE LIMBURGER
24 1025 CHEESE MONTEREY
25 1026 CHEESE MOZZARELLA WHL MILK
26 1027 CHEESE MOZZARELLA WHL MILK LO MOIST
27 1028 CHEESE MOZZARELLA PART SKIM MILK
28 1029 CHEESE MOZZARELLA LO MOIST PART-SKIM
29 1030 CHEESE MUENSTER
... ... ...
8588 43544 BABYFOOD CRL RICE W/ PEARS & APPL DRY INST
8589 43546 BABYFOOD BANANA NO TAPIOCA STR
8590 43550 BABYFOOD BANANA APPL DSSRT STR
8591 43566 SNACKS TORTILLA CHIPS LT (BAKED W/ LESS OIL)
8592 43570 CEREALS RTE POST HONEY BUNCHES OF OATS HONEY RSTD
8593 43572 POPCORN MICROWAVE LOFAT&NA
8594 43585 BABYFOOD FRUIT SUPREME DSSRT
8595 43589 CHEESE SWISS LOW FAT
8596 43595 BREAKFAST BAR CORN FLAKE CRUST W/FRUIT
8597 43597 CHEESE MOZZARELLA LO NA
8598 43598 MAYONNAISE DRSNG NO CHOL
8599 44005 OIL CORN PEANUT AND OLIVE
8600 44018 SWEETENERS TABLETOP FRUCTOSE LIQ
8601 44048 CHEESE FOOD IMITATION
8602 44055 CELERY FLAKES DRIED
8603 44061 PUDDINGS CHOC FLAVOR LO CAL INST DRY MIX
8604 44074 BABYFOOD GRAPE JUC NO SUGAR CND
8605 44110 JELLIES RED SUGAR HOME PRESERVED
8606 44158 PIE FILLINGS BLUEBERRY CND
8607 44203 COCKTAIL MIX NON-ALCOHOLIC CONCD FRZ
8608 44258 PUDDINGS CHOC FLAVOR LO CAL REG DRY MIX
8609 44259 PUDDINGS ALL FLAVORS XCPT CHOC LO CAL REG DRY MIX
8610 44260 PUDDINGS ALL FLAVORS XCPT CHOC LO CAL INST DRY...
8611 48052 VITAL WHEAT GLUTEN
8612 80200 FROG LEGS RAW
8613 83110 MACKEREL SALTED
8614 90240 SCALLOP (BAY&SEA) CKD STMD
8615 90480 SYRUP CANE
8616 90560 SNAIL RAW
8617 93600 TURTLE GREEN RAW
[8618 rows x 2 columns]
取所有的列名
>>> food_info.columns.tolist()
['NDB_No', 'Shrt_Desc', 'Water_(g)', 'Energ_Kcal', 'Protein_(g)', 'Lipid_Tot_(g)', 'Ash_(g)', 'Carbohydrt_(g)', 'Fiber_TD_(g)', 'Sugar_Tot_(g)', 'Calcium_(mg)', 'Iron_(mg)', 'Magnesium_(mg)', 'Phosphorus_(mg)', 'Potassium_(mg)', 'Sodium_(mg)', 'Zinc_(mg)', 'Copper_(mg)', 'Manganese_(mg)', 'Selenium_(mcg)', 'Vit_C_(mg)', 'Thiamin_(mg)', 'Riboflavin_(mg)', 'Niacin_(mg)', 'Vit_B6_(mg)', 'Vit_B12_(mcg)', 'Vit_A_IU', 'Vit_A_RAE', 'Vit_E_(mg)', 'Vit_D_mcg', 'Vit_D_IU', 'Vit_K_(mcg)', 'FA_Sat_(g)', 'FA_Mono_(g)', 'FA_Poly_(g)', 'Cholestrl_(mg)']
排序
升序
inplace = True代表在当前对象内直接排序,如果要返回一个新的对象 set False
food_info.sort_values("Water_(g)",inplace = True)
>>> food_info["Water_(g)"]
>>> 760 0.00
8599 0.00
654 0.00
631 0.00
630 0.00
629 0.00
611 0.00
610 0.00
655 0.00
673 0.00
663 0.00
671 0.00
670 0.00
669 0.00
633 0.00
668 0.00
700 0.00
665 0.00
664 0.00
662 0.00
656 0.00
661 0.00
660 0.00
659 0.00
658 0.00
657 0.00
699 0.00
737 0.00
8122 0.00
667 0.00
...
4270 99.80
4411 99.85
4408 99.89
4357 99.90
4239 99.90
4356 99.90
4369 99.90
4347 99.90
4205 99.90
4203 99.93
4204 99.95
4208 99.95
4213 99.95
4374 99.96
4407 99.97
4379 99.97
4373 99.97
4404 99.98
4372 99.98
4377 100.00
4378 100.00
4348 100.00
4209 100.00
4376 100.00
6150 NaN
6067 NaN
6113 NaN
1983 NaN
7776 NaN
6095 NaN
降序
>>> food_info.sort_values("Water_(g)",inplace = True , ascending = False)
>>> food_info["Water_(g)"]
4376 100.00
4209 100.00
4348 100.00
4378 100.00
4377 100.00
4372 99.98
4404 99.98
4407 99.97
4379 99.97
4373 99.97
4374 99.96
4213 99.95
4208 99.95
4204 99.95
4203 99.93
4356 99.90
4357 99.90
4239 99.90
4205 99.90
4369 99.90
4347 99.90
4408 99.89
4411 99.85
4270 99.80
4252 99.80
4392 99.80
4260 99.80
4409 99.79
4255 99.74
4397 99.70
...
739 0.00
790 0.00
638 0.00
689 0.00
688 0.00
687 0.00
686 0.00
685 0.00
666 0.00
632 0.00
653 0.00
639 0.00
696 0.00
8455 0.00
791 0.00
675 0.00
8180 0.00
704 0.00
705 0.00
706 0.00
707 0.00
738 0.00
6417 0.00
760 0.00
6150 NaN
6067 NaN
6113 NaN
1983 NaN
7776 NaN
6095 NaN
Pandas 基础学习的更多相关文章
- Pandas基础学习与Spark Python初探
摘要:pandas是一个强大的Python数据分析工具包,pandas的两个主要数据结构Series(一维)和DataFrame(二维)处理了金融,统计,社会中的绝大多数典型用例科学,以及许多工程领域 ...
- pandas基础学习一
生成对象 用值列表生成 Series 时,Pandas 默认自动生成整数索引: In [3]: s = pd.Series([1, 3, 5, np.nan, 6, 8]) In [4]: s Out ...
- numpy+pandas 基础学习
#-*- coding:utf-8 -*- import numpy as np; data1=[1,2,3,4,5] array1=np.array(data1) #创建数组/矩阵 # 使用nump ...
- pandas基础学习
1.导入两个数据分析重要的模块import numpy as npimport pandas as pd2.创建一个时间索引,所谓的索引(index)就是每一行数据的id,可以标识每一行的唯一值dat ...
- Python 读取UCI iris数据集分析、numpy基础学习
python基础.numpy使用.io读取数据集.数据处理转换与简单分析.读取UCI iris数据集中鸢尾花的萼片.花瓣长度数据,进行数据清理,去重,排序,并求出和.累积和.均值.标准差.方差.最大值 ...
- python学习笔记(四):pandas基础
pandas 基础 serise import pandas as pd from pandas import Series, DataFrame obj = Series([4, -7, 5, 3] ...
- Pandas 基础(1) - 初识及安装 yupyter
Hello, 大家好, 昨天说了我会再更新一个关于 Pandas 基础知识的教程, 这里就是啦......Pandas 被广泛应用于数据分析领域, 是一个很好的分析工具, 也是我们后面学习 machi ...
- 基于 Python 和 Pandas 的数据分析(2) --- Pandas 基础
在这个用 Python 和 Pandas 实现数据分析的教程中, 我们将明确一些 Pandas 基础知识. 加载到 Pandas Dataframe 的数据形式可以很多, 但是通常需要能形成行和列的数 ...
- 零基础学习Python数据分析
网上虽然有很多Python学习的教程,但是大多是围绕Python网页开发等展开.数据分析所需要的Python技能和网页开发等差别非常大,本人就是浪费了很多时间来看这些博客.书籍.所以就有了本文,希望能 ...
随机推荐
- mac os x 查看网络端口情况
查看端口是否打开 使用 netstat 命令 a. `netstat -nat | grep <端口号>` , 如命令 `netstat -nat | grep 3306` b. `net ...
- Typescript骚操作,在TS里面直接插入HTML
Typescript骚操作,在TS里面直接插入HTML,还有语法提示 先给大家看一个图 因为我不喜欢用很重的框架,主要是并非专业UI,但是偶尔会用到,还是觉得直接element组装受不了,想想能在ts ...
- Do-Now—团队冲刺博客三
Do-Now-团队 冲刺博客三 作者:仇夏 前言 不知不觉我们的项目已经做了三个多礼拜了,团队冲刺博客也写到了这第三篇,看着一个基本成型的APP安装在自己的手机上,一种喜悦感油然而生.好了,现在来看看 ...
- PPT vba从Execl 拷贝图表
在PPT 需要引用Execl的COM组件 Dim wkb As Workbook Sub Change() Set wkb = Workbooks.Open("D:\D2_月报基础数据.xl ...
- restful levels
1. 什么是RESTful REST这个词,是Roy Thomas Fielding在他2000年的博士论文中提出的.翻译过来就是"表现层状态转化.” REST是一种软件架构风格.设计风格, ...
- laravel之知识点
- Hadoop集群搭建过程中ssh免密码登录(二)
一.为什么设置ssh免密码登录 在集群中,Hadoop控制脚本依赖SSH来执行针对整个集群的操作.例如,某个脚本能够终止并重启集群中的所有守护进程.所以,需要安装SSH,但是,SSH远程登陆的时候,需 ...
- unittest中的Empty suite错误
import unittest from selenium import webdriver class ibdata(unittest.TestCase): @classmethod def set ...
- C++或C#调用外部exe的分析
假如有个外部程序名为A.exe,放在目录E:\temp\下,然后我们用C++或者C#写一个程序调用这个A.exe的话(假设这个调用者所在的路径在D:\invoke),通常会采用下面的代码: // C# ...
- HTTP长连接和短连接 + Websocket
HTTP协议与TCP/IP协议的关系 HTTP的长连接和短连接本质上是TCP长连接和短连接.HTTP属于应用层协议,在传输层使用TCP协议,在网络层使用IP协议.IP协议主要解决网络路由和寻址问题,T ...