吴裕雄--天生自然 python数据分析:健康指标聚集分析(健康分析)

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv) # Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory
df=pd.read_csv('F:\\kaggleDataSet\\Key_indicator_districtwise\\Key_indicator_districtwise.csv')
df.head()

x=df['AA_Sample_Units_Total']
y=df['AA_Sample_Units_Rural']
z=df['AA_Population_Urban']
import matplotlib.pyplot as plt
import seaborn as sns
plt.title('State_District_Name vs AA_Sample_Units_Total ')
plt.xlabel('State_District_Name')
plt.ylabel('AA_Sample_Units_Total')
plt.scatter(x,y)

plt.hist(x)
plt.title('AA_Sample_Units_Total vs Frequency')
plt.xlabel('AA_Sample_Units_Total')
plt.ylabel('Frequency')

plt.hist(y)
plt.title('AA_Sample_Units_Rural vs frequency')
plt.xlabel('AA_Sample_Units_Rural')
plt.ylabel('Frequency')

plt.hist(z)
plt.title('AA_Population_Urban vs Frequency')
plt.xlabel('AA_Population_Urban')
plt.ylabel('Frequency')

q=df['AA_Ever_Married_Women_Aged_15_49_Years_Total']
q
w=q.sort_values()
w

plt.boxplot(w)

plt.boxplot(y)

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, metrics # load the boston dataset
boston = datasets.load_boston(return_X_y=False) # defining feature matrix(X) and response vector(y)
X = boston.data
y = boston.target # splitting X and y into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4,
random_state=1) # create linear regression object
reg = linear_model.LinearRegression() # train the model using the training sets
reg.fit(X_train, y_train) # regression coefficients
print('Coefficients: \n', reg.coef_) # variance score: 1 means perfect prediction
print('Variance score: {}'.format(reg.score(X_test, y_test))) # plot for residual error ## setting plot style
plt.style.use('fivethirtyeight') ## plotting residual errors in training data
plt.scatter(reg.predict(X_train), reg.predict(X_train) - y_train,
color = "green", s = 10, label = 'Train data') ## plotting residual errors in test data
plt.scatter(reg.predict(X_test), reg.predict(X_test) - y_test,
color = "blue", s = 10, label = 'Test data') ## plotting line for zero residual error
plt.hlines(y = 0, xmin = 0, xmax = 50, linewidth = 2) ## plotting legend
plt.legend(loc = 'upper right') ## plot title
plt.title("Residual errors") ## function to show plot
plt.show()

吴裕雄--天生自然 python数据分析:健康指标聚集分析(健康分析)的更多相关文章
- 吴裕雄--天生自然 PYTHON数据分析:基于Keras的CNN分析太空深处寻找系外行星数据
#We import libraries for linear algebra, graphs, and evaluation of results import numpy as np import ...
- 吴裕雄--天生自然 PYTHON数据分析:人类发展报告——HDI, GDI,健康,全球人口数据数据分析
import pandas as pd # Data analysis import numpy as np #Data analysis import seaborn as sns # Data v ...
- 吴裕雄--天生自然 PYTHON数据分析:糖尿病视网膜病变数据分析(完整版)
# This Python 3 environment comes with many helpful analytics libraries installed # It is defined by ...
- 吴裕雄--天生自然 PYTHON数据分析:所有美国股票和etf的历史日价格和成交量分析
# This Python 3 environment comes with many helpful analytics libraries installed # It is defined by ...
- 吴裕雄--天生自然 python数据分析:葡萄酒分析
# import pandas import pandas as pd # creating a DataFrame pd.DataFrame({'Yes': [50, 31], 'No': [101 ...
- 吴裕雄--天生自然 python数据分析:医疗费数据分析
import numpy as np import pandas as pd import os import matplotlib.pyplot as pl import seaborn as sn ...
- 吴裕雄--天生自然 python数据分析:基于Keras使用CNN神经网络处理手写数据集
import pandas as pd import numpy as np import matplotlib.pyplot as plt import matplotlib.image as mp ...
- 吴裕雄--天生自然 PYTHON数据分析:钦奈水资源管理分析
df = pd.read_csv("F:\\kaggleDataSet\\chennai-water\\chennai_reservoir_levels.csv") df[&quo ...
- 吴裕雄--天生自然 PYTHON数据分析:医疗数据分析
import numpy as np # linear algebra import pandas as pd # data processing, CSV file I/O (e.g. pd.rea ...
随机推荐
- 关于Java自动拆箱装箱中的缓存问题
package cn.zhang.test; /** * 测试自动装箱拆箱 * 自动装箱:基本类型自动转为包装类对象 * 自动拆箱:包装类对象自动转化为基本数据类型 * * * /*缓存问题*/ /* ...
- 有几个水洼(DFS)
#include <iostream> #include<cstdio> using namespace std; #define maxn 105 char field[ma ...
- Python 学习笔记:Python 操作 SQL Server 数据库
最近要将数据写到数据库里,学习了一下如何用 Python 来操作 SQL Server 数据库. 一.连接数据库: 首先,我们要连接 SQL Server 数据库,需要安装 pymssql 这个第三方 ...
- java去掉数字后面的0
有些财务业务场景是需要把数字多余的0去掉的. 可以这么写 private String getRealData(BigDecimal num) { if (num == null) { return ...
- SaltSack 中Job管理
一.简介 Jid: job id的格式为%Y%m%d%H%M%S%f master在下发指令消息时,会附带上产生的jid,minion在接收到指令开始执行时,会在本地的cachedir(默认是/var ...
- 一篇文章带你了解axios网络交互-Vue
来源:滁州SEO 1 **什么是axios呢?**了解,并去使用它,对于axios发送请求的两种方式有何了解,以及涉及axios跨域问题如何解决. 对于axios网络交互,去使用axios的同时,首先 ...
- Gene family|
6.1引言 随着测序技术的提高,能被测序的物种趋近于复杂(因为越高等的生物基因组大且复杂(1.本身基因结构复杂2.复杂程度与种属关系并不相关)),所以基因家族(Gene family)的数目可能能够更 ...
- 关于前端JS的总结
简介 JavaScript是一种计算机编程语言,可以像等其他编程语言那样定义变量,执行循环等.主要执行在浏览器上,为HTML页面提供动态效果,而且JavaScript是一种脚本语言,它的代码是解释执行 ...
- shell_clean_log
apache日志每天进行轮转: vim /usr/local/apache2/conf/extar/httpd-vhosts.conf...ErrorLog "| /usr/local/ap ...
- mysql truncate 的问题
问题是微信群里一伙计提的 `mysql truncate 空表都需要3 4秒,要优化解决` 一开始觉得这莫名其妙,因为作这种操作的都是后台运维,不是实时的对外服务,运维又不差这3秒 其反应trunca ...