Python 实现多元线性回归预测
一、二元输入特征线性回归
测试数据为:ex1data2.txt
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
- ,,
Python代码如下:
- #-*- coding: UTF- -*-
- import random
- import numpy as np
- import matplotlib.pyplot as plt
- #加载数据
- def load_exdata(filename):
- data = []
- with open(filename, 'r') as f:
- for line in f.readlines():
- line = line.split(',')
- current = [int(item) for item in line] //根据数据输入的不同确定是int 还是其他类型
- #5.5277,9.1302
- data.append(current)
- return data
- data = load_exdata('ex1data2.txt');
- data = np.array(data,np.int64)//根据数据输入的不同确定是int 还是其他类型
- #特征缩放
- def featureNormalize(X):
- X_norm = X;
- mu = np.zeros((,X.shape[]))
- sigma = np.zeros((,X.shape[]))
- for i in range(X.shape[]):
- mu[,i] = np.mean(X[:,i]) # 均值
- sigma[,i] = np.std(X[:,i]) # 标准差
- # print(mu)
- # print(sigma)
- X_norm = (X - mu) / sigma
- return X_norm,mu,sigma
- #计算损失
- def computeCost(X, y, theta):
- m = y.shape[]
- # J = (np.sum((X.dot(theta) - y)**)) / (*m)
- C = X.dot(theta) - y
- J2 = (C.T.dot(C))/ (*m)
- return J2
- #梯度下降
- def gradientDescent(X, y, theta, alpha, num_iters):
- m = y.shape[]
- #print(m)
- # 存储历史误差
- J_history = np.zeros((num_iters, ))
- for iter in range(num_iters):
- # 对J求导,得到 alpha/m * (WX - Y)*x(i), (,m)*(m,) X (m,)*(,) = (m,)
- theta = theta - (alpha/m) * (X.T.dot(X.dot(theta) - y))
- J_history[iter] = computeCost(X, y, theta)
- return J_history,theta
- iterations = #迭代次数
- alpha = 0.01 #学习率
- x = data[:,(,)].reshape((-,))
- y = data[:,].reshape((-,))
- m = y.shape[]
- x,mu,sigma = featureNormalize(x)
- X = np.hstack([x,np.ones((x.shape[], ))])
- # X = X[range(),:]
- # y = y[range(),:]
- theta = np.zeros((, ))
- j = computeCost(X,y,theta)
- J_history,theta = gradientDescent(X, y, theta, alpha, iterations)
- print('Theta found by gradient descent',theta)
- def predict(data):
- testx = np.array(data)
- testx = ((testx - mu) / sigma)
- testx = np.hstack([testx,np.ones((testx.shape[], ))])
- price = testx.dot(theta)
- print('price is %d ' % (price))
- predict([,])
二、多元线性回归,以三个特征输入为例
输入数据:testdata.txt。其中第一列是指输入的数据序列,不可读入
- ,230.1,37.8,69.2,22.1
- ,44.5,39.3,45.1,10.4
- ,17.2,45.9,69.3,9.3
- ,151.5,41.3,58.5,18.5
- ,180.8,10.8,58.4,12.9
- ,8.7,48.9,,7.2
- ,57.5,32.8,23.5,11.8
- ,120.2,19.6,11.6,13.2
- ,8.6,2.1,,4.8
- ,199.8,2.6,21.2,10.6
- ,66.1,5.8,24.2,8.6
- ,214.7,,,17.4
- ,23.8,35.1,65.9,9.2
- ,97.5,7.6,7.2,9.7
- ,204.1,32.9,,
- ,195.4,47.7,52.9,22.4
- ,67.8,36.6,,12.5
- ,281.4,39.6,55.8,24.4
- ,69.2,20.5,18.3,11.3
- ,147.3,23.9,19.1,14.6
- ,218.4,27.7,53.4,
- ,237.4,5.1,23.5,12.5
- ,13.2,15.9,49.6,5.6
- ,228.3,16.9,26.2,15.5
- ,62.3,12.6,18.3,9.7
- ,262.9,3.5,19.5,
- ,142.9,29.3,12.6,
- ,240.1,16.7,22.9,15.9
- ,248.8,27.1,22.9,18.9
- ,70.6,,40.8,10.5
- ,292.9,28.3,43.2,21.4
- ,112.9,17.4,38.6,11.9
- ,97.2,1.5,,9.6
- ,265.6,,0.3,17.4
- ,95.7,1.4,7.4,9.5
- ,290.7,4.1,8.5,12.8
- ,266.9,43.8,,25.4
- ,74.7,49.4,45.7,14.7
- ,43.1,26.7,35.1,10.1
- ,,37.7,,21.5
- ,202.5,22.3,31.6,16.6
- ,,33.4,38.7,17.1
- ,293.6,27.7,1.8,20.7
- ,206.9,8.4,26.4,12.9
- ,25.1,25.7,43.3,8.5
- ,175.1,22.5,31.5,14.9
- ,89.7,9.9,35.7,10.6
- ,239.9,41.5,18.5,23.2
- ,227.2,15.8,49.9,14.8
- ,66.9,11.7,36.8,9.7
- ,199.8,3.1,34.6,11.4
- ,100.4,9.6,3.6,10.7
- ,216.4,41.7,39.6,22.6
- ,182.6,46.2,58.7,21.2
- ,262.7,28.8,15.9,20.2
- ,198.9,49.4,,23.7
- ,7.3,28.1,41.4,5.5
- ,136.2,19.2,16.6,13.2
- ,210.8,49.6,37.7,23.8
- ,210.7,29.5,9.3,18.4
- ,53.5,,21.4,8.1
- ,261.3,42.7,54.7,24.2
- ,239.3,15.5,27.3,15.7
- ,102.7,29.6,8.4,
- ,131.1,42.8,28.9,
- ,,9.3,0.9,9.3
- ,31.5,24.6,2.2,9.5
- ,139.3,14.5,10.2,13.4
- ,237.4,27.5,,18.9
- ,216.8,43.9,27.2,22.3
- ,199.1,30.6,38.7,18.3
- ,109.8,14.3,31.7,12.4
- ,26.8,,19.3,8.8
- ,129.4,5.7,31.3,
- ,213.4,24.6,13.1,
- ,16.9,43.7,89.4,8.7
- ,27.5,1.6,20.7,6.9
- ,120.5,28.5,14.2,14.2
- ,5.4,29.9,9.4,5.3
- ,,7.7,23.1,
- ,76.4,26.7,22.3,11.8
- ,239.8,4.1,36.9,12.3
- ,75.3,20.3,32.5,11.3
- ,68.4,44.5,35.6,13.6
- ,213.5,,33.8,21.7
- ,193.2,18.4,65.7,15.2
- ,76.3,27.5,,
- ,110.7,40.6,63.2,
- ,88.3,25.5,73.4,12.9
- ,109.8,47.8,51.4,16.7
- ,134.3,4.9,9.3,11.2
- ,28.6,1.5,,7.3
- ,217.7,33.5,,19.4
- ,250.9,36.5,72.3,22.2
- ,107.4,,10.9,11.5
- ,163.3,31.6,52.9,16.9
- ,197.6,3.5,5.9,11.7
- ,184.9,,,15.5
- ,289.7,42.3,51.2,25.4
- ,135.2,41.7,45.9,17.2
- ,222.4,4.3,49.8,11.7
- ,296.4,36.3,100.9,23.8
- ,280.2,10.1,21.4,14.8
- ,187.9,17.2,17.9,14.7
- ,238.2,34.3,5.3,20.7
- ,137.9,46.4,,19.2
- ,,,29.7,7.2
- ,90.4,0.3,23.2,8.7
- ,13.1,0.4,25.6,5.3
- ,255.4,26.9,5.5,19.8
- ,225.8,8.2,56.5,13.4
- ,241.7,,23.2,21.8
- ,175.7,15.4,2.4,14.1
- ,209.6,20.6,10.7,15.9
- ,78.2,46.8,34.5,14.6
- ,75.1,,52.7,12.6
- ,139.2,14.3,25.6,12.2
- ,76.4,0.8,14.8,9.4
- ,125.7,36.9,79.2,15.9
- ,19.4,,22.3,6.6
- ,141.3,26.8,46.2,15.5
- ,18.8,21.7,50.4,
- ,,2.4,15.6,11.6
- ,123.1,34.6,12.4,15.2
- ,229.5,32.3,74.2,19.7
- ,87.2,11.8,25.9,10.6
- ,7.8,38.9,50.6,6.6
- ,80.2,,9.2,8.8
- ,220.3,,3.2,24.7
- ,59.6,,43.1,9.7
- ,0.7,39.6,8.7,1.6
- ,265.2,2.9,,12.7
- ,8.4,27.2,2.1,5.7
- ,219.8,33.5,45.1,19.6
- ,36.9,38.6,65.6,10.8
- ,48.3,,8.5,11.6
- ,25.6,,9.3,9.5
- ,273.7,28.9,59.7,20.8
- ,,25.9,20.5,9.6
- ,184.9,43.9,1.7,20.7
- ,73.4,,12.9,10.9
- ,193.7,35.4,75.6,19.2
- ,220.5,33.2,37.9,20.1
- ,104.6,5.7,34.4,10.4
- ,96.2,14.8,38.9,11.4
- ,140.3,1.9,,10.3
- ,240.1,7.3,8.7,13.2
- ,243.2,,44.3,25.4
- ,,40.3,11.9,10.9
- ,44.7,25.8,20.6,10.1
- ,280.7,13.9,,16.1
- ,,8.4,48.7,11.6
- ,197.6,23.3,14.2,16.6
- ,171.3,39.7,37.7,
- ,187.8,21.1,9.5,15.6
- ,4.1,11.6,5.7,3.2
- ,93.9,43.5,50.5,15.3
- ,149.8,1.3,24.3,10.1
- ,11.7,36.9,45.2,7.3
- ,131.7,18.4,34.6,12.9
- ,172.5,18.1,30.7,14.4
- ,85.7,35.8,49.3,13.3
- ,188.4,18.1,25.6,14.9
- ,163.5,36.8,7.4,
- ,117.2,14.7,5.4,11.9
- ,234.5,3.4,84.8,11.9
- ,17.9,37.6,21.6,
- ,206.8,5.2,19.4,12.2
- ,215.4,23.6,57.6,17.1
- ,284.3,10.6,6.4,
- ,,11.6,18.4,8.4
- ,164.5,20.9,47.4,14.5
- ,19.6,20.1,,7.6
- ,168.4,7.1,12.8,11.7
- ,222.4,3.4,13.1,11.5
- ,276.9,48.9,41.8,
- ,248.4,30.2,20.3,20.2
- ,170.2,7.8,35.2,11.7
- ,276.7,2.3,23.7,11.8
- ,165.6,,17.6,12.6
- ,156.6,2.6,8.3,10.5
- ,218.5,5.4,27.4,12.2
- ,56.2,5.7,29.7,8.7
- ,287.6,,71.8,26.2
- ,253.8,21.3,,17.6
- ,,45.1,19.6,22.6
- ,139.5,2.1,26.6,10.3
- ,191.1,28.7,18.2,17.3
- ,,13.9,3.7,15.9
- ,18.7,12.1,23.4,6.7
- ,39.5,41.1,5.8,10.8
- ,75.5,10.8,,9.9
- ,17.2,4.1,31.6,5.9
- ,166.8,,3.6,19.6
- ,149.7,35.6,,17.3
- ,38.2,3.7,13.8,7.6
- ,94.2,4.9,8.1,9.7
- ,,9.3,6.4,12.8
- ,283.6,,66.2,25.5
- ,232.1,8.6,8.7,13.4
python 代码:
- #-*- coding: UTF- -*-
- import random
- import numpy as np
- import matplotlib.pyplot as plt
- #加载数据
- def load_exdata(filename):
- data = []
- with open(filename, 'r') as f:
- for line in f.readlines():
- line = line.split(',')
- current = [float(item) for item in line]
- #5.5277,9.1302
- data.append(current)
- return data
- data = load_exdata('testdata.txt');
- data = np.array(data,np.float64)//数据是浮点型
- # 特征缩放
- def featureNormalize(X):
- X_norm = X;
- mu = np.zeros((, X.shape[]))
- sigma = np.zeros((, X.shape[]))
- for i in range(X.shape[]):
- mu[, i] = np.mean(X[:, i]) # 均值
- sigma[, i] = np.std(X[:, i]) # 标准差
- # print(mu)
- # print(sigma)
- X_norm = (X - mu) / sigma
- return X_norm, mu, sigma
- # 计算损失
- def computeCost(X, y, theta):
- m = y.shape[]
- # J = (np.sum((X.dot(theta) - y)**)) / (*m)
- C = X.dot(theta) - y
- J2 = (C.T.dot(C)) / ( * m)
- return J2
- # 梯度下降
- def gradientDescent(X, y, theta, alpha, num_iters):
- m = y.shape[]
- # print(m)
- # 存储历史误差
- J_history = np.zeros((num_iters, ))
- for iter in range(num_iters):
- # 对J求导,得到 alpha/m * (WX - Y)*x(i), (,m)*(m,) X (m,)*(,) = (m,)
- theta = theta - (alpha / m) * (X.T.dot(X.dot(theta) - y))
- J_history[iter] = computeCost(X, y, theta)
- return J_history, theta
- iterations = # 迭代次数
- alpha = 0.01 # 学习率
- x = data[:, ( ,,)].reshape((-, ))//数据特征输入,采用数据集一行的,第1,2,3个数据,然后将其变成一行,所以用shape
- y = data[:, ].reshape((-, ))//输出特征,数据集的第四位
- m = y.shape[]
- x, mu, sigma = featureNormalize(x)
- X = np.hstack([x, np.ones((x.shape[], ))])
- # X = X[range(),:]
- # y = y[range(),:]
- theta = np.zeros((, ))//因为x+y.总共有四个输入,所以theta是四维
- j = computeCost(X, y, theta)
- J_history, theta = gradientDescent(X, y, theta, alpha, iterations)
- print('Theta found by gradient descent', theta)
- def predict(data):
- testx = np.array(data)
- testx = ((testx - mu) / sigma)
- testx = np.hstack([testx, np.ones((testx.shape[], ))])
- price = testx.dot(theta)
- print('predit value is %f ' % (price))
- predict([151.5,41.3,58.5])//输入为3维
Python 实现多元线性回归预测的更多相关文章
- MATLAB实现多元线性回归预测
一.简单的多元线性回归: data.txt ,230.1,37.8,69.2,22.1 ,44.5,39.3,45.1,10.4 ,17.2,45.9,69.3,9.3 ,151.5,41.3,58. ...
- 机器学习01:使用scikit-learn的线性回归预测Google股票
这是机器学习系列的第一篇文章. 本文将使用Python及scikit-learn的线性回归预测Google的股票走势.请千万别期望这个示例能够让你成为股票高手.下面按逐步介绍如何进行实践. 准备数据 ...
- R语言 多元线性回归分析
#线性模型中有关函数#基本函数 a<-lm(模型公式,数据源) #anova(a)计算方差分析表#coef(a)提取模型系数#devinace(a)计算残差平方和#formula(a)提取模型公 ...
- R与数据分析旧笔记(六)多元线性分析 下
逐步回归 向前引入法:从一元回归开始,逐步加快变量,使指标值达到最优为止 向后剔除法:从全变量回归方程开始,逐步删去某个变量,使指标值达到最优为止 逐步筛选法:综合上述两种方法 多元线性回归的核心问题 ...
- Tensorflow 线性回归预测房价实例
在本节中将通过一个预测房屋价格的实例来讲解利用线性回归预测房屋价格,以及在tensorflow中如何实现 Tensorflow 线性回归预测房价实例 1.1. 准备工作 1.2. 归一化数据 1.3. ...
- C# chart.DataManipulator.FinancialFormula()公式的使用 线性回归预测方法
最近翻阅资料,找到 chart.DataManipulator.FinancialFormula()公式的使用,打开另一扇未曾了解的窗,供大家分享一下. 一 DataManipulator类 运行时, ...
- python实现感知机线性分类模型
前言 感知器是分类的线性分类模型,其中输入为实例的特征向量,输出为实例的类别,取+1或-1的值作为正类或负类.感知器对应于输入空间中对输入特征进行分类的超平面,属于判别模型. 通过梯度下降使误分类的损 ...
- 利用R进行多元线性回归分析
对于一个因变量y,n个自变量x1,...,xn,要如何判断y与这n个自变量之间是否存在线性关系呢? 肯定是要利用他们的数据集,假设数据集中有m个样本,那么,每个样本都分别对应着一个因变量和一个n维的自 ...
- R与数据分析旧笔记(六)多元线性分析 上
> x=iris[which(iris$Species=="setosa"),1:4] > plot(x) 首先是简单的肉眼观察数据之间相关性 多元回归相较于一元回归的 ...
随机推荐
- 【转】java io 流 设计模式
知识点:什么是装饰模式: http://wenku.baidu.com/view/ad4eac9f51e79b896802263b.html(原理讲的很清楚) http://wenku.baidu.c ...
- Android后台处理最佳实践(Best Practices for Background Jobs)
本课将告诉你如何通过后台加载来加速应用启动和降低应用耗电. 后台跑服务 除非你做了特殊指定,否则在应用中的大部分前台操作都是在一个特殊的UI线程里面进行的.这有可能会导致一些问题,因为长时间运行的操作 ...
- Android开发中的神坑和知识点记录
1.SDK Manager.exe闪退的问题 http://blog.csdn.net/fambit025/article/details/26984345 1.找到android.bat,在源码处找 ...
- C#基础课程之二变量常量及流程控制
课堂练习:.一个四位整数 输出它的千位,百位,十位,个位 数字. ; ; % ; % ; ; Console.WriteLine("千位数" + b+" 百位数" ...
- 浅析PCIe链路LTSSM状态机
我们知道,在PCIe链路可以正常工作之前,需要对PCIe链路进行链路训练,在这个过程中,就会用LTSSM状态机.LTSSM全称是Link Training and Status State Machi ...
- vue2.0 组件化及组件传值
组件 (Component) 是 Vue.js 最强大的功能之一.组件可以扩展 HTML 元素,封装可重用的代码.在较高层面上,组件是自定义元素,Vue.js 的编译器为它添加特殊功能.在有些情况下, ...
- vuex入门教程和思考
Vuex是什么 首先对于vuex是什么,我先引用下官方的解释. Vuex 是一个专为 Vue.js 应用程序开发的状态管理模式.它采用集中式存储管理应用的所有组件的状态,并以相应的规则保证状态以一种可 ...
- 我的IT之路2013(二)
严寒即将过去,温暖的春天正在向我们招手,欢呼吧,在迎接新的开始的同时,不要忘了回顾一下过去的这一年,总结一下过去的这一年有什么得失. 英语学习 13年下半年,最大的变化就是有很大一部分时间用来学英语. ...
- FFmpeg AVPacket和AVFrame区别
简介 AVPacket:存储压缩数据(视频对应H.264等码流数据,音频对应AAC/MP3等码流数据)AVFrame:存储非压缩的数据(视频对应RGB/YUV像素数据,音频对应PCM采样数据)
- how many shards and replicas should be set for Elastic Search
https://cpratt.co/how-many-shards-should-elasticsearch-indexes-have/ https://blog.trifork.com/2014/0 ...