tensorflow knn 预测房价注意有 Min-Max Scaling

示例数据：

0.00632  18.00   2.310  0  0.5380  6.5750  65.20  4.0900   1  296.0  15.30 396.90   4.98  24.00
 0.02731   0.00   7.070  0  0.4690  6.4210  78.90  4.9671   2  242.0  17.80 396.90   9.14  21.60
 0.02729   0.00   7.070  0  0.4690  7.1850  61.10  4.9671   2  242.0  17.80 392.83   4.03  34.70
 0.03237   0.00   2.180  0  0.4580  6.9980  45.80  6.0622   3  222.0  18.70 394.63   2.94  33.40
 0.06905   0.00   2.180  0  0.4580  7.1470  54.20  6.0622   3  222.0  18.70 396.90   5.33  36.20
 0.02985   0.00   2.180  0  0.4580  6.4300  58.70  6.0622   3  222.0  18.70 394.12   5.21  28.70
 0.08829  12.50   7.870  0  0.5240  6.0120  66.60  5.5605   5  311.0  15.20 395.60  12.43  22.90
 0.14455  12.50   7.870  0  0.5240  6.1720  96.10  5.9505   5  311.0  15.20 396.90  19.15  27.10
 0.21124  12.50   7.870  0  0.5240  5.6310 100.00  6.0821   5  311.0  15.20 386.63  29.93  16.50
 0.17004  12.50   7.870  0  0.5240  6.0040  85.90  6.5921   5  311.0  15.20 386.71  17.10  18.90
 0.22489  12.50   7.870  0  0.5240  6.3770  94.30  6.3467   5  311.0  15.20 392.52  20.45  15.00
 0.11747  12.50   7.870  0  0.5240  6.0090  82.90  6.2267   5  311.0  15.20 396.90  13.27  18.90
 0.09378  12.50   7.870  0  0.5240  5.8890  39.00  5.4509   5  311.0  15.20 390.50  15.71  21.70
 0.62976   0.00   8.140  0  0.5380  5.9490  61.80  4.7075   4  307.0  21.00 396.90   8.26  20.40
 0.63796   0.00   8.140  0  0.5380  6.0960  84.50  4.4619   4  307.0  21.00 380.02  10.26  18.20
 0.62739   0.00   8.140  0  0.5380  5.8340  56.50  4.4986   4  307.0  21.00 395.62   8.47  19.90
 1.05393   0.00   8.140  0  0.5380  5.9350  29.30  4.4986   4  307.0  21.00 386.85   6.58  23.10

代码：最大值与最小值之差：ptp()

# k-Nearest Neighbor
#----------------------------------
#
# This function illustrates how to use
# k-nearest neighbors in tensorflow
#
# We will use the 1970s Boston housing dataset
# which is available through the UCI
# ML data repository.
#
# Data:
#----------x-values-----------
# CRIM   : per capita crime rate by town
# ZN     : prop. of res. land zones
# INDUS  : prop. of non-retail business acres
# CHAS   : Charles river dummy variable
# NOX    : nitrix oxides concentration / 10 M
# RM     : Avg. # of rooms per building
# AGE    : prop. of buildings built prior to 1940
# DIS    : Weighted distances to employment centers
# RAD    : Index of radian highway access
# TAX    : Full tax rate value per $10k
# PTRATIO: Pupil/Teacher ratio by town
# B      : 1000*(Bk-0.63)^2, Bk=prop. of blacks
# LSTAT  : % lower status of pop
#------------y-value-----------
# MEDV   : Median Value of homes in $1,000's
 
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import requests
from tensorflow.python.framework import ops
ops.reset_default_graph()
 
# Create graph
sess = tf.Session()
 
# Load the data
housing_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data'
housing_header = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
cols_used = ['CRIM', 'INDUS', 'NOX', 'RM', 'AGE', 'DIS', 'TAX', 'PTRATIO', 'B', 'LSTAT']
num_features = len(cols_used)
housing_file = requests.get(housing_url)
housing_data = [[float(x) for x in y.split(' ') if len(x)>=1] for y in housing_file.text.split('\n') if len(y)>=1]
 
y_vals = np.transpose([np.array([y[13] for y in housing_data])])
x_vals = np.array([[x for i,x in enumerate(y) if housing_header[i] in cols_used] for y in housing_data])
 
## Min-Max Scaling
x_vals = (x_vals - x_vals.min(0)) / x_vals.ptp(0)
 
# Split the data into train and test sets
np.random.seed(13)  #make results reproducible
train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
 
# Declare k-value and batch size
k = 4
batch_size=len(x_vals_test)
 
# Placeholders
x_data_train = tf.placeholder(shape=[None, num_features], dtype=tf.float32)
x_data_test = tf.placeholder(shape=[None, num_features], dtype=tf.float32)
y_target_train = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target_test = tf.placeholder(shape=[None, 1], dtype=tf.float32)
 
# Declare distance metric
# L1
distance = tf.reduce_sum(tf.abs(tf.subtract(x_data_train, tf.expand_dims(x_data_test,1))), axis=2)
 
# L2
#distance = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(x_data_train, tf.expand_dims(x_data_test,1))), reduction_indices=1))
 
# Predict: Get min distance index (Nearest neighbor)
#prediction = tf.arg_min(distance, 0)
top_k_xvals, top_k_indices = tf.nn.top_k(tf.negative(distance), k=k)
x_sums = tf.expand_dims(tf.reduce_sum(top_k_xvals, 1),1)
x_sums_repeated = tf.matmul(x_sums,tf.ones([1, k], tf.float32))
x_val_weights = tf.expand_dims(tf.div(top_k_xvals,x_sums_repeated), 1)
 
top_k_yvals = tf.gather(y_target_train, top_k_indices)
prediction = tf.squeeze(tf.matmul(x_val_weights,top_k_yvals), axis=[1])
 
# Calculate MSE
mse = tf.div(tf.reduce_sum(tf.square(tf.subtract(prediction, y_target_test))), batch_size)
 
# Calculate how many loops over training data
num_loops = int(np.ceil(len(x_vals_test)/batch_size))
 
for i in range(num_loops):
    min_index = i*batch_size
    max_index = min((i+1)*batch_size,len(x_vals_train))
    x_batch = x_vals_test[min_index:max_index]
    y_batch = y_vals_test[min_index:max_index]
    predictions = sess.run(prediction, feed_dict={x_data_train: x_vals_train, x_data_test: x_batch,
                                         y_target_train: y_vals_train, y_target_test: y_batch})
    batch_mse = sess.run(mse, feed_dict={x_data_train: x_vals_train, x_data_test: x_batch,
                                         y_target_train: y_vals_train, y_target_test: y_batch})
 
    print('Batch #' + str(i+1) + ' MSE: ' + str(np.round(batch_mse,3)))
 
# Plot prediction and actual distribution
bins = np.linspace(5, 50, 45)
 
plt.hist(predictions, bins, alpha=0.5, label='Prediction')
plt.hist(y_batch, bins, alpha=0.5, label='Actual')
plt.title('Histogram of Predicted and Actual Values')
plt.xlabel('Med Home Value in $1,000s')
plt.ylabel('Frequency')
plt.legend(loc='upper right')
plt.show()

tensorflow knn 预测房价注意有 Min-Max Scaling的更多相关文章

Tensorflow 线性回归预测房价实例
在本节中将通过一个预测房屋价格的实例来讲解利用线性回归预测房屋价格,以及在tensorflow中如何实现 Tensorflow 线性回归预测房价实例 1.1. 准备工作 1.2. 归一化数据 1.3. ...
在一定[min,max]区间，生成n个不重复的随机数的封装函数
引:生成一个[min,max]区间的一个随机数,随机数生成相关问题参考→链接 var ran=parseInt(Math.random()*(max-min+1)+min); //生成一个[min,m ...
LINQ to SQL Count/Sum/Min/Max/Avg Join
public class Linq { MXSICEDataContext Db = new MXSICEDataContext(); // LINQ to SQL // Count/Sum/Min/ ...
2.10 用最少次数寻找数组中的最大值和最小值[find min max of array]
[本文链接] http://www.cnblogs.com/hellogiser/p/find-min-max-of-array.html [题目] 对于一个由N个整数组成的数组,需要比较多少次才能把 ...
LINQ Count/Sum/Min/Max/Avg
参考:http://www.cnblogs.com/peida/archive/2008/08/11/1263384.html Count/Sum/Min/Max/Avg用于统计数据,比如统计一些数据 ...
【转载】：【C++跨平台系列】解决STL的max()与numeric_limits::max()和VC6 min/max 宏冲突问题
http://www.cnblogs.com/cvbnm/articles/1947743.html 多年以前,Microsoft 幹了一件比 #define N 3 還要蠢的蠢事,那就是在 < ...
LINQ to SQL 语句(3) 之 Count/Sum/Min/Max/Avg
LINQ to SQL 语句(3) 之 Count/Sum/Min/Max/Avg [1] Count/Sum 讲解 [2] Min 讲解 [3] Max 讲解 [4] Average 和 Agg ...
[转]LINQ语句之Select/Distinct和Count/Sum/Min/Max/Avg
在讲述了LINQ,顺便说了一下Where操作,这篇开始我们继续说LINQ语句,目的让大家从语句的角度了解LINQ,LINQ包括LINQ to Objects.LINQ to DataSets.LINQ ...
动态规划——min/max的单调性优化总结
一般形式: $max\{min(ax+by+c,dF(x)+eG(y)+f)\},其中F(x)和G(y)是单调函数.$ 或 $min\{max(ax+by+c,dF(x)+eG(y)+f)\},其中F ...

随机推荐

AWK 思维导图
完整的AWK思维导图文章来源:刘俊涛的博客地址:http://www.cnblogs.com/lovebing
Shell脚本之：case
case ... esac 与其他语言中的 switch ... case 语句类似,是一种多分枝选择结构. case语句的语法 case 值 in 模式1) command1 command2 co ...
数字精确运算BigDecimal经常用法
import java.math.BigDecimal; public class Arith { /** * 因为Java的简单类型不可以精确的对浮点数进行运算,这个工具类提供精 * 确的浮 ...
Jquery.ajax报parseerror Invalid JSON错误的原因和解决方法:不能解析
(默认: 自动判断 (xml 或 html)) 请求失败时调用时间.参数有以下三个:XMLHttpRequest 对象.错误信息.(可选)捕获的错误对象.如果发生了错误,错误信息(第二个参数)除了得到 ...
Chrome 插件 CrxMouse 去除后门优化版
说明 CrxMouse 是一款挺不错的 Chrome 插件.仅仅是据说这个插件会在后台偷偷的上传用户的浏览数据,无论上传的内容是不是涉及隐私数据,总让人认为不放心,可是因为插件本身功能还是挺好用的,所 ...
js学习总结--DOM2兼容处理重复问题
在解决this问题之后,只需要在每次往自定义属性和事件池当中添加事件的时候进行以下判断就好了,具体代码如下: /* bind:处理DOM2级事件绑定的兼容性问题(绑定方法) @parameter: c ...
scrapy之Logging使用
#coding:utf-8 __author__ = 'similarface' ###################### ##Logging的使用 ###################### ...
javax.persistence.PersistenceException: org.hibernate.PersistentObjectException: detached entity passed to persist:
javax.persistence.PersistenceException: org.hibernate.PersistentObjectException: detached entity pas ...
PowerBuilder -- Tab控件
在tab中关闭窗口 Close(tab_1.getparent()) 调整tab中的控件的tab oder 鼠标右键tabpage_1,选择 Tab Order菜单.
java多线程那些事之中的一个
1. Callable 接口获取线程运行状态(get.get(long timeout)),取消线程(cancel(boolean mayinterruptifrunning)).isCance ...

tensorflow knn 预测房价 注意有 Min-Max Scaling

tensorflow knn 预测房价 注意有 Min-Max Scaling的更多相关文章

随机推荐

热门专题

tensorflow knn 预测房价注意有 Min-Max Scaling

tensorflow knn 预测房价注意有 Min-Max Scaling的更多相关文章