sklearn-标准化标签LabelEncoder

python机器学习-乳腺癌细胞挖掘（博主亲自录制视频）

https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

sklearn.preprocessing.LabelEncoder()：标准化标签

standardScaler==features with a mean=0 and variance=1

minMaxScaler==features in a 0 to 1 range

normalizer==feature vector to a euclidean length=1

normalization

bring the values of each feature vector on a common scale

L1-least absolute deviations-sum of absolute values(on each row)=1;it is insensitive to outliers

L2-Least squares-sum of squares(on each row)=1;takes outliers in consideration during traing

# -*- coding: utf-8 -*-

"""

Created on Sat Apr 14 09:09:41 2018

@author:Toby

standardScaler==features with a mean=0 and variance=1

minMaxScaler==features in a 0 to 1 range

normalizer==feature vector to a euclidean length=1

normalization

bring the values of each feature vector on a common scale

L1-least absolute deviations-sum of absolute values(on each row)=1;it is insensitive to outliers

L2-Least squares-sum of squares(on each row)=1;takes outliers in consideration during traing

"""

from sklearn import preprocessing

import numpy as np

data=np.array([[2.2,5.9,-1.8],[5.4,-3.2,-5.1],[-1.9,4.2,3.2]])

bindata=preprocessing.Binarizer(threshold=1.5).transform(data)

print('Binarized data:',bindata)

#mean removal

print('Mean(before)=',data.mean(axis=0))

print('standard deviation(before)=',data.std(axis=0))

#features with a mean=0 and variance=1

scaled_data=preprocessing.scale(data)

print('Mean(before)=',scaled_data.mean(axis=0))

print('standard deviation(before)=',scaled_data.std(axis=0))

print('scaled_data:',scaled_data)

'''

scaled_data: [[ 0.10040991  0.91127074 -0.16607709]

 [ 1.171449   -1.39221918 -1.1332319 ]

 [-1.27185891  0.48094844  1.29930899]]

'''

#features in a 0 to 1 range

minmax_scaler=preprocessing.MinMaxScaler(feature_range=(0,1))

data_minmax=minmax_scaler.fit_transform(data)

print('MinMaxScaler applied on the data:',data_minmax)

'''

MinMaxScaler applied on the data: [[ 0.56164384  1.          0.39759036]

 [ 1.          0.          0.        ]

 [ 0.          0.81318681  1.        ]]

'''

data_l1=preprocessing.normalize(data,norm='l1')

data_l2=preprocessing.normalize(data,norm='l2')

print('l1-normalized data:',data_l1)

'''

[[ 0.22222222  0.5959596  -0.18181818]

 [ 0.39416058 -0.23357664 -0.37226277]

 [-0.20430108  0.4516129   0.34408602]]

'''

print('l2-normalized data:',data_l2)

'''

[[ 0.3359268   0.90089461 -0.2748492 ]

 [ 0.6676851  -0.39566524 -0.63059148]

 [-0.33858465  0.74845029  0.57024784]]

'''

https://study.163.com/provider/400000000398149/index.htm?share=2&shareId=400000000398149（欢迎关注博主主页，学习python视频资源，还有大量免费python经典文章）

QQ:231469242

sklearn-标准化标签LabelEncoder的更多相关文章

sklearn 标准化数据的方法
Sklearn 标准化数据 from __future__ import print_function from sklearn import preprocessing import numpy a ...
sklearn.preprocessing.LabelEncoder_标准化标签，将标签值统一转换成range(标签值个数-1)范围内
. LabelEncode(),标签值编码用在将一些类别型的列进行编码,方便用于训练
sklearn标准化-【老鱼学sklearn】
在前面的一篇博文中关于计算房价中我们也大致提到了标准化的概念,也就是比如对于影响房价的参数中有面积和户型,面积的取值范围可以很广,它可以从0-500平米,而户型一般也就1-5. 标准化就是要把这两种参 ...
机器学习入门-线性判别分析（LDA）1.LabelEncoder(进行标签的数字映射) 2.LinearDiscriminantAnalysis (sklearn的LDA模块)
1.from sklearn.processing import LabelEncoder 进行标签的代码编译首先需要通过model.fit 进行预编译,然后使用transform进行实际编译 2. ...
利用sklearn的LabelEncoder对标签进行数字化编码
from sklearn.preprocessing import LabelEncoder def gen_label_encoder(): labels = ['BB', 'CC'] le = L ...
python标签值标准化到[0-(#class-1)]
python 处理标签常常需要将一组标签映射到一组数字,数字还要求连续. 比如 ['a', 'b', 'c', 'a', 'a', 'b', 'c'] ==(a->0, b->1, c-& ...
11.sklearn.preprocessing.LabelEncoder的作用
In [5]: from sklearn import preprocessing ...: le =preprocessing.LabelEncoder() ...: le.fit(["p ...
OneHotEncoder独热编码和 LabelEncoder标签编码
学习sklearn和kagggle时遇到的问题,什么是独热编码?为什么要用独热编码?什么情况下可以用独热编码?以及和其他几种编码方式的区别. 首先了解机器学习中的特征类别:连续型特征和离散型特征拿到 ...
使用sklearn进行数据挖掘-房价预测(4)—数据预处理
在使用机器算法之前,我们先把数据做下预处理,先把特征和标签拆分出来 housing = strat_train_set.drop("median_house_value",axis ...

随机推荐

centos6.8下安装matlab2009（图片转帖）
前言如何优雅的在centos6.8上安装matlab2009. 流程不过我个人安装过程完后启动matlab的时候又出现了新问题: error while loading shared librar ...
【BZOJ4653】【NOI2016】区间线段树
题目大意数轴上有\(n\)个闭区间\([l_1,r_1],[l_2,r_2],\ldots,[l_n,r_n]\),你要选出\(m\)个区间,使得存在一个\(x\),对于每个选出的区间\([l_i, ...
IDEA 安装 Sonalint失败
1.直接在线安装[Plugins]-[Browse reponsitories...],安不上,FQ了以后还是安不上 2.下载了离线的Sonalint 插件包,通过引用外部插件的方式,[Install ...
django-simple-captcha 组件使用
功能实现验证码安装 pip install django-simple-captcha== 使用前准备首先需要加入到 django 的 app 中更新下数据库会添加一张新的表 python ...
Docker 私有仓库 Harbor registry 安全认证搭建 [Https]
Harbor源码地址:https://github.com/vmware/harborHarbort特性:基于角色控制用户和仓库都是基于项目进行组织的, 而用户基于项目可以拥有不同的权限.基于镜像的复 ...
【BZOJ4784】[ZJOI2017]仙人掌（Tarjan，动态规划）
[BZOJ4784][ZJOI2017]仙人掌(Tarjan,动态规划) 题面 BZOJ 洛谷题解显然如果原图不是仙人掌就无解. 如果原图是仙人掌,显然就是把环上的边给去掉,变成若干森林连边成为仙 ...
文艺平衡树 Splay 学习笔记(1)
(这里是Splay基础操作,reserve什么的会在下一篇里面讲) 好久之前就说要学Splay了,结果苟到现在才学习. 可能是最近良心发现自己实在太弱了,听数学又听不懂只好多学点不要脑子的数据结构. ...
[luogu5004]专心OI - 跳房子【矩阵加速+动态规划】
传送门:https://www.luogu.org/problemnew/show/P5004 分析动态规划转移方程是这样的\(f[i]=\sum^{i-m-1}_{j=0}f[j]\). 那么很明 ...
「TJOI2015」组合数学解题报告
「TJOI2015」组合数学这不是个贪心吗? 怎么都最小链覆盖=最大点独立集去了注意到一个点出度最多只有2,可以贪心一下出度的去向按读入顺序处理就可以,维护一个\(res_i\)数组,表示上一行 ...
[SCOI2007]压缩（区间dp）
神仙题,看了半天题解才看明白... 因为题目里说如果没有m,会自动默认m在最前面. 我们设计状态为dp[l][r][0/1]为在区间l到r中有没有m的最小长度. 转移:枚举我们要压缩的起点,dp[l] ...

sklearn-标准化标签LabelEncoder

python机器学习-乳腺癌细胞挖掘（博主亲自录制视频）

https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

sklearn-标准化标签LabelEncoder的更多相关文章

随机推荐

热门专题