sklearn.model_selection 的 train_test

train_test_split函数用于将数据划分为训练数据和测试数据。

train_test_split是交叉验证中常用的函数，功能是从样本中随机的按比例选取train_data和test_data，形式为：

X_train,X_test, y_train, y_test =

train_test_split(train_data , train_target , test_size=0.4, random_state=0)

参数解释：
train_data：所要划分的样本特征集
train_target：所要划分的样本结果
test_size：样本占比，如果是整数的话就是样本的数量
random_state：是随机数的种子。
随机数种子：其实就是该组随机数的编号，在需要重复试验的时候，保证得到一组一样的随机数。比如你每次都填1，

其他参数一样的情况下你得到的随机数组是一样的。但填0或不填，每次都会不一样。

>>> import numpy as np

    >>> from sklearn.model_selection import train_test_split

    >>> X, y = np.arange(10).reshape((5, 2)), range(5)

    >>> X

    array([[0, 1],

           [2, 3],

           [4, 5],

           [6, 7],

           [8, 9]])

    >>> list(y)

    [0, 1, 2, 3, 4]

    >>> X_train, X_test, y_train, y_test = train_test_split(

    ...     X, y, test_size=0.33, random_state=42)

    ...

    >>> X_train

    array([[4, 5],

           [0, 1],

           [6, 7]])

    >>> y_train

    [2, 0, 3]

    >>> X_test

    array([[2, 3],

           [8, 9]])

    >>> y_test

    [1, 4]

    >>> train_test_split(y, shuffle=False)

    [[0, 1, 2], [3, 4]]

sklearn.model_selection 的 train_test_split作用的更多相关文章

sklearn.model_selection 的train_test_split方法和参数
train_test_split是sklearn中用于划分数据集,即将原始数据集划分成测试集和训练集两部分的函数. from sklearn.model_selection import train_ ...
sklearn中的train_test_split （随机划分训练集和测试集）
官方文档:http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html ...
No module named ‘sklearn.model_selection解决办法
在python中运行导入以下模块 from sklearn.model_selection import train_test_split 出现错误: No module named ‘sklear ...
[Python]-sklearn.model_selection模块-处理数据集
拆分数据集train&test from sklearn.model_selection import train_test_split 可以按比例拆分数据集,分为train和test x_t ...
【sklearn】网格搜索 from sklearn.model_selection import GridSearchCV
GridSearchCV用于系统地遍历模型的多种参数组合,通过交叉验证确定最佳参数. 1.GridSearchCV参数 # 不常用的参数 pre_dispatch 没看懂 refit 默认为Tr ...
sklearn.model_selection.StratifiedShuffleSplit
sklearn.model_selection.StratifiedShuffleSplit
sklearn.model_selection模块
后续补代码 sklearn.model_selection模块的几个方法参数
sklearn.model_selection Part 2: Model validation
1. check_cv() def check_cv(cv=3, y=None, classifier=False): if cv is None: cv = 3 if isinstance(cv, ...
11.sklearn.preprocessing.LabelEncoder的作用
In [5]: from sklearn import preprocessing ...: le =preprocessing.LabelEncoder() ...: le.fit(["p ...

随机推荐

hp MSA50 5盘RAID5重建为4盘RAID5怎么恢复数据
[用户单位] XX省电视台[数据恢复故障描述] 一台HP 服务器,挂接一台HP MSA50磁盘阵列,内接5块1TB硬盘,原先结构为RAID5. 使用一段时间后,其中一块硬盘掉线,因RAID5支持一块硬 ...
mysql5.5中datetime默认值不能为NOW或者CURRENT_TIMESTAMP，用触发器解决
mysql5.6及以上的版本datatime默认值可以为CURRENT_TIMESTAMP或者NOW 那我们要用的是mysql5.5及以下版本呢? 请看代码 delimiter // DROP TRI ...
java截取一个字符串正数或倒数某个特定字符前后的内容
取出正数第二个“.”后面的内容 public class TestCode { public static void main(String[] args) { String str ="2 ...
TF中conv2d和kernel_initializer方法
conv2d中的padding 在使用TF搭建CNN的过程中,卷积的操作如下 convolution = tf.nn.conv2d(X, filters, strides=[1,2,2,1], pad ...
优化从 App.config 读取配置文件
public class AppSettingsConfig { /// <summary> ////// </summary> public static int Query ...
Linq GroupJoin
static void Main(string[] args) { List<Person> persons = new List<Person> { }, }, }; Lis ...
Ubuntu16.04 + Zabbix 3.4.7 邮件报警设置
部署了Zabbix,需要配置邮件报警,在网上找了一些教程,大多是是用的CentOS + Zabbix 2.x版本的,而且还要写脚本,感觉太麻烦了,所以自己结合其他文章摸索了一套配置方法. 先说一下环境 ...
框架学习之Struts2(三)---OGNL和值栈
一.OGNL概述 1.1OGNL是对象图导航语言(Object-Graph Navigation Languaged)的缩写,他是一种功能强大的表达式语言,通过简单一致的表达式语法,可以存取Java对 ...
PHP 7.2 新功能介绍
PHP 7.2 已經在 2017 年 11 月 30 日正式發布 .這次發布包含新特性.功能,及優化,以讓我們寫出更好的代碼.在這篇文章裡,我將會介紹一些 PHP 7.2 最有趣的語言特性. 你可以 ...
re模块中的compile函数
compile compile(pattern,flag=0) compile a regular expression pattern,return a pattern object compile ...

sklearn.model_selection 的 train_test_split作用

sklearn.model_selection 的 train_test_split作用的更多相关文章

随机推荐

热门专题