Comparing randomized search and grid search for hyperparameter estimation

Compare randomized search and grid search for optimizing hyperparameters of a random forest. All parameters that influence the learning are searched simultaneously (except for the number of estimators, which poses a time / quality tradeoff).

The randomized search and the grid search explore exactly the same space of parameters. The result in parameter settings is quite similar, while the run time for randomized search is drastically lower.

The performance is slightly worse for the randomized search, though this is most likely a noise effect and would not carry over to a held-out test set.

Note that in practice, one would not search over this many different parameters simultaneously using grid search, but pick only the ones deemed most important.

Python source code: randomized_search.py

print(__doc__)

import numpy as np

from time import time
from operator import itemgetter
from scipy.stats import randint as sp_randint from sklearn.grid_search import GridSearchCV, RandomizedSearchCV
from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier # get some data
iris = load_digits()
X, y = iris.data, iris.target # build a classifier
clf = RandomForestClassifier(n_estimators=20) # Utility function to report best scores
def report(grid_scores, n_top=3):
top_scores = sorted(grid_scores, key=itemgetter(1), reverse=True)[:n_top]
for i, score in enumerate(top_scores):
print("Model with rank: {0}".format(i + 1))
print("Mean validation score: {0:.3f} (std: {1:.3f})".format(
score.mean_validation_score,
np.std(score.cv_validation_scores)))
print("Parameters: {0}".format(score.parameters))
print("") # specify parameters and distributions to sample from
param_dist = {"max_depth": [3, None],
"max_features": sp_randint(1, 11),
"min_samples_split": sp_randint(1, 11),
"min_samples_leaf": sp_randint(1, 11),
"bootstrap": [True, False],
"criterion": ["gini", "entropy"]} # run randomized search
n_iter_search = 20
random_search = RandomizedSearchCV(clf, param_distributions=param_dist,
n_iter=n_iter_search) start = time()
random_search.fit(X, y)
print("RandomizedSearchCV took %.2f seconds for %d candidates"
" parameter settings." % ((time() - start), n_iter_search))
report(random_search.grid_scores_) # use a full grid over all parameters
param_grid = {"max_depth": [3, None],
"max_features": [1, 3, 10],
"min_samples_split": [1, 3, 10],
"min_samples_leaf": [1, 3, 10],
"bootstrap": [True, False],
"criterion": ["gini", "entropy"]} # run grid search
grid_search = GridSearchCV(clf, param_grid=param_grid)
start = time()
grid_search.fit(X, y) print("GridSearchCV took %.2f seconds for %d candidate parameter settings."
% (time() - start, len(grid_search.grid_scores_)))
report(grid_search.grid_scores_)

Comparing randomized search and grid search for hyperparameter estimation的更多相关文章

  1. 3.2. Grid Search: Searching for estimator parameters

    3.2. Grid Search: Searching for estimator parameters Parameters that are not directly learnt within ...

  2. scikit-learn:3.2. Grid Search: Searching for estimator parameters

    參考:http://scikit-learn.org/stable/modules/grid_search.html GridSearchCV通过(蛮力)搜索參数空间(參数的全部可能组合).寻找最好的 ...

  3. Grid search in the tidyverse

    @drsimonj here to share a tidyverse method of grid search for optimizing a model's hyperparameters. ...

  4. How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras

    Hyperparameter optimization is a big part of deep learning. The reason is that neural networks are n ...

  5. Grid Search学习

    转自:https://www.cnblogs.com/ysugyl/p/8711205.html Grid Search:一种调参手段:穷举搜索:在所有候选的参数选择中,通过循环遍历,尝试每一种可能性 ...

  6. grid search 超参数寻优

    http://scikit-learn.org/stable/modules/grid_search.html 1. 超参数寻优方法 gridsearchCV 和  RandomizedSearchC ...

  7. [转载]Grid Search

    [转载]Grid Search 初学机器学习,之前的模型都是手动调参的,效果一般.同学和我说他用了一个叫grid search的方法.可以实现自动调参,顿时感觉非常高级.吃饭的时候想调参的话最差不过也 ...

  8. 【起航计划 032】2015 起航计划 Android APIDemo的魔鬼步伐 31 App->Search->Invoke Search 搜索功能 Search Dialog SearchView SearchRecentSuggestions

    Search (搜索)是Android平台的一个核心功能之一,用户可以在手机搜索在线的或是本地的信息.Android平台为所有需要提供搜索或是查询功能的应用提 供了一个统一的Search Framew ...

  9. grid search

    sklearn.metrics.make_scorer(score_func, greater_is_better=True, needs_proba=False, needs_threshold=F ...

随机推荐

  1. delphi tidhttp 超时的解决方案

    现在delphi都发布到xe10.1了,tidhttp还有缺陷,那就是超时设置在没有网络或者连不上服务器的时候是无效的,不管你设置为多少都要10-20秒.connectTimeout和readTime ...

  2. linux系统启动

    在本文中,我们按电源按钮简要叙述,以便能够登录到系统,在此期间,系统和计算机硬件是如何一起工作.既作为自己整理知识的摘要,有可能linux0绍,高手请略过. 一般来说linux的启动能够分成三个阶段: ...

  3. android 02 登录

    activity_main.xml <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android& ...

  4. 升级 node 版本

    npm install -g n n stablen v0.10.26 n 0.10.26

  5. spring下载dist.zip

    http://repo.springsource.org/libs-release-local/org/springframework/spring/ 选择对应版本下载即可

  6. PHP 5.6启动失败failed to open configuration file '/usr/local/php/etc/php-fpm.conf'

    PHP编译安装完毕,启动失败,提示 [-Jun- ::] ERROR: failed to open configuration ) [-Jun- ::] ERROR: failed to load ...

  7. <legend>标签

    健康信息身高: 体重: 如果表单周围没有边框,说明您的浏览器太老了. <!DOCTYPE HTML> <html> <body> <form> < ...

  8. ASP.NET Boilerplate 工作单元

    从上往下说起,框架使用castle拦截器,拦截实现了IApplication.IRepository接口的所有方法,和使用了UnitOfWork 特性的方法,代码如下 internal class U ...

  9. jni使用

    版权声明:本文为博主原创文章,未经博主允许不得转载.   目录(?)[-] 简介 详解 JNI 元素 JNI函数实战 AndroidmkApplicationmk Androidmk Applicat ...

  10. Kettle 实现mysql数据库不同表之间数据同步——实验过程

    下面是试验的主要步骤: 在上一篇文章中LZ已经介绍了,实验的环境和实验目的. 在本篇文章中主要介绍侧重于对Kettle ETL的相应使用方法, 在这里LZ需要说明一下,LZ成为了避免涉及索引和表连接等 ...