lightgbm的sklearn接口和原生接口参数详细说明及调参指点

class lightgbm.LGBMClassifier(boosting_type='gbdt', num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=10, max_bin=255, subsample_for_bin=200000, objective=None, min_split_gain=0.0, min_child_weight=0.001, min_child_samples=20, subsample=1.0, subsample_freq=1, colsample_bytree=1.0, reg_alpha=0.0, reg_lambda=0.0, random_state=None, n_jobs=-1, silent=True, **kwargs)

boosting_type	default="gbdt"	"gbdt":Gradient Boosting Decision Tree "dart":Dropouts meet Multiple Additive Regression Trees "goss":Gradient-based One-Side Sampling "rf": Random Forest
num_leaves	(int, optional (default=31))	每个基学习器的最大叶子节点	<=2^max_depth
max_depth	(int, optional (default=-1))	每个基学习器的最大深度, -1 means no limit	当模型过拟合，首先降低max_depth
learning_rate	(float, optional (default=0.1))	Boosting learning rate
n_estimators	(int, optional (default=10))	基学习器的数量
max_bin	(int, optional (default=255))	feature将存入的bin的最大数量，应该是直方图的k值
subsample_for_bin	(int, optional (default=50000))	Number of samples for constructing bins
objective	(string, callable or None, optional (default=None))	default: ‘regression’ for LGBMRegressor, ‘binary’ or ‘multiclass’ for LGBMClassifier, ‘lambdarank’ for LGBMRanker.
min_split_gain	(float, optional* (default=0.))*	树的叶子节点上进行进一步划分所需的最小损失减少
min_child_weight	(float, optional* (default=1e-3))*	Minimum sum of instance weight(hessian) needed in a child(leaf)
min_child_samples	(int, optional (default=20))	叶子节点具有的最小记录数
subsample	(float, optional (default=1.))	训练时采样一定比例的数据
subsample_freq	(int, optional* (default=1))*	Frequence of subsample, <=0 means no enable
colsample_bytree	(float, optional (default=1.))	Subsample ratio of columns when constructing each tree
reg_alpha	(float, optional* (default=0.))*	L1 regularization term on weights
reg_lambda	(float, optional* (default=0.))*	L2 regularization term on weights
random_state	(int* or None, optional (default=None))*
silent	(bool, optional (default=True))
n_jobs	(int, optional (default=-1))

######################################################################################################

下表对应了Faster Spread，better accuracy，over-fitting三种目的时，可以调整的参数:

###########################################################################################

类的属性：

n_features_	int	特征的数量
classes_	rray of shape = [n_classes]	类标签数组（只针对分类问题）
n_classes_	int	类别数量（只针对分类问题）
best_score_	dict or None	最佳拟合模型得分
best_iteration_	int or None	如果已经指定了early_stopping_rounds，则拟合模型的最佳迭代次数
objective_	string or callable	拟合模型时的具体目标
booster_	Booster	这个模型的Booster
evals_result_	dict or None	如果已经指定了early_stopping_rounds，则评估结果
feature_importances_	array of shape = [n_features]	特征的重要性

###########################################################################################

类的方法：

fit(X, y, sample_weight=None, init_score=None, eval_set=None, eval_names=None, eval_sample_weight=None, eval_init_score=None, eval_metric='logloss', early_stopping_rounds=None, verbose=True, feature_name='auto', categorical_feature='auto', callbacks=None)

X	array-like or sparse matrix of shape = [n_samples, n_features]	特征矩阵
y	array-like of shape = [n_samples]	The target values (class labels in classification, real numbers in regression)
sample_weight	array-like of shape = [n_samples] or None, optional (default=None))	样本权重,可以采用np.where设置
init_score	array-like of shape = [n_samples] or None, optional (default=None))	Init score of training data
group	array-like of shape = [n_samples] or None, optional (default=None)	Group data of training data.
eval_set	list or None, optional (default=None)	A list of (X, y) tuple pairs to use as a validation sets for early-stopping
eval_names	list of strings or None, optional (default=None)	Names of eval_set
eval_sample_weight	list of arrays or None, optional (default=None)	Weights of eval data
eval_init_score	list of arrays or None, optional (default=None)	Init score of eval data
eval_group	list of arrays or None, optional (default=None)	Group data of eval data
eval_metric	string, list of strings, callable or None, optional (default="logloss")	"mae","mse",...
early_stopping_rounds	int or None, optional (default=None)	一定rounds,即停止迭代
verbose	bool, optional (default=True)
feature_name	list of strings or 'auto', optional (default="auto")	If ‘auto’ and data is pandas DataFrame, data columns names are used
categorical_feature	list of strings or int, or 'auto', optional (default="auto")	If ‘auto’ and data is pandas DataFrame, pandas categorical columns are used
callbacks	list of callback functions or None, optional (default=None)





###############################################################################################

predict_proba(X, raw_score=False, num_iteration=0)

X	array-like or sparse matrix of shape = [n_samples, n_features]	Input features matrix
raw_score	bool, optional (default=False)	Whether to predict raw scores
num_iteration	int, optional (default=0)	Limit number of iterations in the prediction; defaults to 0 (use all trees).
Returns	predicted_probability	The predicted probability for each class for each sample.
Return type	array-like of shape = [n_samples, n_classes]

不平衡处理的参数：

1.一个简单的方法是设置is_unbalance参数为True或者设置scale_pos_weight,二者只能选一个。设置is_unbalance参数为True时会把负样本的权重设为：正样本数/负样本数。这个参数只能用于二分类。

2.自定义评价函数：

https://cloud.tencent.com/developer/article/1357671

lightGBM的原理总结：

http://www.cnblogs.com/gczr/p/9024730.html

论文翻译：https://blog.csdn.net/u010242233/article/details/79769950，https://zhuanlan.zhihu.com/p/42939089

处理分类变量的原理：https://blog.csdn.net/anshuai_aw1/article/details/83275299

CatBoost、LightGBM、XGBoost的对比

https://blog.csdn.net/LrS62520kV/article/details/79620615

lightgbm的sklearn接口和原生接口参数详细说明及调参指点的更多相关文章

xgboost的sklearn接口和原生接口参数详细说明及调参指点
from xgboost import XGBClassifier XGBClassifier(max_depth=3,learning_rate=0.1,n_estimators=100,silen ...
word2vec参数调整及lda调参
一.word2vec调参 ./word2vec -train resultbig.txt -output vectors.bin -cbow 0 -size 200 -window 5 -neg ...
DeepMind提出新型超参数最优化方法：性能超越手动调参和贝叶斯优化
DeepMind提出新型超参数最优化方法:性能超越手动调参和贝叶斯优化 2017年11月29日 06:40:37 机器之心V 阅读数 2183 版权声明:本文为博主原创文章,遵循CC 4.0 BY ...
python+pytest接口自动化(6)-请求参数格式的确定
我们在做接口测试之前,先需要根据接口文档或抓包接口数据,搞清楚被测接口的详细内容,其中就包含请求参数的编码格式,从而使用对应的参数格式发送请求.例如某个接口规定的请求主体的编码方式为 applicat ...
android 学习随笔二十七（JNI：Java Native Interface,JAVA原生接口）
JNI(Java Native Interface,JAVA原生接口) 使用JNI可以使Java代码和其他语言写的代码(如C/C++代码)进行交互. 问:为什么要进行交互? 首先,Java语言提供的类 ...
接口作为方法的参数或返回值——List接口
接口作为方法的参数或返回值,源码可知,List为一个接口,ArraryList是的它的实现类: 其中,addNames方法中,入参和返回值都List接口,入参是多态的,编译看左,运行看右(访问成员方法 ...
编写高质量代码改善C#程序的157个建议——建议43：让接口中的泛型参数支持协变
建议43:让接口中的泛型参数支持协变除了上一建议中提到的使用泛型参数兼容接口不可变性外,还有一种办法是为接口中的泛型声明加上out关键字来支持协变,如下所示: interface ISalary&l ...
Python+request 分模块存放接口，多接口共用参数URL、headers的抽离，添加日志打印等《三》
主要介绍内容如下: 1.分模块存放接口 2.多接口共用参数URL.headers的抽离为配置文件 3.添加日志打印 4.一个py文件运行所有所测的接口如上介绍内容的作用: 1.分模块存放接口:方便多 ...
对接接口时，组织参数json出现的问题
在进行对接第三方接口时,进行参数组装成json的过程中出现参数传递格式错误以及json格式化错误. 在拼接json时,如果json中有对象,则以map的方式组装好所有参数.最后map转成json,不然 ...

随机推荐

linux——git安装使用
系统环境centos7 安装git命令 yum install git -y 安装好之后使用命令查看git版本 git –version [root@bogon ~]# git --version g ...
Stream processing with Apache Flink and Minio
转自:https://blog.minio.io/stream-processing-with-apache-flink-and-minio-10da85590787 Modern technolog ...
04基于python玩转人工智能最火框架之TensorFlow开发环境搭建
MOOC_VM.vdl.zip 解压之后,得到一个vdl文件.打开virtual box,新建选择类型linuxubuntu 64位. 选择继续,分配2g.使用已有的虚拟硬盘文件,点击选择我们下载的文 ...
C#遍历菜单项
(1)横向遍历 ToolStripMenuItem foreach (ToolStripMenuItem con in this.MainMenuStrip.Items) { ...
电脑上不安装Oracle时，C# 调用oracle数据库，Oracle客户工具【转载】
http://www.cnblogs.com/jiekzou/p/5047850.html Oracle的安装包通常都比较大,安装又比较费时,而且如果安装过程中不幸出错,各种蛋疼,即便是安装过N遍的老 ...
并发运算lib
最近对类似于erlang或者golang的并发运算很感兴趣.以下是看到的相关资料. libgo c++,技术:协程,多线程.这是俺发现的用法最漂亮的c++库,用法参考golang CAF 全称c++ ...
css加载字体跨域问题
刚才碰到一个css加载字体跨域问题,记录一下.站点的动态请求与静态文件请求是不同的域名的.站点的域名为 www.domain.com,而静态文件的域名为 st.domain.com.问题:页面中加载c ...
monkey配置及简单报告生成(安卓）
参考网址:http://www.51testing.com/html/72/502872-3709760.html 1.安装jdk,配置环境变量 2.安装sdk(解压后,配置环境变量到path ...
NET设计模式第二部分结构性模式(9)：装饰模式（Decorator Pattern）
装饰模式(Decorator Pattern) ——.NET设计模式系列之十 Terrylee,2006年3月概述在软件系统中,有时候我们会使用继承来扩展对象的功能,但是由于继承为类型引入的静态特 ...
用Shell判断字符串包含关系的方法小结
这篇文章主要给大家介绍了关于用Shell判断字符串包含关系的几种方法,其中包括利用grep查找.利用字符串运算符.利用通配符.利用case in 语句以及利用替换等方法,每个方法都给出了详细的示例代 ...

lightgbm的sklearn接口和原生接口参数详细说明及调参指点

CatBoost、LightGBM、XGBoost的对比

lightgbm的sklearn接口和原生接口参数详细说明及调参指点的更多相关文章

随机推荐

热门专题