


 void exit_with_help()
"Usage: train [options] training_set_file [model_file]\n"
"-s type : set type of solver (default 1)\n"
" for multi-class classification\n"
" 0 -- L2-regularized logistic regression (primal)\n"
" 1 -- L2-regularized L2-loss support vector classification (dual)\n"
" 2 -- L2-regularized L2-loss support vector classification (primal)\n"
" 3 -- L2-regularized L1-loss support vector classification (dual)\n"
" 4 -- support vector classification by Crammer and Singer\n"
" 5 -- L1-regularized L2-loss support vector classification\n"
" 6 -- L1-regularized logistic regression\n"
" 7 -- L2-regularized logistic regression (dual)\n"
" for regression\n"
" 11 -- L2-regularized L2-loss support vector regression (primal)\n"
" 12 -- L2-regularized L2-loss support vector regression (dual)\n"
" 13 -- L2-regularized L1-loss support vector regression (dual)\n"
"-c cost : set the parameter C (default 1)\n"
"-p epsilon : set the epsilon in loss function of SVR (default 0.1)\n"
"-e epsilon : set tolerance of termination criterion\n"
" -s 0 and 2\n"
" |f'(w)|_2 <= eps*min(pos,neg)/l*|f'(w0)|_2,\n"
" where f is the primal function and pos/neg are # of\n"
" positive/negative data (default 0.01)\n"
" -s 11\n"
" |f'(w)|_2 <= eps*|f'(w0)|_2 (default 0.001)\n"
" -s 1, 3, 4, and 7\n"
" Dual maximal violation <= eps; similar to libsvm (default 0.1)\n"
" -s 5 and 6\n"
" |f'(w)|_1 <= eps*min(pos,neg)/l*|f'(w0)|_1,\n"
" where f is the primal function (default 0.01)\n"
" -s 12 and 13\n"
" |f'(alpha)|_1 <= eps |f'(alpha0)|,\n"
" where f is the dual function (default 0.1)\n"
"-B bias : if bias >= 0, instance x becomes [x; bias]; if < 0, no bias term added (default -1)\n"
"-wi weight: weights adjust the parameter C of different classes (see README for details)\n"
"-v n: n-fold cross validation mode\n"
"-q : quiet mode (no outputs)\n"

库的实现主要在linear.cpp这个文件中,其中train()函数负责训练数据得出相应的model,predict()函数负责预判未知的输入数据。具体的使用帮助请参考软件包中的README文件。我在项目中使用的训练方法主要是坐标下降法,下面就是坐标下降法 的主要原理和应用。

L2-regularized L1- and L2-loss Support Vector Classification(dual)

L2-regularized L1-loss support vector classification (dual)的最优化模型:

L2-regularized L2-loss support vector classification (dual)的最后化模型:


下面讨论的求解过程以L1 SVC为准,L1与L2的泛化能力差不多,而训练时间一般L2要快些。



这里的α是叫做学习速度(learning rate), 它决定了坐标下降的幅度大小,假设在只有一个训练样本的情况下对J(θ)求偏导并代入上式:




关于 求导得

当d=0,即 时收敛,也就是说 达到最优值,在先前SVM原理简介中提到,带入到 中得到:。在更新α的同时我们也需要更新w:,其中 是更新后的值, 是更新前的值, ,两个值的差值d可以对上面的 关于d求导得到 :,可能写的比较乱,下面列出整个流程的伪代码来理清思路。


             if(fabs(PG) > 1.0e-12)
double alpha_old = alpha[i];
alpha[i] = min(max(alpha[i] - G/QD[i], 0.0), C);
d = (alpha[i] - alpha_old)*yi;
xi = prob->x[i];
while (xi->index != -)
w[xi->index-] += d*xi->value;



 double predict_values(const struct model *model_, const struct feature_node *x, double *dec_values)
int idx;
int n;
double *w=model_->w;
int nr_class=model_->nr_class;
int i;
int nr_w;
if(nr_class== && model_->param.solver_type != MCSVM_CS)
nr_w = ;
nr_w = nr_class; const feature_node *lx=x;
dec_values[i] = ;
for(; (idx=lx->index)!=-; lx++)
// the dimension of testing data may exceed that of training
dec_values[i] += w[(idx-)*nr_w+i]*lx->value;
} if(nr_class==)
if(model_->param.solver_type == L2R_L2LOSS_SVR ||
model_->param.solver_type == L2R_L1LOSS_SVR_DUAL ||
model_->param.solver_type == L2R_L2LOSS_SVR_DUAL)
return dec_values[];
return (dec_values[]>)?model_->label[]:model_->label[];
int dec_max_idx = ;
if(dec_values[i] > dec_values[dec_max_idx])
dec_max_idx = i;
return model_->label[dec_max_idx];

对于惩罚因子C的优化问题,可以借助LIBSVM中grid.py这个工具,它使用交叉验证来选出预测精度最高的那个参数,如果同时优化两个参数(RBF kernel中的c和g),它可以借助gnuplot画出等高线来帮助我们直观的了解整个优化的过程,当然可以根据你自己的需要来修改grid.py来优化你想要的参数,非常有用的小工具。

Usage: grid.py [grid_options] [svm_options] dataset

grid_options :
-log2c {begin,end,step | "null"} : set the range of c (default -,,)
begin,end,step -- c_range = ^{begin,...,begin+k*step,...,end}
"null" -- do not grid with c
-log2g {begin,end,step | "null"} : set the range of g (default ,-,-)
begin,end,step -- g_range = ^{begin,...,begin+k*step,...,end}
"null" -- do not grid with g
-v n : n-fold cross validation (default )
-svmtrain pathname : set svm executable path and name
-gnuplot {pathname | "null"} :
pathname -- set gnuplot executable path and name
"null" -- do not plot
-out {pathname | "null"} : (default dataset.out)
pathname -- set output file path and name
"null" -- do not output file
-png pathname : set graphic output file path and name (default dataset.png)
-resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out)
This is experimental. Try this option only if some parameters have been checked for the SAME data.


