pycaffe简明文档

by ChrisZZ, imzhuo@foxmail.com

2018年01月18日19:00:56

说明

caffe的python接口没有官方说明文档，例如查看一个函数的用法，pytorch能查到所有的用法，而pycaffe则需要自行去查看源码。于是手动写了一个很粗糙的文档，凑合看了。

1.主要根据caffe_root/python/caffe目录下的__init__.py和_caffe.cpp来手动生成。

2.以及caffe_root/python/proto/caffe_pb2.py文件（这个文件是从caffe.proto来的，特别大）。

3.还有就是caffe_root/python/caffe/test目录下的这些单元测试文件。

4.其他文件，如io.py, net_spec.py, pycaffe.py, draw.py, classifier.py, detector.py，要么是caffe_pb2.py中众多函数的重要的、有代表性的基本封装和使用，要么就是充当（可能）常用的工具。

5.此外，老版本(如caffe-fast-rcnn)和新版本(如caffe-BVLC)的pycaffe支持的东西，肯定是新版本的多一些。本文档仅仅列出了caffe-fast-rcnn版代码中的函数，更新版本pycaffe中的函数需要自行查看_caffe.cpp做对照理解。

也是醉了，其实可以直接查看从代码和注释生成的文档的：

#在ipython环境下

import caffe

help(caffe.Net)

help(caffe.proto.caffe_pb2)

这里是模板：

`className`类

className.funcName(param1, param2)

@param1 类型，作用

@param2 类型，作用

@descipion 函数的作用是，

@return 类型

`caffe.Net`类

__init__(prototxt, phase)

@param prototxt: 字符串。指定prototxt路径

@param phase: `caffe.TRAIN`或者`caffe.TEST`

@description: 根据prototxt初始化一个网络。相当于C++的:

	  shared_ptr<Net<Dtype> > net(new Net<Dtype>(param_file,

      static_cast<Phase>(phase)));

@return `Net`类对象，网络实例

@example `net = caffe.Net('test.prototxt', caffe.TEST)`

__init__(prototxt, caffemodel, phase)

@param prototxt: 字符串。指定prototxt路径

@param caffemodel: 字符串，指定caffemodel路径

@param phase: `caffe.TRAIN`或者`caffe.TEST`

@description: 根据prototxt初始化一个网络，并从caffemodel复制同名层的权值（作为网络中该层的权值）。相当于C++的：

	  shared_ptr<Net<Dtype> > net(new Net<Dtype>(param_file,

    	  static_cast<Phase>(phase)));

	  net->CopyTrainedLayersFrom(pretrained_param_file);

@return `Net`类对象，网络实例

@example `net = caffe.Net('test.prototxt', 'resnet50.caffemodel', caffe.TEST)`

_forward(start, end)

@param start: int类型

@param end: int类型

@description: 执行前向传播。调用的是C++的`Dtype Net<Dtype>::ForwardFromTo(int start, int end) `

@return: loss值

@example: 类的私有方法，所以不建议使用

_backward(start, end)

@param start: int类型

@param end: int类型

@description: 执行前向传播。调用的是C++的`void Net<Dtype>::BackwardFromTo(int start, int end) `

@return: 没有返回值。

@example: 类的私有方法，所以不建议使用

reshape()

@param: 不需要参数

@description: 网络中的每一层，都执行reshape。调用的C++的`void Net<Dtype>::Reshape() `

@return: 没有返回值类型

copy_from(caffemodel)

@param caffemodel: 字符串类型。指定(pretrained的)caffemodel路径

@description: 读取指定的caffemodel(二进制protobuf文件），从中读取和当前网络同名层的参数作为替代。调用的是C++`void Net<Dtype>::CopyTrainedLayersFrom(const string trained_filename)`

@return: 空

@example

	pretrained_caffemodel = 'abc.caffemodel'

	net = caffe.Net(prototxt, caffe.TEST)

	net.copy_from(pretrained_caffemodel)

	(来源：py-faster-rcnn)

copy_from(net)

@param net: (另一个)Net对象

@description: 从Net中读取同名网络层参数作为替代。调用的其实是`void Net<Dtype>::CopyTrainedLayersFrom(const NetParameter& param)`

@return 空

@example

	net = caffe.Net(prototxt, caffe.TEST)

	resnet = caffe.Net(prototxt_resnet, caffemodel, caffe.TEST)

	net.copy_from(resnet)

share_with(net)

@param net: Net类型。打算从net上取同名层，和当前网络共享参数。

@description: 和copy_from非常像。区别在于，被share的两个Net对象，同名层的数据是共享的！也就是内存中只有一份！改了一个，另一个也被修改！调用的是C++`void Net<Dtype>::ShareTrainedLayersWith(const Net* other) `

@return 空

_blob_loss_weights

私有属性。

返回loss函数中的每个blob的权值。按照blob_id进行索引。

_bottom_ids(i)

@param i: 层(layer)序号

@return 第i层的bottom们的id列表，`bottom_id_vecs_[i]`

_top_ids(i)

@param i: 层(layer)序号

@return 第i层的top们的id列表，`top_id_vecs_[i]`

_blobs

私有属性

返回blobs（不是很懂！）

layers

公共属性

返回layers

_blob_names

私有属性

返回blob们的名字

_layer_names

私有属性

返回layer们的名字

_inputs

私有属性。

返回`net_input_blob_indices_`，也就是网络输入们的索引们

_outputs

私有属性。

返回`net_output_blob_indices_`，也就是网络输出们的索引们

_set_input_arrays(?)

私有函数。

设定网络输入？（不懂）

save(pth)

@param pth: 字符串类型。指定保存的路径。

@description： 保存当前网络对象到文件（磁盘）。调用的是C++`void Net_Save(const Net<Dtype>& net, string filename) `

`caffe.Blob`类

一开始我觉得这个类在pycaffe中没有被暴露出来，因此没用。

其实不是的。

某个layer的top或者bottom，其实都是Blob的实例。那么这些blob就需要查看相对应的属性、函数。

shape

公开属性

返回当前blob对象(也就是一个tensor)的各个维度信息

常见的：返回N,C,H,W

num

公开属性

Deprecated legacy shape accessor num: use shape(0) instead.

channels

公开属性

Deprecated legacy shape accessor channels: use shape(1) instead.

height

公开属性

Deprecated legacy shape accessor height: use shape(2) instead.

width

公开属性

Deprecated legacy shape accessor width: use shape(3) instead.

count

公开属性

返回的是当前blob的维度数目，相当于len(self.shape)

比如，对于(N,C,H,W)维度的blob，返回的是4

reshape(shape_args)

公开函数

执行reshape操作

例子：

	im = np.array(caffe.io.load_image('catGray.jpg', color=False)).squeeze()

	im_input = im[np.newaxis, np.newaxis, :, :]

	net.blobs['data'].reshape(*im_input.shape)

	net.blobs['data'].data[...] = im_input

data

公开属性

本质是`Blob<Dtype>::mutable_cpu_data`

diff

公开属性

本质是`Blob<Dtype>::mutable_cpu_diff`

`caffe.Layer`类

blobs

公开属性

返回blobs

setup()

reshape()

phase

公开属性

层的phase

type

公开属性

层的type(？？干什么用的？）

`caffe.Solver`类

这个类没有被直接暴露出来使用，而是被作为SGDSolver等子类继承。因此子类中也有这些方法和属性可用，需要查看。

注意

caffe.Solver类没有构造函数！

但是caffe.Solver类的子类(例如caffe.SGDSolver)是有构造函数的！

net

公开属性

Solver所拥有的net对象

test_nets

公开属性

Solver所拥有的所有测试网络（因此是一个列表）

iter

公开属性

Solver对象当前迭代次数

solve(resume_file=None)

公开函数

让当前Solver对象执行求解，也就是执行所有iter。

如果指定`resume_file`那么从该文件继续执行（而不是从0次iter执行)

step(iters)

公开函数

@param iters: 需要执行的迭代次数

@description: 从Solver的当前迭代次数(self.iter)开始，执行iters次迭代。迭代期间可能输出“平滑过的loss"(smoothed_loss)，形如：

	Iteration 20, loss = 3.66

具体见solver.cpp的`void Solver<Dtype>::UpdateSmoothedLoss(Dtype loss, int start_iter,

    int average_loss)`函数

@return:　空

@example: solver.step(1) #执行一次迭代

restore(state_file)

公开函数

@param state_file: 字符串类型。暂存状态文件的路径。

@description: 从指定的暂存文件state_file中读取(恢复)状态。

snapshot()

公开函数

@description: 将当前solver的网路状态写入暂存文件(solver_state文件)

说明：暂存文件的名字是根据如下规则得到(C++):

	`param_.snapshot_prefix() + "_iter_" + caffe::format_int(iter_) + ".caffemodel";`

可以通过python代码进行查看(甚至修改snapshot的前缀??):

	from caffe.proto import caffe_pb2

	solver_param = caffe_pb2.SolverParameter()

	print(solver_param.snapshot_prefix)

注意2(TODO):

caffe.SGDSolver.restore(), caffe.Net.copy_from(), caffe.SGDSolver.copy_from(), caffe.SGDSolver.step()等函数，使用前请参考https://github.com/BVLC/caffe/issues/3336

`caffe.SGDSolver`类

caffe.Solver类的子类

__init__(filename)

@param filename: 字符串类型，指定SGDSolver的prototxt描述文件

@descrition 从指定的prototxt文件，创建SGDSolver

@example

	solver_prototxt = 'resnet-50-test.prototxt'

	self.solver = caffe.SGDSolver(solver_prototxt)

`caffe.NesterovSolver`类

caffe.Solver类的子类

__init__(filename)

同SGDSolver的构造函数

`caffe.AdaGradSolver`类

caffe.Solver类的子类

__init__(filename)

同SGDSolver的构造函数

`caffe.RMSPropSolver`类

caffe.Solver类的子类

__init__(filename)

同SGDSolver的构造函数

`caffe.AdaDeltaSolver`类

caffe.Solver类的子类

__init__(filename)

同SGDSolver的构造函数

`caffe.AdamSolver`类

caffe.Solver类的子类

__init__(filename)

同SGDSolver的构造函数

`caffe.get_solver(filename)`

公开函数

@param filename: 字符串类型。指定的solver文件

@description: 从指定的solver文件，创建solver

`caffe.set_mode_cpu()`

`caffe.set_mode_gpu()`

`caffe.set_device(gpu_id)`

@param gpu_id: int类型

设定(gpu)id。从0开始。即：第一块显卡是0，第二块显卡是1.

`caffe.set_random_seed()`

设定随机数种子。

好处：在重复实验的时候，尽量减少随机性。（但其实仅仅是种子一致，无法随机数出现顺序是一样的，因此最后inference结果还是会不一样，只不过波动性小了）

`caffe.layer_type_list()`

返回各个Layer的类型

`caffe.proto.caffe_pb2`包

这个包下面有很多类。但是很多（也许是全部？）都是从caffe.proto生成的，感觉很蛋疼啊，一个一个的写，肯定写不完的。

大概总结有这些：

LayerNameParameter型，例如：SliceParameter

其他类：

反正，感觉都是在caffe.proto中通过message声明的一个个东西，例如：

caffe.proto.caffe_pb2.SolverParameter类

message SolverParameter {

  //////////////////////////////////////////////////////////////////////////////

  // Specifying the train and test networks

  //

  // Exactly one train net must be specified using one of the following fields:

  //     train_net_param, train_net, net_param, net

  // One or more test nets may be specified using any of the following fields:

  //     test_net_param, test_net, net_param, net

  // If more than one test net field is specified (e.g., both net and

  // test_net are specified), they will be evaluated in the field order given

  // above: (1) test_net_param, (2) test_net, (3) net_param/net.

  // A test_iter must be specified for each test_net.

  // A test_level and/or a test_stage may also be specified for each test_net.

  //////////////////////////////////////////////////////////////////////////////

  // Proto filename for the train net, possibly combined with one or more

  // test nets.

  optional string net = 24;

  // Inline train net param, possibly combined with one or more test nets.

  optional NetParameter net_param = 25;

  optional string train_net = 1; // Proto filename for the train net.

  repeated string test_net = 2; // Proto filenames for the test nets.

  optional NetParameter train_net_param = 21; // Inline train net params.

  repeated NetParameter test_net_param = 22; // Inline test net params.

  // The states for the train/test nets. Must be unspecified or

  // specified once per net.

  //

  // By default, all states will have solver = true;

  // train_state will have phase = TRAIN,

  // and all test_state's will have phase = TEST.

  // Other defaults are set according to the NetState defaults.

  optional NetState train_state = 26;

  repeated NetState test_state = 27;

  // The number of iterations for each test net.

  repeated int32 test_iter = 3;

  // The number of iterations between two testing phases.

  optional int32 test_interval = 4 [default = 0];

  optional bool test_compute_loss = 19 [default = false];

  // If true, run an initial test pass before the first iteration,

  // ensuring memory availability and printing the starting value of the loss.

  optional bool test_initialization = 32 [default = true];

  optional float base_lr = 5; // The base learning rate

  // the number of iterations between displaying info. If display = 0, no info

  // will be displayed.

  optional int32 display = 6;

  // Display the loss averaged over the last average_loss iterations

  optional int32 average_loss = 33 [default = 1];

  optional int32 max_iter = 7; // the maximum number of iterations

  // accumulate gradients over `iter_size` x `batch_size` instances

  optional int32 iter_size = 36 [default = 1];

  // The learning rate decay policy. The currently implemented learning rate

  // policies are as follows:

  //    - fixed: always return base_lr.

  //    - step: return base_lr * gamma ^ (floor(iter / step))

  //    - exp: return base_lr * gamma ^ iter

  //    - inv: return base_lr * (1 + gamma * iter) ^ (- power)

  //    - multistep: similar to step but it allows non uniform steps defined by

  //      stepvalue

  //    - poly: the effective learning rate follows a polynomial decay, to be

  //      zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power)

  //    - sigmoid: the effective learning rate follows a sigmod decay

  //      return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))

  //

  // where base_lr, max_iter, gamma, step, stepvalue and power are defined

  // in the solver parameter protocol buffer, and iter is the current iteration.

  optional string lr_policy = 8;

  optional float gamma = 9; // The parameter to compute the learning rate.

  optional float power = 10; // The parameter to compute the learning rate.

  optional float momentum = 11; // The momentum value.

  optional float weight_decay = 12; // The weight decay.

  // regularization types supported: L1 and L2

  // controlled by weight_decay

  optional string regularization_type = 29 [default = "L2"];

  // the stepsize for learning rate policy "step"

  optional int32 stepsize = 13;

  // the stepsize for learning rate policy "multistep"

  repeated int32 stepvalue = 34;

  // Set clip_gradients to >= 0 to clip parameter gradients to that L2 norm,

  // whenever their actual L2 norm is larger.

  optional float clip_gradients = 35 [default = -1];

  optional int32 snapshot = 14 [default = 0]; // The snapshot interval

  optional string snapshot_prefix = 15; // The prefix for the snapshot.

  // whether to snapshot diff in the results or not. Snapshotting diff will help

  // debugging but the final protocol buffer size will be much larger.

  optional bool snapshot_diff = 16 [default = false];

  enum SnapshotFormat {

    HDF5 = 0;

    BINARYPROTO = 1;

  }

  optional SnapshotFormat snapshot_format = 37 [default = BINARYPROTO];

  // the mode solver will use: 0 for CPU and 1 for GPU. Use GPU in default.

  enum SolverMode {

    CPU = 0;

    GPU = 1;

  }

  optional SolverMode solver_mode = 17 [default = GPU];

  // the device_id will that be used in GPU mode. Use device_id = 0 in default.

  optional int32 device_id = 18 [default = 0];

  // If non-negative, the seed with which the Solver will initialize the Caffe

  // random number generator -- useful for reproducible results. Otherwise,

  // (and by default) initialize using a seed derived from the system clock.

  optional int64 random_seed = 20 [default = -1];

  // type of the solver

  optional string type = 40 [default = "SGD"];

  // numerical stability for RMSProp, AdaGrad and AdaDelta and Adam

  optional float delta = 31 [default = 1e-8];

  // parameters for the Adam solver

  optional float momentum2 = 39 [default = 0.999];

  // RMSProp decay value

  // MeanSquare(t) = rms_decay*MeanSquare(t-1) + (1-rms_decay)*SquareGradient(t)

  optional float rms_decay = 38;

  // If true, print information about the state of the net that may help with

  // debugging learning problems.

  optional bool debug_info = 23 [default = false];

  // If false, don't save a snapshot after training finishes.

  optional bool snapshot_after_train = 28 [default = true];

  // DEPRECATED: old solver enum types, use string instead

  enum SolverType {

    SGD = 0;

    NESTEROV = 1;

    ADAGRAD = 2;

    RMSPROP = 3;

    ADADELTA = 4;

    ADAM = 5;

  }

  // DEPRECATED: use type instead of solver_type

  optional SolverType solver_type = 30 [default = SGD];

}

caffe.proto.caffe_pb2.SolverState类

  244 // A message that stores the solver snapshots

  245 message SolverState {

  246   optional int32 iter = 1; // The current iteration

  247   optional string learned_net = 2; // The file that stores the learned net.

  248   repeated BlobProto history = 3; // The history for sgd solvers

  249   optional int32 current_step = 4 [default = 0]; // The current step for learning rate

  250 }

总之，感受到了protobuf这个出自google的包的强大和可怕。