毫无疑问,faster-rcnn是目标检测领域的一个里程碑式的算法。本文主要是本人阅读python版本的faster-rcnn代码的一个记录,算法的具体原理本文也会有介绍,但是为了对该算法有一个整体性的理解以及更好地理解本文,还需事先阅读faster-rcnn的论文并参考网上的一些说明性的博客(如一文读懂Faster RCNN)。官方的py-faster-rcnn代码库已经不再维护了,我使用的是经过少许修改后的代码(主要是numpy版本不兼容导致的一些错误),可以参考这里

faster-rcnn有2种训练方式,一是两阶段法,二是端到端的方法,本文主要讲述端到端的方法,并以训练代码的运行顺序进行阅读。

一、数据准备

程序首先从faster-rcnn/tools/train_net.py运行,程序如下:

 if __name__ == '__main__':
args = parse_args() print('Called with args:')
print(args) if args.cfg_file is not None:
cfg_from_file(args.cfg_file)
if args.set_cfgs is not None:
cfg_from_list(args.set_cfgs) cfg.GPU_ID = args.gpu_id print('Using config:')
pprint.pprint(cfg) if not args.randomize:
# fix the random seeds (numpy and caffe) for reproducibility
np.random.seed(cfg.RNG_SEED)
caffe.set_random_seed(cfg.RNG_SEED) # set up caffe
caffe.set_mode_gpu()
caffe.set_device(args.gpu_id) imdb, roidb = combined_roidb(args.imdb_name)
print '{:d} roidb entries'.format(len(roidb)) output_dir = get_output_dir(imdb)
print 'Output will be saved to `{:s}`'.format(output_dir) train_net(args.solver, roidb, output_dir,
pretrained_model=args.pretrained_model,
max_iters=args.max_iters)

该部分cfg_from_file(args.cfg_file)调用faster-rcnn/lib/fast_rcnn/config.py中的cfg_from_file方法,从faster-rcnn/experiments/cfgs/faster_rcnn_end2end.yml文件中加载一些端到端训练时用到的参数配置,这句话会修改config.py中一些参数的值,下面是faster_rcnn_end2end.yml中的内容:

 EXP_DIR: faster_rcnn_end2end
TRAIN:
HAS_RPN: True
IMS_PER_BATCH: 1
BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
RPN_POSITIVE_OVERLAP: 0.7
RPN_BATCHSIZE: 256
PROPOSAL_METHOD: gt
BG_THRESH_LO: 0.0
TEST:
HAS_RPN: True

train_net第109行imdb, roidb = combined_roidb(args.imdb_name)是数据准备的核心部分。返回的imdb是类pascal_voc的一个实例,后面只用到了其的一些路径,作用不大。roidb则包含了训练网络所需要的所有信息。下面看一下它的产生过程:

 def combined_roidb(imdb_names):
def get_roidb(imdb_name):
imdb = get_imdb(imdb_name)
print 'Loaded dataset `{:s}` for training'.format(imdb.name)
imdb.set_proposal_method(cfg.TRAIN.PROPOSAL_METHOD)
print 'Set proposal method: {:s}'.format(cfg.TRAIN.PROPOSAL_METHOD)
roidb = get_training_roidb(imdb)
return roidb roidbs = [get_roidb(s) for s in imdb_names.split('+')]
roidb = roidbs[0]
if len(roidbs) > 1:
for r in roidbs[1:]:
roidb.extend(r)
imdb = datasets.imdb.imdb(imdb_names)
else:
imdb = get_imdb(imdb_names)
return imdb, roidb

下面逐一分析combined_roidb函数中的每步操作。

1.1、get_imdb

首先由imdb = get_imdb(imdb_name)调用faster-rcnn/lib/datasets/factory.py中的get_imdb方法,返回了一个faster-rcnn/lib/datasets/pascal_voc.py中的pascal_voc类的实例。我输入函数get_imdb的参数是'voc_2007_trainval',与其对应的初始化pascal_voc类的参数为image_set='trainval',year='2007'。在这个pascal_voc实例中,数据集的路径由以下方式获取:

     def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set)
self._year = year
self._image_set = image_set
self._devkit_path = self._get_default_path() if devkit_path is None \
else devkit_path
self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year) def _get_default_path(self):
"""
Return the default path where PASCAL VOC is expected to be installed.
"""
return os.path.join(cfg.DATA_DIR, 'VOCdevkit' + self._year)

至于cfg.DATA_DIR,由faster-rcnn/lib/fast_rcnn/config.py文件的如下内容确定:

 # Root directory of project
__C.ROOT_DIR = osp.abspath(osp.join(osp.dirname(__file__), '..', '..')) # Data directory
__C.DATA_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'data'))

因此,由给出的以上参数确定的数据集的路径为self._data_path=$CODE_DIR/faster-rcnn/data/VOCdevkit2007/VOC2007。

1.2、imdb.set_proposal_method

其次,imdb.set_proposal_method(cfg.TRAIN.PROPOSAL_METHOD)会调用faster-rcnn/lib/datasets/imdb.py中类imdb中的set_proposal_method方法(因为pascal_voc继承自imdb),进而使self.roidb_handler为类pascal_voc中的gt_roidb方法(因为参数method='gt')。

这步操作非常重要,因为函数gt_roidb就是读取pascal_voc数据集,并返回所有图片信息的函数,代码如下:

     def gt_roidb(self):
"""
Return the database of ground-truth regions of interest. This function loads/saves from/to a cache file to speed up future calls.
"""
cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
if os.path.exists(cache_file):
with open(cache_file, 'rb') as fid:
roidb = cPickle.load(fid)
print '{} gt roidb loaded from {}'.format(self.name, cache_file)
return roidb gt_roidb = [self._load_pascal_annotation(index)
for index in self.image_index]
with open(cache_file, 'wb') as fid:
cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)
print 'wrote gt roidb to {}'.format(cache_file) return gt_roidb

在函数gt_roidb中,首先判断有没有cache_file(它会在第一次读取数据集标注文件之后将所有字典形式的标注信息写进一个文件中,我创建数据类用的imdb_name='voc_2007_trainval',因此对应的文件名为faster-rcnn/data/voc_2007_trainval_gt_roidb.pkl),若存在,则直接从中读取标注信息,若不存在,则通过调用_load_pascal_annotation将pascal_voc数据集中每张图片的标注信息读取读取到一个字典中,具体代码如下:

     def _load_pascal_annotation(self, index):
"""
Load image and bounding boxes info from XML file in the PASCAL VOC
format.
"""
filename = os.path.join(self._data_path, 'Annotations', index + '.xml')
tree = ET.parse(filename)
objs = tree.findall('object')
if not self.config['use_diff']:
# Exclude the samples labeled as difficult
non_diff_objs = [
obj for obj in objs if int(obj.find('difficult').text) == 0]
# if len(non_diff_objs) != len(objs):
# print 'Removed {} difficult objects'.format(
# len(objs) - len(non_diff_objs))
objs = non_diff_objs
num_objs = len(objs) boxes = np.zeros((num_objs, 4), dtype=np.uint16)
gt_classes = np.zeros((num_objs), dtype=np.int32)
overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
# "Seg" area for pascal is just the box area
seg_areas = np.zeros((num_objs), dtype=np.float32) # Load object bounding boxes into a data frame.
for ix, obj in enumerate(objs):
bbox = obj.find('bndbox')
# Make pixel indexes 0-based
x1 = float(bbox.find('xmin').text) - 1
y1 = float(bbox.find('ymin').text) - 1
x2 = float(bbox.find('xmax').text) - 1
y2 = float(bbox.find('ymax').text) - 1
cls = self._class_to_ind[obj.find('name').text.lower().strip()]
boxes[ix, :] = [x1, y1, x2, y2]
gt_classes[ix] = cls
overlaps[ix, cls] = 1.0
seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1) overlaps = scipy.sparse.csr_matrix(overlaps) return {'boxes' : boxes,
'gt_classes': gt_classes,
'gt_overlaps' : overlaps,
'flipped' : False,
'seg_areas' : seg_areas}

值得一提的是,字典中,overlaps指的是该张图片中,每个物体与其它ground true之间的重叠比例,不过从代码来看,默认一张图片中所有的物体(ground true)之间是没有重叠的,因而overlaps的shape为(num_objs, self.num_classes),它的每一行(第一个轴上)只有一个元素是1.0,其它的元素都是0。这种默认方式虽然与实际标注情况不符,但对后面的操作并没有影响。

1.3、get_training_roidb

roidb = get_training_roidb(imdb)会调用faster-rcnn/lib/fast_rcnn/train.py中的get_training_roidb函数

 def get_training_roidb(imdb):
"""Returns a roidb (Region of Interest database) for use in training."""
if cfg.TRAIN.USE_FLIPPED:
print 'Appending horizontally-flipped training examples...'
imdb.append_flipped_images()
print 'done' print 'Preparing training data...'
rdl_roidb.prepare_roidb(imdb)
print 'done' return imdb.roidb

会进行2步操作。

1.3.1、imdb.append_flipped_images()

     def append_flipped_images(self):
num_images = self.num_images
widths = self._get_widths()
for i in xrange(num_images):
boxes = self.roidb[i]['boxes'].copy()
oldx1 = boxes[:, 0].copy()
oldx2 = boxes[:, 2].copy()
boxes[:, 0] = widths[i] - oldx2 - 1
boxes[:, 2] = widths[i] - oldx1 - 1
assert (boxes[:, 2] >= boxes[:, 0]).all()
entry = {'boxes' : boxes,
'gt_overlaps' : self.roidb[i]['gt_overlaps'],
'gt_classes' : self.roidb[i]['gt_classes'],
'flipped' : True}
self.roidb.append(entry)
self._image_index = self._image_index * 2

此句调用faster-rcnn/lib/datasets/imdb.py中类imdb的append_flipped_images方法,其作用是将数据集中的每张图的所有bounding box标签进行水平翻转,然后将图片信息字典中的'flipped'置为True,并将这一新的字典添加进原始的roidb list中,这样图片信息列表的长度就变为了原来的2倍。最后将数据集实例中的_image_index成员(所有图片名的list)复制了一份,长度也变为了原来的2倍。值得关注的是self.roidb是类imdb的一个属性(由Python内置的@property装饰器修饰)。属性和方法的不同之处在于调用方法需要加(),如某方法名为methodname,调用方式为methodname(),而调用属性不需要加(),self.roidb的构造过程如以下代码所示。另外,装饰器@methodname.setter可以把一个方法变成可以赋值的属性,“=”右侧的表达式作为传入方法的实参,如以下代码中的@roidb_handler.setter。

     @property
def roidb_handler(self):
return self._roidb_handler @roidb_handler.setter
def roidb_handler(self, val):
self._roidb_handler = val def set_proposal_method(self, method):
method = eval('self.' + method + '_roidb')
self.roidb_handler = method @property
def roidb(self):
# A roidb is a list of dictionaries, each with the following keys:
# boxes
# gt_overlaps
# gt_classes
# flipped
if self._roidb is not None:
return self._roidb
self._roidb = self.roidb_handler()
return self._roidb

1.3.2、rdl_roidb.prepare_roidb(imdb)

 def prepare_roidb(imdb):
"""Enrich the imdb's roidb by adding some derived quantities that
are useful for training. This function precomputes the maximum
overlap, taken over ground-truth boxes, between each ROI and
each ground-truth box. The class with maximum overlap is also
recorded.
"""
sizes = [PIL.Image.open(imdb.image_path_at(i)).size
for i in xrange(imdb.num_images)]
roidb = imdb.roidb
for i in xrange(len(imdb.image_index)):
roidb[i]['image'] = imdb.image_path_at(i)
roidb[i]['width'] = sizes[i][0]
roidb[i]['height'] = sizes[i][1]
# need gt_overlaps as a dense array for argmax
gt_overlaps = roidb[i]['gt_overlaps'].toarray()
# max overlap with gt over classes (columns)
max_overlaps = gt_overlaps.max(axis=1)
# gt class that had the max overlap
max_classes = gt_overlaps.argmax(axis=1)
roidb[i]['max_classes'] = max_classes
roidb[i]['max_overlaps'] = max_overlaps
# sanity checks
# max overlap of 0 => class should be zero (background)
zero_inds = np.where(max_overlaps == 0)[0]
assert all(max_classes[zero_inds] == 0)
# max overlap > 0 => class should not be zero (must be a fg class)
nonzero_inds = np.where(max_overlaps > 0)[0]
assert all(max_classes[nonzero_inds] != 0)

此句调用faster-rcnn/lib/roi_data_layer/roidb.py中的prepare_roidb函数,其作用是在图片信息字典中加入5个键值。分别是'image'(图片的全路径),'width'(图片的宽度),'height'(图片的高度),'max_classes','max_overlaps'。

至此roidb的构造过程便结束了,下面总结一下:最终得到的roidb是一个包含数据集中所有图片(以及它的水平翻转)信息的list,每张图的信息(保存在一个字典中)对应着list中的一个元素。每张图片的信息结构如下:

 {
'boxes' : boxes, # picture's bounding box: xmin, ymin, xmax, ymax(pixel indexes 0-based),
# shape: (num_objs, 4), dtype=np.uint16
'gt_classes': gt_classes, # gt class label(background is 0), shape: (num_objs,), dtype=np.int32
'gt_overlaps' : overlaps, # each obj's max overlap with one of gt, shape: (num_objs, self.num_classes), dtype=np.float32
'flipped' : False,
'seg_areas' : seg_areas, # area for each obj in one picture, shape: (num_objs,), dtype=np.float32
'image' : image_full_path,
'width' : image_width,
'height' : image_height,
'max_classes' : max_classes, # equal to gt_classes, shape: (num_objs,), dtype=np.int64
'max_overlaps' : max_overlaps, # all elements are 1.0, shape: (num_objs,), dtype=np.float32
}

1.4、get_output_dir

train_net.py第112行output_dir = get_output_dir(imdb)调用faster-rcnn/lib/fast_rcnn/config.py中的get_output_dir函数

 def get_output_dir(imdb, net=None):
"""Return the directory where experimental artifacts are placed.
If the directory does not exist, it is created. A canonical path is built using the name from an imdb and a network
(if not None).
"""
outdir = osp.abspath(osp.join(__C.ROOT_DIR, 'output', __C.EXP_DIR, imdb.name))
if net is not None:
outdir = osp.join(outdir, net.name)
if not os.path.exists(outdir):
os.makedirs(outdir)
return outdir

函数中的__C.EXP_DIR在faster_rcnn_end2end.yml中的配置为faster_rcnn_end2end,因此最终outdir=$CODE_DIR/faster-rcnn/output/faster_rcnn_end2end/voc_2007_trainval

1.5、train_net

使用以上得到的roidb,output_dir等作为参数,训练网络。调用faster-rcnn/lib/fast_rcnn/train.py中的train_net函数

 def train_net(solver_prototxt, roidb, output_dir,
pretrained_model=None, max_iters=40000):
"""Train a Fast R-CNN network.""" roidb = filter_roidb(roidb)
sw = SolverWrapper(solver_prototxt, roidb, output_dir,
pretrained_model=pretrained_model) print 'Solving...'
model_paths = sw.train_model(max_iters)
print 'done solving'
return model_paths

1.5.1、filter_roidb

roidb = filter_roidb(roidb)调用filter_roidb函数对上述得到的roidb再按照一定的要求作进一步的过滤:

 __C.TRAIN.FG_THRESH = 0.5
__C.TRAIN.BG_THRESH_HI = 0.5
__C.TRAIN.BG_THRESH_LO = 0.1 def filter_roidb(roidb):
"""Remove roidb entries that have no usable RoIs.""" def is_valid(entry):
# Valid images have:
# (1) At least one foreground RoI OR
# (2) At least one background RoI
overlaps = entry['max_overlaps']
# find boxes with sufficient overlap
fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
# Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
(overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
# image is only valid if such boxes exist
valid = len(fg_inds) > 0 or len(bg_inds) > 0
return valid num = len(roidb)
filtered_roidb = [entry for entry in roidb if is_valid(entry)]
num_after = len(filtered_roidb)
print 'Filtered {} roidb entries: {} -> {}'.format(num - num_after,
num, num_after)
return filtered_roidb

一般的标注信息都能满足上述2个要求。

1.5.2、SolverWrapper

在该类的初始化函数中,主要有以下操作:

函数中各配置参数的值如下:

cfg.TRAIN.HAS_RPN=True

cfg.TRAIN.BBOX_REG=True

cfg.TRAIN.BBOX_NORMALIZE_TARGETS=True

cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED=True

     def __init__(self, solver_prototxt, roidb, output_dir,
pretrained_model=None):
"""Initialize the SolverWrapper."""
self.output_dir = output_dir if (cfg.TRAIN.HAS_RPN and cfg.TRAIN.BBOX_REG and
cfg.TRAIN.BBOX_NORMALIZE_TARGETS):
# RPN can only use precomputed normalization because there are no
# fixed statistics to compute a priori
assert cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED if cfg.TRAIN.BBOX_REG:
print 'Computing bounding-box regression targets...'
self.bbox_means, self.bbox_stds = \
rdl_roidb.add_bbox_regression_targets(roidb)
print 'done' self.solver = caffe.SGDSolver(solver_prototxt)
if pretrained_model is not None:
print ('Loading pretrained model '
'weights from {:s}').format(pretrained_model)
self.solver.net.copy_from(pretrained_model) self.solver_param = caffe_pb2.SolverParameter()
with open(solver_prototxt, 'rt') as f:
pb2_text_format.Merge(f.read(), self.solver_param) self.solver.net.layers[0].set_roidb(roidb)

1.5.2.1、add_bbox_regression_targets

函数中self.bbox_means, self.bbox_stds = rdl_roidb.add_bbox_regression_targets(roidb)调用faster-rcnn/lib/roi_data_layer/roidb.py中的add_bbox_regression_targets函数

函数中各配置参数的值如下:

cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED=True

cfg.TRAIN.BBOX_NORMALIZE_TARGETS=True

 def add_bbox_regression_targets(roidb):
"""Add information needed to train bounding-box regressors."""
assert len(roidb) > 0
assert 'max_classes' in roidb[0], 'Did you call prepare_roidb first?' num_images = len(roidb)
# Infer number of classes from the number of columns in gt_overlaps
num_classes = roidb[0]['gt_overlaps'].shape[1]
for im_i in xrange(num_images):
rois = roidb[im_i]['boxes']
max_overlaps = roidb[im_i]['max_overlaps']
max_classes = roidb[im_i]['max_classes']
roidb[im_i]['bbox_targets'] = \
_compute_targets(rois, max_overlaps, max_classes) if cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:
# Use fixed / precomputed "means" and "stds" instead of empirical values
means = np.tile(
np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS), (num_classes, 1))
stds = np.tile(
np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS), (num_classes, 1))
else:
# Compute values needed for means and stds
# var(x) = E(x^2) - E(x)^2
class_counts = np.zeros((num_classes, 1)) + cfg.EPS
sums = np.zeros((num_classes, 4))
squared_sums = np.zeros((num_classes, 4))
for im_i in xrange(num_images):
targets = roidb[im_i]['bbox_targets']
for cls in xrange(1, num_classes):
cls_inds = np.where(targets[:, 0] == cls)[0]
if cls_inds.size > 0:
class_counts[cls] += cls_inds.size
sums[cls, :] += targets[cls_inds, 1:].sum(axis=0)
squared_sums[cls, :] += \
(targets[cls_inds, 1:] ** 2).sum(axis=0) means = sums / class_counts
stds = np.sqrt(squared_sums / class_counts - means ** 2) print 'bbox target means:'
print means
print means[1:, :].mean(axis=0) # ignore bg class
print 'bbox target stdevs:'
print stds
print stds[1:, :].mean(axis=0) # ignore bg class # Normalize targets
if cfg.TRAIN.BBOX_NORMALIZE_TARGETS:
print "Normalizing targets"
for im_i in xrange(num_images):
targets = roidb[im_i]['bbox_targets']
for cls in xrange(1, num_classes):
cls_inds = np.where(targets[:, 0] == cls)[0]
roidb[im_i]['bbox_targets'][cls_inds, 1:] -= means[cls, :]
roidb[im_i]['bbox_targets'][cls_inds, 1:] /= stds[cls, :]
else:
print "NOT normalizing targets" # These values will be needed for making predictions
# (the predicts will need to be unnormalized and uncentered)
return means.ravel(), stds.ravel()

add_bbox_regression_targets首先计算所有边界框的回归目标(注意不是边界框的坐标),然后使用事先设定的均值和方差将回归目标标准化:

 __C.TRAIN.BBOX_NORMALIZE_MEANS = (0.0, 0.0, 0.0, 0.0)
__C.TRAIN.BBOX_NORMALIZE_STDS = (0.1, 0.1, 0.2, 0.2)

因为从gt到gt的回归目标都为0,因此标准化之后仍然为0,我认为这一步有点多余。其中使用到的函数有_compute_targetsbbox_transform

1.5.2.2、set_roidb

在SolverWrapper的初始化函数中,接下来是构造一个caffe中的solver对象、加载与训练模型的参数。最后使用self.solver.net.layers[0].set_roidb(roidb)将上述的roidb传入网络的第一层,即input-data层中。set_roidb的具体代码如下:

函数中各配置参数的值如下:

cfg.TRAIN.USE_PREFETCH=False

cfg.TRAIN.ASPECT_GROUPING=True

     def set_roidb(self, roidb):
"""Set the roidb to be used by this layer during training."""
self._roidb = roidb
self._shuffle_roidb_inds()
if cfg.TRAIN.USE_PREFETCH:
self._blob_queue = Queue(10)
self._prefetch_process = BlobFetcher(self._blob_queue,
self._roidb,
self._num_classes)
self._prefetch_process.start()
# Terminate the child process when the parent exists
def cleanup():
print 'Terminating BlobFetcher'
self._prefetch_process.terminate()
self._prefetch_process.join()
import atexit
atexit.register(cleanup) def _shuffle_roidb_inds(self):
"""Randomly permute the training roidb."""
if cfg.TRAIN.ASPECT_GROUPING:
widths = np.array([r['width'] for r in self._roidb])
heights = np.array([r['height'] for r in self._roidb])
horz = (widths >= heights)
vert = np.logical_not(horz)
horz_inds = np.where(horz)[0]
vert_inds = np.where(vert)[0]
inds = np.hstack((
np.random.permutation(horz_inds),
np.random.permutation(vert_inds)))
inds = np.reshape(inds, (-1, 2))
row_perm = np.random.permutation(np.arange(inds.shape[0]))
inds = np.reshape(inds[row_perm, :], (-1,))
self._perm = inds
else:
self._perm = np.random.permutation(np.arange(len(self._roidb)))
self._cur = 0

其中,使用到的函数有set_roidb_shuffle_roidb_inds。至此,faster-rcnn的数据准备阶段完成。

faster-rcnn代码阅读1的更多相关文章

  1. Faster R-CNN代码例子

    主要参考文章:1,从编程实现角度学习Faster R-CNN(附极简实现) 经常是做到一半发现收敛情况不理想,然后又回去看看这篇文章的细节. 另外两篇: 2,Faster R-CNN学习总结      ...

  2. Faster RCNN代码理解(Python)

    转自http://www.infocool.net/kb/Python/201611/209696.html#原文地址 第一步,准备 从train_faster_rcnn_alt_opt.py入: 初 ...

  3. Faster rcnn代码理解(4)

    上一篇我们说完了AnchorTargetLayer层,然后我将Faster rcnn中的其他层看了,这里把ROIPoolingLayer层说一下: 我先说一下它的实现原理:RPN生成的roi区域大小是 ...

  4. Faster rcnn代码理解(2)

    接着上篇的博客,咱们继续看一下Faster RCNN的代码- 上次大致讲完了Faster rcnn在训练时是如何获取imdb和roidb文件的,主要都在train_rpn()的get_roidb()函 ...

  5. Faster rcnn代码理解(1)

    这段时间看了不少论文,回头看看,感觉还是有必要将Faster rcnn的源码理解一下,毕竟后来很多方法都和它有相近之处,同时理解该框架也有助于以后自己修改和编写自己的框架.好的开始吧- 这里我们跟着F ...

  6. Faster R-CNN论文阅读摘要

    论文链接: https://arxiv.org/pdf/1506.01497.pdf 代码下载: https://github.com/ShaoqingRen/faster_rcnn (MATLAB) ...

  7. Faster rcnn代码理解(3)

    紧接着之前的博客,我们继续来看faster rcnn中的AnchorTargetLayer层: 该层定义在lib>rpn>中,见该层定义: 首先说一下这一层的目的是输出在特征图上所有点的a ...

  8. Faster RCNN代码解析

    1.faster_rcnn_end2end训练 1.1训练入口及配置 def train(): cfg.GPU_ID = 0 cfg_file = "../experiments/cfgs/ ...

  9. tensorflow faster rcnn 代码分析一 demo.py

    os.environ["CUDA_VISIBLE_DEVICES"]=2 # 设置使用的GPU tfconfig=tf.ConfigProto(allow_soft_placeme ...

  10. 对faster rcnn代码讲解的很好的一个

    http://www.cnblogs.com/houkai/p/6824455.html http://blog.csdn.net/u014696921/article/details/6032142 ...

随机推荐

  1. poj3926

    dp+优化 很明显可以用单调队列优化. 记录下自己犯的sb错误:  数组开小,sum没搞清... #include<cstdio> #include<cstring> usin ...

  2. [Apple开发者帐户帮助]二、管理你的团队(5)转移帐户持有人角色

    组织团队的帐户持有人可以将帐户持有人角色转移给团队中的其他人.如果您是个人注册并需要将您的会员资格转移给其他人,请与我们联系. 所需角色:帐户持有人. 转移Apple Developer Progra ...

  3. 【Python】循环语句

    while循环 当条件成立时,循环体的内容可以一直执行,但是避免死循环,需要有一个跳出循环的条件才行. for循环 遍历任何序列(列表和字符串)中的每一个元素 >>> a = [&q ...

  4. CAS配置(3)之restful-api接入接口

    第一步,cas服务端对api接口支持 在cas-server-webapp下 pom.xml添加如下依赖 <dependency> <groupId>org.jasig.cas ...

  5. IP地址库

    吐槽 前两天一个线上的IP地址库除了点幺蛾子,一查代码,发现用的库早就不更新了,遂决定换库,有几个方案: 纯真数据库 IPIP数据库 GeoIP 纯真数据库是大码农的福音,免费,但是精度一般:IPIP ...

  6. CentOS7 搭建Kafka(三)工具篇

    CentOS7 搭建Kafka(三)工具篇 做为一名懒人,自然不喜欢敲那些命令,一个是容易出错,另外一个是懒得记,能有个工具就最好了,一查还挺多,我们用个最主流的Kafka Manager Kafka ...

  7. .net core2.0 中使用log4net

    一.nuget安装log4net 二.添加log4net.config配置文件 <?xml version="1.0" encoding="utf-8"? ...

  8. Jquery IE8兼容性

    环境: jsp+jquery-1.11.1.min.js 问题描述: 使用$("#article标签id名").append(“xxxxxxxxx") ,chrome.f ...

  9. Tomcat 程序无问题的情况下页面打开变慢的原因

    看看这写日志的频率就知道我有多闲了.. 前言: 其实关于tomcat,遇到过很多关于“慢”的问题,比如启动慢,比如页面打开慢, 以前太忙也太懒,不愿意花时间分析原因,现在终于肯静下来找原因 环境是ec ...

  10. windows phone控件

    常用控件: 包括: Button控件.CheckBox控件.HyperlinkButton控件.Iamege控件.ListBox控件.PasswordBox控件.ProgressBar控件.Radio ...