Faster-RCNN Pytorch实现的minibatch包装

实际上faster-rcnn对于输入的图片是有resize操作的，在resize的图片基础上提取feature map，而后generate一定数量的RoI。

我想首先去掉这个resize的操作，对每张图都是在原始图片基础上进行识别，所以要找到它到底在哪里resize了图片。

直接搜 grep 'resize' ./lib/ -r

./lib/crnn/utils.py: v.data.resize_(data.size()).copy_(data)
./lib/model/config.py:# Option to set if max-pooling is appended after crop_and_resize.
./lib/model/config.py:# if true, the region will be resized to a square of 2xPOOLING_SIZE,
./lib/model/config.py:# resized to a square of POOLING_SIZE
./lib/model/test.py: im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,
./lib/nets/network.py:from scipy.misc import imresize
./lib/nets/network.py: image = imresize(image[0], self._im_info[:2] / self._im_info[2])
./lib/utils/blob.py: im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,

这里在training过程中应当是调用了./lib/utils/blob.py，

该文件包含了两个函数：

 def im_list_to_blob(ims):

   """Convert a list of images into a network input.

   Assumes images are already prepared (means subtracted, BGR order, ...).

   """

   max_shape = np.array([im.shape for im in ims]).max(axis=0)

   num_images = len(ims)

   blob = np.zeros((num_images, max_shape[0], max_shape[1], 3),

                   dtype=np.float32)

   for i in range(num_images):

     im = ims[i]

     blob[i, 0:im.shape[0], 0:im.shape[1], :] = im

   return blob

 def prep_im_for_blob(im, pixel_means, target_size, max_size):

   """Mean subtract and scale an image for use in a blob."""

   im = im.astype(np.float32, copy=False)

   im -= pixel_means

   im_shape = im.shape

   im_size_min = np.min(im_shape[0:2])

   im_size_max = np.max(im_shape[0:2])

   im_scale = float(target_size) / float(im_size_min)

   # Prevent the biggest axis from being more than MAX_SIZE

   if np.round(im_scale * im_size_max) > max_size:

     im_scale = float(max_size) / float(im_size_max)

   im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,

                   interpolation=cv2.INTER_LINEAR)

   return im, im_scale

而这两个函数都是在./lib/roi_data_layer/minibatch.py 下被调用的。

而该文件也定义了两个函数，其中get_minibatch() 调用了另一个子函数_get_image_blob()。

 def get_minibatch(roidb, num_classes):

   """Given a roidb, construct a minibatch sampled from it."""

   num_images = len(roidb)

   # Sample random scales to use for each image in this batch

   random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),

                   size=num_images)

   assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \

     'num_images ({}) must divide BATCH_SIZE ({})'. \

     format(num_images, cfg.TRAIN.BATCH_SIZE)

   # Get the input image blob, formatted for caffe

   im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)

   blobs = {'data': im_blob}

   assert len(im_scales) == 1, "Single batch only"

   assert len(roidb) == 1, "Single batch only"

   # gt boxes: (x1, y1, x2, y2, cls)

   if cfg.TRAIN.USE_ALL_GT:

     # Include all ground truth boxes

     gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]

   else:

     # For the COCO ground truth boxes, exclude the ones that are ''iscrowd''

     gt_inds = np.where(roidb[0]['gt_classes'] != 0 & np.all(roidb[0]['gt_overlaps'].toarray() > -1.0, axis=1))[0]

   gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)

   gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]

   gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]

   blobs['gt_boxes'] = gt_boxes

   blobs['im_info'] = np.array(

     [im_blob.shape[1], im_blob.shape[2], im_scales[0]],

     dtype=np.float32)

   return blobs

 def _get_image_blob(roidb, scale_inds):

   """Builds an input blob from the images in the roidb at the specified

   scales.

   """

   num_images = len(roidb)

   processed_ims = []

   im_scales = []

   for i in range(num_images):

     im = cv2.imread(roidb[i]['image'])

     if roidb[i]['flipped']:

       im = im[:, ::-1, :]

     target_size = cfg.TRAIN.SCALES[scale_inds[i]]

     im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,

                     cfg.TRAIN.MAX_SIZE)

     im_scales.append(im_scale)

     processed_ims.append(im)

   # Create a blob to hold the input images

   blob = im_list_to_blob(processed_ims)

   return blob, im_scales

get_minibatch()又是被./lib/roi_data_layer/layer.py中的类RoIDataLayer的一个方法forward()中调用的另一个方法_get_next_minibatch()调用的。

至此，由于RoIDataLayer类在类Network中被调用，终于把这些都接起来了。

faster-RCNN的代码实在是冗杂，来来回回定义了很多完全可以用一个函数实现的很多很多个函数。我佛了！

Faster-RCNN Pytorch实现的minibatch包装的更多相关文章

记pytorch版faster rcnn配置运行中的一些坑
记pytorch版faster rcnn配置运行中的一些坑项目地址 https://github.com/jwyang/faster-rcnn.pytorch 一般安装配置参考README.md文件 ...
读论文系列：Object Detection NIPS2015 Faster RCNN
转载请注明作者:梦里茶 Faster RCNN在Fast RCNN上更进一步,将Region Proposal也用神经网络来做,如果说Fast RCNN的最大贡献是ROI pooling layer和 ...
Faster RCNN 学习笔记
下面的介绍都是基于VGG16 的Faster RCNN网络,各网络的差异在于Conv layers层提取特征时有细微差异,至于后续的RPN层.Pooling层及全连接的分类和目标定位基本相同. 一). ...
faster rcnn讲解很细
https://blog.csdn.net/bailufeiyan/article/details/50749694 https://www.cnblogs.com/dudumiaomiao/p/65 ...
（原）faster rcnn的tensorflow代码的理解
转载请注明出处: https://www.cnblogs.com/darkknightzh/p/10043864.html 参考网址: 论文:https://arxiv.org/abs/1506.01 ...
目标检测（四）Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
作者:Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun SPPnet.Fast R-CNN等目标检测算法已经大幅降低了目标检测网络的运行时间. ...
Faster R-CNN代码例子
主要参考文章:1,从编程实现角度学习Faster R-CNN(附极简实现) 经常是做到一半发现收敛情况不理想,然后又回去看看这篇文章的细节. 另外两篇: 2,Faster R-CNN学习总结 ...
深度学习论文翻译解析（四）：Faster R-CNN: Down the rabbit hole of modern object detection
论文标题:Faster R-CNN: Down the rabbit hole of modern object detection 论文作者:Zhi Tian , Weilin Huang, Ton ...
Faster RCNN代码理解（Python）
转自http://www.infocool.net/kb/Python/201611/209696.html#原文地址第一步,准备从train_faster_rcnn_alt_opt.py入: 初 ...

随机推荐

js实现Set
class MySet { constructor(params) { if (typeof params[Symbol.iterator] !== 'function') { throw new T ...
Logarithmic-Trigonometric积分系列（一）
\[\Large\displaystyle \int_{0}^{\frac{\pi }{2}}x^{2}\ln\left ( \sin x \right )\ln\left ( \cos x \rig ...
前端——语言——Core JS——《The good part》读书笔记——第四章节(Function)
本章介绍Function对象,它是JS语言最复杂的内容. Java语言中没有Function对象,而是普通的方法,它的概念也比较简单,包含方法的重载,重写,方法签名,形参,实参等. JS语言中的Fun ...
其他 - win10 paged pool 内存溢出
1. 概述 win 10 内存时不时溢出目前还没有跟踪完毕有空继续跟踪 2. 问题 win10 内存动不动就往上涨只涨不降看各个进程又是正常的 3. 思路先看看内存情况妈的我 jvm 的 ...
优化mysql
数据库设计和表创建时就要考虑性能 sql的编写需要注意优化分区分表分库 1.数据库设计和表创建时就要考虑性能 mysql数据库本身高度灵活,造成性能不足,严重依赖开发人员能力.也就是说开发人员能 ...
Tika结合Tesseract-OCR 实现光学汉字识别（简体、宋体的识别率百分之百）—附Java源码、测试数据和训练集下载地址
OCR(Optical character recognition) —— 光学字符识别,是图像处理的一个重要分支,中文的识别具有一定挑战性,特别是手写体和草书的识别,是重要和热门的科学研究方向.可 ...
单例模式的Java泛型实现方式
import java.util.HashMap; import java.util.Map; /** * Created by zhao.wu on 2016/11/18. */ public cl ...
BUUCTF-Web-Warm Up(CVE-2018-12613)
题目(虽然是Warm up,但一点也不简单): 打开只有图片,源码里面提示了source.php 查看source.php: php代码里又提到了hint,去查看一下: 提示flag在如上图文件名里面 ...
Spring Boot 编辑器 IDEA 免费许可申请
最近 IDEA 陆续到期(试用版)听说可以申请开源许可,试试吧. 点击 https://www.jetbrains.com/shop/eform/opensource?product=ALL 填写相关 ...
全排列dfs
#include <iostream> #include <vector> using namespace std; vector<int> ans; const ...

Faster-RCNN Pytorch实现的minibatch包装

Faster-RCNN Pytorch实现的minibatch包装的更多相关文章

随机推荐

热门专题