prior_box层

https://www.jianshu.com/p/5195165bbd06

1.step_w、step_h其实就相当于faster中的feat_stride,也就是把这些点从feature map映射回原图,同时也可以看出min_size、max_size这些都是直接在针对原图来讲的

2.以mobileNet-ssd为例子:https://github.com/chuanqi305/MobileNet-SSD/blob/master/train.prototxt

layer {

  name: "conv11_mbox_priorbox"

  type: "PriorBox"

  bottom: "conv11"

  bottom: "data"

  top: "conv11_mbox_priorbox"

  prior_box_param {

    min_size: 60.0

    aspect_ratio: 2.0

    flip: true

    clip: false

    variance: 0.1

    variance: 0.1

    variance: 0.2

    variance: 0.2

    offset: 0.5

  }

}

layer {

  name: "conv13_mbox_priorbox"

  type: "PriorBox"

  bottom: "conv13"

  bottom: "data"

  top: "conv13_mbox_priorbox"

  prior_box_param {

    min_size: 105.0

    max_size: 150.0

    aspect_ratio: 2.0

    aspect_ratio: 3.0

    flip: true

    clip: false

    variance: 0.1

    variance: 0.1

    variance: 0.2

    variance: 0.2

    offset: 0.5

  }

}

只有conv11的anchor个数是3,其他5层都是6,原因是conv11只有min_size,没有max_size,并且aspect_ratio只有1个,其他5层都是两个,也就是说conv11是1+1*2=3,其他5层是1+1+2*2=6

prior_box_layer.cpp里,aspect_ratios_根据这层的param存储相应的aspect ratio.如果flip为true,param里一个aspect ratio就要存储他本身和他的倒数两个值

  aspect_ratios_.clear();

  aspect_ratios_.push_back(.);

  flip_ = prior_box_param.flip();

  for (int i = ; i < prior_box_param.aspect_ratio_size(); ++i) {

    float ar = prior_box_param.aspect_ratio(i);

    bool already_exist = false;

    for (int j = ; j < aspect_ratios_.size(); ++j) {     //检查是否有重复的

      if (fabs(ar - aspect_ratios_[j]) < 1e-) {

        already_exist = true;

        break;

      }

    }

    if (!already_exist) {

      aspect_ratios_.push_back(ar);　　　　　　　　　　　　　 //如果flip为true,存储aspect ratio和他的倒数,否则只存储aspect ratio本身

      if (flip_) {

        aspect_ratios_.push_back(./ar);

      }

    }

}

对于每个点,先计算以min_size为长宽的正方形这个anchor;然后如果有max_size,计算以sqrt(min_size_ * max_size_)为长宽的正方形;然后计算aspect_ratios_中所有的aspect ratios,然后以这个aspect ratios计算box_width = min_size_ * sqrt(ar)和box_height = min_size_ / sqrt(ar),prototxt中的param里,一个ratio要存储他和他的倒数,这样一个ratio就要求两个anchor

  for (int h = ; h < layer_height; ++h) {

    for (int w = ; w < layer_width; ++w) {

      float center_x = (w + offset_) * step_w;

      float center_y = (h + offset_) * step_h;

      float box_width, box_height;

      for (int s = ; s < min_sizes_.size(); ++s) {

        int min_size_ = min_sizes_[s];

        // first prior: aspect_ratio = 1, size = min_size

        box_width = box_height = min_size_;

        // xmin

        top_data[idx++] = (center_x - box_width / .) / img_width;

        // ymin

        top_data[idx++] = (center_y - box_height / .) / img_height;

        // xmax

        top_data[idx++] = (center_x + box_width / .) / img_width;

        // ymax

        top_data[idx++] = (center_y + box_height / .) / img_height;

        if (max_sizes_.size() > ) {

          CHECK_EQ(min_sizes_.size(), max_sizes_.size());

          int max_size_ = max_sizes_[s];

          // second prior: aspect_ratio = 1, size = sqrt(min_size * max_size)

          box_width = box_height = sqrt(min_size_ * max_size_);

          // xmin

          top_data[idx++] = (center_x - box_width / .) / img_width;

          // ymin

          top_data[idx++] = (center_y - box_height / .) / img_height;

          // xmax

          top_data[idx++] = (center_x + box_width / .) / img_width;

          // ymax

          top_data[idx++] = (center_y + box_height / .) / img_height;

        }

        // rest of priors

        for (int r = ; r < aspect_ratios_.size(); ++r) {

          float ar = aspect_ratios_[r];

          if (fabs(ar - .) < 1e-) {

            continue;

          }

          box_width = min_size_ * sqrt(ar);

          box_height = min_size_ / sqrt(ar);

          // xmin

          top_data[idx++] = (center_x - box_width / .) / img_width;

          // ymin

          top_data[idx++] = (center_y - box_height / .) / img_height;

          // xmax

          top_data[idx++] = (center_x + box_width / .) / img_width;

          // ymax

          top_data[idx++] = (center_y + box_height / .) / img_height;

        }

      }

    }

}

3.从reshape可以看出,输出的shape是(1,2,layer_width * layer_height * num_priors_ * 4),layer_width * layer_height * num_priors_ * 4是每个feature map上每个点乘以anchor数,再每个anchor乘以对应的4个坐标,比如整个blob中第一个4个值存储的就是feature map中第一个像素点的min size对应的正方形那个anchor的4个坐标值,第二个就是第一个像素点对应的max size对应的anchor的4个坐标值

void PriorBoxLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,

      const vector<Blob<Dtype>*>& top) {

  const int layer_width = bottom[]->width();

  const int layer_height = bottom[]->height();

  vector<int> top_shape(, );

  // Since all images in a batch has same height and width, we only need to

  // generate one set of priors which can be shared across all images.

  top_shape[] = ;

  // 2 channels. First channel stores the mean of each prior coordinate.

  // Second channel stores the variance of each prior coordinate.

  top_shape[] = ;

  top_shape[] = layer_width * layer_height * num_priors_ * ;

  CHECK_GT(top_shape[], );

  top[]->Reshape(top_shape);

}

注意到,输出是2channel的,第一个channel就是存储的真实的每个anchor的4个坐标,第二个channel存储的就是variance,variance_在layer_setup里面就初始化了4个值,这4个值就是来自于prototxt的param.这4个值分别对应4个坐标点,对于每个anchor,都会有对应这4个variance值,这些值存储在第二个channel,并且在第二个channel里面每4个值每4个值重复

 top_data += top[]->offset(, );

  if (variance_.size() == ) {

    caffe_set<Dtype>(dim, Dtype(variance_[]), top_data);

  } else {

    int count = ;

    for (int h = ; h < layer_height; ++h) {

      for (int w = ; w < layer_width; ++w) {

        for (int i = ; i < num_priors_; ++i) {

          for (int j = ; j < ; ++j) {

            top_data[count] = variance_[j];

            ++count;

          }

        }

      }

    }

}

4.http://www.360doc.com/content/17/0810/10/10408243_678091430.shtml

这两段代码都来自于bbox_util.cpp的DecodeBBox函数.prior_box层输出的prior_variance就是一个系数,这个系数乘以bounding box regression的回归值,在faster中,是直接在anchor的坐标上加bounding box regression,ssd这里可以对回归乘以一个系数.当然DecodeBBox其实也可以使用faster那种方式,可以通过参数控制

else {

      // variance is encoded in bbox, we need to scale the offset accordingly.

      decode_bbox->set_xmin(

          prior_bbox.xmin() + prior_variance[] * bbox.xmin());

      decode_bbox->set_ymin(

          prior_bbox.ymin() + prior_variance[] * bbox.ymin());

      decode_bbox->set_xmax(

          prior_bbox.xmax() + prior_variance[] * bbox.xmax());

      decode_bbox->set_ymax(

          prior_bbox.ymax() + prior_variance[] * bbox.ymax());

}

else {

      // variance is encoded in bbox, we need to scale the offset accordingly.

      decode_bbox->set_xmin(

          prior_bbox.xmin() + prior_variance[] * bbox.xmin() * prior_width);

      decode_bbox->set_ymin(

          prior_bbox.ymin() + prior_variance[] * bbox.ymin() * prior_height);

      decode_bbox->set_xmax(

          prior_bbox.xmax() + prior_variance[] * bbox.xmax() * prior_width);

      decode_bbox->set_ymax(

          prior_bbox.ymax() + prior_variance[] * bbox.ymax() * prior_height);

}

5.https://zhuanlan.zhihu.com/p/33544892 这个介绍了每层的prior如何确定min_size

对于后面的特征图，先验框尺度按照上面公式线性增加，但是先将尺度比例先扩大100倍，此时增长步长为 $\lfloor \frac{\lfloor s_{max}\times 100\rfloor - \lfloor s_{min}\times 100\rfloor}{m-1}\rfloor=17$ ，这样各个特征图的 $s_k$ 为 $20, 37, 54, 71, 88$ ，将这些比例除以100，然后再乘以图片大小，可以得到各个特征图的尺度为 $60,111, 162,213,264$ ，这种计算方式是参考SSD的Caffe源码。综上，可以得到各个特征图的先验框尺度 $30,60,111, 162,213,264$

prior_box层的更多相关文章

整个ssd的网络和multibox_loss_layer
总结说来prior_box层只完成了一个提取anchor的过程,其他与gt的match,筛选正负样本比例都是在multibox_loss_layer完成的 http://www.360doc.com/ ...
iOS总结_UI层自我复习总结
UI层复习笔记在main文件中,UIApplicationMain函数一共做了三件事根据第三个参数创建了一个应用程序对象默认写nil,即创建的是UIApplication类型的对象,此对象看成是 ...
终于等到你：CYQ.Data V5系列（ORM数据层）最新版本开源了
前言: 不要问我框架为什么从收费授权转到免费开源,人生没有那么多为什么,这些年我开源的东西并不少,虽然这个是最核心的,看淡了就也没什么了. 群里的网友:太平说: 记得一年前你开源另一个项目的时候我就说 ...
UWP开发之ORM实践：如何使用Entity Framework Core做SQLite数据持久层？
选择SQLite的理由在做UWP开发的时候我们首选的本地数据库一般都是Sqlite,我以前也不知道为啥?后来仔细研究了一下也是有原因的: 1,微软做的UWP应用大部分也是用Sqlite.或者说是微软 ...
java中Action层、Service层和Dao层的功能区分
Action/Service/DAO简介: Action是管理业务(Service)调度和管理跳转的. Service是管理具体的功能的. Action只负责管理,而Service负责实施. DAO只 ...
ABP领域层
1.实体Entites 1.1 概念实体是DDD(领域驱动设计)的核心概念之一. 实体是具有唯一标识的ID且存储在数据库总.实体通常被映射成数据库中的一个表. 在ABP中,实体继承自Entity类. ...
JavaScript学习笔记(一)——延迟对象、跨域、模板引擎、弹出层、AJAX示例
一.AJAX示例 AJAX全称为“Asynchronous JavaScript And XML”(异步JavaScript和XML) 是指一种创建交互式网页应用的开发技术.改善用户体验,实现无刷新效 ...
jQuery遮罩层登录对话框
用户登录是许多网站必备的功能.有一种方式就是不管在网站的哪个页面,点击登录按钮就会弹出一个遮罩层,显示用户登录的对话框.这用方式比较灵活方便.而现在扫描二维码登录的方式也是很常见,例如QQ.微信.百度 ...
使用CSS3的box-shadow实现双透明遮罩层对话框
box-shadow介绍在我之前的一篇文章<从天猫和支付宝身上学习opcity与rgba>中,介绍了实现双透明遮罩层效果的两种方法,分别是opacity和rgba.他们需要分别依赖于不同 ...

随机推荐

QT跟VC++结合来进行插件的验证机制
由于最近公司要开发一个以C++插件机制为主的,主要有一个问题就是C++的二进制兼容性的问题.一旦类使用虚函数,只要随便改动下增删查改下头文件的虚函数,就会导致程序在跑的时候进行乱跳,因为这个时候exe ...
简单的CSS3鼠标滑过图片标题和遮罩层动画特效
此文转自:http://www.cnblogs.com/w2bc/p/5735300.html,仅供本人学习参考,版权归原作者所有! 这是一款使用CSS3制作的简单的鼠标滑过图片标题和遮罩层动画特 ...
unity监听键盘按键
放在Update里面 if (Input.anyKeyDown) { foreach (KeyCode keyCode in Enum.GetValues(typeof(KeyCode))) { if ...
Markdown简易使用
Markdown 笔记标题 1.一级标题 2.二级标题 3.三级标题列表这是一个无序列表这是一个有序列表引用这是一条引用图片与链接图片链接 Baidu 粗体与斜体粗体斜体 ...
React.js 小书 Lesson11 - 配置组件的 props
作者:胡子大哈原文链接:http://huziketang.com/books/react/lesson11 转载请注明出处,保留原文链接和作者信息. 组件是相互独立.可复用的单元,一个组件可能在不 ...
用jquery来实现正反选选择框checkbox的小示例
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
phpstorm一些简单配置
1.字体大小和行间距 2.设置编码:包括编辑工具编码和项目编码
前端之CSS——属性和定位
一.字体属性 1.font-size(字体大小) p { font-size: 14px; } font-size 属性可设置字体的尺寸. px:像素,稳定和精确 %:把 font-size 设置为基 ...
caffe-windows之手写体数字识别例程mnist
caffe-windows之手写体数字识别例程mnist 一.训练测试网络模型 1.准备数据 Caffe不是直接处理原始数据的,而是由预处理程序将原始数据变换存储为LMDB格式,这种方式可以保持较高的 ...
HTML <frameset> 标签
<frameset></frameset>:框架标签,可以将页面分割,被frameset标签分割的页面,不允许使用body标签;frameset标签页面内只能出现framese ...

prior_box层

prior_box层的更多相关文章

随机推荐

热门专题