DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

1、Introduction

DL解决VO问题：End-to-End VO with RCNN

2、Network structure

a.CNN based Feature Extraction

　　论文使用KITTI数据集。

　　CNN部分有9个卷积层，除了Conv6，其他的卷积层后都连接1层ReLU，则共有17层。

b、RNN based Sequential Modelling

　　RNN is different from CNN in that it maintains memory of its hidden states over time and has feedback loops among them, which enables its current hidden state to be a function of the previous ones.

　　Given a convolutional feature x_k at time k, a RNN updates at time step k by

　　hk and yk are the hidden state and output at time k respectively.

　　W terms denote corresponding weight matrices.

　　b terms denote bias vectors.

　　H is an element-wise nonlinear activation function.

　　LSTM

Folded and unfolded LSTMs and internal structure of its unit.

　　is element-wise product of two vectors.

　　σ is sigmoid non-linearity.

　　tanh is hyperbolic tangent non-linearity.

　　W terms denote corresponding weight matrices.

　　b terms denote bias vectors.

　　ik, f k, gk, ck and ok are input gate, forget gate, input modulation gate, memory cell and output gate.

　　Each of the LSTM layers has 1000 hidden states.

3、损失函数及优化

　　The conditional probability of the poses Y_t = (y₁, . . . , y_t) given a sequence of monocular RGB images X_t = (x₁, . . . , x_t) up to time t.

　　Optimal parameters :

　　The hyperparameters of the DNNs:

　　(p_k, φ_k) is the ground truth pose.

　　(pˆ_k, φˆ_k) is the estimated ground truth pose.

　　κ (100 in the experiments) is a scale factor to balance the weights of positions and orientations.

　　N is the number of samples.

　　The orientation φ is represented by Euler angles rather than quaternion since quaternion is subject to an extra unit constraint which hinders the optimisation problem of DL.

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks的更多相关文章

论文笔记之：Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking
Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking arXiv Paper ...
论文笔记之：Learning Multi-Domain Convolutional Neural Networks for Visual Tracking
Learning Multi-Domain Convolutional Neural Networks for Visual Tracking CVPR 2016 本文提出了一种新的CNN 框架来处理 ...
Convolutional Neural Networks for Visual Recognition
http://cs231n.github.io/ 里面有很多相当好的文章 http://cs231n.github.io/convolutional-networks/ Table of Cont ...
Convolutional Neural Networks for Visual Recognition 1
Introduction 这是斯坦福计算机视觉大牛李菲菲最新开设的一门关于deep learning在计算机视觉领域的相关应用的课程.这个课程重点介绍了deep learning里的一种比较流行的模型 ...
cs231n spring 2017 lecture1 Introduction to Convolutional Neural Networks for Visual Recognition 听课笔记
1. 生物学家做实验发现脑皮层对简单的结构比如角.边有反应,而通过复杂的神经元传递,这些简单的结构最终帮助生物体有了更复杂的视觉系统.1970年David Marr提出的视觉处理流程遵循这样的原则,拿 ...
Stanford CS231n - Convolutional Neural Networks for Visual Recognition
网易云课堂上有汉化的视频:http://study.163.com/course/courseLearn.htm?courseId=1003223001#/learn/video?lessonId=1 ...
CS231n: Convolutional Neural Networks for Visual Recognition
https://zhuanlan.zhihu.com/p/28522637 https://zhuanlan.zhihu.com/p/21930884 mark
卷积神经网络用于视觉识别Convolutional Neural Networks for Visual Recognition
Table of Contents: Architecture Overview ConvNet Layers Convolutional Layer Pooling Layer Normalizat ...
Robust Online Visual Tracking with a Single Convolutional Neural Network
Abstract:这篇论文有三个贡献,第一提出了新颖的简化的结构损失函数,能保持尽量多的训练样本,通过适应模型输出的不确定性来减少跟踪误差累积风险. 第二是增强了普通的SGD,采用了暂时的选择策略来进 ...

随机推荐

response对象乱码--解决
中文乱码响应对象中文乱码,即就是response对象乱码. response对象输出中文数据乱码解决方案: 1 字节流输出响应乱码. 该情况不一定乱码.但是解决乱码的步骤是: 1) 设置浏览器打开文 ...
前端进阶笔记（一）---JS语言通识
一.语言按照语法分类 1.非形式语言:中文英文 2.形式语言:乔姆斯基谱系(四种文法上下文包含文法) 0型无限制文法 1型上下文相关文法 2型上下文无关文法正则文法二产生式(BNF) ...
tinymce 设置和获取编辑器的内容
$('目标元素').html(插入的内容) //设置tinymce编辑器的内容tinymce.get('目标元素').getContent() //获取tinymce编辑器的内容
JVM 专题十四：本地方法接口
1. 本地方法接口 2. 什么是本地方法? 简单来讲,一个Native Method就是一个Java调用非Java代码的接口.一个Native Method是这样一个java方法:该方法的实现由非Ja ...
shell专题（九）：函数
9.1 系统函数 1．basename基本语法 basename [string / pathname] [suffix] (功能描述:basename命令会删掉所有的前缀包括最后一个(‘/’)字 ...
flask 源码专题（五）：SqlAlchemy 中操作数据库时session和scoped_session的区别
1原生session: from sqlalchemy.orm import sessionmaker from sqlalchemy import create_engine from sqlalc ...
day1：注释和变量
1.注释的作用:对代码的解释,方便以后阅读代码 2.常用的快捷键:ctrl+q:notepad++的注释ctrl+/:pycharm的注释ctrl+c:复制ctrl+v:粘贴ctrl+z:撤销ctrl ...
Spring AOP底层实现分析
Spring AOP代理对象的生成 Spring提供了两种方式来生成代理对象: JdkProxy和Cglib,具体使用哪种方式生成由AopProxyFactory根据AdvisedSupport对象的 ...
Ethical Hacking - NETWORK PENETRATION TESTING(20)
MITM - Capturing Screen Of Target & Injecting a Keylogger ScreenShotter Plugin: ScreenShotter: U ...
Python Ethical Hacking - VULNERABILITY SCANNER(6)
EXPLOITATION - XSS VULNS EXPLOITING XSS Run any javascript code. Beef framework can be used to hook ...

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks的更多相关文章

随机推荐

热门专题