Deep Reinforcement Learning with Iterative Shift for Visual Tracking

Deep Reinforcement Learning with Iterative Shift for Visual Tracking

2019-07-30 14:55:31

Paper: http://openaccess.thecvf.com/content_ECCV_2018/papers/Liangliang_Ren_Deep_Reinforcement_Learning_ECCV_2018_paper.pdf

Code: not find yet.

Paper List of Tracking with Deep Reinforcement Learning: https://github.com/wangxiao5791509/Tracking-with-Deep-Reinforcement-Learning

1. Background and Motivation:

本文的贡献在于：

1). 提出一种 Actor-Critic Network 来预测物体运动的参数，并根据跟踪状态选择动作，不同的动作，会根据对结果的影响不同，设置不同的奖励；

2). 将 tracking 看做是迭代的平移问题，而不是 CNN Classification 问题，所以定位更加高效和准确；

3). 在 OTB 和 TC128 数据集上取得了较好的效果；

2. Approach:

本文所提出的方法包含三个模块：1). the actor network; 2). the prediction network; 3). the critic network.

2.1 Iterative Shift for Visual Tracking

本文将 tracking 看做是迭代的平移问题。给定当前帧和之前的跟踪结果，prediction network 会迭代的平移候选框，以定位住目标物体，与此同时，action network 会在跟踪状态上进行预测，判断是否进行模型的更新，预测网络，甚至是重启跟踪过程。

正式的来说，给定上一帧的跟踪结果 $l_{t-1} = {x_{t-1}, y_{t-1}, w_{t-1}, h_{t-1}}$ 以及 feature $f_{t-1}^*$，我们先根据该位置，得到当前帧的大致位置，抠出该 feature $f_t$，然后用预测网络进行预测：

其中，预测网络的输出为：

此外，跟踪状态也可能会影响最终的结果，即：需要适时的更新预测网络。为了联合的基于 target's motion status 以及 tracker's status 进行决策，我们利用 actor network 根据多项式分布来产生动作：

其中，$a_k \in A = \{ continuous, stop & update, stop & ignore, restart \}$。

对于动作 continuous 来说，即：不用更新模型，继续平移，而进行的 shift 是根据 prediction network 进行的。

对于动作 step & update 来说，即：停止平移，更新模型，即：

对于动作 stop & ignore 来说，停止平移，不更新模型，表示目标物体已经找到，然而，跟踪器无法确定是否需要进行更新。

对于动作 restart 来说，重新进行跟踪过程，即：restart the iteration by re-sampling a random set of candidate patches $L_t$ around $l_{t-1}^*$ in $I_t$ and select the patch which has the highest Q-values.

DRL-IS with Actor-Critic:

我们探索 AC算法，来进行联合的训练三个网络。首先作者根据跟踪的性能，进行了奖励的设定：

对于 continue 动作，根据

对于 stop & update and stop & ignore 动作，奖励的设定是根据 final prediction 和 ground truth 之间的 IoU 进行评判的：

对于 restart 动作，当 final prediction 和 groundtruth 之间的 IoU 低于 0.4 时，给予 pos 的奖励：

然后，我们计算每一个动作的 Q-value。

对于 action continue 的 Q-value 来说：

对于其他的三个动作来说，是按照如下的式子进行计算：

最终，两个函数的优化是按照如下的式子进行的：

其中，s' 是下一个状态，a' 是选择的最优动作，Action-value 以及 Value function 是按照如下的方式进行计算的：

总体的算法过程如下所示：

Deep Reinforcement Learning with Iterative Shift for Visual Tracking的更多相关文章

论文笔记之：Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning
论文笔记之:Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning 2017-06-06 21: ...
Deep Reinforcement Learning for Visual Object Tracking in Videos 论文笔记
Deep Reinforcement Learning for Visual Object Tracking in Videos 论文笔记 arXiv 摘要:本文提出了一种 DRL 算法进行单目标跟踪 ...
(转) Deep Reinforcement Learning: Pong from Pixels
Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...
(zhuan) Deep Reinforcement Learning Papers
Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...
getting started with building a ROS simulation platform for Deep Reinforcement Learning
Apparently, this ongoing work is to make a preparation for futural research on Deep Reinforcement Le ...
论文笔记之：Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...
18 Issues in Current Deep Reinforcement Learning from ZhiHu
深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...
Paper Reading 1 - Playing Atari with Deep Reinforcement Learning
来源:NIPS 2013 作者:DeepMind 理解基础: 增强学习基本知识深度学习特别是卷积神经网络的基本知识创新点:第一个将深度学习模型与增强学习结合在一起从而成功地直接从高维的输入学习控 ...
repost: Deep Reinforcement Learning
From: http://wanghaitao8118.blog.163.com/blog/static/13986977220153811210319/ accessed 2016-03-10 深度 ...

随机推荐

Jmeter配置元件
1.CSV Data Set Config Filename 参数化文件的路径文件中的数据最后一行不能有空行,空行会被当做一个参数若要进行分布式压测,可以将参数化文件放在jmeter的bin目 ...
Jenkins+Docker+Git+Harbor流水线打包
Jenkins+Docker+Git+Harbor流水线打包环境: CentOS Linux release 7.6.1810 (Core) 192.168.247.214 Jenkins+dock ...
【转】移植vsftpd到arm linux
vsftpd即very secure FTP daemon(非常安全的FTP进程),是一个基于GPL发布的类UNIX类操作系统上运行的服务器的名字(是一种守护进程),可以运行在诸如Linux.BSD. ...
常用docker管理UI
1. HumpBacks 特性 Web UI Supporting, Easy to use. Container Grouping and Isolation. Container Upgrades ...
关于ssh_config和sshd_config
转载:https://www.cnblogs.com/panda2046/p/5933498.html 在远程管理linux系统基本上都要使用到ssh,原因很简单:telnet.FTP等传输方式是 ...
linux卸载mysql误删mysql.pm
操作步骤如下 linux卸载mysql:yum remove mysql 查找mysql所有的文件并删除: 查找:find / -name mysql 删除:rm -rf xxx 误操作删除mysql ...
Tortoise Git 安装及报错处理
TortoiseGit安装详解: https://www.cnblogs.com/xinlj/p/5978730.html Tortoise Git 错误处理 disconnected no supp ...
Nginx+keepalived 高可用双机热备(主从模式)
环境:centos7.6 最小化安装主:10.11.1.32 从:10.11.1.33 VIP:10.11.1.130 修改主节点主机名: hostnamectl set-hostname web_ ...
django-mysql事务
django文档:https://yiyibooks.cn/xx/django_182/topics/db/transactions.html mysql事务 1) 事务概念一组mysql语句,要么 ...
Vue --- 基础练习
1.有红,黄,蓝三个按钮,以及一个矩形框,点击不同的按钮,矩形框的颜色会被切换为指定的颜色 <!DOCTYPE html> <html lang="zh"> ...

Deep Reinforcement Learning with Iterative Shift for Visual Tracking

Deep Reinforcement Learning with Iterative Shift for Visual Tracking的更多相关文章

随机推荐

热门专题