First, a depth spatial-temporal descriptor is developed to extract the interested local regions in depth image. Then the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor are combined and feeded into a linear coding framework to get an effective feature vector, which can be used for action classification. Finally, extensive experiments are conducted on a publicly available RGB-D action recognition dataset and the proposed method shows promising results.

创新点就这个了:A linear coding framework is developed to fuse the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor to form robust feature vector. In addition, we further exploit the temporal intrinsics of the video sequence and design a new pooling technology to improve the description performance.

Feature extraction

STIPs is an extension of SIFT (Scale-Invariant-Feature-Transform) in 3-dimensional space and uses one of Harris3D, Cuboid or Hessian as the detector.

http://www.di.ens.fr/~laptev/download.html

patch的分割有重叠~~

算是对depth map的预处理了 ~~

So the STIPs features in the RGB images disclose more detail characters of the subjects themselves while in the depth images they extract more characters of the shape of the subjects.

Coding approaches

vector quantization (VQ)

One disadvantage of the VQ is that it introduces significant quantization errors since only one element of the codebook is selected to represent the descriptor. To remedy this, one usually has to design a nonlinear SVM as the classifier which tries to compensate the quantization errors. However, using nonlinear kernels, the SVM has to pay a high training cost, including computation and storage. Considering the above defects, localityconstrained linear coding (LLC) –a more accurate and efficient coding approach[9]is adopted to replace VQ in this paper

Pooling strategy

Similar to the VQ coding approach, the LLC coding coefficients ci are expected to be combined into a global representation of the sample for classification.

DataSet

RGBD-HuDaAct[1]video database

The video sample consists of synchronized and calibrated RGB-D frame sequences, which contains in each frame a RGB image and a depth image, respectively. The RGB and depth images in each frame have been calibrated with a standard stereocalibration method available in OpenCV so that the points with the same coordinate in RGB and depth images are corresponded.

一片简洁的paper ,给我指明了方向 ~~

RGB-D action recognition using linear coding的更多相关文章

  1. Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

    论文标题:Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition 来源/作者机构情况: 解决问题/主要思想贡献 ...

  2. 201904:Action recognition based on 2D skeletons extracted from RGB videos

    论文标题:Action recognition based on 2D skeletons extracted from RGB videos 发表时间:02 April 2019 解决问题/主要思想 ...

  3. 行为识别(action recognition)相关资料

    转自:http://blog.csdn.net/kezunhai/article/details/50176209 ================华丽分割线=================这部分来 ...

  4. 论文列表 for Action recognition

    要读的论文: https://www.cnblogs.com/hizhaolei/p/10565405.html 骨架动作识别论文汇总 https://blog.csdn.net/bianxuewei ...

  5. 【ML】Two-Stream Convolutional Networks for Action Recognition in Videos

    Two-Stream Convolutional Networks for Action Recognition in Videos & Towards Good Practices for ...

  6. 论文笔记 | A Closer Look at Spatiotemporal Convolutions for Action Recognition

    ( 这篇博文为原创,如需转载本文请email我: leizhao.mail@qq.com, 并注明来源链接,THX!) 本文主要分享了一篇来自CVPR 2018的论文,A Closer Look at ...

  7. Skeleton-Based Action Recognition with Directed Graph Neural Network

    Skeleton-Based Action Recognition with Directed Graph Neural Network 摘要 因为骨架信息可以鲁棒地适应动态环境和复杂的背景,所以经常 ...

  8. Two-Stream Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition

    Two-Stream Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition 摘要 基于骨架的动作识别因为 ...

  9. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition (ST-GCN)

    Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 摘要 动态人体骨架模型带有进行动 ...

随机推荐

  1. mysql读写分离的解决方案

    来源于网上整理 http://yanwt.iteye.com/blog/1460780 现有三种解决方式实现mysql读写分离 1 程序修改mysql操作类 优点:直接和数据库通信,简单快捷的读写分离 ...

  2. Oracle 流程控制语句

    分为选择语句循环语句两大类:一 选择语句1 if then ...end;set serveroutput on declare var_name1 varchar2(50):='East'; var ...

  3. Linux 图形文件压缩/解压缩实用程序,归档管理器。

    1.ArkArk是KDE桌面环境默认的归档管理器,支持插件设置,允许你创建一个压缩包,查看压缩文件的内容,解压压缩包的内容到你所选定的目录.它能处理多种格式,包括 tar.gzip.bzip2.zip ...

  4. 操作系统——第五章 输入输出(I/O)管理

    这就是SDT表和DCT表

  5. Timestamp 转 date

    Timestamp startTime = new Timestamp(new Date().getTime());

  6. ECNUOJ 2144 抗震机械制造

    抗震机械制造 Time Limit:1000MS Memory Limit:65536KBTotal Submit:312 Accepted:78 Description  为了应付可能到来的地震,E ...

  7. zookeeper_相关命令 以及 API

    (区分大小写) 启动ZooKeeper服务        进入主目录下的 /bin 文件夹. zkServer.sh start.  需要每个节点运行启动命令 客户端启动          zkCli ...

  8. Java的几个有用小Util函数(日期处理和http)

    /**      * 依据日期返回当前日期是一年的第几天      * @param date      * @return      */     public static int orderDa ...

  9. abap选择屏幕上的button

    1.背景:近期在看sap的一些abapDemo,看了一个比較好用的功能.分享一下.希望对用到的兄弟有帮助,主要功能是:在选择屏幕上弹出一个小窗体.放一些button在上面,触发不同button,会处理 ...

  10. 数据格式转换 (三)Office文档转HTML

         HTML Filter 是由北京红樱枫软件有限公司根据HTML Ver 4.01/CSS式样,研制和开发的MS Office系列文档到HTML转换的通用程序库.便于用户实现对多种文档的统一管 ...