First, a depth spatial-temporal descriptor is developed to extract the interested local regions in depth image. Then the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor are combined and feeded into a linear coding framework to get an effective feature vector, which can be used for action classification. Finally, extensive experiments are conducted on a publicly available RGB-D action recognition dataset and the proposed method shows promising results.

创新点就这个了:A linear coding framework is developed to fuse the intensity spatial-temporal descriptor and the depth spatial-temporal descriptor to form robust feature vector. In addition, we further exploit the temporal intrinsics of the video sequence and design a new pooling technology to improve the description performance.

Feature extraction

STIPs is an extension of SIFT (Scale-Invariant-Feature-Transform) in 3-dimensional space and uses one of Harris3D, Cuboid or Hessian as the detector.

http://www.di.ens.fr/~laptev/download.html

patch的分割有重叠~~

算是对depth map的预处理了 ~~

So the STIPs features in the RGB images disclose more detail characters of the subjects themselves while in the depth images they extract more characters of the shape of the subjects.

Coding approaches

vector quantization (VQ)

One disadvantage of the VQ is that it introduces significant quantization errors since only one element of the codebook is selected to represent the descriptor. To remedy this, one usually has to design a nonlinear SVM as the classifier which tries to compensate the quantization errors. However, using nonlinear kernels, the SVM has to pay a high training cost, including computation and storage. Considering the above defects, localityconstrained linear coding (LLC) –a more accurate and efficient coding approach[9]is adopted to replace VQ in this paper

Pooling strategy

Similar to the VQ coding approach, the LLC coding coefficients ci are expected to be combined into a global representation of the sample for classification.

DataSet

RGBD-HuDaAct[1]video database

The video sample consists of synchronized and calibrated RGB-D frame sequences, which contains in each frame a RGB image and a depth image, respectively. The RGB and depth images in each frame have been calibrated with a standard stereocalibration method available in OpenCV so that the points with the same coordinate in RGB and depth images are corresponded.

一片简洁的paper ,给我指明了方向 ~~

RGB-D action recognition using linear coding的更多相关文章

  1. Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

    论文标题:Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition 来源/作者机构情况: 解决问题/主要思想贡献 ...

  2. 201904:Action recognition based on 2D skeletons extracted from RGB videos

    论文标题:Action recognition based on 2D skeletons extracted from RGB videos 发表时间:02 April 2019 解决问题/主要思想 ...

  3. 行为识别(action recognition)相关资料

    转自:http://blog.csdn.net/kezunhai/article/details/50176209 ================华丽分割线=================这部分来 ...

  4. 论文列表 for Action recognition

    要读的论文: https://www.cnblogs.com/hizhaolei/p/10565405.html 骨架动作识别论文汇总 https://blog.csdn.net/bianxuewei ...

  5. 【ML】Two-Stream Convolutional Networks for Action Recognition in Videos

    Two-Stream Convolutional Networks for Action Recognition in Videos & Towards Good Practices for ...

  6. 论文笔记 | A Closer Look at Spatiotemporal Convolutions for Action Recognition

    ( 这篇博文为原创,如需转载本文请email我: leizhao.mail@qq.com, 并注明来源链接,THX!) 本文主要分享了一篇来自CVPR 2018的论文,A Closer Look at ...

  7. Skeleton-Based Action Recognition with Directed Graph Neural Network

    Skeleton-Based Action Recognition with Directed Graph Neural Network 摘要 因为骨架信息可以鲁棒地适应动态环境和复杂的背景,所以经常 ...

  8. Two-Stream Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition

    Two-Stream Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition 摘要 基于骨架的动作识别因为 ...

  9. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition (ST-GCN)

    Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition 摘要 动态人体骨架模型带有进行动 ...

随机推荐

  1. 6、json支持

    package main import ( "encoding/json" "fmt") // Json 支持 type Response1 struct{ P ...

  2. [JSOI2008]火星人 hash+splay

    题目描述: 现在,火星人定义了一个函数 LCQ(x, y)LCQ(x,y),表示:该字符串中第 xx 个字符开始的字串,与该字符串中第 yy 个字符开始的字串,两个字串的公共前缀的长度.比方说,LCQ ...

  3. [NOI2008]志愿者招募 网络流 建模

    题目描述申奥成功后,布布经过不懈努力,终于成为奥组委下属公司人力资源部门的主管.布布刚上任就遇到了一个难题:为即将启动的奥运新项目招募一批短期志愿者.经过估算,这个项目需要N 天才能完成,其中第i 天 ...

  4. notepad++调用python3中文乱码

    使用notepad++,配置好快捷键调用python3,一切就绪,仿佛就差代码了,结果一使用, 中文乱码,一直没有好的解决办法. 最后只能在代码中增加一行重写向输出解决,示例如下: #!/usr/bi ...

  5. numpy基础篇-简单入门教程3

    np import numpy as np np.__version__ print(np.__version__) # 1.15.2 numpy.arange(start, stop, step, ...

  6. 移动端开发ios和安卓兼容问题

    移动端开发ios和安卓兼容问题 最近做移动端混合开的时候遇到一些安卓和iOS的兼容性问题,兼容想问题不仅在浏览器存在也在APP开发当中也会经常遇到这样的情况. 最近看了一下内容很不错的移动端开发相关的 ...

  7. 【ICM Technex 2018 and Codeforces Round #463 (Div. 1 + Div. 2, combined) B】Recursive Queries

    [链接] 我是链接,点我呀:) [题意] 在这里输入题意 [题解] 写个记忆化搜索. 接近O(n)的复杂度吧 [代码] #include <bits/stdc++.h> using nam ...

  8. Android Studio的Signature Versions选择,分别是什么意思

    转自原文 Android Studio的Signature Versions选择,分别是什么意思 打包一个文件的签名版本, 选V1打包出来的app是jar的(一般这种就是当做第三方导入项目来用的), ...

  9. Codeforces Round #105 (Div. 2) 148C Terse princess(脑洞)

    C. Terse princess time limit per test 1 second memory limit per test 256 megabytes input standard in ...

  10. Linux安全应用之防垃圾邮件服务器的构建

    Linux安全应用之防垃圾邮件服务器的构建 一.垃圾邮件产生的原因 垃圾邮件(SPAM) 也称作UCE(Unsoticited Commercial Email.未经许可的商业电子邮件)或UBE(Un ...