http://rogerioferis.com/VisualRecognitionAndSearch2014/Resources.html

Source Code

Non-exhaustive list of state-of-the-art implementations related to visual recognition and search. There is no warranty for the source code links below – use them at your own risk!

Feature Detection and Description

General Libraries:

Fast Keypoint Detectors for Real-time Applications:

  • FAST – High-speed corner detector implementation for a wide variety of platforms
  • AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV
    2010).

Binary Descriptors for Real-Time Applications:

  • BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
  • ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations,
    but not scale)
  • BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
  • FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

Other Local Feature Detectors and Descriptors:

  • VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
  • LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
  • Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and
    rendering style (CVPR 2012).

Global Image Descriptors:

  • GIST – Matlab code for the GIST descriptor
  • CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

  • VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including
    Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
  • Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

  • Caffe – Fast C++ implementation of deep convolutional networks (GPU / CPU / ImageNet 2013 demonstration).
  • id=software:overfeat:start" style="color:rgb(165,88,88)">OverFeat – C++ library for integrated classification and localization of objects.

  • EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on
    convolutional neural networks.
  • Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural
    networks.
  • Deep Learning - Various links for deep learning software.

Facial Feature Detection and Tracking

  • IntraFace – Very accurate detection and tracking of facial features (C++/Matlab API).

Part-Based Models

Attributes and Semantic Features

Large-Scale Learning

  • Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
  • LIBLINEAR – Library for large-scale linear SVM classification.
  • VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

  • FLANN – Library for performing fast approximate nearest neighbor.
  • Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
  • ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing
    (CVPR 2011).
  • INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

3D Recognition

Action Recognition


Datasets

Attributes

  • Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
  • aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
  • FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
  • PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
  • LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
  • Human Attributes – 8,000 people with annotated attributes. Check also this link for
    another dataset of human attributes.
  • SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
  • ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
  • Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for
    the WhittleSearch data.
  • Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Face Detection

  • FDDB – UMass face detection dataset and benchmark (5,000+ faces)
  • CMU/MIT – Classical face detection dataset.

Face Recognition

  • Face Recognition Homepage – Large collection of face recognition datasets.
  • LFW – UMass unconstrained face recognition dataset (13,000+ face images).
  • NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
  • CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
  • FERET – Classical face recognition dataset.
  • Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale,
    ORL, PIE, and Extended Yale B.
  • SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

  • MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Generic Object Recognition

  • ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
  • Tiny Images – 80 million 32x32 low resolution images.
  • Pascal VOC – One of the most influential visual recognition datasets.
  • Caltech 101 / Caltech
    256
     – Popular image datasets containing 101 and 256 object categories, respectively.
  • MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

Feature Detection and Description

  • VGG Affine Dataset – Widely used dataset for measuring performance of feature detection and description. CheckVLBenchmarksfor
    an evaluation framework.

Action Recognition

RGBD Recognition

state-of-the-art implementations related to visual recognition and search的更多相关文章

  1. Image Processing and Analysis_8_Edge Detection:Edge and line oriented contour detection State of the art ——2011

    此主要讨论图像处理与分析.虽然计算机视觉部分的有些内容比如特 征提取等也可以归结到图像分析中来,但鉴于它们与计算机视觉的紧密联系,以 及它们的出处,没有把它们纳入到图像处理与分析中来.同样,这里面也有 ...

  2. Convolutional Neural Networks for Visual Recognition

    http://cs231n.github.io/   里面有很多相当好的文章 http://cs231n.github.io/convolutional-networks/ Table of Cont ...

  3. 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015

    Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...

  4. 论文笔记之: Bilinear CNN Models for Fine-grained Visual Recognition

    Bilinear CNN Models for Fine-grained Visual Recognition CVPR 2015 本文提出了一种双线性模型( bilinear models),一种识 ...

  5. CNN for Visual Recognition (01)

    CS231n: Convolutional Neural Networks for Visual Recognitionhttp://vision.stanford.edu/teaching/cs23 ...

  6. 【论文阅读】Deep Mixture of Diverse Experts for Large-Scale Visual Recognition

    导读: 本文为论文<Deep Mixture of Diverse Experts for Large-Scale Visual Recognition>的阅读总结.目的是做大规模图像分类 ...

  7. 目标检测--Spatial pyramid pooling in deep convolutional networks for visual recognition(PAMI, 2015)

    Spatial pyramid pooling in deep convolutional networks for visual recognition 作者: Kaiming He, Xiangy ...

  8. A Theoretical Analysis of Feature Pooling in Visual Recognition

    这篇是10年ICML的论文,但是它是从原理上来分析池化的原因,因为池化的好坏的确会影响到结果,比如有除了最大池化和均值池化,还有随机池化等等,在eccv14中海油在顶层加个空间金字塔池化的方法.可谓多 ...

  9. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zh ...

随机推荐

  1. WPF界面设计技巧(9)—使用UI自动化布局

    原文:WPF界面设计技巧(9)-使用UI自动化布局 最近一直没时间更新这系列文章,因为我一直在埋头编写我的第一个WPF应用程序:MailMail 今天开始编写附属的加密/解密工具,对UI自动化布局有些 ...

  2. 局部敏感哈希-Locality Sensitive Hashing

    局部敏感哈希 转载请注明http://blog.csdn.net/stdcoutzyx/article/details/44456679 在检索技术中,索引一直须要研究的核心技术.当下,索引技术主要分 ...

  3. java中线程机制

    java中线程机制,一开始我们都用的单线程.现在接触到多线程了. 多线性首先要解决的问题是:创建线程,怎么创建线程的问题: 1.线程的创建: 四种常用的实现方法 1.继承Thread. Thread是 ...

  4. Java equals 和 hashcode 方法

    问题 面试时经常会问起字符串比较相关的问题, 总结一下,大体是如下几个: 1.字符串比较时用的什么方法,内部实现如何? 2.hashcode的作用,以及重写equal方法,为什么要重写hashcode ...

  5. Android实现隐藏状态栏和标题栏

    隐藏标题栏需要使用预定义样式:android:theme=”@android:style/Theme.NoTitleBar”. 隐藏状态栏:android:theme=”@android:style/ ...

  6. IT痴汉的工作现状13-吓唬电话

    那是一个普通的周末上午,稍微阴沉的天,使得暑气消退了好多.刚吃过早饭,我懒懒的浏览着CSDN论坛上有趣的问题和答案. 突然电话响起.是一个陌生的号码.我像往常一样接起电话,""您好 ...

  7. C++使用函数模板

    函数模板: 函数模板是蓝图或处方功能,编译器使用其发电功能系列中的新成员. 第一次使用时,新的功能是创建.从功能模板生成的函数的实例称为模板或模板的实例.函数模板的开始是keywordtemplate ...

  8. 怎么样MyEclipse配置Tomcat?

    1.下载tomcat免安装版.tomcat路径不包含空格 http://download.csdn.net/detail/u014112584/7549191 2.windows -preferenc ...

  9. [ios仿系列]仿支付宝手势解码

    呀~.这么快就转到ios阵营了???.android还有那么多坑呢???为此我也仅仅能啃着馒头留下屈辱的眼泪了. . 本次因为开发公司产品的android版,继而ios版也负责一部分.当中一部分就是手 ...

  10. 建立Hibernate二级Cache

    建立Hibernate二级Cache它需要两个步骤:首先,一定要使用什么样的数据并发策略,然后配置缓存过期时间,并设置Cache提供器. 有4种内置的Hibernate数据并发冲突策略,代表数据库隔离 ...