【Paper Reading】Deep Supervised Hashing for fast Image Retrieval

what has been done:

This paper proposed a novel Deep Supervised Hashing method to learn a compact similarity-presevering binary code for the huge body of image data.

Data sets:

CIFAR-10: 60,000 32*32 belonging to 10 mutually exclusively categories(6000 images per category)

NUS-WIDE: 269,648 from Flickr, warpped to 64*64

content based image retrieval: visually similar or semantically similar.

Traditional method: calculate the distance between the query image and the database images.

Problem: time and memory

Solution: hashing methods(map image to compact binary codes that approximately preserve the data structure in the original space)

Problem: performace depends on the features used, more suitable for dealing with the visiual similarity search rather than the sematically similarity search.

Solution: CNNs, the CNNs successful applications of CNNs in various tasks imply that the feature learned by CNNs can well capture the underlying sematic structure of images in spite of significant appearance variations.

Related works:

Locality Sensitive Hashing(LSH)：use random projections to produce hashing bits

cons: requires long codes to achieve satisfactory performance.(large memory)

data-dependent hashing methods: unsupervised vs supervised

unsupervised methods: only make use of unlabelled training data to lean hash functions

spectral hashing(SH): minimizes the weighted hamming distance of image pairs
Iterative Quantization(ITQ): minimize the quantization error on projected image descriptors so as to allievate the information loss

supervised methods: take advantage of label inforamtion thus can preserve semantic similarity

CCA-ITQ： an extension of iterative quantization
predictable discriminative binary code: looks for hypeplanes that seperate categories with large margin as hash function.
Minimal Loss Hashing(MLH): optimize upper bound of a hinge-like loss to learn the hash functions

problem: the above methods use linear projection as hash functions and can only deal with linearly seperable data.

solution: supervised hashing with kernels(KSH) and Binary Reconstructive Embedding(BRE).

Deep hashing: exploits a non-linear deep networks to produce binary code.

Problem : most hash methods relax the binary codes to real-values in optimizations and quantize the model outputs to produce binary codes. However there is no guarantee that the optimal real-valued codes are still optimal after quantization .

Solution: DIscrete Graph Hashing(DGH) and Supervided Discrete Hashing(DSH) are proposed to directly optimize the binary codes.

Problem : Use hand crafted feature and cannot capture the semantic information.

Solution: CNNs base hashing method

Our goal: similar images should be encoded to similar binary codes and the binary codes should be computed efficiently.

Loss function:

Relaxation:

Implementation details:

Network structure:

3*卷积层：

3*池化层：

2*全连接层：

Training methodology:

generate images pairs online by exploiting all the image pairs in each mini-batch. Allivate the need to store the whole pair-wise similarity matrix, thus being scalable to large-scale data-sets.
Fine-tune vs Train from scratch

Experiment:

CIFAR-10

GIST descriptors for conventional hashing methods

NUS-WIDE

225-D normalized block-wise color moment features

Evalutaion Metrics

mAP: mean Average Precision

precision-recall curves(48-bit)

mean precision within Hamming radius 2 for different code lengths

Network ensembles?

Comparison with state-of-the-art method

CNNH: trainin the model to fit pre-computed discriminative binary code. binary code generation and the network learning are isolated

CLBHC: train the model with a binary-line hidden layer as features for classification, encoding dissimilar images to similar binary code would not be punished.

DNNH: used triplet-based constraints to describe more complex semantic relations, training its networks become more diffucult due to the sigmoid non-linearlity and the parameterized piece-wise threshold function used in the output layer.

Combine binary code generation with network learning

Comparision of Encoding Time

【Paper Reading】Deep Supervised Hashing for fast Image Retrieval的更多相关文章

【Paper Reading】Learning while Reading
Learning while Reading 不限于具体的书,只限于知识的宽度这个系列集合了一周所学所看的精华,它们往往来自不只一本书我们之所以将自然界分类,组织成各种概念,并按其分类,主要是因为 ...
【Paper Reading】Object Recognition from Scale-Invariant Features
Paper: Object Recognition from Scale-Invariant Features Sorce: http://www.cs.ubc.ca/~lowe/papers/icc ...
【Paper Reading】Bayesian Face Sketch Synthesis
Contribution: 1) Systematic interpretation to existing face sketch synthesis methods. 2) Bayesian fa ...
【Paper Reading】Improved Textured Networks: Maximizing quality and diversity in Feed-Forward Stylization and Texture Synthesis
Improved Textured Networks: Maximizing quality and diversity in Feed-Forward Stylization and Texture ...
【资料总结】| Deep Reinforcement Learning 深度强化学习
在机器学习中,我们经常会分类为有监督学习和无监督学习,但是尝尝会忽略一个重要的分支,强化学习.有监督学习和无监督学习非常好去区分,学习的目标,有无标签等都是区分标准.如果说监督学习的目标是预测,那么强 ...
【文献阅读】Deep Residual Learning for Image Recognition--CVPR--2016
最近准备用Resnet来解决问题,于是重读Resnet的paper <Deep Residual Learning for Image Recognition>, 这是何恺明在2016-C ...
【文献阅读】Augmenting Supervised Neural Networks with Unsupervised Objectives-ICML-2016
一.Abstract 从近期对unsupervised learning 的研究得到启发,在large-scale setting 上,本文把unsupervised learning 与superv ...
【CS-4476-project 6】Deep Learning
AlexNet / VGG-F network visualized by mNeuron. Project 6: Deep LearningIntroduction to Computer Visi ...
【论文阅读】Deep Mixture of Diverse Experts for Large-Scale Visual Recognition
导读: 本文为论文<Deep Mixture of Diverse Experts for Large-Scale Visual Recognition>的阅读总结.目的是做大规模图像分类 ...

随机推荐

P1828 香甜的黄油 Sweet Butter （spfa）
题目描述农夫John发现做出全威斯康辛州最甜的黄油的方法:糖.把糖放在一片牧场上,他知道N(1<=N<=500)只奶牛会过来舔它,这样就能做出能卖好价钱的超甜黄油.当然,他将付出额外的费 ...
Ubuntu14.04 Anaconda
我虚拟机Ubuntu14.04上的Python已经存在了两个版本,一个是python 2.7,一个是Python 3.4.想在它上面安装Anaconda,但又有所顾虑.我先想到的是,先卸载Ubuntu ...
matlab经验总结（转）
Matlab使用的一点儿体会(For Beginner) 作者:Genial(山城棒棒儿军) 转自不明真正接触matlab一年左右,我很喜欢上了matlab的简单的语法,易于绘制图形,gui ...
selenium+java实现浏览器前进、后退和刷新
hdu 1754 I Hate It 线段树点改动
// hdu 1754 I Hate It 线段树点改动 // // 不多说,裸的点改动 // // 继续练 #include <algorithm> #include <bits ...
线程基础：JDK1.5+（8）——线程新特性（上）
1.概要假设您阅读JAVA的源码.出现最多的代码作者包含:Doug Lea.Mark Reinhold.Josh Bloch.Arthur van Hoff.Neal Gafter.Pavani D ...
validate命令---rman进行备份和回复的验证
rman作为oracle备份与恢复工具,为我们提供了强大的功能.当中包含对数据文件的物理和逻辑检測以及备份文件的有效性检測. 首先.来看一下rman对数据文件的检測. 我们知道,rman在备份数据时, ...
robot framework框架selenium API
RIDE面板认识 selenium API 关键字语法参数备注 Open Browser url Chrome 用不同的浏览器打开url,需要下载不同的浏览器驱动,默认火狐 Close Brow ...
.NET泛型初探
总所周知,.NET出现在.net framework 2.0,为什么要在2.0引入泛型那,因为微软在开始开发.net框架时并没有想过多个类型参数传输时对方法的重构,这样一来,开发人员就要面对传输多种类 ...
vi 调到第一行，或最后一行
用vi命令打开文件直接跳到最后一行的方法如下: :$ 跳到文件最后一行 :0或:1 跳到文件第一行或另外一组命令: gg 跳到文件第一行 Shift + g 跳到文件最后一行

【Paper Reading】Deep Supervised Hashing for fast Image Retrieval

【Paper Reading】Deep Supervised Hashing for fast Image Retrieval的更多相关文章

随机推荐

热门专题