【Paper Reading】Deep Supervised Hashing for fast Image Retrieval
what has been done:
This paper proposed a novel Deep Supervised Hashing method to learn a compact similarity-presevering binary code for the huge body of image data.
Data sets:
CIFAR-10: 60,000 32*32 belonging to 10 mutually exclusively categories(6000 images per category)
NUS-WIDE: 269,648 from Flickr, warpped to 64*64
content based image retrieval: visually similar or semantically similar.
Traditional method: calculate the distance between the query image and the database images.
Problem: time and memory
Solution: hashing methods(map image to compact binary codes that approximately preserve the data structure in the original space)
Problem: performace depends on the features used, more suitable for dealing with the visiual similarity search rather than the sematically similarity search.
Solution: CNNs, the CNNs successful applications of CNNs in various tasks imply that the feature learned by CNNs can well capture the underlying sematic structure of images in spite of significant appearance variations.
Related works:
Locality Sensitive Hashing(LSH):use random projections to produce hashing bits
cons: requires long codes to achieve satisfactory performance.(large memory)
data-dependent hashing methods: unsupervised vs supervised
unsupervised methods: only make use of unlabelled training data to lean hash functions
- spectral hashing(SH): minimizes the weighted hamming distance of image pairs
- Iterative Quantization(ITQ): minimize the quantization error on projected image descriptors so as to allievate the information loss
supervised methods: take advantage of label inforamtion thus can preserve semantic similarity
- CCA-ITQ: an extension of iterative quantization
- predictable discriminative binary code: looks for hypeplanes that seperate categories with large margin as hash function.
- Minimal Loss Hashing(MLH): optimize upper bound of a hinge-like loss to learn the hash functions
problem: the above methods use linear projection as hash functions and can only deal with linearly seperable data.
solution: supervised hashing with kernels(KSH) and Binary Reconstructive Embedding(BRE).
Deep hashing: exploits a non-linear deep networks to produce binary code.
Problem : most hash methods relax the binary codes to real-values in optimizations and quantize the model outputs to produce binary codes. However there is no guarantee that the optimal real-valued codes are still optimal after quantization .
Solution: DIscrete Graph Hashing(DGH) and Supervided Discrete Hashing(DSH) are proposed to directly optimize the binary codes.
Problem : Use hand crafted feature and cannot capture the semantic information.
Solution: CNNs base hashing method
Our goal: similar images should be encoded to similar binary codes and the binary codes should be computed efficiently.
Loss function:
Relaxation:
Implementation details:
Network structure:

3*卷积层:
3*池化层:
2*全连接层:
Training methodology:
- generate images pairs online by exploiting all the image pairs in each mini-batch. Allivate the need to store the whole pair-wise similarity matrix, thus being scalable to large-scale data-sets.
- Fine-tune vs Train from scratch
Experiment:
CIFAR-10
GIST descriptors for conventional hashing methods
NUS-WIDE
225-D normalized block-wise color moment features
Evalutaion Metrics
mAP: mean Average Precision
precision-recall curves(48-bit)
mean precision within Hamming radius 2 for different code lengths
Network ensembles?
Comparison with state-of-the-art method
CNNH: trainin the model to fit pre-computed discriminative binary code. binary code generation and the network learning are isolated
CLBHC: train the model with a binary-line hidden layer as features for classification, encoding dissimilar images to similar binary code would not be punished.
DNNH: used triplet-based constraints to describe more complex semantic relations, training its networks become more diffucult due to the sigmoid non-linearlity and the parameterized piece-wise threshold function used in the output layer.
Combine binary code generation with network learning
Comparision of Encoding Time
【Paper Reading】Deep Supervised Hashing for fast Image Retrieval的更多相关文章
- 【Paper Reading】Learning while Reading
Learning while Reading 不限于具体的书,只限于知识的宽度 这个系列集合了一周所学所看的精华,它们往往来自不只一本书 我们之所以将自然界分类,组织成各种概念,并按其分类,主要是因为 ...
- 【Paper Reading】Object Recognition from Scale-Invariant Features
Paper: Object Recognition from Scale-Invariant Features Sorce: http://www.cs.ubc.ca/~lowe/papers/icc ...
- 【Paper Reading】Bayesian Face Sketch Synthesis
Contribution: 1) Systematic interpretation to existing face sketch synthesis methods. 2) Bayesian fa ...
- 【Paper Reading】Improved Textured Networks: Maximizing quality and diversity in Feed-Forward Stylization and Texture Synthesis
Improved Textured Networks: Maximizing quality and diversity in Feed-Forward Stylization and Texture ...
- 【资料总结】| Deep Reinforcement Learning 深度强化学习
在机器学习中,我们经常会分类为有监督学习和无监督学习,但是尝尝会忽略一个重要的分支,强化学习.有监督学习和无监督学习非常好去区分,学习的目标,有无标签等都是区分标准.如果说监督学习的目标是预测,那么强 ...
- 【文献阅读】Deep Residual Learning for Image Recognition--CVPR--2016
最近准备用Resnet来解决问题,于是重读Resnet的paper <Deep Residual Learning for Image Recognition>, 这是何恺明在2016-C ...
- 【文献阅读】Augmenting Supervised Neural Networks with Unsupervised Objectives-ICML-2016
一.Abstract 从近期对unsupervised learning 的研究得到启发,在large-scale setting 上,本文把unsupervised learning 与superv ...
- 【CS-4476-project 6】Deep Learning
AlexNet / VGG-F network visualized by mNeuron. Project 6: Deep LearningIntroduction to Computer Visi ...
- 【论文阅读】Deep Mixture of Diverse Experts for Large-Scale Visual Recognition
导读: 本文为论文<Deep Mixture of Diverse Experts for Large-Scale Visual Recognition>的阅读总结.目的是做大规模图像分类 ...
随机推荐
- Hibernate 事务和并发控制
首先关于Hibernate事务控制,下面是非常权威的资料, https://docs.jboss.org/hibernate/orm/4.0/devguide/en-US/html/ch02.html ...
- JSP中文乱码问题的由来以及解决方法
首先明确一点,在计算机中,只有二进制的数据! 一.java_web乱码问题的由来 1.字符集 1.1 ASCII字符集 在早期的计算机系统中,使用的字符非常少,这些字符包括26个英文字母.数字符号和一 ...
- CAD教程--嵌入表格
1.第一步,打开excel复制一下表格 2.第二步,打开CAD,选择编辑->选择性粘贴->autocad图元,左键点击一下图就行了,找找图,放大到适合的比例就行了.
- BZOJ 2728 HNOI2012 与非 高斯消元
题目大意:给定k位二进制下的n个数,求[l,r]区间内有多少个数能通过这几个数与非得到 首先观察真值表 我们有A nand A = not A 然后就有not ( A nand B ) = A and ...
- TCO 2015 2D
250分题:给一段仅仅有'0','1'构成的字符串,然后给出串上平衡点的定义:在串上找到某个点(位置是p),这个点将串分成左右两部分(能够为空),左右分别计算字符的值的和,假设左边有字符是'1',那么 ...
- Android安全攻防战,反编译与混淆技术全然解析(下)
转载请注明出处:http://blog.csdn.net/guolin_blog/article/details/50451259 在上一篇文章其中,我们学习了Android程序反编译方面的知识,包括 ...
- Java io 操作
package tlistpackage; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFou ...
- DOM基础知识(概念、节点树、事件、Document)
1. DOM概念 全称为 Document Object Model,译为文档对象模型 D:文档 - DOM将HTML页面解析为一个文档 —> document对象 O:对象 - DOM将H ...
- Android PullTorefreshScrollview回到顶部
列表滑动下面显示按钮,点击按钮回到顶部的功能,一般scrollview会有滑动监听的事件,通过setOnScrollChangeListener()滑动监听滑动的距离来判断是否显示按钮就好了,但是Pu ...
- [原创]c语言中const与指针的用法
最近一直在准备笔试,补补大一大二欠下的课.复习c语言时碰见这么个题: 1 2 3 4 5 int a=248, b=4; int const c=21; const int *d=&a; ...