

这个算法叫做a bag of features approach for image category classification,用于识别小图片里面的是小狗、小猫、还是火车、船等。


% Location of the compressed data set
url = 'http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz';
% Store the output in a temporary folder
outputFolder = fullfile(tempdir, 'caltech101'); % define output folde
if ~exist(outputFolder, 'dir') % download only once
disp('Downloading 126MB Caltech101 data set...');
untar(url, outputFolder);


rootFolder = fullfile(outputFolder, '101_ObjectCategories');

imgSets = [ imageSet(fullfile(rootFolder, 'airplanes')), ...
imageSet(fullfile(rootFolder, 'ferry')), ...
imageSet(fullfile(rootFolder, 'laptop')) ];

imageSet :   Use imageSet class to help you manage the data. Since imageSet operates on image file locations, and therefore does not load all the images into memory, it is safe to use on large image collections.


{ imgSets.Description } % display all labels on one line
[imgSets.Count] % show the corresponding count of images



minSetCount = min([imgSets.Count]); % determine the smallest amount of images in a category

% Use partition method to trim the set.
imgSets = partition(imgSets, minSetCount, 'randomize');


[trainingSets, validationSets] = partition(imgSets, 0.3, 'randomize');

创建一个训练的那个东东——Create a Visual Vocabulary and Train an Image Category Classifier

Bag of words is a technique adapted to computer vision from the world of natural language processing. Since images do not actually contain discrete words, we first construct a "vocabulary" of SURF features representative of each image category.

这个算法基于自然语言处理?什么鬼。。。  应该是和自然语言处理的结合。


  1. extracts SURF features from all images in all image categories

  2. constructs the visual vocabulary by reducing the number of features through quantization of feature space using K-means clustering

bag = bagOfFeatures(trainingSets);


Creating Bag-Of-Features from 3 image sets.
* Image set 1: airplanes.
* Image set 2: ferry.
* Image set 3: laptop. * Extracting SURF features using the Grid selection method.
** The GridStep is [8 8] and the BlockWidth is [32 64 96 128]. * Extracting features from 20 images in image set 1...done. Extracted 86144 features.
* Extracting features from 20 images in image set 2...done. Extracted 70072 features.
* Extracting features from 20 images in image set 3...done. Extracted 97888 features. * Keeping 80 percent of the strongest features from each image set. * Balancing the number of features across all image sets to improve clustering.
** Image set 2 has the least number of strongest features: 56058.
** Using the strongest 56058 features from each of the other image sets. * Using K-Means clustering to create a 500 word visual vocabulary.
* Number of features          : 168174
* Number of clusters (K)      : 500 * Clustering...done. * Finished creating Bag-Of-Features





Additionally, the bagOfFeatures object provides an encode method for counting the visual word occurrences in an image. It produced a histogram that becomes a new and reduced representation of an image.

This histogram forms a basis for training a classifier and for the actual image classification. In essence, it encodes an image into a feature vector.

img = read(imgSets(), );
featureVector = encode(bag, img); % Plot the histogram of visual word occurrences
title('Visual word occurrences')
xlabel('Visual word index')
ylabel('Frequency of occurrence')


Encoded training images from each category are fed into a classifier training process invoked by the trainImageCategoryClassifier function. Note that this function relies on the multiclass linear SVM classifier from the Statistics and Machine Learning Toolbox™.

categoryClassifier = trainImageCategoryClassifier(trainingSets, bag);


Training an image category classifier for  categories.
* Category : airplanes
* Category : ferry
* Category : laptop * Encoding features for category ...done.
* Encoding features for category ...done.
* Encoding features for category ...done. * Finished training the category classifier. Use evaluate to test the classifier on a test set.


confMatrix = evaluate(categoryClassifier, trainingSets);


Evaluating image category classifier for  categories.
------------------------------------------------------- * Category : airplanes
* Category : ferry
* Category : laptop * Evaluating images from category ...done.
* Evaluating images from category ...done.
* Evaluating images from category ...done. * Finished evaluating all the test sets. * The confusion matrix for this test set is: PREDICTED
KNOWN | airplanes ferry laptop
airplanes | 0.95 0.05 0.00
ferry | 0.00 1.00 0.00
laptop | 0.00 0.00 1.00 * Average Accuracy is 0.98.


confMatrix = evaluate(categoryClassifier, validationSets);

% Compute average accuracy


Evaluating image category classifier for  categories.
------------------------------------------------------- * Category : airplanes
* Category : ferry
* Category : laptop * Evaluating images from category ...done.
* Evaluating images from category ...done.
* Evaluating images from category ...done. * Finished evaluating all the test sets. * The confusion matrix for this test set is: PREDICTED
KNOWN | airplanes ferry laptop
airplanes | 0.85 0.13 0.02
ferry | 0.02 0.94 0.04
laptop | 0.04 0.02 0.94 * Average Accuracy is 0.91.

训练了,检验了,该上战场了!——Try the Newly Trained Classifier on Test Images

img = imread(fullfile(rootFolder, 'airplanes', 'image_0690.jpg'));
[labelIdx, scores] = predict(categoryClassifier, img); % Display the string label




MATLAB官方文档:bagOfFeatures class

