本文地址:http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

本文作者:Francois Chollet

  • 按照官方的文章实现过程有一些坑,彻底理解代码细节实现,理解keras的api具体使用方法
  • 也有很多人翻译这篇文章,但是有些没有具体实现细节
  • 另外keres开发者自己有本书的jupyter:Companion Jupyter notebooks for the book "Deep Learning with Python"
  • 另外我自己实验三收敛的准确率并没有0.94+,可以参考前面这本书上的实现
  • 文章一共有三个实验:
      1. 第一个实验使用自定义的神经网络对数据集进行训练,三层卷积加两层全连接,训练并验证网络的准确率;
      2. 第二个实验使用VGG16网络对数据进行训练,为了适应自定义的数据集,将VGG16网络的全连接层去掉,作者称之为 “Feature extraction”, 再在上面添加自己实现的全连接层,然后训练并验证网络准确性;
      3. 第三个实验称为 “fine-tune” ,利用第二个实验的实验模型和weight,重新训练VGG16的最后一个卷积层和自定义的全连接层,然后验证网络准确性;
  • 实验二的代码:
  1. '''This script goes along the blog post
  2. "Building powerful image classification models using very little data"
  3. from blog.keras.io.
  4. It uses data that can be downloaded at:
  5. https://www.kaggle.com/c/dogs-vs-cats/data
  6. In our setup, we:
  7. - created a data/ folder
  8. - created train/ and validation/ subfolders inside data/
  9. - created cats/ and dogs/ subfolders inside train/ and validation/
  10. - put the cat pictures index - in data/train/cats
  11. - put the cat pictures index - in data/validation/cats
  12. - put the dogs pictures index - in data/train/dogs
  13. - put the dog pictures index - in data/validation/dogs
  14. So that we have training examples for each class, and validation examples for each class.
  15. In summary, this is our directory structure:
  16. ```
  17. data/
  18. train/
  19. dogs/
  20. dog001.jpg
  21. dog002.jpg
  22. ...
  23. cats/
  24. cat001.jpg
  25. cat002.jpg
  26. ...
  27. validation/
  28. dogs/
  29. dog001.jpg
  30. dog002.jpg
  31. ...
  32. cats/
  33. cat001.jpg
  34. cat002.jpg
  35. ...
  36. ```
  37. '''
  38. import numpy as np
  39. from keras.preprocessing.image import ImageDataGenerator
  40. from keras.models import Sequential
  41. from keras.layers import Dropout, Flatten, Dense
  42. from keras import applications
  43.  
  44. # dimensions of our images.
  45. img_width, img_height = ,
  46.  
  47. top_model_weights_path = 'bottleneck_fc_model.h5'
  48.  
  49. data_root = 'M:/dataset/dog_cat/'
  50. train_data_dir =data_root+ 'data/train'
  51. validation_data_dir = data_root+'data/validation'
  52. nb_train_samples =
  53. nb_validation_samples =
  54. epochs =
  55. batch_size =
  56.  
  57. def save_bottlebeck_features():
  58. datagen = ImageDataGenerator(rescale=. / )
  59.  
  60. # build the VGG16 network
  61. model = applications.VGG16(include_top=False, weights='imagenet')
  62.  
  63. generator = datagen.flow_from_directory(
  64. train_data_dir,
  65. target_size=(img_width, img_height),
  66. batch_size=batch_size,
  67. class_mode=None,
  68. shuffle=False)
  69. bottleneck_features_train = model.predict_generator(
  70. generator, nb_train_samples // batch_size) #####2000//batch_size!!!!!!!!!!
  71. np.save('bottleneck_features_train.npy',
  72. bottleneck_features_train)
  73.  
  74. generator = datagen.flow_from_directory(
  75. validation_data_dir,
  76. target_size=(img_width, img_height),
  77. batch_size=batch_size,
  78. class_mode=None,
  79. shuffle=False)
  80. bottleneck_features_validation = model.predict_generator(
  81. generator, nb_validation_samples // batch_size)
  82. np.save('bottleneck_features_validation.npy',
  83. bottleneck_features_validation)
  84.  
  85. def train_top_model():
  86. train_data = np.load('bottleneck_features_train.npy')
  87. train_labels = np.array([] * int(nb_train_samples / ) + [] * int(nb_train_samples / ))
  88.  
  89. validation_data = np.load('bottleneck_features_validation.npy')
  90. validation_labels = np.array([] * int(nb_validation_samples / ) + [] * int(nb_validation_samples / ))
  91.  
  92. model = Sequential()
  93. model.add(Flatten(input_shape=train_data.shape[:]))
  94. model.add(Dense(, activation='relu'))
  95. model.add(Dropout(0.5))
  96. model.add(Dense(, activation='sigmoid'))
  97.  
  98. model.compile(optimizer='rmsprop',
  99. loss='binary_crossentropy', metrics=['accuracy'])
  100.  
  101. model.fit(train_data, train_labels,
  102. epochs=epochs,
  103. batch_size=batch_size,
  104. validation_data=(validation_data, validation_labels))
  105. model.save_weights(top_model_weights_path)
  106.  
  107. #save_bottlebeck_features()
  108. train_top_model()
  • 实验三代码,自己添加了一些api使用方法,也是以后可以参考的:
  1. '''This script goes along the blog post
  2. "Building powerful image classification models using very little data"
  3. from blog.keras.io.
  4. It uses data that can be downloaded at:
  5. https://www.kaggle.com/c/dogs-vs-cats/data
  6. In our setup, we:
  7. - created a data/ folder
  8. - created train/ and validation/ subfolders inside data/
  9. - created cats/ and dogs/ subfolders inside train/ and validation/
  10. - put the cat pictures index - in data/train/cats
  11. - put the cat pictures index - in data/validation/cats
  12. - put the dogs pictures index - in data/train/dogs
  13. - put the dog pictures index - in data/validation/dogs
  14. So that we have training examples for each class, and validation examples for each class.
  15. In summary, this is our directory structure:
  16. ```
  17. data/
  18. train/
  19. dogs/
  20. dog001.jpg
  21. dog002.jpg
  22. ...
  23. cats/
  24. cat001.jpg
  25. cat002.jpg
  26. ...
  27. validation/
  28. dogs/
  29. dog001.jpg
  30. dog002.jpg
  31. ...
  32. cats/
  33. cat001.jpg
  34. cat002.jpg
  35. ...
  36. ```
  37. '''
  38.  
  39. # thanks sove bug @http://blog.csdn.net/aggresss/article/details/78588135
  40.  
  41. from keras import applications
  42. from keras.preprocessing.image import ImageDataGenerator
  43. from keras import optimizers
  44. from keras.models import Sequential
  45. from keras.layers import Dropout, Flatten, Dense
  46. from keras.models import Model
  47. from keras.regularizers import l2
  48.  
  49. # path to the model weights files.
  50. weights_path = '../keras/examples/vgg16_weights.h5'
  51. top_model_weights_path = 'bottleneck_fc_model.h5'
  52. # dimensions of our images.
  53. img_width, img_height = ,
  54.  
  55. data_root = 'M:/dataset/dog_cat/'
  56. train_data_dir =data_root+ 'data/train'
  57. validation_data_dir = data_root+'data/validation'
  58.  
  59. nb_train_samples =
  60. nb_validation_samples =
  61. epochs =
  62. batch_size =
  63.  
  64. # build the VGG16 network
  65. base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(,,)) # train 指定训练大小
  66. print('Model loaded.')
  67.  
  68. # build a classifier model to put on top of the convolutional model
  69. top_model = Sequential()
  70. top_model.add(Flatten(input_shape=base_model.output_shape[:])) # base_model.output_shape[:])
  71. top_model.add(Dense(, activation='relu',kernel_regularizer=l2(0.001),))
  72. top_model.add(Dropout(0.8))
  73. top_model.add(Dense(, activation='sigmoid'))
  74.  
  75. # note that it is necessary to start with a fully-trained
  76. # classifier, including the top classifier,
  77. # in order to successfully do fine-tuning
  78. top_model.load_weights(top_model_weights_path)
  79.  
  80. # add the model on top of the convolutional base
  81. # model.add(top_model) # bug
  82.  
  83. model = Model(inputs=base_model.input, outputs=top_model(base_model.output))
  84.  
  85. # set the first layers (up to the last conv block)
  86. # to non-trainable (weights will not be updated)
  87. for layer in model.layers[:]: # : bug
  88. layer.trainable = False
  89.  
  90. # compile the model with a SGD/momentum optimizer
  91. # and a very slow learning rate.
  92. model.compile(loss='binary_crossentropy',
  93. optimizer=optimizers.SGD(lr=1e-, momentum=0.9),
  94. metrics=['accuracy'])
  95.  
  96. # prepare data augmentation configuration
  97. train_datagen = ImageDataGenerator(
  98. rescale=. / ,
  99. shear_range=0.2,
  100. zoom_range=0.2,
  101. horizontal_flip=True)
  102.  
  103. test_datagen = ImageDataGenerator(rescale=. / )
  104.  
  105. train_generator = train_datagen.flow_from_directory(
  106. train_data_dir,
  107. target_size=(img_height, img_width),
  108. batch_size=batch_size,
  109. class_mode='binary')
  110.  
  111. validation_generator = test_datagen.flow_from_directory(
  112. validation_data_dir,
  113. target_size=(img_height, img_width),
  114. batch_size=batch_size,
  115. class_mode='binary')
  116.  
  117. model.summary() # prints a summary representation of your model.
  118. # let's visualize layer names and layer indices to see how many layers
  119. # we should freeze:
  120. for i, layer in enumerate(base_model.layers):
  121. print(i, layer.name)
  122.  
  123. from keras.utils import plot_model
  124. plot_model(model, to_file='model.png')
  125.  
  126. from keras.callbacks import History
  127. from keras.callbacks import ModelCheckpoint
  128. import keras
  129. history = History()
  130. model_checkpoint = ModelCheckpoint('temp_model.hdf5', monitor='loss', save_best_only=True)
  131. tb_cb = keras.callbacks.TensorBoard(log_dir='log', write_images=, histogram_freq=)
  132. # 设置log的存储位置,将网络权值以图片格式保持在tensorboard中显示,设置每一个周期计算一次网络的
  133. # 权值,每层输出值的分布直方图
  134. callbacks = [
  135. history,
  136. model_checkpoint,
  137. tb_cb
  138. ]
  139. # model.fit()
  140.  
  141. # fine-tune the model
  142. history=model.fit_generator(
  143. train_generator,
  144. steps_per_epoch=nb_train_samples // batch_size,
  145. epochs=epochs,
  146. callbacks=callbacks,
  147. validation_data=validation_generator,
  148. validation_steps=nb_validation_samples // batch_size,
  149. verbose = )
  150.  
  151. model.save('fine_tune_model.h5')
  152. model.save_weights('fine_tune_model_weight')
  153. print(history.history)
  154.  
  155. from matplotlib import pyplot as plt
  156. history=history
  157. plt.plot()
  158. plt.plot(history.history['val_acc'])
  159. plt.title('model accuracy')
  160. plt.ylabel('accuracy')
  161. plt.xlabel('epoch')
  162. plt.legend(['train', 'test'], loc='upper left')
  163. plt.show()
  164. # summarize history for loss
  165. plt.plot(history.history['loss'])
  166. plt.plot(history.history['val_loss'])
  167. plt.title('model loss')
  168. plt.ylabel('loss')
  169. plt.xlabel('epoch')
  170. plt.legend(['train', 'test'], loc='upper left')
  171. plt.show()
  172.  
  173. import numpy as np
  174. accy=history.history['acc']
  175. np_accy=np.array(accy)
  176. np.savetxt('save_acc.txt',np_accy)
  • result
  1. Model loaded.
  2. Found images belonging to classes.
  3. Found images belonging to classes.
  4. _________________________________________________________________
  5. Layer (type) Output Shape Param #
  6. =================================================================
  7. input_1 (InputLayer) (None, , , )
  8. _________________________________________________________________
  9. block1_conv1 (Conv2D) (None, , , )
  10. _________________________________________________________________
  11. block1_conv2 (Conv2D) (None, , , )
  12. _________________________________________________________________
  13. block1_pool (MaxPooling2D) (None, , , )
  14. _________________________________________________________________
  15. block2_conv1 (Conv2D) (None, , , )
  16. _________________________________________________________________
  17. block2_conv2 (Conv2D) (None, , , )
  18. _________________________________________________________________
  19. block2_pool (MaxPooling2D) (None, , , )
  20. _________________________________________________________________
  21. block3_conv1 (Conv2D) (None, , , )
  22. _________________________________________________________________
  23. block3_conv2 (Conv2D) (None, , , )
  24. _________________________________________________________________
  25. block3_conv3 (Conv2D) (None, , , )
  26. _________________________________________________________________
  27. block3_pool (MaxPooling2D) (None, , , )
  28. _________________________________________________________________
  29. block4_conv1 (Conv2D) (None, , , )
  30. _________________________________________________________________
  31. block4_conv2 (Conv2D) (None, , , )
  32. _________________________________________________________________
  33. block4_conv3 (Conv2D) (None, , , )
  34. _________________________________________________________________
  35. block4_pool (MaxPooling2D) (None, , , )
  36. _________________________________________________________________
  37. block5_conv1 (Conv2D) (None, , , )
  38. _________________________________________________________________
  39. block5_conv2 (Conv2D) (None, , , )
  40. _________________________________________________________________
  41. block5_conv3 (Conv2D) (None, , , )
  42. _________________________________________________________________
  43. block5_pool (MaxPooling2D) (None, , , )
  44. _________________________________________________________________
  45. sequential_1 (Sequential) (None, )
  46. =================================================================
  47. Total params: ,,
  48. Trainable params: ,,
  49. Non-trainable params: ,,
  50. _________________________________________________________________
  51. input_1
  52. block1_conv1
  53. block1_conv2
  54. block1_pool
  55. block2_conv1
  56. block2_conv2
  57. block2_pool
  58. block3_conv1
  59. block3_conv2
  60. block3_conv3
  61. block3_pool
  62. block4_conv1
  63. block4_conv2
  64. block4_conv3
  65. block4_pool
  66. block5_conv1
  67. block5_conv2
  68. block5_conv3
  69. block5_pool
  70. Backend TkAgg is interactive backend. Turning interactive mode on.

Keras 最新《面向小数据集构建图像分类模型》的更多相关文章

  1. 面向小数据集构建图像分类模型Keras

    文章信息 本文地址:http://blog.keras.io/building-powerful-image-classification-models-using-very-little-data. ...

  2. 我的Keras使用总结(2)——构建图像分类模型(针对小数据集)

    Keras基本的使用都已经清楚了,那么这篇主要学习如何使用Keras进行训练模型,训练训练,主要就是“练”,所以多做几个案例就知道怎么做了. 在本文中,我们将提供一些面向小数据集(几百张到几千张图片) ...

  3. 硬货 | 手把手带你构建视频分类模型(附Python演练))

    译者 | VK 来源 | Analytics Vidhya 概述 了解如何使用计算机视觉和深度学习技术处理视频数据 我们将在Python中构建自己的视频分类模型 这是一个非常实用的视频分类教程,所以准 ...

  4. 使用 keras 和 tfjs 构建血细胞分类模型

    欢迎大家关注我们的网站和系列教程:http://www.tensorflownews.com/,学习更多的机器学习.深度学习的知识!

  5. Recorder︱深度学习小数据集表现、优化(Active Learning)、标注集网络获取

    一.深度学习在小数据集的表现 深度学习在小数据集情况下获得好效果,可以从两个角度去解决: 1.降低偏差,图像平移等操作 2.降低方差,dropout.随机梯度下降 先来看看深度学习在小数据集上表现的具 ...

  6. ML.NET 示例:图像分类模型训练-首选API(基于原生TensorFlow迁移学习)

    ML.NET 版本 API 类型 状态 应用程序类型 数据类型 场景 机器学习任务 算法 Microsoft.ML 1.5.0 动态API 最新 控制台应用程序和Web应用程序 图片文件 图像分类 基 ...

  7. keras入门(三)搭建CNN模型破解网站验证码

    项目介绍   在文章CNN大战验证码中,我们利用TensorFlow搭建了简单的CNN模型来破解某个网站的验证码.验证码如下: 在本文中,我们将会用Keras来搭建一个稍微复杂的CNN模型来破解以上的 ...

  8. Keras(一)Sequential与Model模型、Keras基本结构功能

    keras介绍与基本的模型保存 思维导图 1.keras网络结构 2.keras网络配置 3.keras预处理功能 模型的节点信息提取 config = model.get_config() 把mod ...

  9. PLUTO平台是由美林数据技术股份有限公司下属西安交大美林数据挖掘研究中心自主研发的一款基于云计算技术架构的数据挖掘产品,产品设计严格遵循国际数据挖掘标准CRISP-DM(跨行业数据挖掘过程标准),具备完备的数据准备、模型构建、模型评估、模型管理、海量数据处理和高纬数据可视化分析能力。

    http://www.meritdata.com.cn/article/90 PLUTO平台是由美林数据技术股份有限公司下属西安交大美林数据挖掘研究中心自主研发的一款基于云计算技术架构的数据挖掘产品, ...

随机推荐

  1. managed unmanaged

    Enable function-level control for compiling functions as managed or unmanaged.     #pragma managed # ...

  2. webpack 环境搭建+实现热更新

    让我们一起构建一个小的app 为了便于你更好的了解Webpack带来的好处,我们将会构建一个非常小的app并将资源文件打包.在这个教程中我推荐基于Node4或Node5和NPM3来进行开发,这样就避免 ...

  3. Spring Cloud(2.0)能力大致列表

    微服务九大特性 出自Martin Fowler的<Microservices> 服务组件化 按业务组织团队 做"产品"的态度 智能端点与哑管道 去中心化治理 去中心化管 ...

  4. Web网站性能测试分析及调优实例

    1 背景   前段时间,性能测试团队经历了一个规模较大的门户网站的性能优化工作,该网站的开发和合作涉及多个组织和部门,而且网站的重要性不言而喻,同时上线时间非常紧迫,关注度也很高,所以对于整个团队的压 ...

  5. day01_07.逻辑与字符串运算符

    &&(并且)====>发现&符号总是打错,记忆口令:&7(暗器),在数字7上面,在python中是and ||(或者)====>在python中是or . ...

  6. Python生成器、三元表达式、列表生成式、字典生成式、生成器表达式

    什么是生成器:只要函数内部包含有yield关键字,那么函数名()的到的结果(生成器地址)就是生成器,再调用函数不会执行函数内部代码这个生成器本身有  _iter_  he  _next_功能(即生成器 ...

  7. [python IO学习篇] 补充中文编码

    http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001386820066616a7 ...

  8. hibernate缓存机制【转】

    一.why(为什么要用Hibernate缓存?) Hibernate是一个持久层框架,经常访问物理数据库. 为了降低应用程序对物理数据源访问的频次,从而提高应用程序的运行性能. 缓存内的数据是对物理数 ...

  9. 【Luogu】P3971Alice And Bob(贪心)

    题目链接 容易发现值为x的点只可能从值为x-1的点转移过来,所以我们把原序列连成一棵树,dfs序就是原序列的一种形式. 就可以直接求啦 #include<cstdio> #include& ...

  10. 古代猪文 BZOJ 1951

    古代猪文 [问题描述] “在那山的那边海的那边有一群小肥猪.他们活泼又聪明,他们调皮又灵敏.他们自由自在生活在那绿色的大草坪,他们善良勇敢相互都关心……” ——选自猪王国民歌 很久很久以前,在山的那边 ...