https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation

#!/usr/bin/env python

# Copyright 2016 Google Inc. All Rights Reserved.

#

# Licensed under the Apache License, Version 2.0 (the "License");

# you may not use this file except in compliance with the License.

# You may obtain a copy of the License at

#

#     http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

# ==============================================================================

r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training.

Associate the ImageNet 2012 Challenge validation data set with labels.

The raw ImageNet validation data set is expected to reside in JPEG files

located in the following directory structure.

 data_dir/ILSVRC2012_val_00000001.JPEG

 data_dir/ILSVRC2012_val_00000002.JPEG

 ...

 data_dir/ILSVRC2012_val_00050000.JPEG

This script moves the files into a directory structure like such:

 data_dir/n01440764/ILSVRC2012_val_00000293.JPEG

 data_dir/n01440764/ILSVRC2012_val_00000543.JPEG

 ...

where 'n01440764' is the unique synset label associated with

these images.

This directory reorganization requires a mapping from validation image

number (i.e. suffix of the original file) to the associated label. This

is provided in the ImageNet development kit via a Matlab file.

In order to make life easier and divorce ourselves from Matlab, we instead

supply a custom text file that provides this mapping for us.

Sample usage:

  ./preprocess_imagenet_validation_data.py ILSVRC2012_img_val \

  imagenet_2012_validation_synset_labels.txt

"""

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

import os

import sys

from six.moves import xrange  # pylint: disable=redefined-builtin

if __name__ == '__main__':

  if len(sys.argv) < 3:  # sys.argv返回脚本本身的名字及给定脚本的参数.

    print('Invalid usage\n'

          'usage: preprocess_imagenet_validation_data.py '

          '<validation data dir> <validation labels file>')

    sys.exit(-1)  # System.exit(-1)是指所有程序（方法，类等）停止，系统停止运行。

  data_dir = sys.argv[1]

  validation_labels_file = sys.argv[2]

  # Read in the 50000 synsets associated with the validation data set.

  # imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别，有重复，与50000图片是一一对应的

  labels = [l.strip() for l in open(validation_labels_file).readlines()]  # strip() 方法用于移除字符串头尾指定的字符（默认为空格或换行符）。

  unique_labels = set(labels)  # set() 函数创建一个无序不重复元素集，可进行关系测试，删除重复数据，还可以计算交集、差集、并集等。

  # Make all sub-directories in the validation data dir.

  for label in unique_labels:

    labeled_data_dir = os.path.join(data_dir, label)

    if not os.path.exists(labeled_data_dir):

    	os.makedirs(labeled_data_dir)

  # Move all of the image to the appropriate sub-directory.

  for i in xrange(len(labels)):  # xrange() 函数用法与 range 完全相同，所不同的是生成的不是一个数组，而是一个生成器。

    basename = 'ILSVRC2012_val_000%.5d.JPEG' % (i + 1)

    original_filename = os.path.join(data_dir, basename)

    if not os.path.exists(original_filename):

      #print('Failed to find: ' % original_filename)

      continue

      #sys.exit(-1)

    new_filename = os.path.join(data_dir, labels[i], basename)

    os.rename(original_filename, new_filename)

82行的代码一加进去，就出错：

TypeError: not all arguments converted during string formatting

过程中还出现了以下错误：

Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ▒▒һ▒▒▒▒▒▒▒▒▒▒ʹ▒ô▒▒ļ▒▒▒▒▒▒▒▒޷▒▒▒▒ʡ▒: 'F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG' -> 'F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG'

可能是不能够一次性重命名太多文件，反正我重新运行了

./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified

preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。

https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章

https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/
https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/ ...
结对项目https://github.com/bxoing1994/test/blob/master/源代码
所选项目名称:文本替换结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码结对人github地址:ht ...
https://github.com/python/cpython/blob/master/Doc/library/contextlib.rst 被同一个线程多次获取的同步基元组件
# -*- coding: utf-8 -*- import time from threading import Lock, RLock from datetime import datetime ...
https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/connections.py
# Python implementation of the MySQL client-server protocol # http://dev.mysql.com/doc/internals/en/ ...
用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php
<?php $serv = new swoole_server("0.0.0.0", 9508); $serv->set(['worker_num'=>1 ...
GC 的认识(转) https://github.com/qcrao/Go-Questions/blob/master/GC/GC.md#1-什么是-gc有什么作用
1. 什么是 GC,有什么作用? GC,全称 Garbage Collection,即垃圾回收,是一种自动内存管理的机制. 当程序向操作系统申请的内存不再需要时,垃圾回收主动将其回收并供其他代码进行内 ...
tensorflow models flags 初步使用
参考官方仓库:https://github.com/tensorflow/models/tree/master/official/utils/flags 测试Demo代码如下: from absl i ...
Ubuntu18.04下安装、测试tensorflow/models Tensorflow Object Detection API 笔记
参考:https://www.jianshu.com/p/1ed2d9ce6a88 安装安装conda+tensorflow库下载protoc linux x64版,https://github. ...

随机推荐

eclipse is missing required source folder src/test/java
原因:maven的bug,不兼容eclipse 解决方法:右击工程,选择run-->maven-->build重新构建工程,就解决了.
什么是面向切面编程AOP--知识点汇总
最近在学这方面的内容,读到的这段话我感觉说的很清楚了:这种在运行时,动态地将代码切入到类的指定方法.指定位置上的编程思想就是面向切面的编程. 面向切面编程(AOP是Aspect Orie ...
could not resolve property: leader_id of: pojo.Project
https://www.cnblogs.com/zhaocundang/p/9211270.html hibernate 双向1对多出现问题外键解析错误! log4j:WARN No append ...
目前我对ReactNative的了解
1.什么是React? 一个js组件库,不同于angular的是一个完整的framework,React需要像jQuery一样写事件监听逻辑,最大特点是Virtual DOM. 官网:https:// ...
undo与redo
http://www.cnblogs.com/HondaHsu/p/3724815.html
[sqoop] sqoop2 使用
sqoop版本1.99.7 ,安装省略 1. 启动server sqoop2-server start 2. sqoop2-shell 链接表示安装成功. 创建link 查看link 创建job 查 ...
Rk3288 双屏异显单触摸
系统版本:RK3288 android 5.1 设备同时有两个lcd,主屏是mipi接口,带有触摸屏,触摸屏是usb接口,副屏是hdmi接口,没有触摸屏,正常情况下,两个lcd显示相同内容,触摸屏一切 ...
Git命令行大全
git branch 查看本地所有分支 git status 查看当前状态 git commit 提交 git branch -a 查看所有的分支 git branch -r 查看远程所有分支 git ...
miniprogrampatch 提供 watch 和 computed 特性
推荐一个小程序补丁 github:miniprogrampatch,为你的 Page 和 Component 增加 watch 和 computed 特性. 安装通过 npm 安装:npm inst ...
dir 命令手册
dir 命令手册参数 /A D 目录 R 只读文件 H 隐藏文件 A 准备存档的文件 S 系统文件 - 表示"否"的前缀 /B 使用空格式(没有标题信息或摘要) /C 在文件大小 ...

https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版

https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章

随机推荐

热门专题