https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版
#!/usr/bin/env python
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training. Associate the ImageNet 2012 Challenge validation data set with labels. The raw ImageNet validation data set is expected to reside in JPEG files
located in the following directory structure. data_dir/ILSVRC2012_val_00000001.JPEG
data_dir/ILSVRC2012_val_00000002.JPEG
...
data_dir/ILSVRC2012_val_00050000.JPEG This script moves the files into a directory structure like such:
data_dir/n01440764/ILSVRC2012_val_00000293.JPEG
data_dir/n01440764/ILSVRC2012_val_00000543.JPEG
...
where 'n01440764' is the unique synset label associated with
these images. This directory reorganization requires a mapping from validation image
number (i.e. suffix of the original file) to the associated label. This
is provided in the ImageNet development kit via a Matlab file. In order to make life easier and divorce ourselves from Matlab, we instead
supply a custom text file that provides this mapping for us. Sample usage:
./preprocess_imagenet_validation_data.py ILSVRC2012_img_val \
imagenet_2012_validation_synset_labels.txt
""" from __future__ import absolute_import
from __future__ import division
from __future__ import print_function import os
import sys from six.moves import xrange # pylint: disable=redefined-builtin if __name__ == '__main__':
if len(sys.argv) < 3: # sys.argv返回脚本本身的名字及给定脚本的参数.
print('Invalid usage\n'
'usage: preprocess_imagenet_validation_data.py '
'<validation data dir> <validation labels file>')
sys.exit(-1) # System.exit(-1)是指所有程序(方法,类等)停止,系统停止运行。
data_dir = sys.argv[1]
validation_labels_file = sys.argv[2] # Read in the 50000 synsets associated with the validation data set.
# imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别,有重复,与50000图片是一一对应的
labels = [l.strip() for l in open(validation_labels_file).readlines()] # strip() 方法用于移除字符串头尾指定的字符(默认为空格或换行符)。
unique_labels = set(labels) # set() 函数创建一个无序不重复元素集,可进行关系测试,删除重复数据,还可以计算交集、差集、并集等。 # Make all sub-directories in the validation data dir.
for label in unique_labels:
labeled_data_dir = os.path.join(data_dir, label)
if not os.path.exists(labeled_data_dir):
os.makedirs(labeled_data_dir) # Move all of the image to the appropriate sub-directory.
for i in xrange(len(labels)): # xrange() 函数用法与 range 完全相同,所不同的是生成的不是一个数组,而是一个生成器。
basename = 'ILSVRC2012_val_000%.5d.JPEG' % (i + 1)
original_filename = os.path.join(data_dir, basename)
if not os.path.exists(original_filename):
#print('Failed to find: ' % original_filename)
continue
#sys.exit(-1)
new_filename = os.path.join(data_dir, labels[i], basename)
os.rename(original_filename, new_filename)
82行的代码一加进去,就出错:
TypeError: not all arguments converted during string formatting
过程中还出现了以下错误:
Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ▒▒һ▒▒▒▒▒▒▒▒▒▒ʹ▒ô▒▒ļ▒▒▒▒▒▒▒▒▒▒▒▒ʡ▒: 'F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG' -> 'F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG'
可能是不能够一次性重命名太多文件,反正我重新运行了
./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified
preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。
https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章
- https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/
https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/ ...
- 结对项目https://github.com/bxoing1994/test/blob/master/源代码
所选项目名称:文本替换 结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码 结对人github地址:ht ...
- https://github.com/python/cpython/blob/master/Doc/library/contextlib.rst 被同一个线程多次获取的同步基元组件
# -*- coding: utf-8 -*- import time from threading import Lock, RLock from datetime import datetime ...
- https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
- https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/connections.py
# Python implementation of the MySQL client-server protocol # http://dev.mysql.com/doc/internals/en/ ...
- 用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php
<?php $serv = new swoole_server("0.0.0.0", 9508); $serv->set(['worker_num'=>1 ...
- GC 的认识(转) https://github.com/qcrao/Go-Questions/blob/master/GC/GC.md#1-什么是-gc有什么作用
1. 什么是 GC,有什么作用? GC,全称 Garbage Collection,即垃圾回收,是一种自动内存管理的机制. 当程序向操作系统申请的内存不再需要时,垃圾回收主动将其回收并供其他代码进行内 ...
- tensorflow models flags 初步使用
参考官方仓库:https://github.com/tensorflow/models/tree/master/official/utils/flags 测试Demo代码如下: from absl i ...
- Ubuntu18.04下安装、测试tensorflow/models Tensorflow Object Detection API 笔记
参考:https://www.jianshu.com/p/1ed2d9ce6a88 安装 安装conda+tensorflow库 下载protoc linux x64版,https://github. ...
随机推荐
- (9) MySQL主主复制架构使用方法
一. 回忆主从复制的一些缺点 上节说到主从复制的一些问题 我们再来回忆一下 主从复制,增加了一个数据库副本,从数据库和主数据库的数据最终会是一致的 之所以说是最终一致,因为mysql复制是异步的,正常 ...
- 【分享】Web前端开发第三方插件大全
收集整理了一些Web前端开发比较成熟的第三方插件,分享给大家. ******************************************************************** ...
- SpringBoot系统列 3 - 多线程数据处理(ThreadPoolTaskExecutor、DruidDataSource)
在上篇文章的基础上进行改造: package com.hello.util; import org.slf4j.Logger; import org.slf4j.LoggerFactory; impo ...
- OpenGL normalMap
参考zwqxin的博客 http://www.zwqxin.com/ shader 来自zwqxin,稍作修改 <-vertex-> attribute vec3 v_Pos; attr ...
- 【静默】Oracle各类响应文件何在?
[静默]Oracle各类响应文件何在? --root用户下执行: find -name *.rsp / 1.创建数据库的响应文件:$ORACLE_HOME/assistants/dbca/dbca. ...
- python列表的切片操作允许索引超出范围
其余的不说,列表切片操作允许索引超出范围:
- k8s(5)-拓展服务
在之前我们创建了一个部署,然后通过服务公开它.部署只创建了一个Pod来运行我们的应用程序.当流量增加时,我们需要扩展应用程序以满足用户需求. 通过更改部署中的副本数来完成扩展. 1. 拓展部署 这里将 ...
- oracle多个单引号的处理
Oracle多个单引号的处理 在ORACLE中,单引号有两个作用,一是字符串是由单引号引用,二是转义.单引号的使用是就近配对,即就近原则.而在单引号充当转义角色时相对不好理解. 下面转载 1.从第二个 ...
- nw.js---创建一个hello word的方法
一.如果用nw.js 来开发桌面应用 首先到Nw.js中文网下载软件: https://nwjs.org.cn/download.html 下载下来进行解压就可以了,绿色的免安装的,整个目录结果是这样 ...
- java项目(学习和研究)
java项目就是研究,不断的对项目进行迭代,把产品做的越来越好,就是research. 自己想着做一个java项目把,可以类似牛客网,想好自己的预期产品,在设计的过程中可以不断改进和扩展,在做这个项目 ...