[转载]pytorch自定义数据集

为什么要定义Datasets:

PyTorch提供了一个工具函数torch.utils.data.DataLoader。通过这个类，我们在准备mini-batch的时候可以多线程并行处理，这样可以加快准备数据的速度。Datasets就是构建这个类的实例的参数之一。

如何自定义Datasets

下面是一个自定义Datasets的框架：

class CustomDataset(data.Dataset):#需要继承data.Dataset

    def __init__(self):

        # TODO

        # 1. Initialize file path or list of file names.

        pass

    def __getitem__(self, index):

        # TODO

        # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).

        # 2. Preprocess the data (e.g. torchvision.Transform).

        # 3. Return a data pair (e.g. image and label).

        #这里需要注意的是，第一步：read one data，是一个data

        pass

    def __len__(self):

        # You should change 0 to the total size of your dataset.

        return 0

下面看一下官方MNIST的例子（代码被缩减，只留下了重要的部分）：

class MNIST(data.Dataset):

    def __init__(self, root, train=True, transform=None, target_transform=None, download=False):

        self.root = root

        self.transform = transform

        self.target_transform = target_transform

        self.train = train  # training set or test set

        if download:

            self.download()

        if not self._check_exists():

            raise RuntimeError('Dataset not found.' +

                               ' You can use download=True to download it')

        if self.train:

            self.train_data, self.train_labels = torch.load(

                os.path.join(root, self.processed_folder, self.training_file))

        else:

            self.test_data, self.test_labels = torch.load(os.path.join(root, self.processed_folder, self.test_file))

    def __getitem__(self, index):

        if self.train:

            img, target = self.train_data[index], self.train_labels[index]

        else:

            img, target = self.test_data[index], self.test_labels[index]

        # doing this so that it is consistent with all other datasets

        # to return a PIL Image

        img = Image.fromarray(img.numpy(), mode='L')

        if self.transform is not None:

            img = self.transform(img)

        if self.target_transform is not None:

            target = self.target_transform(target)

        return img, target

    def __len__(self):

        if self.train:

            return 60000

        else:

            return 10000

[转载]pytorch自定义数据集的更多相关文章

PyTorch 自定义数据集
准备数据准备 COCO128 数据集,其是 COCO train2017 前 128 个数据.按 YOLOv5 组织的目录: $ tree ~/datasets/coco128 -L 2 /home ...
Pytorch划分数据集的方法
之前用过sklearn提供的划分数据集的函数,觉得超级方便.但是在使用TensorFlow和Pytorch的时候一直找不到类似的功能,之前搜索的关键字都是"pytorch split dat ...
pytorch加载语音类自定义数据集
pytorch对一下常用的公开数据集有很方便的API接口,但是当我们需要使用自己的数据集训练神经网络时,就需要自定义数据集,在pytorch中,提供了一些类,方便我们定义自己的数据集合 torch.u ...
MMDetection 快速开始，训练自定义数据集
本文将快速引导使用 MMDetection ,记录了实践中需注意的一些问题. 环境准备基础环境 Nvidia 显卡的主机 Ubuntu 18.04 系统安装,可见制作 USB 启动盘,及系统安装 ...
Scaled-YOLOv4 快速开始，训练自定义数据集
代码: https://github.com/ikuokuo/start-scaled-yolov4 Scaled-YOLOv4 代码: https://github.com/WongKinYiu/S ...
torch_13_自定义数据集实战
1.将图片的路径和标签写入csv文件并实现读取 # 创建一个文件,包含image,存放方式:label pokemeon\\mew\\0001.jpg,0 def load_csv(self,file ...
[转载]PyTorch上的contiguous
[转载]PyTorch上的contiguous 来源:https://zhuanlan.zhihu.com/p/64551412 这篇文章写的非常好,我这里就不复制粘贴了,有兴趣的同学可以去看原文,我 ...
[转载]PyTorch中permute的用法
[转载]PyTorch中permute的用法来源:https://blog.csdn.net/york1996/article/details/81876886 permute(dims) 将ten ...
[转载]Pytorch详解NLLLoss和CrossEntropyLoss
[转载]Pytorch详解NLLLoss和CrossEntropyLoss 来源:https://blog.csdn.net/qq_22210253/article/details/85229988 ...

随机推荐

xcode中用pods管理第三方库<转>
安装pods :http://www.cnblogs.com/wangluochong/p/5567082.html 史上最详细的CocoaPods安装教程 --------------------- ...
java中用正则表达式判断中文字符串中是否含有英文或者数字
public static boolean includingNUM(String str)throws Exception{ Pattern p = Pattern.compile(" ...
Python_pip_01_pip的相关操作
>Python中的pip是什么?能够做些什么? pip是Python中的一个进行包管理的东西,能够下载包.安装包.卸载包......一些列操作 >怎么查看pip的相关信息在控制台输入: ...
static、静态变量、静态方法
1 静态:static 1.1 用法是一个修饰符:用于修饰成员(成员变量和成员函数) 1.2 好处当成员变量被静态static修饰后,就多了一种调用方式,除了可以被对象调用外,还可以直接被类名调用 ...
关于 block的一些浅识
block的定义:“带自动变量的匿名函数” (一)写法: ^ void (int iAge){ NSLog(@"%d", iAge);}; 和C函数写法区别在于: 1) :以插入符 ...
linux ftp、sftp、telnet服务开通、更改Orale最大连接数
1 ftp服务开通 1.1 检测vsftpd是否安装及启动先用service vsftpd status 来查看ftp是否开启.也可以使用ps -ef | grep ftp 来查看本地是否含有包含f ...
Bootstrap 组件之 List group
简介 List group 指列表.当然,Bootstrap 列表不局限于由 <ul> 和 <li> 标签构成的. Bootstrap 中一共三种列表的构成方式,这里有一个例 ...
CodeForces 173B Chamber of Secrets (二分图+BFS)
题意:给定上一个n*m的矩阵,你从(1,1)这个位置发出水平向的光,碰到#可以选择四个方向同时发光,或者直接穿过去, 问你用最少的#使得光能够到达 (n,m)并且方向水平向右. 析:很明显的一个最短路 ...
Java基础-集合框架的学习大纲
1.List 和 Set 的区别 2.HashSet 是如何保证不重复的 3.HashMap 是线程安全的吗,为什么不是线程安全的(最好画图说明多线程环境下不安全)? 4.HashMap 的扩容过程 ...
用Apache James 3.3.0 搭建个人邮箱服务器
准备域名比如域名为example.net,则邮箱格式为test@example.net.在自己的域名管理界面,添加一条A记录(mail.example.net xxx.xxx.xxx.xxx),指 ...

[转载]pytorch自定义数据集

为什么要定义Datasets:

如何自定义Datasets

[转载]pytorch自定义数据集的更多相关文章

随机推荐

热门专题