Pytorch 报错总结
目前在学习pytorch,自己写了一些例子,在这里记录下来一些报错及总结
1. RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'weight'
详细报错信息
Traceback (most recent call last):
File "dogvscat-resnet.py", line , in <module>
outputs = net(inputs)
File "/home/lzx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line , in __call__
result = self.forward(*input, **kwargs)
File "/home/lzx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/models/resnet.py", li
ne , in forward
File "/home/lzx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line , in __call__
result = self.forward(*input, **kwargs)
File "/home/lzx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line , in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument # 'weight'
参考:https://github.com/wohlert/semi-supervised-pytorch/issues/7
这个报错其实比较隐蔽,用Google搜索的第一页都没什么参考价值,只有上面的这个链接里提醒了我,
在GPU上进行训练时,需要把模型和数据都加上.cuda(),如
model.cuda()
但是对于数据,这个.cuda()并非是inplace操作,就是说不单单是在变量名后面加上.cuda()就可以了
还必须显示的赋值回去,即:
data.cuda()是不行的,而
data = data.cuda()才是可以的。
这样的显示声明的细节非常重要。
示例代码:用LeNet做猫狗的二分类,自己写的代码
请重点关注以下行的写法:46 47 57 58 96 97
import os
from PIL import Image
import numpy as np
import torch
from torchvision import transforms as T
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.nn.functional as F
from torch import optim
from torch.utils import data
import torchvision as tv
from torchvision.transforms import ToPILImage
show = ToPILImage() # 可以把Tensor转成Image,方便可视化 transform = T.Compose([
T.Resize(32), # 缩放图片(Image),保持长宽比不变,最短边为224像素
T.CenterCrop(32), # 从图片中间切出224*224的图片
T.ToTensor(), # 将图片(Image)转成Tensor,归一化至[0, 1]
T.Normalize(mean=[.5, .5, .5], std=[.5, .5, .5]) # 标准化至[-1, 1],规定均值和标准差
]) class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16*5*5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 2) def forward(self, x):
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(x.size()[0], -1)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x net = Net()
if torch.cuda.is_available():
print("Using GPU")
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device) def test():
correct = 0 # 预测正确的图片数
total = 0 # 总共的图片数
# 由于测试的时候不需要求导,可以暂时关闭autograd,提高速度,节约内存
with torch.no_grad():
for data in testloader:
images, labels = data
images = images.to(device)
labels = labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum() print('Accuracy in the test dataset: %.1f %%' % (100 * correct / total)) train_dataset = ImageFolder('/home/lzx/datasets/dogcat/sub-train/', transform=transform)
test_dataset = ImageFolder('/home/lzx/datasets/dogcat/sub-test/', transform=transform)
# dataset = DogCat('/home/lzx/datasets/dogcat/sub-train/', transforms=transform)
# train_dataset = ImageFolder('/Users/lizhixuan/PycharmProjects/pytorch_learning/Chapter5/sub-train/', transform=transform)
# test_dataset = ImageFolder('/Users/lizhixuan/PycharmProjects/pytorch_learning/Chapter5/sub-test/', transform=transform) trainloader = torch.utils.data.DataLoader(
train_dataset,
batch_size=512,
shuffle=True,
num_workers=4)
testloader = torch.utils.data.DataLoader(
test_dataset,
batch_size=512,
shuffle=False,
num_workers=4)
classes = ('cat', 'dog') criterion = nn.CrossEntropyLoss() # 交叉熵损失函数
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) print("Starting to train")
torch.set_num_threads(8)
for epoch in range(1000): running_loss = 0.0
for i, data in enumerate(trainloader, 0): # 输入数据
inputs, labels = data
inputs = inputs.to(device)
labels = labels.to(device) # 梯度清零
optimizer.zero_grad() # forward + backward
outputs = net(inputs)
loss = criterion(outputs, labels)
# print("outputs %s labels %s" % (outputs, labels))
loss.backward() # 更新参数
optimizer.step() # 打印log信息
# loss 是一个scalar,需要使用loss.item()来获取数值,不能使用loss[0]
running_loss += loss.item()
print_gap = 10
if i % print_gap == (print_gap-1): # 每1000个batch打印一下训练状态
print('[%d, %5d] loss: %.3f' \
% (epoch+1, i+1, running_loss / print_gap))
running_loss = 0.0
test()
print('Finished Training')
这样一来,就完全明白了如何把代码放在GPU上运行了,哈哈
Pytorch 报错总结的更多相关文章
- pytorch报错:ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1,512,1,1])
1.pytorch报错:ValueError: Expected more than 1 value per channel when training, got input size torch.S ...
- Pytorch报错:cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26
Pytorch报错:cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/TH ...
- Anaconda 安装 pytorch报错解决方法
一.安装Pytorch: # -c 指定用pytorch镜像源下载软件conda install pytorch torchvision cpuonly -c pytorch 报错: 二.配置: ch ...
- windows安装Pytorch报错:from torch._C import * ImportError: DLL load failed: 找不到指定的模块”解决方案
问题描述 python环境下安装cpu版本pytorch,安装成功,但是导入出错. 报错如下 解决方法 参考博客,大家解决方法大概有:升级numpy.添加.dll文件到环境变量,均没有成功.本地pyt ...
- 【pytorch报错解决】expected input to have 3 channels, but got 1 channels instead
遇到的问题 数据是png图像的时候,如果用PIL读取图像,获得的是单通道的,不是多通道的.虽然使用opencv读取图片可以获得三通道图像数据,如下: def __getitem__(self, idx ...
- Pytorch报错记录
1.BrokenPipeError 执行以下命令时: a,b = iter(train_loader).next() 报错:BrokenPipeError: [Errno 32] Broken pip ...
- 服务器 PyTorch 报错 重装 PyTorch
两个代码,pix2pix + CycleGan , wgan-gp 都是 pytorch 写的, 在服务器端运行,均存在下列问题,故判定是 pytorch 的安装问题. Traceback (mos ...
- pytorch报错:AttributeError: 'module' object has no attribute '_rebuild_tensor_v2'
转载自: https://blog.csdn.net/qq_24305433/article/details/80844548 由于训练模型时使用的是新版本的pytorch,而加载时使用的是旧版本的p ...
- pytorch 加载mnist数据集报错not gzip file
利用pytorch加载mnist数据集的代码如下 import torchvision import torchvision.transforms as transforms from torch.u ...
随机推荐
- CentOS 7系统上制作Clonezilla(再生龙)启动U盘并克隆双系统
笔记本安装的是双系统:Win7 64位,CentOS 7 64位. 政采就是个巨大的坑,笔记本标配的是5400转的机械硬盘,开机时间常常要一至两分钟,软件运行起来时各种数据的读写也非常慢,忍无可忍,决 ...
- Ehlib 学习
似乎是为了垂直滚动条 SumList.Active := True; SumList.VirtualRecords := True; TDBGridEh 设计时 It is useful to use ...
- express3/4引入socket.io
app.js var express = require('express'); var path = require('path'); var session = require('express- ...
- mybatis源码解析之Configuration加载(二)
概述 上一篇我们讲了configuation.xml中几个标签的解析,例如<properties>,<typeAlises>,<settings>等,今天我们来介绍 ...
- java生成二维码扫码网页自动登录功能
找了很多资料,七七八八都试了一遍,最终写出来了这个功能. 菜鸟一枚,此文只为做笔记. 简单的一个生成二维码,通过网页确认登录,实现二维码页面跳转到主页面. 有三个servlet: CodeServle ...
- Python之字典方法
def clear(self): # 清除所有内容 """ D.clear() -> None. Remove all items from D. "&q ...
- Linux下数据库的启动和关闭
[oracle@*** ~]$ su - oracle --切换oracle用户[oracle@*** ~]$ sqlplus /nologSQL>connect /as sysdbaSQL&g ...
- selenium 分布式 [WinError 10061] 由于目标计算机积极拒绝
selenium grid分布式,老是出现[WinError 10061] 由于目标计算机积极拒绝的问题 网上查了一圈,出现积极拒绝大概是代理问题, 捣鼓了一圈,还是不行 想到fiddler自动侦听了 ...
- 修改hbuilder背景颜色为护眼模式
复制以下代码,保存为.tmTheme文件导入HBuilder <?xml version="1.0" encoding="UTF-8"?> < ...
- Ceph集群更换public_network网络
1.确保ceph集群是连通状态 这里,可以先把机器配置为以前的x.x.x.x的网络,确保ceph集群是可以通的.这里可以执行下面的命令查看是否连通,显示HEALTH_OK则表示连通 2.获取monma ...