Pytorch 分割模型构建和训练【直播】2019 年县域农业大脑AI挑战赛---(四)模型构建和网络训练

对于分割网络，如果当成一个黑箱就是：输入一个3x1024x1024 输出4x1024x1024。

我没有使用二分类，直接使用了四分类。

分类网络使用了SegNet，没有加载预训练模型，参数也是默认初始化。为了加快训练，1024输入进网络后直接通过

pooling缩小到256的尺寸，等到输出层，直接使用bilinear放大4倍，相当于直接在256的尺寸上训练。

import os

import urllib

import torch

import torch.nn as nn

import torch.nn.functional as F

#import torch.utils.model_zoo as model_zoo

from torchvision import models

#https://raw.githubusercontent.com/delta-onera/delta_tb/master/deltatb/networks/net_segnet_bn_relu.py

class SegNet_BN_ReLU(nn.Module):

    # Unet network

    @staticmethod

    def weight_init(m):

        if isinstance(m, nn.Linear):

            torch.nn.init.kaiming_normal(m.weight.data)

    def __init__(self, in_channels, out_channels):

        super(SegNet_BN_ReLU, self).__init__()

        self.in_channels = in_channels

        self.out_channels = out_channels

        self.pool = nn.MaxPool2d(2, return_indices=True)

        self.unpool = nn.MaxUnpool2d(2)

        self.conv1_1 = nn.Conv2d(in_channels, 64, 3, padding=1)

        self.conv1_1_bn = nn.BatchNorm2d(64)

        self.conv1_2 = nn.Conv2d(64, 64, 3, padding=1)

        self.conv1_2_bn = nn.BatchNorm2d(64)

        self.conv2_1 = nn.Conv2d(64, 128, 3, padding=1)

        self.conv2_1_bn = nn.BatchNorm2d(128)

        self.conv2_2 = nn.Conv2d(128, 128, 3, padding=1)

        self.conv2_2_bn = nn.BatchNorm2d(128)

        self.conv3_1 = nn.Conv2d(128, 256, 3, padding=1)

        self.conv3_1_bn = nn.BatchNorm2d(256)

        self.conv3_2 = nn.Conv2d(256, 256, 3, padding=1)

        self.conv3_2_bn = nn.BatchNorm2d(256)

        self.conv3_3 = nn.Conv2d(256, 256, 3, padding=1)

        self.conv3_3_bn = nn.BatchNorm2d(256)

        self.conv4_1 = nn.Conv2d(256, 512, 3, padding=1)

        self.conv4_1_bn = nn.BatchNorm2d(512)

        self.conv4_2 = nn.Conv2d(512, 512, 3, padding=1)

        self.conv4_2_bn = nn.BatchNorm2d(512)

        self.conv4_3 = nn.Conv2d(512, 512, 3, padding=1)

        self.conv4_3_bn = nn.BatchNorm2d(512)

        self.conv5_1 = nn.Conv2d(512, 512, 3, padding=1)

        self.conv5_1_bn = nn.BatchNorm2d(512)

        self.conv5_2 = nn.Conv2d(512, 512, 3, padding=1)

        self.conv5_2_bn = nn.BatchNorm2d(512)

        self.conv5_3 = nn.Conv2d(512, 512, 3, padding=1)

        self.conv5_3_bn = nn.BatchNorm2d(512)

        self.conv5_3_D = nn.Conv2d(512, 512, 3, padding=1)

        self.conv5_3_D_bn = nn.BatchNorm2d(512)

        self.conv5_2_D = nn.Conv2d(512, 512, 3, padding=1)

        self.conv5_2_D_bn = nn.BatchNorm2d(512)

        self.conv5_1_D = nn.Conv2d(512, 512, 3, padding=1)

        self.conv5_1_D_bn = nn.BatchNorm2d(512)

        self.conv4_3_D = nn.Conv2d(512, 512, 3, padding=1)

        self.conv4_3_D_bn = nn.BatchNorm2d(512)

        self.conv4_2_D = nn.Conv2d(512, 512, 3, padding=1)

        self.conv4_2_D_bn = nn.BatchNorm2d(512)

        self.conv4_1_D = nn.Conv2d(512, 256, 3, padding=1)

        self.conv4_1_D_bn = nn.BatchNorm2d(256)

        self.conv3_3_D = nn.Conv2d(256, 256, 3, padding=1)

        self.conv3_3_D_bn = nn.BatchNorm2d(256)

        self.conv3_2_D = nn.Conv2d(256, 256, 3, padding=1)

        self.conv3_2_D_bn = nn.BatchNorm2d(256)

        self.conv3_1_D = nn.Conv2d(256, 128, 3, padding=1)

        self.conv3_1_D_bn = nn.BatchNorm2d(128)

        self.conv2_2_D = nn.Conv2d(128, 128, 3, padding=1)

        self.conv2_2_D_bn = nn.BatchNorm2d(128)

        self.conv2_1_D = nn.Conv2d(128, 64, 3, padding=1)

        self.conv2_1_D_bn = nn.BatchNorm2d(64)

        self.conv1_2_D = nn.Conv2d(64, 64, 3, padding=1)

        self.conv1_2_D_bn = nn.BatchNorm2d(64)

        self.conv1_1_D = nn.Conv2d(64, out_channels, 3, padding=1)

        self.apply(self.weight_init)

    def forward(self, x):

        # Encoder block 1

        x =F.avg_pool2d(x,4)

        #print(x.shape)

        x = self.conv1_1_bn(F.relu(self.conv1_1(x)))

        x1 = self.conv1_2_bn(F.relu(self.conv1_2(x)))

        size1 = x.size()

        x, mask1 = self.pool(x1)

        # Encoder block 2

        x = self.conv2_1_bn(F.relu(self.conv2_1(x)))

        #x = self.drop2_1(x)

        x2 = self.conv2_2_bn(F.relu(self.conv2_2(x)))

        size2 = x.size()

        x, mask2 = self.pool(x2)

        # Encoder block 3

        x = self.conv3_1_bn(F.relu(self.conv3_1(x)))

        x = self.conv3_2_bn(F.relu(self.conv3_2(x)))

        x3 = self.conv3_3_bn(F.relu(self.conv3_3(x)))

        size3 = x.size()

        x, mask3 = self.pool(x3)

        # Encoder block 4

        x = self.conv4_1_bn(F.relu(self.conv4_1(x)))

        x = self.conv4_2_bn(F.relu(self.conv4_2(x)))

        x4 = self.conv4_3_bn(F.relu(self.conv4_3(x)))

        size4 = x.size()

        x, mask4 = self.pool(x4)

        # Encoder block 5

        x = self.conv5_1_bn(F.relu(self.conv5_1(x)))

        x = self.conv5_2_bn(F.relu(self.conv5_2(x)))

        x = self.conv5_3_bn(F.relu(self.conv5_3(x)))

        size5 = x.size()

        x, mask5 = self.pool(x)

        # Decoder block 5

        x = self.unpool(x, mask5, output_size = size5)

        x = self.conv5_3_D_bn(F.relu(self.conv5_3_D(x)))

        x = self.conv5_2_D_bn(F.relu(self.conv5_2_D(x)))

        x = self.conv5_1_D_bn(F.relu(self.conv5_1_D(x)))

        # Decoder block 4

        x = self.unpool(x, mask4, output_size = size4)

        x = self.conv4_3_D_bn(F.relu(self.conv4_3_D(x)))

        x = self.conv4_2_D_bn(F.relu(self.conv4_2_D(x)))

        x = self.conv4_1_D_bn(F.relu(self.conv4_1_D(x)))

        # Decoder block 3

        x = self.unpool(x, mask3, output_size = size3)

        x = self.conv3_3_D_bn(F.relu(self.conv3_3_D(x)))

        x = self.conv3_2_D_bn(F.relu(self.conv3_2_D(x)))

        x = self.conv3_1_D_bn(F.relu(self.conv3_1_D(x)))

        # Decoder block 2

        x = self.unpool(x, mask2, output_size = size2)

        x = self.conv2_2_D_bn(F.relu(self.conv2_2_D(x)))

        x = self.conv2_1_D_bn(F.relu(self.conv2_1_D(x)))

        # Decoder block 1

        x = self.unpool(x, mask1, output_size = size1)

        x = self.conv1_2_D_bn(F.relu(self.conv1_2_D(x)))

        x = self.conv1_1_D(x)

        #print(x.shape)

        return F.interpolate(x,mode='bilinear',scale_factor=4)  

    def load_pretrained_weights(self):

        #vgg16_weights = model_zoo.load_url("https://download.pytorch.org/models/vgg16_bn-6c64b313.pth")

        vgg16_weights=models.vgg16_bn(True).state_dict()

        count_vgg = 0

        count_this = 0

        vggkeys = list(vgg16_weights.keys())

        thiskeys  = list(self.state_dict().keys())

        corresp_map = []

        while(True):

            vggkey = vggkeys[count_vgg]

            thiskey = thiskeys[count_this]

            if "classifier" in vggkey:

                break

            while vggkey.split(".")[-1] not in thiskey:

                count_this += 1

                thiskey = thiskeys[count_this]

            corresp_map.append([vggkey, thiskey])

            count_vgg+=1

            count_this += 1

        mapped_weights = self.state_dict()

        for k_vgg, k_segnet in corresp_map:

            if (self.in_channels != 3) and "features" in k_vgg and "conv1_1." not in k_segnet:

                mapped_weights[k_segnet] = vgg16_weights[k_vgg]

            elif (self.in_channels == 3) and "features" in k_vgg:

                mapped_weights[k_segnet] = vgg16_weights[k_vgg]

        try:

            self.load_state_dict(mapped_weights)

            print("Loaded VGG-16 weights in Segnet !")

        except:

            print("Error VGG-16 weights in Segnet !")

            raise

    def load_from_filename(self, model_path):

        """Load weights from filename."""

        th = torch.load(model_path)  # load the weigths

        self.load_state_dict(th)

def segnet_bn_relu(in_channels, out_channels, pretrained=False, **kwargs):

    """Constructs a ResNet-34 model.

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    model = SegNet_BN_ReLU(in_channels, out_channels)

    if pretrained:

        model.load_pretrained_weights()

    return model

if __name__=='__main__':

    net=segnet_bn_relu(3,4,False)

    print(net)

    x=torch.rand((1,3,1024,1024))

    print(net.forward(x).shape)

　　训练网络的代码：

import argparse

import torch

import torch.nn as nn

import torch.nn.functional as F

import torch.optim as optim

from farmdataset import FarmDataset

from segnet import segnet_bn_relu as Unet

import time

from PIL import Image

def train(args, model, device, train_loader, optimizer, epoch):

    model.train()

    for batch_idx, (data, target) in enumerate(train_loader):

        data, target = data.to(device), target.to(device)

        #print(target.shape)

        optimizer.zero_grad()

        output = model(data)

        #print('output size',output.size(),output)

        output = F.log_softmax(output, dim=1)

        loss=nn.NLLLoss2d(weight=torch.Tensor([0.1,0.5,0.5,0.2]).to('cuda'))(output,target)

        loss.backward()

        optimizer.step()

        #time.sleep(0.6)#make gpu sleep

        if batch_idx % args.log_interval == 0:

            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(

                epoch, batch_idx * len(data), len(train_loader.dataset),

                100. * batch_idx / len(train_loader), loss.item()))

    if epoch%2==0:

        imgd=output.detach()[0,:,:,:].cpu()

        img=torch.argmax(imgd,0).byte().numpy()

        imgx=Image.fromarray(img).convert('L')

        imgxx=Image.fromarray(target.detach()[0,:,:].cpu().byte().numpy()*255).convert('L')

        imgx.save("./tmp/predict{}.bmp".format(epoch))

        imgxx.save('./tmp/real{}.bmp'.format(epoch))

def test(args, model, device, testdataset,issave=False):

    model.eval()

    test_loss = 0

    correct = 0

    evalid=[i+7 for i in range(0,2100,15)]

    maxbatch=len(evalid)

    with torch.no_grad():

        for idx in evalid:

            data, target=testdataset[idx]

            data, target = data.unsqueeze(0).to(device), target.unsqueeze(0).to(device)

            #print(target.shape)

            target=target[:,:1472,:1472]

            output = model(data[:,:,:1472,:1472])

            output = F.log_softmax(output, dim=1)

            loss=nn.NLLLoss2d().to('cuda')(output,target)

            test_loss+=loss

            r=torch.argmax(output[0],0).byte()

            tg=target.byte().squeeze(0)

            tmp=0

            count=0

            for i in range(1,4):

                mp=r==i

                tr=tg==i

                tp=mp*tr==1

                t=(mp+tr-tp).sum().item()

                if t==0:

                    continue

                else:

                    tmp+=tp.sum().item()/t

                    count+=1

            if count>0:

                correct+=tmp/count

            if issave:

                Image.fromarray(r.cpu().numpy()).save('predict.png')

                Image.fromarray(tg.cpu().numpy()).save('target.png')

                input()

    print('Test Loss is {:.6f}, mean precision is: {:.4f}%'.format(test_loss/maxbatch,correct))

def main():

    # Training settings

    parser = argparse.ArgumentParser(description='Scratch segmentation Example')

    parser.add_argument('--batch-size', type=int, default=8, metavar='N',

                        help='input batch size for training (default: 64)')

    parser.add_argument('--test-batch-size', type=int, default=8, metavar='N',

                        help='input batch size for testing (default: 1000)')

    parser.add_argument('--epochs', type=int, default=30, metavar='N',

                        help='number of epochs to train (default: 10)')

    parser.add_argument('--lr', type=float, default=0.001, metavar='LR',

                        help='learning rate (default: 0.01)')

    parser.add_argument('--momentum', type=float, default=0.5, metavar='M',

                        help='SGD momentum (default: 0.5)')

    parser.add_argument('--no-cuda', action='store_true', default=False,

                        help='disables CUDA training')

    parser.add_argument('--seed', type=int, default=1, metavar='S',

                        help='random seed (default: 1)')

    parser.add_argument('--log-interval', type=int, default=10, metavar='N',

                        help='how many batches to wait before logging training status')

    args = parser.parse_args()

    use_cuda = not args.no_cuda and torch.cuda.is_available()

    torch.manual_seed(args.seed)

    device = torch.device("cuda" if use_cuda else "cpu")

    print('my device is :',device)

    kwargs = {'num_workers': 0, 'pin_memory': True} if use_cuda else {}

    train_loader = torch.utils.data.DataLoader(     FarmDataset(istrain=True),batch_size=args.batch_size, shuffle=True,drop_last=True, **kwargs)

    startepoch=0

    model =torch.load('./tmp/model{}'.format(startepoch))  if startepoch else Unet(3,4).to(device)

    args.epochs=50

    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)

    for epoch in range(startepoch, args.epochs + 1):

        train(args, model, device, train_loader, optimizer, epoch)

        if epoch %3==0:

            print(epoch)

            test(args, model, device, FarmDataset(istrain=True,isaug=False),issave=False)

            torch.save(model,'./tmp/model{}'.format(epoch))

if __name__ == '__main__':

    main()

　　训练代码每隔三轮，评测一次训练精度，测试数据仍然使用训练数据，只是抽样了。可以根据该精度选择使用什么时刻的网络作为预测节点。

训练精度可以达到0.4，但是这时候貌似过学习了。提交结果并不好。

到现在感觉，只要改进一点，分数就会高一点。如果要继续提高成绩，感觉可以从以下几个方面改进：

样本的不均衡

损失函数

模型结构的设计可以参考PSP，UNET，deeplab，或者GAN的pix2pix。

总之，感觉只要进行一点改进，功夫就不会白费。

整个从数据切割，数据集准备，数据增强，预测结果保存，深度分割网络和网络训练，全部代码到此分享完毕，

做完这些你的结果就能到0.2以上。也是折腾了好几天才到现在，希望这能成为一个基线，看到更精彩的模型思路。

（完）

Pytorch 分割模型构建和训练【直播】2019 年县域农业大脑AI挑战赛---(四)模型构建和网络训练的更多相关文章

Pytorch 加载保存模型【直播】2019 年县域农业大脑AI挑战赛---(三)保存结果
在模型训练结束,结束后,通常是一个分割模型,输入 1024x1024 输出 4x1024x1024. 一种方法就是将整个图切块,然后每张预测,但是有个不好处就是可能在边界处断续. 由于这种切块再预测很 ...
Pytorch dataset自定义【直播】2019 年县域农业大脑AI挑战赛---数据准备(二)，Dataset定义
在我的torchvision库里介绍的博文(https://www.cnblogs.com/yjphhw/p/9773333.html)里说了对pytorch的dataset的定义方式. 本文相当于实 ...
Pytorch【直播】2019 年县域农业大脑AI挑战赛---初级准备(一)切图
比赛地址:https://tianchi.aliyun.com/competition/entrance/231717/introduction 这次比赛给的图非常大5万x5万,在训练之前必须要进行数 ...
综合5项百度大脑AI技术，快速构建智能交通方案
一.整体方案:思路:整合百度AI功能,通过百度AI解决.优化在公交运行过程中遇到的运营.管理.安全等方面的问题.具体如下: 安全方面:通过驾驶员检测+语音合成,对驾驶员状态进行实时检测,跟踪,告警. ...
深度学习原理与框架-猫狗图像识别-卷积神经网络(代码) 1.cv2.resize(图片压缩) 2..get_shape()[1:4].num_elements(获得最后三维度之和) 3.saver.save(训练参数的保存) 4.tf.train.import_meta_graph(加载模型结构) 5.saver.restore(训练参数载入)
1.cv2.resize(image, (image_size, image_size), 0, 0, cv2.INTER_LINEAR) 参数说明:image表示输入图片,image_size表示变 ...
CVPR目标检测与实例分割算法解析：FCOS（2019），Mask R-CNN（2019），PolarMask（2020）
CVPR目标检测与实例分割算法解析:FCOS(2019),Mask R-CNN(2019),PolarMask(2020)1. 目标检测:FCOS(CVPR 2019)目标检测算法FCOS(FCOS: ...
图像分割实验：FCN数据集制作，网络模型定义，网络训练（提供数据集和模型文件，以供参考）
论文:<Fully Convolutional Networks for Semantic Segmentation> 代码:FCN的Caffe 实现数据集:PascalVOC 一数据 ...
MINIST深度学习识别：python全连接神经网络和pytorch LeNet CNN网络训练实现及比较（三）
版权声明:本文为博主原创文章,欢迎转载,并请注明出处.联系方式:460356155@qq.com 在前两篇文章MINIST深度学习识别:python全连接神经网络和pytorch LeNet CNN网 ...
Pytorch半精度浮点型网络训练问题
用Pytorch1.0进行半精度浮点型网络训练需要注意下问题: 1.网络要在GPU上跑,模型和输入样本数据都要cuda().half() 2.模型参数转换为half型,不必索引到每层,直接model. ...

随机推荐

主机与虚拟机连接，主机能ping通虚拟机虚拟机ping不通主机问题
事件描述: 从物理主机ping虚拟机时,能正常返回信息.反之,从虚机ping物理主机时返回信息:Destination Host unreachable. 解决方法: 首先,是因为默认创建的虚拟机 ...
.NET中的字符串(2):你真的了解.NET中的String吗？
概述 String在任何语言中,都有它的特殊性,在.NET中也是如此.它属于基本数据类型,也是基本数据类型中唯一的引用类型.字符串可以声明为常量,但是它却放在了堆中.希望通过本文能够使大家对.NET中 ...
19市赛树状数组第k个小的糖果
int find_kth(int k) { , cnt = , i; ; i >= ; i--)/ { ans += ( << i); if (ans >= maxn|| cn ...
【译】高级T-SQL进阶系列（二）【下篇】：使用 APPLY操作符
[译注:此文为翻译,由于本人水平所限,疏漏在所难免,欢迎探讨指正] 原文链接:传送门. 使用OUTER APPLY 操作符 OUTER APPLY操作符工作起来和CROSS APPLY比较类似.唯一的 ...
maven搭建ssm 完整过程
https://blog.csdn.net/qq_28008917/article/details/79755935
Nginx常用模块及作用
Nginx模块详解 nginx模块分为两种,官方和第三方,我们通过命令 nginx -V 查看 nginx已经安装的模块! [root@localhost ~]# nginx -V nginx ver ...
PIP安装模块下载慢或者无法下载
pip下载不动,模块安装失败由于pip默认的下载源在国外,下载的人也多,难免有时会抽风,下载慢还能熬一熬,有时候就直接安装失败了. 安装gevent模块的的时候,进度条慢悠悠的到达100%,以为马上 ...
UIControl的API
@property(nonatomic, readonly) UIControlState state; 控件的状态,指定为位掩码值. @property(nonatomic, getter=isEn ...
Java 中 VO、PO、DTO、BO、POJO、DAO 之间的区别与解释
转载:https://www.cnblogs.com/hunmeng/p/11298680.html VO value object:值对象通常用于业务层之间的数据传递,由new创建,由GC回收. ...
PageObject
import org.openqa.selenium.WebDriver; import org.openqa.selenium.ie.InternetExplorerDriver; import o ...

Pytorch 分割模型构建和训练【直播】2019 年县域农业大脑AI挑战赛---(四)模型构建和网络训练

Pytorch 分割模型构建和训练【直播】2019 年县域农业大脑AI挑战赛---(四)模型构建和网络训练的更多相关文章

随机推荐

热门专题