SSD代码笔记 + EifficientNet backbone 练习

ssd代码完全ok了,然后用最近性能和速度都非常牛的Eifficient Net做backbone设计了自己的TinySSD网络,没有去调参,所以网络并没有很好的收敛,之后我会调一调,实际去应用。

torch.clamp

torch.clamp(input, min, max, out=None) → Tensor

就是clip的功能

eg:

  1. >>> a = torch.randn(4)
  2. >>> a
  3. tensor([-1.7120, 0.1734, -0.0478, -0.0922])
  4. >>> torch.clamp(a, min=-0.5, max=0.5)
  5. tensor([-0.5000, 0.1734, -0.0478, -0.0922])

计算iou

交集除以并集,首先要算面积,面积就是给定两个点坐标求宽高乘积即可。

交集面积就是两个框离原点最远的左上角点与离原点最近的右下角点组成区域的面积。

  1. def area_of(left_top,right_bottom):# (num_boxes,2),(num_boxes,2)
  2. hw = torch.clamp(right_bottom-left_top,0.0) # (num_boxes,2)
  3. # 这里做clip的原因是如果框不重叠的化,如果不clip算出来就是负值,有了clip就是0
  4. return hw[...,0] * hw[...,1]
  5. def iou_of(boxes0,boxes1,eps = 1e-5):# (N,4) and (1,4) or (N,4)
  6. # 注意这里其实boxes0和boxes1和size其实是不一样的,所以size较少的那个会broadcast到较大的那个然后在做max和min操作。
  7. overlap_left_top = torch.max(boxes0[...,:2],boxes1[...:2])# 左上角点最远的
  8. overlap_right_bottom = torch.min(boxes0[...,2:],boxes1[...,2:]) # 右下角点最近的
  9. area0 = area_of(boxes0[...,:2],boxes0[...,2:]) # 左上角和右下角点计算面积
  10. area1 = area_of(boxes1[...,:2],boxes1[...,2:]) # predict box的面积
  11. overlap_area = area_of(overlap_left_top,over_right_bottom) # 并集面积
  12. return overlap_area / (area0 + area1 - overlap_area + eps)

生成priorbox

这里生成坐标的方法是真学到了,product构造映射。实际上featur map上每个点都是加了0.5作为中心,然后除以ratio,这里ratio一般和feature map 的size是不一样的,取决于网络设计,我这里用的刚好一样。ratio就是对应到原图有多少个滑窗。后面考虑了其他尺度的anchor。

  1. #有的注释是原来代码里的,有的英文注释是我加的。
  2. class PriorBox(nn.Module):
  3. def __init__(self):
  4. super(PriorBox, self).__init__()
  5. self.image_size = 512
  6. self.feature_maps = [16,8,4,2,1]
  7. self.min_sizes = [30,60,111,162,213]
  8. self.max_sizes = [60,111,162,213,512]
  9. self.strides = [32,64,128,256,512]
  10. self.aspect_ratios = [[2], [2, 3], [2, 3], [2], [2]]
  11. self.clip = True
  12. def forward(self):
  13. """Generate SSD Prior Boxes.
  14. It returns the center, height and width of the priors. The values are relative to the image size
  15. Returns:
  16. priors (num_priors, 4): The prior boxes represented as [[center_x, center_y, w, h]]. All the values
  17. are relative to the image size.
  18. """
  19. priors = []
  20. for k, f in enumerate(self.feature_maps): # every size of feature map
  21. scale = self.image_size / self.strides[k] # how many boxes (not anchor) in a row in raw img
  22. # 512 / 32 = 16
  23. for i, j in product(range(f), repeat=2): # xy generator in feature map
  24. # unit center x,y
  25. cx = (j + 0.5) / scale # see as blocks and xy in center of it
  26. cy = (i + 0.5) / scale # 15,15 -> 15.5,15.5 -> 15.5/16,15.5/16 which means the xy in center of feature map
  27. # small sized square box
  28. size = self.min_sizes[k] # min size
  29. h = w = size / self.image_size # small size
  30. priors.append([cx, cy, w, h]) # the small size one
  31. # big sized square box
  32. size = sqrt(self.min_sizes[k] * self.max_sizes[k]) # the same as small one
  33. h = w = size / self.image_size
  34. priors.append([cx, cy, w, h])
  35. # change h/w ratio of the small sized box
  36. # considering the w/ratio , w*ratio , h/ratio and h * ratio
  37. size = self.min_sizes[k]
  38. h = w = size / self.image_size
  39. for ratio in self.aspect_ratios[k]:
  40. ratio = sqrt(ratio)
  41. priors.append([cx, cy, w * ratio, h / ratio])
  42. priors.append([cx, cy, w / ratio, h * ratio])
  43. priors = torch.Tensor(priors)
  44. if self.clip:
  45. priors.clamp_(max=1, min=0)
  46. return priors

priorbox的分配

很好的利用了broadcast机制,计算每个iou,然后得到target与所有prior重叠度最高的匹配,以及prior与target重叠度最高的匹配,然后通过阈值滤去。

  1. def assign_priors(gt_boxes, gt_labels, corner_form_priors,
  2. iou_threshold):
  3. """Assign ground truth boxes and targets to priors.
  4. Args:
  5. gt_boxes (num_targets, 4): ground truth boxes.
  6. gt_labels (num_targets): labels of targets.
  7. priors (num_priors, 4): corner form priors
  8. Returns:
  9. boxes (num_priors, 4): real values for priors.
  10. labels (num_priros): labels for priors.
  11. """
  12. # size: num_priors x num_targets
  13. ious = iou_of(gt_boxes.unsqueeze(0), corner_form_priors.unsqueeze(1))
  14. # size: num_priors
  15. best_target_per_prior, best_target_per_prior_index = ious.max(1) # 每个prior的iou最大的值以及在target里的索引
  16. # size: num_targets
  17. best_prior_per_target, best_prior_per_target_index = ious.max(0) # 每个target与所有prior的iou最大值以及在priors里的索引
  18. for target_index, prior_index in enumerate(best_prior_per_target_index):
  19. best_target_per_prior_index[prior_index] = target_index # 让每个Prior对应iou最大的target (0,0,1,2,3)
  20. # 2.0 is used to make sure every target has a prior assigned
  21. best_target_per_prior.index_fill_(0, best_prior_per_target_index, 2) # dim = 0 ,value = 2,只要重叠的iou最大,就认为其重叠度是2
  22. # size: num_priors
  23. labels = gt_labels[best_target_per_prior_index] # num_priors,先按照iou最大分
  24. labels[best_target_per_prior < iou_threshold] = 0 # the backgournd id,小于阈值的认为是背景,有的iou尽管最大但是其iou还是很小,所以也需要滤去
  25. boxes = gt_boxes[best_target_per_prior_index] # 直接给box
  26. return boxes, labels

hard_negative_mining

通过给出mask考虑算哪些loss不算哪些loss,因为负样本实在太多了,所以这是一个方法。

  1. def hard_negative_mining(loss, labels, neg_pos_ratio):
  2. """
  3. It used to suppress the presence of a large number of negative prediction.
  4. It works on image level not batch level.
  5. For any example/image, it keeps all the positive predictions and
  6. cut the number of negative predictions to make sure the ratio
  7. between the negative examples and positive examples is no more
  8. the given ratio for an image.
  9. Args:
  10. loss (N, num_priors): the loss for each example.
  11. labels (N, num_priors): the labels.
  12. neg_pos_ratio: the ratio between the negative examples and positive examples.
  13. """
  14. pos_mask = labels > 0
  15. num_pos = pos_mask.long().sum(dim=1, keepdim=True)
  16. num_neg = num_pos * neg_pos_ratio
  17. loss[pos_mask] = -math.inf
  18. _, indexes = loss.sort(dim=1, descending=True)
  19. _, orders = indexes.sort(dim=1)
  20. neg_mask = orders < num_neg
  21. return pos_mask | neg_mask

Loss Function

bbox用smotth L1 loss,交叉熵分类loss。

  1. class MultiBoxLoss(nn.Module):
  2. def __init__(self, neg_pos_ratio):
  3. """Implement SSD MultiBox Loss.
  4. Basically, MultiBox loss combines classification loss
  5. and Smooth L1 regression loss.
  6. """
  7. super(MultiBoxLoss, self).__init__()
  8. self.neg_pos_ratio = neg_pos_ratio
  9. def forward(self, confidence, predicted_locations, labels, gt_locations):
  10. """Compute classification loss and smooth l1 loss.
  11. Args:
  12. confidence (batch_size, num_priors, num_classes): class predictions.
  13. predicted_locations (batch_size, num_priors, 4): predicted locations.
  14. labels (batch_size, num_priors): real labels of all the priors.
  15. gt_locations (batch_size, num_priors, 4): real boxes corresponding all the priors.
  16. """
  17. num_classes = confidence.size(2)
  18. with torch.no_grad():
  19. # derived from cross_entropy=sum(log(p))
  20. loss = -F.log_softmax(confidence, dim=2)[:, :, 0]
  21. mask = box_utils.hard_negative_mining(loss, labels, self.neg_pos_ratio)
  22. confidence = confidence[mask, :]
  23. #print(confidence.view(-1, num_classes))
  24. #print(labels[mask])
  25. classification_loss = F.cross_entropy(confidence.view(-1, num_classes), labels[mask], reduction='sum')
  26. pos_mask = labels > 0
  27. predicted_locations = predicted_locations[pos_mask, :].view(-1, 4)
  28. gt_locations = gt_locations[pos_mask, :].view(-1, 4)
  29. smooth_l1_loss = F.smooth_l1_loss(predicted_locations, gt_locations, reduction='sum')
  30. num_pos = gt_locations.size(0)
  31. return smooth_l1_loss / num_pos, classification_loss / num_pos

Model

我看网络模型搜索得到的Eifficient Net性能和速度都是最优,直接拿来做backbone,但是调参还没调好,只是直接用其输出然后再加5层卷积层分别做特征金字塔,感觉感受野可能太大了,网络收敛性能不是很好,后面会调好参的,但是还是可以跑的。

使用EFnet作为后端的训练效果:

Eifficient Net Model

  1. import torch
  2. from torch import nn
  3. from torch.nn import functional as F
  4. from .utils import (
  5. relu_fn,
  6. round_filters,
  7. round_repeats,
  8. drop_connect,
  9. Conv2dSamePadding,
  10. get_model_params,
  11. efficientnet_params,
  12. load_pretrained_weights,
  13. )
  14. class MBConvBlock(nn.Module):
  15. """
  16. Mobile Inverted Residual Bottleneck Block
  17. Args:
  18. block_args (namedtuple): BlockArgs, see above
  19. global_params (namedtuple): GlobalParam, see above
  20. Attributes:
  21. has_se (bool): Whether the block contains a Squeeze and Excitation layer.
  22. """
  23. def __init__(self, block_args, global_params):
  24. super().__init__()
  25. self._block_args = block_args
  26. self._bn_mom = 1 - global_params.batch_norm_momentum
  27. self._bn_eps = global_params.batch_norm_epsilon
  28. self.has_se = (self._block_args.se_ratio is not None) and (0 < self._block_args.se_ratio <= 1)
  29. self.id_skip = block_args.id_skip # skip connection and drop connect
  30. # Expansion phase
  31. inp = self._block_args.input_filters # number of input channels
  32. oup = self._block_args.input_filters * self._block_args.expand_ratio # number of output channels
  33. if self._block_args.expand_ratio != 1:
  34. self._expand_conv = Conv2dSamePadding(in_channels=inp, out_channels=oup, kernel_size=1, bias=False)
  35. self._bn0 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps)
  36. # Depthwise convolution phase
  37. k = self._block_args.kernel_size
  38. s = self._block_args.stride
  39. self._depthwise_conv = Conv2dSamePadding(
  40. in_channels=oup, out_channels=oup, groups=oup, # groups makes it depthwise
  41. kernel_size=k, stride=s, bias=False)
  42. self._bn1 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps)
  43. # Squeeze and Excitation layer, if desired
  44. if self.has_se:
  45. num_squeezed_channels = max(1, int(self._block_args.input_filters * self._block_args.se_ratio))
  46. self._se_reduce = Conv2dSamePadding(in_channels=oup, out_channels=num_squeezed_channels, kernel_size=1)
  47. self._se_expand = Conv2dSamePadding(in_channels=num_squeezed_channels, out_channels=oup, kernel_size=1)
  48. # Output phase
  49. final_oup = self._block_args.output_filters
  50. self._project_conv = Conv2dSamePadding(in_channels=oup, out_channels=final_oup, kernel_size=1, bias=False)
  51. self._bn2 = nn.BatchNorm2d(num_features=final_oup, momentum=self._bn_mom, eps=self._bn_eps)
  52. def forward(self, inputs, drop_connect_rate=None):
  53. """
  54. :param inputs: input tensor
  55. :param drop_connect_rate: drop connect rate (float, between 0 and 1)
  56. :return: output of block
  57. """
  58. # Expansion and Depthwise Convolution
  59. x = inputs
  60. if self._block_args.expand_ratio != 1:
  61. x = relu_fn(self._bn0(self._expand_conv(inputs)))
  62. x = relu_fn(self._bn1(self._depthwise_conv(x)))
  63. # Squeeze and Excitation
  64. if self.has_se:
  65. x_squeezed = F.adaptive_avg_pool2d(x, 1)
  66. x_squeezed = self._se_expand(relu_fn(self._se_reduce(x_squeezed)))
  67. x = torch.sigmoid(x_squeezed) * x
  68. x = self._bn2(self._project_conv(x))
  69. # Skip connection and drop connect
  70. input_filters, output_filters = self._block_args.input_filters, self._block_args.output_filters
  71. if self.id_skip and self._block_args.stride == 1 and input_filters == output_filters:
  72. if drop_connect_rate:
  73. x = drop_connect(x, p=drop_connect_rate, training=self.training)
  74. x = x + inputs # skip connection
  75. return x
  76. class EfficientNet(nn.Module):
  77. """
  78. An EfficientNet model. Most easily loaded with the .from_name or .from_pretrained methods
  79. Args:
  80. blocks_args (list): A list of BlockArgs to construct blocks
  81. global_params (namedtuple): A set of GlobalParams shared between blocks
  82. Example:
  83. model = EfficientNet.from_pretrained('efficientnet-b0')
  84. """
  85. def __init__(self, blocks_args=None, global_params=None):
  86. super().__init__()
  87. assert isinstance(blocks_args, list), 'blocks_args should be a list'
  88. assert len(blocks_args) > 0, 'block args must be greater than 0'
  89. self._global_params = global_params
  90. self._blocks_args = blocks_args
  91. # Batch norm parameters
  92. bn_mom = 1 - self._global_params.batch_norm_momentum
  93. bn_eps = self._global_params.batch_norm_epsilon
  94. # Stem
  95. in_channels = 3 # rgb
  96. out_channels = round_filters(32, self._global_params) # number of output channels
  97. self._conv_stem = Conv2dSamePadding(in_channels, out_channels, kernel_size=3, stride=2, bias=False)
  98. self._bn0 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)
  99. # Build blocks
  100. self._blocks = nn.ModuleList([])
  101. for block_args in self._blocks_args:
  102. # Update block input and output filters based on depth multiplier.
  103. block_args = block_args._replace(
  104. input_filters=round_filters(block_args.input_filters, self._global_params),
  105. output_filters=round_filters(block_args.output_filters, self._global_params),
  106. num_repeat=round_repeats(block_args.num_repeat, self._global_params)
  107. )
  108. # The first block needs to take care of stride and filter size increase.
  109. self._blocks.append(MBConvBlock(block_args, self._global_params))
  110. if block_args.num_repeat > 1:
  111. block_args = block_args._replace(input_filters=block_args.output_filters, stride=1)
  112. for _ in range(block_args.num_repeat - 1):
  113. self._blocks.append(MBConvBlock(block_args, self._global_params))
  114. # Head
  115. in_channels = block_args.output_filters # output of final block
  116. out_channels = round_filters(1280, self._global_params)
  117. self._conv_head = Conv2dSamePadding(in_channels, out_channels, kernel_size=1, bias=False)
  118. self._bn1 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)
  119. # Final linear layer
  120. self._dropout = self._global_params.dropout_rate
  121. self._fc = nn.Linear(out_channels, self._global_params.num_classes)
  122. def extract_features(self, inputs):
  123. """ Returns output of the final convolution layer """
  124. # Stem
  125. x = relu_fn(self._bn0(self._conv_stem(inputs)))
  126. # Blocks
  127. for idx, block in enumerate(self._blocks):
  128. drop_connect_rate = self._global_params.drop_connect_rate
  129. if drop_connect_rate:
  130. drop_connect_rate *= float(idx) / len(self._blocks)
  131. x = block(x) # , drop_connect_rate) # see https://github.com/tensorflow/tpu/issues/381
  132. return x
  133. def forward(self, inputs):
  134. """ Calls extract_features to extract features, applies final linear layer, and returns logits. """
  135. # Convolution layers
  136. x = self.extract_features(inputs)
  137. # Head
  138. x = relu_fn(self._bn1(self._conv_head(x)))
  139. x = F.adaptive_avg_pool2d(x, 1).squeeze(-1).squeeze(-1)
  140. if self._dropout:
  141. x = F.dropout(x, p=self._dropout, training=self.training)
  142. x = self._fc(x)
  143. return x
  144. @classmethod
  145. def from_name(cls, model_name, override_params=None):
  146. cls._check_model_name_is_valid(model_name)
  147. blocks_args, global_params = get_model_params(model_name, override_params)
  148. return EfficientNet(blocks_args, global_params)
  149. @classmethod
  150. def from_pretrained(cls, model_name):
  151. model = EfficientNet.from_name(model_name)
  152. load_pretrained_weights(model, model_name)
  153. return model
  154. @classmethod
  155. def get_image_size(cls, model_name):
  156. cls._check_model_name_is_valid(model_name)
  157. _, _, res, _ = efficientnet_params(model_name)
  158. return res
  159. @classmethod
  160. def _check_model_name_is_valid(cls, model_name, also_need_pretrained_weights=False):
  161. """ Validates model name. None that pretrained weights are only available for
  162. the first four models (efficientnet-b{i} for i in 0,1,2,3) at the moment. """
  163. num_models = 4 if also_need_pretrained_weights else 8
  164. valid_models = ['efficientnet_b'+str(i) for i in range(num_models)]
  165. if model_name.replace('-','_') not in valid_models:
  166. raise ValueError('model_name should be one of: ' + ', '.join(valid_models))

My TinySSD Model

  1. '''
  2. @Descripttion: This is Aoru Xue's demo,which is only for reference
  3. @version:
  4. @Author: Aoru Xue
  5. @Date: 2019-06-14 00:42:10
  6. @LastEditors: Aoru Xue
  7. @LastEditTime: 2019-09-02 17:04:26
  8. '''
  9. import torch
  10. from torch import nn
  11. from efficientnet_pytorch import EfficientNet
  12. from prior_box import PriorBox
  13. from torchsummary import summary
  14. import torch.nn.functional as F
  15. from box_utils import *
  16. from PIL import Image
  17. class TinySSD(nn.Module):
  18. def __init__(self,training = True):
  19. super(TinySSD,self).__init__()
  20. self.basenet = EfficientNet.from_name('efficientnet-b0')
  21. self.training = training
  22. for idx,num_anchors in enumerate([4, 6, 6, 4, 4]):
  23. setattr(self,"predict_bbox_{}".format(idx + 1),nn.Conv2d(
  24. 320,num_anchors * 4,kernel_size = 3,padding = 1
  25. ))
  26. setattr(self,"predict_class_{}".format(idx + 1),nn.Conv2d( # 这里3 是 2 + 1
  27. 320,3 * num_anchors,kernel_size = 3,padding = 1
  28. ))
  29. self.priors = None
  30. for idx,k in enumerate([[320,320],[320,320],[320,320]]):
  31. setattr(self,"feature_{}".format(idx + 2),nn.Sequential(
  32. nn.Conv2d(k[0],k[1],kernel_size = 3,padding =1),
  33. nn.BatchNorm2d(k[1]),
  34. nn.ReLU(),
  35. nn.Conv2d(k[1],k[1],kernel_size = 3,padding =1),
  36. nn.BatchNorm2d(k[1]),
  37. nn.ReLU(),
  38. nn.MaxPool2d(2)
  39. ))
  40. def forward(self,x):
  41. x = self.basenet.extract_features(x)
  42. feature_1 = x
  43. feature_2 = self.feature_2(x)
  44. feature_3 = self.feature_3(feature_2)
  45. feature_4 = self.feature_4(feature_3)
  46. feature_5 = F.max_pool2d(feature_4,kernel_size = 2)
  47. '''
  48. (2,4*4,16,16)
  49. (2,4*6,8,8)
  50. (2,4*6,4,4),
  51. (2,4*4,2,2),
  52. (2,4*4,1,1)
  53. -> 每个 anchor 中心,连续4个值代表x y w h
  54. '''
  55. confidences = []
  56. locations = []
  57. locations.append(self.predict_bbox_1(feature_1).permute(0,2,3,1).contiguous())
  58. locations.append(self.predict_bbox_2(feature_2).permute(0,2,3,1).contiguous())
  59. locations.append(self.predict_bbox_3(feature_3).permute(0,2,3,1).contiguous())
  60. locations.append(self.predict_bbox_4(feature_4).permute(0,2,3,1).contiguous())
  61. locations.append(self.predict_bbox_5(feature_5).permute(0,2,3,1).contiguous())
  62. locations = torch.cat([o.view(o.size(0), -1) for o in locations], 1) #(batch_size,total_anchor_num*4)
  63. locations = locations.view(locations.size(0), -1, 4) # (batch_size,total_anchor_num,4)
  64. confidences.append(self.predict_class_1(feature_1).permute(0,2,3,1).contiguous())
  65. confidences.append(self.predict_class_2(feature_2).permute(0,2,3,1).contiguous())
  66. confidences.append(self.predict_class_3(feature_3).permute(0,2,3,1).contiguous())
  67. confidences.append(self.predict_class_4(feature_4).permute(0,2,3,1).contiguous())
  68. confidences.append(self.predict_class_5(feature_5).permute(0,2,3,1).contiguous())
  69. confidences = torch.cat([o.view(o.size(0), -1) for o in confidences], 1) #(batch_size,total_anchor_num*4)
  70. confidences = confidences.view(confidences.size(0), -1, 3) # (batch_size,total_anchor_num,4)
  71. if not self.training:
  72. if self.priors is None:
  73. self.priors = PriorBox()()
  74. self.priors = self.priors.cuda()
  75. boxes = convert_locations_to_boxes(
  76. locations, self.priors, 0.1, 0.2
  77. )
  78. confidences = F.softmax(confidences, dim=2)
  79. return confidences, boxes
  80. else:
  81. #print(confidences.size(),locations.size())
  82. return (confidences, locations) # (2,1111,3) (2,1111,4)
  83. if __name__ == "__main__":
  84. net = TinySSD()
  85. net.cuda()
  86. #prior = PriorBox()
  87. #print(len(prior()))
  88. #gt_prior = assign_priors(torch.Tensor([[0,0,10/512,10/512],[55/512,55/512,30/512,30/512]]),torch.Tensor([1,2,5]),prior(),0.5)
  89. #print(gt_prior[1])
  90. #x = torch.randn(1,3,512,512)
  91. #out = net(x.cuda())
  92. #print(out[0].size())
  93. #print(out[1].size())
  94. #print(prior()[:200,:])
  95. #print(out[0][0])
  96. #print(out[1][0])
  97. summary(net,(3,512,512),device="cuda")

dataset

  1. '''
  2. @Descripttion: This is Aoru Xue's demo,which is only for reference
  3. @version:
  4. @Author: Aoru Xue
  5. @Date: 2019-06-15 12:48:09
  6. @LastEditors: Aoru Xue
  7. @LastEditTime: 2019-09-13 10:43:34
  8. '''
  9. import torch
  10. import torch.nn
  11. from torch.utils.data import Dataset
  12. from PIL import Image
  13. from prior_box import PriorBox
  14. from box_utils import *
  15. import cv2 as cv
  16. import random
  17. import numpy as np
  18. import glob
  19. import xml.etree.ElementTree as ET
  20. class Mydataset(Dataset):
  21. def __init__(self,img_path = "./dataset",transform = None,center_variance = 0.1,size_variance = 0.2):
  22. self.center_variance = center_variance
  23. self.size_variance = size_variance
  24. self.img_paths = glob.glob(img_path + "/images/*.jpg")
  25. self.labels = [label.replace(".jpg",".xml").replace("images","labels") for label in self.img_paths]
  26. self.class_names = ("__background__","basketball","volleyball")
  27. prior = PriorBox()
  28. self.center_form_priors = prior() # center form
  29. self.imgW,self.imgH = 512,512
  30. self.corner_form_priors = center_form_to_corner_form(self.center_form_priors)
  31. #print(self.center_form_priors.size(),self.corner_form_priors.size())
  32. self.transform = transform
  33. def __len__(self):
  34. return len(self.img_paths)
  35. def __getitem__(self,idx):
  36. img = Image.open(self.img_paths[idx]).convert("RGB")
  37. label_file = self.labels[idx]
  38. gt_bboxes,gt_classes = self._get_annotation(idx)
  39. if self.transform:
  40. img = self.transform(img)
  41. gt_bboxes,gt_classes = assign_priors(gt_bboxes,gt_classes,self.corner_form_priors,0.5) # corner form
  42. #imH,imW = cv_img.shape[:2]
  43. gt_bboxes = corner_form_to_center_form(gt_bboxes) # (1524, 4) center form
  44. locations = convert_boxes_to_locations(gt_bboxes, self.center_form_priors, self.center_variance, self.size_variance) # 相当于归一化
  45. # 拟合距离而不是直接拟合,这样更容易拟合。
  46. return [img,locations,gt_classes]
  47. def _get_annotation(self,idx):
  48. annotation_file = self.labels[idx]
  49. objects = ET.parse(annotation_file).findall("object")
  50. boxes = []
  51. labels = []
  52. #is_difficult = []
  53. for obj in objects:
  54. class_name = obj.find('name').text.lower().strip()
  55. bbox = obj.find('bndbox')
  56. # VOC dataset format follows Matlab, in which indexes start from 0
  57. x1 = float(bbox.find('xmin').text) - 1
  58. y1 = float(bbox.find('ymin').text) - 1
  59. x2 = float(bbox.find('xmax').text) - 1
  60. y2 = float(bbox.find('ymax').text) - 1
  61. boxes.append([x1/self.imgW,y1/self.imgH,x2/self.imgW,y2/self.imgH])
  62. labels.append(self.class_names.index(class_name))
  63. return (torch.tensor(boxes, dtype=torch.float),
  64. torch.tensor(labels, dtype=torch.long))
  65. if __name__ == '__main__':
  66. datset = Mydataset()
  67. import cv2 as cv
  68. img,gt_loc,gt_labels = datset[0]
  69. cv_img = np.array(img)
  70. cv_img = cv.cvtColor(cv_img,cv.COLOR_RGB2BGR)
  71. idx = gt_labels > 0
  72. #print(gt_loc.size(),dataset.priors.size())
  73. loc = convert_locations_to_boxes(gt_loc,datset.center_form_priors,0.1,0.2)
  74. loc = loc[idx]
  75. label = gt_labels[idx]
  76. for i in range(loc.size(0)):
  77. print(loc.size())
  78. x1,y1,w,h = loc[i,:]
  79. #print(x,y,r)
  80. x1 = x1.item() * 512.
  81. y1 = y1.item() * 512.
  82. w= w.item() * 512.
  83. h = h.item() * 512.
  84. #cv.circle(cv_img,(int(x),int(y)),int(r),(255,0,0),2)
  85. cv.rectangle(cv_img,(int(x1 - w/2),int(y1-h/2)),(int(x1 + w/2),int(y1 + h/2)),(255,0,0),2)
  86. cv.imshow("cv",cv_img)
  87. cv.waitKey(0)

训练

  1. '''
  2. @Descripttion: This is Aoru Xue's demo,which is only for reference
  3. @version:
  4. @Author: Aoru Xue
  5. @Date: 2019-06-15 12:56:39
  6. @LastEditors: Aoru Xue
  7. @LastEditTime: 2019-09-10 20:46:54
  8. '''
  9. import torch
  10. import torchvision
  11. from TinySSD import TinySSD
  12. #from vgg_ssd import build_ssd_model
  13. from dataset import Mydataset
  14. from torchvision import transforms
  15. #from transforms import *
  16. from torch.utils.data import DataLoader
  17. from multibox_loss import MultiBoxLoss
  18. import torch.optim as optim
  19. from tqdm import tqdm
  20. def train(dataloader,net,loss_fn,optimizer,epochs = 200):
  21. for epoch in range(epochs):
  22. running_loss_bbox = 0.
  23. running_loss_class = 0.
  24. for img,gt_bbox,gt_class in tqdm(dataloader):
  25. img = img.cuda()
  26. gt_bbox = gt_bbox.cuda()
  27. gt_class = gt_class.cuda()
  28. optimizer.zero_grad()
  29. pred_class,pred_locations = net(img)
  30. """Compute classification loss and smooth l1 loss.
  31. Args:
  32. confidence (batch_size, num_priors, num_classes): class predictions.
  33. predicted_locations (batch_size, num_priors, 4): predicted locations.
  34. labels (batch_size, num_priors): real labels of all the priors.
  35. gt_locations (batch_size, num_priors, 4): real boxes corresponding all the priors.
  36. """
  37. regression_loss, classification_loss = loss_fn(pred_class ,pred_locations,gt_class,gt_bbox)
  38. loss = regression_loss + classification_loss
  39. loss.backward()
  40. running_loss_bbox += regression_loss.item()
  41. running_loss_class += classification_loss.item()
  42. optimizer.step()
  43. #print(pred_bbox.size(),pred_class.size())
  44. #print("epoch: {},bbox loss:{:.8f} , class loss:{:.8f}".format(epoch + 1,loss[0].cpu().item(),loss[1].cpu().item()))
  45. print("*" * 20)
  46. print("average bbox loss: {:.8f}; average class loss: {:.8f}".format(running_loss_bbox/len(dataloader),running_loss_class/len(dataloader)))
  47. if epoch % 5 == 0:
  48. torch.save(net.state_dict(),"./ckpt/{}.pkl".format(epoch))
  49. if __name__ == "__main__":
  50. net = TinySSD()
  51. net.cuda()
  52. loss_fn = MultiBoxLoss(3.)
  53. transform = transforms.Compose([
  54. transforms.Resize((512,512)),
  55. transforms.ToTensor(),
  56. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
  57. ]
  58. )
  59. # transform = Compose([
  60. # ConvertFromInts(),
  61. # PhotometricDistort(),
  62. # Expand([123, 117, 104]),
  63. # RandomSampleCrop(),
  64. # RandomMirror(),
  65. # ToPercentCoords(),
  66. # Resize(300),
  67. # SubtractMeans([123, 117, 104]),
  68. # ToTensor(),
  69. # ])
  70. optm = optim.Adam(net.parameters(),lr = 1e-3)
  71. dtset = Mydataset(img_path = "./dataset",transform = transform)
  72. dataloader = DataLoader(dtset,batch_size = 8,shuffle = True)
  73. train(dataloader,net,loss_fn,optm)

[学习笔记] SSD代码笔记 + EifficientNet backbone 练习的更多相关文章

  1. 前端学习:JS(面向对象)代码笔记

    前端学习:JS(面向对象)代码笔记 前端学习:JS面向对象知识学习(图解) 创建类和对象 创建对象方式1调用Object函数 <body> </body> <script ...

  2. canvas学习之API整理笔记(二)

    前面我整理过一篇文章canvas学习之API整理笔记(一),从这篇文章我们已经可以基本了解到常用绘图的API.简单的变换和动画.而本篇文章的主要内容包括高级动画.像素操作.性能优化等知识点,讲解每个知 ...

  3. 《Data Structures and Algorithm Analysis in C》学习与刷题笔记

    <Data Structures and Algorithm Analysis in C>学习与刷题笔记 为什么要学习DSAAC? 某个月黑风高的夜晚,下班的我走在黯淡无光.冷清无人的冲之 ...

  4. 学习Logistic Regression的笔记与理解(转)

    学习Logistic Regression的笔记与理解 1.首先从结果往前来看下how logistic regression make predictions. 设我们某个测试数据为X(x0,x1, ...

  5. 【hadoop代码笔记】Mapreduce shuffle过程之Map输出过程

    一.概要描述 shuffle是MapReduce的一个核心过程,因此没有在前面的MapReduce作业提交的过程中描述,而是单独拿出来比较详细的描述. 根据官方的流程图示如下: 本篇文章中只是想尝试从 ...

  6. 【hadoop代码笔记】hadoop作业提交之汇总

    一.概述 在本篇博文中,试图通过代码了解hadoop job执行的整个流程.即用户提交的mapreduce的jar文件.输入提交到hadoop的集群,并在集群中运行.重点在代码的角度描述整个流程,有些 ...

  7. 【Hadoop代码笔记】通过JobClient对Jobtracker的调用详细了解Hadoop RPC

    Hadoop的各个服务间,客户端和服务间的交互采用RPC方式.关于这种机制介绍的资源很多,也不难理解,这里不做背景介绍.只是尝试从Jobclient向JobTracker提交作业这个最简单的客户端服务 ...

  8. 【Hadoop代码笔记】目录

    整理09年时候做的Hadoop的代码笔记. 开始. [Hadoop代码笔记]Hadoop作业提交之客户端作业提交 [Hadoop代码笔记]通过JobClient对Jobtracker的调用看详细了解H ...

  9. 转载-《Python学习手册》读书笔记

    转载-<Python学习手册>读书笔记 http://www.cnblogs.com/wuyuegb2312/archive/2013/02/26/2910908.html

随机推荐

  1. DeleteDC ReleaseDC DeleteObject之间的区别

    DeleteDC 该函数删除指定的设备上下文环境(DC). 原型: BOOL DeleteDC(HDC hdc): 参数: hdc:设备上下文环境的句柄. 返回值: 成功,返回非零值:失败,返回零.调 ...

  2. HDU - 5950 Recursive sequence(二项式+矩阵合并+矩阵快速幂)

    Recursive sequence Farmer John likes to play mathematics games with his N cows. Recently, they are a ...

  3. Celery 基本使用

    1. 认识 Celery Celery 是一个 基于 Python 开发的分布式异步消息任务队列,可以实现任务异步处理,制定定时任务等. 异步消息队列:执行异步任务时,会返回一个任务 ID 给你,过一 ...

  4. 查看 打包秘钥的 SHA1

    keytool -v -list -keystore C:\Users\XXX\.android\debug.keystore 输入密钥库口令: android android

  5. Jinkins定时任务设置

    设置的地方在构建触发器中 Build after other projects are built:在其他项目构建完成后再进行构建. Build periodically:周期进行构建 Build w ...

  6. 从图(Graph)到图卷积(Graph Convolution):漫谈图神经网络模型 (二)

    本文属于图神经网络的系列文章,文章目录如下: 从图(Graph)到图卷积(Graph Convolution):漫谈图神经网络模型 (一) 从图(Graph)到图卷积(Graph Convolutio ...

  7. Solidity 最新 0.5.8 中文文档发布

    本文首发于深入浅出区块链社区 热烈祝贺 Solidity 最新 0.5.8 中文文档发布, 这不单是一份 Solidity 速查手册,更是一份深入以太坊智能合约开发宝典. 翻译说明 Solidity ...

  8. EOS Bios Boot Sequence

    EOS version:v1.0.5 Date:2018-06-19 Host: Centos 7 Reference :https://github.com/EOSIO/eos/wiki/Tutor ...

  9. JavaWeb之用户数据回显

  10. shell编程 条件判断式----利用 case ..... esac 判断

    条件判断式----利用 case ..... esac 判断 case  $变量名称 in   <==关键词为 case ,还有变量前有钱字号 "第一个变量内容")   &l ...