티스토리 뷰

 

이번에는 MobileNet V1을 직접 구현해보았다. 확실히 ResNet, VGGNet보다 학습속도가 빠르다는 것을 확인할 수 있었다.

 

1. Setup

이전 구현과 똑같이 세팅을 했다.

 

import torch.nn as nn
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.optim as optim
import time
import numpy as np

import random
import torch.backends.cudnn as cudnn

seed = 2022
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
np.random.seed(seed)
cudnn.benchmark = False
cudnn.deterministic = True
random.seed(seed)

 

2. Depthwise Separable Conv

 

Pytorch에서는 nn.Conv2d 의 parameter로 groups가 있으며 친절하게 Depthwise Conv를 적용하는 방법까지 공식문서에서 제공해준다 다음 링크를 참고해서 작성했다.

Separable은 Pointwise Conv이므로 1x1 Conv로 처리했다.

참고로 torch버전은 1.10.1으로 구현했다.

https://pytorch.org/docs/1.10.1/generated/torch.nn.Conv2d.html?highlight=depthwise 

 

Conv2d — PyTorch 1.10.1 documentation

Shortcuts

pytorch.org

 

 

class DWConv(nn.Module):
    def __init__(self, in_channels, multiplier, stride, padding = 1):
        super(DWConv, self).__init__()
        self.stride = stride
        self.padding = padding
        self.in_channels = in_channels
        self.multiplier = multiplier
        self.out_channels = self.in_channels * self.multiplier
        self.DW = nn.Sequential(
            nn.Conv2d(in_channels=self.in_channels,
                      out_channels=self.out_channels,
                      groups=self.in_channels,
                      kernel_size = 3,
                      stride = self.stride,
                      padding = self.padding),
            nn.BatchNorm2d(self.out_channels),
            nn.ReLU(),
            nn.Conv2d(in_channels = self.out_channels,
                      out_channels=self.out_channels,
                      kernel_size=1,
                      stride=1),
            nn.BatchNorm2d(self.out_channels),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.DW(x)
        return x

 

3. Conv 3x3

 

MobileNet v1 전체 아키텍쳐에는 일반 Conv 3x3도 적용되므로 class로 선언해주었다.

class conv3x3(nn.Module):
    def __init__(self, in_channels, out_channels, stride, padding):
        super(conv3x3, self).__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.net = nn.Sequential(
            nn.Conv2d(
                in_channels=self.in_channels,
                out_channels=self.out_channels,
                stride = stride,
                padding = padding,
                kernel_size=3
            ),
            nn.BatchNorm2d(self.out_channels),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.net(x)
        return x

 

 

4. MobileNet v1

 

깔끔하지는 않지만 논문에 제시한대로 그대로 구현하려고 노력했다.

 

class MobileNetV1(nn.Module):
    def __init__(self, in_channels, num_classes):
        super(MobileNetV1, self).__init__()
        self.features = nn.Sequential(
            conv3x3(in_channels, 32, 2, 1),
            DWConv(32, 1, 1),
            conv3x3(32, 64, 1, 1),
            DWConv(64, 1, 2),
            conv3x3(64, 128, 1, 1),
            DWConv(128, 1, 2),
            conv3x3(128, 256, 1, 1),
            DWConv(256, 1, 1),
            conv3x3(256, 256, 1, 1),
            DWConv(256, 1, 2),
            conv3x3(256, 512, 1, 1),
            DWConv(512, 1, 1),
            conv3x3(512, 512, 1, 1),
            DWConv(512, 1, 1),
            conv3x3(512, 512, 1, 1),
            DWConv(512, 1, 1),
            conv3x3(512, 512, 1, 1),
            DWConv(512, 1, 1),
            conv3x3(512, 512, 1, 1),
            DWConv(512, 1, 1),
            conv3x3(512, 512, 1, 1),
            DWConv(512, 1, 2),
            conv3x3(512, 1024, 1, 1),
            DWConv(1024, 1, 2),
            conv3x3(1024, 1024, 1, 1)
        )
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Linear(in_features=1024, out_features=num_classes)
        self.softmax = nn.Softmax(dim = 1)

    def forward(self, x):
        x = self.features(x)
        x = self.avgpool(x)
        x = x.squeeze()
        x = self.fc(x)
        x = self.softmax(x)
        return x

 

이후 CIFAR-10 으로 테스트 했을 때 0.553으로 그리 높지는 않으나 epoch을 20으로 했던 점과 학습속도가 빠르다는 점에서 의의가 있다는 것을 직접 확인할 수 있었다.

 

전체 코드는 다음 github 링크를 참고하면 된다.

https://github.com/kkt4828/reviewpaper/blob/68b179ff6550744ea262306908be50eb386dd3cc/MobileNet/MobileNetV1.py

 

GitHub - kkt4828/reviewpaper: 논문구현

논문구현. Contribute to kkt4828/reviewpaper development by creating an account on GitHub.

github.com

참고논문 - https://arxiv.org/abs/1704.04861

 

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce tw

arxiv.org