I am Charmie

メモとログ

PyTorch: fine-tune a pre-trained model

See the tutorial.

Important note:

  1. torchvision provides several pre-trained models with their trained parameters.
    1. AlexNet, DenseNet, Inception, ResNet, VGG are available, see here.
  2. With pretrained=True, pre-trained parameters are available.
  3. The pre-trained models are classified into two types:
    1. (feature, classifier) model (AlexNet, DenseNet, VGG)
      1. This model has two sub-networks, feature and classifier.
      2. Each sub-network is implemented as a torch.nn.Sequential object.
    2. (..., fc) model (Inception, ResNet)
      1. This model has several components and the final fully-connected layer, accessible by net.fc.
  4. To custom final fully connected layer(s),
    1. define a pre-trained model with pretrained=True
    2. set all fixed parameters untrainable
    3. change final fully connected layer(s)
    4. give optimizer only trainable parameters

[code lang='python'] import torch import torch.nn as nn import torch.optim as optim import torchvision

if name == 'main':

B = 4
C = 3
H = 224
W = 224
num_classes = 8
x = torch.rand(B, C, H, W)
y = torch.rand(B, num_classes)

# load the pre-trained model
net = torchvision.models.vgg16_bn(pretrained=True)

print('#layers in vgg16.features:', len(net.features))
print(net.features)
print('#layers in vgg16.classifier:', len(net.classifier))
print(net.classifier)

# fix all pre-trained parameters
for n, param in enumerate(net.features.parameters()):
    param.requires_grad = False

# change last FC layer
net.classifier[-1] = nn.Linear(in_features=net.classifier[-1].in_features,
                               out_features=num_classes)
for n, module in enumerate(net.classifier):
    print(n, module)

# training settings
criterion = nn.MSELoss()
optimizer = optim.SGD(net.classifier.parameters(),
                      lr=0.001,
                      momentum=0.9)

for epoch in range(100):
    optimizer.zero_grad()

    y_pred = net(x)
    loss = criterion(y_pred, y)
    loss.backward()
    optimizer.step()

    print('%d: %f' % (epoch, loss.item()))

# test
net.eval()
y = torch.rand(B, num_classes)
y_pred = net(x)
print(y)
print(y_pred)

[/code]

Before FC layer customization:

[code lang='bash'] 0 Linear(in_features=25088, out_features=4096, bias=True) 1 ReLU(inplace) 2 Dropout(p=0.5) 3 Linear(in_features=4096, out_features=4096, bias=True) 4 ReLU(inplace) 5 Dropout(p=0.5) 6 Linear(in_features=4096, out_features=1000, bias=True) [/code]

After FC layer customization:

[code lang='bash'] 0 Linear(in_features=25088, out_features=4096, bias=True) 1 ReLU(inplace) 2 Dropout(p=0.5) 3 Linear(in_features=4096, out_features=4096, bias=True) 4 ReLU(inplace) 5 Dropout(p=0.5) 6 Linear(in_features=4096, out_features=8, bias=True) [/code]