See the tutorial.
Important note:
- torchvision provides several pre-trained models with their trained parameters.
- AlexNet, DenseNet, Inception, ResNet, VGG are available, see here.
- With pretrained=True, pre-trained parameters are available.
- The pre-trained models are classified into two types:
- (feature, classifier) model (AlexNet, DenseNet, VGG)
- This model has two sub-networks, feature and classifier.
- Each sub-network is implemented as a torch.nn.Sequential object.
- (..., fc) model (Inception, ResNet)
- This model has several components and the final fully-connected layer, accessible by net.fc.
- (feature, classifier) model (AlexNet, DenseNet, VGG)
- To custom final fully connected layer(s),
- define a pre-trained model with pretrained=True
- set all fixed parameters untrainable
- change final fully connected layer(s)
- give optimizer only trainable parameters
[code lang='python'] import torch import torch.nn as nn import torch.optim as optim import torchvision
if name == 'main':
B = 4
C = 3
H = 224
W = 224
num_classes = 8
x = torch.rand(B, C, H, W)
y = torch.rand(B, num_classes)
# load the pre-trained model
net = torchvision.models.vgg16_bn(pretrained=True)
print('#layers in vgg16.features:', len(net.features))
print(net.features)
print('#layers in vgg16.classifier:', len(net.classifier))
print(net.classifier)
# fix all pre-trained parameters
for n, param in enumerate(net.features.parameters()):
param.requires_grad = False
# change last FC layer
net.classifier[-1] = nn.Linear(in_features=net.classifier[-1].in_features,
out_features=num_classes)
for n, module in enumerate(net.classifier):
print(n, module)
# training settings
criterion = nn.MSELoss()
optimizer = optim.SGD(net.classifier.parameters(),
lr=0.001,
momentum=0.9)
for epoch in range(100):
optimizer.zero_grad()
y_pred = net(x)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
print('%d: %f' % (epoch, loss.item()))
# test
net.eval()
y = torch.rand(B, num_classes)
y_pred = net(x)
print(y)
print(y_pred)
[/code]
Before FC layer customization:
[code lang='bash'] 0 Linear(in_features=25088, out_features=4096, bias=True) 1 ReLU(inplace) 2 Dropout(p=0.5) 3 Linear(in_features=4096, out_features=4096, bias=True) 4 ReLU(inplace) 5 Dropout(p=0.5) 6 Linear(in_features=4096, out_features=1000, bias=True) [/code]
After FC layer customization:
[code lang='bash'] 0 Linear(in_features=25088, out_features=4096, bias=True) 1 ReLU(inplace) 2 Dropout(p=0.5) 3 Linear(in_features=4096, out_features=4096, bias=True) 4 ReLU(inplace) 5 Dropout(p=0.5) 6 Linear(in_features=4096, out_features=8, bias=True) [/code]