小言_互联网的博客

计算机视觉框架OpenMMLab开源学习(三):图像分类实战

487人阅读  评论(0)

前言:本篇主要偏向图像分类实战部分,使用MMclassification工具进行代码应用,最后对水果分类进行实战演示,本次环境和代码配置部分省略,具体内容建议参考前一篇文章:计算机视觉框架OpenMMLab开源学习(二):图像分类

计算机视觉框架OpenMMLab开源学习(三):图像分类实战

一、安装OpenMMLab v2.0

Step 1. Install MMCV

mim install "mmcv>=2.0.0rc0"

Step 2. Install MMClassification and MMDetection

mim install "mmcls>=1.0.0rc0" "mmdet>=3.0.0rc0"

代码模版讲解:


  
  1. model = dict(
  2. type= 'ImageClassifier', # 分类器类型
  3. backbone= dict(
  4. type= 'ResNet', # 主干网络类型
  5. depth= 50, # 主干网网络深度, ResNet 一般有18, 34, 50, 101, 152 可以选择
  6. num_stages= 4, # 主干网络状态(stages)的数目,这些状态产生的特征图作为后续的 head 的输入。
  7. out_indices=( 3, ), # 输出的特征图输出索引。越远离输入图像,索引越大
  8. frozen_stages=- 1, # 网络微调时,冻结网络的stage(训练时不执行反相传播算法),若num_stages=4,backbone包含stem 与 4 个 stages。frozen_stages为-1时,不冻结网络; 为0时,冻结 stem; 为1时,冻结 stem 和 stage1; 为4时,冻结整个backbone
  9. style= 'pytorch'), # 主干网络的风格,'pytorch' 意思是步长为2的层为 3x3 卷积, 'caffe' 意思是步长为2的层为 1x1 卷积。
  10. neck= dict( type= 'GlobalAveragePooling'), # 颈网络类型
  11. head= dict(
  12. type= 'LinearClsHead', # 线性分类头,
  13. num_classes= 1000, # 输出类别数,这与数据集的类别数一致
  14. in_channels= 2048, # 输入通道数,这与 neck 的输出通道一致
  15. loss= dict( type= 'CrossEntropyLoss', loss_weight= 1.0), # 损失函数配置信息
  16. topk=( 1, 5), # 评估指标,Top-k 准确率, 这里为 top1 与 top5 准确率
  17. ))

二、Pytorch图像分类任务

本次任务训练数据为FashionMNIST,完整代码如下:


  
  1. # https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
  2. import torch
  3. from torch import nn
  4. from torch.utils.data import DataLoader
  5. from torchvision import datasets
  6. from torchvision.transforms import ToTensor
  7. # Training
  8. ## Construct Dataset and Dataloader
  9. training_data = datasets.FashionMNIST(
  10. root= "data",
  11. train= True,
  12. download= True,
  13. transform=ToTensor(),
  14. )
  15. test_data = datasets.FashionMNIST(
  16. root= "data",
  17. train= False,
  18. download= True,
  19. transform=ToTensor(),
  20. )
  21. batch_size = 64
  22. train_dataloader = DataLoader(training_data, batch_size=batch_size)
  23. test_dataloader = DataLoader(test_data, batch_size=batch_size)
  24. ## Define model
  25. class NeuralNetwork(nn.Module):
  26. def __init__( self):
  27. super(NeuralNetwork, self).__init__()
  28. self.flatten = nn.Flatten()
  29. self.linear_relu_stack = nn.Sequential(
  30. nn.Linear( 28* 28, 512),
  31. nn.ReLU(),
  32. nn.Linear( 512, 512),
  33. nn.ReLU(),
  34. nn.Linear( 512, 10)
  35. )
  36. def forward( self, x):
  37. x = self.flatten(x)
  38. logits = self.linear_relu_stack(x)
  39. return logits
  40. device = "cuda" if torch.cuda.is_available() else "cpu"
  41. model = NeuralNetwork().to(device)
  42. ## Define loss function and Optimizer
  43. loss_fn = nn.CrossEntropyLoss()
  44. optimizer = torch.optim.SGD(model.parameters(), lr= 1e-3)
  45. ## Inner loop for training
  46. def train( dataloader, model, loss_fn, optimizer):
  47. size = len(dataloader.dataset)
  48. model.train()
  49. for batch, (X, y) in enumerate(dataloader):
  50. X, y = X.to(device), y.to(device)
  51. # Compute prediction error
  52. pred = model(X)
  53. loss = loss_fn(pred, y)
  54. # Backpropagation
  55. optimizer.zero_grad()
  56. loss.backward()
  57. optimizer.step()
  58. # Output Logs
  59. if batch % 100 == 0:
  60. loss, current = loss.item(), batch * len(X)
  61. print( f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
  62. ## Inner loop for test
  63. def test( dataloader, model, loss_fn):
  64. size = len(dataloader.dataset)
  65. num_batches = len(dataloader)
  66. model. eval()
  67. test_loss, correct = 0, 0
  68. with torch.no_grad():
  69. for X, y in dataloader:
  70. X, y = X.to(device), y.to(device)
  71. pred = model(X)
  72. test_loss += loss_fn(pred, y).item()
  73. correct += (pred.argmax( 1) == y). type(torch. float). sum().item()
  74. test_loss /= num_batches
  75. correct /= size
  76. print( f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
  77. ## Launch training / test loops#
  78. epochs = 5
  79. for t in range(epochs):
  80. print( f"Epoch {t+1}\n-------------------------------")
  81. train(train_dataloader, model, loss_fn, optimizer)
  82. test(test_dataloader, model, loss_fn)
  83. print( "Done!")
  84. ## Saving Models
  85. torch.save(model.state_dict(), "model.pth")
  86. print( "Saved PyTorch Model State to model.pth")
  87. # Deployment
  88. ## Loading Models
  89. model = NeuralNetwork()
  90. model.load_state_dict(torch.load( "model.pth"))
  91. # Predict new images
  92. classes = [
  93. "T-shirt/top",
  94. "Trouser",
  95. "Pullover",
  96. "Dress",
  97. "Coat",
  98. "Sandal",
  99. "Shirt",
  100. "Sneaker",
  101. "Bag",
  102. "Ankle boot",
  103. ]
  104. model. eval()
  105. x, y = test_data[ 0][ 0], test_data[ 0][ 1]
  106. with torch.no_grad():
  107. pred = model(x)
  108. predicted, actual = classes[pred[ 0].argmax( 0)], classes[y]
  109. print( f'Predicted: "{predicted}", Actual: "{actual}"')

三、利用MMClassification提供的预训练模型推理:

安装环境:


  
  1. pip install openmim, mmengine
  2. mim install mmcv-full mmcls

Inference using high-level API


  
  1. from mmcls.apis import init_model, inference_model
  2. model = init_model( 'mobilenet-v2_8xb32_in1k.py',
  3. 'mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth',
  4. device= 'cuda:0')
  5. load checkpoint from local path: mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth
  6. result = inference_model(model, 'banana.png')
  7. result
  8. { 'pred_label': 954, 'pred_score': 0.9999284744262695, 'pred_class': 'banana'}
  9. from mmcls.apis import show_result_pyplot
  10. show_result_pyplot(model, 'banana.png', result)

PyTorch codes under the hood

Let write some raw PyTorch codes to do the same thing.

These are actual codes wrapped in high-level APIs.

construct an ImageClassifier

Note: current implementation only allow configs of backbone, neck and classification head instead of Python objects.

from mmcls.models import ImageClassifier

classifier = ImageClassifier(
    backbone=dict(type='MobileNetV2', widen_factor=1.0),
    neck=dict(type='GlobalAveragePooling'),
    head=dict(
        type='LinearClsHead',
        num_classes=1000,
        in_channels=1280)
)

Load trained parameters

import torch

ckpt = torch.load('mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth')
classifier.load_state_dict(ckpt['state_dict'])

Construct data preprocessing pipeline

Important: A models work only if image preprocessing pipelines is correct.

from mmcls.datasets.pipelines import Compose

test_pipeline = Compose([
    dict(type='LoadImageFromFile'),
    dict(type='Resize', size=(256, -1), backend='pillow'),
    dict(type='CenterCrop', crop_size=224),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='ImageToTensor', keys=['img']),
    dict(type='Collect', keys=['img'])
])
data = dict(img_info=dict(filename='banana.png'), img_prefix=None)
data = test_pipeline(data)
data
{'img_metas': DataContainer({'filename': 'banana.png', 'ori_filename': 'banana.png', 'ori_shape': (403, 393, 3), 'img_shape': (224, 224, 3), 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}}),
 'img': tensor([[[ 0.3309,  0.2967,  0.3138,  ...,  2.0263,  2.0092,  1.9920],
          [ 0.3481,  0.3309,  0.2282,  ...,  2.0263,  2.0092,  1.9920],
          [ 0.2796,  0.2967,  0.2967,  ...,  1.9920,  2.0263,  1.9749],
          ...,
          [ 0.1939,  0.1768,  0.2282,  ...,  0.3994,  0.3309,  0.3823],
          [ 0.1426,  0.1254,  0.2111,  ...,  0.5878,  0.5364,  0.5536],
          [-0.0116, -0.0801,  0.1597,  ...,  0.5707,  0.5536,  0.5364]],
 
         [[ 0.3803,  0.3803,  0.3803,  ...,  2.1660,  2.1485,  2.1134],
          [ 0.4153,  0.4153,  0.3102,  ...,  2.1835,  2.1310,  2.1134],
          [ 0.3452,  0.3803,  0.3803,  ...,  2.1134,  2.1485,  2.1134],
          ...,
          [ 0.2752,  0.2577,  0.3102,  ...,  0.5028,  0.4328,  0.4328],
          [ 0.2227,  0.1877,  0.3102,  ...,  0.6604,  0.6254,  0.5728],
          [ 0.0301, -0.0049,  0.2402,  ...,  0.6604,  0.6254,  0.5728]],
 
         [[ 0.5485,  0.5485,  0.5485,  ...,  2.3437,  2.3263,  2.2914],
          [ 0.5834,  0.5834,  0.4788,  ...,  2.3611,  2.3088,  2.2914],
          [ 0.5136,  0.5485,  0.5485,  ...,  2.3088,  2.3437,  2.3088],
          ...,
          [ 0.4091,  0.3916,  0.4439,  ...,  0.5834,  0.5136,  0.5311],
          [ 0.3568,  0.3045,  0.4265,  ...,  0.7576,  0.7228,  0.7054],
          [ 0.1651,  0.1128,  0.3742,  ...,  0.7576,  0.7402,  0.7054]]])}
 

equivalent in torchvision

from PIL import Image
from torchvision.transforms import Compose, Resize, CenterCrop, Normalize, ToTensor

tv_transform = Compose([Resize(256), 
                        CenterCrop(224), 
                        ToTensor(),
                        Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
                        ])

image = Image.open('banana.png').convert('RGB')
tv_data = tv_transform(image)

Forward through the model

## IMPORTANT: set the classifier to eval mode
classifier.eval()

imgs = data['img'].unsqueeze(0)
imgs = tv_data.unsqueeze(0)

with torch.no_grad():
    # class probabilities
    prob = classifier.forward_test(imgs)[0]
    # features
    feat = classifier.extract_feat(imgs, stage='neck')[0]
    
print(len(prob))
print(prob.argmax().item())
print(feat.shape)
1000
954
torch.Size([1, 1280])

3.使用MMClassificaiton完整进行水果分类实战:

数据集下载:

GitHub - TommyZihao/MMClassification_Tutorials: Jupyter notebook tutorials for MMClassificationJupyter notebook tutorials for MMClassification. Contribute to TommyZihao/MMClassification_Tutorials development by creating an account on GitHub.https://github.com/TommyZihao/MMClassification_Tutorials

代码框架: 


  
  1. def main():
  2. model = build_classifier(cfg.model)
  3. model.init_weights()
  4. datasets = [build_dataset(cfg.data.train)]
  5. train_model(
  6. model,
  7. datasets,
  8. cfg,
  9. distributed=distributed,
  10. validate=( not args.no_validate),
  11. timestamp=timestamp,
  12. device=cfg.device,
  13. meta=meta)
  14. mmcls/apis/train_model.py
  15. def train_model( model,
  16. dataset,
  17. cfg):
  18. data_loaders = [build_dataloader(ds, **train_loader_cfg) for ds in dataset]
  19. optimizer = build_optimizer(model, cfg.optimizer)
  20. runner = build_runner(
  21. cfg.runner,
  22. default_args= dict(
  23. model=model,
  24. optimizer=optimizer))
  25. runner.register_training_hooks(
  26. cfg.lr_config,
  27. optimizer_config,
  28. cfg.checkpoint_config,
  29. cfg.log_config,
  30. cfg.get( 'momentum_config', None),
  31. custom_hooks_config=cfg.get( 'custom_hooks', None))
  32. runner.run(data_loaders, cfg.workflow)
  33. mmcv/runner/epoch_based_runner.py
  34. class EpochBasedRunner( BaseRunner):
  35. def run_iter( self, data_batch: Any, train_mode: bool, **kwargs) -> None:
  36. if train_mode:
  37. outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  38. else:
  39. outputs = self.model.val_step(data_batch, self.optimizer, **kwargs)
  40. self.outputs = outputs
  41. def train( self, data_loader, **kwargs):
  42. self.model.train()
  43. self.data_loader = data_loader
  44. for i, data_batch in enumerate(self.data_loader):
  45. self.run_iter(data_batch, train_mode= True, **kwargs)
  46. self.call_hook( 'after_train_iter')
  47. mmcls/models/classifiers/base.py
  48. class BaseClassifier(BaseModule, metaclass=ABCMeta):
  49. def forward( self, img, return_loss=True, **kwargs):
  50. """Calls either forward_train or forward_test depending on whether
  51. return_loss=True.
  52. Note this setting will change the expected inputs. When
  53. `return_loss=True`, img and img_meta are single-nested (i.e. Tensor and
  54. List[dict]), and when `resturn_loss=False`, img and img_meta should be
  55. double nested (i.e. List[Tensor], List[List[dict]]), with the outer
  56. list indicating test time augmentations.
  57. """
  58. if return_loss:
  59. return self.forward_train(img, **kwargs)
  60. else:
  61. return self.forward_test(img, **kwargs)
  62. def train_step( self, data, optimizer=None, **kwargs):
  63. losses = self(**data)
  64. loss, log_vars = self._parse_losses(losses)
  65. outputs = dict(
  66. loss=loss, log_vars=log_vars, num_samples= len(data[ 'img'].data))
  67. return outputs
  68. mmcls/models/classifiers/image.py
  69. class ImageClassifier( BaseClassifier):
  70. def __init__( self,
  71. backbone,
  72. neck=None,
  73. head=None,
  74. pretrained=None,
  75. train_cfg=None,
  76. init_cfg=None):
  77. super(ImageClassifier, self).__init__(init_cfg)
  78. if pretrained is not None:
  79. self.init_cfg = dict( type= 'Pretrained', checkpoint=pretrained)
  80. self.backbone = build_backbone(backbone)
  81. if neck is not None:
  82. self.neck = build_neck(neck)
  83. if head is not None:
  84. self.head = build_head(head)
  85. def extract_feat( self, img):
  86. x = self.backbone(img)
  87. if self.with_neck:
  88. x = self.neck(x)
  89. return x
  90. def forward_train( self, img, gt_label, **kwargs):
  91. x = self.extract_feat(img)
  92. losses = dict()
  93. loss = self.head.forward_train(x, gt_label)
  94. losses.update(loss)
  95. return losses
  96. mmcv/runner/hooks/optimizer.py
  97. class OptimizerHook( Hook):
  98. def after_train_iter( self, runner):
  99. runner.optimizer.zero_grad()
  100. runner.outputs[ 'loss'].backward()
  101. runner.optimizer.step()

总结:本篇主要偏向图像分类实战部分,使用MMclassification工具进行代码应用,熟悉其框架应用,为后续处理不同场景下分类问题提供帮助。 

本文参考:GitHub - wangruohui/sjtu-openmmlab-tutorial


转载:https://blog.csdn.net/qq_36816848/article/details/128884766
查看评论
* 以上用户言论只代表其个人观点,不代表本网站的观点或立场