专栏链接:
https://blog.csdn.net/qq_39707285/article/details/124005405
此专栏主要总结深度学习中的知识点,从各大数据集比赛开始,介绍历年冠军算法;同时总结深度学习中重要的知识点,包括损失函数、优化器、各种经典算法、各种算法的优化策略Bag of Freebies (BoF)等。
专栏链接:
https://blog.csdn.net/qq_39707285/category_11814303.html
此专栏主要介绍RNN、LSTM、Attention、Transformer及其代码实现。
专栏链接:
https://blog.csdn.net/qq_39707285/category_12009356.html
此专栏详细介绍YOLO系列算法,包括官方的YOLOv1、YOLOv2、YOLOv3、YOLOv4、Scaled-YOLOv4、YOLOv7,和YOLOv5,以及美团的YOLOv6,还有PaddlePaddle的PP-YOLO、PP-YOLOv2等,还有YOLOR、YOLOX、YOLOS等。
专栏链接:
https://blog.csdn.net/qq_39707285/category_12184436.html
此专栏详细介绍各种Visual Transformer,包括应用到分类、检测和分割的多种算法。
1. 用于分类的Transformer
表1 Visual Transformer在ImageNet-1k、CIFAR-10和CIFAR-100数据集上TOP-1准确率对比。 “1k only”表示仅在ImageNet-1K数据集上进行训练; “21k pre-train”表示在ImageNet-21k数据集上进行预训练,然后再ImageNet-1k上进行微调; “Distill” 表示应用DEIT蒸馏训练方案 |
||||||||||||
Method | Type | Epochs | Batch Size |
#Params. (M) |
FLOPs (G) |
Training Scheme |
Image Size | ImageNet-1k Top-1 Acc. | CIFAR Top-1 Acc. | |||
Train | Test | 1k only | 21k pre-train. / Distill.Υ | CIFAR 10 | CIFAR 100 | |||||||
ViT-B/16↑ ViT-L/16↑ |
OVT | 300 | 4096 | 86 307 |
743 5172 |
ViT | 224 224 |
384 384 |
77.9 76.5 |
83.97 85.15 |
98.1 97.9 |
87.1 86.4 |
VT-ResNet18 VT-ResNet34 VT-ResNet50 VT-ResNet101 |
TEC | 90 | 256 | 11.7 19.2 21.4 41.5 |
1.569 3.236 3.412 7.129 |
- | 224 224 224 224 |
224 224 224 224 |
76.8 79.9 80.6 82.3 |
- - - - |
- - - - |
- - - - |
BoTNet-S1-59-T2 BoTNet-S1-110-T4 BoTNet-S1-128-T5↑ |
TEC | 350 | 4096 | 33.5 54.7 75.1 |
7.3 10.9 19.3 |
- | 224 224 224 |
224 224 256 |
81.7 82.8 83.5 |
- - - |
- - - |
- - - |
DeiT-Ti DeiT-S DeiT-B DeiT-B↑ |
CET | 300 | 1024 | 5.7 22.1 86.6 86.6 |
1.3 4.6 17.6 52.8 |
DeiT | 224 224 224 224 |
224 224 224 384 |
72.2 79.8 81.8 83.1 |
74.5Υ 81.2Υ 83.4Υ 84.5Υ |
- - 99.1 99.2 |
- - 90.8 91.4 |
ConViT-Ti ConViT-S ConViT-B |
CET | 300 | 512 | 6 27 86 |
1 5.4 17 |
DeiT | 224 224 224 |
224 224 224 |
73.1 81.3 82.4 |
- - - |
- - - |
- - - |
LocalViT-T LocalViT-S |
CET | 300 | 1024 | 5.9 22.4 |
1.3 4.6 |
DeiT | 224 224 |
224 224 |
74.8 80.8 |
- - |
- - |
- - |
CeiT-T CeiT-S CeiT-T↑ CeiT-S↑ |
CET | 300 | 1024 | 6.4 24.2 6.4 24.2 |
1.2 4.5 3.6 12.9 |
DeiT | 224 224 224 224 |
224 224 384 384 |
76.4 82 78.8 83.3 |
- - - - |
98.5 99 98.5 99.1 |
88.4 90.8 88 90.8 |
ResT-Small ResT-Base ResT-Large |
CET | 300 | 2048 | 13.66 30.28 51.63 |
1.9 4.3 7.9 |
DeiT | 224 224 224 |
224 224 224 |
79.6 81.6 83.6 |
- - - |
- - - |
- - - |
ViTC-1GF ViTC-4GF ViTC-18GF ViTC-36GF |
CET | 400 | 2048 2048 1024 512 |
4.6 17.8 81.6 167.8 |
1.1 4 17.7 35 |
DeiT , PVT | 224 224 224 224 |
224 224 224 224 |
75.3 81.4 83 84.2 |
- 81.2 84.9 85.8 |
- - - - |
- - - - |
CoAtNet-0 CoAtNet-1 CoAtNet-2 CoAtNet-3 CoAtNet-4-E150↑ |
CET | 300/90 | 4096 | 25 42 75 168 275 |
4.2 8.4 15.7 34.7 189.5 |
- | 224 224 224 224 224 |
224 224 224 224 384 |
81.6 83.3 84.1 84.5 - |
- - 87.1 87.6 88.4 |
- - - - - |
- - - - - |
TNT-S TNT-B TNT-S↑ TNT-B↑ |
TET | 300 | 1024 | 23.8 65.6 23.8 65.6 |
5.2 14.1 - - |
DeiT | 224 224 224 224 |
224 224 384 384 |
81.3 82.8 83.1 83.9 |
- - - - |
- - 98.7 99.1 |
- - 90.1 91.1 |
Swin-T Swin-S Swin-B Swin-B↑ Swin-L↑ |
TET | 300/60 | 1024/4096 | 29 50 88 88 197 |
4.5 8.7 15.4 47 103.9 |
DeiT | 224 224 224 224 224 |
224 224 224 384 384 |
81.3 83 83.3 84.2 - |
- - 85.2 86.0 86.4 |
- - - - - |
- - - - - |
VOLO-D1 VOLO-D2 VOLO-D3 VOLO-D4 VOLO-D5 VOLO-D3↑ VOLO-D4↑ VOLO-D5↑ |
TET | 300 | 1024 | 27 59 86 193 296 86 193 296 |
6.8 14.1 20.6 43.8 69 67.9 197 304 |
LV-ViT | 224 224 224 224 224 224 224 224 |
224 224 224 224 224 448 448 448 |
84.2 85.2 85.4 85.7 86.1 86.3 86.8 87 |
- - - - - - - |
- - - - - - - |
- - - - - - - |
T2T-ViT-14 T2T-ViT-19 |
TET | 310 | 1024 | 21.5 39.2 |
5.2 8.9 |
- | 224 224 |
224 224 |
81.5 81.9 |
- - |
97.5 98.3 |
88.4 89 |
PVT-Tiny PVT-Small PVT-Medium PVT-Large |
HT | 300 | 128 | 13.2 24.5 44.1 61.4 |
1.9 3.8 6.7 9.8 |
DeiT | 224 224 224 224 |
224 224 224 224 |
75.1 79.8 81.2 81.7 |
- - - - |
- - - - |
- - - - |
PiT-Ti PiT-XS PiT-S PiT-B |
HT | 300 | 1024 | 4.9 10.6 23.5 73.8 |
0.7 1.4 2.9 12.5 |
DeiT | 224 224 224 224 |
224 224 224 224 |
73 78.1 80.9 82 |
74.6Υ 79.1Υ 81.9Υ 84Υ |
- - - - |
- - - - |
CvT-13 CvT-21 CvT-13↑ CvT-21↑ CvT-W24↑ |
HT | 300 | 2048 | 20 32 20 32 277 |
4.5 7.1 16.3 24.9 193.2 |
ViT , BiT | 224 224 224 224 224 |
224 224 384 384 384 |
81.6 82.5 83 83.3 - |
- - 83.3 84.9 87.7 |
- - - - - |
- - - - - |
DeepViT-S DeepViT-L |
DT | 300 | 256 | 27 55 |
6.2 12.5 |
DeiT , ResNest | 224 224 |
224 224 |
82.3 83.1 |
- - |
- - |
- - |
CaiT-XS-24 CaiT-S-24 CaiT-S-36 CaiT-M-24 CaiT-M-36 |
DT | 400 | 1024 | 26.6 46.9 68.2 185.9 270.9 |
5.4 9.4 13.9 36 53.7 |
DeiT | 224 224 224 224 224 |
224 224 224 224 224 |
81.8 82.7 83.3 83.4 83.8 |
82.0Υ 83.5Υ 84Υ 84.7Υ 85.1Υ |
- - 99.2 - 99.3 |
- - 92.2 - 93.3 |
DiversePatch-S12 DiversePatch-S24 DiversePatch-B12 DiversePatch-B24 DiversePatch-B12↑ |
DT | 400 | 1024 | 22 44 86 172 86 |
- - - - - |
DeiT | 224 224 224 224 224 |
224 224 224 224 384 |
81.2 82.2 82.9 83.3 84.2 |
- - - - - |
- - - - - |
- - - - - |
Refined-ViT-S Refined-ViT-M Refined-ViT-L Refined-ViT-M↑ Refined-ViT-L↑ |
DT | 300 | 256 512 |
25 55 81 55 81 |
7.2 13.5 19.1 49.2 69.1 |
DeiT | 224 224 224 224 224 |
224 224 224 384 384 |
83.6 84.6 84.9 85.6 85.7 |
- - - - - |
- - - - - |
- - - - - |
CrossViT-9 CrossViT-15 CrossViT-18 CrossViT-18* CrossViT-15*↑ CrossViT-18*↑ |
M | 300 | 4096 | 8.6 27.4 43.3 44.3 28.5 44.6 |
1.8 5.8 9.03 9.5 21.4 32.4 |
DeiT | 224 224 224 224 224 224 |
224 224 224 224 384 384 |
73.9 81.5 82.5 82.8 83.5 83.9 |
- - - - - - |
- 99 99.11 - - - |
- 90.77 91.36 - - - |
LV-ViT-S LV-ViT-M LV-ViT-L LV-ViT-M↑ LV-ViT-L↑ |
DAT | 300 | 1024 | 26 56 150 56 150 |
6.6 16 59 42.2 157.2 |
LV-ViT | 224 224 288 224 288 |
224 224 288 384 448 |
83.3 84 85.3 85.4 85.9 |
- - - - - |
- - - - - |
- - - - - |
2. 用于目标检测的Transformer
3. 用于分割的Transformer
转载:https://blog.csdn.net/qq_39707285/article/details/128914081
查看评论