目录
划分数据集
split_train_val.py
-
import os
-
import random
-
import argparse
-
-
parser = argparse.ArgumentParser()
-
# 标注文件的地址,根据自己的标注文件所在的位置进行修改
-
parser.add_argument(
'--label_path', default=
'/data/test/txt',
type=
str,
help=
'input label path')
-
# 数据集的划分输出地址,可以自己定
-
parser.add_argument(
'--save_path', default=
'imageSets/Path',
type=
str,
help=
'output txt label path')
-
opt = parser.parse_args()
-
-
train_val_percent =
0.95
# 训练集和验证集所占比例,剩下的部分就是测试集。
-
train_percent =
8 /
9
# 训练集所占比例,可自己进行调整
-
label_file_path = opt.label_path
-
txt_save_path = opt.save_path
-
total_label = os.listdir(label_file_path)
-
if
not os.path.exists(txt_save_path):
-
os.makedirs(txt_save_path)
-
-
num =
len(total_label)
-
list_index =
range(num)
-
tv =
int(num * train_val_percent)
-
tr =
int(tv * train_percent)
-
trainval = random.sample(list_index, tv)
-
train = random.sample(trainval, tr)
-
-
file_trainval =
open(txt_save_path +
'/trainval.txt',
'w')
-
file_test =
open(txt_save_path +
'/test.txt',
'w')
-
file_train =
open(txt_save_path +
'/train.txt',
'w')
-
file_val =
open(txt_save_path +
'/val.txt',
'w')
-
-
for i
in list_index:
-
name = total_label[i][:-
4] +
'\n'
-
if i
in trainval:
-
file_trainval.write(name)
-
if i
in train:
-
file_train.write(name)
-
else:
-
file_val.write(name)
-
else:
-
file_test.write(name)
-
-
file_trainval.close()
-
file_train.close()
-
file_val.close()
-
file_test.close()
终端输入以下命令:
python3 split_train_val.py
(Tips:如果运行失败,将python3修改成python)
例:train.txt
生成数据集路径txt文件(xml转txt)
-
# -*- coding: utf-8 -*-
-
import xml.etree.ElementTree
as ET
-
import os
-
from os
import getcwd
-
-
sets = [
'train',
'val',
'test']
-
classes = [
"test"]
# 改成自己的类别
-
abs_path = os.getcwd()
-
print(abs_path)
-
-
-
def
convert(
size, box):
-
dw =
1. / (size[
0])
-
dh =
1. / (size[
1])
-
x = (box[
0] + box[
1]) /
2.0 -
1
-
y = (box[
2] + box[
3]) /
2.0 -
1
-
w = box[
1] - box[
0]
-
h = box[
3] - box[
2]
-
x = x * dw
-
w = w * dw
-
y = y * dh
-
h = h * dh
-
return x, y, w, h
-
-
-
def
convert_annotation(
image_ID):
-
in_file =
open(
'D:/dataSet/marking/%s.xml' % image_ID, encoding=
'UTF-8')
-
out_file =
open(
'D:/dataSet/labels/%s.txt' % image_ID,
'w')
-
tree = ET.parse(in_file)
-
root = tree.getroot()
-
size = root.find(
'size')
-
w =
int(size.find(
'width').text)
-
h =
int(size.find(
'height').text)
-
for obj
in root.
iter(
'object'):
-
difficult = obj.find(
'difficult').text
-
# difficult = obj.find('Difficult').text
-
cls = obj.find(
'name').text
-
if cls
not
in classes
or
int(difficult) ==
1:
-
continue
-
cls_id = classes.index(cls)
-
xmlbox = obj.find(
'bndbox')
-
b = (
float(xmlbox.find(
'xmin').text),
float(xmlbox.find(
'xmax').text),
float(xmlbox.find(
'ymin').text),
-
float(xmlbox.find(
'ymax').text))
-
b1, b2, b3, b4 = b
-
# 标注越界修正
-
if b2 > w:
-
b2 = w
-
if b4 > h:
-
b4 = h
-
b = (b1, b2, b3, b4)
-
bb = convert((w, h), b)
-
out_file.write(
str(cls_id) +
" " +
" ".join([
str(a)
for a
in bb]) +
'\n')
-
-
-
wd = getcwd()
-
for image_set
in sets:
-
image_ids =
open(
'imageSets/Main/%s.txt' % image_set).read().strip().split()
-
-
if
not os.path.exists(
'imageSets/dataSet_path/'):
-
os.makedirs(
'imageSets/dataSet_path/')
-
-
list_file =
open(
'imageSets/dataSet_path/%s.txt' % image_set,
'w')
-
# 这行路径不需更改,这是相对路径
-
for image_id
in image_ids:
-
list_file.write(
'/data/basketball/images/%s.jpg\n' % image_id)
-
# convert_annotation(image_id) # 如果是xml标注文件转txt文件就删除最左边的#号
-
list_file.close()
(Tips:yolov5中txt标注文件夹应命名为labels,并与图片所在文件夹images处于同级目录)
运行效果:
编写配置文件
data
在yolov5/data目录下新建xxxx.yaml(自己命名,例如:test.yaml),内容模板如下:
-
train: /test/imageSets/dataSet_path/train.txt
-
val: /test/ly01/imageSets/dataSet_path/val.txt
-
# test: /test/imageSets/dataSet_path/test.txt
-
-
# number of classes
-
nc: 1
-
-
# class names
-
names: [
"test"]
# 自己的类别
(Tips:从这里可以看出,我们只告诉了yolov5图片路径,因为yolov5会去与images同级目录下的labels寻找标注txt文件)
cfg
在yolov5/models目录下新建xxxx.yaml(自己命名,例如yolov5t.yaml),内容根据希望训练的模型而变,例如我希望训练成像yolov5s.pt一样的模型,那就复制yolov5s.yaml中的内容到新建的yaml文件中,然后将nc(标注类别数目)修改成和上面一样的大小。
-
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
-
-
# Parameters
-
nc:
1
# number of classes
-
depth_multiple:
0.33
# model depth multiple
-
width_multiple:
0.50
# layer channel multiple
-
anchors:
-
- [
10,
13,
16,
30,
33,
23]
# P3/8
-
- [
30,
61,
62,
45,
59,
119]
# P4/16
-
- [
116,
90,
156,
198,
373,
326]
# P5/32
-
-
# YOLOv5 v6.0 backbone
-
backbone:
-
# [from, number, module, args]
-
[[-
1,
1, Conv, [
64,
6,
2,
2]],
# 0-P1/2
-
[-
1,
1, Conv, [
128,
3,
2]],
# 1-P2/4
-
[-
1,
3, C3, [
128]],
-
[-
1,
1, Conv, [
256,
3,
2]],
# 3-P3/8
-
[-
1,
6, C3, [
256]],
-
[-
1,
1, Conv, [
512,
3,
2]],
# 5-P4/16
-
[-
1,
9, C3, [
512]],
-
[-
1,
1, Conv, [
1024,
3,
2]],
# 7-P5/32
-
[-
1,
3, C3, [
1024]],
-
[-
1,
1, SPPF, [
1024,
5]],
# 9
-
]
-
-
# YOLOv5 v6.0 head
-
head:
-
[[-
1,
1, Conv, [
512,
1,
1]],
-
[-
1,
1, nn.Upsample, [
None,
2,
'nearest']],
-
[[-
1,
6],
1, Concat, [
1]],
# cat backbone P4
-
[-
1,
3, C3, [
512,
False]],
# 13
-
-
[-
1,
1, Conv, [
256,
1,
1]],
-
[-
1,
1, nn.Upsample, [
None,
2,
'nearest']],
-
[[-
1,
4],
1, Concat, [
1]],
# cat backbone P3
-
[-
1,
3, C3, [
256,
False]],
# 17 (P3/8-small)
-
-
[-
1,
1, Conv, [
256,
3,
2]],
-
[[-
1,
14],
1, Concat, [
1]],
# cat head P4
-
[-
1,
3, C3, [
512,
False]],
# 20 (P4/16-medium)
-
-
[-
1,
1, Conv, [
512,
3,
2]],
-
[[-
1,
10],
1, Concat, [
1]],
# cat head P5
-
[-
1,
3, C3, [
1024,
False]],
# 23 (P5/32-large)
-
-
[[
17,
20,
23],
1, Detect, [nc, anchors]],
# Detect(P3, P4, P5)
-
]
训练模型
做完前面的准备工作以后就可以开始训练模型了,我们使用yolov5目录下的train.py进行训练,train.py有许多参数,其中比较重要的有以下几个:
-
weights: 权重文件的路径
-
cfg: 存储模型结构的配置文件,也就是之前创建的yolov5t.yaml
-
data: 存储训练、测试数据的文件,也就是之前创建的test.yaml
-
epochs: 迭代次数,在训练过程中数据集将进行这么多次迭代
-
batch-size: “看完”多少张图片以后才进行一次权重更新
-
img-size: 输入图片宽高
-
device: 使用GPU还是CPU进行训练
-
workers: 线程数
-
resume: 使用最近保存的模型开始训练
还有一些其他参数,可以看每个参数里面的help知道是什么作用,每个参数有默认值。
终端进入yolov5项目文件夹,根据自己的需要参数修改训练命令,例如:
python3 train.py --weights runs/train/exp2/weights/last.pt --cfg models/yolov5t.yaml --data data/test.yaml --epoch 200 --batch-size 8 --img 1280 --device 1 --workers 2
训练可视化
终端进入yolov5项目文件夹,输入:
tensorboard --logdir=runs
报错
如果出现报错,应该是字体没找到,下载以后放在项目根目录就行
运行效果:
然后在浏览器打开网址就行
运行在远程服务器
自己的电脑上与远程服务器就行一个链接就行,和jupyternotebook一样,终端输入:
-
# 模板
-
ssh -N -f -L localhost:port1:localhost:port2 username@IP
-
-
# 说明
-
# port1: 自己电脑上随便一个空闲端口就行
-
# port2: 之前在ubuntu服务启动notebook的端口,记着的那个
-
# username: ubuntu用户名
-
# IP: ubuntu服务器IP,公网IP或者私网IP
-
# 示例: ssh -N -f -L localhost:8888:localhost:8888 hhh@198.162.1.1
然后在自己的电脑上访问localhost:port1即可
检测目标(查看效果)
在yolov5/data/images目录下存放要检测的图像或视频,然后终端进入项目根目录,输入命令:
python3 detect.py --weights runs/train/exp/weights/best.pt
--weights后的路径根据实际情况修改
参考:YOLOv5训练自己的数据集(超详细完整版)_深度学习菜鸟的博客-CSDN博客_yolov5
转载:https://blog.csdn.net/weixin_52470573/article/details/127707526