YOLOv5全面解析教程①:網(wǎng)絡(luò)結(jié)構(gòu)逐行代碼解讀

          來(lái)源:CSDN博客 | 2022-12-16 11:07:29 |

          撰文 |?Fengwen, BBuf

          本教程涉及的代碼在:


          (資料圖)

          https://github.com/Oneflow-Inc/one-yolov5

          教程也同樣適用于 Ultralytics/YOLOv5,因?yàn)?One-YOLOv5 僅僅是換了一個(gè)運(yùn)行時(shí)后端而已,計(jì)算邏輯和代碼相比?Ultralytics/YOLOv5 沒(méi)有做任何改變,歡迎 star 。詳細(xì)信息請(qǐng)看:一個(gè)更快的YOLOv5問(wèn)世,附送全面中文解析教程

          1

          引言

          YOLOv5針對(duì)不同大小(n, s, m, l, x)的網(wǎng)絡(luò)整體架構(gòu)都是一樣的,只不過(guò)會(huì)在每個(gè)子模塊中采用不同的深度和寬度,分別應(yīng)對(duì)yaml文件中的depth_multiple和width_multiple參數(shù)。

          還需要注意一點(diǎn),官方除了n, s, m, l, x版本外還有n6, s6, m6, l6, x6,區(qū)別在于后者是針對(duì)更大分辨率的圖片比如1280x1280,?當(dāng)然結(jié)構(gòu)上也有些差異,前者只會(huì)下采樣到32倍且采用3個(gè)預(yù)測(cè)特征層 , 而后者會(huì)下采樣64倍,采用4個(gè)預(yù)測(cè)特征層。

          本章將以YOLOv5s為例,

          從配置文件models/yolov5s.yaml

          (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolov5s.yaml)到models/yolo.py (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolo.py)

          源碼進(jìn)行解讀。

          2

          yolov5s.yaml文件內(nèi)容

          nc:?80??#?number?of?classes?數(shù)據(jù)集中的類(lèi)別數(shù)depth_multiple:?0.33??#?model?depth?multiple??模型層數(shù)因子(用來(lái)調(diào)整網(wǎng)絡(luò)的深度)width_multiple:?0.50??#?layer?channel?multiple?模型通道數(shù)因子(用來(lái)調(diào)整網(wǎng)絡(luò)的寬度)#?如何理解這個(gè)depth_multiple和width_multiple呢?它決定的是整個(gè)模型中的深度(層數(shù))和寬度(通道數(shù)),具體怎么調(diào)整的結(jié)合后面的backbone代碼解釋。anchors:?#?表示作用于當(dāng)前特征圖的Anchor大小為?xxx#?9個(gè)anchor,其中P表示特征圖的層級(jí),P3/8該層特征圖縮放為1/8,是第3層特征??-?[10,13,?16,30,?33,23]??#?P3/8,?表示[10,13],[16,30],?[33,23]3個(gè)anchor??-?[30,61,?62,45,?59,119]??#?P4/16??-?[116,90,?156,198,?373,326]??#?P5/32#?YOLOv5s?v6.0?backbonebackbone:??#?[from,?number,?module,?args]??[[-1,?1,?Conv,?[64,?6,?2,?2]],??#?0-P1/2???[-1,?1,?Conv,?[128,?3,?2]],??#?1-P2/4???[-1,?3,?C3,?[128]],???[-1,?1,?Conv,?[256,?3,?2]],??#?3-P3/8???[-1,?6,?C3,?[256]],???[-1,?1,?Conv,?[512,?3,?2]],??#?5-P4/16???[-1,?9,?C3,?[512]],???[-1,?1,?Conv,?[1024,?3,?2]],??#?7-P5/32???[-1,?3,?C3,?[1024]],???[-1,?1,?SPPF,?[1024,?5]],??#?9??]#?YOLOv5s?v6.0?headhead:??[[-1,?1,?Conv,?[512,?1,?1]],???[-1,?1,?nn.Upsample,?[None,?2,?"nearest"]],???[[-1,?6],?1,?Concat,?[1]],??#?cat?backbone?P4???[-1,?3,?C3,?[512,?False]],??#?13???[-1,?1,?Conv,?[256,?1,?1]],???[-1,?1,?nn.Upsample,?[None,?2,?"nearest"]],???[[-1,?4],?1,?Concat,?[1]],??#?cat?backbone?P3???[-1,?3,?C3,?[256,?False]],??#?17?(P3/8-small)???[-1,?1,?Conv,?[256,?3,?2]],???[[-1,?14],?1,?Concat,?[1]],??#?cat?head?P4???[-1,?3,?C3,?[512,?False]],??#?20?(P4/16-medium)???[-1,?1,?Conv,?[512,?3,?2]],???[[-1,?10],?1,?Concat,?[1]],??#?cat?head?P5???[-1,?3,?C3,?[1024,?False]],??#?23?(P5/32-large)???[[17,?20,?23],?1,?Detect,?[nc,?anchors]],??#?Detect(P3,?P4,?P5)??]

          3

          anchors 解讀

          YOLOv5 初始化了 9 個(gè) anchors,分別在三個(gè)特征圖 (feature map)中使用,每個(gè) feature map 的每個(gè) grid cell 都有三個(gè) anchor 進(jìn)行預(yù)測(cè)。分配規(guī)則:

          尺度越大的 feature map 越靠前,相對(duì)原圖的下采樣率越小,感受野越小, 所以相對(duì)可以預(yù)測(cè)一些尺度比較小的物體(小目標(biāo)),分配到的 anchors 越小。

          尺度越小的 feature map 越靠后,相對(duì)原圖的下采樣率越大,感受野越大, 所以可以預(yù)測(cè)一些尺度比較大的物體(大目標(biāo)),所以分配到的 anchors 越大。

          即在小特征圖(feature map)上檢測(cè)大目標(biāo),中等大小的特征圖上檢測(cè)中等目標(biāo), 在大特征圖上檢測(cè)小目標(biāo)。

          4

          backbone & head?解讀

          [from, number, module, args] 參數(shù)

          四個(gè)參數(shù)的意義分別是:

          第一個(gè)參數(shù) from :從哪一層獲得輸入,-1表示從上一層獲得,[-1, 6]表示從上層和第6層兩層獲得。

          第二個(gè)參數(shù) number:表示有幾個(gè)相同的模塊,如果為9則表示有9個(gè)相同的模塊。

          第三個(gè)參數(shù) module:模塊的名稱(chēng),這些模塊寫(xiě)在common.py中。

          第四個(gè)參數(shù) args:類(lèi)的初始化參數(shù),用于解析作為 moudle 的傳入?yún)?shù)。

          下面以第一個(gè)模塊Conv 為例介紹下common.py中的模塊

          Conv 模塊定義如下:

          class?Conv(nn.Module):????#?Standard?convolution????def?__init__(self,?c1,?c2,?k=1,?s=1,?p=None,?g=1,?act=True):??#?ch_in,?ch_out,?kernel,?stride,?padding,?groups????????"""????????@Pargm?c1:?輸入通道數(shù)????????@Pargm?c2:?輸出通道數(shù)????????@Pargm?k?:?卷積核大小(kernel_size)????????@Pargm?s?:?卷積步長(zhǎng)?(stride)????????@Pargm?p?:?特征圖填充寬度?(padding)????????@Pargm?g?:?控制分組,必須整除輸入的通道數(shù)(保證輸入的通道能被正確分組)????????"""????????super().__init__()????????#?https://oneflow.readthedocs.io/en/master/generated/oneflow.nn.Conv2d.html?highlight=Conv????????self.conv?=?nn.Conv2d(c1,?c2,?k,?s,?autopad(k,?p),?groups=g,?bias=False)????????self.bn?=?nn.BatchNorm2d(c2)????????self.act?=?nn.SiLU()?if?act?is?True?else?(act?if?isinstance(act,?nn.Module)?else?nn.Identity())????def?forward(self,?x):????????return?self.act(self.bn(self.conv(x)))????def?forward_fuse(self,?x):????????return?self.act(self.conv(x))

          比如上面把width_multiple設(shè)置為了0.5,那么第一個(gè) [64, 6, 2, 2] 就會(huì)被解析為 [3,64*0.5=32,6,2,2],其中第一個(gè) 3 為輸入channel(因?yàn)檩斎?,32 為輸出channel。

          關(guān)于調(diào)整網(wǎng)絡(luò)大小的詳解說(shuō)明

          在yolo.py (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolo.py)的256行 有對(duì)yaml 文件的nc,depth_multiple等參數(shù)讀取,具體代碼如下:

          anchors,?nc,?gd,?gw?=?d["anchors"],?d["nc"],?d["depth_multiple"],?d["width_multiple"]

          "width_multiple"參數(shù)的作用前面介紹args參數(shù)中已經(jīng)介紹過(guò)了,那么"depth_multiple"又是什么作用呢?

          在yolo.py (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolo.py) 的257行有對(duì)參數(shù)的具體定義:

          n?=?n_?=?max(round(n?*?gd),?1)?if?n?>?1?else?n??#?depth?gain?暫且將這段代碼當(dāng)作公式(1)

          其中 gd 就是depth_multiple的值,n的值就是backbone中列表的第二個(gè)參數(shù):

          根據(jù)公式(1)很容易看出 gd 影響 n 的大小,從而影響網(wǎng)絡(luò)的結(jié)構(gòu)大小。

          后面各層之間的模塊數(shù)量、卷積核大小和數(shù)量等也都產(chǎn)生了變化,YOLOv5l 與 YOLOv5s 相比較起來(lái)訓(xùn)練參數(shù)的大小成倍數(shù)增長(zhǎng),

          其模型的深度和寬度也會(huì)大很多,這就使得 YOLOv5l 的精度值要比 YOLOv5s 好很多,因此在最終推理時(shí)的檢測(cè)精度高,但是模型的推理速度更慢。

          所以 YOLOv5 提供了不同的選擇,如果想要追求推理速度可選用較小一些的模型如 YOLOv5s、YOLOv5m,如果想要追求精度更高對(duì)推理速度要求不高的可以選擇其他兩個(gè)稍大的模型。

          如下面這張圖:

          yolov5模型復(fù)雜度比較圖

          5

          Conv模塊解讀

          網(wǎng)絡(luò)結(jié)構(gòu)預(yù)覽

          下面是根據(jù)yolov5s.yaml

          (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolov5s.yaml)?繪制的網(wǎng)絡(luò)整體結(jié)構(gòu)簡(jiǎn)化版。

          yolov5s網(wǎng)絡(luò)整體結(jié)構(gòu)圖

          詳細(xì)的網(wǎng)絡(luò)結(jié)構(gòu)圖:

          https://oneflow-static.oss-cn-beijing.aliyuncs.com/one-yolo/imgs/yolov5s.onnx.png

          通過(guò)export.py導(dǎo)出的onnx格式,并通過(guò) https://netron.app/ 網(wǎng)站導(dǎo)出的圖片(模型導(dǎo)出將在本教程的后續(xù)文章單獨(dú)介紹)。

          模塊組件右邊參數(shù) 表示特征圖的的形狀,比如 在 第 一 層( Conv )輸入 圖片形狀為 [ 3, 640, 640] ,關(guān)于這些參數(shù),可以固定一張圖片輸入到網(wǎng)絡(luò)并通過(guò)yolov5s.yaml?

          (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolov5s.yaml)?的模型參數(shù)計(jì)算得到,并且可以在工程models/yolo.py(https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolo.py) 通過(guò)代碼進(jìn)行print查看,詳細(xì)數(shù)據(jù)可以參考附件表2.1。

          6

          yolo.py模塊解讀

          文件地址(https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolo.py)

          文件主要包含三大部分: Detect類(lèi)、?Model類(lèi)和 parse_model 函數(shù)

          可以通過(guò) python models/yolo.py --cfg yolov5s.yaml運(yùn)行該腳本進(jìn)行觀察

          7

          parse_model函數(shù)解讀

          def?parse_model(d,?ch):??#?model_dict,?input_channels(3)????"""用在下面Model模塊中????解析模型文件(字典形式),并搭建網(wǎng)絡(luò)結(jié)構(gòu)????這個(gè)函數(shù)其實(shí)主要做的就是:?更新當(dāng)前層的args(參數(shù)),計(jì)算c2(當(dāng)前層的輸出channel)?=>??????????????????????????使用當(dāng)前層的參數(shù)搭建當(dāng)前層?=>??????????????????????????生成?layers?+?save????@Params?d:?model_dict?模型文件?字典形式?{dict:7}??[yolov5s.yaml](https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolov5s.yaml)中的6個(gè)元素?+?ch????#Params?ch:?記錄模型每一層的輸出channel?初始ch=[3]?后面會(huì)刪除????@return?nn.Sequential(*layers):?網(wǎng)絡(luò)的每一層的層結(jié)構(gòu)????@return?sorted(save):?把所有層結(jié)構(gòu)中from不是-1的值記下?并排序?[4,?6,?10,?14,?17,?20,?23]????"""????LOGGER.info(f"\n{"":>3}{"from":>18}{"n":>3}{"params":>10}??{"module":<40}{"arguments":<30}")????#?讀取d字典中的anchors和parameters(nc、depth_multiple、width_multiple)????anchors,?nc,?gd,?gw?=?d["anchors"],?d["nc"],?d["depth_multiple"],?d["width_multiple"]????#?na:?number?of?anchors?每一個(gè)predict?head上的anchor數(shù)?=?3????na?=?(len(anchors[0])?//?2)?if?isinstance(anchors,?list)?else?anchors??#?number?of?anchors????no?=?na?*?(nc?+?5)??#?number?of?outputs?=?anchors?*?(classes?+?5)?每一個(gè)predict?head層的輸出channel?????#?開(kāi)始搭建網(wǎng)絡(luò)????#?layers:?保存每一層的層結(jié)構(gòu)????#?save:?記錄下所有層結(jié)構(gòu)中from中不是-1的層結(jié)構(gòu)序號(hào)????#?c2:?保存當(dāng)前層的輸出channel????layers,?save,?c2?=?[],?[],?ch[-1]??#?layers,?savelist,?ch?out????# enumerate()?函數(shù)用于將一個(gè)可遍歷的數(shù)據(jù)對(duì)象(如列表、元組或字符串)組合為一個(gè)索引序列,同時(shí)列出數(shù)據(jù)和數(shù)據(jù)下標(biāo),一般用在 for 循環(huán)當(dāng)中。????for?i,?(f,?n,?m,?args)?in?enumerate(d["backbone"]?+?d["head"]):??#?from,?number,?module,?args????????m?=?eval(m)?if?isinstance(m,?str)?else?m??#?eval?strings????????for?j,?a?in?enumerate(args):????????????#?args是一個(gè)列表,這一步把列表中的內(nèi)容取出來(lái)????????????with?contextlib.suppress(NameError):????????????????args[j]?=?eval(a)?if?isinstance(a,?str)?else?a??#?eval?strings????????????????#?將深度與深度因子相乘,計(jì)算層深度。深度最小為1. ????????n?=?n_?=?max(round(n?*?gd),?1)?if?n?>?1?else?n??#?depth?gain????????????????#?如果當(dāng)前的模塊m在本項(xiàng)目定義的模塊類(lèi)型中,就可以處理這個(gè)模塊????????if?m?in?(Conv,?GhostConv,?Bottleneck,?GhostBottleneck,?SPP,?SPPF,?DWConv,?MixConv2d,?Focus,?CrossConv,?????????????????BottleneckCSP,?C3,?C3TR,?C3SPP,?C3Ghost,?nn.ConvTranspose2d,?DWConvTranspose2d,?C3x):????????????# c1:?輸入通道數(shù) c2:輸出通道數(shù)????????????c1,?c2?=?ch[f],?args[0]?????????????#?該層不是最后一層,則將通道數(shù)乘以寬度因子?也就是說(shuō),寬度因子作用于除了最后一層之外的所有層????????????if?c2?!=?no:??#?if?not?output????????????????# make_divisible的作用,使得原始的通道數(shù)乘以寬度因子之后取整到8的倍數(shù),這樣處理一般是讓模型的并行性和推理性能更好。????????????????c2?=?make_divisible(c2?*?gw,?8)????????????#?將前面的運(yùn)算結(jié)果保存在args中,它也就是這個(gè)模塊最終的輸入?yún)?shù)。????????????args?=?[c1,?c2,?*args[1:]]?????????????#?根據(jù)每層網(wǎng)絡(luò)參數(shù)的不同,分別處理參數(shù)?具體各個(gè)類(lèi)的參數(shù)是什么請(qǐng)參考它們的__init__方法這里不再詳細(xì)解釋了????????????if?m?in?[BottleneckCSP,?C3,?C3TR,?C3Ghost,?C3x]:????????????????#?這里的意思就是重復(fù)n次,比如conv這個(gè)模塊重復(fù)n次,這個(gè)n?是上面算出來(lái)的?depth?????????????????args.insert(2,?n)??#?number?of?repeats????????????????n?=?1????????elif?m?is?nn.BatchNorm2d:????????????args?=?[ch[f]]????????elif?m?is?Concat:????????????c2?=?sum(ch[x]?for?x?in?f)????????elif?m?is?Detect:????????????args.append([ch[x]?for?x?in?f])????????????if?isinstance(args[1],?int):??#?number?of?anchors????????????????args[1]?=?[list(range(args[1]?*?2))]?*?len(f)????????elif?m?is?Contract:????????????c2?=?ch[f]?*?args[0]?**?2????????elif?m?is?Expand:????????????c2?=?ch[f]?//?args[0]?**?2????????else:????????????c2?=?ch[f]????????#?構(gòu)建整個(gè)網(wǎng)絡(luò)模塊?這里就是根據(jù)模塊的重復(fù)次數(shù)n以及模塊本身和它的參數(shù)來(lái)構(gòu)建這個(gè)模塊和參數(shù)對(duì)應(yīng)的Module????????m_?=?nn.Sequential(*(m(*args)?for?_?in?range(n)))?if?n?>?1?else?m(*args)??#?module????????#?獲取模塊(module type)具體名例如 models.common.Conv , models.common.C3 , models.common.SPPF 等。??????? t = str(m)[8:-2].replace("__main__.", "")??#? replace函數(shù)作用是字符串"__main__"替換為"",在當(dāng)前項(xiàng)目沒(méi)有用到這個(gè)替換。????????np?=?sum(x.numel()?for?x?in?m_.parameters())??#?number?params????????m_.i,?m_.f,?m_.type,?m_.np?=?i,?f,?t,?np??#?attach?index,?"from"?index,?type,?number?params????????LOGGER.info(f"{i:>3}{str(f):>18}{n_:>3}{np:10.0f}??{t:<40}{str(args):<30}")??#?print????????"""????????如果x不是-1,則將其保存在save列表中,表示該層需要保存特征圖。????????這里?x?%?i?與?x?等價(jià)例如在最后一層?:?????????f?=?[17,20,23]?,?i?=?24?????????y?=?[?x?%?i?for?x?in?([f]?if?isinstance(f,?int)?else?f)?if?x?!=?-1?]????????print(y)?#?[17,?20,?23]?????????#?寫(xiě)成x % i 可能因?yàn)椋篿 - 1 =?-1 % i (比如 f =?[-1],則?[x % i for x in f]?代表?[11]?)????????"""????????save.extend(x?%?i?for?x?in?([f]?if?isinstance(f,?int)?else?f)?if?x?!=?-1)??#?append?to?savelist????????layers.append(m_)????????if?i?==?0:?#?如果是初次迭代,則新創(chuàng)建一個(gè)ch(因?yàn)樾螀h在創(chuàng)建第一個(gè)網(wǎng)絡(luò)模塊時(shí)需要用到,所以創(chuàng)建網(wǎng)絡(luò)模塊之后再初始化ch)????????????ch?=?[]????????ch.append(c2)????#?將所有的層封裝為nn.Sequential?,?對(duì)保存的特征圖排序????return?nn.Sequential(*layers),?sorted(save)

          8

          Model類(lèi)解讀

          class?Model(nn.Module):????#?YOLOv5?model????def?__init__(self,?cfg="[yolov5s.yaml](https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolov5s.yaml)",?ch=3,?nc=None,?anchors=None):??#?model,?input?channels,?number?of?classes????????super().__init__()????????#?如果cfg已經(jīng)是字典,則直接賦值,否則先加載cfg路徑的文件為字典并賦值給self.yaml。????????if?isinstance(cfg,?dict):?????????????self.yaml?=?cfg??#?model?dict????????else:??#?is?*.yaml??加載yaml模塊????????????import?yaml??#?for?flow?hub?????????????self.yaml_file?=?Path(cfg).name????????????with?open(cfg,?encoding="ascii",?errors="ignore")?as?f:????????????????self.yaml?=?yaml.safe_load(f)??#?model?dict??從yaml文件中加載出字典????????#?Define?model????????# ch:?輸入通道數(shù)。?假如self.yaml有鍵‘ch’,則將該鍵對(duì)應(yīng)的值賦給內(nèi)部變量ch。假如沒(méi)有‘ch’,則將形參ch賦給內(nèi)部變量ch????????ch?=?self.yaml["ch"]?=?self.yaml.get("ch",?ch)??#?input?channels????????#?假如yaml中的nc和方法形參中的nc不一致,則覆蓋yaml中的nc。????????if?nc?and?nc?!=?self.yaml["nc"]:????????????LOGGER.info(f"Overriding?model.yaml?nc={self.yaml["nc"]}?with?nc={nc}")????????????self.yaml["nc"]?=?nc??#?override?yaml?value????????if?anchors:?#?anchors??先驗(yàn)框的配置????????????LOGGER.info(f"Overriding?model.yaml?anchors?with?anchors={anchors}")????????????self.yaml["anchors"]?=?round(anchors)??#?override?yaml?value????????#?得到模型,以及對(duì)應(yīng)的保存的特征圖列表。????????????self.model,?self.save?=?parse_model(deepcopy(self.yaml),?ch=[ch])??#?model,?savelist????????self.names?=?[str(i)?for?i?in?range(self.yaml["nc"])]??#?default?names?初始化類(lèi)名列表,默認(rèn)為[0,1,2...]????????????????#?self.inplace=True??默認(rèn)True??節(jié)省內(nèi)存????????self.inplace?=?self.yaml.get("inplace",?True)????????#?Build?strides,?anchors??確定步長(zhǎng)、步長(zhǎng)對(duì)應(yīng)的錨框????????m?=?self.model[-1]??#?Detect()????????if?isinstance(m,?Detect):?#?檢驗(yàn)?zāi)P偷淖詈笠粚邮荄etect模塊????????????s?=?256??#?2x?min?stride????????????m.inplace?=?self.inplace????????????#?計(jì)算三個(gè)feature?map下采樣的倍率??[8,?16,?32]????????????m.stride?=?flow.tensor([s?/?x.shape[-2]?for?x?in?self.forward(flow.zeros(1,?ch,?s,?s))])??#?forward????????????#?檢查anchor順序與stride順序是否一致?anchor的順序應(yīng)該是從小到大,這里排一下序????????????check_anchor_order(m)??#?must?be?in?pixel-space?(not?grid-space)????????????#?對(duì)應(yīng)的anchor進(jìn)行縮放操作,原因:得到anchor在實(shí)際的特征圖中的位置,因?yàn)榧虞d的原始anchor大小是相對(duì)于原圖的像素,但是經(jīng)過(guò)卷積池化之后,特征圖的長(zhǎng)寬變小了。????????????m.anchors?/=?m.stride.view(-1,?1,?1)????????????self.stride?=?m.stride????????????self._initialize_biases()?#?only?run?once??初始化偏置?????????#?Init?weights,?biases????????#?調(diào)用oneflow_utils.py下initialize_weights初始化模型權(quán)重????????initialize_weights(self)????????self.info()?#?打印模型信息????????LOGGER.info("")????#?管理前向傳播函數(shù)????def?forward(self,?x,?augment=False,?profile=False,?visualize=False):????????if?augment:#?是否在測(cè)試時(shí)也使用數(shù)據(jù)增強(qiáng)??Test?Time?Augmentation(TTA)????????????return?self._forward_augment(x)??#?augmented?inference,?None????????return?self._forward_once(x,?profile,?visualize)??#?single-scale?inference,?train????#?帶數(shù)據(jù)增強(qiáng)的前向傳播????def?_forward_augment(self,?x):????????img_size?=?x.shape[-2:]??#?height,?width????????s?=?[1,?0.83,?0.67]??#?scales????????f?=?[None,?3,?None]??#?flips?(2-ud,?3-lr)????????y?=?[]??#?outputs????????for?si,?fi?in?zip(s,?f):????????????xi?=?scale_img(x.flip(fi)?if?fi?else?x,?si,?gs=int(self.stride.max()))????????????yi?=?self._forward_once(xi)[0]??#?forward????????????#?cv2.imwrite(f"img_{si}.jpg",?255?*?xi[0].cpu().numpy().transpose((1,?2,?0))[:,?:,?::-1])??#?save????????????yi?=?self._descale_pred(yi,?fi,?si,?img_size)????????????y.append(yi)????????y?=?self._clip_augmented(y)??#?clip?augmented?tails????????return?flow.cat(y,?1),?None??#?augmented?inference,?train????#?前向傳播具體實(shí)現(xiàn)????def?_forward_once(self,?x,?profile=False,?visualize=False):????????"""????????@params?x:?輸入圖像????????@params?profile:?True?可以做一些性能評(píng)估????????@params?feature_vis:?True?可以做一些特征可視化????????"""????????#?y:?存放著self.save=True的每一層的輸出,因?yàn)楹竺娴奶卣魅诤喜僮饕玫竭@些特征圖????????y,?dt?=?[],?[]??#?outputs????????#?前向推理每一層結(jié)構(gòu)???m.i=index???m.f=from???m.type=類(lèi)名???m.np=number?of?params????????for?m?in?self.model:????????????#?if?not?from?previous?layer???m.f=當(dāng)前層的輸入來(lái)自哪一層的輸出??s的m.f都是-1????????????if?m.f?!=?-1:??#?if?not?from?previous?layer????????????????x?=?y[m.f]?if?isinstance(m.f,?int)?else?[x?if?j?==?-1?else?y[j]?for?j?in?m.f]??#?from?earlier?layers????????????if?profile:????????????????self._profile_one_layer(m,?x,?dt)????????????x?=?m(x)??#?run????????????y.append(x?if?m.i?in?self.save?else?None)??#?save?output????????????if?visualize:????????????????feature_visualization(x,?m.type,?m.i,?save_dir=visualize)????????return?x????#?將推理結(jié)果恢復(fù)到原圖圖片尺寸(逆操作)????def?_descale_pred(self,?p,?flips,?scale,?img_size):????????#?de-scale?predictions?following?augmented?inference?(inverse?operation)????????"""用在上面的__init__函數(shù)上????????將推理結(jié)果恢復(fù)到原圖圖片尺寸??Test?Time?Augmentation(TTA)中用到?????????de-scale?predictions?following?augmented?inference?(inverse?operation)????????@params?p:?推理結(jié)果????????@params?flips:????????@params?scale:????????@params?img_size:????????"""????????if?self.inplace:????????????p[...,?:4]?/=?scale??#?de-scale????????????if?flips?==?2:????????????????p[...,?1]?=?img_size[0]?-?p[...,?1]??#?de-flip?ud????????????elif?flips?==?3:????????????????p[...,?0]?=?img_size[1]?-?p[...,?0]??#?de-flip?lr????????else:????????????x,?y,?wh?=?p[...,?0:1]?/?scale,?p[...,?1:2]?/?scale,?p[...,?2:4]?/?scale??#?de-scale????????????if?flips?==?2:????????????????y?=?img_size[0]?-?y??#?de-flip?ud????????????elif?flips?==?3:????????????????x?=?img_size[1]?-?x??#?de-flip?lr????????????p?=?flow.cat((x,?y,?wh,?p[...,?4:]),?-1)????????return?p????#?這個(gè)是TTA的時(shí)候?qū)υ瓐D片進(jìn)行裁剪,也是一種數(shù)據(jù)增強(qiáng)方式,用在TTA測(cè)試的時(shí)候。????def?_clip_augmented(self,?y):????????#?Clip?YOLOv5?augmented?inference?tails????????nl?=?self.model[-1].nl??#?number?of?detection?layers?(P3-P5)????????g?=?sum(4?**?x?for?x?in?range(nl))??#?grid?points????????e?=?1??#?exclude?layer?count????????i?=?(y[0].shape[1]?//?g)?*?sum(4?**?x?for?x?in?range(e))??#?indices????????y[0]?=?y[0][:,?:-i]??#?large????????i?=?(y[-1].shape[1]?//?g)?*?sum(4?**?(nl?-?1?-?x)?for?x?in?range(e))??#?indices????????y[-1]?=?y[-1][:,?i:]??#?small????????return?y????#?打印日志信息??前向推理時(shí)間????def?_profile_one_layer(self,?m,?x,?dt):????????c?=?isinstance(m,?Detect)??#?is?final?layer,?copy?input?as?inplace?fix????????o?=?thop.profile(m,?inputs=(x.copy()?if?c?else?x,),?verbose=False)[0]?/?1E9?*?2?if?thop?else?0??#?FLOPs????????t?=?time_sync()????????for?_?in?range(10):????????????m(x.copy()?if?c?else?x)????????dt.append((time_sync()?-?t)?*?100)????????if?m?==?self.model[0]:????????????LOGGER.info(f"{"time?(ms)":>10s}?{"GFLOPs":>10s}?{"params":>10s}??module")????????LOGGER.info(f"{dt[-1]:10.2f}?{o:10.2f}?{m.np:10.0f}??{m.type}")????????if?c:????????????LOGGER.info(f"{sum(dt):10.2f}?{"-":>10s}?{"-":>10s}??Total")????#?initialize?biases?into?Detect(),?cf?is?class?frequency????def?_initialize_biases(self,?cf=None):?????????#?https://arxiv.org/abs/1708.02002?section?3.3????????#?cf?=?flow.bincount(flow.tensor(np.concatenate(dataset.labels,?0)[:,?0]).long(),?minlength=nc)?+?1.????????m?=?self.model[-1]??#?Detect()?module????????for?mi,?s?in?zip(m.m,?m.stride):??#?from????????????b?=?mi.bias.view(m.na,?-1).detach()??#?conv.bias(255)?to?(3,85)????????????b[:,?4]?+=?math.log(8?/?(640?/?s)?**?2)??#?obj?(8?objects?per?640?image)????????????b[:,?5:]?+=?math.log(0.6?/?(m.nc?-?0.999999))?if?cf?is?None?else?flow.log(cf?/?cf.sum())??#?cls????????????mi.bias?=?flow.nn.Parameter(b.view(-1),?requires_grad=True)????#??打印模型中最后Detect層的偏置biases信息(也可以任選哪些層biases信息)????def?_print_biases(self):????????"""????????打印模型中最后Detect模塊里面的卷積層的偏置biases信息(也可以任選哪些層biases信息)????????"""????????m?=?self.model[-1]??#?Detect()?module????????for?mi?in?m.m:??#?from????????????b?=?mi.bias.detach().view(m.na,?-1).T??#?conv.bias(255)?to?(3,85)????????????LOGGER.info(????????????????("%6g?Conv2d.bias:"?+?"%10.3g"?*?6)?%?(mi.weight.shape[1],?*b[:5].mean(1).tolist(),?b[5:].mean()))????def?_print_weights(self):????????"""????????打印模型中Bottleneck層的權(quán)重參數(shù)weights信息(也可以任選哪些層weights信息)????????"""????????for?m?in?self.model.modules():????????????if?type(m)?is?Bottleneck:????????????????LOGGER.info("%10.3g"?%?(m.w.detach().sigmoid()?*?2))??#?shortcut?weights????????# fuse()是用來(lái)進(jìn)行conv和bn層合并,為了提速模型推理速度。????def?fuse(self):??#?fuse?model?Conv2d()?+?BatchNorm2d()?layers????????"""用在detect.py、val.py????????fuse?model?Conv2d()?+?BatchNorm2d()?layers????????調(diào)用oneflow_utils.py中的fuse_conv_and_bn函數(shù)和common.py中Conv模塊的fuseforward函數(shù)????????"""????????LOGGER.info("Fusing?layers...?")????????for?m?in?self.model.modules():????????????#?如果當(dāng)前層是卷積層Conv且有bn結(jié)構(gòu),?那么就調(diào)用fuse_conv_and_bn函數(shù)講conv和bn進(jìn)行融合,?加速推理????????????if?isinstance(m,?(Conv,?DWConv))?and?hasattr(m,?"bn"):????????????????m.conv?=?fuse_conv_and_bn(m.conv,?m.bn)??#?update?conv????????????????delattr(m,?"bn")??#?remove?batchnorm??移除bn?remove?batchnorm????????????????m.forward?=?m.forward_fuse??#?update?forward?更新前向傳播?update?forward?(反向傳播不用管,?因?yàn)檫@種推理只用在推理階段)????????self.info()??#?打印conv+bn融合后的模型信息????????return?self????#?打印模型結(jié)構(gòu)信息?在當(dāng)前類(lèi)__init__函數(shù)結(jié)尾處有調(diào)用????def?info(self,?verbose=False,?img_size=640):??#?print?model?information????????model_info(self,?verbose,?img_size)????def?_apply(self,?fn):????????#?Apply?to(),?cpu(),?cuda(),?half()?to?model?tensors?that?are?not?parameters?or?registered?buffers????????self?=?super()._apply(fn)????????m?=?self.model[-1]??#?Detect()????????if?isinstance(m,?Detect):????????????m.stride?=?fn(m.stride)????????????m.grid?=?list(map(fn,?m.grid))????????????if?isinstance(m.anchor_grid,?list):????????????????m.anchor_grid?=?list(map(fn,?m.anchor_grid))????????return?self

          9

          Detect類(lèi)解讀

          class?Detect(nn.Module):????"""????Detect模塊是用來(lái)構(gòu)建Detect層的,將輸入feature?map?通過(guò)一個(gè)卷積操作和公式計(jì)算到我們想要的shape,?為后面的計(jì)算損失或者NMS后處理作準(zhǔn)備????"""????stride?=?None??#?strides?computed?during?build????onnx_dynamic?=?False??#?ONNX?export?parameter????export?=?False??#?export?mode????def?__init__(self,?nc=80,?anchors=(),?ch=(),?inplace=True):??#?detection?layer????????super().__init__()????????#??nc:分類(lèi)數(shù)量????????self.nc?=?nc??#?number?of?classes??????????#??no:每個(gè)anchor的輸出數(shù)????????self.no?=?nc?+?5??#?number?of?outputs?per?anchor????????#?nl:預(yù)測(cè)層數(shù),此次為3????????self.nl?=?len(anchors)??#?number?of?detection?layers????????#??na:anchors的數(shù)量,此次為3????????self.na?=?len(anchors[0])?//?2??#?number?of?anchors????????#??grid:格子坐標(biāo)系,左上角為(1,1),右下角為(input.w/stride,input.h/stride)????????self.grid?=?[flow.zeros(1)]?*?self.nl??#?init?grid????????self.anchor_grid?=?[flow.zeros(1)]?*?self.nl??#?init?anchor?grid????????#?寫(xiě)入緩存中,并命名為anchors????????self.register_buffer("anchors",?flow.tensor(anchors).float().view(self.nl,?-1,?2))??#?shape(nl,na,2)????????#?將輸出通過(guò)卷積到?self.no?*?self.na?的通道,達(dá)到全連接的作用????????self.m?=?nn.ModuleList(nn.Conv2d(x,?self.no?*?self.na,?1)?for?x?in?ch)??#?output?conv????????self.inplace?=?inplace??#?use?inplace?ops?(e.g.?slice?assignment)????def?forward(self,?x):????????z?=?[]??#?inference?output????????for?i?in?range(self.nl):????????????x[i]?=?self.m[i](x[i])??#?conv????????????bs,?_,?ny,?nx?=?x[i].shape??#?x(bs,255,20,20)?to?x(bs,3,20,20,85)????????????x[i]?=?x[i].view(bs,?self.na,?self.no,?ny,?nx).permute(0,?1,?3,?4,?2).contiguous()????????????if?not?self.training:??#?inference????????????????if?self.onnx_dynamic?or?self.grid[i].shape[2:4]?!=?x[i].shape[2:4]:????????????????????#?向前傳播時(shí)需要將相對(duì)坐標(biāo)轉(zhuǎn)換到grid絕對(duì)坐標(biāo)系中????????????????????self.grid[i],?self.anchor_grid[i]?=?self._make_grid(nx,?ny,?i)????????????????y?=?x[i].sigmoid()????????????????if?self.inplace:????????????????????y[...,?0:2]?=?(y[...,?0:2]?*?2?+?self.grid[i])?*?self.stride[i]??#?xy????????????????????y[...,?2:4]?=?(y[...,?2:4]?*?2)?**?2?*?self.anchor_grid[i]??#?wh????????????????else:??#?for?YOLOv5?on?AWS?Inferentia?https://github.com/ultralytics/yolov5/pull/2953????????????????????xy,?wh,?conf?=?y.split((2,?2,?self.nc?+?1),?4)??#?y.tensor_split((2,?4,?5),?4)??????????????????????xy?=?(xy?*?2?+?self.grid[i])?*?self.stride[i]??#?xy????????????????????wh?=?(wh?*?2)?**?2?*?self.anchor_grid[i]??#?wh????????????????????y?=?flow.cat((xy,?wh,?conf),?4)????????????????z.append(y.view(bs,?-1,?self.no))????????return?x?if?self.training?else?(flow.cat(z,?1),)?if?self.export?else?(flow.cat(z,?1),?x)????????#?相對(duì)坐標(biāo)轉(zhuǎn)換到grid絕對(duì)坐標(biāo)系????def?_make_grid(self,?nx=20,?ny=20,?i=0):????????d?=?self.anchors[i].device????????t?=?self.anchors[i].dtype????????shape?=?1,?self.na,?ny,?nx,?2??#?grid?shape????????y,?x?=?flow.arange(ny,?device=d,?dtype=t),?flow.arange(nx,?device=d,?dtype=t)???????????????yv,?xv?=?flow.meshgrid(y,?x,?indexing="ij")????????grid?=?flow.stack((xv,?yv),?2).expand(shape)?-?0.5??#?add?grid?offset,?i.e.?y?=?2.0?*?x?-?0.5????????anchor_grid?=?(self.anchors[i]?*?self.stride[i]).view((1,?self.na,?1,?1,?2)).expand(shape)????????return?grid,?anchor_grid

          10

          附件

          表2.1 yolov5s.yaml解析表

          (https://github.com/Oneflow-Inc/one-yolov5/blob/main/models/yolov5s.yaml)

          層數(shù)formmouduleargumentsinputoutput
          0-1Conv[3, 32, 6, 2, 2][3, 640, 640][32, 320, 320]
          1-1Conv[32, 64, 3, 2][32, 320, 320][64, 160, 160]
          2-1C3[64, 64, 1][64, 160, 160][64, 160, 160]
          3-1Conv[64, 128, 3, 2][64, 160, 160][128, 80, 80]
          4-1C3[128, 128, 2][128, 80, 80][128, 80, 80]
          5-1Conv[128, 256, 3, 2][128, 80, 80][256, 40, 40]
          6-1C3[256, 256, 3][256, 40, 40][256, 40, 40]
          7-1Conv[256, 512, 3, 2][256, 40, 40][512, 20, 20]
          8-1C3[512, 512, 1][512, 20, 20][512, 20, 20]
          9-1SPPF[512, 512, 5][512, 20, 20][512, 20, 20]
          10-1Conv[512, 256, 1, 1][512, 20, 20][256, 20, 20]
          11-1Upsample[None, 2, "nearest"][256, 20, 20][256, 40, 40]
          12[-1, 6]Concat[1][1, 256, 40, 40],[1, 256, 40, 40][512, 40, 40]
          13-1C3[512, 256, 1, False][512, 40, 40][256, 40, 40]
          14-1Conv[256, 128, 1, 1][256, 40, 40][128, 40, 40]
          15-1Upsample[None, 2, "nearest"][128, 40, 40][128, 80, 80]
          16[-1, 4]Concat[1][1, 128, 80, 80],[1, 128, 80, 80][256, 80, 80]
          17-1C3[256, 128, 1, False][256, 80, 80][128, 80, 80]
          18-1Conv[128, 128, 3, 2][128, 80, 80][128, 40, 40]
          19[-1, 14]Concat[1][1, 128, 40, 40],[1, 128, 40, 40][256, 40, 40]
          20-1C3[256, 256, 1, False][256, 40, 40][256, 40, 40]
          21-1Conv[256, 256, 3, 2][256, 40, 40][256, 20, 20]
          22[-1, 10]Concat[1][1, 256, 20, 20],[1, 256, 20, 20][512, 20, 20]
          23-1C3[512, 512, 1, False][512, 20, 20][512, 20, 20]
          24[17, 20, 23]Detect[80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]][1, 128, 80, 80],[1, 256, 40, 40],[1, 512, 20, 20][1, 3, 80, 80, 85],[1, 3, 40, 40, 85],[1, 3, 20, 20, 85]

          11

          參考文章

          https://zhuanlan.zhihu.com/p/436891962?ivk_sa=1025922q

          https://zhuanlan.zhihu.com/p/110204563

          https://www.it610.com/article/1550621248474648576.htm

          其他人都在看

          OneFlow-ONNX v0.6.0正式發(fā)布

          下載量突破10億,MinIO的開(kāi)源啟示錄

          關(guān)于ChatGPT的一切;CUDA入門(mén)之矩陣乘

          李白:你的模型權(quán)重很不錯(cuò),可惜被我沒(méi)收了

          單RTX3090訓(xùn)練YOLOv5s,時(shí)間減少11個(gè)小時(shí)

          比快更快,開(kāi)源Stable Diffusion刷新作圖速度

          OneEmbedding:單卡訓(xùn)練TB級(jí)推薦模型不是夢(mèng)

          歡迎Star、試用OneFlow最新版本:GitHub - Oneflow-Inc/oneflow: OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient. - GitHub - Oneflow-Inc/oneflow: OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.https://github.com/Oneflow-Inc/oneflow/

          關(guān)鍵詞: 網(wǎng)絡(luò)結(jié)構(gòu)

          亚洲综合精品伊人久久| 久久精品夜色国产亚洲av| 久久亚洲欧洲国产综合| 亚洲国产精品久久久久秋霞影院| 久久亚洲私人国产精品vA| 亚洲福利视频导航| 亚洲一区二区成人| 久久精品国产精品亚洲色婷婷 | 亚洲永久无码3D动漫一区| 4338×亚洲全国最大色成网站| 亚洲裸男gv网站| 久久亚洲中文字幕精品一区| 国内精品久久久久久久亚洲| 在线亚洲97se亚洲综合在线| 国产自偷亚洲精品页65页| 亚洲精品第一国产综合境外资源| 亚洲一区二区观看播放| 亚洲videos| 亚洲熟妇AV一区二区三区宅男| 亚洲欧美中文日韩视频| 亚洲hairy多毛pics大全| 国产精品亚洲а∨天堂2021| 亚洲国产日韩成人综合天堂| 中文字幕第13亚洲另类| 亚洲国产精品成人精品无码区| 亚洲av永久无码精品漫画 | 风间由美在线亚洲一区| 日韩在线视精品在亚洲| 中文字幕亚洲一区二区va在线| 亚洲午夜福利717| 亚洲精品免费观看| 亚洲人成影院午夜网站| 亚洲乱码中文字幕在线| 亚洲国产91精品无码专区| 亚洲自偷自偷图片| 久久精品国产亚洲av麻豆小说 | 亚洲AV一二三区成人影片| 亚洲乱人伦中文字幕无码| 亚洲精品97久久中文字幕无码| 久久久久亚洲AV成人网人人软件| 精品亚洲综合久久中文字幕|