目录
前言
上一篇博客讲了计算图的加载和预处理,真是费了不少劲啊……
这一篇博客和大家一起学习PPQ精髓之一:计算图的分割与调度。第一讲就说过PPQ把计算图分成了三类:可量化、不可量化、争议区。计算图分割的目的就是把这三类区域分割出来;为了适应多平台,PPQ已经在计算图的调度上也分了很多种,我们一一道来。


分割调度类型
PPQ一共有六种类型,写在DISPATCHER_TABLE字典中:
  
   - 
    
     
    
    
     
      DISPATCHER_
      TABLE 
      = {
     
    
- 
    
     
    
    
     
       
      "conservative": ConservativeDispatcher,
     
    
- 
    
     
    
    
     
       
      "pplnn": PPLNNDispatcher,
     
    
- 
    
     
    
    
     
       
      "aggresive": AggresiveDispatcher,
     
    
- 
    
     
    
    
     
       
      "pointwise": PointDispatcher,
     
    
- 
    
     
    
    
     
       
      "allin": AllinDispatcher,
     
    
- 
    
     
    
    
     
       
      'perseus': Perseus
     
    
- 
    
     
    
    
     
      }
     
    
我们就先从保守调度类型conservative为例开始看吧!
计算图的切割
三种调度平台
首先回忆一下上一讲用的在计算图中搜索的所需算子的方法opset_matching(),现在我们用这个方法来搜索我们所需要的算子!
我们现在要搜索三类算子,然后调度到不同平台上:quant_platform、SOI_platform、fp32_platform。
- quant_platform:图形的所有可量化部分都将被分派到此平台。
- SOI_platform:形状或索引相关操作将分派到此平台。
- fp32_platform:有一些操作同时从quant_platform和SOI_platform接收结果,它们将被分派到fp32_platform。
小插曲,此处的注释貌似写错了:

Shape or Index算子为什么不能被量化?

在图中 shape 算子的输出要作为参数传递给 reshape 算子,如果我们在这条路径上插入任何量化操作,会导致 reshape 算子的输入被改变。
例如 shape 算子的输出为 [1, 3, 224, 224],我们知道 int8 量化只能表示 256 个数,而对称量化在正数半轴只有 128 个数可以表示,假设我们选取 scale = 2,其量化后的值将会变为 [0, 4, 224, 224]。那么后续的逻辑自然就执行不通了。
分割量化操作
opset_matching()方法又大显神功,把所有可计算量化的算子都装入quant_operations当中。
  
   - 
    
     
    
    
     
         quant_operations 
      = 
      search_engine.opset_matching(
     
    
- 
    
     
    
    
     
             sp_expr 
      = lambda x: x.
      is_computing_op,
     
    
- 
    
     
    
    
     
             rp_expr 
      = 
      value_tracing_pattern,
     
    
- 
    
     
    
    
     
             ep_expr 
      = lambda x: (x.
      type 
      not 
      in quant_types) 
      or x.
      is_boundary,
     
    
- 
    
     
    
    
     
             direction 
      = 
      'down')
     
    
- 
    
     
    
    
     
         quant_operations.filter(lambda x: x.
      type 
      not 
      in quant_types)
     
    
Shape or Index操作
先按照算子类型是否是shape、topk、nonmaxsuppression寻找计算图:
  
   - 
    
     
    
    
     
         computing_extensions 
      = 
      search_engine.opset_matching(
     
    
- 
    
     
    
    
     
             sp_expr 
      = lambda x: x.
      is_computing_op,
     
    
- 
    
     
    
    
     
             rp_expr 
      = 
      value_tracing_pattern,
     
    
- 
    
     
    
    
     
             ep_expr 
      = lambda x: x.
      type 
      in {
      'Shape', 
      'TopK', 
      'NonMaxSuppression'} 
      or x.
      is_boundary,
     
    
- 
    
     
    
    
     
             direction 
      = 
      'down')
     
    
但是在某些特定情况下,单个匹配无法处理。为了覆盖所有与形状相关的操作,需要反向匹配。
注释是这么描述的,但是我还不是特别理解,后面会结合实例再思考一下:
  
   - 
    
     
    
    
     
         # we assume 
      all 
      'Shape', 
      'NonMaxSuppression', 
      'ConstantOfShape', 
      'Topk' operations 
      are SOI generators.
     
    
- 
    
     
    
    
     
         shape_forward_matching 
      = 
      search_engine.opset_matching(
     
    
- 
    
     
    
    
     
             sp_expr 
      = lambda x: x 
      in generators 
      and x.
      type 
      not 
      in {
      'Constant'},
     
    
- 
    
     
    
    
     
             rp_expr 
      = 
      value_tracing_pattern,
     
    
- 
    
     
    
    
     
             ep_expr 
      = lambda x: (x 
      in recivers 
      or
     
    
- 
    
     
    
    
     
                                  x 
      in quant_operations 
      or
     
    
- 
    
     
    
    
     
                                  x.
      is_boundary 
      or
     
    
- 
    
     
    
    
     
                                  x.
      is_computing_op),
     
    
- 
    
     
    
    
     
             direction 
      = 
      'down')
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
         # remove computing operations 
      and quant operations 
      from matching
     
    
- 
    
     
    
    
     
         shape_forward_matching.filter(lambda x: x.
      is_computing_op 
      or x 
      in quant_operations)
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
         # update matchings, ready 
      for further searching.
     
    
- 
    
     
    
    
     
         SOI_operations.update(shape_forward_matching)
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
         while 
      True:
     
    
- 
    
     
    
    
     
             # there 
      are some particular cases where a single matching can 
      not handle.
     
    
- 
    
     
    
    
     
             # 
      to cover 
      all shape-related operations, a reverse matching 
      is required.
     
    
- 
    
     
    
    
     
             shape_backward_matching 
      = 
      search_engine.opset_matching(
     
    
- 
    
     
    
    
     
                 sp_expr 
      = lambda x: x 
      in SOI_operations 
      and x.
      type !
      = 
      'Shape',
     
    
- 
    
     
    
    
     
                 rp_expr 
      = reverse_tracing_pattern,
     
    
- 
    
     
    
    
     
                 ep_expr 
      = lambda x: (x 
      in SOI_operations 
      or
     
    
- 
    
     
    
    
     
                                      x 
      in quant_operations 
      or
     
    
- 
    
     
    
    
     
                                      x.
      is_boundary 
      or
     
    
- 
    
     
    
    
     
                                      x.
      is_computing_op),
     
    
- 
    
     
    
    
     
                 direction 
      = 
      'up')
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
             # remove computing operations 
      and quant operations 
      from matching
     
    
- 
    
     
    
    
     
             shape_backward_matching.filter(lambda x: x.
      is_computing_op 
      or x 
      in quant_operations)
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
             
      if 
      all([(op 
      in SOI_operations) 
      for op 
      in shape_backward_matching]): break
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
             # update matchings
     
    
- 
    
     
    
    
     
             SOI_operations.update(shape_backward_matching)
     
    
 fp32(非量化)操作
剩下的就全是非量化区域啦~
组装分割计算图
刚才依赖opset_matching方法把我们所需要切割的都找到了,接下来我们组装一个字典集然后返回,大功告成!
  
   - 
    
     
    
    
     
         # 
      generate dispatching 
      table.
     
    
- 
    
     
    
    
     
         dispatching_
      table 
      = {}
     
    
- 
    
     
    
    
     
         
      for operation 
      in graph.operations.
      values():
     
    
- 
    
     
    
    
     
             
      if operation 
      in SOI_operations 
      and operation 
      not 
      in computing_extensions:
     
    
- 
    
     
    
    
     
                 dispatching_
      table[operation.name] 
      = SOI_platform
     
    
- 
    
     
    
    
     
             elif operation 
      in quant_operations:
     
    
- 
    
     
    
    
     
                 dispatching_
      table[operation.name] 
      = quant_platform
     
    
- 
    
     
    
    
     
             
      else:
     
    
- 
    
     
    
    
     
                 dispatching_
      table[operation.name] 
      = fp
      32_platform
     
    
因为在SOI匹配的时候,做了正向和反向两次匹配,所以这里需要删除重复匹配:
  
   - 
    
     
    
    
     
         
      for operation 
      in graph.operations.
      values():
     
    
- 
    
     
    
    
     
             # 
      move Topk, Shape, NonMaxSuppression 
      to the platform 
      same 
      as their 
      input.
     
    
- 
    
     
    
    
     
             
      if operation.
      type 
      in {
      'Shape', 
      'TopK', 
      'NonMaxSuppression'}:
     
    
- 
    
     
    
    
     
                 
      source_op 
      = operation.inputs[
      0].
      source_op
     
    
- 
    
     
    
    
     
                 
      if 
      source_op 
      is 
      not None:
     
    
- 
    
     
    
    
     
                     dispatching_
      table[operation.name] 
      = dispatching_
      table[
      source_op.name]
     
    
- 
    
     
    
    
     
                 
      else: dispatching_
      table[operation.name] 
      = fp
      32_platform
     
    
- 
    
     
    
    
      
     
    
- 
    
     
    
    
     
             # 
      move activations 
      to the platform 
      same 
      as their 
      input.
     
    
- 
    
     
    
    
     
             
      if operation.
      is_linear_activation:
     
    
- 
    
     
    
    
     
                 
      source_op 
      = operation.inputs[
      0].
      source_op
     
    
- 
    
     
    
    
     
                 
      if 
      source_op 
      is 
      not None:
     
    
- 
    
     
    
    
     
                     dispatching_
      table[operation.name] 
      = dispatching_
      table[
      source_op.name]
     
    
PPL NN 计算图分割策略
刚才是以保守调度策略为例说明了计算图分割的大体过程,因为大致流程是一样的,下面重点讲讲其他的策略不同点。
PPL NN的的量化策略是从conv到conv作为可量化区域,区别于保守调度策略中的从可计算op调度:
  
   - 
    
     
    
    
     
         quant_operations 
      = 
      search_engine.opset_matching(
     
    
- 
    
     
    
    
     
             sp_expr 
      = lambda x: x.
      type 
      =
      = 
      'Conv',
     
    
- 
    
     
    
    
     
             rp_expr 
      = lambda x, y: 
      value_tracing_pattern(x, y) 
      and y.
      type 
      in quant_types,
     
    
- 
    
     
    
    
     
             ep_expr 
      = lambda x: x.
      type 
      =
      = 
      'Conv',
     
    
- 
    
     
    
    
     
             direction 
      = 
      'down')
     
    
其他的等用到的时候再看吧,这里就不细看了~
后记
其实还有不少地方我没有看明白,比如不同平台是软件层面的分类还是硬件层面的分类?分类标准是加速运算还是为了分割计算图方便后续调度量化?……后面将探索这些疑点!
转载:https://blog.csdn.net/qq_41895747/article/details/128876385
 
					