S600 模型性能Benchmark

  • 测试开发板:S600

  • 模型来源:OE包内 samples/ucp_tutorial/dnn/ai_benchmark 路径下的模型

  • 运行环境:Linux

模型精度

MODEL NAMEINPUT SIZEACCURACYDataset
ResNet50

1x3x224x224

Top1:

0.7704(FLOAT)/0.7661(INT8)

ImageNet
GoogleNet

1x3x224x224

Top1:

0.7018(FLOAT)/0.6995(INT8)

ImageNet
EfficientNet_Lite1

1x240x240x3

Top1:

0.7652(FLOAT)/0.7602(INT8)

ImageNet
EfficientNet_Lite2

1x260x260x3

Top1:

0.7734(FLOAT)/0.7696(INT8)

ImageNet
EfficientNet_Lite3

1x280x280x3

Top1:

0.7917(FLOAT)/0.7885(INT8)

ImageNet
EfficientNet_Lite4

1x300x300x3

Top1:

0.8063(FLOAT)/0.8041(INT8)

ImageNet
Vargconvnet

1x3x224x224

Top1:

0.7793(FLOAT)/0.7765(INT8)

ImageNet
Efficientnasnet_m

1x3x300x300

Top1:

0.7935(FLOAT)/0.7927(INT8)

ImageNet
Efficientnasnet_s

1x3x280x280

Top1:

0.7441(FLOAT)/0.7512(INT8)

ImageNet
ResNet18

1x3x224x224

Top1:

0.7170(FLOAT)/0.7159(INT8)

ImageNet
YOLOv2_Darknet19

1x3x608x608

[IoU=0.50:0.95]=

0.2760(FLOAT)/0.2707(INT8)

COCO
YOLOv3_Darknet53

1x3x416x416

[IoU=0.50:0.95]=

0.3370(FLOAT)/0.3370(INT8)

COCO
YOLOv5x_v2.0

1x3x672x672

[IoU=0.50:0.95]=

0.4810(FLOAT)/0.4670(INT8)

COCO
Centernet_resnet101

1x3x512x512

[IoU=0.50:0.95]=

0.3420(FLOAT)/0.3270(INT8)

COCO
YOLOv3_VargDarknet

1x3x416x416

[IoU=0.50:0.95]=

0.3280(FLOAT)/0.3260(INT8)

COCO
Deeplabv3plus_efficientnetb0

1x3x1024x2048

mIoU:

0.7630(FLOAT)/0.7569(INT8)

Cityscapes
Fastscnn_efficientnetb0

1x3x1024x2048

mIoU:

0.6997(FLOAT)/0.6909(INT8)

Cityscapes
Deeplabv3plus_efficientnetm1

1x3x1024x2048

mIoU:

0.7794(FLOAT)/0.7756(INT8)

Cityscapes
Deeplabv3plus_efficientnetm2

1x3x1024x2048

mIoU:

0.7882(FLOAT)/0.7854(INT8)

Cityscapes
Bev_gkt_mixvargenet_multitask

image:

6x3x512x960

points(0-8):

6x64x64x2

NDS:

0.2810(FLOAT)/0.2788(INT8)

MeanIOU:

0.4852(FLOAT)/0.4839(INT8)

mAP:

0.1990(FLOAT)/0.1992(INT8)

Nuscenes
Bev_ipm_4d_efficientnetb0_multitask

image:

6x3x512x960

points:

6x128x128x2

prev_feat:

1x164x28x128

prev_point:

1x128x128x2

NDS:

0.3722(FLOAT)/0.3723(INT8)

MeanIOU:

0.5287(FLOAT)/0.5389(INT8)

mAP:

0.2201(FLOAT)/0.2217(INT8)

Nuscenes
Bev_ipm_efficientnetb0_multitask

image:

6x3x512x960

points:

6x128x128x2

NDS:

0.3055(FLOAT)/0.3041(INT8)

MeanIOU:

0.5145(FLOAT)/0.5103(INT8)

mAP:

0.2170(FLOAT)/0.2166(INT8)

Nuscenes
Bev_lss_efficientnetb0_multitask

image:

6x3x256x704

points(0&1):

10x128x128x2

NDS:

0.3006(FLOAT)/0.3008(INT8)

MeanIOU:

0.5180(FLOAT)/0.5172(INT8)

mAP:

0.2061(FLOAT)/0.2043(INT8)

Nuscenes
Detr3d_efficientnetb3

coords(0-3):

6x4x256x2

image:

6x3x512x1408

masks:

1x4x256x24

NDS:

0.3304(FLOAT)/0.3306(INT8)

mAP:

0.2753(FLOAT)/0.2742(INT8)

Nuscenes
Petr_efficientnetb3

image:

6x3x512x1408

pos_embed:

1x96x44x256

NDS:

0.3765(FLOAT)/0.3748(INT8)

mAP:

0.3038(FLOAT)/0.2942(INT8)

Nuscenes
Bevformer_tiny_resnet50_detection

img:

6x3x480x800

prev_bev:

1x2500x256

prev_bev_ref:

1x50x50x2

queries_rebatch_grid:

6x20x32x2

restore_bev_grid:

1x100x50x2

reference_points_rebatch:

6x640x4x2

bev_pillar_counts:

1x2500x1

NDS:

0.3713(FLOAT)/0.3700(INT8)

mAP:

0.2673(FLOAT)/0.2644(INT8)

Nuscenes
Flashocc_henet_lss_occ3d_nuscenes

img:

6x3x512x960

points:

10x128x128x2

points_depth:

10x128x128x2

mIoU:

0.3675(FLOAT)/0.3685(INT8)

Nuscenes
Horizon_swin_transformer

1x3x224x224

Top1:

0.8024(FLOAT)/0.7982(INT8)

ImageNet
Vargnetv2

1x3x224x224

Top1:

0.7342(FLOAT)/0.7332(INT8)

ImageNet
Vit_small

1x3x224x224

Top1:

0.7950(FLOAT)/0.7924(INT8)

ImageNet
Centerpoint_pointpillar

points:

300000x5

voxel_feature:

1x5x20x40000

coors:

40000x4

NDS:

0.5832(FLOAT)/0.5820(INT8)

mAP:

0.4804(FLOAT)/0.4783(INT8)

Nuscenes
Detr_efficientnetb3

1x3x800x1333

[IoU=0.50:0.95]=

0.3720(FLOAT)/0.3584(INT8)

MS COCO
Detr_resnet50

1x3x800x1333

[IoU=0.50:0.95]=

0.3569(FLOAT)/0.3168(INT8)

MS COCO
FCOS3D_efficientnetb0

1x3x512x896

NDS:

0.3061(FLOAT)/0.3022(INT8)

mAP:

0.2133(FLOAT)/0.2067(INT8)

nuscenes
Fcos_efficientnetb0

1x3x512x512

[IoU=0.50:0.95]=

0.3626(FLOAT)/0.3562(INT8)

MS COCO
Ganet_mixvargenet

1x3x320x800

F1Score:

0.7949(FLOAT)/0.7884(INT8)

CuLane
Deformable_detr_resnet50

1x3x800x1333

[IoU=0.50:0.95]=

0.4413(FLOAT)/0.4499(INT8)

MS COCO
Stereonetplus_mixvargenet

2x3x544x960

EPE:

1.1270(FLOAT)/1.1336(INT8)

SceneFlow
Centerpoint_mixvargnet_multitask

points:

300000x5

voxel_feature:

1x5x20x40000

coors:

40000x4

NDS:

0.5809(FLOAT)/0.5766(INT8)

MeanIOU:

0.9128(FLOAT)/0.9126(INT8)

mAP:

0.4727(FLOAT)/0.4649(INT8)

Nuscenes
Unet_mobilenetv1

1x3x1024x2048

mIoU:

0.6801(FLOAT)/0.6764(INT8)

Cityscapes
Densetnt_vectornet

goals_2d:

30x1x2048x2

goals_2d_mask:

30x1x2048x1

instance_mask:

30x1x96x1

lane_feat:

30x9x64x11

traj_feat:

30x19x32x9

minFDA:

1.2975(FLOAT)/1.2989(INT8)

Argoverse 1
Maptroe_henet_tinym_bevformer

img:

6x3x480x800

osm_mask:

1x1x50x100

queries_rebatch_grid:

6x20x100x2

restore_bev_grid:

1x100x100x2

reference_points_rebatch:

6x2000x4x2

bev_pillar_counts:

1x5000x1

mAP:

0.6633(FLOAT)/0.6569(INT8)

Nuscenes
Qcnet_oe

valid_mask:

1x30x10

valid_mask_a2a:

1x10x30x30

agent_type:

1x30x1

x_a_cur:

1x1x30x1,1x1x30x1,1x1x30x1,1x1x30x1

r_pl2a_cur:

1x1x30x80,1x1x30x80,1x1x30x80

r_t_cur:

1x1x30x6,1x1x30x6,1x1x30x6,1x1x30x6

r_a2a_cur:

1x1x30x30,1x1x30x30,1x1x30x30

x_a_mid_emb:

1x30x2x128

x_a:

1x30x6x128

pl_type,is_intersection:

1x80

r_pl2pl:

1x1x80x80,1x1x80x80,1x1x80x80

r_pt2pl:

1x1x80x50,1x1x80x50,1x1x80x50

mask_pl2pl:

1x80x80

magnitude,pt_type,side,mask:

1x80x50

mask_a2m:

1x30x30

mask_dst:

1x30x1

type_pl2pl:

1x80x80

hitrate:

0.8026(FLOAT)/0.7979(INT8)

Argoverse 2

模型性能

注解

在进行性能评测前,请参考如下方式,通过环境变量设置模型推理中CPU算子推理线程池的线程数量为12:

export HB_UCP_ENABLE_CPU_BACKEND_CORE_NUM=12
MODEL NAMEINPUT SIZELatency(ms)FPSFPS Configuration
ResNet50

1x3x224x224

0.5876519.27thread_num:10
GoogleNet

1x3x224x224

0.37918480.92thread_num:12
EfficientNet_Lite1

1x240x240x3

0.38217998.63thread_num:12
EfficientNet_Lite2

1x260x260x3

0.44714056.08thread_num:10
EfficientNet_Lite3

1x280x280x3

0.51211572.69thread_num:10
EfficientNet_Lite4

1x300x300x3

0.6568327.86thread_num:10
Vargconvnet

1x3x224x224

0.52312423.31thread_num:12
Efficientnasnet_m

1x3x300x300

0.5769260.61thread_num:10
Efficientnasnet_s

1x3x280x280

0.37619348.62thread_num:12
ResNet18

1x3x224x224

0.38414316.09thread_num:12
YOLOv2_Darknet19

1x3x608x608

2.2291987.00thread_num:8
YOLOv3_Darknet53

1x3x416x416

2.4531752.26thread_num:8
YOLOv5x_v2.0

1x3x672x672

8.202494.50thread_num:8
Centernet_resnet101

1x3x512x512

2.4461786.98thread_num:8
YOLOv3_VargDarknet

1x3x416x416

1.6282673.33thread_num:8
Deeplabv3plus_efficientnetb0

1x3x1024x2048

2.9521468.62thread_num:8
Fastscnn_efficientnetb0

1x3x1024x2048

1.7412647.19thread_num:8
Deeplabv3plus_efficientnetm1

1x3x1024x2048

5.126818.14thread_num:8
Deeplabv3plus_efficientnetm2

1x3x1024x2048

7.330561.92thread_num:8
Bev_gkt_mixvargenet_multitask

image:

6x3x512x960

points(0-8):

6x64x64x2

6.911605.09thread_num:8
Bev_ipm_4d_efficientnetb0_multitask

image:

6x3x512x960

points:

6x128x128x2

prev_feat:

1x164x28x128

prev_point:

1x128x128x2

4.615940.42thread_num:8
Bev_ipm_efficientnetb0_multitask

image:

6x3x512x960

points:

6x128x128x2

4.499967.65thread_num:8
Bev_lss_efficientnetb0_multitask

image:

6x3x256x704

points(0&1):

10x128x128x2

3.0751454.60thread_num:8
Detr3d_efficientnetb3

coords(0-3):

6x4x256x2

image:

6x3x512x1408

masks:

1x4x256x24

15.168258.88thread_num:8
Petr_efficientnetb3

image:

6x3x512x1408

pos_embed:

1x96x44x256

21.444187.70thread_num:8
Bevformer_tiny_resnet50_detection

img:

6x3x480x800

prev_bev:

1x2500x256

prev_bev_ref:

1x50x50x2

queries_rebatch_grid:

6x20x32x2

restore_bev_grid:

1x100x50x2

reference_points_rebatch:

6x640x4x2

bev_pillar_counts:

1x2500x1

14.164276.44thread_num:8
Flashocc_henet_lss_occ3d_nuscenes

img:

6x3x512x960

points:

10x128x128x2

points_depth:

10x128x128x2

5.519742.29thread_num:8
Horizon_swin_transformer

1x3x224x224

1.6942556.99thread_num:8
Vargnetv2

1x3x224x224

0.33219838.28thread_num:12
Vit_small

1x3x224x224

1.0924245.19thread_num:8
Centerpoint_pointpillar

points:

300000x5

voxel_feature:

1x5x20x40000

coors:

40000x4

7.5301067.16thread_num:16
Detr_efficientnetb3

1x3x800x1333

8.499390.55thread_num:8
Detr_resnet50

1x3x800x1333

10.795317.51thread_num:8
FCOS3D_efficientnetb0

1x3x512x896

1.4753567.10thread_num:10
Fcos_efficientnetb0

1x3x512x512

0.8877227.14thread_num:10
Ganet_mixvargenet

1x3x320x800

0.55012439.15thread_num:12
Deformable_detr_resnet50

1x3x800x1333

77.20129.28thread_num:8
Stereonetplus_mixvargenet

2x3x544x960

2.0222147.39thread_num:8
Centerpoint_mixvargnet_multitask

points:

300000x5

voxel_feature:

1x5x20x40000

coors:

40000x4

6.2631638.08thread_num:18
Unet_mobilenetv1

1x3x1024x2048

0.8816401.99thread_num:10
Densetnt_vectornet

goals_2d:

30x1x2048x2

goals_2d_mask:

30x1x2048x1

instance_mask:

30x1x96x1

lane_feat:

30x9x64x11

traj_feat:

30x19x32x9

3.5601108.38thread_num:12
Maptroe_henet_tinym_bevformer

img:

6x3x480x800

osm_mask:

1x1x50x100

queries_rebatch_grid:

6x20x100x2

restore_bev_grid:

1x100x100x2

reference_points_rebatch:

6x2000x4x2

bev_pillar_counts:

1x5000x1

6.439625.40thread_num:8
Qcnet_oe

valid_mask:

1x30x10

valid_mask_a2a:

1x10x30x30

agent_type:

1x30x1

x_a_cur:

1x1x30x1,1x1x30x1,1x1x30x1,1x1x30x1

r_pl2a_cur:

1x1x30x80,1x1x30x80,1x1x30x80

r_t_cur:

1x1x30x6,1x1x30x6,1x1x30x6,1x1x30x6

r_a2a_cur:

1x1x30x30,1x1x30x30,1x1x30x30

x_a_mid_emb:

1x30x2x128

x_a:

1x30x6x128

pl_type,is_intersection:

1x80

r_pl2pl:

1x1x80x80,1x1x80x80,1x1x80x80

r_pt2pl:

1x1x80x50,1x1x80x50,1x1x80x50

mask_pl2pl:

1x80x80

magnitude,pt_type,side,mask:

1x80x50

mask_a2m:

1x30x30

mask_dst:

1x30x1

type_pl2pl:

1x80x80

2.6111723.45thread_num:8