1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
CPU Tasks
LoadModel 939.886 milliseconds
RunComplete 48.7479 seconds
Run 48.6305 seconds
Callbacks 10.5582 milliseconds, 37 calls, 285.357 microseconds average
Spectrogram 280.966 milliseconds, 42 calls, 6.68965 milliseconds average
Sample 65.5797 milliseconds, 511 calls, 128.336 microseconds average
Encode 19.0653 seconds, 10 calls, 1.90653 seconds average
Decode 29.5369 seconds, 10 calls, 2.95369 seconds average
DecodeStep 29.4709 seconds, 511 calls, 57.6731 milliseconds average
GPU Tasks
LoadModel 586.498 milliseconds
Run 48.4589 seconds
Encode 19.2258 seconds, 10 calls, 1.92258 seconds average
EncodeLayer 15.7109 seconds, 240 calls, 65.4622 milliseconds average
Decode 29.233 seconds, 10 calls, 2.9233 seconds average
DecodeStep 29.233 seconds, 511 calls, 57.2074 milliseconds average
DecodeLayer 27.6558 seconds, 12264 calls, 2.25504 milliseconds average
Compute Shaders
mulMatTiled 19.1596 seconds, 5290 calls, 3.62186 milliseconds average
mulMatByRowTiled 12.681 seconds, 144789 calls, 87.5829 microseconds average
fmaRepeat1 3.10945 seconds, 37793 calls, 82.2758 microseconds average
copyTranspose 2.83737 seconds, 25008 calls, 113.458 microseconds average
softMaxFixed 1.6294 seconds, 12504 calls, 130.31 microseconds average
addRepeatScale 1.54396 seconds, 24528 calls, 62.9467 microseconds average
addRepeatEx 1.06992 seconds, 37272 calls, 28.7056 microseconds average
normFixed 1.06753 seconds, 37793 calls, 28.2467 microseconds average
copyConvert 994.495 milliseconds, 25488 calls, 39.0182 microseconds average
convolutionMain2Fixed 954.715 milliseconds, 10 calls, 95.4715 milliseconds average
softMax 742.126 milliseconds, 12264 calls, 60.5126 microseconds average
addRepeatGelu 506.056 milliseconds, 12524 calls, 40.4069 microseconds average
softMaxLong 491.226 milliseconds, 511 calls, 961.303 microseconds average
scaleInPlace 438.956 milliseconds, 12504 calls, 35.1053 microseconds average
addRepeat 403.997 milliseconds, 12984 calls, 31.1149 microseconds average
diagMaskInf 366.713 milliseconds, 12264 calls, 29.9016 microseconds average
convolutionMain 276.364 milliseconds, 10 calls, 27.6364 milliseconds average
convolutionPrep1 44.9126 milliseconds, 20 calls, 2.24563 milliseconds average
convolutionPrep2 20.0013 milliseconds, 20 calls, 1.00006 milliseconds average
addRows 7.2369 milliseconds, 511 calls, 14.1622 microseconds average
add 2.453 milliseconds, 10 calls, 245.3 microseconds average
Memory Usage
Model 877.966 KB RAM, 1.42785 GB VRAM
Context 91.0716 MB RAM, 785.219 MB VRAM
Total 91.929 MB RAM, 2.19467 GB VRAM
|