1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
|
CPU Tasks
LoadModel 2.88964 seconds
RunComplete 140.747 seconds
Run 140.661 seconds
Callbacks 20.302 milliseconds, 44 calls, 461.409 microseconds average
Spectrogram 468.419 milliseconds, 41 calls, 11.4249 milliseconds average
Sample 139.558 milliseconds, 527 calls, 264.815 microseconds average
Encode 87.5396 seconds, 9 calls, 9.72662 seconds average
Decode 53.0971 seconds, 9 calls, 5.89968 seconds average
DecodeStep 52.9566 seconds, 527 calls, 100.487 milliseconds average
GPU Tasks
LoadModel 1.86694 seconds
Run 140.175 seconds
Encode 88.7441 seconds, 9 calls, 9.86046 seconds average
EncodeLayer 75.809 seconds, 288 calls, 263.226 milliseconds average
Decode 51.4306 seconds, 9 calls, 5.71451 seconds average
DecodeStep 51.43 seconds, 527 calls, 97.5901 milliseconds average
DecodeLayer 48.1822 seconds, 16864 calls, 2.85711 milliseconds average
Compute Shaders
mulMatTiledEx 69.1011 seconds, 2880 calls, 23.9934 milliseconds average
mulMatTiled 21.009 seconds, 3465 calls, 6.06321 milliseconds average
mulMatByRowTiled 20.0965 seconds, 166278 calls, 120.861 microseconds average
mulMatByRowTiledEx 9.61326 seconds, 33152 calls, 289.975 microseconds average
softMaxFixed 3.7631 seconds, 17152 calls, 219.397 microseconds average
norm 2.23806 seconds, 51704 calls, 43.2859 microseconds average
convolutionMain2Fixed 2.12825 seconds, 9 calls, 236.472 milliseconds average
matReshapePanels 2.0333 seconds, 1737 calls, 1.17058 milliseconds average
addRepeatGelu 1.5491 seconds, 17170 calls, 90.2211 microseconds average
scaleInPlace 1.32928 seconds, 17152 calls, 77.5001 microseconds average
copyConvert 1.23135 seconds, 34880 calls, 35.3026 microseconds average
fmaRepeat1 1.10337 seconds, 51704 calls, 21.3401 microseconds average
addRepeatEx 1.00095 seconds, 51168 calls, 19.562 microseconds average
copyTranspose 846.807 milliseconds, 34304 calls, 24.6854 microseconds average
addRepeat 704.028 milliseconds, 17728 calls, 39.7128 microseconds average
softMaxLong 608.58 milliseconds, 527 calls, 1.1548 milliseconds average
convolutionMain 522.249 milliseconds, 9 calls, 58.0277 milliseconds average
addRepeatScale 500.937 milliseconds, 33728 calls, 14.8523 microseconds average
softMax 236.054 milliseconds, 16864 calls, 13.9975 microseconds average
diagMaskInf 171.964 milliseconds, 16864 calls, 10.1971 microseconds average
convolutionPrep1 60.7331 milliseconds, 18 calls, 3.37406 milliseconds average
convolutionPrep2 33.441 milliseconds, 18 calls, 1.85783 milliseconds average
add 12.0883 milliseconds, 9 calls, 1.34314 milliseconds average
addRows 1.9724 milliseconds, 527 calls, 3.74269 microseconds average
Memory Usage
Model 892.591 KB RAM, 2.8815 GB VRAM
Context 92.2612 MB RAM, 1.19934 GB VRAM
Total 93.1329 MB RAM, 4.08084 GB VRAM
|