1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
CPU Tasks
LoadModel 950.578 milliseconds
RunComplete 27.5329 seconds
Run 27.434 seconds
Callbacks 10.6484 milliseconds, 44 calls, 242.009 microseconds average
Spectrogram 199.106 milliseconds, 41 calls, 4.85624 milliseconds average
Sample 58.7404 milliseconds, 527 calls, 111.462 microseconds average
Encode 11.3813 seconds, 9 calls, 1.26459 seconds average
Decode 16.0418 seconds, 9 calls, 1.78242 seconds average
DecodeStep 15.9829 seconds, 527 calls, 30.3281 milliseconds average
GPU Tasks
LoadModel 805.211 milliseconds
Run 27.338 seconds
Encode 11.3967 seconds, 9 calls, 1.2663 seconds average
EncodeLayer 9.78685 seconds, 288 calls, 33.9821 milliseconds average
Decode 15.9412 seconds, 9 calls, 1.77125 seconds average
DecodeStep 15.9412 seconds, 527 calls, 30.249 milliseconds average
DecodeLayer 15.0511 seconds, 16864 calls, 892.499 microseconds average
Compute Shaders
mulMatTiled 12.0503 seconds, 6345 calls, 1.89919 milliseconds average
mulMatByRowTiled 9.45404 seconds, 199430 calls, 47.4053 microseconds average
norm 1.32432 seconds, 51704 calls, 25.6135 microseconds average
fmaRepeat1 583.884 milliseconds, 51704 calls, 11.2928 microseconds average
addRepeatEx 536.551 milliseconds, 51168 calls, 10.4861 microseconds average
softMaxFixed 534.105 milliseconds, 17152 calls, 31.1395 microseconds average
copyConvert 500.4 milliseconds, 34880 calls, 14.3463 microseconds average
copyTranspose 377.38 milliseconds, 34304 calls, 11.001 microseconds average
addRepeatScale 315.294 milliseconds, 33728 calls, 9.34814 microseconds average
addRepeatGelu 283.978 milliseconds, 17170 calls, 16.5392 microseconds average
softMaxLong 245.57 milliseconds, 527 calls, 465.976 microseconds average
scaleInPlace 226.545 milliseconds, 17152 calls, 13.2081 microseconds average
softMax 212.206 milliseconds, 16864 calls, 12.5834 microseconds average
addRepeat 209.397 milliseconds, 17728 calls, 11.8117 microseconds average
convolutionMain2Fixed 184.615 milliseconds, 9 calls, 20.5128 milliseconds average
diagMaskInf 107.423 milliseconds, 16864 calls, 6.36998 microseconds average
convolutionMain 74.7954 milliseconds, 9 calls, 8.3106 milliseconds average
convolutionPrep1 20.9316 milliseconds, 18 calls, 1.16287 milliseconds average
convolutionPrep2 3.8103 milliseconds, 18 calls, 211.683 microseconds average
addRows 3.7939 milliseconds, 527 calls, 7.19905 microseconds average
add 1.0895 milliseconds, 9 calls, 121.056 microseconds average
Memory Usage
Model 892.591 KB RAM, 2.8815 GB VRAM
Context 92.2612 MB RAM, 1.14026 GB VRAM
Total 93.1329 MB RAM, 4.02176 GB VRAM
|