1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
CPU Tasks
LoadModel 7.95251 seconds
RunComplete 109.423 seconds
Run 109.351 seconds
Callbacks 12.7226 milliseconds, 44 calls, 289.15 microseconds average
Spectrogram 270.286 milliseconds, 41 calls, 6.59235 milliseconds average
Sample 69.0965 milliseconds, 527 calls, 131.113 microseconds average
Encode 35.943 seconds, 9 calls, 3.99366 seconds average
Decode 73.3946 seconds, 9 calls, 8.15496 seconds average
DecodeStep 73.3251 seconds, 527 calls, 139.137 milliseconds average
GPU Tasks
LoadModel 7.55659 seconds
Run 109.16 seconds
Encode 36.3141 seconds, 9 calls, 4.0349 seconds average
EncodeLayer 29.8405 seconds, 288 calls, 103.613 milliseconds average
Decode 72.8459 seconds, 9 calls, 8.09398 seconds average
DecodeStep 72.8458 seconds, 527 calls, 138.227 milliseconds average
DecodeLayer 69.0153 seconds, 16864 calls, 4.09247 milliseconds average
Compute Shaders
mulMatTiled 36.8159 seconds, 6345 calls, 5.80234 milliseconds average
mulMatByRowTiled 28.0431 seconds, 199430 calls, 140.616 microseconds average
copyTranspose 8.11917 seconds, 34304 calls, 236.683 microseconds average
fmaRepeat1 7.85961 seconds, 51704 calls, 152.012 microseconds average
addRepeatScale 4.11915 seconds, 33728 calls, 122.129 microseconds average
softMaxFixed 3.22072 seconds, 17152 calls, 187.775 microseconds average
copyConvert 2.8333 seconds, 34880 calls, 81.2298 microseconds average
addRepeatEx 2.78075 seconds, 51168 calls, 54.3455 microseconds average
norm 2.76591 seconds, 51704 calls, 53.495 microseconds average
addRepeatGelu 2.35162 seconds, 17170 calls, 136.961 microseconds average
softMaxLong 2.24788 seconds, 527 calls, 4.26543 milliseconds average
softMax 2.21477 seconds, 16864 calls, 131.331 microseconds average
convolutionMain2Fixed 1.38064 seconds, 9 calls, 153.405 milliseconds average
addRepeat 1.30665 seconds, 17728 calls, 73.7057 microseconds average
scaleInPlace 1.10329 seconds, 17152 calls, 64.3245 microseconds average
diagMaskInf 937.457 milliseconds, 16864 calls, 55.5892 microseconds average
convolutionMain 374.967 milliseconds, 9 calls, 41.663 milliseconds average
convolutionPrep1 119.171 milliseconds, 18 calls, 6.62059 milliseconds average
convolutionPrep2 27.8894 milliseconds, 18 calls, 1.54941 milliseconds average
addRows 5.2536 milliseconds, 527 calls, 9.96888 microseconds average
add 2.8285 milliseconds, 9 calls, 314.278 microseconds average
Memory Usage
Model 892.591 KB RAM, 2.8815 GB VRAM
Context 92.2612 MB RAM, 1.14026 GB VRAM
Total 93.1329 MB RAM, 4.02176 GB VRAM
|