1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
|
CPU Tasks
LoadModel 841.605 milliseconds
RunComplete 62.1145 seconds
Run 62.0268 seconds
Callbacks 10.184 milliseconds, 37 calls, 275.243 microseconds average
Spectrogram 200.241 milliseconds, 42 calls, 4.76764 milliseconds average
Sample 63.0473 milliseconds, 511 calls, 123.38 microseconds average
Encode 37.2409 seconds, 10 calls, 3.72409 seconds average
Decode 24.7715 seconds, 10 calls, 2.47715 seconds average
DecodeStep 24.7082 seconds, 511 calls, 48.3526 milliseconds average
GPU Tasks
LoadModel 410.579 milliseconds
Run 61.8044 seconds
Encode 37.8702 seconds, 10 calls, 3.78702 seconds average
EncodeLayer 32.1896 seconds, 240 calls, 134.123 milliseconds average
Decode 23.9262 seconds, 10 calls, 2.39262 seconds average
DecodeStep 23.9233 seconds, 511 calls, 46.8167 milliseconds average
DecodeLayer 21.6888 seconds, 12264 calls, 1.76849 milliseconds average
Compute Shaders
mulMatTiledEx 27.5216 seconds, 2400 calls, 11.4673 milliseconds average
mulMatTiled 10.2385 seconds, 2890 calls, 3.54273 milliseconds average
mulMatByRowTiled 9.38114 seconds, 120741 calls, 77.6964 microseconds average
mulMatByRowTiledEx 4.19991 seconds, 24048 calls, 174.647 microseconds average
softMaxFixed 1.95105 seconds, 12504 calls, 156.034 microseconds average
convolutionMain2Fixed 1.31354 seconds, 10 calls, 131.354 milliseconds average
matReshapePanels 1.04699 seconds, 1450 calls, 722.064 microseconds average
addRepeatGelu 777.683 milliseconds, 12524 calls, 62.0954 microseconds average
scaleInPlace 750.056 milliseconds, 12504 calls, 59.9853 microseconds average
copyConvert 701.517 milliseconds, 25488 calls, 27.5234 microseconds average
normFixed 697.931 milliseconds, 37793 calls, 18.4672 microseconds average
fmaRepeat1 529.007 milliseconds, 37793 calls, 13.9975 microseconds average
addRepeatEx 511.269 milliseconds, 37272 calls, 13.7172 microseconds average
copyTranspose 459.017 milliseconds, 25008 calls, 18.3548 microseconds average
softMaxLong 382.205 milliseconds, 511 calls, 747.955 microseconds average
convolutionMain 328.996 milliseconds, 10 calls, 32.8996 milliseconds average
addRepeat 305.31 milliseconds, 12984 calls, 23.5144 microseconds average
addRepeatScale 261.749 milliseconds, 24528 calls, 10.6715 microseconds average
softMax 148.894 milliseconds, 12264 calls, 12.1407 microseconds average
diagMaskInf 104.681 milliseconds, 12264 calls, 8.53562 microseconds average
convolutionPrep2 45.8033 milliseconds, 20 calls, 2.29017 milliseconds average
convolutionPrep1 32.3779 milliseconds, 20 calls, 1.61889 milliseconds average
add 7.3228 milliseconds, 10 calls, 732.28 microseconds average
addRows 1.6948 milliseconds, 511 calls, 3.31663 microseconds average
Memory Usage
Model 877.966 KB RAM, 1.42785 GB VRAM
Context 91.0716 MB RAM, 833.407 MB VRAM
Total 91.929 MB RAM, 2.24172 GB VRAM
|