1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
CPU Tasks
LoadModel 593.861 milliseconds
RunComplete 1.13909 seconds
Run 1.06578 seconds
Callbacks 279.4 microseconds, 4 calls, 69.85 microseconds average
Spectrogram 12.0744 milliseconds, 3 calls, 4.0248 milliseconds average
Sample 3.0016 milliseconds, 28 calls, 107.2 microseconds average
Encode 614.44 milliseconds
Decode 451.048 milliseconds
DecodeStep 448.036 milliseconds, 28 calls, 16.0013 milliseconds average
GPU Tasks
LoadModel 452.128 milliseconds
Run 1.0518 seconds
Encode 599.964 milliseconds
EncodeLayer 506.67 milliseconds, 24 calls, 21.1113 milliseconds average
Decode 451.832 milliseconds
DecodeStep 451.828 milliseconds, 28 calls, 16.1367 milliseconds average
DecodeLayer 412.142 milliseconds, 672 calls, 613.307 microseconds average
Compute Shaders
mulMatTiled 562.478 milliseconds, 529 calls, 1.06329 milliseconds average
mulMatByRowTiled 256.062 milliseconds, 7803 calls, 32.8158 microseconds average
softMaxFixed 27.1687 milliseconds, 696 calls, 39.0355 microseconds average
normFixed 24.1828 milliseconds, 2093 calls, 11.5541 microseconds average
fmaRepeat1 23.3089 milliseconds, 2093 calls, 11.1366 microseconds average
addRepeatEx 22.3395 milliseconds, 2064 calls, 10.8234 microseconds average
softMaxLong 19.7192 milliseconds, 28 calls, 704.257 microseconds average
copyConvert 19.301 milliseconds, 1440 calls, 13.4035 microseconds average
copyTranspose 15.3011 milliseconds, 1392 calls, 10.9922 microseconds average
addRepeatScale 13.6043 milliseconds, 1344 calls, 10.1222 microseconds average
convolutionMain2Fixed 12.1242 milliseconds
addRepeatGelu 11.6172 milliseconds, 698 calls, 16.6436 microseconds average
addRepeat 11.5331 milliseconds, 744 calls, 15.5015 microseconds average
scaleInPlace 9.5743 milliseconds, 696 calls, 13.7562 microseconds average
convolutionMain 7.0349 milliseconds
softMax 5.8329 milliseconds, 672 calls, 8.67991 microseconds average
diagMaskInf 4.5297 milliseconds, 672 calls, 6.74062 microseconds average
convolutionPrep1 1.5258 milliseconds, 2 calls, 762.9 microseconds average
convolutionPrep2 383 microseconds, 2 calls, 191.5 microseconds average
addRows 194.6 microseconds, 28 calls, 6.95 microseconds average
add 95.2 microseconds
Memory Usage
Model 877.966 KB RAM, 1.42785 GB VRAM
Context 1.9831 MB RAM, 723.782 MB VRAM
Total 2.84049 MB RAM, 2.13467 GB VRAM
|