summaryrefslogtreecommitdiffstats
path: root/SampleClips/jfk-medium-vega7.txt
diff options
context:
space:
mode:
authorKonstantin <const@const.me>2023-01-24 18:29:47 +0100
committerKonstantin <const@const.me>2023-01-24 18:29:47 +0100
commit993cfa55dcae112822a2e9681056e49697d71338 (patch)
tree184746cafd2a650d8a00288cfeb9c7758dc6603a /SampleClips/jfk-medium-vega7.txt
parent3cdb0f3cc721a0aee7330f4167e3904bdd5b4e18 (diff)
Performance results for version 1.5
Diffstat (limited to 'SampleClips/jfk-medium-vega7.txt')
-rw-r--r--SampleClips/jfk-medium-vega7.txt83
1 files changed, 42 insertions, 41 deletions
diff --git a/SampleClips/jfk-medium-vega7.txt b/SampleClips/jfk-medium-vega7.txt
index 0be45d3..7753030 100644
--- a/SampleClips/jfk-medium-vega7.txt
+++ b/SampleClips/jfk-medium-vega7.txt
@@ -1,46 +1,47 @@
CPU Tasks
-LoadModel 1.44983 seconds
-RunComplete 9.9723 seconds
-Run 9.8953 seconds
-Callbacks 876.5 microseconds, 4 calls, 219.125 microseconds average
-Spectrogram 100.602 milliseconds, 3 calls, 33.5339 milliseconds average
-Sample 8.2281 milliseconds, 28 calls, 293.861 microseconds average
-Encode 8.28685 seconds
-Decode 1.60728 seconds
-DecodeStep 1.599 seconds, 28 calls, 57.1073 milliseconds average
+LoadModel 1.79203 seconds
+RunComplete 8.79853 seconds
+Run 8.71884 seconds
+Callbacks 626.8 microseconds, 4 calls, 156.7 microseconds average
+Spectrogram 17.3373 milliseconds, 3 calls, 5.7791 milliseconds average
+Sample 5.449 milliseconds, 28 calls, 194.607 microseconds average
+Encode 7.29966 seconds
+Decode 1.41824 seconds
+DecodeStep 1.41276 seconds, 28 calls, 50.4557 milliseconds average
GPU Tasks
-LoadModel 751.497 milliseconds
-Run 9.73531 seconds
-Encode 8.28303 seconds
-EncodeLayer 7.19651 seconds, 24 calls, 299.855 milliseconds average
-Decode 1.45228 seconds
-DecodeStep 1.45225 seconds, 28 calls, 51.866 milliseconds average
-DecodeLayer 1.31372 seconds, 672 calls, 1.95494 milliseconds average
+LoadModel 930.123 milliseconds
+Run 8.64946 seconds
+Encode 7.34021 seconds
+EncodeLayer 6.40759 seconds, 24 calls, 266.983 milliseconds average
+Decode 1.30925 seconds
+DecodeStep 1.30389 seconds, 28 calls, 46.5676 milliseconds average
+DecodeLayer 1.19422 seconds, 672 calls, 1.77711 milliseconds average
Compute Shaders
-mulMatTiledEx 5.73474 seconds, 240 calls, 23.8947 milliseconds average
-mulMatTiled 1.59442 seconds, 289 calls, 5.51703 milliseconds average
-mulMatByRowTiled 708.039 milliseconds, 6507 calls, 108.812 microseconds average
-mulMatByRowTiledEx 292.797 milliseconds, 1296 calls, 225.923 microseconds average
-convolutionMain2Fixed 267.762 milliseconds
-softMaxFixed 252.702 milliseconds, 696 calls, 363.078 microseconds average
-addRepeat 122.774 milliseconds, 2808 calls, 43.7229 microseconds average
-matReshapePanels 116.085 milliseconds, 145 calls, 800.583 microseconds average
-convolutionMain 100.111 milliseconds
-addRepeatGelu 78.6895 milliseconds, 698 calls, 112.736 microseconds average
-normFixed 64.6521 milliseconds, 2093 calls, 30.8897 microseconds average
-scaleInPlace 64.0629 milliseconds, 696 calls, 92.0444 microseconds average
-copyConvert 62.7305 milliseconds, 1440 calls, 43.5628 microseconds average
-softMax 50.9006 milliseconds, 700 calls, 72.7151 microseconds average
-fmaRepeat1 49.6347 milliseconds, 2093 calls, 23.7146 microseconds average
-copyTranspose 44.2248 milliseconds, 1392 calls, 31.7707 microseconds average
-addInPlace 44.1766 milliseconds, 1392 calls, 31.7361 microseconds average
-addRepeatScale 31.3737 milliseconds, 1344 calls, 23.3435 microseconds average
-add 19.0564 milliseconds, 673 calls, 28.3156 microseconds average
-convolutionPrep1 8.494 milliseconds, 2 calls, 4.247 milliseconds average
-diagMaskInf 6.9839 milliseconds, 672 calls, 10.3927 microseconds average
-convolutionPrep2 6.0876 milliseconds, 2 calls, 3.0438 milliseconds average
-addRows 72 microseconds, 28 calls, 2.57143 microseconds average
+mulMatTiledEx 4.91773 seconds, 240 calls, 20.4906 milliseconds average
+mulMatTiled 1.47531 seconds, 289 calls, 5.10489 milliseconds average
+mulMatByRowTiled 627.1 milliseconds, 6507 calls, 96.3731 microseconds average
+softMaxFixed 268.285 milliseconds, 696 calls, 385.467 microseconds average
+convolutionMain2Fixed 266.261 milliseconds
+mulMatByRowTiledEx 241.609 milliseconds, 1296 calls, 186.427 microseconds average
+matReshapePanels 156.683 milliseconds, 145 calls, 1.08057 milliseconds average
+convolutionMain 102.091 milliseconds
+copyConvert 77.6113 milliseconds, 1440 calls, 53.8967 microseconds average
+addRepeatGelu 71.5118 milliseconds, 698 calls, 102.452 microseconds average
+copyTranspose 63.3929 milliseconds, 1392 calls, 45.5409 microseconds average
+normFixed 60.9615 milliseconds, 2093 calls, 29.1264 microseconds average
+scaleInPlace 59.9341 milliseconds, 696 calls, 86.1122 microseconds average
+fmaRepeat1 56.3539 milliseconds, 2093 calls, 26.9249 microseconds average
+addRepeatEx 51.8785 milliseconds, 2064 calls, 25.1349 microseconds average
+addRepeat 48.1192 milliseconds, 744 calls, 64.6763 microseconds average
+softMaxLong 28.3411 milliseconds, 28 calls, 1.01218 milliseconds average
+addRepeatScale 21.3646 milliseconds, 1344 calls, 15.8963 microseconds average
+softMax 10.198 milliseconds, 672 calls, 15.1756 microseconds average
+convolutionPrep1 9.1072 milliseconds, 2 calls, 4.5536 milliseconds average
+diagMaskInf 8.3764 milliseconds, 672 calls, 12.4649 microseconds average
+convolutionPrep2 7.4623 milliseconds, 2 calls, 3.73115 milliseconds average
+add 2.3886 milliseconds
+addRows 97.6 microseconds, 28 calls, 3.48571 microseconds average
Memory Usage
Model 877.966 KB RAM, 1.42785 GB VRAM
-Context 1.9836 MB RAM, 771.354 MB VRAM
-Total 2.84099 MB RAM, 2.18113 GB VRAM
+Context 1.9831 MB RAM, 771.235 MB VRAM
+Total 2.84049 MB RAM, 2.18101 GB VRAM