summaryrefslogtreecommitdiffstats
path: root/SampleClips
diff options
context:
space:
mode:
Diffstat (limited to 'SampleClips')
-rw-r--r--SampleClips/Readme.txt18
-rw-r--r--SampleClips/columbia-large-1650.txt43
-rw-r--r--SampleClips/columbia-medium-1650.txt43
-rw-r--r--SampleClips/jfk-large-1650.txt43
-rw-r--r--SampleClips/jfk-medium-1650.txt43
5 files changed, 184 insertions, 6 deletions
diff --git a/SampleClips/Readme.txt b/SampleClips/Readme.txt
index 69432f4..0945728 100644
--- a/SampleClips/Readme.txt
+++ b/SampleClips/Readme.txt
@@ -3,15 +3,21 @@
jfk.wav is from whisper.cpp repository.
columbia.wma is from Wikipedia: https://upload.wikimedia.org/wikipedia/commons/1/1f/George_W_Bush_Columbia_FINAL.ogg
-I had to re-encoded the audio from Ogg Vorbis into Windows Media Audio, because Media Foundation is unable to decode Vorbis.
+I re-encoded the audio from Ogg Vorbis into Windows Media Audio, because Media Foundation is unable to decode Vorbis.
-The rest of the text files in this folder are the outputs of the in-app performance profiler, when the app was transcribing these two audio clips on two different computers.
+The rest of the text files in this folder are the outputs of the in-app performance profiler, when the app was transcribing these two audio clips on three different computers.
-The files names containing `1080ti` are from my desktop, which has nVidia GeForce 1080Ti GPU.
+The “1080ti” files are from my desktop, which has nVidia GeForce 1080Ti GPU.
-The files names containing `vega7` are from my laptop, the GPU is integrated into AMD Ryzen 5 5600U processor.
+The “vega7” files are from my laptop, the GPU is integrated into AMD Ryzen 5 5600U processor.
The laptop model is HP ProBook 445 G8. While running the tests, the laptop was on battery power.
-The files names with `medium` in the middle were made with `ggml-medium.bin` Whisper model.
+The “1650” files are from another desktop with nVidia GeForce 1650.
-The files with `large` were made with `ggml-large.bin` model. \ No newline at end of file
+The file names with “medium” in the middle were made with “ggml-medium.bin” Whisper model, with “large” were made with “ggml-large.bin” model.
+
+In theory, 1080ti delivers 10.6 TFlops FP32 and 484.4 GB/second VRAM bandwidth.
+
+That variant of 1650 delivers 2.6 TFlops FP32, and 128.1 GB/second VRAM bandwidth.
+
+The AMD APU in that laptop delivers 1.6 TFlops FP32, and 51.2 GB/second memory bandwidth. \ No newline at end of file
diff --git a/SampleClips/columbia-large-1650.txt b/SampleClips/columbia-large-1650.txt
new file mode 100644
index 0000000..ffc9e5c
--- /dev/null
+++ b/SampleClips/columbia-large-1650.txt
@@ -0,0 +1,43 @@
+ CPU Tasks
+LoadModel 1.39046 seconds
+RunComplete 98.7705 seconds
+Run 98.6893 seconds
+Callbacks 10.9446 milliseconds, 44 calls, 248.741 microseconds average
+Spectrogram 1.10864 seconds, 41 calls, 27.04 milliseconds average
+Sample 62.5537 milliseconds, 527 calls, 118.698 microseconds average
+Encode 60.6321 seconds, 9 calls, 6.7369 seconds average
+Decode 38.0118 seconds, 9 calls, 4.22353 seconds average
+DecodeStep 37.949 seconds, 527 calls, 72.0095 milliseconds average
+ GPU Tasks
+LoadModel 1.19991 seconds
+Run 98.4248 seconds
+Encode 61.0298 seconds, 9 calls, 6.78109 seconds average
+EncodeLayer 51.7844 seconds, 288 calls, 179.807 milliseconds average
+Decode 37.395 seconds, 9 calls, 4.155 seconds average
+DecodeStep 37.3947 seconds, 527 calls, 70.9577 milliseconds average
+DecodeLayer 34.8821 seconds, 16864 calls, 2.06843 milliseconds average
+ Compute Shaders
+mulMatTiled 65.2919 seconds, 6345 calls, 10.2903 milliseconds average
+mulMatByRowTiled 22.3701 seconds, 199430 calls, 112.17 microseconds average
+convolutionMain2Fixed 1.37801 seconds, 9 calls, 153.113 milliseconds average
+softMaxFixed 1.32519 seconds, 17152 calls, 77.2618 microseconds average
+addRepeat 1.0237 seconds, 68896 calls, 14.8586 microseconds average
+copyTranspose 974.149 milliseconds, 34304 calls, 28.3975 microseconds average
+norm 971.572 milliseconds, 51704 calls, 18.791 microseconds average
+softMax 956.611 milliseconds, 17391 calls, 55.0061 microseconds average
+copyConvert 899.362 milliseconds, 34880 calls, 25.7845 microseconds average
+fmaRepeat1 675.729 milliseconds, 51704 calls, 13.0692 microseconds average
+addRepeatGelu 531.623 milliseconds, 17170 calls, 30.9623 microseconds average
+addInPlace 461.61 milliseconds, 34304 calls, 13.4564 microseconds average
+scaleInPlace 394.457 milliseconds, 17152 calls, 22.9978 microseconds average
+convolutionMain 331.124 milliseconds, 9 calls, 36.7915 milliseconds average
+addRepeatScale 329.854 milliseconds, 33728 calls, 9.77983 microseconds average
+add 203.376 milliseconds, 16873 calls, 12.0534 microseconds average
+diagMaskInf 107.127 milliseconds, 16864 calls, 6.3524 microseconds average
+convolutionPrep1 58.8876 milliseconds, 18 calls, 3.27153 milliseconds average
+convolutionPrep2 9.1367 milliseconds, 18 calls, 507.594 microseconds average
+addRows 3.6551 milliseconds, 527 calls, 6.93567 microseconds average
+ Memory Usage
+Model 892.591 KB RAM, 2.8815 GB VRAM
+Context 92.2616 MB RAM, 1.20719 GB VRAM
+Total 93.1333 MB RAM, 4.08869 GB VRAM
diff --git a/SampleClips/columbia-medium-1650.txt b/SampleClips/columbia-medium-1650.txt
new file mode 100644
index 0000000..10d6984
--- /dev/null
+++ b/SampleClips/columbia-medium-1650.txt
@@ -0,0 +1,43 @@
+ CPU Tasks
+LoadModel 818.374 milliseconds
+RunComplete 55.336 seconds
+Run 55.238 seconds
+Callbacks 8.3113 milliseconds, 37 calls, 224.63 microseconds average
+Spectrogram 1.11163 seconds, 42 calls, 26.4674 milliseconds average
+Sample 59.2017 milliseconds, 511 calls, 115.855 microseconds average
+Encode 33.7839 seconds, 10 calls, 3.37839 seconds average
+Decode 21.4456 seconds, 10 calls, 2.14456 seconds average
+DecodeStep 21.3862 seconds, 511 calls, 41.8517 milliseconds average
+ GPU Tasks
+LoadModel 626.222 milliseconds
+Run 55.0407 seconds
+Encode 34.044 seconds, 10 calls, 3.4044 seconds average
+EncodeLayer 28.8064 seconds, 240 calls, 120.027 milliseconds average
+Decode 20.9967 seconds, 10 calls, 2.09967 seconds average
+DecodeStep 20.9967 seconds, 511 calls, 41.0894 milliseconds average
+DecodeLayer 19.0732 seconds, 12264 calls, 1.55522 milliseconds average
+ Compute Shaders
+mulMatTiled 36.347 seconds, 5290 calls, 6.87089 milliseconds average
+mulMatByRowTiled 12.1268 seconds, 144789 calls, 83.7549 microseconds average
+convolutionMain2Fixed 956.94 milliseconds, 10 calls, 95.694 milliseconds average
+softMaxFixed 878.266 milliseconds, 12504 calls, 70.2388 microseconds average
+softMax 708.091 milliseconds, 12775 calls, 55.4279 microseconds average
+addRepeat 648.271 milliseconds, 50256 calls, 12.8994 microseconds average
+copyConvert 532.099 milliseconds, 25488 calls, 20.8764 microseconds average
+copyTranspose 467.681 milliseconds, 25008 calls, 18.7013 microseconds average
+normFixed 393.9 milliseconds, 37793 calls, 10.4226 microseconds average
+addRepeatGelu 354.445 milliseconds, 12524 calls, 28.3013 microseconds average
+fmaRepeat1 348.257 milliseconds, 37793 calls, 9.21484 microseconds average
+addInPlace 308.862 milliseconds, 25008 calls, 12.3505 microseconds average
+convolutionMain 278.894 milliseconds, 10 calls, 27.8894 milliseconds average
+addRepeatScale 199.387 milliseconds, 24528 calls, 8.12898 microseconds average
+scaleInPlace 197.51 milliseconds, 12504 calls, 15.7958 microseconds average
+add 134.664 milliseconds, 12274 calls, 10.9715 microseconds average
+diagMaskInf 57.9927 milliseconds, 12264 calls, 4.72869 microseconds average
+convolutionPrep1 41.0155 milliseconds, 20 calls, 2.05077 milliseconds average
+convolutionPrep2 8.0689 milliseconds, 20 calls, 403.445 microseconds average
+addRows 3.1188 milliseconds, 511 calls, 6.10333 microseconds average
+ Memory Usage
+Model 877.966 KB RAM, 1.42785 GB VRAM
+Context 91.0719 MB RAM, 841.634 MB VRAM
+Total 91.9293 MB RAM, 2.24976 GB VRAM
diff --git a/SampleClips/jfk-large-1650.txt b/SampleClips/jfk-large-1650.txt
new file mode 100644
index 0000000..9c6e4b8
--- /dev/null
+++ b/SampleClips/jfk-large-1650.txt
@@ -0,0 +1,43 @@
+ CPU Tasks
+LoadModel 1.4018 seconds
+RunComplete 8.71063 seconds
+Run 8.64303 seconds
+Callbacks 251.9 microseconds, 4 calls, 62.975 microseconds average
+Spectrogram 62.1203 milliseconds, 3 calls, 20.7068 milliseconds average
+Sample 3.5493 milliseconds, 27 calls, 131.456 microseconds average
+Encode 6.90879 seconds
+Decode 1.73396 seconds
+DecodeStep 1.73039 seconds, 27 calls, 64.0887 milliseconds average
+ GPU Tasks
+LoadModel 1.20907 seconds
+Run 8.4523 seconds
+Encode 6.83046 seconds
+EncodeLayer 5.71692 seconds, 32 calls, 178.654 milliseconds average
+Decode 1.62184 seconds
+DecodeStep 1.62184 seconds, 27 calls, 60.068 milliseconds average
+DecodeLayer 1.51049 seconds, 864 calls, 1.74825 milliseconds average
+ Compute Shaders
+mulMatTiled 6.39268 seconds, 705 calls, 9.06763 milliseconds average
+mulMatByRowTiled 1.09505 seconds, 10010 calls, 109.395 microseconds average
+convolutionMain2Fixed 155.164 milliseconds
+convolutionMain 123.525 milliseconds
+softMaxFixed 120.173 milliseconds, 896 calls, 134.122 microseconds average
+norm 84.1752 milliseconds, 2684 calls, 31.3618 microseconds average
+copyConvert 78.0956 milliseconds, 1856 calls, 42.0774 microseconds average
+addRepeat 63.3192 milliseconds, 3616 calls, 17.5108 microseconds average
+fmaRepeat1 56.6908 milliseconds, 2684 calls, 21.1218 microseconds average
+softMax 54.6717 milliseconds, 891 calls, 61.3599 microseconds average
+addInPlace 39.7892 milliseconds, 1792 calls, 22.2038 microseconds average
+copyTranspose 38.8897 milliseconds, 1792 calls, 21.7018 microseconds average
+addRepeatGelu 34.762 milliseconds, 898 calls, 38.7105 microseconds average
+add 33.3001 milliseconds, 865 calls, 38.4972 microseconds average
+scaleInPlace 24.343 milliseconds, 896 calls, 27.1685 microseconds average
+addRepeatScale 18.8872 milliseconds, 1728 calls, 10.9301 microseconds average
+convolutionPrep1 7.8052 milliseconds, 2 calls, 3.9026 milliseconds average
+diagMaskInf 4.1647 milliseconds, 864 calls, 4.82025 microseconds average
+convolutionPrep2 1.209 milliseconds, 2 calls, 604.5 microseconds average
+addRows 183.6 microseconds, 27 calls, 6.8 microseconds average
+ Memory Usage
+Model 892.591 KB RAM, 2.8815 GB VRAM
+Context 1.98413 MB RAM, 1.07361 GB VRAM
+Total 2.8558 MB RAM, 3.95511 GB VRAM
diff --git a/SampleClips/jfk-medium-1650.txt b/SampleClips/jfk-medium-1650.txt
new file mode 100644
index 0000000..b072607
--- /dev/null
+++ b/SampleClips/jfk-medium-1650.txt
@@ -0,0 +1,43 @@
+ CPU Tasks
+LoadModel 818.309 milliseconds
+RunComplete 4.59853 seconds
+Run 4.51124 seconds
+Callbacks 259.1 microseconds, 4 calls, 64.775 microseconds average
+Spectrogram 62.0087 milliseconds, 3 calls, 20.6696 milliseconds average
+Sample 3.3139 milliseconds, 28 calls, 118.354 microseconds average
+Encode 3.54162 seconds
+Decode 969.342 milliseconds
+DecodeStep 966.005 milliseconds, 28 calls, 34.5002 milliseconds average
+ GPU Tasks
+LoadModel 623.002 milliseconds
+Run 4.38954 seconds
+Encode 3.46286 seconds
+EncodeLayer 2.86548 seconds, 24 calls, 119.395 milliseconds average
+Decode 926.677 milliseconds
+DecodeStep 926.674 milliseconds, 28 calls, 33.0955 milliseconds average
+DecodeLayer 843.963 milliseconds, 672 calls, 1.2559 milliseconds average
+ Compute Shaders
+mulMatTiled 3.19154 seconds, 529 calls, 6.03316 milliseconds average
+mulMatByRowTiled 628.359 milliseconds, 7803 calls, 80.5278 microseconds average
+convolutionMain2Fixed 98.3757 milliseconds
+convolutionMain 95.2955 milliseconds
+softMaxFixed 73.4031 milliseconds, 696 calls, 105.464 microseconds average
+addRepeat 58.0541 milliseconds, 2808 calls, 20.6745 microseconds average
+copyConvert 42.8539 milliseconds, 1440 calls, 29.7597 microseconds average
+softMax 37.7754 milliseconds, 700 calls, 53.9649 microseconds average
+normFixed 25.4389 milliseconds, 2093 calls, 12.1543 microseconds average
+fmaRepeat1 24.6287 milliseconds, 2093 calls, 11.7672 microseconds average
+addRepeatGelu 24.2553 milliseconds, 698 calls, 34.7497 microseconds average
+copyTranspose 24.2415 milliseconds, 1392 calls, 17.4149 microseconds average
+addInPlace 20.4598 milliseconds, 1392 calls, 14.6981 microseconds average
+scaleInPlace 12.8947 milliseconds, 696 calls, 18.5269 microseconds average
+addRepeatScale 10.8749 milliseconds, 1344 calls, 8.09144 microseconds average
+add 7.3752 milliseconds, 673 calls, 10.9587 microseconds average
+convolutionPrep1 6.0929 milliseconds, 2 calls, 3.04645 milliseconds average
+diagMaskInf 3.2818 milliseconds, 672 calls, 4.88363 microseconds average
+convolutionPrep2 1.2268 milliseconds, 2 calls, 613.4 microseconds average
+addRows 165.9 microseconds, 28 calls, 5.925 microseconds average
+ Memory Usage
+Model 877.966 KB RAM, 1.42785 GB VRAM
+Context 1.98347 MB RAM, 723.729 MB VRAM
+Total 2.84085 MB RAM, 2.13462 GB VRAM