diff options
| author | Konstantin <const@const.me> | 2023-01-16 23:28:54 +0100 |
|---|---|---|
| committer | Konstantin <const@const.me> | 2023-01-16 23:28:54 +0100 |
| commit | 5f11a8275d70dfd27b5998ab91b4ffb7b15ca418 (patch) | |
| tree | 3135ce4ad2d3e3234453156c83fb899873ea9bea | |
| parent | 15aea5bccc5d03c15edc8acbdd1dacd6f2382caf (diff) | |
Sample audio clips, with performance stats
| -rw-r--r-- | SampleClips/Readme.txt | 17 | ||||
| -rw-r--r-- | SampleClips/columbia-large-1080ti.txt | 43 | ||||
| -rw-r--r-- | SampleClips/columbia-large-vega7.txt | 46 | ||||
| -rw-r--r-- | SampleClips/columbia-medium-1080ti.txt | 43 | ||||
| -rw-r--r-- | SampleClips/columbia-medium-vega7.txt | 46 | ||||
| -rw-r--r-- | SampleClips/columbia.wma | bin | 0 -> 3191097 bytes | |||
| -rw-r--r-- | SampleClips/jfk-large-1080ti.txt | 43 | ||||
| -rw-r--r-- | SampleClips/jfk-large-vega7.txt | 46 | ||||
| -rw-r--r-- | SampleClips/jfk-medium-1080ti.txt | 43 | ||||
| -rw-r--r-- | SampleClips/jfk-medium-vega7.txt | 46 | ||||
| -rw-r--r-- | SampleClips/jfk.wav | bin | 0 -> 352078 bytes |
11 files changed, 373 insertions, 0 deletions
diff --git a/SampleClips/Readme.txt b/SampleClips/Readme.txt new file mode 100644 index 0000000..69432f4 --- /dev/null +++ b/SampleClips/Readme.txt @@ -0,0 +1,17 @@ +This folder contains 2 sample speech audio clips, `jfk.wav` and `columbia.wma` + +jfk.wav is from whisper.cpp repository. + +columbia.wma is from Wikipedia: https://upload.wikimedia.org/wikipedia/commons/1/1f/George_W_Bush_Columbia_FINAL.ogg +I had to re-encoded the audio from Ogg Vorbis into Windows Media Audio, because Media Foundation is unable to decode Vorbis. + +The rest of the text files in this folder are the outputs of the in-app performance profiler, when the app was transcribing these two audio clips on two different computers. + +The files names containing `1080ti` are from my desktop, which has nVidia GeForce 1080Ti GPU. + +The files names containing `vega7` are from my laptop, the GPU is integrated into AMD Ryzen 5 5600U processor. +The laptop model is HP ProBook 445 G8. While running the tests, the laptop was on battery power. + +The files names with `medium` in the middle were made with `ggml-medium.bin` Whisper model. + +The files with `large` were made with `ggml-large.bin` model.
\ No newline at end of file diff --git a/SampleClips/columbia-large-1080ti.txt b/SampleClips/columbia-large-1080ti.txt new file mode 100644 index 0000000..a15ff4e --- /dev/null +++ b/SampleClips/columbia-large-1080ti.txt @@ -0,0 +1,43 @@ + CPU Tasks +LoadModel 6.69478 seconds +RunComplete 33.7046 seconds +Run 33.637 seconds +Callbacks 12.7347 milliseconds, 44 calls, 289.425 microseconds average +Spectrogram 679.962 milliseconds, 41 calls, 16.5844 milliseconds average +Sample 64.9643 milliseconds, 527 calls, 123.272 microseconds average +Encode 13.5814 seconds, 9 calls, 1.50905 seconds average +Decode 20.0426 seconds, 9 calls, 2.22696 seconds average +DecodeStep 19.9774 seconds, 527 calls, 37.9077 milliseconds average + GPU Tasks +LoadModel 6.50695 seconds +Run 33.4847 seconds +Encode 13.6283 seconds, 9 calls, 1.51426 seconds average +EncodeLayer 11.6754 seconds, 288 calls, 40.5397 milliseconds average +Decode 19.8563 seconds, 9 calls, 2.20626 seconds average +DecodeStep 19.8559 seconds, 527 calls, 37.6773 milliseconds average +DecodeLayer 18.5337 seconds, 16864 calls, 1.09901 milliseconds average + Compute Shaders +mulMatTiled 14.6726 seconds, 6345 calls, 2.31247 milliseconds average +mulMatByRowTiled 11.8939 seconds, 199430 calls, 59.6393 microseconds average +norm 1.3396 seconds, 51704 calls, 25.909 microseconds average +softMax 858.923 milliseconds, 17391 calls, 49.3889 microseconds average +addRepeat 792.962 milliseconds, 68896 calls, 11.5096 microseconds average +fmaRepeat1 567.753 milliseconds, 51704 calls, 10.9808 microseconds average +copyConvert 541.081 milliseconds, 34880 calls, 15.5126 microseconds average +softMaxFixed 523.378 milliseconds, 17152 calls, 30.5141 microseconds average +copyTranspose 422.677 milliseconds, 34304 calls, 12.3215 microseconds average +addRepeatScale 329.963 milliseconds, 33728 calls, 9.78305 microseconds average +addInPlace 306.328 milliseconds, 34304 calls, 8.92981 microseconds average +addRepeatGelu 290.074 milliseconds, 17170 calls, 16.8942 microseconds average +scaleInPlace 237.756 milliseconds, 17152 calls, 13.8617 microseconds average +add 196.816 milliseconds, 16873 calls, 11.6645 microseconds average +convolutionMain2Fixed 187.457 milliseconds, 9 calls, 20.8285 milliseconds average +diagMaskInf 103.247 milliseconds, 16864 calls, 6.12231 microseconds average +convolutionMain 75.5589 milliseconds, 9 calls, 8.39543 milliseconds average +convolutionPrep1 21.4927 milliseconds, 18 calls, 1.19404 milliseconds average +addRows 9.2908 milliseconds, 527 calls, 17.6296 microseconds average +convolutionPrep2 5.0944 milliseconds, 18 calls, 283.022 microseconds average + Memory Usage +Model 892.591 KB RAM, 2.8815 GB VRAM +Context 92.2616 MB RAM, 1.20719 GB VRAM +Total 93.1333 MB RAM, 4.08869 GB VRAM diff --git a/SampleClips/columbia-large-vega7.txt b/SampleClips/columbia-large-vega7.txt new file mode 100644 index 0000000..836654e --- /dev/null +++ b/SampleClips/columbia-large-vega7.txt @@ -0,0 +1,46 @@ + CPU Tasks +LoadModel 3.44286 seconds +RunComplete 174.677 seconds +Run 174.601 seconds +Callbacks 22.604 milliseconds, 44 calls, 513.727 microseconds average +Spectrogram 1.65973 seconds, 41 calls, 40.4812 milliseconds average +Sample 148.233 milliseconds, 527 calls, 281.276 microseconds average +Encode 110.192 seconds, 9 calls, 12.2436 seconds average +Decode 64.3834 seconds, 9 calls, 7.15371 seconds average +DecodeStep 64.2344 seconds, 527 calls, 121.887 milliseconds average + GPU Tasks +LoadModel 2.20374 seconds +Run 173.895 seconds +Encode 111.531 seconds, 9 calls, 12.3923 seconds average +EncodeLayer 96.2295 seconds, 288 calls, 334.13 milliseconds average +Decode 62.3642 seconds, 9 calls, 6.92936 seconds average +DecodeStep 62.3636 seconds, 527 calls, 118.337 milliseconds average +DecodeLayer 58.6225 seconds, 16864 calls, 3.47619 milliseconds average + Compute Shaders +mulMatTiledEx 89.3411 seconds, 2880 calls, 31.0212 milliseconds average +mulMatTiled 25.4265 seconds, 3465 calls, 7.33809 milliseconds average +mulMatByRowTiled 22.2805 seconds, 166278 calls, 133.995 microseconds average +mulMatByRowTiledEx 13.8414 seconds, 33152 calls, 417.514 microseconds average +softMaxFixed 3.90482 seconds, 17152 calls, 227.66 microseconds average +addRepeatGelu 2.52778 seconds, 17170 calls, 147.221 microseconds average +norm 2.10933 seconds, 51704 calls, 40.7962 microseconds average +convolutionMain2Fixed 2.06899 seconds, 9 calls, 229.888 milliseconds average +matReshapePanels 1.99444 seconds, 1737 calls, 1.14821 milliseconds average +addRepeat 1.84752 seconds, 68896 calls, 26.816 microseconds average +fmaRepeat1 1.28479 seconds, 51704 calls, 24.849 microseconds average +copyConvert 1.23617 seconds, 34880 calls, 35.4406 microseconds average +softMax 1.11773 seconds, 17391 calls, 64.2704 microseconds average +scaleInPlace 848.371 milliseconds, 17152 calls, 49.4619 microseconds average +copyTranspose 796.781 milliseconds, 34304 calls, 23.227 microseconds average +addInPlace 733.523 milliseconds, 34304 calls, 21.383 microseconds average +addRepeatScale 727.214 milliseconds, 33728 calls, 21.5611 microseconds average +convolutionMain 535.149 milliseconds, 9 calls, 59.461 milliseconds average +add 525.766 milliseconds, 16873 calls, 31.1602 microseconds average +diagMaskInf 361.151 milliseconds, 16864 calls, 21.4155 microseconds average +convolutionPrep1 58.0177 milliseconds, 18 calls, 3.22321 milliseconds average +convolutionPrep2 30.1294 milliseconds, 18 calls, 1.67386 milliseconds average +addRows 1.8544 milliseconds, 527 calls, 3.51879 microseconds average + Memory Usage +Model 892.591 KB RAM, 2.8815 GB VRAM +Context 92.2617 MB RAM, 1.27432 GB VRAM +Total 93.1334 MB RAM, 4.15582 GB VRAM diff --git a/SampleClips/columbia-medium-1080ti.txt b/SampleClips/columbia-medium-1080ti.txt new file mode 100644 index 0000000..4a6402a --- /dev/null +++ b/SampleClips/columbia-medium-1080ti.txt @@ -0,0 +1,43 @@ + CPU Tasks +LoadModel 766.119 milliseconds +RunComplete 19.7043 seconds +Run 19.5957 seconds +Callbacks 9.6164 milliseconds, 37 calls, 259.903 microseconds average +Spectrogram 720.672 milliseconds, 42 calls, 17.1589 milliseconds average +Sample 64.2796 milliseconds, 511 calls, 125.792 microseconds average +Encode 7.79098 seconds, 10 calls, 779.098 milliseconds average +Decode 11.7948 seconds, 10 calls, 1.17948 seconds average +DecodeStep 11.7302 seconds, 511 calls, 22.9555 milliseconds average + GPU Tasks +LoadModel 611.184 milliseconds +Run 19.4034 seconds +Encode 7.70488 seconds, 10 calls, 770.488 milliseconds average +EncodeLayer 6.5897 seconds, 240 calls, 27.4571 milliseconds average +Decode 11.6985 seconds, 10 calls, 1.16985 seconds average +DecodeStep 11.6985 seconds, 511 calls, 22.8933 milliseconds average +DecodeLayer 10.6646 seconds, 12264 calls, 869.587 microseconds average + Compute Shaders +mulMatTiled 8.16985 seconds, 5290 calls, 1.5444 milliseconds average +mulMatByRowTiled 6.60967 seconds, 144789 calls, 45.6503 microseconds average +softMax 797.261 milliseconds, 12775 calls, 62.4079 microseconds average +addRepeat 571.485 milliseconds, 50256 calls, 11.3715 microseconds average +fmaRepeat1 416.121 milliseconds, 37793 calls, 11.0105 microseconds average +normFixed 411.604 milliseconds, 37793 calls, 10.891 microseconds average +softMaxFixed 383.004 milliseconds, 12504 calls, 30.6305 microseconds average +copyConvert 373.59 milliseconds, 25488 calls, 14.6575 microseconds average +copyTranspose 337.831 milliseconds, 25008 calls, 13.5089 microseconds average +addRepeatScale 227.901 milliseconds, 24528 calls, 9.29146 microseconds average +addInPlace 226.48 milliseconds, 25008 calls, 9.05631 microseconds average +addRepeatGelu 215.091 milliseconds, 12524 calls, 17.1743 microseconds average +scaleInPlace 164.065 milliseconds, 12504 calls, 13.121 microseconds average +add 139.896 milliseconds, 12274 calls, 11.3978 microseconds average +convolutionMain2Fixed 129.329 milliseconds, 10 calls, 12.9329 milliseconds average +diagMaskInf 75.8229 milliseconds, 12264 calls, 6.18256 microseconds average +convolutionMain 70.7461 milliseconds, 10 calls, 7.07461 milliseconds average +convolutionPrep1 16.0788 milliseconds, 20 calls, 803.94 microseconds average +convolutionPrep2 5.4456 milliseconds, 20 calls, 272.28 microseconds average +addRows 4.1574 milliseconds, 511 calls, 8.13581 microseconds average + Memory Usage +Model 877.966 KB RAM, 1.42785 GB VRAM +Context 91.0719 MB RAM, 841.634 MB VRAM +Total 91.9293 MB RAM, 2.24976 GB VRAM diff --git a/SampleClips/columbia-medium-vega7.txt b/SampleClips/columbia-medium-vega7.txt new file mode 100644 index 0000000..06b3ad3 --- /dev/null +++ b/SampleClips/columbia-medium-vega7.txt @@ -0,0 +1,46 @@ + CPU Tasks +LoadModel 1.63669 seconds +RunComplete 97.4095 seconds +Run 97.3338 seconds +Callbacks 18.5655 milliseconds, 37 calls, 501.77 microseconds average +Spectrogram 1.4999 seconds, 42 calls, 35.7119 milliseconds average +Sample 135.736 milliseconds, 511 calls, 265.628 microseconds average +Encode 61.2992 seconds, 10 calls, 6.12992 seconds average +Decode 36.0131 seconds, 10 calls, 3.60131 seconds average +DecodeStep 35.8768 seconds, 511 calls, 70.2089 milliseconds average + GPU Tasks +LoadModel 875.606 milliseconds +Run 96.9497 seconds +Encode 62.3057 seconds, 10 calls, 6.23057 seconds average +EncodeLayer 53.632 seconds, 240 calls, 223.467 milliseconds average +Decode 34.644 seconds, 10 calls, 3.4644 seconds average +DecodeStep 34.6434 seconds, 511 calls, 67.7954 milliseconds average +DecodeLayer 31.2704 seconds, 12264 calls, 2.54977 milliseconds average + Compute Shaders +mulMatTiledEx 46.2214 seconds, 2400 calls, 19.2589 milliseconds average +mulMatTiled 17.3476 seconds, 2890 calls, 6.00262 milliseconds average +mulMatByRowTiled 13.9489 seconds, 120741 calls, 115.527 microseconds average +mulMatByRowTiledEx 5.45206 seconds, 24048 calls, 226.716 microseconds average +softMaxFixed 2.49323 seconds, 12504 calls, 199.395 microseconds average +convolutionMain2Fixed 1.51065 seconds, 10 calls, 151.065 milliseconds average +matReshapePanels 1.26582 seconds, 1450 calls, 872.982 microseconds average +addRepeat 1.21062 seconds, 50256 calls, 24.0891 microseconds average +softMax 986.762 milliseconds, 12775 calls, 77.2417 microseconds average +addRepeatGelu 937.447 milliseconds, 12524 calls, 74.852 microseconds average +copyConvert 787.692 milliseconds, 25488 calls, 30.9044 microseconds average +fmaRepeat1 769.494 milliseconds, 37793 calls, 20.3608 microseconds average +normFixed 741.028 milliseconds, 37793 calls, 19.6076 microseconds average +addRepeatScale 600.233 milliseconds, 24528 calls, 24.4714 microseconds average +addInPlace 548.734 milliseconds, 25008 calls, 21.9423 microseconds average +scaleInPlace 489.186 milliseconds, 12504 calls, 39.1224 microseconds average +convolutionMain 469.994 milliseconds, 10 calls, 46.9994 milliseconds average +copyTranspose 452.957 milliseconds, 25008 calls, 18.1125 microseconds average +add 296.072 milliseconds, 12274 calls, 24.1219 microseconds average +diagMaskInf 194.708 milliseconds, 12264 calls, 15.8764 microseconds average +convolutionPrep2 43.5675 milliseconds, 20 calls, 2.17837 milliseconds average +convolutionPrep1 40.4517 milliseconds, 20 calls, 2.02258 milliseconds average +addRows 1.6846 milliseconds, 511 calls, 3.29667 microseconds average + Memory Usage +Model 877.966 KB RAM, 1.42785 GB VRAM +Context 91.0721 MB RAM, 893.634 MB VRAM +Total 91.9295 MB RAM, 2.30054 GB VRAM diff --git a/SampleClips/columbia.wma b/SampleClips/columbia.wma Binary files differnew file mode 100644 index 0000000..7196199 --- /dev/null +++ b/SampleClips/columbia.wma diff --git a/SampleClips/jfk-large-1080ti.txt b/SampleClips/jfk-large-1080ti.txt new file mode 100644 index 0000000..6b963cb --- /dev/null +++ b/SampleClips/jfk-large-1080ti.txt @@ -0,0 +1,43 @@ + CPU Tasks +LoadModel 1.31643 seconds +RunComplete 2.62992 seconds +Run 2.55991 seconds +Callbacks 268.8 microseconds, 4 calls, 67.2 microseconds average +Spectrogram 41.7164 milliseconds, 3 calls, 13.9055 milliseconds average +Sample 3.7334 milliseconds, 27 calls, 138.274 microseconds average +Encode 1.59685 seconds +Decode 962.766 milliseconds +DecodeStep 959.004 milliseconds, 27 calls, 35.5187 milliseconds average + GPU Tasks +LoadModel 1.16929 seconds +Run 2.50813 seconds +Encode 1.54197 seconds +EncodeLayer 1.31304 seconds, 32 calls, 41.0324 milliseconds average +Decode 966.163 milliseconds +DecodeStep 966.159 milliseconds, 27 calls, 35.7837 milliseconds average +DecodeLayer 902.348 milliseconds, 864 calls, 1.04438 milliseconds average + Compute Shaders +mulMatTiled 1.48565 seconds, 705 calls, 2.10731 milliseconds average +mulMatByRowTiled 597.295 milliseconds, 10010 calls, 59.6698 microseconds average +norm 73.0336 milliseconds, 2684 calls, 27.2107 microseconds average +addRepeat 53.7049 milliseconds, 3616 calls, 14.852 microseconds average +softMaxFixed 42.5443 milliseconds, 896 calls, 47.4825 microseconds average +softMax 42.278 milliseconds, 891 calls, 47.4501 microseconds average +fmaRepeat1 32.9186 milliseconds, 2684 calls, 12.2648 microseconds average +copyConvert 30.5182 milliseconds, 1856 calls, 16.443 microseconds average +copyTranspose 23.707 milliseconds, 1792 calls, 13.2294 microseconds average +convolutionMain2Fixed 20.2435 milliseconds +addInPlace 20.0419 milliseconds, 1792 calls, 11.1841 microseconds average +addRepeatGelu 19.4727 milliseconds, 898 calls, 21.6845 microseconds average +addRepeatScale 17.2226 milliseconds, 1728 calls, 9.96678 microseconds average +scaleInPlace 15.5358 milliseconds, 896 calls, 17.3391 microseconds average +add 11.4178 milliseconds, 865 calls, 13.1998 microseconds average +convolutionMain 8.7583 milliseconds +diagMaskInf 5.3196 milliseconds, 864 calls, 6.15694 microseconds average +convolutionPrep1 2.3276 milliseconds, 2 calls, 1.1638 milliseconds average +convolutionPrep2 572.4 microseconds, 2 calls, 286.2 microseconds average +addRows 207.9 microseconds, 27 calls, 7.7 microseconds average + Memory Usage +Model 892.591 KB RAM, 2.8815 GB VRAM +Context 1.98413 MB RAM, 1.07361 GB VRAM +Total 2.8558 MB RAM, 3.95511 GB VRAM diff --git a/SampleClips/jfk-large-vega7.txt b/SampleClips/jfk-large-vega7.txt new file mode 100644 index 0000000..712c626 --- /dev/null +++ b/SampleClips/jfk-large-vega7.txt @@ -0,0 +1,46 @@ + CPU Tasks +LoadModel 2.48295 seconds +RunComplete 19.41 seconds +Run 19.325 seconds +Callbacks 938.5 microseconds, 4 calls, 234.625 microseconds average +Spectrogram 101.776 milliseconds, 3 calls, 33.9253 milliseconds average +Sample 7.4609 milliseconds, 27 calls, 276.33 microseconds average +Encode 16.6219 seconds +Decode 2.7018 seconds +DecodeStep 2.69429 seconds, 27 calls, 99.7886 milliseconds average + GPU Tasks +LoadModel 1.59925 seconds +Run 19.1489 seconds +Encode 16.6535 seconds +EncodeLayer 14.7506 seconds, 32 calls, 460.957 milliseconds average +Decode 2.4954 seconds +DecodeStep 2.49537 seconds, 27 calls, 92.4212 milliseconds average +DecodeLayer 2.33892 seconds, 864 calls, 2.70708 milliseconds average + Compute Shaders +mulMatTiledEx 10.8399 seconds, 320 calls, 33.8745 milliseconds average +mulMatTiled 2.40696 seconds, 385 calls, 6.25184 milliseconds average +norm 1.1565 seconds, 2684 calls, 430.885 microseconds average +addRepeatGelu 1.1138 seconds, 898 calls, 1.24031 milliseconds average +mulMatByRowTiled 1.08614 seconds, 8346 calls, 130.14 microseconds average +mulMatByRowTiledEx 692.772 milliseconds, 1664 calls, 416.329 microseconds average +softMaxFixed 416.688 milliseconds, 896 calls, 465.053 microseconds average +convolutionMain2Fixed 415.361 milliseconds +matReshapePanels 179.52 milliseconds, 193 calls, 930.155 microseconds average +addRepeat 171.572 milliseconds, 3616 calls, 47.4479 microseconds average +convolutionMain 126.679 milliseconds +copyConvert 95.046 milliseconds, 1856 calls, 51.2101 microseconds average +fmaRepeat1 74.6558 milliseconds, 2684 calls, 27.8151 microseconds average +copyTranspose 73.589 milliseconds, 1792 calls, 41.0653 microseconds average +addInPlace 67.0819 milliseconds, 1792 calls, 37.4341 microseconds average +scaleInPlace 66.1625 milliseconds, 896 calls, 73.8421 microseconds average +softMax 65.629 milliseconds, 891 calls, 73.6577 microseconds average +addRepeatScale 29.3899 milliseconds, 1728 calls, 17.008 microseconds average +add 25.2651 milliseconds, 865 calls, 29.2082 microseconds average +convolutionPrep1 13.325 milliseconds, 2 calls, 6.6625 milliseconds average +diagMaskInf 11.2047 milliseconds, 864 calls, 12.9684 microseconds average +convolutionPrep2 5.9717 milliseconds, 2 calls, 2.98585 milliseconds average +addRows 93.7 microseconds, 27 calls, 3.47037 microseconds average + Memory Usage +Model 892.591 KB RAM, 2.8815 GB VRAM +Context 1.98427 MB RAM, 1.13175 GB VRAM +Total 2.85594 MB RAM, 4.01325 GB VRAM diff --git a/SampleClips/jfk-medium-1080ti.txt b/SampleClips/jfk-medium-1080ti.txt new file mode 100644 index 0000000..f76376d --- /dev/null +++ b/SampleClips/jfk-medium-1080ti.txt @@ -0,0 +1,43 @@ + CPU Tasks +LoadModel 751.527 milliseconds +RunComplete 1.46731 seconds +Run 1.39689 seconds +Callbacks 319.7 microseconds, 4 calls, 79.925 microseconds average +Spectrogram 40.711 milliseconds, 3 calls, 13.5703 milliseconds average +Sample 3.6208 milliseconds, 28 calls, 129.314 microseconds average +Encode 803.503 milliseconds +Decode 593.049 milliseconds +DecodeStep 589.41 milliseconds, 28 calls, 21.0504 milliseconds average + GPU Tasks +LoadModel 597.603 milliseconds +Run 1.34198 seconds +Encode 754.654 milliseconds +EncodeLayer 645.794 milliseconds, 24 calls, 26.9081 milliseconds average +Decode 587.324 milliseconds +DecodeStep 587.321 milliseconds, 28 calls, 20.9758 milliseconds average +DecodeLayer 543.185 milliseconds, 672 calls, 808.311 microseconds average + Compute Shaders +mulMatTiled 723.882 milliseconds, 529 calls, 1.3684 milliseconds average +mulMatByRowTiled 346.71 milliseconds, 7803 calls, 44.4329 microseconds average +softMax 39.6096 milliseconds, 700 calls, 56.5851 microseconds average +addRepeat 31.2462 milliseconds, 2808 calls, 11.1276 microseconds average +softMaxFixed 27.2224 milliseconds, 696 calls, 39.1126 microseconds average +normFixed 24.8054 milliseconds, 2093 calls, 11.8516 microseconds average +fmaRepeat1 24.7513 milliseconds, 2093 calls, 11.8258 microseconds average +copyConvert 19.778 milliseconds, 1440 calls, 13.7347 microseconds average +copyTranspose 18.6921 milliseconds, 1392 calls, 13.4282 microseconds average +addRepeatScale 13.4873 milliseconds, 1344 calls, 10.0352 microseconds average +addInPlace 13.0325 milliseconds, 1392 calls, 9.36243 microseconds average +convolutionMain2Fixed 12.33 milliseconds +addRepeatGelu 12.1985 milliseconds, 698 calls, 17.4764 microseconds average +scaleInPlace 10.3726 milliseconds, 696 calls, 14.9032 microseconds average +add 8.0935 milliseconds, 673 calls, 12.026 microseconds average +convolutionMain 6.6079 milliseconds +diagMaskInf 3.9483 milliseconds, 672 calls, 5.87545 microseconds average +convolutionPrep1 1.5073 milliseconds, 2 calls, 753.65 microseconds average +convolutionPrep2 540.7 microseconds, 2 calls, 270.35 microseconds average +addRows 204.8 microseconds, 28 calls, 7.31429 microseconds average + Memory Usage +Model 877.966 KB RAM, 1.42785 GB VRAM +Context 1.98347 MB RAM, 723.729 MB VRAM +Total 2.84085 MB RAM, 2.13462 GB VRAM diff --git a/SampleClips/jfk-medium-vega7.txt b/SampleClips/jfk-medium-vega7.txt new file mode 100644 index 0000000..0be45d3 --- /dev/null +++ b/SampleClips/jfk-medium-vega7.txt @@ -0,0 +1,46 @@ + CPU Tasks +LoadModel 1.44983 seconds +RunComplete 9.9723 seconds +Run 9.8953 seconds +Callbacks 876.5 microseconds, 4 calls, 219.125 microseconds average +Spectrogram 100.602 milliseconds, 3 calls, 33.5339 milliseconds average +Sample 8.2281 milliseconds, 28 calls, 293.861 microseconds average +Encode 8.28685 seconds +Decode 1.60728 seconds +DecodeStep 1.599 seconds, 28 calls, 57.1073 milliseconds average + GPU Tasks +LoadModel 751.497 milliseconds +Run 9.73531 seconds +Encode 8.28303 seconds +EncodeLayer 7.19651 seconds, 24 calls, 299.855 milliseconds average +Decode 1.45228 seconds +DecodeStep 1.45225 seconds, 28 calls, 51.866 milliseconds average +DecodeLayer 1.31372 seconds, 672 calls, 1.95494 milliseconds average + Compute Shaders +mulMatTiledEx 5.73474 seconds, 240 calls, 23.8947 milliseconds average +mulMatTiled 1.59442 seconds, 289 calls, 5.51703 milliseconds average +mulMatByRowTiled 708.039 milliseconds, 6507 calls, 108.812 microseconds average +mulMatByRowTiledEx 292.797 milliseconds, 1296 calls, 225.923 microseconds average +convolutionMain2Fixed 267.762 milliseconds +softMaxFixed 252.702 milliseconds, 696 calls, 363.078 microseconds average +addRepeat 122.774 milliseconds, 2808 calls, 43.7229 microseconds average +matReshapePanels 116.085 milliseconds, 145 calls, 800.583 microseconds average +convolutionMain 100.111 milliseconds +addRepeatGelu 78.6895 milliseconds, 698 calls, 112.736 microseconds average +normFixed 64.6521 milliseconds, 2093 calls, 30.8897 microseconds average +scaleInPlace 64.0629 milliseconds, 696 calls, 92.0444 microseconds average +copyConvert 62.7305 milliseconds, 1440 calls, 43.5628 microseconds average +softMax 50.9006 milliseconds, 700 calls, 72.7151 microseconds average +fmaRepeat1 49.6347 milliseconds, 2093 calls, 23.7146 microseconds average +copyTranspose 44.2248 milliseconds, 1392 calls, 31.7707 microseconds average +addInPlace 44.1766 milliseconds, 1392 calls, 31.7361 microseconds average +addRepeatScale 31.3737 milliseconds, 1344 calls, 23.3435 microseconds average +add 19.0564 milliseconds, 673 calls, 28.3156 microseconds average +convolutionPrep1 8.494 milliseconds, 2 calls, 4.247 milliseconds average +diagMaskInf 6.9839 milliseconds, 672 calls, 10.3927 microseconds average +convolutionPrep2 6.0876 milliseconds, 2 calls, 3.0438 milliseconds average +addRows 72 microseconds, 28 calls, 2.57143 microseconds average + Memory Usage +Model 877.966 KB RAM, 1.42785 GB VRAM +Context 1.9836 MB RAM, 771.354 MB VRAM +Total 2.84099 MB RAM, 2.18113 GB VRAM diff --git a/SampleClips/jfk.wav b/SampleClips/jfk.wav Binary files differnew file mode 100644 index 0000000..3184d37 --- /dev/null +++ b/SampleClips/jfk.wav |
