summaryrefslogtreecommitdiffstats
path: root/ComputeShaders/softMaxLong.hlsl
diff options
context:
space:
mode:
authorKonstantin <const@const.me>2023-01-23 14:38:12 +0100
committerKonstantin <const@const.me>2023-01-23 14:38:12 +0100
commit27dfc3428a7016e2d05dd67b6d8b88c0b982baa9 (patch)
treef969d54ebfb266ecf61285a039295a1da37200a0 /ComputeShaders/softMaxLong.hlsl
parent01aba39f15a03ed96e034ffc3b6ee9ec12294b0d (diff)
Performance improvement, `softMax` shader
Diffstat (limited to 'ComputeShaders/softMaxLong.hlsl')
-rw-r--r--ComputeShaders/softMaxLong.hlsl6
1 files changed, 6 insertions, 0 deletions
diff --git a/ComputeShaders/softMaxLong.hlsl b/ComputeShaders/softMaxLong.hlsl
new file mode 100644
index 0000000..1f2c2be
--- /dev/null
+++ b/ComputeShaders/softMaxLong.hlsl
@@ -0,0 +1,6 @@
+// This version is for the "dec.probs" shader tag
+// The input tensor has a size [ 51865, 3 ], a very long tensor with just 3 rows.
+// Despite the shader only runs on 3 GPU cores, large count of threads helps substantially, this shader is about 50% faster.
+#define THREADS 1024
+
+#include "softMax.hlsl" \ No newline at end of file