diff options
| author | Konstantin <const@const.me> | 2023-01-23 14:38:12 +0100 |
|---|---|---|
| committer | Konstantin <const@const.me> | 2023-01-23 14:38:12 +0100 |
| commit | 27dfc3428a7016e2d05dd67b6d8b88c0b982baa9 (patch) | |
| tree | f969d54ebfb266ecf61285a039295a1da37200a0 /ComputeShaders/softMaxLong.hlsl | |
| parent | 01aba39f15a03ed96e034ffc3b6ee9ec12294b0d (diff) | |
Performance improvement, `softMax` shader
Diffstat (limited to 'ComputeShaders/softMaxLong.hlsl')
| -rw-r--r-- | ComputeShaders/softMaxLong.hlsl | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/ComputeShaders/softMaxLong.hlsl b/ComputeShaders/softMaxLong.hlsl new file mode 100644 index 0000000..1f2c2be --- /dev/null +++ b/ComputeShaders/softMaxLong.hlsl @@ -0,0 +1,6 @@ +// This version is for the "dec.probs" shader tag +// The input tensor has a size [ 51865, 3 ], a very long tensor with just 3 rows. +// Despite the shader only runs on 3 GPU cores, large count of threads helps substantially, this shader is about 50% faster. +#define THREADS 1024 + +#include "softMax.hlsl"
\ No newline at end of file |
