Allow entry points with missing numthreads on CPU targets (#8678)

Several tests have compute entry points without a `[numthreads(x,y,z)]` decoration. Currently, none of these tests run on the CPU target, as they crash the compiler. I took a look at the SPIR-V emitter, which falls back to a workgroup size of (1,1,1): https://github.com/shader-slang/slang/blob/1e0908bd7107dfbdac912b693c3ab9bd6e1dc8b3/source/slang/slang-ir-spirv-legalize.cpp#L1635-L1643 To match this behaviour, this PR implements a fallback solution that makes `emitCalcGroupExtents()` emit (1,1,1). This PR is both a question and a suggestion; I'm not sure the approach here is at all reasonable. Personally, I'd just like to explicitly add `[numthreads(1,1,1)]` to all such tests, but I don't know if it's actually legal and supported to not have a `numthreads`. So the implementation here is a bit conservative. I ran across these when I went through tests for the upcoming LLVM target. These were the final blockers to get all autodiff and language-features tests passing (not counting the ones using things like wave intrinsics and barriers etc.)
author: Julius Ikkala <julius.ikkala@gmail.com> 2025-10-11 02:01:51 +0300
committer: GitHub <noreply@github.com> 2025-10-10 23:01:51 +0000
commit: c99addbf2e8a0210b97dad2827045dad95765d08 (patch)
tree: efb5b88febc2285362acf11ffea28146951e0b2d /source/slang
parent: 462ea4e66569efa978e4057ea2d041c69d4a729b (diff)
1 files changed, 8 insertions, 11 deletions
diff --git a/source/slang/slang-ir-legalize-varying-params.cpp b/source/slang/slang-ir-legalize-varying-params.cpp
index 5f6c7a34e..4c96dc0c7 100644
--- a/source/slang/slang-ir-legalize-varying-params.cpp
+++ b/source/slang/slang-ir-legalize-varying-params.cpp
@@ -185,11 +185,10 @@ void assign(IRBuilder& builder, LegalizedVaryingVal const& dest, IRInst* src)
 
 IRInst* emitCalcGroupExtents(IRBuilder& builder, IRFunc* entryPoint, IRVectorType* type)
 {
+    static const int kAxisCount = 3;
+    IRInst* groupExtentAlongAxis[kAxisCount] = {};
     if (auto numThreadsDecor = entryPoint->findDecoration<IRNumThreadsDecoration>())
     {
-        static const int kAxisCount = 3;
-        IRInst* groupExtentAlongAxis[kAxisCount] = {};
-
         for (int axis = 0; axis < kAxisCount; axis++)
         {
             auto litValue = as<IRIntLit>(numThreadsDecor->getOperand(axis));
@@ -199,16 +198,14 @@ IRInst* emitCalcGroupExtents(IRBuilder& builder, IRFunc* entryPoint, IRVectorTyp
             groupExtentAlongAxis[axis] =
                 builder.getIntValue(type->getElementType(), litValue->getValue());
         }
-
-        return builder.emitMakeVector(type, kAxisCount, groupExtentAlongAxis);
+    }
+    else
+    {
+        for (int axis = 0; axis < kAxisCount; axis++)
+            groupExtentAlongAxis[axis] = builder.getIntValue(type->getElementType(), 1);
     }
 
-    // TODO: We may want to implement a backup option here,
-    // in case we ever want to support compute shaders with
-    // dynamic/flexible group size on targets that allow it.
-    //
-    SLANG_UNEXPECTED("Expected '[numthreads(...)]' attribute on compute entry point.");
-    UNREACHABLE_RETURN(nullptr);
+    return builder.emitMakeVector(type, kAxisCount, groupExtentAlongAxis);
 }
 
 // There are some cases of system-value inputs that can be derived
author	Julius Ikkala <julius.ikkala@gmail.com>	2025-10-11 02:01:51 +0300
committer	GitHub <noreply@github.com>	2025-10-10 23:01:51 +0000
commit	c99addbf2e8a0210b97dad2827045dad95765d08 (patch)
tree	efb5b88febc2285362acf11ffea28146951e0b2d /source/slang
parent	462ea4e66569efa978e4057ea2d041c69d4a729b (diff)