From c99addbf2e8a0210b97dad2827045dad95765d08 Mon Sep 17 00:00:00 2001 From: Julius Ikkala Date: Sat, 11 Oct 2025 02:01:51 +0300 Subject: Allow entry points with missing numthreads on CPU targets (#8678) Several tests have compute entry points without a `[numthreads(x,y,z)]` decoration. Currently, none of these tests run on the CPU target, as they crash the compiler. I took a look at the SPIR-V emitter, which falls back to a workgroup size of (1,1,1): https://github.com/shader-slang/slang/blob/1e0908bd7107dfbdac912b693c3ab9bd6e1dc8b3/source/slang/slang-ir-spirv-legalize.cpp#L1635-L1643 To match this behaviour, this PR implements a fallback solution that makes `emitCalcGroupExtents()` emit (1,1,1). This PR is both a question and a suggestion; I'm not sure the approach here is at all reasonable. Personally, I'd just like to explicitly add `[numthreads(1,1,1)]` to all such tests, but I don't know if it's actually legal and supported to not have a `numthreads`. So the implementation here is a bit conservative. I ran across these when I went through tests for the upcoming LLVM target. These were the final blockers to get all autodiff and language-features tests passing (not counting the ones using things like wave intrinsics and barriers etc.) --- source/slang/slang-ir-legalize-varying-params.cpp | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) (limited to 'source/slang') diff --git a/source/slang/slang-ir-legalize-varying-params.cpp b/source/slang/slang-ir-legalize-varying-params.cpp index 5f6c7a34e..4c96dc0c7 100644 --- a/source/slang/slang-ir-legalize-varying-params.cpp +++ b/source/slang/slang-ir-legalize-varying-params.cpp @@ -185,11 +185,10 @@ void assign(IRBuilder& builder, LegalizedVaryingVal const& dest, IRInst* src) IRInst* emitCalcGroupExtents(IRBuilder& builder, IRFunc* entryPoint, IRVectorType* type) { + static const int kAxisCount = 3; + IRInst* groupExtentAlongAxis[kAxisCount] = {}; if (auto numThreadsDecor = entryPoint->findDecoration()) { - static const int kAxisCount = 3; - IRInst* groupExtentAlongAxis[kAxisCount] = {}; - for (int axis = 0; axis < kAxisCount; axis++) { auto litValue = as(numThreadsDecor->getOperand(axis)); @@ -199,16 +198,14 @@ IRInst* emitCalcGroupExtents(IRBuilder& builder, IRFunc* entryPoint, IRVectorTyp groupExtentAlongAxis[axis] = builder.getIntValue(type->getElementType(), litValue->getValue()); } - - return builder.emitMakeVector(type, kAxisCount, groupExtentAlongAxis); + } + else + { + for (int axis = 0; axis < kAxisCount; axis++) + groupExtentAlongAxis[axis] = builder.getIntValue(type->getElementType(), 1); } - // TODO: We may want to implement a backup option here, - // in case we ever want to support compute shaders with - // dynamic/flexible group size on targets that allow it. - // - SLANG_UNEXPECTED("Expected '[numthreads(...)]' attribute on compute entry point."); - UNREACHABLE_RETURN(nullptr); + return builder.emitMakeVector(type, kAxisCount, groupExtentAlongAxis); } // There are some cases of system-value inputs that can be derived -- cgit v1.2.3