diff options
| author | Julius Ikkala <julius.ikkala@gmail.com> | 2025-10-11 02:01:51 +0300 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-10-10 23:01:51 +0000 |
| commit | c99addbf2e8a0210b97dad2827045dad95765d08 (patch) | |
| tree | efb5b88febc2285362acf11ffea28146951e0b2d /source/slang | |
| parent | 462ea4e66569efa978e4057ea2d041c69d4a729b (diff) | |
Allow entry points with missing numthreads on CPU targets (#8678)
Several tests have compute entry points without a `[numthreads(x,y,z)]`
decoration. Currently, none of these tests run on the CPU target, as
they crash the compiler. I took a look at the SPIR-V emitter, which
falls back to a workgroup size of (1,1,1):
https://github.com/shader-slang/slang/blob/1e0908bd7107dfbdac912b693c3ab9bd6e1dc8b3/source/slang/slang-ir-spirv-legalize.cpp#L1635-L1643
To match this behaviour, this PR implements a fallback solution that
makes `emitCalcGroupExtents()` emit (1,1,1).
This PR is both a question and a suggestion; I'm not sure the approach
here is at all reasonable. Personally, I'd just like to explicitly add
`[numthreads(1,1,1)]` to all such tests, but I don't know if it's
actually legal and supported to not have a `numthreads`. So the
implementation here is a bit conservative.
I ran across these when I went through tests for the upcoming LLVM
target. These were the final blockers to get all autodiff and
language-features tests passing (not counting the ones using things like
wave intrinsics and barriers etc.)
Diffstat (limited to 'source/slang')
| -rw-r--r-- | source/slang/slang-ir-legalize-varying-params.cpp | 19 |
1 files changed, 8 insertions, 11 deletions
diff --git a/source/slang/slang-ir-legalize-varying-params.cpp b/source/slang/slang-ir-legalize-varying-params.cpp index 5f6c7a34e..4c96dc0c7 100644 --- a/source/slang/slang-ir-legalize-varying-params.cpp +++ b/source/slang/slang-ir-legalize-varying-params.cpp @@ -185,11 +185,10 @@ void assign(IRBuilder& builder, LegalizedVaryingVal const& dest, IRInst* src) IRInst* emitCalcGroupExtents(IRBuilder& builder, IRFunc* entryPoint, IRVectorType* type) { + static const int kAxisCount = 3; + IRInst* groupExtentAlongAxis[kAxisCount] = {}; if (auto numThreadsDecor = entryPoint->findDecoration<IRNumThreadsDecoration>()) { - static const int kAxisCount = 3; - IRInst* groupExtentAlongAxis[kAxisCount] = {}; - for (int axis = 0; axis < kAxisCount; axis++) { auto litValue = as<IRIntLit>(numThreadsDecor->getOperand(axis)); @@ -199,16 +198,14 @@ IRInst* emitCalcGroupExtents(IRBuilder& builder, IRFunc* entryPoint, IRVectorTyp groupExtentAlongAxis[axis] = builder.getIntValue(type->getElementType(), litValue->getValue()); } - - return builder.emitMakeVector(type, kAxisCount, groupExtentAlongAxis); + } + else + { + for (int axis = 0; axis < kAxisCount; axis++) + groupExtentAlongAxis[axis] = builder.getIntValue(type->getElementType(), 1); } - // TODO: We may want to implement a backup option here, - // in case we ever want to support compute shaders with - // dynamic/flexible group size on targets that allow it. - // - SLANG_UNEXPECTED("Expected '[numthreads(...)]' attribute on compute entry point."); - UNREACHABLE_RETURN(nullptr); + return builder.emitMakeVector(type, kAxisCount, groupExtentAlongAxis); } // There are some cases of system-value inputs that can be derived |
