summaryrefslogtreecommitdiff
path: root/source/slang/slang-ir-defunctionalization.cpp
AgeCommit message (Collapse)Author
2025-09-30Enhance buffer load specialization pass to specialize past field extracts. ↵Yong He
(#8547) This allows us to specialize functions whose argument is a sub element of a constant buffer, instead of being only applicable to entire buffer element. Closes #8421. This change also implements a proper heuristic to determine when to specialize the calls and defer the buffer loads. This PR addresses a pathological case exposed in `slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to take 27ms to finish, and now takes 1.25ms. For example, given: ``` struct Bottom { float bigArray[1024]; [mutating] void setVal(int index, float value) { bigArray[index] = value; } } struct Root { Bottom top[2]; [mutating] void setTopVal(int x, int y, float value) { top[x].setVal(y, value); } } RWStructuredBuffer<Root> sb; [shader("compute")] [numthreads(1, 1, 1)] void compute_main(uint3 tid: SV_DispatchThreadID) { sb[0].setTopVal(1, 2, 100.0f); } ``` We are now able to specialize the call to `setTopVal` into: ``` void compute_main(uint3 tid: SV_DispatchThreadID) { setTopVal_specialized(0, 1, 2, 100.0f); } void setTopVal_specialized(int sbIdx, int x, int y, float value) { Bottom_setVal_specialized(sbIdx, x, y, value); } void Bottom_setVal_specialized(int sbIdx, int x, int y, float value) { sb[sbIdx].top[x].bigArray[y] = value; } ``` And get rid of all unnecessary loads. Achieving this requires a combination of function call specialization and buffer-load-defer pass. The buffer-load-defer pass has been completely rewritten to be more correct and avoid introducing redundant loads. This PR also adds tests to make sure pointers, bindless handles, and loads from structured buffer or constant buffers works as expected.
2024-10-29formatEllie Hermaszewska
* format * Minor test fixes * enable checking cpp format in ci
2023-05-23Add API for querying total compile time. (#2898)Yong He
* Add API for querying total compile time. * Optimize. * Remove redundant simplifyIR calls. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-05-11MVP for higher order functions (#2849)Ellie Hermaszewska
* MVP for higher order functions * Add shader subgroup partitioned glsl intrinsics * Implement parsing and checking for tuple types Currently there is no way to do anything useful with them from the source language however * neaten * Correct precedence of function type parsing * neaten * higher order function tests * function types of any arity * Inference for higher order functions * Add second test for unsynchronized params * regenerate vs projects * dx11 -> dx12 for saturated cooperations tests * Disable saturated cooperation tests on vulkan They fail on release builds in CI, not essential for the higher order function work however * remove saturated-cooperation tests * Remove unnecessary assert and clarify control flow in AddDeclRefOverloadCandidates * Add Tuple type name mangling * Use functype keyword to introduce function types * Add more inference tests for hof --------- Co-authored-by: Yong He <yonghe@outlook.com>