| Commit message (Collapse) | Author | Age |
| | |
|
| |
|
|
|
|
|
| |
On the shader-host-callable target, test `gh-4874.slang` generates IR
that contains global constants referencing global params. These need to
get inlined into functions, as otherwise
`introduceExplicitGlobalContext()` will fail with "no outer func at use
site for global", making the test crash the compiler.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR implements `Access.Immutable` to allow pointers to immutable
data.
The new type `ImmutablePtr<T>` is defined as an alias of `Ptr<T,
Address.Immutable>`.
By forming a immutable pointer, the programmer is conveying to the
compiler that the data at the pointer address will never change during
the execution of the current program. Therefore loads from immutable
pointers can be deduplicated by the compiler, and will translate to
`__ldg` when generating code for CUDA.
The SPIRV backend is not changed in this PR, since the current SPIRV
spec makes it very difficult to specify loads from immutable address
without generating tons of wrappers and boilerplate type declarations.
We would like to see the spec evolved a bit to around its support of
`NonWritable` physical storage pointers or immutable loads before we
attempt to express such immutability in SPIRV. For now we simply emit
ordinary pointers and loads when generating spirv.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
| |
When Slang form a new spirv code without the debug info, List container
had to reserve the memory space before adding items in it.
This improves the given repro test time from 56 minutes to 6 minutes.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fix a regression on metal test.
In `lowerBufferElementTypeToStorageType` pass, not only we want to defer
an argument that is `CastStorageToLogical` to the callee, but also apply
the same defer logic to `CastStorageToLogicalDeref` as well.
Because `CastStorageToLogicalDeref` will appear as argumnet if
`lowerBufferElementTypeToStorageType` is run before we apply the
`in->borrow` transformation pass, which is the case for metal parameter
block legalization.
|
| |
|
|
|
|
|
|
| |
happen. (#8653)
This allows us to further cleanup unnecessary copies in the target code
we generate.
Part of effort of #8652.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closes #7606.
When Slang compile for a bindful target, we will run the resource type
legalization pass to hoist resource typed struct fields outside of the
struct type and define them as global parameters and passing them around
via dedicated function parameters.
When we compile for a bindless target, we don't run this pass.
However, Metal is a hybrid bindful and bindless target. We need to run
type legalization for the constant buffer, but skip type legalization
for parameter block.
The previous attempt to support this behavior is to hack the type
legalization pass to return `LegalVal::simple` when it sees a
`ParameterBlock<T>`. However, whenever the code is accessing
`parameterBlock.someNestedField`, the type of the nested field may get a
`LegalType::tuple`, and now we will run into inconsistent scenarios
where we have a `LegalVal::simple` on the operand val, and but the
legalization logic is expecting that val to be a `LegalType::tuple`.
This breaks a lot of assumptions and invariants in the type legalization
pass, resulting unstable/fragile behavior.
To systematically solve this problem, this change generalizes the
existing legalize buffer element type pass to translate
`ParameterBlock<Texture2D>` (and similar cases) to
`ParameterBlock<Texture2D.Handle>`. So that such parameter block will
always be legalized to `LegalType:::simple` during type legalization,
and we will never run into any inconsistent cases. This allowed us to
get rid of the hacky logic in the type legalization pass to try to
workaround the inconsistencies.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#8547)
This allows us to specialize functions whose argument is a sub element
of a constant buffer, instead of being only applicable to entire buffer
element. Closes #8421.
This change also implements a proper heuristic to determine when to
specialize the calls and defer the buffer loads.
This PR addresses a pathological case exposed in
`slangpy\slangpy\benchmarks\test_benchmark_tensor.py`, which used to
take 27ms to finish, and now takes 1.25ms.
For example, given:
```
struct Bottom
{
float bigArray[1024];
[mutating]
void setVal(int index, float value) { bigArray[index] = value; }
}
struct Root
{
Bottom top[2];
[mutating]
void setTopVal(int x, int y, float value)
{
top[x].setVal(y, value);
}
}
RWStructuredBuffer<Root> sb;
[shader("compute")]
[numthreads(1, 1, 1)]
void compute_main(uint3 tid: SV_DispatchThreadID)
{
sb[0].setTopVal(1, 2, 100.0f);
}
```
We are now able to specialize the call to `setTopVal` into:
```
void compute_main(uint3 tid: SV_DispatchThreadID)
{
setTopVal_specialized(0, 1, 2, 100.0f);
}
void setTopVal_specialized(int sbIdx, int x, int y, float value)
{
Bottom_setVal_specialized(sbIdx, x, y, value);
}
void Bottom_setVal_specialized(int sbIdx, int x, int y, float value)
{
sb[sbIdx].top[x].bigArray[y] = value;
}
```
And get rid of all unnecessary loads. Achieving this requires a
combination of function call specialization and buffer-load-defer pass.
The buffer-load-defer pass has been completely rewritten to be more
correct and avoid introducing redundant loads.
This PR also adds tests to make sure pointers, bindless handles, and
loads from structured buffer or constant buffers works as expected.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
packing/unpacking. (#8526)
Part of the effort to improve the performance of generated SPIRV code.
The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.
Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer<BigStruct> cb;
void test(BigStruct v)
{
if (v.values[0]) { printf("ok"); }
}
[numthreads(1,1,1)]
void computeMain()
{
test(cb);
}
```
In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
%v = param : BigStruct
%barr = fieldExtract(%v, "values")
%element = elementExtract(%barr, 0)
... // uses %element
func computeMain:
%v = load(cb)
call %test %v
```
The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:
```
struct BigStruct_std430 {
int values[1024];
}
var cb : ConstantBuffer<BigStruct_std430>;
func computeMain:
%tmpVar : var<BigStruct>
call %unpackStorage(%tmpVar, cb)
%v : BigStruct = load %tmpVar
call %test %v
```
This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.
This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.
In this example, after `transformParamsToConstRef`, the IR is:
```
func test:
%v = param : ConstRef<BigStruct>
%barr = fieldAddr(%v, "values")
%elementPtr = elementAddr(%barr, 0)
%element = load(%elementPtr)
... // uses %element
func computeMain:
call %test %cb
```
The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:
```
func test:
%v = param : ConstRef<BigStruct_std430>
%barr = fieldAddr(%v, "values")
%elementPtr : ptr<int> = elementAddr(%barr, 0)
%element_int = load(%elementPtr)
%element = cast(%element_int) : %bool
... // uses %element
func computeMain:
call %test %cb
```
In this new IR, there is no longer a load and conversion of the entire
array.
See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.
This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix CUDA global variable initialization with constructor calls
Resolves CUDA compilation failure where global variables with struct
constructor
initialization generated illegal `__device__` variable runtime
initialization.
**Problem:**
```cuda
// Generated invalid CUDA code:
__device__ static const Stuff_0 gStuff_0 = Stuff_x24init_0(args...);
// Error: "dynamic initialization is not supported for a __device__
variable"
Root Cause Discovered:
Through extensive debugging, found that
moveGlobalVarInitializationToEntryPoints
pass only handled kIROp_GlobalVar instructions, but global constants
with
constructor calls appeared as kIROp_Call instructions at module scope.
Solution:
1. IR Pipeline Fix: Extended moveGlobalVarInitializationToEntryPoints to
detect
and transform module-level constructor calls into proper global
variables with
entry-point initialization
2. Field Access Fix: Enhanced kIROp_FieldExtract logic to emit correct
->
syntax for pointer types and address-of operations
3. Constructor Emission: Added CUDA-specific handling for constructor
calls
Architecture:
- Transforms let %gStuff = call %Constructor(...) into kernel context
initialization
- Moves runtime initialization from global scope to entry-point
execution
- Follows CUDA best practices for global state management
Files:
- source/slang/slang-ir-explicit-global-init.cpp: Extended IR
transformation pass
- source/slang/slang-emit-c-like.cpp: Enhanced field access and foldable
value logic
- source/slang/slang-emit-cuda.cpp: Added CUDA-specific field extraction
handling
Result:
// Now generates proper CUDA code:
struct KernelContext_0 { Stuff_0 gStuff_1; };
// Runtime initialization in entry point:
kernelContext_1.gStuff_1 = constructor_call();
Fixes: tests/compute/type-legalize-global-with-init.slang
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `SpirvInstructionHelper::loadBlob()` method could segfault when
calling `m_headerWords.addRange()` if the SPIR-V blob contained
insufficient data for the required 5-word header.
To reproduce, run
```
./build/Debug/bin/slangc.exe tests/modules/environment.slang -o tests/modules/environment.slang-module -target spirv -separate-debug-info
(0): error 57004: output SPIR-V contains no exported symbols. Please make sure to specify at least one entrypoint.
Segmentation fault
```
The error is expected, but the `Segmentation fault` is not.
This PR adds the check to ensure the SPIR-V blob has at least
`SPV_INDEX_INSTRUCTION_START * sizeof(SpvWord)` bytes (20 bytes minimum)
before attempting to process the header words.
Related to: https://github.com/shader-slang/slang/issues/7547
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
…function-parameter.slang #8315
Root Cause:
CUDA compilation crashed with `assert failure:
!seenFinalUnsizedArrayField` because unsized arrays like
`RWStructuredBuffer<uint> globalBuffer[]` were not the final field in
generated parameter structs, violating the layout constraint in
slang-ir-layout.cpp.
Fix:
Extended `collectGlobalUniformParameters` to automatically reorder
struct fields for CUDA targets - regular fields first, unsized arrays
last. Other targets preserve original order.
Impact:
- Enables CUDA support for nonuniform resource indexing as function
parameters
- Zero impact on existing GLSL/HLSL/SPIRV targets
- Automatic handling - no manual parameter reordering required
Files: slang-emit.cpp, slang-ir-collect-global-uniforms.cpp/.h, test
file
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
| |
closes https://github.com/shader-slang/slang/issues/3313
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #7574
Changes:
* Add an initial (fairly simple) optimization pass which is able to
eliminate redundant copies.
* Our current existing optimizer passes remove redundant load/store very
robustly, this pass will focus on other cases of copy elimination
* Primary approach is to make all functions which are `in T` and `T` is
trivial to copy into a `__constref T`. We then (depending on scenario)
manually insert a variable+load if a pass-by-reference is not possible;
otherwise we pass by `constref`.
* Added optimizations to eliminate redundant code which causes
`constref` to fail to compile
---------
Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add emit cases for WGSL and GLSL
* Fix compilation warnings
Modify short cutting test to reflect change in emit logic
Lower matrix for metal as well
Add emit matrix logic for metal
Fix compiler warning
Brace initializer for lowered matrices
Fix compiler warnings
* Tests for metal
* Fix mult, any, and determinant
* Fix matrix-matrix multiplication
* Fix mat mul to be element-wise
* Fix compiler warning
* Move makeMatrix to legalization
* Move unary and binary arithmetic operator lowering to legalization
* Remove emit logic and move final comparison operators to legalization
* Handle vector/matrix negation for WGSL
* Restore older SPIR-V emit logic
* Address PR comments
* Revert to zero minus for negation
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
| |
Closes https://github.com/shader-slang/slang/issues/3386
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add tests for expected behaviour
* Allow matrix types in logical or/and
* Legalize int/bool matrix types and construction with makeMatrix
* Legalize uint matrices and operations
* Limit testing to only SPIRV
* Better tests for int and bool
* Add test for uint
* Remove GLSL tests
* Remove old test for diagnosing int matrices
* Emit SPIRV directly in tests
* format code
* Address PR comments
* Improve testing
* Address PR comments
* format code
* Add tests for matrix intrinsic operations
* Move matrix lowering to dedicated legalization pass
* Fix compiler warning
* Remove signal again
* Reorder matrix and vector legalization
* Fix formatting
* Add shift and comparison tests
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Initial plan
* Implement out-of-bounds array access checking
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Add tests and format code for array bounds checking
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Move checkForOutOfBoundAccess to separate file and refactor using InstPassBase
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Optimize out-of-bounds checker to use single IR traversal
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Fix DiagnosticSink forward declaration from struct to class
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Update 0-array-1 test to use runtime indices to avoid bounds checking diagnostic
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Use SV_DispatchThreadID for truly runtime array access in 0-array-1 test
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* emit var and hoist out OpSelect if Composite
* cleanup comment
* address review
check for version in spv context
use phi node instead of using var
move inst's using a list (not in-place modification)
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Defer immutable buffer loads when emitting spirv.
* Fix.
* Fix.
* Fix.
* Fix tests.
* Fix test.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add fkYAML submodule
* Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml
* generate ir-inst-defs.h
* neaten things
* neaten inst def parser
* add rapidyaml submodule
* remove fkyaml
* remove fkyaml submodule
* remove use of ir-inst-defs.h
* format and warnings
* fix wasm build
* tidy
* remove rapidyaml
* Extend fiddle to allow custom splices in more places
* Use lua to describe ir insts
* fix
* neaten
* neaten
* neaten
* spelling
* neaten
* comment comment out assert
* merge
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement -fp-denorm-mode slangc arg
* Split fp-denorm-mode into 3 args for fp16/32/64
* Remove redundant option categories
* Use emitInst for multiple of the same OpExecutionMode
* Fix formatting
* Remove -denorm any
* Re-add option categories
* emitinst for ftz
* Use enums for type text
* Remove extra categories again
* Add tests for denorm mode
* Move denorm mode to post linking
* format code (#8)
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* regenerate command line reference (#9)
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Clean up tests
* Fix option text
* format code (#10)
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Add tests for "any" mode
* Return "any" enum if option not set
* Simplify emission logic
* Add support for generic entrypoints
* Move denorm modes to end of CompilerOptionName enum
* format code (#11)
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
* Move new enum members to before CountOf
* Add not checks to tests, fix generic test, add functionality tests
* Rename denorm to fpDenormal
* Clean up functional test
* Rename denorm test dir
* Fix formatting, regenerate cmdline ref
* Fold simple tests into functional tests, add more dxil checks
* Remove no-op DX tests, make tests more consistent
* Disable VK functionality tests that will fail on the CI configs
* Fix formatting
* Add comments to disabled tests explaining why
---------
Co-authored-by: slangbot <ellieh+slangbot@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
| |
This PR replaces enable/disable style C function calls with C++ RAII style code.
In debug build, when an assertion failed in between enable and disable functions, an exception is thrown and the disable function is not called. RAII style code is safer for an exception
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add command line option for separate debug info
Add command line arg -separate-debug-info which, if provided, produces
both a .spv and a .dbg.spv file. The .dbg.spv file contains full debug
info and the .spv file has all debug info stripped out.
Also add a DebugBuildIdentifier instruction to store a unique hash in
both the output files, so they can be more easily matched together.
A matching API is provided to allow using the Slang API to retrieve a
base and debug SPIRV as well as the debug build identifier string.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add legalization for 0-sized arrays.
* Allow 0-sized arrays in the front-end.
* More tests.
* Add `Conditional<T, hasValue>` type to core module.
* Update toc.
* Fix wording.
* Update test.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Make interface types non c-style.
* Make Optional<T> work with autodiff and existential types.
* Fix.
* patch behind slang 2026.
* Fix warnings.
* cleanup.
* Fix tests.
* Fix.
* Fix com interface lowering.
* Add comment to test.
* regenerate command line reference
* Add test for passing `none` to autodiff function.
* Fix recording of `getDynamicObjectRTTIBytes`.
* Fix nested Optional types.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement throw statement
It already existed in the IR, so only parsing, checking and lowering was
missing.
* Initial catch implementation
Likely very broken.
* Error out when catch() isn't last in scope
* Prevent accessing variables from scope preceding catch
As those may actually not be available at that point.
* Add IError and use it in Result type lowering
* Add diagnostic tests
* Allow caught throws in non-throw functions
* Fix catch propagating between functions & SPIR-V merge issue
* Add test for non-trivial error types
* Fix MSVC build
* Fix invalid value type from Result lowering
* Also lower error handling in templates
* Lower result types only after specialization
* Attempt to disambiguate error enums by witness table
* Revert matching by witness, types should be distinct too
* Don't assert valueField when getting Result's error value
It may not exist if the function returns void, but getting the error
value is still legitimate.
* Update tests for new error numbers & get rid of expected.txt
* Change catch lowering to resemble breaking a loop
... To make SPIR-V happy.
* Fix dead catch blocks and invalid cached dominator tree
* More SPIR-V adjustment
* Lower catch as two nested loops
* Add defer interaction test and revert broken defer changes
* Fix enum type when throwing literals
* Cleanup and bikeshedding
* Document error handling mechanism
* Fix table of contents
* Use boolean tag in Result<T, E>
* Use anyValue storage for Result<T,E>
* Remove IError
* Fix formatting
* Eradicate success values from docs and tests
* Use parseModernParamDecl for catch parameter
* Implement do-catch syntax
* Implement catch-all
* Fix formatting
* Fix marshalling native calls that throw
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add noreturn attribute to IgnoreHit
* Revert "Add noreturn attribute to IgnoreHit"
This reverts commit 3cf2354dada71b9a8713b08f3a2e261de4aabfa4.
* Fix: Preserve inout param modifications with OptiX IgnoreHit()
Issue #6326 identified that in OptiX, when using IgnoreHit() (which
maps to the "noreturn" optixIgnoreIntersection() intrinsic), any
modifications made to 'inout' parameters within the shader would be
lost. This was due to IgnoreHit() preventing the execution of the
copy-back operation from the temporary variable (used to implement
'inout' semantics) to the original parameter.
This commit introduces a new IR pass, 'undoParameterCopy', specifically
for CUDA/OptiX targets to address this. The pass operates as follows:
1. Identifies temporary IR variables created for 'inout' parameters,
which are now decorated with 'TempCallArgVarDecoration'.
2. Maps these temporary variables back to their original parameter
storage (e.g., the OptiX payload pointer).
3. Replaces all uses of the temporary variable directly with the
original parameter pointer.
4. Removes the temporary variable declaration and its initializing store
(which copied from the original parameter to the temporary).
By transforming the IR to operate directly on the original parameter
storage before any potential call to IgnoreHit(), this fix ensures
that all modifications are preserved, correctly resolving issue #6326.
The pass is integrated into the compilation flow for relevant targets.
* Refactor(IR): Optimize GetOptiXRayPayloadPtr for better DCE/CSE
To allow for more effective dead code elimination (DCE) and
common subexpression elimination (CSE) of `getOptiXRayPayloadPtr`
instructions, this commit:
- Marks `kIROp_GetOptiXRayPayloadPtr` as side-effect-free within
`IRInst::mightHaveSideEffects` (in `slang-ir.cpp`).
- Flags `GetOptiXRayPayloadPtr` as `HOISTABLE` in its definition
within `slang-ir-inst-defs.h`.
This addresses scenarios where multiple, potentially redundant,
calls to `getOptiXRayPayloadPtr` might appear in the IR,
allowing optimizers to produce cleaner and potentially more
efficient code for OptiX targets. This change supports efforts
to refine IR handling for ray-tracing shader stages.
* Remove debugging code
* Refactor UndoParameterCopyVisitor for improved performance
- Optimized IR traversal by combining multiple passes into a single scan
- Removed unnecessary dictionary, immediately replace uses when a temp var is found
- Reduced duplicate code paths by checking for both temp vars and redundant stores in one loop
- Better handling of the 'changed' flag to ensure DCE only runs when needed
- Results in fewer instruction traversals and improved efficiency for large functions
* Add Test
|
| |
|
|
|
|
|
|
|
| |
* Error out on invalid vector sizes
* Remove unnecessary include
* Fix incorrect assert
* Add test
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add IREnumType to distinguish enums from ints and each other
* Add issue example as test
* format code
* Add expected test output
* Fix peephole optimization hanging
No idea why this PR triggered this, but there seems to have been a clear bug
here anyway, so may just as well fix it now.
* Move enum lowering later
* Add linkage decoration to enum type
* Use filecheck-buffer instead of expected.txt
* Fix comment
* Make enum casts actually use IR enum casts
They were all BuiltinCasts by accident
* Lower enum type before VM
* Deal with rate-qualified types in enum cast
* Allow any value marshalling for enum types
* Handle new enum instructions in a couple more switches
* Fix formatting
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add Slang Byte Code generation and interpreter.
* Fix compile issues.
* format code
* More compile fix.
* Fix clang issue.
* Fix more clang issues.
* Another clang fix.
* Fix clang issues.
* Fix another clang issue.
* Fix wasm build.
* Update building.md
* Fix test-server.
* Fix compile error.
* Fix bug.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
| |
* void field rework
* move void cleanup pass earlier
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#6696)
* Initial loop analysis pass
* More changes for a single-pass implication propagation
* Update slang-ir-autodiff-loop-analysis.cpp
* Cleanup + new system for loop analysis
* Fixup bugs in loop analysis
* Remove some relation types to simplify the analysis. Add test
* Remove unused
* Address comments
* Fix issue with continue loops
* Update reverse-loop-exit-value-inference-1.slang
* Update reverse-continue-loop.slang
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Reapply "Eliminate empty struct on metal target (#6603)" (#6711)
This reverts commit bc9dc6557fc0cc3a4c0c2ff27e636940e361cf5d.
* Remove argument in make_struct call corresponding to void field
This is a follow-up of #6543, where we leave the VoidType field as it in
make_struct call during legalization pass.
So during cleaning_void IR pass, when we remove "VoidType" from struct,
we will have to also clean up the argument corresponding to the
"VoidType" field.
|
| |
|
| |
This reverts commit b3deec2001ea34e20e9a6af8ddf5cf3866cafac0.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Eliminate empty struct on metal target
Close 6573.
We previously disabled the type legalization for ParameterBlock on
Metal, but Metal doesn't allow empty struct in the argument buffer
which is mapped from ParameterBlock, so we will need legalizeEmptyTypes
on Metal target.
* update test
* update function name
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* initial wip
* more WIP
* preserve old lower behavior
* remove unnecessary includes
* add test
* add no target case in test
* fix broken test
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
|
|
|
| |
- Add the diagnostic messages, and code to emit them
- Add some tests
This helps to address issue #6183.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* initial wip
* more wip
* add test
* add unexpected for invalid target
* fixups and improve error message
* fixups and improve error message
* remove incorrect comment
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
|
|
|
|
| |
* Update SPIRV-Tools and fix new validation errors.
* Implement pointers for glsl target.
* Reworked packStorage/unpackStorage code gen to operate on pointers rather than values.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new option "SkipDownstreamLinking" will defer final downstream IR
linking to the user application. This option only has an effect if
there are modules that were precompiled to the target IR using
precompileForTarget().
Until now, the default behavior for SPIR-V was to use deferred linking, and
the default behavior for DXIL was to use immediate/internal linking in Slang.
This change only affects the SPIR-V behavior such that both deferred and
non-deferred linking is supported based on the new option.
To support the non-deferred option, Slang will internally call into
SPIRV-Tools-link to reconstitute a complete SPIR-V shader program when
necessary (due to modules having been precompiled to target IR).
Otherwise, if SkipDownstreamLinking is enabled, the shader returned by
e.g. getTargetCode() or getEntryPointCode() may have import linkage to
the SPIR-V embedded in the constituent modules.
Closes #4994
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* support WaveGetLane* for WGSL and Metal
* update test and glsl support
* address review comments and fix metal test
* add missing pragma guard
* update test
* Revert "update test"
This reverts commit f2b97e91c29de154190710580c343bd0764aedbb.
* update failing glsl metal test and added new test
* make hlsl and glsl outputs similar
* update test
* disable tests for Metal and cleanup
* comment fix
* add expected failures
* correct expected failures list
* remove expected failure
* add tests to expected failure
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
| |
Improve performance when compiling small shaders.
Avoid copying witness table entries that are not getting used during linking.
Avoid copying auto-diff related decorations and derivative functions during linking, if the user modules doesn't use autodiff.
Cache operator overload resolution results on global session, so each new Session doesn't need to repetitively run through overload resolution from scratch.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
* Use and() and or() functions for logical-AND and OR
With this commit, Slang will emit function calls to `and()` and `or()`
for the logical-AND and logical-OR when the operands are non-scalar and
the target profile is SM6.0 and above. This is required change from
SM6.0.
For WGSL, there is no operator overloadings of `&&` and `||` when the
operands are non-scalar. Unlike HLSL, WGSL also don't have `and()` nor
`or()`. Alternatively, we can use `select()`.
|
| |
|
|
|
|
|
|
|
| |
* Support stage_switch.
* Update proposal status.
* Fix gl_InstanceID.
* Fix.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* initial work
* more work
* more work on glsl intrinsics
* add subgroup broadcast for glsl
* wip add wgsl extension tracking
* enable tests, enable extensions and added some todos
* format and warning fixes
* fix wgsl extension tracker
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Base compiler options for targets on target-specific compiler options
Before this change, the target compiler options were based on the linkage-wide compiler
options, which where later again inherited from (basically a no-op).
With this change, the target-specific compiler options are added first, and then
the linkage-wide comnpiler options are inherited from.
* Remove debug instructions if target-specific setting is NONE
This helps to address #6092.
* Make sure the linkage debug info level is sufficient for each target
This closes #6092.
|
| |
|
|
|
|
|
| |
* Support cooperative vector without Vulkan-header update
Adding a Slang support for cooperative vector.
But this commit doesn't have Vulkan-header update.
|