| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: https://github.com/shader-slang/slang/issues/7634
Duplicate of PR https://github.com/shader-slang/slang/pull/8052
Primary Changes:
* Added `storeCoherent` and `loadCoherent` for coherent load/store via
pointers. This is backed by `IRMemoryScopeAttr` which is an `IRAttr`
attached to `IRLoad` and `IRStore`
* Logic in `source\slang\slang-emit-spirv.cpp` for load/store emitting
has been reworked to be less messy and more maintainable
* Add to `hlsl.meta.slang` coop vector and coop matrix coherent
load/store operations
Secondary Changes:
* Added a missing load/store test for coop matrix:
`tests\cooperative-matrix\load-store-pointer.slang`
---------
Co-authored-by: ArielG-NV <aglasroth@nvidia.com>
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Nathan V. Morrical <natemorrical@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g. in
[generic-extension-2.slang](https://github.com/shader-slang/slang/blob/master/tests/language-feature/extensions/generic-extension-2.slang),
incorrect DebugFunctions are generated for `getFirstOuter`:
```
let %33 : Void = DebugFunction("getFirstOuter", 18 : UInt, 3 : UInt, %26, Func(Int, 0 : Int))
```
This happens because specialization passes are leaving a `%IFoo` in the
function type, instead of replacing with a concrete type:
```
let %34 : Void = DebugFunction("getFirstOuter", 18 : UInt, 3 : UInt, %26, Func(Int, %IFoo))
```
and later, `cleanUpInterfaceTypes()` just replaces all interfaces with
the literal zero. So now we have a parameter type which isn't actually a
type at all, but an IntLit instead.
I'm not sure if the approach I picked is good, though. Some other
options that crossed my mind were:
* Make `fixUpFuncType` also update related DebugFunctions
- But is there a reason why DebugFunctions separately carry a function
type in the first place?
* Make `cleanUpInterfaceTypes` less aggressive or at least replace types
with a type instead of a value
- But this will still make the debug info incorrect :(
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closes #7606.
When Slang compile for a bindful target, we will run the resource type
legalization pass to hoist resource typed struct fields outside of the
struct type and define them as global parameters and passing them around
via dedicated function parameters.
When we compile for a bindless target, we don't run this pass.
However, Metal is a hybrid bindful and bindless target. We need to run
type legalization for the constant buffer, but skip type legalization
for parameter block.
The previous attempt to support this behavior is to hack the type
legalization pass to return `LegalVal::simple` when it sees a
`ParameterBlock<T>`. However, whenever the code is accessing
`parameterBlock.someNestedField`, the type of the nested field may get a
`LegalType::tuple`, and now we will run into inconsistent scenarios
where we have a `LegalVal::simple` on the operand val, and but the
legalization logic is expecting that val to be a `LegalType::tuple`.
This breaks a lot of assumptions and invariants in the type legalization
pass, resulting unstable/fragile behavior.
To systematically solve this problem, this change generalizes the
existing legalize buffer element type pass to translate
`ParameterBlock<Texture2D>` (and similar cases) to
`ParameterBlock<Texture2D.Handle>`. So that such parameter block will
always be legalized to `LegalType:::simple` during type legalization,
and we will never run into any inconsistent cases. This allowed us to
get rid of the hacky logic in the type legalization pass to try to
workaround the inconsistencies.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
packing/unpacking. (#8526)
Part of the effort to improve the performance of generated SPIRV code.
The existing lower-buffer-element-type pass works by loading the entire
buffer element content from memory, and translate it to logical type
stored in a local variable at the earliest reference of a buffer handle.
This means that is can generate inefficient code that reads more than
necessary.
Consider this example:
```
struct BigStruct { bool values[1024]; }
ConstantBuffer<BigStruct> cb;
void test(BigStruct v)
{
if (v.values[0]) { printf("ok"); }
}
[numthreads(1,1,1)]
void computeMain()
{
test(cb);
}
```
In IR, the `computeMain` function before lower-buffer-element-type pass
is something like following:
```
func test:
%v = param : BigStruct
%barr = fieldExtract(%v, "values")
%element = elementExtract(%barr, 0)
... // uses %element
func computeMain:
%v = load(cb)
call %test %v
```
The existing lower-buffer-element-type pass will rewrite the bool array
in `BigStruct` into `int` array so it is legal in SPIRV. However, it
does so by inserting the translation on the first `load` of the constant
buffer:
```
struct BigStruct_std430 {
int values[1024];
}
var cb : ConstantBuffer<BigStruct_std430>;
func computeMain:
%tmpVar : var<BigStruct>
call %unpackStorage(%tmpVar, cb)
%v : BigStruct = load %tmpVar
call %test %v
```
This means that the entire array will be loaded and translated to int,
before calling `test`, which only uses one element. It turns out that
the downstream compiler isn't always able to optimize out this
inefficient translation/copy.
This PR completely rewrites the way buffer-element-type lowering is
handled to avoid producing this inefficient code. It works in two parts:
first we turn on the `transformParamsToConstRef` pass for SPIRV target
as well, so we will translate the `test` function to take the `v`
parameter as `constref`. The second part is a redesigned
buffer-element-type pass that defers the storage-type to logical-type
translation until a value is actually used by a `load` instruction.
In this example, after `transformParamsToConstRef`, the IR is:
```
func test:
%v = param : ConstRef<BigStruct>
%barr = fieldAddr(%v, "values")
%elementPtr = elementAddr(%barr, 0)
%element = load(%elementPtr)
... // uses %element
func computeMain:
call %test %cb
```
The new `buffer-element-type-lowering` pass will take this IR, and
insert translation at latest possible time across the entire call graph,
and translate the IR into:
```
func test:
%v = param : ConstRef<BigStruct_std430>
%barr = fieldAddr(%v, "values")
%elementPtr : ptr<int> = elementAddr(%barr, 0)
%element_int = load(%elementPtr)
%element = cast(%element_int) : %bool
... // uses %element
func computeMain:
call %test %cb
```
In this new IR, there is no longer a load and conversion of the entire
array.
See new comment in `slang-ir-lower-buffer-element-type.cpp` for more
details of how the pass works.
This PR also address many other issues surfaced by turning on
`transformParamsToConstRef` pass on SPIRV backend.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
| |
closes https://github.com/shader-slang/slang/issues/3313
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves #7628
Resolves: #8197
Primary Goals:
1. Add `Access` to pointer
2. AddressSpace::GroupShared support for pointers (SPIR-V)
3. Add `__getAddress()` to replace `&`
* `&` is not updated to `require(cpu)` since slangpy uses `&`. This
means we must: (1) merge PR; (2) replace `&` with `__getAddress()`; (3)
add `require(cpu)` to `&`
Changes:
* Added to `Ptr` the `Access` generic argument & logic (for
`Access::Read`).
* Moved the generic argument `AddressSpace` from `Ptr` to the end of the
type.
* Added pointer casting support between any `Ptr` as long as the
`AddressSpace` is the same
* Disallow globallycoherent T* and coherent T*
* Disallow const T*, T const*, and const T*
* Fixed .natvis display of `ConstantValue` `ValOperandNode`
* Support generic resolution of type-casted integers
* Added `VariablePointer` emitting for spirv + other minor logic needed
for groupshared pointers
Breaking Changes:
* Anyone using the `AddressSpace` of `Ptr` will now have to account for
the `Access` argument
* we disallow various syntax paired with `Ptr` and `T*`
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closes #8112. ~~The issue asks for a "C layout", but in this PR I use
the term "CPU layout" because this naming was pre-existing in the
codebase as `kCPULayoutRulesImpl_`. The primary purpose of this layout
is to match CPU-side struct definitions with the shader side. I'm open
to better naming suggestions, though.~~
Edit: switched back to using `CDataLayout` & `-fvk-use-c-layout`, as the
CPU target depends on the object layout rules of existing CPU layout
rules, but they're incompatible with actual shaders. So a new
`kCLayoutRulesImpl_` was needed anyway.
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
debugging (#7820)
* Initial plan
* Add SLANG_DEBUG_IR_BREAK environment variable support
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Apply code formatting to SLANG_DEBUG_IR_BREAK implementation
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Improve stack trace debugging with -rdynamic flag and backtrace_symbols
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Address PR feedback: use PlatformUtil::getEnvironmentVariable, remove -rdynamic flag, and delete fallback branch
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Address PR feedback: simplify env var parsing, move backtrace to PlatformUtil, use #if for SLANG_LINUX_FAMILY
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Address PR feedback: remove unneeded include, make backtrace() more generic by removing uid parameter
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* Fix and clone source tracking.
* Add python script to dump traces.
* Update instructions.
* Batch calls to addr2line
* Cleanup claude instructions.
* update claude action.
* Remove duplicated build instructions from claude.yml workflow
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
* fix build error.
* Fix build errors
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
Co-authored-by: Yong He <yonghe@outlook.com>
Co-authored-by: Gangzheng Tong <tonggangzheng@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This change expands the IR struct definition generation logic in slang-ir.h.lua
to code generate the getters for the operands of an IR.
To facilitate the above, the schema for the IR definitions in slang-ir-insts.lua is updated
to allow for explicit specification of the operands of an IR, with Fiddle code generating
the getters for them.
slang-ir.h is updated to remove the hardcoded getters of the IRs since they are now generated by Fiddle.
Fixes part of #7185
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* option to use riff as serialization backend
* option to use riff as serialization backend
* perf
* shuffle code
* perf improvements to deserialization
* formatting
* remove bit_cast
* correct IR verification
* neaten serialized format
* fix peek module info
* formatting
* remove temporary profiling code
* cleanup
* fix wasm build
* more explicit sizes
* deserialize via fossil on 32 bit wasm
* Make serialized modules Int size agnostic
* reorder stable names to allow range based check for 64 bit constants
* format
* review comments
* fix build
* fix
* c++17 compat slang-common.h
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stable names
* tests, options and ci for stable names
* Add back compat design document
* fix warnings
* formatting
* comment
* neaten
* regenerate command line reference
* consolidate ci scripts
* faster ci
* remove libreadline
* Move new function to end of interface
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* bottleneck ir module reading and writing
* compute/simple working
* more complex tests working
* neaten
* factor out SourceLoc serialization
* document changes
* Appease clang
* Correct name serialization
* remove unnecessary code
* neaten
* neaten
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add fkYAML submodule
* Generate slang-ir-inst-defs.h from slang-ir-inst-defs.yaml
* generate ir-inst-defs.h
* neaten things
* neaten inst def parser
* add rapidyaml submodule
* remove fkyaml
* remove fkyaml submodule
* remove use of ir-inst-defs.h
* format and warnings
* fix wasm build
* tidy
* remove rapidyaml
* Extend fiddle to allow custom splices in more places
* Use lua to describe ir insts
* fix
* neaten
* neaten
* neaten
* spelling
* neaten
* comment comment out assert
* merge
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
This commit implements two new types and related Load/Store functions in CoopMat.
tensor_addrressing.TensorLayout
tensor_addressing.TensorView
CoopMat.Load(..., TensorLayout)
CoopMat.Load(..., TensorLayout, TensorView)
CoopMat.Store(..., TensorLayout)
CoopMat.Store(..., TensorLayout, TensorView)
CoopMat.Load(..., TensorLayout, TensorView)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Close #6859
Goal of this PR
We want to support an array whose size can be specialization constant for shared/global variable e.g.
layout (constant_id = 0) const uint BLOCK_SIZE = 64;
shared float buf_a[(BLOCK_SIZE + 5) * 4];
Overview of the solution:
During IndexExpr check, we will loose the restriction to allow SpecConst passing, but the size parameter will not be a constant value because it cannot be folded into a constant, so we will make it follow the same logic as generic parameter value, and the size will be represented by FuncCallIntVal/PolynomialIntVal/DeclRefIntVal.
During IR lowering, we will detect whether there is spec constant in the IntVal, and wrap the IRInst with a SpecConstRateType, and propagate the type though the lowering logic, such that the IntVal representing the array size will have SpecConstRateType.
During spirv emit stage, if we detect that a IRInst has SpecConstRateType, we will emit it as SpecConstantOp.
We have to implement new logic to emit OpSpecConstantOp, the existing emit logic doesn't support emitting OpSpecConstantOp, especially this op can embed arithmetic operation at global scope, where we can only emit arithmetic instruct at local. But there are only few instructs we need to support.
Overview of the solution:
This PR doesn't support generic, and we will create a separate PR to extend that, tracked in #6840.
|
| |
|
|
|
|
|
| |
* Fix unsigned to signed casts for SPIRV
* Add test
* Fix ICE crash
|
| |
|
|
|
|
|
|
|
| |
* Correct incorrect enum usage on metal
* Update C++ standard to C++20
Closes https://github.com/shader-slang/slang/issues/6945
* use bit_cast
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add IREnumType to distinguish enums from ints and each other
* Add issue example as test
* format code
* Add expected test output
* Fix peephole optimization hanging
No idea why this PR triggered this, but there seems to have been a clear bug
here anyway, so may just as well fix it now.
* Move enum lowering later
* Add linkage decoration to enum type
* Use filecheck-buffer instead of expected.txt
* Fix comment
* Make enum casts actually use IR enum casts
They were all BuiltinCasts by accident
* Lower enum type before VM
* Deal with rate-qualified types in enum cast
* Allow any value marshalling for enum types
* Handle new enum instructions in a couple more switches
* Fix formatting
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* initial wip for spirv
* working tiled example
* clean up store and load
* minor fixes
* fix loadAny name
* add initial tests, including broken/unimplemented intrinsics
* fix subscript
* run tests at 16x16, remove not supported arithmetic tests
* minor fixups on implementation
* rename CoopMatMatrixUse
* Update tests to pass validation layers locally
* Add mat-mul-add test and minor fixes
* Add more tests
* Remove dead code
* Add coopMatLoad function and tests, enforce constexpr for matrix layout
* Use getVectorOrCoopMatrixElementType in place of getVectorElementType
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
# Make `IRWitnessTable` Hoistable
## Intention of the PR
This commit makes `IRWitnessTable` Hoistable so that we can avoid duplicated `IRWitnessTable`.
## Problems
This commit tries to address the following issues arise after turning `IRWitnessTable` into Hoistable:
1. A Hoistable instance is immutable.
2. When tries to create a duplicated child, you will get a previously created instance of `IRWitnessTable`, instead of a new one.
3. We don't actually want to hoist `IRWitnessTable`.
4. There can be only one instance of Hoistable and it cannot appear as childs multiple times.
5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface.
## Implementation
### Solution for "1. A Hoistable instance is immutable."
`IRWitnessTable::setConcreteType()` is removed, because when an `IRInst` is Hoistable, it is treated as immutable. Any `IRInst::setXXX()` methods don't work anymore.
There were two places calling `setConcreteType()` and their logic had to change little bit.
`DeclLoweringVisitor::visitInheritanceDecl()` in `source/slang/slang-lower-to-ir.cpp` was calling `setConcreteType()`. It had a little strange logic around `lowerType()`. The `IRWitnessTable` was added with `context->setGlobalValue()` first and its `concreteType` was changed later. This commit works around in a way that it sets the parent of `IRWitnessTable` temporarily and reset it with the correct `IRWitnessTable`. Without this logic, it went into an infinite recursion.
`AutoDiffPass::fillDifferentialTypeImplementation()` in `source/slang/slang-ir-autodiff.cpp` was calling `setConcreteType()`. It was changing the concreteType of `innerResult.diffWitness`. This commit creates a new `IRWitnessTable` and copies its `IRWitnessTableEntry`.
### Solution for "2. When tries to create a duplicated child, you will get a previously created instance of IRWitnessTable, instead of a new one"
After a call to `IRBuilder::createWitnessTable()`, this commit checks if the returned `IRWitnessTable` is a brand new or not. If it is not a new one, we have to avoid adding the decorations and children.
This commit decides when to add decorations and children based on whether `IRWitnessTable` has any of decorations or children already. It doesn't seem like a proper way to check. But when I tried, it was difficult to find a bottleneck point where the decorations and children are added to `IRWitnessTable` first time. Note that we are not trying to find when `IRWitnessTable` is created for the first time; we need to find if the decorations and children were added once.
It might be fine to have duplicated `IRWitnessTableEntry` in most of the cases, but I noticed that it fails an assertion check when `shouldDeepCloneWitnessTable()` returns false in `cloneWitnessTableImpl()`.
### Solution for "3. We don't actually want to hoist IRWitnessTable."
The reason why this commit makes `IRWitnessTable` is to prevent the duplicated instances of `IRInst`. But we don't really want to "Hoist" them.
When an `IRWitnessTable` gets Hoisted out, it causes unexpected problems and the specialization process fails due to the missing `IRWitnessTable` in the input.
This commit prevent from hoisting `IRWitnessTable` in `_replaceInstUsesWith()`. The way this is implemented feel little hack but we discussed on Slack and decided to go with this. One of the proper approaches could be to add a new flag in `IROpFlags` and have a new one like `kIROpFlag_Deduplicate`, which is different from just `kIROpFlag_Hoistable`.
### Solution for "4. There can be only one instance of Hoistable and it cannot appear as childs multiple times."
When `IRWitnessTable` is Hoistable, there can be only a unique set of instances. And we cannot have an instance as a duplicated childs. It is because `IRInst` has only one set of `IRInst* next` and `IRInst* prev`.
Before this commit, an instance of `IRGeneral` could have duplicated instances of `IRWitnessTable`. As an example, `IInteger` interface inherits two other interfaces, `IArithmetic` and `ILogical`. And they both inherits from `IComparable`.
```
interface IInteger : IArithmetic, ILogical {}
interface IArithmetic : IComparable {}
interface ILogical : IComparable
```
When we specialize it in `specializeGenericImpl()`, an `IRBlock` gets the following list of children:
- IRWitnessTable for IComparable,
- IRWitnessTable for IArithmetic,
- IRWitnessTable for IComparable,
- IRWitnessTable for ILogical,
For the cloning during the specialize, "IRWitnessTable for `IComparable`" must be cloned before the cloning of "IRWitnessTable for `IArithmetic`". Because "IRWitnessTable for `IArithmetic`" refers "IRWitnessTable for `IComparable`" as its `IRWitnessTableEntry`. The order they appear in the `IRBlock` as children decides which instances will be cloned first. And "IRWitnessTable for `IComparable`" must appear before "IRWitnessTable for `IArithmetic`".
Note that "IRWitnessTable for `IComparable`" appears twice, The first one was added for "IRWitnessTable for `IArithmetic`". And the second one is added for "IRWitnessTable for `ILogical`".
With this commit "IRWitnessTable for `IComparable`" can appear as a child only once in `IRBlock`. So it causes an error if it gets the following list:
- IRWitnessTable for IArithmetic,
- IRWitnessTable for IComparable,
- IRWitnessTable for ILogical,
In order to resolve the problem, "IRWitnessTable for `IComparable`" must appear before both "IRWitnessTable for `IArithmetic`" and "IRWitnessTable for `ILogical`" as following:
- IRWitnessTable for IComparable,
- IRWitnessTable for IArithmetic,
- IRWitnessTable for ILogical,
To address the problem, the instances of `IRWitnessTable` is always added to the end of the children list. If it is already added to the list, we don't move. This works out because the AST tree is built based on the dependencies.
### Solution for "5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface."
This issue was found while testing with Falcor tests where it uses Conformance-type feature of Slang.
We are using different import and export mangled names for a same Witness-table when the witness-table is for "Enum" interface.
The way we simplify the implementation of "Enum" causes a problem when it comes to generate export/import for the witness-table. And the exact repro step is still unclear.
There were two suggested solutions for the problem and this PR adopted the first option for now. Maybe we want to improve it with the second option later.
option 1, when we produce mangled names for those witness-table, we can use a mangled name with the underlying "int" type instead of the name of the enum type. In this way, all witness-tables for enum types whose underlying type is same will get the same mangled name. It will allow us to deduplicate the witness-table during the linking.
option 2, we can preserve type info for enum type when generating IR. We can still erase all other uses of the type info of enum types for now. But when we generate the witness-table, instead of filling the conforming type operand to IntType, we fill it as EnumType(IntType) where EnumType is a new global IROp code to represent all enum types (like InterfaceType/StructType). This way the operands for the two witness-tables will be different.
"option 1" is more quick and dirty and "option 2" is more proper way to address it.
I should go with "option 1" and improve it with "option 2" approach later.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add -dump-module command to slangc
The new -dump-module command to slangc will load and disassemble a slang module, similar to what would be seen by the -dump-ir command, except that -dump-ir tells slangc to print IR as it performs some compilation command. That is, -dump-ir requires
some larger compilation task.
-dump-module on the otherhand requires no additional goal and will simply load a module and print its IR to stdout independently from other compilation steps.
Its intended purpose is to inspect .slang-module files on disk.
It can also be used on .slang files which will be parsed and lowered
if slang does not find an associated ".slang-module" version of the
module on disk.
The compilation API is extended with a new IModule::disassemble()
method which retrieves the string representation of the dumped IR.
Closes #6599
* format code
* Use FileStream not FILE
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
| |
|
|
|
|
|
| |
* Update SPIRV-Tools and fix new validation errors.
* Implement pointers for glsl target.
* Reworked packStorage/unpackStorage code gen to operate on pointers rather than values.
|
| |
|
|
|
|
|
| |
Improve performance when compiling small shaders.
Avoid copying witness table entries that are not getting used during linking.
Avoid copying auto-diff related decorations and derivative functions during linking, if the user modules doesn't use autodiff.
Cache operator overload resolution results on global session, so each new Session doesn't need to repetitively run through overload resolution from scratch.
|
| |
|
|
|
|
|
| |
* Support cooperative vector without Vulkan-header update
Adding a Slang support for cooperative vector.
But this commit doesn't have Vulkan-header update.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Initial implementation of `ResourcePtr<T>`.
* Update docs
* Fix build error.
* Add more discussion.
* Update documentation.
* Update TOC.
* Fix.
* Fix.
* Add test case for custom `getResourceFromBindlessHandle`.
* Add namehint to generated descriptor heap param.
* Fix.
* Fix.
* format code
* Rename to `DescriptorHandle`, and add `T.Handle` alias.
* Fix compiler error.
* Fix.
* Fix build.
* Renames.
* Fix documentation.
* Documentation fix.
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
| |
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add datalayout for constant buffers.
* Fixes.
* Fix test.
* Fix glsl codegen.
* Update spirv-specific doc.
* Fix test.
* Fix binding in the presense of specialization constants.
* address comments.
* Add a test for constant buffer layout.
|
| |
|
|
|
|
|
|
|
| |
* Move switch statement bodies to their own lines
* format
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add support for write-only textures.
* Fix capabilities.
* Fix implementation.
* Fix.
* format code
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
|
| |
|
|
|
|
|
| |
* format
* Minor test fixes
* enable checking cpp format in ci
|
| |
|
|
|
| |
(#5415)
This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Initial Atomic<T> type implementation.
* Update design doc.
* Fix.
* Add test.
* Fixes and add tests.
* Fix WGSL.
* Fix glsl.
* Fix metal.
* experiemnt with github metal.
* experiment github metal 2
* github metal experiment 3
* experiment with github metal 4.
* experiment with metal 5.
* experiment 7.
* metal experiment 8.
* Fix metal tests.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* initial diff-ref-type interface
* Initial support for `IDifferentiablePtrType`
* Fix unused vars
* More tests + fix switch case fallthrough.
* Update slang-ir-autodiff.cpp
* Update diff-ptr-type-loop.slang
* Add optimization to allow more complex pair types
* Update slang-ir-autodiff-primal-hoist.cpp
* Update diff-ptr-type-loop.slang
* Update slang-ir-autodiff-primal-hoist.cpp
* More fixes to address reviews
* Update slang-check-expr.cpp
* Optimizations + rename `differentiableRefInterfaceType` -> `differentiablePtrInterfaceType`
* Move pair logic to ir-builder, unify the type dictionaries.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support mixture of precompiled and non-precompiled modules
This changes the implementation of precompile DXIL modules to
accept combinations of modules with precompiled DXIL, ones without,
and ones with a mixture of precompiled DXIL and Slang IR.
During precompilation, module IR is analyzed to find public functions
which appear to be capable of being compiled as HLSL, and those
functions are given a HLSLExport decoration, ensuring they are emitted
as HLSL and preserved in the precompiled DXIL blob. The IR for those
functions is then tagged with a new decoration AvailableInDXIL, which
marks that their implementation is present in the embedded DXIL blob.
The DXIL blob is attached to the IR as before, inside a EmbeddedDXIL
BlobLit instruction.
The logic that determines whether or not functions should be
precompiled to DXIL is a placeholder at this point, returning true
always. A subsequent change will add selection criteria.
During module linking, the full module IR is available, as well
as the optional EmbeddedDXIL blob. The IR for functions implemented
by the blob are tagged with AvailableInDXIL in the module IR.
After linking the IR for all modules to program level IR, the IR for
the functions marked AvailableInDXIL are deleted from the linked IR,
prior to emitting HLSL and compiling linking the result.
This change also changes the point of time when the module IR is
checked for EmbeddedDXIL blobs. Instead of happening at load time
as before, it happens during immediately before final linking, meaning
that the blob does not need to be independently stored with the module
separate from the IR as was done previously.
Work on #4792
* Clean up debug prints
* Call isSimpleHLSLDataType stub
* Address feedback on precompiled dxil support
Allow for IR filtering both before and after linking.
Only mark AvailableInDXIL those functions which pass
both filtering stages. Functions are corrlated using
mangled function names.
Rather than delete functions entirely when linking with
libraries that include precompiled DXIL, instead convert
the IR function definitions to declarations by gutting
them, removing child blocks.
* Use artifact metadata and name list instead of linkedir hack
* Use String instead of UnownedStringSlice
* Update tests
* Renaming
* Minor edits
* Don't fully remove functions post-link
* Unexport before collecting metadata
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Metal: mesh shading skeleton
* Metal: fixing mesh payload
* Metal: improving mesh shader indices output
* Metal: Implementing conditional mesh output set
* Metal: Trying to not break other backends
* Metal: trying to fix mesh output set
* Metal: Fixing MeshOutputSet usages
* Metal: Fixing vertex and primitive semantics
* Metal: Fixing code style
* Metal: Fixed hlsl indices set
* Fixed HLSL mesh output set disappearing and GLSL mesh output crashing
* Metal: Adjusting task test matching
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement `-fvk-use-dx-layout`
Fixes: #4126
Changes:
* Added fvk-use-dx-layout
* Modified `HLSLConstantBufferLayoutRulesImpl` for correctness (ex: Array is always 16 byte aligned)
* Added kFXCShaderResourceLayoutRulesFamilyImpl and kFXCConstantBufferLayoutRulesFamilyImpl to handle fvk-use-dx-layout
* Added `ConstantBufferLayoutRules` to manage constant buffer rules
* Added `alignCompositeElementOfNonAggregate`/`alignCompositeElementOfAggregate` to handle forced alignment of composites for ConstantBuffers
* `StructuredBuffer` rules are mostly equal to `scalar` layout, not much was needed to be changed to support this behavior.
* seperate legacy constant buffer and how Slang does constant-buffer normally
* undo an addition
* remove accidental test
* Address review and fix
Address review and remove GLSL support since GLSL requires a seperate legalization (need to linearlize structs like with `legalizeMetalIR` to assign explicit offsets)
* comments
* remove aggregate and non-aggregate logic
We don't need this distinction for the logic
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
* Tuple swizzling and element access.
* Update proposal status.
* Cleanup.
* Fix merrge error.
* Address review.
|
| |
|
|
|
|
|
|
|
| |
* Variadic Generics Part 2: IR lowering and specialization.
* Update design doc status.
* Update design doc.
* Resolve review comments.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add embedded precompiled binary IR ops
Add IR operations to embed precompiled DXIL or SPIR-V blobs
into IR. Adds a BlobLit literal that is mostly identical to
StringLit except for its inability to be displayed, e.g.
in dumped IR. In the future, the blob might be dumped as
hexadecimal, but for now it is summarized as "<binary blob>".
* EmbeddedDXIL and SPIR-V options
The options, '-embed-dxil' and '-embed-spirv' in slangc, will
cause a target dxil or spirv to be compiled and stored in the
translation unit IR when written to a slang-module. Subsequent
changes actually implement the options.
* Per-translation unit DXIL precompilation
When -embed-dxil is specified, perform a precompilation to DXIL of
each TU, linked only with stdlib. Embed the resulting DXIL for
the TU in a IR op. Being part of IR, the precompiled DXIL can be
serialized to disk in a slang-module.
Upon loading slang-modules, the new IR op will be searched for and
the precompiled DXIL blob is saved with the loaded Module. During
linking, if all the Modules have precompiled blobs they will be
sent to the downstream compile commands as libraries instead of
source, skipping the downstream compilation, using DXC only for
linking.
Fixes Issue #4580
* Remove placeholder embedded SPIRV support
Code was added only to sketch out how other precompiled bins
will be supported.
* Remove the rest of the SPIRV placeholder support
* Fix warnings, test error on non-windows
* Remove lib_6_6 hack, add dxil_lib capability
* Allocate blob value from irmodule memarena
* Add null check after memarena allocation
* Restore the request->e2erequest code path for generatewholeprogram
* Update capability handling, move EmbedDXIL enum to end to preserve abi
* Remove lib_6_6 hack
* Move ICompileRequest functions to end
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Overhaul IR lowering of pointer types.
* Propagate address space in IRBuilder.
* Fixup.
* Fix.
* Fix.
* Change how Ptr type is printed to text.
* Fix.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add ResourceArray intrinsic type
* Move aliased parameter generation to GLSL legalization
* Add DynamicResourceEntry type for proxying layout of GenericResourceArray
* Reimplement as DynamicResource
* Add reflection test
* Don't reuse alias cache between different parameters
* Add dynamic cast extensions for buffer types
* Minor format fix
* Fix VarDecl diagnostics after finding non-appliable initializer candidates
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Specialize address space during spirv legalization.
* Fix.
* Fix building doc.
* Fix cmake.
* Update assert.
|
| | |
|
| |
|
|
| |
code (#4250)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add options to speedup compilation.
* Fix.
* Plumb options to DCE pass.
* Revert debug change.
* Fix regressions.
* More optimizations.
* more cleanup and fixes.
* remove comment.
* Fixes.
* Another fix.
* Fix errors.
* Fix errors.
* Add comments.
|
| |
|
|
|
| |
Adds a member dump() to IRInst that can writes the immediate value or IR
inst value to stdout to help with debugging
|
| |
|
|
|
| |
* Metal: rewrite global variables as explicit context.
* Small tweaks.
|