| Age | Commit message (Collapse) | Author |
|
* Fix macos CI.
* Fix.
* Fix.
* Fix.
* Fix clang warnings.
* Fix more warnings.
|
|
Adding "override" keywords for member functions whereever they need.
The compiler warning was visible on CI build but not visible on local
visual studio build.
|
|
Resolves #3980
Based on the operator precedence, Slang may omits the parentheses if they
are not needed. DXC prints warnings for such cases and some applications
may treat the warnings as errors.
This commit emits parentheses to avoid the DXC warning even when they
are not needed.
|
|
Fix the issue (#3999).
For a function is defined as extern and export at the same time, don't
report error, we can use the 'export' function to overload the 'extern'
function.
|
|
* Switch to direct-to-spirv backend as default.
* Fix slang-test.
* Fix.
* Fix.
|
|
* Fix a bug in fwd-diff for cross product
* Also add a test for the reverse-mode AD
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* ForceInline ByteAddressBuffer operations in stdlib
* fixup
|
|
* bit_cast & reinterpret warning if src->dst type not equally sized.
bit_cast & reinterpret warning if src->dst type not equally sized.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* Add metal downstream compiler + metallib target.
* Add more comments.
* Add missing override.
|
|
|
|
Resovles an issue #3935
Slang had to fold the generic arguments after specialization.
|
|
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
`-ignore-capabilities` flag allows ignoring capability incompatibilities/discontinuity errors/warnings. We still process capabilities (needed for stdlib).
Added to capability tests to ensure everything is working as intended. More will be added in the full stdlib capabilities implementation.
|
|
Fixes #387676* ForceInline SampleLevel to allow decorations to apply
* explictly add all the SPIRVAsmOperand Insts in non-differentiable list, which might get inadvertently processed when these functions are inlined into the main shader
* Support NonUniformResourceIndex for SPIR-V target
Fixes #3876
* add a new IR instruction for NonUniformResourceIndex
* slang ir emitter for nonuniform resource index
* update the hlsl meta slang
* Add test cases for NonUniformResourceIndex access for buffers and textures, with/without cast, nested access etc.
* add default c-like emitter for nonuniformresourceinfo
* added hlsl emitter
* added glsl emitter
* requisites for spirv enabling
- new decorator for nonuniformresourceindex
- emitter for nonuniformresourceindex signature change
* add hasResourceType checker
* add rwStructBuffType in resourcetype checker
* add a case for nonuniformres in emitDecorations
* DO NOT COMMIT: This change adds special handling for RWStructBuf within the isResourceType function, if it is a pointer to this resource, return true to make it work with nonuniformres test
* spirv emitter for decorations - update the emitLocalInst to perform decorations at the end
* added main spirv emitter code
* slang emit spirv bugfix
* hacky way of supporting Call Inst
* move code to cleanup nonuniform inst into helper function
* remove stale codefrom test
* add spirv decoration for nonuniform
* update test to remove global variables
* update coherent-2 test
* update comment for special handling
* update the spirv legalize to handle nested nonuniforms
improved logic that handles call ops, rwstructbuf, nested nonuniforms
etc.
* update nonuniform-array-of-tex test
* missed removing nonuniform inst causing duplicate decorations
* add glsl and hlsl variants of nonuniform tests
* repurpose the hasResource function into something specific for nonuniform inst decoration helper
* clean up comments and code around spirv-legalization to emit nonuniform inst by recursively looking into the inst
* use the helper canDecorateNonUniformInst to convert `nonUniformResourceInfo` inst to decoration
* converted compute/unbounded-array-of-array cross compile test into a simple check test
* update contains Resource helper function to be more generic
* clean up the case for opcall handling with nonuniform resource inst
* update ptr to struct buffer check to be more explicit and rename the function to check for ptr to resource type
* update comments and fix the test for coherent
* fix typos
* update logic on spirv legalize to delete dead instructions - for some reason this doesn't automatically happen
* add comments to declarations
* add NonuniformResourceIndex to the non-differential inst list
|
|
* Metal: rewrite global variables as explicit context.
* Small tweaks.
|
|
Fixes #3969
NonUniformResourceInfo instruction is applied as a Decoration on the backing resource. With the following shader, this is applied to the Function Call.
res.rgb *= g_bindless_Texture2D [NonUniformResourceIndex (val.x)].SampleLevel(g_Sampler, v, 0.0).rgb;
as shown below:
{145371} let %1826 : Int = nonUniformResourceIndex(%1789)
{177146} let %1828 : Vec(Float, 4 : Int) = call %SampleLevel(%1826, %sampler, %1827, 0 : Float)
This patch ForceInlines SampleLevel intrinsic function call so that the Decoration is correctly applied on the resource.
|
|
|
|
* Support combined texture sampler when targeting HLSL.
* Fix glsl intrinsics.
* Update source/slang/slang-ir-lower-combined-texture-sampler.cpp
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Update source/slang/slang-ir-lower-combined-texture-sampler.cpp
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Update source/slang/slang-ir-lower-combined-texture-sampler.cpp
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
* Fix.,
* Enhance test.
* Remove unused field.
* Fix indentation
---------
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
|
|
|
|
|
|
This change forcibly inlines the InterlockedAdd functions when using byteAddress buffer.
The IR generated when using nonUniformResourceInst on RWByteAddressBuffer:
buffer[NonUniformResourceIndex(uint(0))].InterlockedAdd(0, 1);
follows the sequence of a call into an index lookup that is wrapped by a nonuniformResourceIndex:
%ld = nonUniformResourceIndex(0)
Call RWStructBufferInterlockedAdd(%ld, 0, 1)
This prevents NonUniformResource decoration of the buffer because it is wrapped by the function call to
InterlockedAdd, that further expands to:
%gep = getElement(%buffer, 0)
SpirvAsmInst(..., rwStructuredBufferGEP(%gep, 0), ...)
By Force-Inlining the atomic functions, the buffer / resource is made visible to the nonUniformResourceIndex inst,
allowing the decoration.
Identified while debugging tests/spirv/coherent-2.slang
|
|
|
|
* Init expressions for struct members
Following commit handles init expressions of struct's.
The general implementation follows C++ init expression rules for classes & inherited classes.
The logic was implemented after type resolution (`SemanticsDeclAttributesVisitor`):
1. Create a default constructor if missing.
2. Check all member variables (`this` and `super`) for if a member has an init expression, continue to *3* if found.
3. For each constructor, insert a member variable's init expression at the beginning of a constructor. This is to follow how C++ does construction of objects.
Some important notes about implementation:
* We must handle the scenario that there is inheritance. To handle the inheritance information processing `findLevelsOfInheritance` was created.
* If a user manually sets overload rank's of constructor expression's we have no way to assume new default constructor overload ranks.
* address feedback
- moved all scope bound variables into if statment initializers
- added indent
- changed logic for overloadRank to be centered around positive numbers rather than negative
* Inheritance fixes universally & for struct field init
1. reimplemented struct field logic
2. implemented inheritance through calling a "super->init()" inisde a constructor for each "this".
3. implemented support for multi level inheritance (4+) and accessing members without a crash.
* add a way to ignore Forward declared constructors.
* a test and fix for a falcor failiure
the following case was not handled: creating an default Ctor due to a non L-Value struct field. Having an empty Ctor causes a warning.
* remove texture/sampler from test since it will break glsl
* get inheritance info using existing lookup logic
modified Facet lookups to store relative depth rather than arbitrary ::Self or' ::Direct for inheritance (which was 'wong' since depth 2 is not Direct, but was considered a Direct inheritance)
* cleanup unused
* cleanup unused functions and whitespace
* fix compile warning
* clean up, reorder, addressed language server fail
changed logic to safeguard bad code --> no longer breaks language server if code is incomplete.
remove the "semi-ordering" logic because caused a crash (and this code does nothing functionally, just thought it would be nice to add if '0 cost').
Remove rank setting for constructors, in place use an addition to the overload system: "this" expressions have calling priority over "super" expressions.
* undo all inheritance depth checks & code added to the inheritance checking algorithm
Reorder default ctor creation and auto-generation of constructor body.
* Handle same struct types during overload resolution
Changed overload resolution logic to properly handle same struct types; added test to check for multi-param same type function overload.
* remove unused ast object
Used unused object in an incorrect way. This caused the compiler to not flag a warning.
* extension support for default constructors
specialization is not supported with default constructors yet.
* fix bugs
Fix bug in override/overload logic with type comparisons.
used wrong type for ctor list construction
Specialization has not been added yet
* disallow default ctor inside extension
* adjust comment, add new tests
* add explicit types to invoke, use faster default ctor lookup.
* adjust syntax & naming as recomended
|
|
|
|
Resolves #3951
This adds a few atomic functions for SM6.6.
The spec can be found from here:
https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_Int64_and_Float_Atomics.html
The new functions are:
void InterlockedAdd(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedAdd(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedAnd(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedOr(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedXor(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedMin(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedMin(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedMax(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedMax(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedExchange(inout XXX dest, in float value, out float original_value);
void InterlockedExchange(inout XXX dest, in int64_t value, out int64_t original_value);
void InterlockedExchange(inout XXX dest, in uint64_t value, out uint64_t original_value);
void InterlockedCompareStore(inout XXX dest, in int64_t compare_value, in int64_t value);
void InterlockedCompareStore(inout XXX dest, in uint64_t compare_value, in uint64_t value);
void InterlockedCompareStoreFloatBitwise(inout XXX dest, in float compare_value, in float value);
void InterlockedCompareExchange(inout XXX dest, in int64_t compare_value, in int64_t value, out int64_t original_value);
void InterlockedCompareExchange(inout XXX dest, in uint64_t compare_value, in uint64_t value, out uint64_t original_value);
void InterlockedCompareExchangeFloatBitwise(inout XXX dest, in float compare_value, in float value, out float original_value);
void RWByteAddressBuffer::InterlockedAnd64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedOr64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedXor64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedMin64(in uint dest_offset, in int64_t value, out int64_t original_value);
void RWByteAddressBuffer::InterlockedMin64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedMax64(in uint dest_offset, in int64_t value, out int64_t original_value);
void RWByteAddressBuffer::InterlockedMax64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedExchangeFloat(in uint dest_offset, in float value, out float original_value);
void RWByteAddressBuffer::InterlockedExchange64(in uint dest_offset, in int64_t value, out int64_t original_value);
void RWByteAddressBuffer::InterlockedExchange64(in uint dest_offset, in uint64_t value, out uint64_t original_value);
void RWByteAddressBuffer::InterlockedCompareStore64(in uint dest_offset, in int64_t compare_value, in int64_t value);
void RWByteAddressBuffer::InterlockedCompareStore64(in uint dest_offset, in uint64_t compare_value, in uint64_t value);
void RWByteAddressBuffer::InterlockedCompareStoreFloatBitwise(in uint dest_offset, in float compare_value, in float value);
void RWByteAddressBuffer::InterlockedCompareExchangeFloatBitwise(in uint dest_offset, in float compare_value, in float value, out float original_value);
|
|
|
|
|
|
Fix the issue that 'spGetDependencyFilePath' will report "unknown"
for the source code is from string. We only reported valid file path
when the source code is file a file, so we change that to report a valid
file name even when the source code is from the string.
|
|
|
|
|
|
|
|
* Fix the variable scope issue (#3838)
In the IR optimization pass, we turn all the loop to do-while loop form.
But in the do-while loop form, the loop body block is dominating the
blocks after the loop break block. This assumption is fine for SPIRV and
IR code, however, it's incorrect for all the other language target (e.g.
c/c++/cuda/glsl/hlsl) because the instructions defined in the loop body
is invisible from outside of the loop. Therefore, when translating to
other textual language, there could be issue for the variables scope.
To fix this issue, we first detect the instructions that are defined
inside the loop block, then check if these instructions are used after
the break block. If so, we duplicate these instructions right before
their users such that we can make those instructions available globally.
* Update slang vcxproj file because of add new source files
* Minor fix
- Update the method to get the block of an instruction
- Avoid query the hash-map twice by using "add" method directly.
* Reduce complexity
In searching loop region blocks, we don't actually need to traverse the
instructions. Instead, we only have to check each block to see if it's
in a loop region, and hash such block for later on processing.
So we can remove one level of loop.
In the second pass, we can use that hash to filter out the blocks that
are not in the loop region, and only process the instructions inside the
loop region.
Add description for the new fix-up pass declared in
slang-ir-variable-scope-correction.h.
* Categorize the unstorable and storable instructs
1. When checking the loop regions, there could be multi-levels nested
loops, so we should use a list to store the loopHeaders.
2. Categorize the instructs based on storable and non-storable, because
we only have to duplicate the non-storable instructs. Note pointer
type instruct is also belonged to non-storable class because we can
not store a pointer in local variable.
* Fix some test failure
* Fix test failures
* Recursively process the operands
Besides process the out-of-scope instruction, we have to also process
all the operands of this instructions. Therefore, we have to make the
process logic recursive until all the involved instructions are
accessible.
* Change how to check storable type
* Add target checking for CPP/CUDA
In decide whether the type is storable, add target checking for CPP/CUDA
as they can store any types.
Cleanup the code to remove those debug log prints.
* Addressing feedbacks
Address some feedbacks.
Change the depth-first traverse to breadth-first traverse when
processing instruction and its operands.
* Minor fix for the variable names
|
|
* Properly compile `gl_WorkgroupSize`.
* Update source/slang/slang-ir-translate-glsl-global-var.cpp
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
---------
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
|
|
|
|
|
|
attribute. (#3914)
* Allow COM based API to discover and check entrypoints without [shader] attribute.
* Undo changes.
* More comments.
|
|
|
|
|
|
Fixes the issue #3671
* The __init constructors are not expected to return a value like other member
functions, but must construct a new value and return the struct type or none.
* This patch enables this behavior in the IR lowering without complaining about
illegal situations where the user returns an invalid type or none at all.
Translate ordinary struct `return ...;` to `this = ...; return this;`
Translate NonCopyableType struct `return ...;` to `return this;`
* This patch also fixes the issue with type checking when __init()
returns a void that mismatches the base type of the struct/ class
Translate ordinary struct `return;` to `return this;`
Translate NonCopyableType struct `return;` to `return;`
* Add end-to-end test and compile only tests to check the above behavior.
|
|
PrimitiveID (#3895)
Fixes bug 3872
|
|
* Legalization of non-struct when expects struct.
`__forceVarIntoStructTemporarily()` solves the issue of passing "non-struct type's" into a parameter that only accepts "struct type's".
The intrinsic solves the issue through checking the parameter of the intrinsic:
If the parameter is a "struct type"
* Return a reference to the parameter
else
* a "struct type" Temporary variable is made and the "non struct type" parameter is copied to a member of this struct. This struct is then returned by `__forceVarIntoStructTemporarily()`. Optionally if the use location of this call is a argument which can have side effects (out, inout, ref, etc.) the temporary struct variable is copied into the original "non struct type" parameter.
Testing code has "addComplexity" functions to avoid optimizations through forcing side effects so we can predict the code output.
* Address review comments
- ForceInline ray functions
- fix testing
- adjust how we replace operands in senarios to avoid unexpected side effects of replacing operands without any explicit checks
* Adjust nv test slightly and remove .glsl file
* Remove implicit LOD sampling & test additions
- Implicit LOD sampling is not allowed in a raygen. Implicit LOD sampling requires depth (from a fragment shader) to sample. Raygen does not have the depth, so this function was replaced.
- Changed other tests for correctness/clarity
* Test if Falcor breaks through use of ForceInline
* Add back force inline
may need to look at how Falcor wrote its slang shaders. This will be done if ForceInline causes issues since ForceInline should not affect code gen in an impactable way.
|
|
* Fix assertions due to malformed switch statements
Fixes the issue #2955
* Checks for multiple case statements with same values
* Checks for multiple default cases
* Constant-folds case exprs into an Integer value
* fix the comments, and updated error code
* one-line comment on diagnostic code
|
|
|
|
* Update glsl intrinsic for `GroupMemoryBarrierWithGroupSync`,
* Add spirv tests for `GroupMemoryBarrierWithGroupSync`.
|
|
(#3881)
* Refactor memory qualifier decorators to be a bit-flag set.
replace GloballyCoherent, ReadOnly, WriteOnly, Volatile, and Restrict memory modifiers and decorations with a bit flag set to more efficiently manage memory qualifiers.
added `restrict` modifier to test to ensure the code works when dropping a `restrict` memory qualifier
* Refine tests & add SSBO memory qualifer support
add CHECK's to tests to ensure memory qualifiers emit as intended
added tests and changed code to ensure memory qualifiers work on SSBO objects (SPIR-V & GLSL)
* add memory qualifiers & fixes.
Add to StructuredBuffer & ByteAddressBuffer `ReadOnly`/NonWritable qualifier.
* Memory qualifiers must be decorated on a variable inst. Due to this the qualifier is added after `lowerStructuredBufferType`
Fixed an error where ReadOnly->NonReadable & WriteOnly->NonWritable
* Adjusted tests accordingly
Added back the removed `globallycoherent` memory qualifier emit'ing code in hlsl-emit (was incorrectly removed).
undo hlsl.meta changes
cleanup
|
|
The following PR implements 8.14-8.19 of the [OpenGL-GLSL specification](https://registry.khronos.org/OpenGL/specs/gl/GLSLangSpec.4.60.pdf).
Fully implements all functions and built-in type's, resolves https://github.com/shader-slang/slang/issues/3692 for GLSL & SPRI-V targets.
_Notes:_
Testing Tools:
* Fragment shaders cannot test computational results. Only OpCodes are checked for proper emitting.
Implementation Notes:
* SubpassInput requires an unknown image format.
* SubpassInput is disjoint from TextureType: __SubpassImpl (.slang) & SubpassInputType (Compiler) to reduce code generation required.
* SubpassInput required an additional input layout modifier, input_attachment_index, this was added as a new parameter binding attribute. Since the following qualifiers can overlap with different resources (`layout(input_attachment_index = 0, binding = 0, set = 0)`) input_attachment_index is checked for overlapping resource bindings separately from other qualifiers with `LayoutResourceKind::InputAttachmentIndex`.
* `GLSLInputAttachmentIndexLayoutModifier` was added to enforce function parameters only accepting `in` decorated variables.
* `in` decorated variables needed to have emitting modified to allow directly emitting the variable into function calls if used as a parameter, normally Slang has a "global variable" shadow as a "global parameter" through a copy. This does not work and is solved using `GlobalVariableShadowingGlobalParameterDecoration` to build a relationship of "global variable" to "global parameter", we then resolve this relationship and replace "global variable" uses later in compile.
* `AtomicCounterMemory` memory-constraint requires `OpCapability AtomicStorage`, `AtomicStorage` is invalid for Vulkan targets. glslang outputs for `barrier`, `memoryBarrier`, and `groupMemoryBarrier` `AtomicCounterMemory` as a memory constraint. This compiles as valid SPIR-V for Vulkan since `OpCapability AtomicStorage` is not declared. This behavior of glslang is undefined as per [3.31.Capability of the SPIR-V specification](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_capability). We will omit `AtomicCounterMemory` from our barrier calls.
|
|
DepthReplacing. (#3885)
* Fix the erroneous logic of determining whether or not to emit DepthReplacing.
Closes #3884.
* Fix.
* More cleanup.
|
|
* Allow enum values to be used as generic arguments.
* Fix constant folding.
|