| Age | Commit message (Collapse) | Author |
|
* format
* Minor test fixes
* enable checking cpp format in ci
|
|
auto-diff results (#5394)
* Various AD enhancements
* Fix issue with pt-loop test
* Update pt-loop.slang
* More fixes for perf. Final minimal context test now passes.
* Fix issue with loop-elimination pass not running after dce
* Try fix wgpu test by removing select operator
* Disable wgpu
* Delete out.wgsl
* Remove comments
* Update slang-ir-util.cpp
* Fix header relative paths for slang-embed
* Disbale wgpu for a few other tests
* Better way of determining which params to ignore for side-effects
* Update slang-ir-dce.cpp
* Fix issue with circular reference from previous AD pass being left behind for the next AD pass
* Update slang-ir-dce.cpp
|
|
Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com>
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Support mixture of precompiled and non-precompiled modules
This changes the implementation of precompile DXIL modules to
accept combinations of modules with precompiled DXIL, ones without,
and ones with a mixture of precompiled DXIL and Slang IR.
During precompilation, module IR is analyzed to find public functions
which appear to be capable of being compiled as HLSL, and those
functions are given a HLSLExport decoration, ensuring they are emitted
as HLSL and preserved in the precompiled DXIL blob. The IR for those
functions is then tagged with a new decoration AvailableInDXIL, which
marks that their implementation is present in the embedded DXIL blob.
The DXIL blob is attached to the IR as before, inside a EmbeddedDXIL
BlobLit instruction.
The logic that determines whether or not functions should be
precompiled to DXIL is a placeholder at this point, returning true
always. A subsequent change will add selection criteria.
During module linking, the full module IR is available, as well
as the optional EmbeddedDXIL blob. The IR for functions implemented
by the blob are tagged with AvailableInDXIL in the module IR.
After linking the IR for all modules to program level IR, the IR for
the functions marked AvailableInDXIL are deleted from the linked IR,
prior to emitting HLSL and compiling linking the result.
This change also changes the point of time when the module IR is
checked for EmbeddedDXIL blobs. Instead of happening at load time
as before, it happens during immediately before final linking, meaning
that the blob does not need to be independently stored with the module
separate from the IR as was done previously.
Work on #4792
* Clean up debug prints
* Call isSimpleHLSLDataType stub
* Address feedback on precompiled dxil support
Allow for IR filtering both before and after linking.
Only mark AvailableInDXIL those functions which pass
both filtering stages. Functions are corrlated using
mangled function names.
Rather than delete functions entirely when linking with
libraries that include precompiled DXIL, instead convert
the IR function definitions to declarations by gutting
them, removing child blocks.
* Use artifact metadata and name list instead of linkedir hack
* Use String instead of UnownedStringSlice
* Update tests
* Renaming
* Minor edits
* Don't fully remove functions post-link
* Unexport before collecting metadata
|
|
|
|
* Fix SPIRV emit for small-integer texture types.
* Disable -emit-spirv-via-glsl test.
|
|
* Fix the issue in emitFloatCast
In emitFloatCast function, we only considered the input type
is float scalar or float vector, so if the input type is a float
matrix type, it will crash.
We should also handle the float matrix type.
Also, we add some diagnose info to point out the source location
where there is error happened, so in the future it's easier to tell
us what happens.
* Add a unit test
* Disable the test for metal
Metal doesn't support 'double'.
"
metal 32023.35: /tmp/unknown-YgHAsJ.metal(15): error : 'double' is not supported in Metal
matrix<double,int(3),int(4)> b_0 = matrix<double,int(3),int(4)> (a_0);
"
|
|
* Add additional `ImageSubscript` features:
1. Added ImageSubscript support for Metal & a test case
* Merge GLSL/SPIRV/Metal `ImageSubscript` legalization pass
2. Added multisample support to glsl/spirv/metal for when using ImageSubscript
* Added in this PR since the overhaul of the code merges together GLSL/SPIRV/Metal implementation
3. Fixed minor metal texture `Load`/`Read` bugs
* [HLSL methods of access do not support subscript accessor for texture cube array](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/texturecubearray)
* removed swizzling of uint/int/float
* other odd bugs which were causing compile errors
note: Compute tests do not work due to what seems to be the GFX backend (causes crash without error report). The tests are disabled.
* disable LOD with texture 1d
seems that LOD for 1d textures need to be a compile time constant as per an error metal throws
* syntax error in hlsl.meta
* static_assert alone with intrinsic_asm error
provides cleaner errors
Note: `static_assert` seems to be unstable and not be fully respected (still require `intrinsic_asm` to avoid a stdlib compile error)
* change comment to `// lod is not supported for 1D texture
* add `static_assert` in related code gen paths
* address review
* address review
* add asserts as per review comment, NOTE: unclear if these should be release 'asserts' as well
|
|
|
|
* Add options to speedup compilation.
* Fix.
* Plumb options to DCE pass.
* Revert debug change.
* Fix regressions.
* More optimizations.
* more cleanup and fixes.
* remove comment.
* Fixes.
* Another fix.
* Fix errors.
* Fix errors.
* Add comments.
|
|
* Add host shared library target.
* Attempt fix.
* Fix warnings.
* try fix.
* Fix test.
* Fix.
|
|
* Fix compile failures when using debug symbol.
* Various fixes.
* Fix intrinsic.
* Fix test.
|
|
* SPIRV: Fix performance issue when handling large arrays.
* Add test for packing.
* Fix clang.
|
|
* Support derivative functions in compute & capabilities adjustments
fixes #4000
PR implements derivative functions in compute shaders properly so we have the functionality for SPIR-V & GLSL. Tests reflect fragment and compute paths.
PR also adjusts capabilities to correct wrong SPRI-V target capabilities for when using textures.
Remarks:
1. __requireComputeDerivative(); is a intrinsic_op and not modifier since inlining will destroy the modifier.
2. Derivative mode is tied to an entry point decoration `[DerivativeGroupQuad]`/`[DerivativeGroupLinear]` or GLSL syntax ``derivative_group_linearNV`. Default is to set the mode to `[DerivativeGroupQuad]`
* remove -emit-spirv-directly
* fixes
1. fix minor issue fwidth change where I returned the wrong type
2. fix issue where glslang{glsl->spirv} is wrong, so we don't run that test and just run the glsl test & direct spir-v test for intrinsic-texture.slang
* adjust as per review and refine code
1. add test to ensure multi-diverging-in-logic entry points work -- 2 functions which may cause computeDerivatives + 1 that uses, 1 that does not.
2. naming
3. use entry point ref graph for c-like-targets
4. reordered some code to util's and removed `static linline` since that was just for ease of coding on my end (should not have been pushed).
* Grammer
* split up source file + issolate GLSL emit path change.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Switch to direct-to-spirv backend as default.
* Fix slang-test.
* Fix.
* Fix.
|
|
|
|
The following PR implements 8.14-8.19 of the [OpenGL-GLSL specification](https://registry.khronos.org/OpenGL/specs/gl/GLSLangSpec.4.60.pdf).
Fully implements all functions and built-in type's, resolves https://github.com/shader-slang/slang/issues/3692 for GLSL & SPRI-V targets.
_Notes:_
Testing Tools:
* Fragment shaders cannot test computational results. Only OpCodes are checked for proper emitting.
Implementation Notes:
* SubpassInput requires an unknown image format.
* SubpassInput is disjoint from TextureType: __SubpassImpl (.slang) & SubpassInputType (Compiler) to reduce code generation required.
* SubpassInput required an additional input layout modifier, input_attachment_index, this was added as a new parameter binding attribute. Since the following qualifiers can overlap with different resources (`layout(input_attachment_index = 0, binding = 0, set = 0)`) input_attachment_index is checked for overlapping resource bindings separately from other qualifiers with `LayoutResourceKind::InputAttachmentIndex`.
* `GLSLInputAttachmentIndexLayoutModifier` was added to enforce function parameters only accepting `in` decorated variables.
* `in` decorated variables needed to have emitting modified to allow directly emitting the variable into function calls if used as a parameter, normally Slang has a "global variable" shadow as a "global parameter" through a copy. This does not work and is solved using `GlobalVariableShadowingGlobalParameterDecoration` to build a relationship of "global variable" to "global parameter", we then resolve this relationship and replace "global variable" uses later in compile.
* `AtomicCounterMemory` memory-constraint requires `OpCapability AtomicStorage`, `AtomicStorage` is invalid for Vulkan targets. glslang outputs for `barrier`, `memoryBarrier`, and `groupMemoryBarrier` `AtomicCounterMemory` as a memory constraint. This compiles as valid SPIR-V for Vulkan since `OpCapability AtomicStorage` is not declared. This behavior of glslang is undefined as per [3.31.Capability of the SPIR-V specification](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_capability). We will omit `AtomicCounterMemory` from our barrier calls.
|
|
* Fix #3780.
* Fixers #3781.
* Add test for #3781.
* Diagnose error on unsupported builtin intrinsic types.
* Add check for recursion.
* Fix.
* Fix.
* Fix recursion detection.
* Fix.
* Fix.
* Fix recursion logic.
* More fix.
|
|
(#3675)
The following PR implements raytracing extensions (GLSL_EXT_ray_tracing, GLSL_EXT_ray_query, GLSL_NV_shader_invocation_reorder & GLSL_NV_ray_tracing_motion_blur); for GLSL & SPIR-V targets. Fully implements all functions, built-in variables, & syntax; resolves #3560 for GLSL & SPIR-V Targets.
notes of worth:
* __rayPayloadFromLocation, __rayAttributeFromLocation, and __rayCallableFromLocation, were added as SPIR-V Intrinsics to refer to location's of raytracing objects in SPIR-V for when using GLSL syntax.
|
|
|
|
extension(s); resolves #3587 for GLSL & SPIR-V targets (#3755)
The following commit implements atomic operations & types associated with OpenGL 4.6, GL_EXT_vulkan_glsl_relaxed, GLSL_EXT_shader_atomic_float, GLSL_EXT_shader_atomic_float2, for GLSL & SPIR-V targets.
Fully implements all functions, and built-in type's, resolves https://github.com/shader-slang/slang/issues/3560 for GLSL & SPRI-V targets.
[Atomic extensions for GLSL can be found here](https://github.com/KhronosGroup/GLSL/tree/main)
Notes of worth:
* atomic_uint is well defined in GLSL->OpenGL, although was removed in GLSL->VK unless a compiler extension is supported (GL_EXT_vulkan_glsl_relaxed). This support entails transforming all atomic_uint operations and references into a storage buffer. SPIR-V has AtomicCounter+AtomicStorage (atomic_uint parallel) but does not implement these capabilities for SPIR-V->VK in any scenario. Due to the case we transform atomic_uint ourselves (GLSL_Syntax->Slang_IR) to accommodate transforming atomic_uint into valid syntax.
* GLSL_EXT_shader_atomic_float2 (all float16_t & some float/double operations) support is minimal and worth watching out for if enabling the tests.
|
|
* Fix crash when generating debug info for geometry shaders.
* Fix.
* Fix source language field in DebugCompilationUnit.
* Fix.
* Emit DebugEntryPoint inst.
* Add trivial test.
* Cleanup.
* More cleanup.
|
|
* [SPIRV] Add NonSemanticDebugInfo for step-through debugging.
* Fix.
* Fix.
|
|
* Support pointers in SPIRV.
* Fix test.
* Enhance test.
* Fix test.
* Cleanup.
|
|
parent. Previously special case was added to handle IRDecoration similarly. Replace this with a common method getBlock that traverses the parent chain till it gets to the Block (#3486)
Fixes bug #3432
|
|
* Add `-fspv-reflect` support.
Closes #3462.
* Fix.
* Fix.
* Remove use of `SPV_GOOGLE_hlsl_functionality1`.
* Fix spirv validation error.
* Fix test.
* Update typename hints.
* Update commandline options doc.
* Remove superfluous empty lines.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix crash when writing to `no_diff` out parameter.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* SPIRV compiler performance fixes.
* Cleanup.
* update project files
* Cleanup debug code.
* Make redundancy removal non-recursive.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* More direct-SPIRV fixes.
* Fix array-reg-to-mem.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support `constref` parameters passing.
* Fix.
* Fix.
* Add test and diagnostic on mix use of __constref and no_diff.
* check for [constref] on differentiable member method.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Various SPIRV fixes.
- Geometry shader support (WIP).
- Fix texture get dimension and load.
- Fold global GetElement(MakeArray/MakeVector) insts.
- Call spvopt to inline all functions.
- Translate OpImageSubscript.
- Emit struct member names and global variable names.
- Fix lowering of OpBitNot -> OpNot, instead of OpBitReverse.
* Fix test.
* Fix geometry shader.
* Fix geometry shader emit.
* Add atomic Image access test.
* Fix tests.
* don't fail if spirv-opt fails.
* Update comments.
* Fix test.
* Cleanups.
* indentation
---------
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
* Make dynamic cast transparent through `IRAttributedType`.
* Add [CUDAXxx] variant of attributes.
* Support marshaling of vector types.
* Wrap cuda kernels in `extern "C"` block.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix issue with trivial loop detection
* Fix issue with unreachable blocks in break elimination
Add logic to avoid eliminating loops with multi-level breaks.
* Incorporate feedback
- Use a boolean for multi-level break check
- Use dominator trees for region check instead of exhaustive enumeration
- Fix potential issue with enumerating parent break blocks.
* fix
|
|
* Misc. SPIRV Fixes, Part 2.
* Fix up.
* Fix.
* Add system smenatic values.
* 16 bit int and floats, matrix/vector reshape, bool ops.
* Fix.
* Fix.
* Allow push constant entry point params.
* entrypoint params.
* swizzleSet and swizzledStore.
* packoffset.
* string hash.
* Fix.
* Matrix arithmetics.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support per field matrix layout
* Fix warnings.
* Fix.
* Fix tests.
* Fix spiv gen.
* Fix.
* More test fixes.
* Fix.
* Run only GPU tests on self-hosted servers.
* Remove -use-glsl-matrix-layout-modifier.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fail on an unhandled spv operand
* Diagnose on emitting a function with no definition or intrinsic
* clearer error message
* Add assert
* Add assert
* remove unused assert
* Disagnostic on snippet parsing failure
* Mention unimplemented instruction in error message
* mention unhandled local instruction for spirv in error message
* Allow specifying dump options in dumpIRToString
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Do not fail when emitting GLSL using unorm/snorm textures
Ignored in glslang https://github.com/KhronosGroup/glslang/blob/main/glslang/HLSL/hlslGrammar.cpp\#L1476
* Add test for unorm modifier on glsl
|
|
* Fix DCE on mutable calls in a loop.
* More accurate in-loop test.
* code review fixes.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Generate faster derivative for div by const operations.
* Increase `kMaxIterationsToAttempt` to 256.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Various fixes for autodiff and slangpy.
* Fix cuda code gen for `select`.
* Fix getBuildTagString().
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Various dxc/fxc compatibility fixes.
* Cleanup.
* Fix test cases.
* Fix comments.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* SSA Register Allocation improvements.
* Fix.
* Rename `Use`->`UseOrPseudoUse`.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
* Small tidy on String.
* Append -> append
* ToString -> toString
ProduceString -> produceString
* Small fixes.
* StringToXXX -> stringToXXX
* Fix typo introduced by Append -> append.
* Made intToAscii do reversal at the end.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|