| Age | Commit message (Collapse) | Author |
|
* Support derivative functions in compute & capabilities adjustments
fixes #4000
PR implements derivative functions in compute shaders properly so we have the functionality for SPIR-V & GLSL. Tests reflect fragment and compute paths.
PR also adjusts capabilities to correct wrong SPRI-V target capabilities for when using textures.
Remarks:
1. __requireComputeDerivative(); is a intrinsic_op and not modifier since inlining will destroy the modifier.
2. Derivative mode is tied to an entry point decoration `[DerivativeGroupQuad]`/`[DerivativeGroupLinear]` or GLSL syntax ``derivative_group_linearNV`. Default is to set the mode to `[DerivativeGroupQuad]`
* remove -emit-spirv-directly
* fixes
1. fix minor issue fwidth change where I returned the wrong type
2. fix issue where glslang{glsl->spirv} is wrong, so we don't run that test and just run the glsl test & direct spir-v test for intrinsic-texture.slang
* adjust as per review and refine code
1. add test to ensure multi-diverging-in-logic entry points work -- 2 functions which may cause computeDerivatives + 1 that uses, 1 that does not.
2. naming
3. use entry point ref graph for c-like-targets
4. reordered some code to util's and removed `static linline` since that was just for ease of coding on my end (should not have been pushed).
* Grammer
* split up source file + issolate GLSL emit path change.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Switch to direct-to-spirv backend as default.
* Fix slang-test.
* Fix.
* Fix.
|
|
|
|
The following PR implements 8.14-8.19 of the [OpenGL-GLSL specification](https://registry.khronos.org/OpenGL/specs/gl/GLSLangSpec.4.60.pdf).
Fully implements all functions and built-in type's, resolves https://github.com/shader-slang/slang/issues/3692 for GLSL & SPRI-V targets.
_Notes:_
Testing Tools:
* Fragment shaders cannot test computational results. Only OpCodes are checked for proper emitting.
Implementation Notes:
* SubpassInput requires an unknown image format.
* SubpassInput is disjoint from TextureType: __SubpassImpl (.slang) & SubpassInputType (Compiler) to reduce code generation required.
* SubpassInput required an additional input layout modifier, input_attachment_index, this was added as a new parameter binding attribute. Since the following qualifiers can overlap with different resources (`layout(input_attachment_index = 0, binding = 0, set = 0)`) input_attachment_index is checked for overlapping resource bindings separately from other qualifiers with `LayoutResourceKind::InputAttachmentIndex`.
* `GLSLInputAttachmentIndexLayoutModifier` was added to enforce function parameters only accepting `in` decorated variables.
* `in` decorated variables needed to have emitting modified to allow directly emitting the variable into function calls if used as a parameter, normally Slang has a "global variable" shadow as a "global parameter" through a copy. This does not work and is solved using `GlobalVariableShadowingGlobalParameterDecoration` to build a relationship of "global variable" to "global parameter", we then resolve this relationship and replace "global variable" uses later in compile.
* `AtomicCounterMemory` memory-constraint requires `OpCapability AtomicStorage`, `AtomicStorage` is invalid for Vulkan targets. glslang outputs for `barrier`, `memoryBarrier`, and `groupMemoryBarrier` `AtomicCounterMemory` as a memory constraint. This compiles as valid SPIR-V for Vulkan since `OpCapability AtomicStorage` is not declared. This behavior of glslang is undefined as per [3.31.Capability of the SPIR-V specification](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_capability). We will omit `AtomicCounterMemory` from our barrier calls.
|
|
* Fix #3780.
* Fixers #3781.
* Add test for #3781.
* Diagnose error on unsupported builtin intrinsic types.
* Add check for recursion.
* Fix.
* Fix.
* Fix recursion detection.
* Fix.
* Fix.
* Fix recursion logic.
* More fix.
|
|
(#3675)
The following PR implements raytracing extensions (GLSL_EXT_ray_tracing, GLSL_EXT_ray_query, GLSL_NV_shader_invocation_reorder & GLSL_NV_ray_tracing_motion_blur); for GLSL & SPIR-V targets. Fully implements all functions, built-in variables, & syntax; resolves #3560 for GLSL & SPIR-V Targets.
notes of worth:
* __rayPayloadFromLocation, __rayAttributeFromLocation, and __rayCallableFromLocation, were added as SPIR-V Intrinsics to refer to location's of raytracing objects in SPIR-V for when using GLSL syntax.
|
|
|
|
extension(s); resolves #3587 for GLSL & SPIR-V targets (#3755)
The following commit implements atomic operations & types associated with OpenGL 4.6, GL_EXT_vulkan_glsl_relaxed, GLSL_EXT_shader_atomic_float, GLSL_EXT_shader_atomic_float2, for GLSL & SPIR-V targets.
Fully implements all functions, and built-in type's, resolves https://github.com/shader-slang/slang/issues/3560 for GLSL & SPRI-V targets.
[Atomic extensions for GLSL can be found here](https://github.com/KhronosGroup/GLSL/tree/main)
Notes of worth:
* atomic_uint is well defined in GLSL->OpenGL, although was removed in GLSL->VK unless a compiler extension is supported (GL_EXT_vulkan_glsl_relaxed). This support entails transforming all atomic_uint operations and references into a storage buffer. SPIR-V has AtomicCounter+AtomicStorage (atomic_uint parallel) but does not implement these capabilities for SPIR-V->VK in any scenario. Due to the case we transform atomic_uint ourselves (GLSL_Syntax->Slang_IR) to accommodate transforming atomic_uint into valid syntax.
* GLSL_EXT_shader_atomic_float2 (all float16_t & some float/double operations) support is minimal and worth watching out for if enabling the tests.
|
|
* Fix crash when generating debug info for geometry shaders.
* Fix.
* Fix source language field in DebugCompilationUnit.
* Fix.
* Emit DebugEntryPoint inst.
* Add trivial test.
* Cleanup.
* More cleanup.
|
|
* [SPIRV] Add NonSemanticDebugInfo for step-through debugging.
* Fix.
* Fix.
|
|
* Support pointers in SPIRV.
* Fix test.
* Enhance test.
* Fix test.
* Cleanup.
|
|
parent. Previously special case was added to handle IRDecoration similarly. Replace this with a common method getBlock that traverses the parent chain till it gets to the Block (#3486)
Fixes bug #3432
|
|
* Add `-fspv-reflect` support.
Closes #3462.
* Fix.
* Fix.
* Remove use of `SPV_GOOGLE_hlsl_functionality1`.
* Fix spirv validation error.
* Fix test.
* Update typename hints.
* Update commandline options doc.
* Remove superfluous empty lines.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix crash when writing to `no_diff` out parameter.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* SPIRV compiler performance fixes.
* Cleanup.
* update project files
* Cleanup debug code.
* Make redundancy removal non-recursive.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* More direct-SPIRV fixes.
* Fix array-reg-to-mem.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support `constref` parameters passing.
* Fix.
* Fix.
* Add test and diagnostic on mix use of __constref and no_diff.
* check for [constref] on differentiable member method.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Various SPIRV fixes.
- Geometry shader support (WIP).
- Fix texture get dimension and load.
- Fold global GetElement(MakeArray/MakeVector) insts.
- Call spvopt to inline all functions.
- Translate OpImageSubscript.
- Emit struct member names and global variable names.
- Fix lowering of OpBitNot -> OpNot, instead of OpBitReverse.
* Fix test.
* Fix geometry shader.
* Fix geometry shader emit.
* Add atomic Image access test.
* Fix tests.
* don't fail if spirv-opt fails.
* Update comments.
* Fix test.
* Cleanups.
* indentation
---------
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
* Make dynamic cast transparent through `IRAttributedType`.
* Add [CUDAXxx] variant of attributes.
* Support marshaling of vector types.
* Wrap cuda kernels in `extern "C"` block.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix issue with trivial loop detection
* Fix issue with unreachable blocks in break elimination
Add logic to avoid eliminating loops with multi-level breaks.
* Incorporate feedback
- Use a boolean for multi-level break check
- Use dominator trees for region check instead of exhaustive enumeration
- Fix potential issue with enumerating parent break blocks.
* fix
|
|
* Misc. SPIRV Fixes, Part 2.
* Fix up.
* Fix.
* Add system smenatic values.
* 16 bit int and floats, matrix/vector reshape, bool ops.
* Fix.
* Fix.
* Allow push constant entry point params.
* entrypoint params.
* swizzleSet and swizzledStore.
* packoffset.
* string hash.
* Fix.
* Matrix arithmetics.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support per field matrix layout
* Fix warnings.
* Fix.
* Fix tests.
* Fix spiv gen.
* Fix.
* More test fixes.
* Fix.
* Run only GPU tests on self-hosted servers.
* Remove -use-glsl-matrix-layout-modifier.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fail on an unhandled spv operand
* Diagnose on emitting a function with no definition or intrinsic
* clearer error message
* Add assert
* Add assert
* remove unused assert
* Disagnostic on snippet parsing failure
* Mention unimplemented instruction in error message
* mention unhandled local instruction for spirv in error message
* Allow specifying dump options in dumpIRToString
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Do not fail when emitting GLSL using unorm/snorm textures
Ignored in glslang https://github.com/KhronosGroup/glslang/blob/main/glslang/HLSL/hlslGrammar.cpp\#L1476
* Add test for unorm modifier on glsl
|
|
* Fix DCE on mutable calls in a loop.
* More accurate in-loop test.
* code review fixes.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Generate faster derivative for div by const operations.
* Increase `kMaxIterationsToAttempt` to 256.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Various fixes for autodiff and slangpy.
* Fix cuda code gen for `select`.
* Fix getBuildTagString().
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Various dxc/fxc compatibility fixes.
* Cleanup.
* Fix test cases.
* Fix comments.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* SSA Register Allocation improvements.
* Fix.
* Rename `Use`->`UseOrPseudoUse`.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
* Small tidy on String.
* Append -> append
* ToString -> toString
ProduceString -> produceString
* Small fixes.
* StringToXXX -> stringToXXX
* Fix typo introduced by Append -> append.
* Made intToAscii do reversal at the end.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix Phi simplification bug.
* Fix up.
* Fix.
* Fix.
* Fix.
* Fix.
* Fix.
* Fix test.
* Fix test.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
parameters. (#2700)
|
|
* Properly implement differential witness of intermediate context type.
* Modify test to include a loop.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support high order diff pattern: `bwd_diff(fwd_diff(f))`.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Detect and deduplicate read-only resource access.
* Fix tests.
* Fix tests.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* More control flow and Phi param simplifications.
* Fix.
* Fix gcc error.
* Fix.
* More IR cleanup.
* Fix bug in phi param dce + ifelse simplify.
* Propagate and DCE side-effect-free functions.
* Enhance CFG simplifcation to remove loops with no side effects.
* Fix.
* Fixes.
* Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation.
* More cleanup.
* Fixes.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix differentiable type registration
* Fix use of non-differentiable return value in a differentiable func.
* Fix use of primal inst that does not dominate the diff block.
* Fix primal inst hoisting, and add missing type legalization logic.
* Make `detach` defined on all differentiable T.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Overhaul global inst deduplication and cpp/cuda backend.
* Update IR documentation.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|