slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2023-08-14	Support per field matrix layout (#3101)	Yong He
	* Support per field matrix layout * Fix warnings. * Fix. * Fix tests. * Fix spiv gen. * Fix. * More test fixes. * Fix. * Run only GPU tests on self-hosted servers. * Remove -use-glsl-matrix-layout-modifier. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-10	Add support for ConstBufferPointer on Vulkan. (#3089)	Yong He
	* Add support for `ConstBufferPointer` on Vulkan. * Add spv compilation test. * Fix. * Fix code review issues. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-09	Support implciit casted swizzled lvalue. (#3077)	Yong He
	* Support implciit casted swizzled lvalue. * Fix warnings. * Fix. * fix comment. * Prefer mangled linkage name for global params. * Update tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-04	Redesign `DeclRef` and systematic `Val` deduplication (#3049)	Yong He
	* Redesign DeclRef + Deduplicate Val. * Update project files * Fix warning. * Fix. * Fix. * Remove `Val::_equalsImplOverride`. * Rmove `Val::_getHashCodeOverride`. * Remove `semanticVisitor` param from `resolve`. * Cleanups. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-01	Fix literals needing cast (#3039)	jsmall-nvidia
	* Cast integer literals. * Fix expected output. * For CUDA, search global instructions to see what types are used. Improve lookup for fp16 header in CUDA. * Fix issue with f16tof32 * Small improvement around finding used base types.
2023-07-21	Better handling of bindings with multiple resource kind "aliases" for GLSL ↵	jsmall-nvidia
	emit (#3009) * A more way robust way to handle resource consumption might use multiple `kind`s on GLSL emit. * Improve method naming and some comments. * Small consistency fix.
2023-07-12	Pool inst worklists and hashsets to avoid rehashing. (#2982)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-06-27	Pointer layout support (#2930)	jsmall-nvidia
	* WIP looking at reflection with pointers. * Added GetPointerLayout. * Initial test via reflection with layout of ptr type. * WIP handles ptrs to types that have layout that hasn't been completed. * Move tests to ptr. * WIP try to take into account lowering correctly between AggTypeDecl and Type, but doesn't quite work. * WIP a different path to handling recursive lowering problem with Ptr. * Fix issues with reflection output. * Small tidy. * Fix for infinite recursion issue. * Lower IRPointerTypeLayout * Working with generics. Has a hack to work around Layout around Ptr in IR. The reflection around the generic - the name isn't much use, it should probably have the generic parameters, but that would require getName to do something more sophisticated. * Fix issue around calling finishOuterGenerics to early. * Remove feature/ptr test. * Fix type legalization being an infinite loop with Ptr self referencing. * Disable the pointer self reference test because produces an infintie loop on emit. * Fixed comment based on review. * Fix for issue with emit and pointers causing infinite recursion.
2023-06-22	[branch] and [flatten] support (#2928)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Add HLSL output support for [flatten] and [branch] * Handle [branch] on switch.
2023-06-13	Fixes for Shader Execution Reordering on VK (#2929)	Theresa Foley
	* Fixes for Shader Execution Reordering on VK There are some mismatches between the way that hit objects are handled between the current NVAPI/HLSL and proposed GLSL extensions for shader execution reordering. These mismatches create complications for generating valid GLSL/SPIR-V code from input Slang. Many of the problems that apply to `HitObject` also apply to the existing `RayQuery<>` type used for "inline" ray tracing. In the case of `RayQuery<>` we have that for both HLSL and GLSL/SPIR-V: * A `RayQuery` (or `rayQueryEXT`) is an opaque handle to underlying mutable storage * The storage that backs a `RayQuery` is allocated as part of the "defualt constructor" for a local variable declared with type `RayQuery`. * The `RayQuery` API provides numerous operations that mutate the storage referred to by the opaque handle. The key difference between HLSL and GLSL/SPIR-V for the case of a `RayQuery` amounts to: * In HLSL, local variables of type `RayQuery` can be assigned to, and assignment has by-reference semantics. It is possible to create multiple aliased handles to the same underlying storage. * In GLSL/SPIR-V, local variables of type `rayQueryEXT` cannot be assigned to, returned from functions, etc. It is impossible to create multiple aliased handles to the same underlying storage. The case for `HitObject`s is signicantly more messy, because: * In NVAPI/HLSL a `HitObject` is effectively a "value type" in that it only exposes constructors, and there is no way to mutate the state of a `HitObject` other than by assignment to a variable of that type. It makes no semantic difference whether a `HitObject` directly stores the value(s), or if it is a handle, since there is no way to introduce aliasing of mutable state. Assignment of `HitObject`s semantically creates a copy. * In GLSL/SPIR-V, a `hitObjectNV` is, like a `rayQueryEXT`, a handle to underlying mutable state. These handles cannot be assigned, returned from functions, etc. There is no way to make a copy of a hit object. This change includes several changes to how both `RayQuery<>` and `HitObject` are implemented, with the intention of getting more cases to work correctly when compiling for GLSL/SPIR-V, and to set up a more clear mental model for the semantics we want to give to these types in Slang, and how those semantics can/should map to our targets. An overview of important changes: * Marked a few operations on `RayQuery` as `[mutating]` that realistically should have already been that way. * Marked the `HitObject` type as being non-copyable (an attribute we do not currently enforce), and marked the various GLSL operations that construct a hit object as having an `out` parameter of the `HitObject` type (even if they are nominally specified in GLSL as not writing to the correspondign parameter). * Added a distinct IR opcode (`allocateOpaqueHandle`) to represent the implicit allocation that happens when declaring a variable of type `HitObject` or `RayQuery`, and made the "implicit constructor" for those types map to the new op. This operation took a lot of tweaking to get emitting in a reasonable way, and I'm still not 100% sure that all of the emission-related logic for it is strictly required (or correct). * Added new IR instructions for `HitObject` and `RayQuery` types, and made the stdlib types map to those IR instructions. * Treat `HitObject` and `RayQuery` as resource types for the purpose of our existing pass that specializes calls to functions that have outputs of resource type * Added a new test case that includes a function that returns a `HitObject` as its result. * Many test cases saw slight changes in their output (especially around the relative ordering of declarations of `HitObject`s and `RayQuery`s with other instructions) * Remove debugging logic
2023-05-09	Various fixes for autodiff and slangpy. (#2876)	Yong He
	* Various fixes for autodiff and slangpy. * Fix cuda code gen for `select`. * Fix getBuildTagString(). * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-05-02	Various dxc/fxc compatibility fixes. (#2863)	Yong He
	* Various dxc/fxc compatibility fixes. * Cleanup. * Fix test cases. * Fix comments. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-04-26	Fix most of the disabled warnings on gcc/clang (#2839)	Ellie Hermaszewska

2023-04-26	For C-like targets, emit resource declarations before other globals (#2843)	Sai Praveen Bangaru
	* For C-like targets, emit resource declarations before other globals * Remove unused tests
2023-04-25	StringBuilder to lowerCamel (#2840)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI. * Small tidy on String. * Append -> append * ToString -> toString ProduceString -> produceString * Small fixes. * StringToXXX -> stringToXXX * Fix typo introduced by Append -> append. * Made intToAscii do reversal at the end. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-04-25	Dictionary using lowerCamel (#2835)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
2023-03-30	More builtin library support in torch backend. (#2760)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-03-28	Add slangpy doc, fix cuda prelude. (#2748)	Yong He
	* Add slangpy doc, fix cuda prelude. * more bug fix. * fix. * fix. * More fix. * fix. * f * fix prelude. * update prelude. * update doc * Update prelude. * add zeros_like * update doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-03-28	Small fixes and cleanups on CUDA/CPP codegen. (#2746)	Yong He
	* Small fixes and cleanups on CUDA/CPP codegen. * Disable `legalizeEmptyTypes` for now. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-03-26	Add PyTorch C++ binding generation. (#2734)	Yong He
	* Add PyTorch C++ binding generation. * fix --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-03-23	Fix scope fixing for address insts. (#2724)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-27	Detect and deduplicate read-only resource access. (#2680)	Yong He
	* Detect and deduplicate read-only resource access. * Fix tests. * Fix tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-24	More control flow simplifications. (#2673)	Yong He
	* More control flow and Phi param simplifications. * Fix. * Fix gcc error. * Fix. * More IR cleanup. * Fix bug in phi param dce + ifelse simplify. * Propagate and DCE side-effect-free functions. * Enhance CFG simplifcation to remove loops with no side effects. * Fix. * Fixes. * Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation. * More cleanup. * Fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-07	Arithmetic simplifications and more IR clean up logic. (#2632)	Yong He

2023-01-25	Unify UpdateField and UpdateElement with access chain. (#2611)	Yong He
	* Unify UpdateField and UpdateElement with access chain. * Fix warnings. Co-authored-by: Yong He <yhe@nvidia.com>
2023-01-24	Reimplement address elimination. (#2605)	Yong He
	* Reimplement address elimination pass. * Fix error. * Update test references. Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-14	Fix code generation for matrix reshape. (#2568)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-08	Auto-diff for matrix operations. (#2559)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-07	Rename IR opcodes to unify style. (#2556)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-07	Remove `construct` IR op. (#2555)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-07	Lower-to-ir no longer produce `Construct` inst. (#2553)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-12-02	Inline functions with string param/return for GPU targets (#2544)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * WIP inlining of functions that take or return string related types on GPU targets. * Small fixes. * Added a test. * Add checking for any getStringHash insts are valid. * Support getStringHash on CUDA. * Tweak diagnostic.
2022-11-16	Mesh shader support (#2464)	Ellie Hermaszewska
	* Add gdb generated files to .gitignore * Switch to c++17 TODO: Ellie update coding style doc * WIP mesh shaders * Add MeshOutputType and mesh output decorations * Lift array type layout creation out of _createTypeLayout in preparation for sharing it elsewhere * Initial pass at GLSL legalization for mesh shaders * Create output types for builtin mesh outputs This should be rendered as an out paramter block * Handle writes to member fields in mesh shader output * Per primitive output from mesh shaders * Add mesh shader tests * Redeclare mesh output builtins * Remove unused instruction * Emit explicit mesh output max max size * Add unimplemented warning for array members in mesh output * Implement mesh output splitting for GLSL in terms of getSubscriptVal * Allow HLSL syntax for mesh output modifiers * Improve error messages for mesh output * Add test for HLSL style mesh output syntax * Emit explicit mesh output indices max size * HLSL generation support for mesh shaders * Better errors for mesh shader misuse * Neaten comments * Regenerate vs2019 project files * Fix build on vs2019 * Retreat on c++17 Will make the change in a separate PR * slang-glslang binary dep 11.10.0 -> 11.12.0-32 * Fixes for msvc compiler * Update msvc project
2022-11-15	Shader Execution Reordering for VK (#2491)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Fixes around MakeMiss. * Add preliminary support for HitObject::MakeHit. * Make Nop. * Add HitObject::TraceRay. * HitObject::Invoke for VK. * Remove line numbers from SER GLSL output. * Add support for HitObjectAttributes Add support for GLSL HitObject.GetAttributes<T>() Simplified code around getting locations. * Be more explicit about requiring GL_EXT_ray_tracing in SER. * Split out LocationTracker from CLikeEmitter. * Small doc improvements. * Add motion ray support. * Use inlining to get correct GLSL behavior around hitObjectNV. * Add assignment HitObject test. * Add a HitObject array test. Shows doesn't work correctly for VK/GLSL. * Add call to `hitObjectGetAttributesNV` before getting attributes.
2022-11-02	Shader Execution Reordering (via NVAPI) (#2484)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Preliminary SER NVAPI support. * Set the DXC compiler version. Fix typo in premake5.lua * Improve DXC version detection. Enable HLSL2021 on late enough version of DXC. * Fix typo. * Fix launch. * Test via DXIL output. * Update dxc-error output.
2022-09-15	Run simple compute kernel in gfx-smoke test. (#2400)	Yong He

2022-09-15	Language feature: pointer sized int types. (#2401)	Yong He
	* Language feature: pointer sized int types. * Fix. * small change to test. * Fix stdlib. * Fix. * Fix. * Add typedef for `size_t` in stdlib. * Fix test. * Add `intptr_t::size` constant. Co-authored-by: Yong He <yhe@nvidia.com>
2022-08-24	Compiler time evaluation of all int and bool operators. (#2376)	Yong He
	* Compiler time evaluation of all int and bool operators. * Fix linux compile error. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
2022-08-20	Call `gfx` in slang program. (#2370)	Yong He

2022-08-04	Implicit pointer dereference when using member operator. (#2348)	Yong He
	* Implicit pointer dereference when using member operator. * Add expected test result * Fix lookup. Co-authored-by: Yong He <yhe@nvidia.com>
2022-08-03	Basic pointer usages. (#2342)	Yong He

2022-07-25	Allow `class` to implement COM interface, [DLLExport] (#2338)	Yong He
	* Allow `class` to implement COM interface, [DLLExport] * Fix [COM] usage in tests and examples with UUIDs. Co-authored-by: Yong He <yhe@nvidia.com>
2022-07-12	Support `class` types. (#2321)	Yong He
	* Support `class` types. * Ignore class-keyword test * Fix codereview comments and warnings. Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-29	Native call marshalling for ComPtr parameters and return values. (#2305)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-21	Lower throwing COM interface method. (#2282)	Yong He
	* Lower throwing COM interface method. * Fix. * Fix warnings. Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-02	COM interfaces with host callable (#2258)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure.
2022-06-01	Clean up void returns. (#2260)	Yong He
	* Clean up `IRReturnVoid`. * Update gitignore. Co-authored-by: Yong He <yhe@nvidia.com>
2022-05-17	Refactor prelude emit (#2236)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Refactor how prelude output works in emit. * Small improvement to emit output. * Move around comment on target specific language directives based on review. Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
2022-05-10	Initial support for COM interface in host code. (#2230)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
2022-05-10	Use IR pass to eliminate phi nodes (#2226)	Theresa Foley
	* Use IR pass to eliminate phi nodes "Phi nodes" are one of the key contrivances that makes SSA (Static Single Assignment) form work. Because SSA is so great for compiler IRs, we kind of need to deal with phi nodes, but they also get in the way because they don't have a direct analog in most lower-level machine ISAs or execution models, nor in most of the high-level languages a transpiler wants to emit. As a result a compiler like ours needs to be able to eliminate the phi nodes from a program as part of generating output code. (For any clever people noting that SPIR-V supports phi nodes directly: yes, it does. It doesn't need to and it probably shouldn't. Anybody involved in the decision-making knows my reasoning, and anybody else should feel free to ask me if they want the lecture. Anyway...) The basic idea of elimiating phi nodes is simple enough. We replace each phi node with a temporary variable. Uses of the phi use values loaded from the temporary. The operation of the phi itself (assigning a value based on the branch taken) amounts to an assignment into the temporary. Previously, the Slang compiler dealt with phi nodes very late in the process of generating code: in the middle of emitting strings of source code in a high-level language like HLSL or GLSL. Doing the work that late in compilation has two big drawbacks: 1. Our ability to emit clean and/or optimal code is limited because we may not be able to make certain changes to the IR, or because we cannot make use of additional information like a dominator tree that might be available at other points in compilation. 2. Any other IR passes that relate to temporary variables won't be able to see the variables that we generate for phi nodes. This could raise issues with correctness (e.g., if we want to compute live-range information for all temporary variables), or performance (we have no way to run additional IR optimization passes after phis are eliminated). This change addresses these problems by making the elimination of phi nodes an explicit IR pass. Additional optimizations can easily be run after this pass (although we'd need to be careful not to run passes that could end up introducing new phis). The pass makes use of the information available to it to try to produce code that will emit to "clean" HLSL/GLSL. The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is heavily commented, so I won't describe the approach in detail here. There are two related issues that came up, though: First, it turned out that our emit logic for local variables (`IRVar` instructions) wasn't using the function we'd defined named `emitVar()`. One worrying consequence of that oversight was that the `precise` modifier would impact generated HLSL/GLSL for variables that turned into SSA values (including phi nodes), but not for local variables that had not been SSA'd (or that had been SSA'd and then de-SSA'd). This change also fixes that bug; it is unclear how widespread the impact of the original issue might be. Second, generating explicit IR temporaries for phi nodes exposed a pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass basically detects cases where we have an instruction `I` with a use `U` such that the use follows the rules of SSA form ("def dominates use," meaning `I` dominations `U`), but does not follow the more restrictive scoping rules of high-level-language output (where a value computed "inside" a loop is not automatically visible to code outside the loop just because it dominates that code). That pass did not correctly account for the case where `I` was a temporary variable. It seems that case could not arise before now because we didn't have any passes that would move `var`, `load`, or `store` operations out of the basic block they started in. The fix for that pass was relatively simple, and will make the whole thing more robust in case we add more aggressive optimizations later. * fixup: expected test output