summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-emit.cpp
Commit message (Collapse)AuthorAge
...
* Minimum binary arithmetic reverse autodiff working. (#2514)Edward Liu2022-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Initial plumbing of backward autodiff in the frontend. * More plumbing. * Initial reverse autodiff working. * Bug fixes. * Misc. * Remove redundant code. * More clean up. * Misc. * Rebase and add backward diff test. * Disable test. * Clean up. * Minor fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Fix inlining pass. (#2506)Yong He2022-11-10
| | | | | | | | | | | | | | | * Fix inlining pass. * Add more check against corner cases. * Revise comments. * Fixes. * Fix premake script. * Fixes. Co-authored-by: Yong He <yhe@nvidia.com>
* Higher order differentiation. (#2487)Yong He2022-11-04
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* More renaming in jvp pass. (#2475)Yong He2022-10-27
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Legalize array return types. (#2463)Yong He2022-10-26
|
* Rework differentiation of member access through ↵Yong He2022-10-24
| | | | | | | | | | | | | | | | | | | `[DerivativeMember(DiffType.field)]` (#2460) * wip: remove auto-diff for member access, add diff through property accessors. * Fix getter-setter test. * Fix getter-setter-multi test. * Fix nested-jvp test. * Use [DerivativeMember] attribute to differentiate through member access. * Clean up. * More cleanup. Co-authored-by: Yong He <yhe@nvidia.com>
* Modified the new type system to support generic differentiable types … (#2413)Sai Praveen Bangaru2022-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Modified the new type system to support generic differentiable types and added support for differentiating overloaded functions. * Changed a few asserts to release asserts to avoid unreferenced variable errors * Fixed a naming issue with TypeWitnessBreadcumb::Flavor::Decl * Added logic to avoid tracking differentiable types if the module does not use auto-diff or define differentiable types. * Moved the auto-diff passes to after the specialization step, added a more complex generics test * Added a generics stress test and fixed AST-side logic. IR side needs some more work * Added differential getter and setter logic, fixed multiple issues with DifferentiableTypeDictionary, added support for loops and conditions * Changed differential getters to use pointer types, added getter type checking * Fixed some bugs related to diff type registration and differential getters * Removed some superfluous code * Removed some more unused code. * Fixed an issue with witness substitution * Minor fix Co-authored-by: Yong He <yonghe@outlook.com>
* Allow multi-level breaks to break out of `switch` statements. (#2451)Yong He2022-10-13
| | | | | | | | | * Allow multi-level breaks to break out of `switch` statements. * Rename loop->region. * Add `[ForceInline]` attribute. Co-authored-by: Yong He <yhe@nvidia.com>
* Small IR cleanups. (#2441)Yong He2022-10-11
|
* Support multi-level break + single-return conversion + general inline. (#2436)Yong He2022-10-10
| | | | | | | | | * Support multi-level break. * Single return. * Add test for inlining `void` return-type functions. Co-authored-by: Yong He <yhe@nvidia.com>
* Assorted Artifact improvements (#2374)jsmall-nvidia2022-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP replacing DownstreamCompileResult. * First attempt at replacing DownstreamCompileResult with IArtifact and associated types. * Small renaming around CharSlice. * ICastable -> ISlangCastable Added IClonable Fix issue with cloning in ArtifactDiagnostics. * Only add the blob if one is defined in DXC. * Guard adding blob representation. * Make cloneInterface available across code base. Set enums backing type for ArtifactDiagnostic. * Added ::create for ArtifactDiagnostics. * Use SemanticVersion for DownstreamCompilerDesc. Set sizes for enum types. * Depreciate old incompatible CompileOptions. Change SemanticVersion use 32 bits for the patch. * Split out CastableUtil. * Change IDownstreamCompiler to use canConvert and convert to use artifact types. * Fix typos. * Fix typo bug. Allow trafficing in PTX assembly/binaries * struct DownstreamCompilerBaseUtil -> struct DownstreamCompilerUtilBase * Add other riff types. * Small fix around artifact kind. * Make using slices instead of strings explicit on atomic ref counted types. (not complete). Added IArtifactList. Use IArtifactList to hold the 'associated' files. Use IUnknown for scoping for atomic ref counting. Small naming improvements. * Make artifact not use String in construction (so it owns contents). * Calculate compile products as artifacts. * Small improvements around ArtifactDesc. * Use ICastableList for list of artifacts and remove IArtifactList.
* Make Optional<PointerType> lower to PointerType instead of a struct. (#2373)Yong He2022-08-22
|
* Replace DownstreamCompileResult with Artifact (#2369)jsmall-nvidia2022-08-22
| | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP replacing DownstreamCompileResult. * First attempt at replacing DownstreamCompileResult with IArtifact and associated types. * Small renaming around CharSlice. * ICastable -> ISlangCastable Added IClonable Fix issue with cloning in ArtifactDiagnostics. * Only add the blob if one is defined in DXC. * Guard adding blob representation. * Make cloneInterface available across code base. Set enums backing type for ArtifactDiagnostic. * Added ::create for ArtifactDiagnostics.
* Call `gfx` in slang program. (#2370)Yong He2022-08-20
|
* IDownstreamCompiler interface (#2361)jsmall-nvidia2022-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Split out CastableList and related types, and placed in core. * Small fixes around IArtifact. Improve IArtifact docs. First impl of getChildren for IArtifact. * Documentation improvements for Artifact related types. * Fix typo. * Special case adding a ICastableList to a LazyCastableList. * Small simplification of LazyCastableList, by adding State member. * Removed the ILockFile interface because IFileArtifactRepresentation can be used. * Implement DiagnosticsArtifactRepresentation. * Added PostEmitMetadataArtifactRepresentation * Add searching by predicate. Added handling of accessing Artifact as ISharedLibrary * Fix typo. * Add find to IArtifacgtList. Fix some missing SLANG_NO_THROW. * Small improvements around ArtifactDesc types. * Another small change around ArtifactKind. * Some more shuffling of ArtifactDesc. * Make IArtifact castable Remove IArtifactList Made IArtifactContainer derive from IArtifact Made ModuleLibrary atomic ref counted/given IModuleLibrary interface. * Must call _requireChildren before any children access. * Fix missing SLANG_MCALL on castAs. * Fix missing SLANG_OVERRIDE. * Added IArtifactHandler * Use ICastable for basis of scope/lookup. * WIP first attempt to remove CompileResult. * Fix support for for downstream compiler shared library adapter. * Fix issues found when replacing CompileResult. * Fix typo. * Fix getting items form 'significant' member of an Artifact. * Split out ArtifactUtil & ArtifactHandler. * Work around for problem on Visual studio. * Improve searching. * Add missing files. * Split out Artifact associated types. Don't produce a container by default - use associated for 'metadata'. * Remove no longer used ArtifactPayload type. * Generalized converting representations. Small improvements to artifacts. * Fix intermediate dumping issue. * Removed #if 0 out CompileResult. Remove DownstreamCompileResult maybeDumpIntermediate. * Pull out functionality for dumping artifact output into ArtifactOutputUtil Fixed a bug in naming files based on ArtifactDesc. * std::atomic issue. * Pull out types from DownstreamCompile to simplify moving to an interface. * Fix typo. * Use IDownstreamCompiler interface. Split out DownstreamCompilerUtil and DownstreamCompilerSet. * Update projects. * Fix missing SLANG_MCALL. * Fix calling convention of IDownstreamCompiler impls. * Split out binary work arounds into a dep1.cpp/h * Small reorganising around DownstreamCompilerInfo. * Remove Desc library functionality to DownstreamCompilerUtil. * Expand IDiagnostics interface. Rename associated impls with Impl suffix. * Fix outputting as text bug. Some small improvements. * Add fix around prefix for dumping. Improved how handling for extensions work form ArtifactDesc. * Dump assembly if available. * Simplify some of Dep1 definitions.
* Move metadata/diagnostics to associated types (#2358)jsmall-nvidia2022-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Split out CastableList and related types, and placed in core. * Small fixes around IArtifact. Improve IArtifact docs. First impl of getChildren for IArtifact. * Documentation improvements for Artifact related types. * Fix typo. * Special case adding a ICastableList to a LazyCastableList. * Small simplification of LazyCastableList, by adding State member. * Removed the ILockFile interface because IFileArtifactRepresentation can be used. * Implement DiagnosticsArtifactRepresentation. * Added PostEmitMetadataArtifactRepresentation * Add searching by predicate. Added handling of accessing Artifact as ISharedLibrary * Fix typo. * Add find to IArtifacgtList. Fix some missing SLANG_NO_THROW. * Small improvements around ArtifactDesc types. * Another small change around ArtifactKind. * Some more shuffling of ArtifactDesc. * Make IArtifact castable Remove IArtifactList Made IArtifactContainer derive from IArtifact Made ModuleLibrary atomic ref counted/given IModuleLibrary interface. * Must call _requireChildren before any children access. * Fix missing SLANG_MCALL on castAs. * Fix missing SLANG_OVERRIDE. * Added IArtifactHandler * Use ICastable for basis of scope/lookup. * WIP first attempt to remove CompileResult. * Fix support for for downstream compiler shared library adapter. * Fix issues found when replacing CompileResult. * Fix typo. * Fix getting items form 'significant' member of an Artifact. * Split out ArtifactUtil & ArtifactHandler. * Work around for problem on Visual studio. * Improve searching. * Add missing files. * Split out Artifact associated types. Don't produce a container by default - use associated for 'metadata'. * Remove no longer used ArtifactPayload type. * Generalized converting representations. Small improvements to artifacts. * Fix intermediate dumping issue. * Removed #if 0 out CompileResult. Remove DownstreamCompileResult maybeDumpIntermediate. * Pull out functionality for dumping artifact output into ArtifactOutputUtil Fixed a bug in naming files based on ArtifactDesc. * std::atomic issue. * Fix outputting as text bug. Some small improvements. * Add fix around prefix for dumping. Improved how handling for extensions work form ArtifactDesc. * Dump assembly if available.
* Remove CompileResult to use IArtifact (#2357)jsmall-nvidia2022-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls. * Add ICastableList * Added UnknownCastableAdapter * Make ISlangSharedLibrary derive from ICastable, and remain backwards compatible with slang-llvm. * Refactor Representation on Artifact. * Make our ISlangBlobs also derive from ICastable. Make ISlangBlob atomic ref counted. * Split out CastableList and related types, and placed in core. * Small fixes around IArtifact. Improve IArtifact docs. First impl of getChildren for IArtifact. * Documentation improvements for Artifact related types. * Fix typo. * Special case adding a ICastableList to a LazyCastableList. * Small simplification of LazyCastableList, by adding State member. * Removed the ILockFile interface because IFileArtifactRepresentation can be used. * Implement DiagnosticsArtifactRepresentation. * Added PostEmitMetadataArtifactRepresentation * Add searching by predicate. Added handling of accessing Artifact as ISharedLibrary * Fix typo. * Add find to IArtifacgtList. Fix some missing SLANG_NO_THROW. * Small improvements around ArtifactDesc types. * Another small change around ArtifactKind. * Some more shuffling of ArtifactDesc. * Make IArtifact castable Remove IArtifactList Made IArtifactContainer derive from IArtifact Made ModuleLibrary atomic ref counted/given IModuleLibrary interface. * Must call _requireChildren before any children access. * Fix missing SLANG_MCALL on castAs. * Fix missing SLANG_OVERRIDE. * Added IArtifactHandler * Use ICastable for basis of scope/lookup. * WIP first attempt to remove CompileResult. * Fix support for for downstream compiler shared library adapter. * Fix issues found when replacing CompileResult. * Fix typo. * Fix getting items form 'significant' member of an Artifact. * Split out ArtifactUtil & ArtifactHandler. * Work around for problem on Visual studio. * Improve searching. * Add missing files.
* Add `none` literal that is convertible to `Optional`. (#2356)Yong He2022-08-10
| | | | | | | | | | | * Add `none` literal that is convertible to `Optional`. * Fix cpu code gen. * Include vk and cpu test for is-as operator test. * Inline comparison operators. Co-authored-by: Yong He <yhe@nvidia.com>
* `is` and `as` operator and `Optional<T>`. (#2355)Yong He2022-08-10
| | | | | | | * `is` and `as` operator and `Optional<T>`. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Artifact split interface and implementation (#2349)jsmall-nvidia2022-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls.
* Allow `class` to implement COM interface, [DLLExport] (#2338)Yong He2022-07-25
| | | | | | | * Allow `class` to implement COM interface, [DLLExport] * Fix [COM] usage in tests and examples with UUIDs. Co-authored-by: Yong He <yhe@nvidia.com>
* Allow dynamic dispatch to handle nested interface-typed fields. (#2336)Yong He2022-07-21
|
* Preserve specialization cache in IR for specialization pass. (#2293)Yong He2022-06-23
| | | | | | | | | | | | | | | * Perserve specialization cache in IR for specialization pass. * Fix compile error. * Fix. * Fix. * Fix test case. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Lower throwing COM interface method. (#2282)Yong He2022-06-21
| | | | | | | | | * Lower throwing COM interface method. * Fix. * Fix warnings. Co-authored-by: Yong He <yhe@nvidia.com>
* COM interfaces with host callable (#2258)jsmall-nvidia2022-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure.
* New language feature: basic error handling. (#2253)Yong He2022-06-01
| | | | | | | | | * New language feature: basic error handling. * Fix. * Fix `tryCall` encoding according to code review. Co-authored-by: Yong He <yhe@nvidia.com>
* Remove LivenessLocation (#2248)jsmall-nvidia2022-05-26
| | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Remove the need for LivenessLocation. * Use LivenessMode. * Fix some comments. Co-authored-by: Yong He <yonghe@outlook.com>
* Support for querying which parameters are used in emitted code (#2239)Alexey Panteleev2022-05-18
| | | See https://github.com/shader-slang/slang/issues/2213
* Refactor prelude emit (#2236)jsmall-nvidia2022-05-17
| | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor how prelude output works in emit. * Small improvement to emit output. * Move around comment on target specific language directives based on review. Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Liveness tracking with phis (#2233)jsmall-nvidia2022-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor Liveness pass, such that locations can be found independently of setting up ranges. * Refactor around different stages of liveness span analysis. * WIP Take into account PHI temporaries in liveness tracking. * WIP First pass of PHI liveness refactor. * Add BlockIndex. * WIP Refactor phi liveness around inst runs. * More improvements around liveness tracking. * Bug fixes. Special handling to not add multiple ends, at starts of blocks and after accesses. * Fix test output. * Use IRInsertLoc to track insertion point. * Liveness markers don't have side effects. * Fix typo in liveness test. * Small improvements around setting SuccessorResult. * Fix memory issue around reallocation and RAIIStackArray. Update test output. * Update test output for liveness.slang. * Fix typo in SuccessorResult blockIndex. * Small tidy up. * Handle the root start block, correctly scoping the run. * Split BlockInfo into 'Root' and 'Function'. Store successors as BlockIndices. * Tidy up around liveness tracking. * Add head/tail support to ArrayViews. Use Count where appropriate. Use head/tail in liveness impl.
* Initial support for COM interface in host code. (#2230)Yong He2022-05-10
| | | | Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Use IR pass to eliminate phi nodes (#2226)Theresa Foley2022-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Use IR pass to eliminate phi nodes "Phi nodes" are one of the key contrivances that makes SSA (Static Single Assignment) form work. Because SSA is so great for compiler IRs, we kind of need to deal with phi nodes, but they also get in the way because they don't have a direct analog in most lower-level machine ISAs or execution models, nor in most of the high-level languages a transpiler wants to emit. As a result a compiler like ours needs to be able to eliminate the phi nodes from a program as part of generating output code. (For any clever people noting that SPIR-V supports phi nodes directly: yes, it does. It doesn't need to and it probably *shouldn't*. Anybody involved in the decision-making knows my reasoning, and anybody else should feel free to ask me if they want the lecture. Anyway...) The basic idea of elimiating phi nodes is simple enough. We replace each phi node with a temporary variable. Uses of the phi use values loaded from the temporary. The operation of the phi itself (assigning a value based on the branch taken) amounts to an assignment into the temporary. Previously, the Slang compiler dealt with phi nodes very late in the process of generating code: in the middle of emitting strings of source code in a high-level language like HLSL or GLSL. Doing the work that late in compilation has two big drawbacks: 1. Our ability to emit clean and/or optimal code is limited because we may not be able to make certain changes to the IR, or because we cannot make use of additional information like a dominator tree that might be available at other points in compilation. 2. Any other IR passes that relate to temporary variables won't be able to see the variables that we generate for phi nodes. This could raise issues with correctness (e.g., if we want to compute live-range information for *all* temporary variables), or performance (we have no way to run additional IR optimization passes after phis are eliminated). This change addresses these problems by making the elimination of phi nodes an explicit IR pass. Additional optimizations can easily be run after this pass (although we'd need to be careful not to run passes that could end up introducing new phis). The pass makes use of the information available to it to try to produce code that will emit to "clean" HLSL/GLSL. The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is heavily commented, so I won't describe the approach in detail here. There are two related issues that came up, though: First, it turned out that our emit logic for local variables (`IRVar` instructions) wasn't using the function we'd defined named `emitVar()`. One worrying consequence of that oversight was that the `precise` modifier would impact generated HLSL/GLSL for variables that turned into SSA values (including phi nodes), but *not* for local variables that had not been SSA'd (or that had been SSA'd and then de-SSA'd). This change also fixes that bug; it is unclear how widespread the impact of the original issue might be. Second, generating explicit IR temporaries for phi nodes exposed a pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass basically detects cases where we have an instruction `I` with a use `U` such that the use follows the rules of SSA form ("def dominates use," meaning `I` dominations `U`), but does not follow the more restrictive scoping rules of high-level-language output (where a value computed "inside" a loop is not automatically visible to code outside the loop just because it dominates that code). That pass did not correctly account for the case where `I` was a temporary variable. It seems that case could not arise before now because we didn't have any passes that would move `var`, `load`, or `store` operations out of the basic block they started in. The fix for that pass was relatively simple, and will make the whole thing more robust in case we add more aggressive optimizations later. * fixup: expected test output
* Output SPIR-V lifetimes (#2221)jsmall-nvidia2022-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP tracking liveness. * Skeleton around adding liveness instructions. * Calling into liveness tracking logic. Adds live start to var insts. * Liveness macros have initial output. * Looking at different initialization scenarios. * Some discussion around liveness. * WIP for working out liveness end. * WIP Updated liveness using use lists. * Is now adding liveness information * Some small fixes. * WIP around liveness. * Seems to output liveness correctly for current scenario. * Tidy up liveness code. * Update comment arounds liveness to current status. * Small fixes to liveness test. * Add support for call in liveness analysis. * Improve liveness example with array access. * Small updates to comments. * Disable liveness test because inconsistencies with output on CI system. * First pass support for GLSL SPIR-V liveness support. * Add the SPIRVOpDecoration. * Fix signature for OpLivenessStop. * Simplified by having a Kind type. * Fix some issues brought up in PR. * Rename liveness instructions. * Merge with var-lifetime. Small improvements. * Improvements to the documentation/naming in GLSL liveness pass. Add comment around possible improvements to the liveness pass.
* Preliminary Liveness tracking (#2218)jsmall-nvidia2022-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP tracking liveness. * Skeleton around adding liveness instructions. * Calling into liveness tracking logic. Adds live start to var insts. * Liveness macros have initial output. * Looking at different initialization scenarios. * Some discussion around liveness. * WIP for working out liveness end. * WIP Updated liveness using use lists. * Is now adding liveness information * Some small fixes. * WIP around liveness. * Seems to output liveness correctly for current scenario. * Tidy up liveness code. * Update comment arounds liveness to current status. * Small fixes to liveness test. * Add support for call in liveness analysis. * Improve liveness example with array access. * Small updates to comments. * Disable liveness test because inconsistencies with output on CI system. * Fix some issues brought up in PR. * Rename liveness instructions.
* Support `[DllImport]` (#2181)Yong He2022-04-12
| | | | | | | | | | | | | | | | | * Support `[DllImport]` * Fix. * Fix. * Fix array type emit in cpp. * Fix. * Fix. * Fix Co-authored-by: Yong He <yhe@nvidia.com>
* Refactor: eliminate BackEndCompileRequest (#2178)Theresa Foley2022-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier refactoring pass over the compiler codebase split the type that had been called `CompileRequest` into three distinct pieces: * `FrontEndCompileRequest` which was supposed to own state and options related to running the compiler front end and producing IR + reflection (e.g., what translation units and source files/strings are included). * `BackEndCompileRequest` which was supposed to own state and options related to running the compiler back end to translate the IR for a `ComponentType` (program) into output code. (Note that the `BackEndCompileRequest` was conceived of as orthogonal to the `TargetRequest`s, which store per-target and target-specific options.) * `EndToEndCompileRequest` which was an umbrella object that owns separate front-end and back-end requests, plus any state that is only relevant when doing a true end-to-end compile (such as the kinds of compiles initiated with `slangc`). As originally conceived, the only state that this type was supposed to own was stuff related to "pass-through" compilation, as well as state related to writing of generated code to output files. That refactoring work was very useful at the time, because it allowed us to "scrub" the back end compilation steps to remove all dependencies on front-end and AST state (this was important for our goals of enabling linking and codegen from serialized Slang IR). At this point, however, it is clear that the hierarchy that was built up serves very little purpose: * The `BackEndCompileRequest` type is only used in two places: * As part of an `EndToEndCompileRequest`, where the settings on the `BackEndCompileRequest` can be configured, but only through the `EndToEndCompileRequest` * As part of on-demand code generation through the `IComponentType` APIs. In this case, the settings stored on the `BackEndCompileRequest` are not accessible to the application at all, and will always use their default values, so that instantiating a "request" object doesn't really make any sense. * The `FrontEndCompileRequest` type has a similar situation: * Front-end compilation as part of an `EndToEndCompileRequest` supports user configuration of `FrontEndCompileRequest` settings, but only through the `EndToEndCompileRequest` * Front-end compilation triggered by an `import` or a `loadModule()` call does not support user configuration of settings at all. It will always derive all relevant settings from thsoe on the session ("linkage"). In addition, subsequent changes have been made to the compiler that show a bit of a "code smell" and/or forward-looking worries for this decomposition: * In some cases we've had to add the same setting to multiple types in the breakdown (front-end, back-end, end-to-end, linkage, target, etc.) which makes it harder for us to validate that all the possible mixtures of state work correctly. * Related to the above, in some cases we have manual logic that copies state from one of the objects in the breakdown to another, in order to ensure that the user's intention is actually followed. * As a forward-looking concern, it seems that developers have sometimes added new configuration options and state to places that don't really make sense according to the rationale of the original decomposition (e.g., we probably don't want to have a lot of state that is only available via end-to-end requests, given that the API structure is meant to push users *away* from end-to-end compiles). As a result of all of the above, I've been planning a large refactor with the following big-picture goals: * Eliminate `BackEndCompileRequest` * Move all relevant state/options from the back-end request to the end-to-end request, since that is the only place they could be set anyway. * Introduce a transient "context" type to be used for the duration of code generation that serves the main functions that back-end requests really served in the codebase * Make `EndToEndCompileRequest` be a subclass of `FrontEndCompileRequest` * Consider addding a transient "context" type for front-end compiles that can be used in `import`-like cases rather than needing a full front-end request object. If this works, then eliminate `FrontEndCompileRequest` and be back to world with just a single `CompileRequest` type * Move *all* compiler configuration options to a distinct type (named something like `CompilerConfig` or `CompilerOptions` or whatever) which stores setting as key-value pairs, and has a notion of "inheritance" such that one configuration can extend or build on top of another. Make all the relevant types use this catch-all structure instead of redundantly storing flags in many places. This change deals with the first of those bullets: removeal of `BackEndCompileRequest`. The addition of the `CodeGenContext` type is perhaps an unncessary additional step, but making that change helps clean up a bunch of the code related to per-target code generation, so I think it is the right choice. Co-authored-by: Yong He <yonghe@outlook.com>
* Allow slangc to generate exe from .slang file. (#2170)Yong He2022-03-28
|
* Improved SCCP, inlining and resource specialization passes, legalize ↵Yong He2022-02-25
| | | | `ImageSubscript` for GLSL (#2146)
* Add target option to force `scalar` layout for storage buffers. (#2135)Yong He2022-02-17
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Various gfx fixes. (#2132)Yong He2022-02-16
| | | | | | | | | | | | | | | * Various gfx fixes. * Fix test case. * Fix crash. * Trigger build * Trigger build 2 * Fix vulkan unit tests. Co-authored-by: Yong He <yhe@nvidia.com>
* Output of IR ids as command line option (#2043)jsmall-nvidia2021-12-07
| | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP control of dump options. * Removed SourceManager for IRDumpOptions * Arm aarch64 debug connection timeout - as CI timed out.
* `reinterpret` and 16-bit value packing. (#1933)Yong He2021-09-09
| | | | | | | | | * `reinterpret` and 16-bit value packing. * Update `half-texture` cross-compile test reference result. * Revert inadvertent reformatting of slang-ir-inst-defs.h Co-authored-by: Yong He <yhe@nvidia.com>
* Further implementation of SPIRV direct emit. (#1920)Yong He2021-08-12
| | | | | | | | | | | | | | * Further implementation of SPIRV direct emit. This change implements: - Struct, Vector, Matrix and Unsized Array types. - Basic arithmetic opcodes, vector construct, swizzle etc. - getElementPtr, getElement, fieldAddress, extractField. - SPIRV target intrinsics with SPIRV asm code in stdlib. - RWStructuredBuffer and StructuredBuffer. - Pointer storage class propagation. - Control flow. * Fix.
* Fix a few issues around opaque types as outputs (#1918)Theresa Foley2021-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | * Fix a few issues around opaque types as outputs Slang and HLSL support opaque types (textures, buffers, samplers, etc.) as members of `struct`s, mutable local variables, function results, and `out`/`inout` parameters. GLSL and SPIR-V do not. In order to translate Slang code over to GLSL/SPIR-V we use a variety of passes that seek to eliminate all of the above use cases and produce code that only uses opaque types in the limited ways that GLSL/SPIR-V allow. This change relates to the passes that deal with function results and `out`/`inout` parameters. There are two basic changes here: 1. The `specializeResourceOutputs` pass was only dealing with resource (texture/buffer) types. This change updates it to process sampler types as well. 2. The sequencing of the passes made it possible that an opaque-typed local variable might be left around after `specializeResourceOutputs`, which would mean the code is still invalid for GLSL/SPIR-V. This change adds an additional SSA-formation pass which would eliminate any opaque-type local variables whose lifetimes were made simple enough by the optimizations. Together these changes fix a problem-case user shader that was failing to compile for Vulkan. * Update slang-emit.cpp Fix typo 'reuslt' * Update slang-emit.cpp Comment change to re-trigger CI build. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
* Enable reading OptiX SBT records via uniform parameters on ray tracing entry ↵Nathan V. Morrical2021-08-10
| | | | | | | points (#1917) * optix SBT record data can now be accessed using uniform parameters on ray tracing entry points * Update slang-emit.cpp
* Work to mitigate SPIR-V bloat (#1914)Theresa Foley2021-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Work to mitigate SPIR-V bloat SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things. Note: experimentation shows that the bloat is less pronounced when running either *no* SPIR-V optimizations or *full* SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request. This change primarily does two things: First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would *always* copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters. Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies. Notes: * The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon. * The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing. * There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments. * Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated. * This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create *more* memory traffic (because the same values get loaded in multiple callee functions). * fixup: run premake * fixup: typo
* Various Fixes to gfx, reflection and emit. (#1867)Yong He2021-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Various Fixes to gfx, reflection and emit. - Fix GLSL emit to properly output `*bitsTo*` functions for `IRBitCast` insts. - Add line directive mode setting for `ISession`. - Extend `TypeLayout::getElementStride` to handle `VectorType` case. - Fix `IDevice::readBufferResource` 's D3D12 implementation to copy only the requested bytes out. - Fix `render-test` to use the `ISession` from `gfx` instead of creating its own `ISession` to make sure `gfx` and `render-test` agree on WitnessTable and RTTI IDs. - Extend `render-test` to support filling vector and matrix values in the new `set x = ...` TEST_INPUT syntax. - Add a `dynamic-dispatch-15` test case to make sure packing / unpacking works correctly across all targets, and to make sure render-test's RTTI/WitnessTable ID filling logic is correct for non-trivial cases. * Remove default-major test * Fix cyclic reference in `ExtendedTypeLayout`. * Move `lineDirectiveMode` setting to `TargetDesc`. Add `structureSize` to `TargetDesc` and `SessionDesc` for future binary compatibility. * Cleanup. Co-authored-by: Yong He <yhe@nvidia.com>
* Glslang refactor bugfix (#1863)jsmall-nvidia2021-05-28
| | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix issue with with SLANG_ENABLE_GLSLANG_SUPPORT * Update expected output from glslang-error.glsl * Fix bug in glsl dissassembly. * Make ExtensionTracker available even if source is not emitted. * Only explicitly set extension tracker based on capability bits, if we are in pass through. * Small simplification of invoke sourceEmit.
* Fix a bug in struct inheritance (#1861)T. Foley2021-05-27
| | | | | | | | | | | | | | | | | | | During lowering from AST to IR, the Slang compiler translates code that uses `struct` inheritance: ```hlsl struct Base { int a; } struct Derived : Base {} ``` into code where the inheritance relationship is "witnessed" by a simple field: ```hlsl struct Base { int a; } struct Derived { Base __anonymous_field__; } ``` The underlying bug here is that the `__anonymous_field__` that the compiler generated during IR lowering was not being given any linkage decorations (no mangled name). As a result, if multiple separately-compiled modules all access that field they could disagree on its identity as an IR instruction. This could lead to output code being generated where the declaration of `__anonymous_field__` uses one IR instruction, but accesses use another. This change includes a fix for the issue, and a test that serves as a reproducer for the original problem.
* [gfx] Support StructuredBuffer<IInterface>. (#1851)Yong He2021-05-21
| | | Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>