summaryrefslogtreecommitdiffstats
path: root/tests/experimental
Commit message (Collapse)AuthorAge
* Support per field matrix layout (#3101)Yong He2023-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Support per field matrix layout * Fix warnings. * Fix. * Fix tests. * Fix spiv gen. * Fix. * More test fixes. * Fix. * Run only GPU tests on self-hosted servers. * Remove -use-glsl-matrix-layout-modifier. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Disable code motion for expensive insts (call & div) (#3042)Sai Praveen Bangaru2023-08-03
| | | | | | | * Disable code motion for expensive insts (call & div) The current redundancy removal pass does not consider control-flow within loops and as a result can sometimes move dynamic dispatch code outside their switch blocks, if they are nested in a single-iter-loop. * Update liveness.slang.expected
* Use scratchData on `IRInst` to replace HashSets. (#2978)Yong He2023-07-12
| | | | | | | | | | | | | | | * Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Be lenient on same-size unsigend->signed conversion. (#2913)Yong He2023-06-01
| | | | | | | | | | | | | | | | | | | * Be lenient on same-size unsigend->signed conversion. * Fix tests. * Use 250. * wip * Fix. * Fix tests. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix function side-effectness prop logic. (#2875)Yong He2023-05-09
|
* SSA Register Allocation improvements. (#2857)Yong He2023-04-28
| | | | | | | | | | | * SSA Register Allocation improvements. * Fix. * Rename `Use`->`UseOrPseudoUse`. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* For C-like targets, emit resource declarations before other globals (#2843)Sai Praveen Bangaru2023-04-26
| | | | | * For C-like targets, emit resource declarations before other globals * Remove unused tests
* Fix optimization pass not converging. (#2725)Yong He2023-03-23
| | | | | | | | | | | * Fix optimization pass not converging. * Fix. * Fix tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* More fixes for reverse-mode on complicated loops (#2675)Sai Praveen Bangaru2023-02-27
| | | | | | | | | | | | | * Multiple fixes to get various loop tests to pass. * Create reverse-nested-loop.slang * Fix for variables becoming inaccessible during cfg normalization * Removed comments and moved break-branch-normalization to eliminateMultiLevelBreaks * Fix. * Override liveness tests
* More control flow simplifications. (#2673)Yong He2023-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * More control flow and Phi param simplifications. * Fix. * Fix gcc error. * Fix. * More IR cleanup. * Fix bug in phi param dce + ifelse simplify. * Propagate and DCE side-effect-free functions. * Enhance CFG simplifcation to remove loops with no side effects. * Fix. * Fixes. * Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation. * More cleanup. * Fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Arithmetic simplifications and more IR clean up logic. (#2632)Yong He2023-02-07
|
* Register allocation during phi elimination. (#2613)Yong He2023-01-27
| | | | | | | | | | | | | | | * Register allocation during phi elimination. * Enhance the test case. * Cleanup line breaks in test case. * remove unncessary line break changes. * More cleanups. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Full address insts elimination for backward autodiff. (#2604)Yong He2023-01-23
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fixes for crash when inlining at global scope (#2593)Theresa Foley2023-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fixes for crash when inlining at global scope Recent changes to the way inlining is implemented in the Slang compiler have broken certain scenarios involving `static const` declarations. The basic problem is that the initial-value expression for a `static const` gets lowered into IR code at the global scope of a module, and if that code includes `call`s to stdlib operations marked `forceInlineEarly`, then we end up trying to apply inlining to code at module scope. The current inlining operation assumes that all `call`s are in basic blocks, and that the correct way to do inlining involves splitting those blocks. This change adds logic to detect when the callee at a call site to be inlined consists of a single basic block ending in a `return`, and in that case it invokes specialized inlining logic that doesn't split basic blocks and doesn't need to care if the original `call` is in a basic block. Thus we are able to inline calls to single-basic-block `forceInlineEarly` functions called as part of the initialization for global-scope `static const` variables. This logic does *not* solve the problem of calls to multi-block `forceInlineEarly` functions from the global scope. Such calls cannot really be inlined. A secondary problem that arises when inlining such calls is that the callee might include local temporaries (`var` instructions) that are read and written (`load`s and `store`s), and none of those instructions should be allowed at the global scope. In the case of the functions being inlined here, the `load`/`store` operations are superfluous, and should be cleaned up by our SSA pass. The only reason that they seem to *not* be getting cleaned up in the case that was been triggering crashes is that the callee is a generic. The current logic for the SSA pass was skipping the bodies of generic functions, so they would not be cleaned up. This change enables the SSA pass to apply to the bodies of generic functions, and also ensures that SSA cleanups are applied *before* any `forceInlineEarly` functions get inlined. * fixup: liveness test outputs
* Small IR cleanups. (#2441)Yong He2022-10-11
|
* Support multi-level break + single-return conversion + general inline. (#2436)Yong He2022-10-10
| | | | | | | | | * Support multi-level break. * Single return. * Add test for inlining `void` return-type functions. Co-authored-by: Yong He <yhe@nvidia.com>
* Warning on lossy implicit casts. (#2367)Yong He2022-08-17
| | | | | | | | | | | | | | | * Warning on bool to float conversion. * Fix test cases. * Improve. * LanguageServer: don't show constant value for non constant variables. * Fix tests. * Fix warnings in tests. Co-authored-by: Yong He <yhe@nvidia.com>
* Liveness fixes and improvements (#2270)jsmall-nvidia2022-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure. * Add support for __global. * Added `ActualGlobalRate` Added special handling around globals and COM interfaces. Tested out in cpu-com-example. * Fix typo in NodeBase. * Support for accessing globals by name working. * Bounds checking for C++ Improved bounds checks for CUDA. * Check that actual global initialization is working. * Fix typo. * Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar. * Fix typo in CUDA prelude. * Remove context. Only create replacement if needed. * Split out COM host-callable into a unit-test. * host-callable com testing on C++and llvm. * Comment around the COM ptr replacement. * WIP Zero bound test. * Disable com test on vs 32 bit. Fix C++ prelude * Disable 32 bit targets testing com host-callable. * For now disable zero index test. * Enable bounds checking for CPU/CUDA. * Small fixes. Disable CUDA zero index bound fix. * Add test result for bound check. * Work around for index wrapping issue. * Added Fixed array test. * Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user) * Small fix around instCount. * Improve liveness loop handing and tests. * Improve liveness comment. * More conservative loop handling. * Make liveness deterministic to make testing work. * Added 'span tidy' Added some more tests. * Simplify span simplification, because could collapse inappropriate spans. * Updated liveness with simple loop tracking. * Update test results. * Small tidy up. * Update comments in liveness tests. * Improve liveness comments. * Loop handling without needing LoopInfo tracking. * Improve liveness comments. * Small fix around removing uninteresting spans. Improve naming.
* Special handling around return and liveness (#2234)jsmall-nvidia2022-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor Liveness pass, such that locations can be found independently of setting up ranges. * Refactor around different stages of liveness span analysis. * WIP Take into account PHI temporaries in liveness tracking. * WIP First pass of PHI liveness refactor. * Add BlockIndex. * WIP Refactor phi liveness around inst runs. * More improvements around liveness tracking. * Bug fixes. Special handling to not add multiple ends, at starts of blocks and after accesses. * Fix test output. * Use IRInsertLoc to track insertion point. * Liveness markers don't have side effects. * Fix typo in liveness test. * Small improvements around setting SuccessorResult. * Fix memory issue around reallocation and RAIIStackArray. Update test output. * Update test output for liveness.slang. * Fix typo in SuccessorResult blockIndex. * Small tidy up. * Handle the root start block, correctly scoping the run. * Split BlockInfo into 'Root' and 'Function'. Store successors as BlockIndices. * Tidy up around liveness tracking. * Add head/tail support to ArrayViews. Use Count where appropriate. Use head/tail in liveness impl. * Special handling if return is effectively a live variable. * Update test output for improved return handling. * Refactor how handling of return accesses. Fix issue around liveness starts. * Disable release warning for unused method. * Some small improvements around liveness pass.
* Liveness tracking with phis (#2233)jsmall-nvidia2022-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor Liveness pass, such that locations can be found independently of setting up ranges. * Refactor around different stages of liveness span analysis. * WIP Take into account PHI temporaries in liveness tracking. * WIP First pass of PHI liveness refactor. * Add BlockIndex. * WIP Refactor phi liveness around inst runs. * More improvements around liveness tracking. * Bug fixes. Special handling to not add multiple ends, at starts of blocks and after accesses. * Fix test output. * Use IRInsertLoc to track insertion point. * Liveness markers don't have side effects. * Fix typo in liveness test. * Small improvements around setting SuccessorResult. * Fix memory issue around reallocation and RAIIStackArray. Update test output. * Update test output for liveness.slang. * Fix typo in SuccessorResult blockIndex. * Small tidy up. * Handle the root start block, correctly scoping the run. * Split BlockInfo into 'Root' and 'Function'. Store successors as BlockIndices. * Tidy up around liveness tracking. * Add head/tail support to ArrayViews. Use Count where appropriate. Use head/tail in liveness impl.
* Add support for `spirv_literal` (#2227)jsmall-nvidia2022-05-10
| | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Add SPIRVLiteralType, to mark types that have spirv_literal in function parameter output. * Update test result. Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Use IR pass to eliminate phi nodes (#2226)Theresa Foley2022-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Use IR pass to eliminate phi nodes "Phi nodes" are one of the key contrivances that makes SSA (Static Single Assignment) form work. Because SSA is so great for compiler IRs, we kind of need to deal with phi nodes, but they also get in the way because they don't have a direct analog in most lower-level machine ISAs or execution models, nor in most of the high-level languages a transpiler wants to emit. As a result a compiler like ours needs to be able to eliminate the phi nodes from a program as part of generating output code. (For any clever people noting that SPIR-V supports phi nodes directly: yes, it does. It doesn't need to and it probably *shouldn't*. Anybody involved in the decision-making knows my reasoning, and anybody else should feel free to ask me if they want the lecture. Anyway...) The basic idea of elimiating phi nodes is simple enough. We replace each phi node with a temporary variable. Uses of the phi use values loaded from the temporary. The operation of the phi itself (assigning a value based on the branch taken) amounts to an assignment into the temporary. Previously, the Slang compiler dealt with phi nodes very late in the process of generating code: in the middle of emitting strings of source code in a high-level language like HLSL or GLSL. Doing the work that late in compilation has two big drawbacks: 1. Our ability to emit clean and/or optimal code is limited because we may not be able to make certain changes to the IR, or because we cannot make use of additional information like a dominator tree that might be available at other points in compilation. 2. Any other IR passes that relate to temporary variables won't be able to see the variables that we generate for phi nodes. This could raise issues with correctness (e.g., if we want to compute live-range information for *all* temporary variables), or performance (we have no way to run additional IR optimization passes after phis are eliminated). This change addresses these problems by making the elimination of phi nodes an explicit IR pass. Additional optimizations can easily be run after this pass (although we'd need to be careful not to run passes that could end up introducing new phis). The pass makes use of the information available to it to try to produce code that will emit to "clean" HLSL/GLSL. The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is heavily commented, so I won't describe the approach in detail here. There are two related issues that came up, though: First, it turned out that our emit logic for local variables (`IRVar` instructions) wasn't using the function we'd defined named `emitVar()`. One worrying consequence of that oversight was that the `precise` modifier would impact generated HLSL/GLSL for variables that turned into SSA values (including phi nodes), but *not* for local variables that had not been SSA'd (or that had been SSA'd and then de-SSA'd). This change also fixes that bug; it is unclear how widespread the impact of the original issue might be. Second, generating explicit IR temporaries for phi nodes exposed a pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass basically detects cases where we have an instruction `I` with a use `U` such that the use follows the rules of SSA form ("def dominates use," meaning `I` dominations `U`), but does not follow the more restrictive scoping rules of high-level-language output (where a value computed "inside" a loop is not automatically visible to code outside the loop just because it dominates that code). That pass did not correctly account for the case where `I` was a temporary variable. It seems that case could not arise before now because we didn't have any passes that would move `var`, `load`, or `store` operations out of the basic block they started in. The fix for that pass was relatively simple, and will make the whole thing more robust in case we add more aggressive optimizations later. * fixup: expected test output
* Liveness pass fixes and improvements (#2225)jsmall-nvidia2022-05-09
| | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for loops within dominator tree. Fix for functions that have no body. * Use a count array. Update some comments. * Special case handling of the root block, for searching for last access. * Enable liveness test with glsl output. Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Preliminary Liveness tracking (#2218)jsmall-nvidia2022-05-05
* #include an absolute path didn't work - because paths were taken to always be relative. * WIP tracking liveness. * Skeleton around adding liveness instructions. * Calling into liveness tracking logic. Adds live start to var insts. * Liveness macros have initial output. * Looking at different initialization scenarios. * Some discussion around liveness. * WIP for working out liveness end. * WIP Updated liveness using use lists. * Is now adding liveness information * Some small fixes. * WIP around liveness. * Seems to output liveness correctly for current scenario. * Tidy up liveness code. * Update comment arounds liveness to current status. * Small fixes to liveness test. * Add support for call in liveness analysis. * Improve liveness example with array access. * Small updates to comments. * Disable liveness test because inconsistencies with output on CI system. * Fix some issues brought up in PR. * Rename liveness instructions.