| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support per field matrix layout
* Fix warnings.
* Fix.
* Fix tests.
* Fix spiv gen.
* Fix.
* More test fixes.
* Fix.
* Run only GPU tests on self-hosted servers.
* Remove -use-glsl-matrix-layout-modifier.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
| |
* Disable code motion for expensive insts (call & div)
The current redundancy removal pass does not consider control-flow within loops and as a result can sometimes move dynamic dispatch code outside their switch blocks, if they are nested in a single-iter-loop.
* Update liveness.slang.expected
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Be lenient on same-size unsigend->signed conversion.
* Fix tests.
* Use 250.
* wip
* Fix.
* Fix tests.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
* SSA Register Allocation improvements.
* Fix.
* Rename `Use`->`UseOrPseudoUse`.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
| |
* For C-like targets, emit resource declarations before other globals
* Remove unused tests
|
| |
|
|
|
|
|
|
|
|
|
| |
* Fix optimization pass not converging.
* Fix.
* Fix tests.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Multiple fixes to get various loop tests to pass.
* Create reverse-nested-loop.slang
* Fix for variables becoming inaccessible during cfg normalization
* Removed comments and moved break-branch-normalization to eliminateMultiLevelBreaks
* Fix.
* Override liveness tests
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* More control flow and Phi param simplifications.
* Fix.
* Fix gcc error.
* Fix.
* More IR cleanup.
* Fix bug in phi param dce + ifelse simplify.
* Propagate and DCE side-effect-free functions.
* Enhance CFG simplifcation to remove loops with no side effects.
* Fix.
* Fixes.
* Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation.
* More cleanup.
* Fixes.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Register allocation during phi elimination.
* Enhance the test case.
* Cleanup line breaks in test case.
* remove unncessary line break changes.
* More cleanups.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixes for crash when inlining at global scope
Recent changes to the way inlining is implemented in the Slang compiler
have broken certain scenarios involving `static const` declarations.
The basic problem is that the initial-value expression for a `static const`
gets lowered into IR code at the global scope of a module, and if
that code includes `call`s to stdlib operations marked `forceInlineEarly`,
then we end up trying to apply inlining to code at module scope.
The current inlining operation assumes that all `call`s are in basic
blocks, and that the correct way to do inlining involves splitting
those blocks.
This change adds logic to detect when the callee at a call site to
be inlined consists of a single basic block ending in a `return`,
and in that case it invokes specialized inlining logic that doesn't
split basic blocks and doesn't need to care if the original `call`
is in a basic block.
Thus we are able to inline calls to single-basic-block `forceInlineEarly`
functions called as part of the initialization for global-scope
`static const` variables.
This logic does *not* solve the problem of calls to multi-block
`forceInlineEarly` functions from the global scope. Such calls cannot
really be inlined.
A secondary problem that arises when inlining such calls is that the
callee might include local temporaries (`var` instructions) that are
read and written (`load`s and `store`s), and none of those instructions
should be allowed at the global scope.
In the case of the functions being inlined here, the `load`/`store`
operations are superfluous, and should be cleaned up by our SSA pass.
The only reason that they seem to *not* be getting cleaned up in the
case that was been triggering crashes is that the callee is a generic.
The current logic for the SSA pass was skipping the bodies of generic
functions, so they would not be cleaned up. This change enables the SSA
pass to apply to the bodies of generic functions, and also ensures that
SSA cleanups are applied *before* any `forceInlineEarly` functions get
inlined.
* fixup: liveness test outputs
|
| | |
|
| |
|
|
|
|
|
|
|
| |
* Support multi-level break.
* Single return.
* Add test for inlining `void` return-type functions.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Warning on bool to float conversion.
* Fix test cases.
* Improve.
* LanguageServer: don't show constant value for non constant variables.
* Fix tests.
* Fix warnings in tests.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Bounds checking for C++
Improved bounds checks for CUDA.
* Check that actual global initialization is working.
* Fix typo.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Fix typo in CUDA prelude.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* WIP Zero bound test.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* For now disable zero index test.
* Enable bounds checking for CPU/CUDA.
* Small fixes.
Disable CUDA zero index bound fix.
* Add test result for bound check.
* Work around for index wrapping issue.
* Added Fixed array test.
* Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
* Small fix around instCount.
* Improve liveness loop handing and tests.
* Improve liveness comment.
* More conservative loop handling.
* Make liveness deterministic to make testing work.
* Added 'span tidy'
Added some more tests.
* Simplify span simplification, because could collapse inappropriate spans.
* Updated liveness with simple loop tracking.
* Update test results.
* Small tidy up.
* Update comments in liveness tests.
* Improve liveness comments.
* Loop handling without needing LoopInfo tracking.
* Improve liveness comments.
* Small fix around removing uninteresting spans.
Improve naming.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Liveness pass, such that locations can be found independently of setting up ranges.
* Refactor around different stages of liveness span analysis.
* WIP Take into account PHI temporaries in liveness tracking.
* WIP First pass of PHI liveness refactor.
* Add BlockIndex.
* WIP Refactor phi liveness around inst runs.
* More improvements around liveness tracking.
* Bug fixes.
Special handling to not add multiple ends, at starts of blocks and after accesses.
* Fix test output.
* Use IRInsertLoc to track insertion point.
* Liveness markers don't have side effects.
* Fix typo in liveness test.
* Small improvements around setting SuccessorResult.
* Fix memory issue around reallocation and RAIIStackArray.
Update test output.
* Update test output for liveness.slang.
* Fix typo in SuccessorResult blockIndex.
* Small tidy up.
* Handle the root start block, correctly scoping the run.
* Split BlockInfo into 'Root' and 'Function'.
Store successors as BlockIndices.
* Tidy up around liveness tracking.
* Add head/tail support to ArrayViews.
Use Count where appropriate.
Use head/tail in liveness impl.
* Special handling if return is effectively a live variable.
* Update test output for improved return handling.
* Refactor how handling of return accesses.
Fix issue around liveness starts.
* Disable release warning for unused method.
* Some small improvements around liveness pass.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Liveness pass, such that locations can be found independently of setting up ranges.
* Refactor around different stages of liveness span analysis.
* WIP Take into account PHI temporaries in liveness tracking.
* WIP First pass of PHI liveness refactor.
* Add BlockIndex.
* WIP Refactor phi liveness around inst runs.
* More improvements around liveness tracking.
* Bug fixes.
Special handling to not add multiple ends, at starts of blocks and after accesses.
* Fix test output.
* Use IRInsertLoc to track insertion point.
* Liveness markers don't have side effects.
* Fix typo in liveness test.
* Small improvements around setting SuccessorResult.
* Fix memory issue around reallocation and RAIIStackArray.
Update test output.
* Update test output for liveness.slang.
* Fix typo in SuccessorResult blockIndex.
* Small tidy up.
* Handle the root start block, correctly scoping the run.
* Split BlockInfo into 'Root' and 'Function'.
Store successors as BlockIndices.
* Tidy up around liveness tracking.
* Add head/tail support to ArrayViews.
Use Count where appropriate.
Use head/tail in liveness impl.
|
| |
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add SPIRVLiteralType, to mark types that have spirv_literal in function parameter output.
* Update test result.
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use IR pass to eliminate phi nodes
"Phi nodes" are one of the key contrivances that makes SSA (Static
Single Assignment) form work. Because SSA is so great for compiler
IRs, we kind of need to deal with phi nodes, but they also get in
the way because they don't have a direct analog in most lower-level
machine ISAs or execution models, nor in most of the high-level
languages a transpiler wants to emit. As a result a compiler like
ours needs to be able to eliminate the phi nodes from a program as
part of generating output code.
(For any clever people noting that SPIR-V supports phi nodes
directly: yes, it does. It doesn't need to and it probably *shouldn't*.
Anybody involved in the decision-making knows my reasoning, and
anybody else should feel free to ask me if they want the lecture.
Anyway...)
The basic idea of elimiating phi nodes is simple enough. We replace
each phi node with a temporary variable. Uses of the phi use values
loaded from the temporary. The operation of the phi itself
(assigning a value based on the branch taken) amounts to an assignment
into the temporary.
Previously, the Slang compiler dealt with phi nodes very late in
the process of generating code: in the middle of emitting strings
of source code in a high-level language like HLSL or GLSL. Doing the
work that late in compilation has two big drawbacks:
1. Our ability to emit clean and/or optimal code is limited because
we may not be able to make certain changes to the IR, or because we
cannot make use of additional information like a dominator tree that
might be available at other points in compilation.
2. Any other IR passes that relate to temporary variables won't be
able to see the variables that we generate for phi nodes. This could
raise issues with correctness (e.g., if we want to compute live-range
information for *all* temporary variables), or performance (we have no
way to run additional IR optimization passes after phis are eliminated).
This change addresses these problems by making the elimination of
phi nodes an explicit IR pass. Additional optimizations can easily be
run after this pass (although we'd need to be careful not to run
passes that could end up introducing new phis). The pass makes use
of the information available to it to try to produce code that will
emit to "clean" HLSL/GLSL.
The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is
heavily commented, so I won't describe the approach in detail here.
There are two related issues that came up, though:
First, it turned out that our emit logic for local variables (`IRVar`
instructions) wasn't using the function we'd defined named `emitVar()`.
One worrying consequence of that oversight was that the `precise`
modifier would impact generated HLSL/GLSL for variables that turned
into SSA values (including phi nodes), but *not* for local variables
that had not been SSA'd (or that had been SSA'd and then de-SSA'd).
This change also fixes that bug; it is unclear how widespread the
impact of the original issue might be.
Second, generating explicit IR temporaries for phi nodes exposed a
pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass
basically detects cases where we have an instruction `I` with a use
`U` such that the use follows the rules of SSA form ("def dominates
use," meaning `I` dominations `U`), but does not follow the more
restrictive scoping rules of high-level-language output (where a value
computed "inside" a loop is not automatically visible to code outside
the loop just because it dominates that code). That pass did not
correctly account for the case where `I` was a temporary variable.
It seems that case could not arise before now because we didn't have
any passes that would move `var`, `load`, or `store` operations out
of the basic block they started in. The fix for that pass was relatively
simple, and will make the whole thing more robust in case we add more
aggressive optimizations later.
* fixup: expected test output
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix for loops within dominator tree.
Fix for functions that have no body.
* Use a count array.
Update some comments.
* Special case handling of the root block, for searching for last access.
* Enable liveness test with glsl output.
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP tracking liveness.
* Skeleton around adding liveness instructions.
* Calling into liveness tracking logic.
Adds live start to var insts.
* Liveness macros have initial output.
* Looking at different initialization scenarios.
* Some discussion around liveness.
* WIP for working out liveness end.
* WIP Updated liveness using use lists.
* Is now adding liveness information
* Some small fixes.
* WIP around liveness.
* Seems to output liveness correctly for current scenario.
* Tidy up liveness code.
* Update comment arounds liveness to current status.
* Small fixes to liveness test.
* Add support for call in liveness analysis.
* Improve liveness example with array access.
* Small updates to comments.
* Disable liveness test because inconsistencies with output on CI system.
* Fix some issues brought up in PR.
* Rename liveness instructions.
|