| Age | Commit message (Collapse) | Author |
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Language server: Inlay hints.
* Signature help for base exprs that is not a declref.
* Fix checking of jvp operator.
* Fix.
* Add clang-format based auto formatting.
* Fix clang error.
* Fix clang-format discovery logic.
* Fine tune auto formatting and completion experience.
* Update macos workflow.
* Fixes to configurations.
* Fix parser recovery to trigger completion for index exprs.
* Typo fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Expose internals of dce and use it to implement call graph walk.
* Specialize calls in generic functions.
* Fix clang error.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
arithmetic. Also added type-checking during the semantic stage. (#2303)
* Added JVPTranscriber to handle differentiation of load, store, var, param and return instructions, as well as conversion of data and function types
* Changed class names to be more in line with convention. Added correct type checking for __jvp() and verified that simple calls with only loads and stores are processed correctly
* Added logic to differentiate basic arithmetic and literals inside IRConstruct and fixed the way parameters are differentiated
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Work around windows issue with temporary file clash.
* Handle the temporary file path actually creates a file.
* Fix typo.
* Fix typo in linux for temporary file.
* Add unit test for io. Tests generateTemporary operation.
|
|
framework for passes to process them. (#2297)
* Added a decorator to mark functions for forward-mode differentiation
* Fill out support for calls to non-decl values
The existing compiler logic has a few places (semantic checking plus AST-to-IR lowering) where it assumes that function calls (`InvokeExpr`) are only ever made to expressions that resolve to a specific `Decl` (`DeclRefExpr`). This assumption allows semantic checking and lowering code to inspect things like the parameter list of an actual declaration, rather than just the type signature of the callee, and that infrastructure is used to support various features (e.g., default argument values on parameters).
The AST and IR representations themselves have no matching requirement, and the places where the more general case of call expressions would need to be supported were relatively clear in the code. This change attempts to add suitable logic into each of those places.
Note that this change does *not* surface any valid way to form input code that would cause these new code paths to be executed, so it is entirely possible that there are bugs in the logic as written here. The primary goal of this change is simply to get a sketch of the correct code checked in so that we have something to build on once we have language features that will require this support.
* fixup: warnings-as-errors
* Added parser logic for '__jvp(<fn-name>)' operator
* Fixed issue with missing overload candidate item and added basic parsing test for the __jvp syntax
* Added a blank JVP Auto-diff pass and a pass that replaces 'JVPDerivativeOf' calls with the differentiated function
* Added a couple comments
* Added parameter handling for the JVP pass
Co-authored-by: Theresa Foley <tfoley@nvidia.com>
|
|
* Perserve specialization cache in IR for specialization pass.
* Fix compile error.
* Fix.
* Fix.
* Fix test case.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
|
|
* Lower throwing COM interface method.
* Fix.
* Fix warnings.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add cpu executable compile test
* Fix.
* Fix permission on linux
* retrigger build
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Language Server: Document Symbol outline.
* Fix highlighting of extension decls.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix typo and improve parser recovery.
* Add search path configuration.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Bounds checking for C++
Improved bounds checks for CUDA.
* Check that actual global initialization is working.
* Fix typo.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Fix typo in CUDA prelude.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* WIP Zero bound test.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* For now disable zero index test.
* Enable bounds checking for CPU/CUDA.
* Small fixes.
Disable CUDA zero index bound fix.
* Add test result for bound check.
* Work around for index wrapping issue.
* Added Fixed array test.
* Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
* Small fix around instCount.
* Improve liveness loop handing and tests.
* Improve liveness comment.
* More conservative loop handling.
* Make liveness deterministic to make testing work.
* Added 'span tidy'
Added some more tests.
* Simplify span simplification, because could collapse inappropriate spans.
* Updated liveness with simple loop tracking.
* Update test results.
* Small tidy up.
* Update comments in liveness tests.
* Improve liveness comments.
* Loop handling without needing LoopInfo tracking.
* Improve liveness comments.
* Small fix around removing uninteresting spans.
Improve naming.
* Store current loop information in Loop structure on the stack.
* Add processing to statically determine which loop a block belongs to.
* Small improvement around leaving a loop.
* Fix release build warning.
* Small improvement to const correctness around Loop.
* Add stores to liveness run information, to allow for more sophisticated loop analysis.
|
|
* Language Server improvements.
- Improve parser robustness around `attribute_syntax`.
- Exclude instance members in a static query.
- Coloring accessors
- Improved signature help cursor range check.
* Add expected test result.
* Language server: support configuring predefined macros.
* Fix constructor highlighting.
* Improving performance by supporting incremental text change notifications.
* Fix UTF16 positions and highlighting of constructor calls.
* Add completion suggestions for HLSL semantics.
* Fix tests.
* Fix: don't skip static variables in a static query.
* Include literal init expr value in hover text.
* Fix scenarios where completion failed to trigger.
* Fixing language server protocol field initializations.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Bounds checking for C++
Improved bounds checks for CUDA.
* Check that actual global initialization is working.
* Fix typo.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Fix typo in CUDA prelude.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* WIP Zero bound test.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* For now disable zero index test.
* Enable bounds checking for CPU/CUDA.
* Small fixes.
Disable CUDA zero index bound fix.
* Add test result for bound check.
* Work around for index wrapping issue.
* Added Fixed array test.
* Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
* Small fix around instCount.
* Improve liveness loop handing and tests.
* Improve liveness comment.
* More conservative loop handling.
* Make liveness deterministic to make testing work.
* Added 'span tidy'
Added some more tests.
* Simplify span simplification, because could collapse inappropriate spans.
* Updated liveness with simple loop tracking.
* Update test results.
* Small tidy up.
* Update comments in liveness tests.
* Improve liveness comments.
* Loop handling without needing LoopInfo tracking.
* Improve liveness comments.
* Small fix around removing uninteresting spans.
Improve naming.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Set default values for all language server protocol types.
Remove = {}; which causes warning/error on older compilers.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix warning/error on older compiler initializing hover.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Bounds checking for C++
Improved bounds checks for CUDA.
* Check that actual global initialization is working.
* Fix typo.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Fix typo in CUDA prelude.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* WIP Zero bound test.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* For now disable zero index test.
* Enable bounds checking for CPU/CUDA.
* Small fixes.
Disable CUDA zero index bound fix.
* Add test result for bound check.
* Work around for index wrapping issue.
* Added Fixed array test.
* Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Check that actual global initialization is working.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* Use JSON parsing to locate VS version.
* Need platform detection in C++prelude.
* Fix com host callable test for LLVM.
* WIP improments finding downstream compiler version.
* Work around for not being able to include "targetConditionals.h"
* Matching semantic versioning support.
* DownstreamMatchVersion -> DownstreamCompilerMatchVersion
Small improvements.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Check that actual global initialization is working.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* Use JSON parsing to locate VS version.
* Need platform detection in C++prelude.
* Fix com host callable test for LLVM.
* Work around for not being able to include "targetConditionals.h"
|
|
* Code review fixes for language server.
* Fix clang error.
* update solution file
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Major language server features.
* Include slangd in binary release.
* Fix compiler issues.
* Fix compiler error.
* Completion resolve.
* Various improvements.
* Update diagnostic test expected output.
* Bug fix for source locations.
* Adjust diagnostic update frequency.
* Update github actions to store artifacts.
* Fix infinite parser loop.
* Fix parser recovery.
* Fix parser recovery.
* Update test.
* Fix test.
* Disable IR gen for language server.
* Allow commit characters in auto completion.
* Fix lookup for invoke exprs.
* More parser robustness fixes.
* update solution file
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
|
|
* Clean up `IRReturnVoid`.
* Update gitignore.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Added ability to compile slang without stdlib source. It's not requried if stdlib is available if embedded, or is a binary on the file system.
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* New language feature: basic error handling.
* Fix.
* Fix `tryCall` encoding according to code review.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Work around MacOS compilation issue with embed stlib
- The enable-stdlib-generator project is created with
'kind = StaticLib' to allow the build to work, even though
the project doesn't actually create a library.
- Unlike some other platforms, MacOs "ar" emits an error if no
object files are listed to be added to an archive. This causes
enable-stdlib-generator to fail on MacOS.
- Changing the project's kind to "SharedLib" works around the issue.
Other values for kind do not seem to work around the issue.
- Add an optional flag to generatorProject to indicate that
kind = "SharedLibrary" should be used, rather than "StaticLibrary"
- Enable embed stdlib in github_macos_build.sh
* Allow Strings to be used with std::ostream.
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
- The enable-stdlib-generator project is created with
'kind = StaticLib' to allow the build to work, even though
the project doesn't actually create a library.
- Unlike some other platforms, MacOs "ar" emits an error if no
object files are listed to be added to an archive. This causes
enable-stdlib-generator to fail on MacOS.
- Changing the project's kind to "SharedLib" works around the issue.
Other values for kind do not seem to work around the issue.
- Add an optional flag to generatorProject to indicate that
kind = "SharedLibrary" should be used, rather than "StaticLibrary"
- Add MacOS fix for SharedLibraryUtils::getSharedLibraryFileName().
- Enable embed stdlib in github_macos_build.sh
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Remove the need for LivenessLocation.
* Use LivenessMode.
* Fix some comments.
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
The problematic case is when an `interface` has a `[mutating]` method:
interface ICounter
{
[mutating] void increment();
}
and code tries to invoke that method on a value of existential type:
ICounter c = ...;
c.increment();
We know that the existential value `c` is conceptually a tuple of:
* A concrete type `X`
* A witness that `X : ICounter`
* A value `v` of type `X`
We simply want to invoke `increment()` on the `v` part, using the `X : ICounter` witness table.
The catch that the compiler faces is that the variable `c` is mutable, so we need to be careful that we "snapshot" its value (the tuple `X, X:ICounter, v`) at a single point.
The snapshotting behavior is important when invoking a method that involves `This` or associated types in its signature, so we cannot get rid of it.
The snapshotting we do relies on the idea of a `LetExpr` AST node, which cannot be written in the input syntax.
A `LetExpr` introduces a variable binding (with an initial-value expression) and then evaluates a body expression in the context of that binding.
For a call site like `c.increment()` the front-end makes an intermediate copy of `c` and then "opens" that immutable value to get at the elements of the tuple `X`, `X : ICounter`, `v`.
The resulting AST after checking looks something like:
ICounter c = ...;
(let tmp = c in extractExistentialValue(tmp)).increment();
In that form it is more clear why the attempt to call `increment()` fails:
1. The binding `tmp` sure looks immutable
2. There is no logic in the compiler to make `extractExistentialValue(x)` be an l-value if `x` is
3. There is seemingly no logic to write back from `tmp` to `c` when the operation completes
Let us walk through those problems in order.
Item (1) turns out to be a bit of a non-issue.
Despite the way that I've written out `let` expressions above, the logic in `moveTemp()` in the compiler actually introduces a *mutable* binding.
Item (2) can be fixed for the purposes of semantic checking by modifying `openExistential()`.
Simplistically, we make the overall expression be an l-value if the operand is.
Item (3) is handled at the level of AST->IR lowering. Each kind of expression that can form an l-value needs to have a way to represent the "location" of that l-value in the `LoweredValInfo` type.
This change adds a case to handle the `extractExistentialVal` operation, by tracking both the extract value (of concrete type) and the underlying l-value (of existential type).
Where all of this comes crashing against reality a bit is that the scoping I've drawn for the `let` expressions above kind of doesn't work once we look at types.
The basic problem is that the *type* of the `(let tmp = c in ...)` expression is the concrete type `X` that was extracted from the existential.
That type can conceptually be written as `ExtractExistentialType(tmp)` which, notably, references `tmp`.
That means that we end up with AST expression nodes that reference the variable `tmp` *outside* of its scope.
Furthermore, those references to `tmp` can end up being lowered to IR *before* we have lowered the `let ...` expression itself.
Fixing the scoping issue turns out to be a major undertaking.
The first (and more obvious) issue is needing to address the scoping problem.
The solution I implemented includes a bit of refactoring to make all the `SemanticsVisitor` types better able to pass around the contextual scope-dependent state that might be needed during semantic checking, but really only adds a single piece of state.
The semantic-checking state used for checking expressions is bottlenecked so that there will (or at least *should*) always be an explicit representation of a "scope" that surrounds a complete expression (as opposed to a sub-expression).
When a `LetExpr` needs to be introduced, it is added to a pending list on the active scope, rather than being added locally.
Once the complete expression is checked, the resulting expression is wrapped up in the pending `LetExpr`s so that their scope is as broad as possible.
Technically this solution doesn't cover all cases. For example:
interface ICell { associatedtype Content; Content getContent(); }
...
ICell cell = ...;
let content = cell.getContent();
In this case the type of `content` refers to the binding introduced by a `LetExpr` in the initial-value expression.
I am leaving such issues as a piece of future work, in the hopes that we can get at least a partial fix for the problem in place.
A future fix probably nees to extend the scoping even wider (e.g., by unwrapping the `LetExpr`s from the initial-value expression and turning them into distinct temporaries).
The second piece of the fix is that we need a way for the modified value of the extracted existential to be "written back" to the original location.
Well...
We are actually being a little slippery here, based on some logic in the compiler codebase that I guess Just Works.
When AST->IR lowering encounters a `LetExpr` that binds an l-value to a name, it actually ends up binding that name more or less as a *reference* to that l-value.
At this point the `let`-ness of `LetExpr` is very much in doubt: the binding can be mutable, and it can even be an *alias* of some location?!?
In any case, the result is that the AST->IR codegen logic implicitly handles the "write-back" because the `let`-bound temporary is actually an alias for the original location.
A more complete future fix might need to introduce a distinct case in `LoweredValInfo` to handle the case of copy of a mutable temporary.
|
|
See https://github.com/shader-slang/slang/issues/2213
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add extension required by SPIRVOpDecoration into part of emit (could be a prior pass).
* Add [[vk::spirv_instruction]] attribute
* Add documentation for [[vk::spirv_instruction].
* Update 08-attributes.md
* Update 08-attributes.md
|
|
* Added support for disabling specific warnings or turning them into errors.
* Added API entry points for adding diagnostic severity overrides and manipulating some sink flags.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor how prelude output works in emit.
* Small improvement to emit output.
* Move around comment on target specific language directives based on review.
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Liveness pass, such that locations can be found independently of setting up ranges.
* Refactor around different stages of liveness span analysis.
* WIP Take into account PHI temporaries in liveness tracking.
* WIP First pass of PHI liveness refactor.
* Add BlockIndex.
* WIP Refactor phi liveness around inst runs.
* More improvements around liveness tracking.
* Bug fixes.
Special handling to not add multiple ends, at starts of blocks and after accesses.
* Fix test output.
* Use IRInsertLoc to track insertion point.
* Liveness markers don't have side effects.
* Fix typo in liveness test.
* Small improvements around setting SuccessorResult.
* Fix memory issue around reallocation and RAIIStackArray.
Update test output.
* Update test output for liveness.slang.
* Fix typo in SuccessorResult blockIndex.
* Small tidy up.
* Handle the root start block, correctly scoping the run.
* Split BlockInfo into 'Root' and 'Function'.
Store successors as BlockIndices.
* Tidy up around liveness tracking.
* Add head/tail support to ArrayViews.
Use Count where appropriate.
Use head/tail in liveness impl.
* Special handling if return is effectively a live variable.
* Update test output for improved return handling.
* Refactor how handling of return accesses.
Fix issue around liveness starts.
* Disable release warning for unused method.
* Some small improvements around liveness pass.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Liveness pass, such that locations can be found independently of setting up ranges.
* Refactor around different stages of liveness span analysis.
* WIP Take into account PHI temporaries in liveness tracking.
* WIP First pass of PHI liveness refactor.
* Add BlockIndex.
* WIP Refactor phi liveness around inst runs.
* More improvements around liveness tracking.
* Bug fixes.
Special handling to not add multiple ends, at starts of blocks and after accesses.
* Fix test output.
* Use IRInsertLoc to track insertion point.
* Liveness markers don't have side effects.
* Fix typo in liveness test.
* Small improvements around setting SuccessorResult.
* Fix memory issue around reallocation and RAIIStackArray.
Update test output.
* Update test output for liveness.slang.
* Fix typo in SuccessorResult blockIndex.
* Small tidy up.
* Handle the root start block, correctly scoping the run.
* Split BlockInfo into 'Root' and 'Function'.
Store successors as BlockIndices.
* Tidy up around liveness tracking.
* Add head/tail support to ArrayViews.
Use Count where appropriate.
Use head/tail in liveness impl.
|
|
is called with invalid arguments, such as unsupported profile. (#2235)
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Update SPIR-V headers/opt.
Update glslang.
* Set the SPIR-V emit version.
* Use the merged hash from shader-slang/glslang
* Improve comment around spirv version for emitting spir-v directly.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add SPIRVLiteralType, to mark types that have spirv_literal in function parameter output.
* Update test result.
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
* Use IR pass to eliminate phi nodes
"Phi nodes" are one of the key contrivances that makes SSA (Static
Single Assignment) form work. Because SSA is so great for compiler
IRs, we kind of need to deal with phi nodes, but they also get in
the way because they don't have a direct analog in most lower-level
machine ISAs or execution models, nor in most of the high-level
languages a transpiler wants to emit. As a result a compiler like
ours needs to be able to eliminate the phi nodes from a program as
part of generating output code.
(For any clever people noting that SPIR-V supports phi nodes
directly: yes, it does. It doesn't need to and it probably *shouldn't*.
Anybody involved in the decision-making knows my reasoning, and
anybody else should feel free to ask me if they want the lecture.
Anyway...)
The basic idea of elimiating phi nodes is simple enough. We replace
each phi node with a temporary variable. Uses of the phi use values
loaded from the temporary. The operation of the phi itself
(assigning a value based on the branch taken) amounts to an assignment
into the temporary.
Previously, the Slang compiler dealt with phi nodes very late in
the process of generating code: in the middle of emitting strings
of source code in a high-level language like HLSL or GLSL. Doing the
work that late in compilation has two big drawbacks:
1. Our ability to emit clean and/or optimal code is limited because
we may not be able to make certain changes to the IR, or because we
cannot make use of additional information like a dominator tree that
might be available at other points in compilation.
2. Any other IR passes that relate to temporary variables won't be
able to see the variables that we generate for phi nodes. This could
raise issues with correctness (e.g., if we want to compute live-range
information for *all* temporary variables), or performance (we have no
way to run additional IR optimization passes after phis are eliminated).
This change addresses these problems by making the elimination of
phi nodes an explicit IR pass. Additional optimizations can easily be
run after this pass (although we'd need to be careful not to run
passes that could end up introducing new phis). The pass makes use
of the information available to it to try to produce code that will
emit to "clean" HLSL/GLSL.
The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is
heavily commented, so I won't describe the approach in detail here.
There are two related issues that came up, though:
First, it turned out that our emit logic for local variables (`IRVar`
instructions) wasn't using the function we'd defined named `emitVar()`.
One worrying consequence of that oversight was that the `precise`
modifier would impact generated HLSL/GLSL for variables that turned
into SSA values (including phi nodes), but *not* for local variables
that had not been SSA'd (or that had been SSA'd and then de-SSA'd).
This change also fixes that bug; it is unclear how widespread the
impact of the original issue might be.
Second, generating explicit IR temporaries for phi nodes exposed a
pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass
basically detects cases where we have an instruction `I` with a use
`U` such that the use follows the rules of SSA form ("def dominates
use," meaning `I` dominations `U`), but does not follow the more
restrictive scoping rules of high-level-language output (where a value
computed "inside" a loop is not automatically visible to code outside
the loop just because it dominates that code). That pass did not
correctly account for the case where `I` was a temporary variable.
It seems that case could not arise before now because we didn't have
any passes that would move `var`, `load`, or `store` operations out
of the basic block they started in. The fix for that pass was relatively
simple, and will make the whole thing more robust in case we add more
aggressive optimizations later.
* fixup: expected test output
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix for loops within dominator tree.
Fix for functions that have no body.
* Use a count array.
Update some comments.
* Special case handling of the root block, for searching for last access.
* Enable liveness test with glsl output.
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Allow rate modifier on parameter.
* Add test.
* Disable test for now as breaks on source comparison because around nvAPI.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add support for HLSL `export`.
* Test for using `export` keyword.
|
|
* Various vulkan/glsl fixes.
* Fix.
* Fix.
* Canonicalize type constraints for name mangling.
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|