| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
| |
* Remove use of `G0` and `__target_intrinsic` in stdlib.
* Fix.
* Fix calling intrinsic in global scope.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add host shared library target.
* Attempt fix.
* Fix warnings.
* try fix.
* Fix test.
* Fix.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Refactor compiler option representation.
* Fix binary compatibility.
* Add a test for specifying compiler options at link time.
* Fix binary compatibility.
* Fix binary compatibility.
* Fix backward compatibility on matrix layout.
* Fix.
* Fix.
* Fix.
* Fix gfx.
* Fix gfx.
* Fix dynamic dispatch.
* Polish.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add per-buffer data layout control.
Fixes #3534.
* Fixes.
* Robustness.
* Update test.
* Fix.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Unify Texture types in stdlib into 1 generic type.
* Fixes.
* Fix.
* Fixes.
* Fix reflection.
* Fix binding reflection.
* Add gather intrinsics.
* Fix gather intrinsics.
* Fix texture type toText.
* Fix intrinsic.
* fix cuda intrinsic.
* Fix project files.
* cleanup.
* Fix.
* Fix.
* Fix sampler feedback test.
* Fix getDimension intrinsics.
* Fix spirv sample image intrinsics.
* Fix test.
* Fix GLSL intrinsic.
* Cleanup.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Add `requirePrelude()` intrinsic function.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support `constref` parameters passing.
* Fix.
* Fix.
* Add test and diagnostic on mix use of __constref and no_diff.
* check for [constref] on differentiable member method.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add `target_switch` and `__intrinsic_asm` statement.
* Cleanup.
* WaveGetActiveMask, WaveGetActiveMask, WaveCountBits.
* WaveIsFirstLane.
* More wave intrinsics.
* wave intrinsics.
* merge fix.
* Fix.
* Fix.
* Update test.
* update test.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support per field matrix layout
* Fix warnings.
* Fix.
* Fix tests.
* Fix spiv gen.
* Fix.
* More test fixes.
* Fix.
* Run only GPU tests on self-hosted servers.
* Remove -use-glsl-matrix-layout-modifier.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* WIP handling LValue coercion via LValueImplicitCast
* Need to have the ptr type for the cast.
* Casting conversion working on C++.
* Make the LValue casts record if in or in/out as we can produce better code if we know the difference.
* WIP LValueCast pass
* Fix tests so we don't fail because downstream compilers detect use of uninitialized variable.
* Do conversions through through tmp for l-value scenarios that can't work other ways.
* Fix a typo.
* Change diagnostic implicit-cast-lvalue for a type that still exhibits the issue.
* Add matrix test.
* Added a bit more clarity around LValue casting choices.
* Small comment improvements.
Improvements based on comments on PR.
* Use findOuterGeneric.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Various fixes for autodiff and slangpy.
* Fix cuda code gen for `select`.
* Fix getBuildTagString().
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
* Small tidy on String.
* Append -> append
* ToString -> toString
ProduceString -> produceString
* Small fixes.
* StringToXXX -> stringToXXX
* Fix typo introduced by Append -> append.
* Made intToAscii do reversal at the end.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Emit simpler vector element access code
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Translate all composed types into tuple types in pyBind.
* Delete temp file.
* Fix get tuple element code emit logic.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Add PyTorch C++ binding generation.
* fix
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP source map.
* Split out handling of RttiTypeFuncs to a map type.
* Make RttiTypeFuncsMap hold default impls.
* Slightly more sophisticated RttiTypeFuncsMap
* Source map decoding.
* Fix tabs.
* Fix asserts due to negative values.
* Use less obscure mechanisms in SourceMap.
* Source map decoding.
Simplifying SourceMap usage.
* First attempt at ouputting a source map as part of emit.
* Added support for -source-map option. SourceMap is added to the artifact.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* More control flow and Phi param simplifications.
* Fix.
* Fix gcc error.
* Fix.
* More IR cleanup.
* Fix bug in phi param dce + ifelse simplify.
* Propagate and DCE side-effect-free functions.
* Enhance CFG simplifcation to remove loops with no side effects.
* Fix.
* Fixes.
* Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation.
* More cleanup.
* Fixes.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Overhaul global inst deduplication and cpp/cuda backend.
* Update IR documentation.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
| |
* Rework differential conformance dictionary checking.
* Revert space changes.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Language feature: pointer sized int types.
* Fix.
* small change to test.
* Fix stdlib.
* Fix.
* Fix.
* Add typedef for `size_t` in stdlib.
* Fix test.
* Add `intptr_t::size` constant.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP replacing DownstreamCompileResult.
* First attempt at replacing DownstreamCompileResult with IArtifact and associated types.
* Small renaming around CharSlice.
* ICastable -> ISlangCastable
Added IClonable
Fix issue with cloning in ArtifactDiagnostics.
* Only add the blob if one is defined in DXC.
* Guard adding blob representation.
* Make cloneInterface available across code base.
Set enums backing type for ArtifactDiagnostic.
* Added ::create for ArtifactDiagnostics.
* Use SemanticVersion for DownstreamCompilerDesc.
Set sizes for enum types.
* Depreciate old incompatible CompileOptions.
Change SemanticVersion use 32 bits for the patch.
* Split out CastableUtil.
* Change IDownstreamCompiler to use canConvert and convert to use artifact types.
* Fix typos.
* Fix typo bug.
Allow trafficing in PTX assembly/binaries
* struct DownstreamCompilerBaseUtil -> struct DownstreamCompilerUtilBase
* Add other riff types.
* Small fix around artifact kind.
* Make using slices instead of strings explicit on atomic ref counted types. (not complete).
Added IArtifactList.
Use IArtifactList to hold the 'associated' files.
Use IUnknown for scoping for atomic ref counting.
Small naming improvements.
* Make artifact not use String in construction (so it owns contents).
* Calculate compile products as artifacts.
* Small improvements around ArtifactDesc.
* Use ICastableList for list of artifacts and remove IArtifactList.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP with hierarchical enums.
* Some small fixes and improvements around artifact desc related types.
* Improvements around hierarchical enum.
* Fixes to get Artifact types refactor to be able to execute tests.
* Attempt to better categorize PTX.
* Work around for potentially unused function warning.
* Typo fix.
* Simplify Artifact header.
* Small improvements around Artifact kind/payload/style.
* Added IDestroyable/ICastable
* Add IArtifactList.
* First impl of IArtifactUtil.
* Use the ICastable interface for IArtifactRepresentation.
* Added IArtifactRepresentation & IArtifactAssociated.
* Add SLANG_OVERRIDE to avoid gcc/clang warning.
* Fix calling convention issue on win32.
* Fix missing SLANG_OVERRIDE.
* First attempt at file abstraction around Artifact.
* Added creation of lock file.
* Move functionality for determining file paths to the IArtifactUtil.
Add casting to ICastable.
* Added some casting/finding mechanisms.
* Simplify IArtifact interface, and use Items for file reps.
* Fix problem with libraries on DXIL.
* Split out ArtifactRepresentation.
* Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil.
* Split implementations from the interfaces for Artifact.
* Use TypeTextUtil for target name outputting.
* Add artifact impls.
|
| |
|
|
|
|
|
| |
* Allow `class` to implement COM interface, [DLLExport]
* Fix [COM] usage in tests and examples with UUIDs.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
| |
* Support `class` types.
* Ignore class-keyword test
* Fix codereview comments and warnings.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Add cpu executable compile test
* Fix.
* Fix permission on linux
* retrigger build
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Check that actual global initialization is working.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* Use JSON parsing to locate VS version.
* Need platform detection in C++prelude.
* Fix com host callable test for LLVM.
* Work around for not being able to include "targetConditionals.h"
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
|
| |
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor how prelude output works in emit.
* Small improvement to emit output.
* Move around comment on target specific language directives based on review.
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
| |
|
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use IR pass to eliminate phi nodes
"Phi nodes" are one of the key contrivances that makes SSA (Static
Single Assignment) form work. Because SSA is so great for compiler
IRs, we kind of need to deal with phi nodes, but they also get in
the way because they don't have a direct analog in most lower-level
machine ISAs or execution models, nor in most of the high-level
languages a transpiler wants to emit. As a result a compiler like
ours needs to be able to eliminate the phi nodes from a program as
part of generating output code.
(For any clever people noting that SPIR-V supports phi nodes
directly: yes, it does. It doesn't need to and it probably *shouldn't*.
Anybody involved in the decision-making knows my reasoning, and
anybody else should feel free to ask me if they want the lecture.
Anyway...)
The basic idea of elimiating phi nodes is simple enough. We replace
each phi node with a temporary variable. Uses of the phi use values
loaded from the temporary. The operation of the phi itself
(assigning a value based on the branch taken) amounts to an assignment
into the temporary.
Previously, the Slang compiler dealt with phi nodes very late in
the process of generating code: in the middle of emitting strings
of source code in a high-level language like HLSL or GLSL. Doing the
work that late in compilation has two big drawbacks:
1. Our ability to emit clean and/or optimal code is limited because
we may not be able to make certain changes to the IR, or because we
cannot make use of additional information like a dominator tree that
might be available at other points in compilation.
2. Any other IR passes that relate to temporary variables won't be
able to see the variables that we generate for phi nodes. This could
raise issues with correctness (e.g., if we want to compute live-range
information for *all* temporary variables), or performance (we have no
way to run additional IR optimization passes after phis are eliminated).
This change addresses these problems by making the elimination of
phi nodes an explicit IR pass. Additional optimizations can easily be
run after this pass (although we'd need to be careful not to run
passes that could end up introducing new phis). The pass makes use
of the information available to it to try to produce code that will
emit to "clean" HLSL/GLSL.
The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is
heavily commented, so I won't describe the approach in detail here.
There are two related issues that came up, though:
First, it turned out that our emit logic for local variables (`IRVar`
instructions) wasn't using the function we'd defined named `emitVar()`.
One worrying consequence of that oversight was that the `precise`
modifier would impact generated HLSL/GLSL for variables that turned
into SSA values (including phi nodes), but *not* for local variables
that had not been SSA'd (or that had been SSA'd and then de-SSA'd).
This change also fixes that bug; it is unclear how widespread the
impact of the original issue might be.
Second, generating explicit IR temporaries for phi nodes exposed a
pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass
basically detects cases where we have an instruction `I` with a use
`U` such that the use follows the rules of SSA form ("def dominates
use," meaning `I` dominations `U`), but does not follow the more
restrictive scoping rules of high-level-language output (where a value
computed "inside" a loop is not automatically visible to code outside
the loop just because it dominates that code). That pass did not
correctly account for the case where `I` was a temporary variable.
It seems that case could not arise before now because we didn't have
any passes that would move `var`, `load`, or `store` operations out
of the basic block they started in. The fix for that pass was relatively
simple, and will make the whole thing more robust in case we add more
aggressive optimizations later.
* fixup: expected test output
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support `[DllImport]`
* Fix.
* Fix.
* Fix array type emit in cpp.
* Fix.
* Fix.
* Fix
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An earlier refactoring pass over the compiler codebase split the
type that had been called `CompileRequest` into three distinct
pieces:
* `FrontEndCompileRequest` which was supposed to own state and
options related to running the compiler front end and producing
IR + reflection (e.g., what translation units and source
files/strings are included).
* `BackEndCompileRequest` which was supposed to own state and options
related to running the compiler back end to translate the IR
for a `ComponentType` (program) into output code. (Note that the
`BackEndCompileRequest` was conceived of as orthogonal to the
`TargetRequest`s, which store per-target and target-specific
options.)
* `EndToEndCompileRequest` which was an umbrella object that owns
separate front-end and back-end requests, plus any state that is
only relevant when doing a true end-to-end compile (such as the
kinds of compiles initiated with `slangc`). As originally conceived,
the only state that this type was supposed to own was stuff related
to "pass-through" compilation, as well as state related to writing
of generated code to output files.
That refactoring work was very useful at the time, because it allowed
us to "scrub" the back end compilation steps to remove all
dependencies on front-end and AST state (this was important for our
goals of enabling linking and codegen from serialized Slang IR).
At this point, however, it is clear that the hierarchy that was built
up serves very little purpose:
* The `BackEndCompileRequest` type is only used in two places:
* As part of an `EndToEndCompileRequest`, where the settings on
the `BackEndCompileRequest` can be configured, but only through
the `EndToEndCompileRequest`
* As part of on-demand code generation through the `IComponentType`
APIs. In this case, the settings stored on the
`BackEndCompileRequest` are not accessible to the application
at all, and will always use their default values, so that
instantiating a "request" object doesn't really make any sense.
* The `FrontEndCompileRequest` type has a similar situation:
* Front-end compilation as part of an `EndToEndCompileRequest`
supports user configuration of `FrontEndCompileRequest` settings,
but only through the `EndToEndCompileRequest`
* Front-end compilation triggered by an `import` or a `loadModule()`
call does not support user configuration of settings at all. It
will always derive all relevant settings from thsoe on the
session ("linkage").
In addition, subsequent changes have been made to the compiler that
show a bit of a "code smell" and/or forward-looking worries for this
decomposition:
* In some cases we've had to add the same setting to multiple types
in the breakdown (front-end, back-end, end-to-end, linkage, target,
etc.) which makes it harder for us to validate that all the possible
mixtures of state work correctly.
* Related to the above, in some cases we have manual logic that copies
state from one of the objects in the breakdown to another, in order
to ensure that the user's intention is actually followed.
* As a forward-looking concern, it seems that developers have sometimes
added new configuration options and state to places that don't really
make sense according to the rationale of the original decomposition
(e.g., we probably don't want to have a lot of state that is only
available via end-to-end requests, given that the API structure is
meant to push users *away* from end-to-end compiles).
As a result of all of the above, I've been planning a large refactor
with the following big-picture goals:
* Eliminate `BackEndCompileRequest`
* Move all relevant state/options from the back-end request to
the end-to-end request, since that is the only place they could
be set anyway.
* Introduce a transient "context" type to be used for the duration
of code generation that serves the main functions that back-end
requests really served in the codebase
* Make `EndToEndCompileRequest` be a subclass of
`FrontEndCompileRequest`
* Consider addding a transient "context" type for front-end
compiles that can be used in `import`-like cases rather than
needing a full front-end request object. If this works, then
eliminate `FrontEndCompileRequest` and be back to world with
just a single `CompileRequest` type
* Move *all* compiler configuration options to a distinct type (named
something like `CompilerConfig` or `CompilerOptions` or whatever)
which stores setting as key-value pairs, and has a notion of
"inheritance" such that one configuration can extend or build on top
of another. Make all the relevant types use this catch-all structure
instead of redundantly storing flags in many places.
This change deals with the first of those bullets: removeal of
`BackEndCompileRequest`. The addition of the `CodeGenContext` type is
perhaps an unncessary additional step, but making that change helps
clean up a bunch of the code related to per-target code generation,
so I think it is the right choice.
Co-authored-by: Yong He <yonghe@outlook.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixed naming conflicts in heterogeneous-hello-world
Added 3 new modifiers (`__unmangled`, `__exportDirectly`, `__externLib`)
`__unmangled` causes mangleName() to return the normal name of the decl.
`__exportDirectly` changes parent decl name concatenation behavior to use
"::" instead of "." (for Name Hint) and emits the name hint when it exists,
otherwise it emits the mangled name.
`__externLib` stops Slang from emitting the corresponding struct.
Also made necessary changes to heterogeneous-hello-world so that this new
functionality is shown off.
* Undo unintentional formatting changes
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
| |
It seems that we were missing a `template<>` prefix when emitting explicit specializations for the `Matrix<...>` type in the CUDA prelude. This might not have caused errors in some versions of the CUDA compiler, but it at least seems to fail on the 11.5 SDK, which I am using to test locally.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
* Add prebuild command to premake for heterogeneous example
* Pass in entry point as parameter (also remove shader bytecode)
* Pass in module name as parameter
* Squashed commit of the following:
commit 5b13b57fe600724344c556fe4309a5d6bb3d39ab
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Oct 7 23:38:50 2021 -0700
Return diagnostics data when encountering module load error by exception (#1966)
commit 112e1515c30fa972ff56f91514b70946153c718c
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Thu Oct 7 16:12:29 2021 -0400
Disable test crashing CI (#1965)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Disable test that appears to be crashing.
commit da32069a0c1c8c723d7ef45100049a8f0dd5d9c4
Author: Kai Yao <kyao@nvidia.com>
Date: Mon Oct 4 13:58:51 2021 -0700
Modified barrier API to accept multiple resources per call (#1959)
Co-authored-by: Yong He <yonghe@outlook.com>
commit 97bb82ebcdf8f1391b9d93b5a8d7b1dfc4e88e52
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Mon Oct 4 14:15:51 2021 -0400
Removing exceptions from core/compiler-core (#1953)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Stream. Working on all tests.
* Split out CharEncode.
* Make method names lower camel.
m_prefix in Writer/Reader
* Tidy up around CharEncode interface.
* Small improvements around encode/decode.
* Better use of types.
* Remove readLine from TextReader.
* Remove exceptions from Stream/Text handling.
* Fix some typos.
* Fix tabbing.
* Fix missing override.
* Remove remaining exception throw/catch via using signal mechanism.
* Remove exceptions that are not used anymore.
* Document the Stream interface.
* Remove index for decoding 'get byte' function.
* Fix CharReader -> ByteReader.
commit b3dfe383c6d31ff3dbd76dcfb32de8d536382f3e
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Mon Oct 4 09:46:33 2021 -0700
Get native handles for TextureResource and BufferResource (#1960)
* Added getNativeHandle() to TextureResource and BufferResource; Implemented getNativeHandle() in Vulkan and D3D12; Added new unit test files for the aforementioned implementation
* Added missing getNativeHandle() implementations to renderer-shared.cpp and CUDA
* Finished new getNativeHandle() unit tests for ITextureResource and IBufferResource; Modified ICommandQueue and ICommandBuffer unit tests to call QueryInterface to convert to IUnknown then back and compare resulting pointers for equality
* Unit tests updated and pass locally
* Cast m_buffer.m_buffer and m_image to uint64_t
commit 35bca4cc432613af3926da3bed217a6baa9cbd26
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Oct 1 13:08:25 2021 -0700
Add getNativeHandle() to ICommandQueue and ICommandBuffer (#1952)
* Added support for getting command buffer and command queue handles to ICommandBuffer and ICommandQueue; D3D12Device, VkDevice, and DebugDevice modifieid to implement this new functionality; immediate-renderer-base.cpp also modified to implement the new functions
* Removed excess boilerplate
* Changed readRef() to get() in D3D12 getNativeHandle() implementation for ICommandBuffer and ICommandQueue
* Added unit tests for new getNativeHandle() implementations, unfinished
* Queue test added; Minor cleanup changes
* getBufferHandleTestImpl() now closes the command buffer before returning
* Added getNativeHandle() implementations to CUDADevice
* Added comment clarifying that the Vulkan check is checking for a null handle, which is defined to be 0
commit 6c6200f547c7387598743b23bb3c8f0d375d9494
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Sep 30 20:25:34 2021 -0700
VK Resource Barrier (#1955)
* Resource barrier API and VK implementation
* Stub implementations
* Handle VK Acceleration Structure flag
* Add a couple more cases to pipeline barrier stages
commit 627fc976bac5c2381dbace9c7925cb6a68b8de12
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 19:48:47 2021 -0700
Fix aarch64 build on github (#1957)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 122d701513e116856bd59c999221ce36a373d7db
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 17:51:56 2021 -0700
Fix GitHub release (#1956)
* Fix aarch64 release build config.
* Fix for WinAarch64 build.
* Update premake for embed-std-lib build on aarch64.
* `platform` fix for aarach64 build.
* Try revert back to use absolute output path for slang-stdlib-generated.h
* Fix
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
commit aa8f7b899b7b562b3d3c6e25c3da41569505e70c
Author: Chad Engler <englercj@live.com>
Date: Wed Sep 29 13:02:47 2021 -0700
Fix ARM64 detection for MSVC (#1951)
commit 6736b0c1c5fa3e89bc561eb7965a1a0d17af3466
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 29 11:29:46 2021 -0700
Add ISession::loadModuleFromSource. (#1950)
Co-authored-by: Yong He <yhe@nvidia.com>
commit d8e452412e14a6a8ba137f2adcae13b398e5cecb
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 15:03:03 2021 -0700
Fix AbortCompilationException leaking through loadModule API. (#1949)
* Fix AbortCompilationException leaking through loadModule API.
* Update.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
commit cdf1b2c007fefdca128584d2a9f63dec3d350e16
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 11:54:24 2021 -0700
Improvements to the unit test framework. (#1948)
commit af788b62e18bbd55cd748ad60400a74cf1bc93ee
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Sep 24 16:53:41 2021 -0700
Add existing device handle support unit test (#1946)
commit bec8e6aec85b6e3f875c58bdd59eb15613978358
Author: Yong He <yonghe@outlook.com>
Date: Fri Sep 24 11:33:44 2021 -0700
Move existing unit tests to a standalone dll. (#1945)
commit f2a3c933bc11a498c622fa18694c84beca8ca031
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Thu Sep 23 12:19:49 2021 -0700
Add method to retrieve native handles (#1944)
* Added a getNativeHandle() method that retrieves the natively created handles; Modified RendererBase, VKDevice, D3D12Device, and DebugDevice to implement this new method
* Moved ExistingDeviceHandles out of Desc directly inside IDevice and renamed to NativeHandles; Modified calls accessing the struct accordingly in RendererBase, DebugDevice, VKDevice, and D3D12Device
* Minor cleanup changes (renames, etc.)
commit b9b398d038b524f15a86ff27cd6888d54e8754e0
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 22 10:06:59 2021 -0700
Add gfx unit testing framework. (#1943)
* Add gfx unit testing framework.
* Fix compilation error.
* Reset gfxDebugCallback after render_test.
* Pass enabledApi flags through.
* Fix for code review suggestions.
Co-authored-by: Yong He <yhe@nvidia.com>
commit 6e9cee69b3588ddae09b08b9f580f59ad899983f
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Tue Sep 21 18:46:32 2021 -0700
Support for existing device/instance handles in Vulkan (#1942)
commit b1f04c8544c650de3947955ca68f679535d249aa
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Wed Sep 15 20:22:45 2021 -0700
Allow D3D12Device to use an existing device handle (#1940)
* Added a new field for an existing device handle to IDevice::Desc; Modified D3D12Device::initialize to set the device stored in desc if it already exists instead of creating a new one
* Turned existingDeviceHandle into a struct containing an array of two elements; Updated D3D12Device::initialize to match changes to existingDeviceHandle; Updated comments
* Fixed style error for ExistingDeviceHandles struct
commit 2f7b9f5ae8be21c6c1d75ae9caefbc7b3f8986a9
Author: Pablo Delgado <private@pablode.com>
Date: Thu Sep 16 01:17:57 2021 +0200
Fix incorrect WIN32 macros and missing Windows.h inclusion (#1939)
* Replace WIN32 preprocessor macros with _WIN32
* Add missing Windows.h include for InterlockedIncrement
commit 11d43642008905ac69a3832eb8a9b2ae7b785f86
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 14 11:36:44 2021 -0700
Avoid upcasting to f32 in 16bit float-uint bit cast. (#1938)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 502aa3812a82cf0d091cff0c67804e4ee448ac78
Author: David Siher <32305650+dsiher@users.noreply.github.com>
Date: Tue Sep 14 12:59:55 2021 -0400
Bring heterogeneous-hello-world back up to date. (#1935)
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
* Revert "Squashed commit of the following:"
This reverts commit 4f665858d65f7c332c616ef6db9fdafa1c5e0b9f.
* Run premake
* Remove prebuild command (only works on Windows?)
* Rerun premake
* Fix heterogeneous prebuild command
* Remove linux specific prebuild command
* Fix prebuild command (again)
* Change target from dxbc to hlsl to see if that fixes linux issues
* Use Path::getFileNameWithoutExt
* Change string-literal.slang.expected to have extra filename in decoration
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
|