summaryrefslogtreecommitdiffstats
path: root/source/slang/slang-emit-cpp.cpp
Commit message (Collapse)AuthorAge
* Remove use of `G0` and `__target_intrinsic` in stdlib. (#4170)Yong He2024-05-14
| | | | | | | * Remove use of `G0` and `__target_intrinsic` in stdlib. * Fix. * Fix calling intrinsic in global scope.
* Add host shared library target. (#4098)Yong He2024-05-03
| | | | | | | | | | | | | * Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix.
* Support emitting generic target_intrinsic type. (#3745)Yong He2024-03-12
|
* Refactor compiler option representations. (#3598)Yong He2024-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
* Add per-buffer data layout control. (#3551)Yong He2024-02-05
| | | | | | | | | | | | | * Add per-buffer data layout control. Fixes #3534. * Fixes. * Robustness. * Update test. * Fix.
* Unify stdlib `Texture` types into one generic type. (#3327)Yong He2023-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Unify Texture types in stdlib into 1 generic type. * Fixes. * Fix. * Fixes. * Fix reflection. * Fix binding reflection. * Add gather intrinsics. * Fix gather intrinsics. * Fix texture type toText. * Fix intrinsic. * fix cuda intrinsic. * Fix project files. * cleanup. * Fix. * Fix. * Fix sampler feedback test. * Fix getDimension intrinsics. * Fix spirv sample image intrinsics. * Fix test. * Fix GLSL intrinsic. * Cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add `requirePrelude()` intrinsic function. (#3250)Yong He2023-09-29
| | | | | | | | | * Add `requirePrelude()` intrinsic function. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Support `constref` parameters passing. (#3249)Yong He2023-09-28
| | | | | | | | | | | | | | | * Support `constref` parameters passing. * Fix. * Fix. * Add test and diagnostic on mix use of __constref and no_diff. * check for [constref] on differentiable member method. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Avoid make copies of __ref parameters when doing autodiff. (#3242)Yong He2023-09-27
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add `target_switch` and `intrinsic_asm` statement. (#3154)Yong He2023-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add `target_switch` and `__intrinsic_asm` statement. * Cleanup. * WaveGetActiveMask, WaveGetActiveMask, WaveCountBits. * WaveIsFirstLane. * More wave intrinsics. * wave intrinsics. * merge fix. * Fix. * Fix. * Update test. * update test. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Lower all ByteAddressBuffer uses for SPIRV. (#3143)Yong He2023-08-23
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Support per field matrix layout (#3101)Yong He2023-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Support per field matrix layout * Fix warnings. * Fix. * Fix tests. * Fix spiv gen. * Fix. * More test fixes. * Fix. * Run only GPU tests on self-hosted servers. * Remove -use-glsl-matrix-layout-modifier. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix native string emit for CUDA/Cpp backend. (#2980)Yong He2023-07-12
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fix for operator assignment issue (#2951)jsmall-nvidia2023-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * WIP handling LValue coercion via LValueImplicitCast * Need to have the ptr type for the cast. * Casting conversion working on C++. * Make the LValue casts record if in or in/out as we can produce better code if we know the difference. * WIP LValueCast pass * Fix tests so we don't fail because downstream compilers detect use of uninitialized variable. * Do conversions through through tmp for l-value scenarios that can't work other ways. * Fix a typo. * Change diagnostic implicit-cast-lvalue for a type that still exhibits the issue. * Add matrix test. * Added a bit more clarity around LValue casting choices. * Small comment improvements. Improvements based on comments on PR. * Use findOuterGeneric.
* Various fixes for autodiff and slangpy. (#2876)Yong He2023-05-09
| | | | | | | | | | | | | * Various fixes for autodiff and slangpy. * Fix cuda code gen for `select`. * Fix getBuildTagString(). * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix most of the disabled warnings on gcc/clang (#2839)Ellie Hermaszewska2023-04-26
|
* StringBuilder to lowerCamel (#2840)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI. * Small tidy on String. * Append -> append * ToString -> toString ProduceString -> produceString * Small fixes. * StringToXXX -> stringToXXX * Fix typo introduced by Append -> append. * Made intToAscii do reversal at the end. --------- Co-authored-by: Yong He <yonghe@outlook.com>
* Dictionary using lowerCamel (#2835)jsmall-nvidia2023-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP lowerCamel Dictionary. * WIP more lowerCamel fixes for Dictionary. * Add/Remove/Clear * GetValue/Contains * Fix tabs in dictionary. Count -> getCount * Fix fields with caps. * Key -> key Value -> value Use m_ for members where appropriate. Use lowerCamel in linked list. * Some small fixes/improvements to Dictionary. * Kick CI.
* Fix missing `f` suffix for float lits in CUDA backend. (#2791)Yong He2023-04-11
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Emit simpler vector element access code. (#2770)Yong He2023-04-03
| | | | | | | | | * Emit simpler vector element access code * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Translate all composed types into tuple types in pyBind. (#2744)Yong He2023-03-27
| | | | | | | | | | | * Translate all composed types into tuple types in pyBind. * Delete temp file. * Fix get tuple element code emit logic. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add PyTorch C++ binding generation. (#2734)Yong He2023-03-26
| | | | | | | | | * Add PyTorch C++ binding generation. * fix --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Support for producing SourceMap on emit (#2707)jsmall-nvidia2023-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP source map. * Split out handling of RttiTypeFuncs to a map type. * Make RttiTypeFuncsMap hold default impls. * Slightly more sophisticated RttiTypeFuncsMap * Source map decoding. * Fix tabs. * Fix asserts due to negative values. * Use less obscure mechanisms in SourceMap. * Source map decoding. Simplifying SourceMap usage. * First attempt at ouputting a source map as part of emit. * Added support for -source-map option. SourceMap is added to the artifact.
* More control flow simplifications. (#2673)Yong He2023-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * More control flow and Phi param simplifications. * Fix. * Fix gcc error. * Fix. * More IR cleanup. * Fix bug in phi param dce + ifelse simplify. * Propagate and DCE side-effect-free functions. * Enhance CFG simplifcation to remove loops with no side effects. * Fix. * Fixes. * Fix tests. Add [__AlwaysFoldIntoUseSite] for rayPayloadLocation. * More cleanup. * Fixes. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul global inst deduplication and cpp/cuda backend. (#2654)Yong He2023-02-16
| | | | | | | | | * Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Fix code generation for matrix reshape. (#2568)Yong He2022-12-14
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Rename IR opcodes to unify style. (#2556)Yong He2022-12-07
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Rework differential conformance dictionary checking. (#2483)Yong He2022-11-02
| | | | | | | * Rework differential conformance dictionary checking. * Revert space changes. Co-authored-by: Yong He <yhe@nvidia.com>
* Run simple compute kernel in gfx-smoke test. (#2400)Yong He2022-09-15
|
* Language feature: pointer sized int types. (#2401)Yong He2022-09-15
| | | | | | | | | | | | | | | | | | | | | * Language feature: pointer sized int types. * Fix. * small change to test. * Fix stdlib. * Fix. * Fix. * Add typedef for `size_t` in stdlib. * Fix test. * Add `intptr_t::size` constant. Co-authored-by: Yong He <yhe@nvidia.com>
* Assorted Artifact improvements (#2374)jsmall-nvidia2022-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP replacing DownstreamCompileResult. * First attempt at replacing DownstreamCompileResult with IArtifact and associated types. * Small renaming around CharSlice. * ICastable -> ISlangCastable Added IClonable Fix issue with cloning in ArtifactDiagnostics. * Only add the blob if one is defined in DXC. * Guard adding blob representation. * Make cloneInterface available across code base. Set enums backing type for ArtifactDiagnostic. * Added ::create for ArtifactDiagnostics. * Use SemanticVersion for DownstreamCompilerDesc. Set sizes for enum types. * Depreciate old incompatible CompileOptions. Change SemanticVersion use 32 bits for the patch. * Split out CastableUtil. * Change IDownstreamCompiler to use canConvert and convert to use artifact types. * Fix typos. * Fix typo bug. Allow trafficing in PTX assembly/binaries * struct DownstreamCompilerBaseUtil -> struct DownstreamCompilerUtilBase * Add other riff types. * Small fix around artifact kind. * Make using slices instead of strings explicit on atomic ref counted types. (not complete). Added IArtifactList. Use IArtifactList to hold the 'associated' files. Use IUnknown for scoping for atomic ref counting. Small naming improvements. * Make artifact not use String in construction (so it owns contents). * Calculate compile products as artifacts. * Small improvements around ArtifactDesc. * Use ICastableList for list of artifacts and remove IArtifactList.
* Call `gfx` in slang program. (#2370)Yong He2022-08-20
|
* Artifact split interface and implementation (#2349)jsmall-nvidia2022-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * WIP with hierarchical enums. * Some small fixes and improvements around artifact desc related types. * Improvements around hierarchical enum. * Fixes to get Artifact types refactor to be able to execute tests. * Attempt to better categorize PTX. * Work around for potentially unused function warning. * Typo fix. * Simplify Artifact header. * Small improvements around Artifact kind/payload/style. * Added IDestroyable/ICastable * Add IArtifactList. * First impl of IArtifactUtil. * Use the ICastable interface for IArtifactRepresentation. * Added IArtifactRepresentation & IArtifactAssociated. * Add SLANG_OVERRIDE to avoid gcc/clang warning. * Fix calling convention issue on win32. * Fix missing SLANG_OVERRIDE. * First attempt at file abstraction around Artifact. * Added creation of lock file. * Move functionality for determining file paths to the IArtifactUtil. Add casting to ICastable. * Added some casting/finding mechanisms. * Simplify IArtifact interface, and use Items for file reps. * Fix problem with libraries on DXIL. * Split out ArtifactRepresentation. * Move ArtifactDesc functionality to ArtifactDescUtil. ArtifactInfoUtil becomes ArtifactDescUtil. * Split implementations from the interfaces for Artifact. * Use TypeTextUtil for target name outputting. * Add artifact impls.
* Allow `class` to implement COM interface, [DLLExport] (#2338)Yong He2022-07-25
| | | | | | | * Allow `class` to implement COM interface, [DLLExport] * Fix [COM] usage in tests and examples with UUIDs. Co-authored-by: Yong He <yhe@nvidia.com>
* Support `class` types. (#2321)Yong He2022-07-12
| | | | | | | | | * Support `class` types. * Ignore class-keyword test * Fix codereview comments and warnings. Co-authored-by: Yong He <yhe@nvidia.com>
* Native call marshalling for ComPtr parameters and return values. (#2305)Yong He2022-06-29
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add CPU executable compile test (#2278)Yong He2022-06-21
| | | | | | | | | | | * Add cpu executable compile test * Fix. * Fix permission on linux * retrigger build Co-authored-by: Yong He <yhe@nvidia.com>
* Actual global support (#2262)jsmall-nvidia2022-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure. * Add support for __global. * Added `ActualGlobalRate` Added special handling around globals and COM interfaces. Tested out in cpu-com-example. * Fix typo in NodeBase. * Support for accessing globals by name working. * Check that actual global initialization is working. * Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar. * Remove context. Only create replacement if needed. * Split out COM host-callable into a unit-test. * host-callable com testing on C++and llvm. * Comment around the COM ptr replacement. * Disable com test on vs 32 bit. Fix C++ prelude * Disable 32 bit targets testing com host-callable. * Use JSON parsing to locate VS version. * Need platform detection in C++prelude. * Fix com host callable test for LLVM. * Work around for not being able to include "targetConditionals.h"
* COM interfaces with host callable (#2258)jsmall-nvidia2022-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure.
* Added NativeStringType (#2252)jsmall-nvidia2022-05-27
| | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Refactor prelude emit (#2236)jsmall-nvidia2022-05-17
| | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor how prelude output works in emit. * Small improvement to emit output. * Move around comment on target specific language directives based on review. Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Initial support for COM interface in host code. (#2230)Yong He2022-05-10
| | | | Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
* Use IR pass to eliminate phi nodes (#2226)Theresa Foley2022-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Use IR pass to eliminate phi nodes "Phi nodes" are one of the key contrivances that makes SSA (Static Single Assignment) form work. Because SSA is so great for compiler IRs, we kind of need to deal with phi nodes, but they also get in the way because they don't have a direct analog in most lower-level machine ISAs or execution models, nor in most of the high-level languages a transpiler wants to emit. As a result a compiler like ours needs to be able to eliminate the phi nodes from a program as part of generating output code. (For any clever people noting that SPIR-V supports phi nodes directly: yes, it does. It doesn't need to and it probably *shouldn't*. Anybody involved in the decision-making knows my reasoning, and anybody else should feel free to ask me if they want the lecture. Anyway...) The basic idea of elimiating phi nodes is simple enough. We replace each phi node with a temporary variable. Uses of the phi use values loaded from the temporary. The operation of the phi itself (assigning a value based on the branch taken) amounts to an assignment into the temporary. Previously, the Slang compiler dealt with phi nodes very late in the process of generating code: in the middle of emitting strings of source code in a high-level language like HLSL or GLSL. Doing the work that late in compilation has two big drawbacks: 1. Our ability to emit clean and/or optimal code is limited because we may not be able to make certain changes to the IR, or because we cannot make use of additional information like a dominator tree that might be available at other points in compilation. 2. Any other IR passes that relate to temporary variables won't be able to see the variables that we generate for phi nodes. This could raise issues with correctness (e.g., if we want to compute live-range information for *all* temporary variables), or performance (we have no way to run additional IR optimization passes after phis are eliminated). This change addresses these problems by making the elimination of phi nodes an explicit IR pass. Additional optimizations can easily be run after this pass (although we'd need to be careful not to run passes that could end up introducing new phis). The pass makes use of the information available to it to try to produce code that will emit to "clean" HLSL/GLSL. The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is heavily commented, so I won't describe the approach in detail here. There are two related issues that came up, though: First, it turned out that our emit logic for local variables (`IRVar` instructions) wasn't using the function we'd defined named `emitVar()`. One worrying consequence of that oversight was that the `precise` modifier would impact generated HLSL/GLSL for variables that turned into SSA values (including phi nodes), but *not* for local variables that had not been SSA'd (or that had been SSA'd and then de-SSA'd). This change also fixes that bug; it is unclear how widespread the impact of the original issue might be. Second, generating explicit IR temporaries for phi nodes exposed a pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass basically detects cases where we have an instruction `I` with a use `U` such that the use follows the rules of SSA form ("def dominates use," meaning `I` dominations `U`), but does not follow the more restrictive scoping rules of high-level-language output (where a value computed "inside" a loop is not automatically visible to code outside the loop just because it dominates that code). That pass did not correctly account for the case where `I` was a temporary variable. It seems that case could not arise before now because we didn't have any passes that would move `var`, `load`, or `store` operations out of the basic block they started in. The fix for that pass was relatively simple, and will make the whole thing more robust in case we add more aggressive optimizations later. * fixup: expected test output
* Support `[DllImport]` (#2181)Yong He2022-04-12
| | | | | | | | | | | | | | | | | * Support `[DllImport]` * Fix. * Fix. * Fix array type emit in cpp. * Fix. * Fix. * Fix Co-authored-by: Yong He <yhe@nvidia.com>
* Refactor: eliminate BackEndCompileRequest (#2178)Theresa Foley2022-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier refactoring pass over the compiler codebase split the type that had been called `CompileRequest` into three distinct pieces: * `FrontEndCompileRequest` which was supposed to own state and options related to running the compiler front end and producing IR + reflection (e.g., what translation units and source files/strings are included). * `BackEndCompileRequest` which was supposed to own state and options related to running the compiler back end to translate the IR for a `ComponentType` (program) into output code. (Note that the `BackEndCompileRequest` was conceived of as orthogonal to the `TargetRequest`s, which store per-target and target-specific options.) * `EndToEndCompileRequest` which was an umbrella object that owns separate front-end and back-end requests, plus any state that is only relevant when doing a true end-to-end compile (such as the kinds of compiles initiated with `slangc`). As originally conceived, the only state that this type was supposed to own was stuff related to "pass-through" compilation, as well as state related to writing of generated code to output files. That refactoring work was very useful at the time, because it allowed us to "scrub" the back end compilation steps to remove all dependencies on front-end and AST state (this was important for our goals of enabling linking and codegen from serialized Slang IR). At this point, however, it is clear that the hierarchy that was built up serves very little purpose: * The `BackEndCompileRequest` type is only used in two places: * As part of an `EndToEndCompileRequest`, where the settings on the `BackEndCompileRequest` can be configured, but only through the `EndToEndCompileRequest` * As part of on-demand code generation through the `IComponentType` APIs. In this case, the settings stored on the `BackEndCompileRequest` are not accessible to the application at all, and will always use their default values, so that instantiating a "request" object doesn't really make any sense. * The `FrontEndCompileRequest` type has a similar situation: * Front-end compilation as part of an `EndToEndCompileRequest` supports user configuration of `FrontEndCompileRequest` settings, but only through the `EndToEndCompileRequest` * Front-end compilation triggered by an `import` or a `loadModule()` call does not support user configuration of settings at all. It will always derive all relevant settings from thsoe on the session ("linkage"). In addition, subsequent changes have been made to the compiler that show a bit of a "code smell" and/or forward-looking worries for this decomposition: * In some cases we've had to add the same setting to multiple types in the breakdown (front-end, back-end, end-to-end, linkage, target, etc.) which makes it harder for us to validate that all the possible mixtures of state work correctly. * Related to the above, in some cases we have manual logic that copies state from one of the objects in the breakdown to another, in order to ensure that the user's intention is actually followed. * As a forward-looking concern, it seems that developers have sometimes added new configuration options and state to places that don't really make sense according to the rationale of the original decomposition (e.g., we probably don't want to have a lot of state that is only available via end-to-end requests, given that the API structure is meant to push users *away* from end-to-end compiles). As a result of all of the above, I've been planning a large refactor with the following big-picture goals: * Eliminate `BackEndCompileRequest` * Move all relevant state/options from the back-end request to the end-to-end request, since that is the only place they could be set anyway. * Introduce a transient "context" type to be used for the duration of code generation that serves the main functions that back-end requests really served in the codebase * Make `EndToEndCompileRequest` be a subclass of `FrontEndCompileRequest` * Consider addding a transient "context" type for front-end compiles that can be used in `import`-like cases rather than needing a full front-end request object. If this works, then eliminate `FrontEndCompileRequest` and be back to world with just a single `CompileRequest` type * Move *all* compiler configuration options to a distinct type (named something like `CompilerConfig` or `CompilerOptions` or whatever) which stores setting as key-value pairs, and has a notion of "inheritance" such that one configuration can extend or build on top of another. Make all the relevant types use this catch-all structure instead of redundantly storing flags in many places. This change deals with the first of those bullets: removeal of `BackEndCompileRequest`. The addition of the `CodeGenContext` type is perhaps an unncessary additional step, but making that change helps clean up a bunch of the code related to per-target code generation, so I think it is the right choice. Co-authored-by: Yong He <yonghe@outlook.com>
* Allow slangc to generate exe from .slang file. (#2170)Yong He2022-03-28
|
* Fixed naming conflicts in heterogeneous-hello-world (#2114)David Siher2022-02-03
| | | | | | | | | | | | | | | | | | | | * Fixed naming conflicts in heterogeneous-hello-world Added 3 new modifiers (`__unmangled`, `__exportDirectly`, `__externLib`) `__unmangled` causes mangleName() to return the normal name of the decl. `__exportDirectly` changes parent decl name concatenation behavior to use "::" instead of "." (for Name Hint) and emits the name hint when it exists, otherwise it emits the mangled name. `__externLib` stops Slang from emitting the corresponding struct. Also made necessary changes to heterogeneous-hello-world so that this new functionality is shown off. * Undo unintentional formatting changes Co-authored-by: Yong He <yonghe@outlook.com>
* Fix a bug in CUDA source emit (#2010)Theresa Foley2021-11-11
| | | It seems that we were missing a `template<>` prefix when emitting explicit specializations for the `Matrix<...>` type in the CUDA prelude. This might not have caused errors in some versions of the CUDA compiler, but it at least seems to fail on the 11.5 SDK, which I am using to test locally.
* Generalize heterogenous code emit (#1968)David Siher2021-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Bring heterogeneous-hello-world back up to date. * Reintroduced heterogeneous-hello-world into the premake * No longer uses compiled bytecode for entry point, instead a loadModule call is hardocoded with the slang file name. * Entry point is, similarly, hardcoded for now. * Added a bypass to slang-legalize-types for an unneeded GPUForeach check * Run premake and change to relative path * Removed experimental and added README * Add prebuild command to premake for heterogeneous example * Pass in entry point as parameter (also remove shader bytecode) * Pass in module name as parameter * Squashed commit of the following: commit 5b13b57fe600724344c556fe4309a5d6bb3d39ab Author: Kai Yao <kyao@nvidia.com> Date: Thu Oct 7 23:38:50 2021 -0700 Return diagnostics data when encountering module load error by exception (#1966) commit 112e1515c30fa972ff56f91514b70946153c718c Author: jsmall-nvidia <jsmall@nvidia.com> Date: Thu Oct 7 16:12:29 2021 -0400 Disable test crashing CI (#1965) * #include an absolute path didn't work - because paths were taken to always be relative. * Disable test that appears to be crashing. commit da32069a0c1c8c723d7ef45100049a8f0dd5d9c4 Author: Kai Yao <kyao@nvidia.com> Date: Mon Oct 4 13:58:51 2021 -0700 Modified barrier API to accept multiple resources per call (#1959) Co-authored-by: Yong He <yonghe@outlook.com> commit 97bb82ebcdf8f1391b9d93b5a8d7b1dfc4e88e52 Author: jsmall-nvidia <jsmall@nvidia.com> Date: Mon Oct 4 14:15:51 2021 -0400 Removing exceptions from core/compiler-core (#1953) * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor Stream. Working on all tests. * Split out CharEncode. * Make method names lower camel. m_prefix in Writer/Reader * Tidy up around CharEncode interface. * Small improvements around encode/decode. * Better use of types. * Remove readLine from TextReader. * Remove exceptions from Stream/Text handling. * Fix some typos. * Fix tabbing. * Fix missing override. * Remove remaining exception throw/catch via using signal mechanism. * Remove exceptions that are not used anymore. * Document the Stream interface. * Remove index for decoding 'get byte' function. * Fix CharReader -> ByteReader. commit b3dfe383c6d31ff3dbd76dcfb32de8d536382f3e Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Mon Oct 4 09:46:33 2021 -0700 Get native handles for TextureResource and BufferResource (#1960) * Added getNativeHandle() to TextureResource and BufferResource; Implemented getNativeHandle() in Vulkan and D3D12; Added new unit test files for the aforementioned implementation * Added missing getNativeHandle() implementations to renderer-shared.cpp and CUDA * Finished new getNativeHandle() unit tests for ITextureResource and IBufferResource; Modified ICommandQueue and ICommandBuffer unit tests to call QueryInterface to convert to IUnknown then back and compare resulting pointers for equality * Unit tests updated and pass locally * Cast m_buffer.m_buffer and m_image to uint64_t commit 35bca4cc432613af3926da3bed217a6baa9cbd26 Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Fri Oct 1 13:08:25 2021 -0700 Add getNativeHandle() to ICommandQueue and ICommandBuffer (#1952) * Added support for getting command buffer and command queue handles to ICommandBuffer and ICommandQueue; D3D12Device, VkDevice, and DebugDevice modifieid to implement this new functionality; immediate-renderer-base.cpp also modified to implement the new functions * Removed excess boilerplate * Changed readRef() to get() in D3D12 getNativeHandle() implementation for ICommandBuffer and ICommandQueue * Added unit tests for new getNativeHandle() implementations, unfinished * Queue test added; Minor cleanup changes * getBufferHandleTestImpl() now closes the command buffer before returning * Added getNativeHandle() implementations to CUDADevice * Added comment clarifying that the Vulkan check is checking for a null handle, which is defined to be 0 commit 6c6200f547c7387598743b23bb3c8f0d375d9494 Author: Kai Yao <kyao@nvidia.com> Date: Thu Sep 30 20:25:34 2021 -0700 VK Resource Barrier (#1955) * Resource barrier API and VK implementation * Stub implementations * Handle VK Acceleration Structure flag * Add a couple more cases to pipeline barrier stages commit 627fc976bac5c2381dbace9c7925cb6a68b8de12 Author: Yong He <yonghe@outlook.com> Date: Thu Sep 30 19:48:47 2021 -0700 Fix aarch64 build on github (#1957) Co-authored-by: Yong He <yhe@nvidia.com> commit 122d701513e116856bd59c999221ce36a373d7db Author: Yong He <yonghe@outlook.com> Date: Thu Sep 30 17:51:56 2021 -0700 Fix GitHub release (#1956) * Fix aarch64 release build config. * Fix for WinAarch64 build. * Update premake for embed-std-lib build on aarch64. * `platform` fix for aarach64 build. * Try revert back to use absolute output path for slang-stdlib-generated.h * Fix * fix Co-authored-by: Yong He <yhe@nvidia.com> commit aa8f7b899b7b562b3d3c6e25c3da41569505e70c Author: Chad Engler <englercj@live.com> Date: Wed Sep 29 13:02:47 2021 -0700 Fix ARM64 detection for MSVC (#1951) commit 6736b0c1c5fa3e89bc561eb7965a1a0d17af3466 Author: Yong He <yonghe@outlook.com> Date: Wed Sep 29 11:29:46 2021 -0700 Add ISession::loadModuleFromSource. (#1950) Co-authored-by: Yong He <yhe@nvidia.com> commit d8e452412e14a6a8ba137f2adcae13b398e5cecb Author: Yong He <yonghe@outlook.com> Date: Tue Sep 28 15:03:03 2021 -0700 Fix AbortCompilationException leaking through loadModule API. (#1949) * Fix AbortCompilationException leaking through loadModule API. * Update. * Fix. Co-authored-by: Yong He <yhe@nvidia.com> commit cdf1b2c007fefdca128584d2a9f63dec3d350e16 Author: Yong He <yonghe@outlook.com> Date: Tue Sep 28 11:54:24 2021 -0700 Improvements to the unit test framework. (#1948) commit af788b62e18bbd55cd748ad60400a74cf1bc93ee Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Fri Sep 24 16:53:41 2021 -0700 Add existing device handle support unit test (#1946) commit bec8e6aec85b6e3f875c58bdd59eb15613978358 Author: Yong He <yonghe@outlook.com> Date: Fri Sep 24 11:33:44 2021 -0700 Move existing unit tests to a standalone dll. (#1945) commit f2a3c933bc11a498c622fa18694c84beca8ca031 Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Thu Sep 23 12:19:49 2021 -0700 Add method to retrieve native handles (#1944) * Added a getNativeHandle() method that retrieves the natively created handles; Modified RendererBase, VKDevice, D3D12Device, and DebugDevice to implement this new method * Moved ExistingDeviceHandles out of Desc directly inside IDevice and renamed to NativeHandles; Modified calls accessing the struct accordingly in RendererBase, DebugDevice, VKDevice, and D3D12Device * Minor cleanup changes (renames, etc.) commit b9b398d038b524f15a86ff27cd6888d54e8754e0 Author: Yong He <yonghe@outlook.com> Date: Wed Sep 22 10:06:59 2021 -0700 Add gfx unit testing framework. (#1943) * Add gfx unit testing framework. * Fix compilation error. * Reset gfxDebugCallback after render_test. * Pass enabledApi flags through. * Fix for code review suggestions. Co-authored-by: Yong He <yhe@nvidia.com> commit 6e9cee69b3588ddae09b08b9f580f59ad899983f Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Tue Sep 21 18:46:32 2021 -0700 Support for existing device/instance handles in Vulkan (#1942) commit b1f04c8544c650de3947955ca68f679535d249aa Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Wed Sep 15 20:22:45 2021 -0700 Allow D3D12Device to use an existing device handle (#1940) * Added a new field for an existing device handle to IDevice::Desc; Modified D3D12Device::initialize to set the device stored in desc if it already exists instead of creating a new one * Turned existingDeviceHandle into a struct containing an array of two elements; Updated D3D12Device::initialize to match changes to existingDeviceHandle; Updated comments * Fixed style error for ExistingDeviceHandles struct commit 2f7b9f5ae8be21c6c1d75ae9caefbc7b3f8986a9 Author: Pablo Delgado <private@pablode.com> Date: Thu Sep 16 01:17:57 2021 +0200 Fix incorrect WIN32 macros and missing Windows.h inclusion (#1939) * Replace WIN32 preprocessor macros with _WIN32 * Add missing Windows.h include for InterlockedIncrement commit 11d43642008905ac69a3832eb8a9b2ae7b785f86 Author: Yong He <yonghe@outlook.com> Date: Tue Sep 14 11:36:44 2021 -0700 Avoid upcasting to f32 in 16bit float-uint bit cast. (#1938) Co-authored-by: Yong He <yhe@nvidia.com> commit 502aa3812a82cf0d091cff0c67804e4ee448ac78 Author: David Siher <32305650+dsiher@users.noreply.github.com> Date: Tue Sep 14 12:59:55 2021 -0400 Bring heterogeneous-hello-world back up to date. (#1935) * Bring heterogeneous-hello-world back up to date. * Reintroduced heterogeneous-hello-world into the premake * No longer uses compiled bytecode for entry point, instead a loadModule call is hardocoded with the slang file name. * Entry point is, similarly, hardcoded for now. * Added a bypass to slang-legalize-types for an unneeded GPUForeach check * Run premake and change to relative path * Removed experimental and added README Co-authored-by: Yong He <yonghe@outlook.com> * Revert "Squashed commit of the following:" This reverts commit 4f665858d65f7c332c616ef6db9fdafa1c5e0b9f. * Run premake * Remove prebuild command (only works on Windows?) * Rerun premake * Fix heterogeneous prebuild command * Remove linux specific prebuild command * Fix prebuild command (again) * Change target from dxbc to hlsl to see if that fixes linux issues * Use Path::getFileNameWithoutExt * Change string-literal.slang.expected to have extra filename in decoration Co-authored-by: Yong He <yonghe@outlook.com>
* Bring heterogeneous-hello-world back up to date. (#1935)David Siher2021-09-14
| | | | | | | | | | | | | | | | | | * Bring heterogeneous-hello-world back up to date. * Reintroduced heterogeneous-hello-world into the premake * No longer uses compiled bytecode for entry point, instead a loadModule call is hardocoded with the slang file name. * Entry point is, similarly, hardcoded for now. * Added a bypass to slang-legalize-types for an unneeded GPUForeach check * Run premake and change to relative path * Removed experimental and added README Co-authored-by: Yong He <yonghe@outlook.com>