| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
| |
* Prefixing source files in source/slang with slang-
* Prefix source in source/slang with slang- prefix.
* Rename core source files with slang- prefix.
* Update project files.
* Fix problems from automatic merge.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* A few changes required for application adoption of interface-type parameters
There are a few small changes here that are all related in that they arose from trying to integrate support for specialization via global interface-type shader parameters into a real application.
Allow querying the "pending" layout via reflection API
------------------------------------------------------
The naming here isn't ideal, and could probably use a round of "bikeshedding" to arrive at something better, but the basic idea is that when you have a type like:
```
struct MyStuff
{
int a;
IFoo foo;
int b;
}
```
the fields `a` and `b` get allocated space directly in the "primary" layout for `MyStuff` (at offsets 0 and 4, with `sizeof(MyStuff) == 8`), but the `foo` field can't be allocated space until we know what concrete type will get plugged in there.
If we have a concrete type in mind:
```
struct Bar : IFoo { int bar; }
```
then we can know how much space the `foo` field will take up, but we still can't allocate it space directly in `MyStuff`, because we already decided that `sizeof(MyStuff) == 8`.
Now imagine we place some `MyStuff` values into constant buffers:
```
cbuffer X {
MyStuff x;
}
cbuffer Y {
MyStuff y;
float4 z;
}
```
In each case we know that we want to place the `MyStuff::foo` field at the end of the containing constant buffer so that it doesn't disrupt the layout of the existing fields. But that means that the offset of `MyStuff::foo` relative to the start of the `MyStuff` isn't fixed, because of unrelated fields like `z` that need to get in between.
In our layout code, we handle this by having a notion of a "pending" layout. Once we know how `MyStuff::foo` will be specialized, we can compute both a "primary" and a "pending" layout for `MyStuff`, which basically treats it as if it were two distinct types:
```
struct MyStuff_Primary
{
int a;
int b;
}
struct MyStuff_Pending
{
Bar foo;
}
```
Layout for an aggregate type like the `X` or `Y` constant buffer then proceeds by computing an aggregate primary layout and an aggregate pending layout, and then finally a constant buffer or parameter block "flushes" all or part of the pending data by appending it to the primary data to get the final layout.
What all this means is that a type like `MyStuff` will have two different layouts (a default one for the primary data and a "pending" one for any specialized interface-type fields), and a variable like `Y::y` will also have two variable layouts that specify offsets (one set of offsets for its primary part, and one set of offsets for its pending part).
In order to handle interface-type fields with these layout rules, an application needs a way to query the "pending" part of a type or variable layout, which luckily gives it back just another type/variable layout. The API change here is minimal, although actually exploiting the new API correctly in application code could prove challenging.
Allow creating of explicitly specialized types
----------------------------------------------
This feature isn't actually implemented all the way through the compiler (I just needed enough to make the API calls go through), but I've added support for specializing a type that has interface-type fields through the reflection API. This maps to an `ExistentialSpecializedType` in the AST, and I'm lowering it to the IR as a `BindExistentialsType`, although that isn't 100% correct for the future.
This feature will require a future PR to actually flesh out the implementation work, but I'll wait until that is the sticking point on the application side before I do that.
Introduce a tiny `Hasher` abstraction
-------------------------------------
While implementing all the boilerplate for a new `Type` subclass (we really need to reduce that work...), I got fed up with how we do hash-code computation and introduced a small utility `Hasher` type that is intended to wrap up the idiom of combining hashes. For now this isn't a major change, but in the future I'd like to expand on the design a bit to clean up some of the warts around how we handle hashing:
* The `Hasher` implementation can and should switch from maintaining a single `HashCode` as its state to something that contains a more complete state (larger than the hash code) and just hashes new bytes into that state as it goes. This should make it possible to implement a `Hasher` for more serious hash functions, whether MD5, CityHash, or whatever we decide is good default.
* Things that are hashable shouldn't have a `getHashCode()` method, but instead should have something like a `hashInto(Hasher&)` method. This change would have the dual benefits that (1) a composite type can easily hash all the fields that contribute to its identity into the hasher with minimal fuss/boilerplate, and (2) the hashes for composite types will be of higher quality because they can exploit all the bits of the hasher's state to combine the fields, instead of restricting each sub-field to just the bits in a hash code.
We should be able to incrementally improve the quality of our design there over future changes, but for now it probably isn't a critical priority.
Fixes for legalization of existential types
-------------------------------------------
There were some missing cases in the handling of type legalization, such that a global interface-type shader parameter that got specialized to a type that contains *only* resource-type fields would cause a crash in the legalization step.
I added a test for this case, and then made `ir-legalize-types.cpp` account for this case (the code to handle it ias a bit of a kludge, and shows that the `declareVars()` routine there is getting to a level of complexity that is worrying.
* fixup: review feedback
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* List made members m_
Tweaked types to closer match conventions.
* Use asserts for checking conditions on List.
Other small improvements.
* List<T>.Count() -> getSize()
* List<T>
Add -> add
First -> getFirst
Last -> getLast
RemoveLast -> removeLast
ReleaseBuffer -> detachBuffer
GetArrayView -> getArrayView
* List<T>::
AddRange -> addRange
Capacity -> getCapacity
Insert -> insert
InsertRange -> insertRange
AddRange -> addRange
RemoveRange -> removeRange
RemoveAt -> removeAt
Remove -> remove
Reverse -> reverse
FastRemove -> fastRemove
FastRemoveAt -> fastRemoveAt
Clear -> clear
* List<T>
FreeBuffer -> _deallocateBuffer
Free -> clearAndDeallocate
SwapWith -> swapWith
* List<T>
SetSize -> setSize
Reserve -> reserve
GrowToSize growToSize
* UnsafeShrinkToSize -> unsafeShrinkToSize
Compress -> compress
FindLast -> findLastIndex
FindLast -> findLastIndex
Simplify Contains
* List<T>
Removed m_allocator (wasn't used)
Swap -> swapElements
Sort -> sort
Contains -> contains
ForEach -> forEach
QuickSort -> quickSort
InsertionSort -> insertionSort
BinarySearch -> binarySearch
Max -> calcMax
Min -> calcMin
* Initializer::Initialize -> initialize
List<T>::
Allocate -> _allocate
Init -> _init
IndexOf -> indexOf
* * Put #include <assert.h> in common.h, and remove unneeded inclusions
* Small refactor of ArrayView - remove stride as not used
* getSize -> getCount
setSize -> setCount
unsafeShrinkToSize->unsafeShrinkToCount
growToSize -> growToCount
m_size -> m_count
* Some tidy up around Allocator.
* Use Index type on List.
* Refactor of IntSet.
First tentative look at using Index.
* Made Index an Int
Did preliminary fixes.
Made String use Index.
* Partial refactor of String.
* String::Buffer -> getBuffer
ToWString -> toWString
* Small improvements to String.
String::
Buffer() -> getBuffer()
Equals() -> equals
* Try to use Index where appropriate.
* Fix warnings on windows x86 builds.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add better control over image formats for GLSL/SPIR-V targets
Currently Slang emits GLSL code assuming all R/W images need to have explicit formats, and thus we try to infer a format from the element type of the image.
E.g., given a `RWTexture2D<half4>` we might infer that a qualifier of `layout(rgba16f)` should be used.
This strategy has two notable shortcomings:
* Sometimes the user will want a format that doesn't match an existing HLSL type. E.g., if they want the equivalent of `layout(r11f_g11f_b10f)`, then what should they put in their `RWTexture2D<...>` to make the inference do what they need?
* Sometimes the user knows that they don't need to specify a format *at all*, because using the `GL_EXT_shader_image_load_formatted` extension, they can still perform non-atomic load/store on images with no format specified in the SPIR-V.
This change adds two features directed at these challenges.
First, we add an explicit `[format(...)]` attribute that can be used to specify an explicit image format, including ones that don't match any HLSL type.
An example of using this new attribute is:
```hlsl
[format("r11f_g11f_b10f")]
RWTexture2D<float3> myImage;
```
For simplicity in initial bring-up, the new formats all use the same naming as formats in GLSL (this should make it easy for a programmer who knows what they expect to get in the GLSL output). We can change the naming convention for formats at a later time, so long as we keep these existing names in as a compatibility feature.
Note that this is *not* given a `vk::` prefix since the attribute should signal the programmer's intent to provide an image with that format on *all* targets (although only some targets might act on that information).
Also note that the attribute takes a string (`[format("rgba8")`) instead of a bare identifier (`[format(rgba8)]`) because this is consistent with the existing convention for attributes in HLSL.
When `[format(...)]` is left off, the default compiler behavior will still be to infer a format, but this behavior can be overidden for a single image using an explicit format of `"unknown"`:
```hlsl
[format("unknown")]
RWTexture2D<float4> mysteryMachine;
```
The second new feature is that if a user knows they are coding for a GPU that supports the `"unknown"` format in all non-atomic cases, then they can opt into making that the default for images without an explicit `[format(...)]`, using the new `-default-image-format-unknown` command-line option for `slangc`.
The new test case included with this change confirms that we correctly see the explicit formats in the output GLSL and *no* formats for images without explicit `[format(...)]` when using the new command-line option. The test stresses images declared at global scope, in parameter blocks, and in entry-point parameter lists, to try and make sure that all the relevant IR passes in the compiler preserve the format information.
* fixup: missing file
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Split front- and back-ends
This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API.
The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion.
For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points.
The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed.
The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did:
* The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files.
* A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does *not* interact with generic arguments for global or entry-point generic parameters.
* The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`).
* The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation.
* A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around.
* The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward.
* Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios.
* The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time).
* One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation).
It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required:
* I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it.
* A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters.
* In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation.
* In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway.
* In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better.
* In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward.
* The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors.
* A few APIs now take `DiagnosticSink*` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob**` parameter for the output.
* fixup: missing include for compilers with accurate template checking (non-VS)
* fixup: review feedback
|
| |
|
|
|
|
|
|
|
|
|
| |
* Re-enable warnings around null this.
* Remove testing for nullptr in Substitution::Equals tests
* Fix ref counting problem in vulkan render.
* * Remove SLANG_ASSERT(this) in mthods
* Place asserts conservatively at method call sites where appropriate.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Allow entry points to have explicit generic parameters
Prior to this change, the Slang implementation required users to use global `type_param` declarations in order to specialize a full shader. For example:
```hlsl
type_param L : ILight;
ParameterBlock<L> gLight;
[shader("fragment")]
float4 fs(...)
{ ... gLight.doSomething() ... }
```
With this change we can rewrite code like the above using explicit generics, plus the ability to have `uniform` entry-point parameters:
```hlsl
[shader("fragment")]
float4 fs<L : ILight>(
uniform ParameterBlock<L> light,
...)
{ ... light.doSomething() ... }
```
Having this support in place should make it possible for us to eliminate global generic type parameters and the complications they cause (both at a conceptual and implementation level).
The most central and visible piece of the change is that `EntryPointRequest` now holds a `DeclRef<FuncDecl>` instead of just ` RefPtr<FuncDecl>`, which allows it to refer to a specialization of a generic function.
Various places in the code that refer to the `EntryPointRequest::decl` member now use a `getFuncDecl()` or `getFuncDeclRef()` method as appropriate (see `compiler.h`).
In order to fill in the new data, the `findAndValidateEntryPoint` function has been greaterly overhauled.
The changes to its operation include:
* The by-name lookup step for the entry point function has been adapted to accept either a function or a generic function.
* The generic argument strings provided by API or command line are no longer parsed all the way to `Type`s, but instead just to `Expr`s in the first pass.
* There are now two cases for checking the global generic arguments against their matching parameters. The first case is the new one, where we plug the generic argument `Expr`s into the explicit generic parameters of an entry point (that case re-uses existing semantic checking logic). The second case is the pre-existing code for dealing with global generic type arguments.
The `lower-to-ir.cpp` logic for hadling entry points then had to be extended. Making it deal with a full `DeclRef` instead of just a `Decl` was the easy part (just call `emitDeclRef` instead of `ensureDecl`).
The more interesting bits were:
* We need to carefully add the `IREntryPointDecoration` to the nested function and not the generic in the case where we have a generic entry point. There is a handy `getResolvedInstForDecorations` that can extract the return value for an IR generic so that we can decorate the right hting.
* We need to make sure that in the case where we emit a `specialize` instruction (which normally wouldn't get a linkage decoration), we attach an `[export(...)]` decoration to it with the mangled name of the decl-ref, so that it can be found during the linking step.
The IR linking step is then slightly more complicated because the mangled entry point name could either refer directly to an `IRFunc` or to a `specialize` instruction for a generic entry point. The logic was refactored to first clone the entry point symbol without concern for which case it is (the old code was specific to functions), and then *if* the result is a `specialize` instruction, we attempt to run generic specialization on-demand.
That on-demand specialization is a bit of a kludge, but it deals with the fact that all the downstream passing only expect to see an `IRFunc`. A future cleanup might try to split out that specialization step into its own pass, which ends up being a limited form of the specialization pass.
Since I was already having to touch a lot of the code around IR linking, I went ahead and refactored the signature of the operations. I eliminated the need for the caller to create, pass in, and then destroy an `IRSpecializationState` (really an IR *linking* state), and replaced it with a structure local to the pass (that data structure was a remnant of an older approach in the compiler), and then also renamed the main operation to `linkIR` to reflect what it is doing in our conceptual flow.
Smaller changes made along the way include:
* Refactored `visitGenericAppExpr` to create a subroutine `checkGenericAppWithCheckedArgs` so that it can be used by the entry-point validation logic described above).
* Refactored the declarations around the IR passes in `emitEntryPoint()` (`emit.cpp`), to show that things are more self-contained than they used to be (e.g., that the `TypeLegalizationContext` is now only needed by one pass).
* Refactored the generic specialization code so that there is a stand-along free function that can perform specialization on a `specialize` instruction without all the other context being required. This is only to support the limited specialization that needs to be done as part of linking.
* Updated the `global-type-param.slang` test to actually test entry-point generic parameters. In a later pass we can/should rework all the tests/examples for global type parameters over to use explicit entry-point generic parameters (at which point we should rename the tests as well). For now I am leaving thigns with just one test case, with the expectation that bugs will be found and ironed out as we expand to more tests.
* fixup
* Fixup: don't leave entry-point decorations on stuff we don't want to keep
The IR `[entryPoint]` decoration is effectively a "keep this alive" decoration, which means that attaching it to something we don't intend to keep around can lead to Bad Things.
The approach to generic entry points was attaching `[entryPoint]` to the underlying `IRFunc` because that seemed to make sense, but that meant that the `specialize` instruction at global scope scould instantiate that generic and then keep it alive, even if the resulting function wouldn't be valid according to the language rules.
As a quick fix, I'm attaching `[entryPoint]` to the `specialize` instruction instead in such cases, and then re-attaching it to the result of explicit specialization during linking.
* Port most of remaining test and rename global type parameters
This change ports as many as possible of the existing tests for global type parameters over to use entry-point generic parameters instead. For the most part this is a mechanical change.
A few test cases remain using global generic parameters, as does the `model-viewer` example application.
The reason for this is that the shaders have either or both the following features:
* A vertex and fragment shader that can/shold agree on their parameters
* A type declaration (e.g., a `struct`) that is dependent on one of the generic type parameters
In these cases, it would really only make sense to switch to explicit parameters once we support shader entry points nested inside of a `struct` type, so that we can use an outer generic `struct` as a mechanism to scope the entry points and other type-dependent declrations.
Since global-scope type parameters need to persist for at least a bit longer, I went ahead and renamed all the use sites over to use `type_param` for consistency.
|
| |
|
|
|
|
|
|
|
|
| |
* Use 'is' over 'as' where appropriate.
* dynamic_cast -> dynamicCast
* Replace 'dynamicCast' with 'as' where has no change in behavior/ambiguity.
* Replace dynamicCast with as where doesn't change behavior/non ambiguous.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Replace dynamicCast with as where does not change behavior (ie not Type derived).
Use free function where scoping is clear.
* Replace uses of dynamicCast with as when there is no difference in behavior.
* Remove the IsXXXX methods from Type.
* Don't have separate smart pointer to store canonicalType on Type.
* Simplify Slang.FilteredMemberRefList.Adjust, such does the cast directly.
* Use free as where appropriate.
* Use free function version of casts where appropriate.
* Fix text in casting.md
* Fix typos in decl-refs.md
* Remove the uses of free function as on RefDecl.
Add 'canAs' to RefDecl as a way to test if a cast is possible.
Moved 'as' into RefDeclBase.
* Use 'is' to test for as cast on smart pointers.
Fix small scope issue.
* * Cache stringType and enumTypeType on the Session
* Make DeclRefType::Create return a RefPtr
* Make casting of result use the *method* .as (cos using free function would mean objects being wrongly destroyed)
* Make results from createInstance ref'd to avoid possible leaks.
* Fix typo in template parameter for is on RefPtr.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Made dynamicCast a free function.
* Replace As with as or dynamicCast depending on if it is a type.
* Fix problem with using non smart pointer cast.
* Removed legacy asXXXX methods.
* Remove As from Type.
* Removed As from Qual type -> made coercable into Type*, such that can just use free 'as'.
* Remove left over QualType::As() impl.
* Remove As from SyntaxNodeBase.
* Made as for instructions implemented by dynamicCast.
* Replace As on DeclRef. Use the global as<> to do the cast.
* Add const safe versions of dynamicCast and as for IRInst
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Initial support for dynamic dispatch using "tagged union" types
Suppose a user declares some generic shader code, like the following:
```hlsl
interface IFrobnicator { ... }
type_param T : IFrobincator;
ParameterBlock<T : IFrobnicator> gFrobnicator;
...
gFrobincator.frobnicate(value);
```
and then they have some concrete implementations of the required interface:
```hlsl
struct A : IFrobnicator { ... }
struct B : IFrobnicator { ... }
```
The current Slang compiler allows them to generate distinct compiled kernels for the case of `T=A` and the case of `T=B`. This means that the decision of which implementation to use must be made at or before the time when a shader gets bound in the application.
This change adds a new ability where the Slang compiler can generate code to handle the case where `T` might be *either* `A` or `B`, and which case it is will be determined dynamically at runtime. This means a single compiled kernel can handle both cases, and the decision about which code path to run can be made any time before the shader executes.
This new option is supported by defining a *tagged union* type. Via the API, the user specifies that `T` should be specialized to `__TaggedUnion(A,B)` (the double underscore indicates that this is an experimental and unsupported feature at present). We refer to the types `A` and `B` here as the "case" types of the tagged union. Conceptually, the compiler synthesizes a type something like:
```hlsl
struct TU { union { A a; B b; } payload; uint tag; }
```
The user can then allocate a constant buffer to hold their tagged union type, and when they pick a concrete type to use (say `B`), they fill in the first `sizeof(B)` bytes of their buffer with data describing a `B` instance, and then set the `tag` field to the appopriate 0-based index of the case type they chose (in this case the `B` case gets the tag value `1`).
Actually implementing tagged unions takes a few main steps:
* Type parsing was extended to special-case `__TaggedUnion` as a contextual keyword. This is really only intended to be used when parsing types from the API or command-line, and Bad Things are likely to happen if a user ever puts it directly in their code. Eventually construction of tagged unions should be an API feature and not part of the language syntax.
* Semantic checking was extended to recognize that a tagged union like `__TaggedUnion(A,B)` shoud support an interface like `IFrobnicator` whenever all of the case types suport it, as long as the interface is "safe" for use with tagged unions (which means it doesn't use a few of the advancd langauge features like associated types).
* The IR was extended with instructions to represent tagged union types and to extract their tag and the payload for the different cases as needed.
* IR generation was extended to synthesize implementations of interface methods for any interface that a tagged union needs to support. Right now the implementation is simplistic and only handles simple method requirements, which it does by emitting a `switch` instruction to pick between the different cases.
* A new IR pass was introduced to "desugar" any tagged union types used in the code. The downstream HLSL and GLSL compilers don't support `union`s, so we have to instead emit a tagged union as a "bag of bits" and implement loading the data for particular cases from it manually.
* Final code emit mostly Just Works after the above steps, but we had to introduce an explicit IR instruction for bit-casting to handle the output of the desugaring pass.
There are a bunch of gaps and caveats in this implementation, but that seems reasonable for something that is an experimental feature. The various `TODO` comments and assertion failures in unimplemented cases are intended, so that this work can be checked in even if it isn't feature-complete.
* fixup: missing files
* fixup: typos
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #775
It was reported (in #775) that Slang doesn't handle initializer-list syntax when initializing matrix variables. When starting on a fix for that it became apparent that the time was right to fix two broad issues in the compiler's current handling of `{}`-enclosed initializer lists.
The first issue was that the front-end checking of initializer lists wasn't handling the C-style behavior where an initializer list can either contain nested `{}`-enclosed lists for sub-arrays/-structures, or directly contain "leaf" values for initializing those aggregates. For example, the following two variable declarations ought to be equivalent:
```hlsl
int4 a[] = { {1, 2, 3, 4}, {5, 6, 7, 8} };
int4 b[] = { 1, 2, 3, 4, 5, 6, 7, 8 };
```
Getting this distinction right is important because we want to support initializing a matrix either from a list of vectors for its rows, or a list of scalars for its elements (in row-major order).
The front-end semantic checking logic for initializer lists was revamped so that it conceptually tries to "read" an expression of a desired type from the initializer list, and decides at each step whether to consume a single expression by coercing it to the desired type, or to recursively read multiple sub-values to construct the type as an aggregate. The logic for deciding between direct vs aggregate initialization could potentially use some tweaking, but luckily it should always handle the case where users introduce explicit `{}`-enclosed sub-lists to make their intention clear, so that existing Slang code should continue to work as before.
The second issue was that initializers without the expected number of elements weren't implemented in code generation, so they would lead to internal compiler errors. This change revamps the codegen logic for initializer lists so that it can synthesize default values for fields/elements that were left out during initialization. This includes an attempt to support default initialization of `struct` fields based on explicitly written initialization expressions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* First step toward supporting use of interfaces as existential types
Traditional generics involve universal quantification. E.g., a declaration like:
```
void drive<T : IVehicle>(T vehicle);
```
indicates for *for all* types `T` that implement the `IVehicle` interface, the `drive()` function is available.
In contrast, whend directly using an interface type like:
```
IVehicle v = ...;
v.doSomething();
```
we only know that there *exists* some concrete type (we could call it `E`) such that `v` refers to a value of type `E`, and `E` implements the `IVehicle` interface. In order to perform an operation like `v.doSomething()` we need to "open" the existential value so that we can look at the concrete type and how it implements the `IVehicle.doSomething` requirement.
This change adds a very explicit representation of existentials to Slang's IR. An operation like `e = makeExistential(v, w)` creates a value of some existential type (interfaces being our only existential types for now), by wrapping a concrete value `v` (the type of `v` can be seen as an implicit operand) and a witness table `w` showing that the type of `v` implements the requirements of the chosen interface type.
In turn, opening of an existential is handled with operations `extractExistential{Value|Type|WitnessTable}` which pull the corresponding piece of information out of a value of existential type (which somewhere in the code had to have been created with `makeExistential`).
The change includes a trivial simplification pass that can detect cases where an `extractExistential*` operation is applied direclty to a `makeExistential` operation, so that there is only one possible result that could be extracted. This allows for simplification of existential types used in trivial ways for local variables (this is mostly so I can check in a functional test, rather than to actually support useful code involving interfaces right now).
The logic in the semantic checking phase of the compiler is comparatively more complex.
When we are about to perform member lookup given an expression like `obj.member` we will first check if `obj` has an existential type, and if it does we will construct a suitable local context in which we extract the value, type, and witness table from the existential (these all become explicit AST expression nodes), and then use the extracted value as the base of the lookup operation.
The nature of existential values is that two different values with the same existential (interface) type could wrap concrete values with differnt types, so that we need to carefully refer only to the extracted type/value/witness-table of specific *values*. We handle this right now by conceptually moving the existential-type value into a local variable (by introducing a `LetExpr` that amounts to `let v = <init> in <body>`) and then require that the extract expressions must refer to the (immutable) variable declaration from which they are extracting a value.
(Eventually we should expand this so that when using an immutable local variable of existential type we just use that variable as-is rather than introduce a new temporary)
A simple test case is included that uses an interface type in an almost trivial way for a local variable; this test can be run and produces the expected results.
A more complex test case that passes an existential into a function is included, but left disabled because a more aggressive simplification approach is required to generate working code from it.
* Add missing file for expected test output
* Fixups for merge from top-of-tree
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Specialize away resource-type function parameters
Work on #397.
Introduction
------------
Suppose a user writes a function that takes a resource type as a parameter:
```hlsl
float4 getThing(RWStructuredBuffer<float4> buffer, int index)
{
return buffer[index];
}
```
This function creates challenges when generating code for GLSL-based targets, because a global shader parameter of type `RWStructuredBuffer`:
```hlsl
RWStructuredBuffer<float4> gBuffer;
```
translates to a global GLSL `buffer` declaration:
```hlsl
buffer _S0
{
float4 _data[];
} gBuffer;
```
There is no equivalent to that `buffer` declaration that can be used in function parameter position, and it is illegal in GLSL to pass `gBuffer` into a function.
(Aside: yes, we could in principle translate a function parameter like `RWStructuredBuffer<float4> buffer` to `float4 buffer[]`, but that will not in turn generalize to arrays of structured buffers; it is a dead-end strategy)
The solution employed by many shader compilers is to "inline everything" to eliminate the need for parameters of resource types, and then rely on dataflow optimization to eliminate locals of resource types. This strategy can of course lead to an increase in code size, and it also means that call stacks are lost when doing step-through debugging. Another serious issue is that an "early `return`" from a function can turn into the equivalent of a multi-level `break` when inlined, and not all of our targets support multi-level `break`.
The solution implemented in this change works around some, but not all, of the problems with full inlining.
The approach here generates specialized versions of a function like `getThing`, adapted to the actual arguments provided at different call sites.
Thus if we have code like:
```hlsl
RWStructuredBuffer<float4> gA;
RWStructuredBuffer<float4> gB[10];
...
getThing(gA, x);
getThing(gA, y);
getThing(gB[someVal], z);
```
we will generate two specializations of `getThing`: one specialized for the `buffer` parameter being `gA` and the other for `gB`:
```hlsl
float4 getThing_gA(int index) { return gA[index]; }
float4 getThing_gB(int _val, int index) { return gB[_val][index]; }
```
and the call sites will change to match:
```hlsl
getThing_gA(x);
getThing_gA(y);
getThing_gB(someVal, z);
```
Note how in the case where the argument being passed in was obtained by indexing into an array of resources, the callee is specialized to the identity of the global shader parameter (`gB`), and now accepts a new parameter to indicate the array index into it.
While this description motivates the change based on GLSL output, the same basic issue can arise for other targets.
For example, while current HLSL has added the `ConstantBuffer<T>` type, it is not supported on older targets, and it turns out that even dxc does not allow functions to have `ConstantBuffer<T>` parameters.
Longer-term, we will likely need to do even more aggressive specialization both in order to generate SPIR-V output directly, and also to deal with function that have return values or `out` parameters of resource types.
Implementation
--------------
The meat of the change is in `ir-specialize-resources.{h,cpp}`, where we have a pass that looks at all call sites (`IRCall` instructions) in the program, and attempts to replace them with calls to specialized functions, where the specializations are generated on-demand.
The code in this pass is heavily commented, so hopefully it serves to explain itself all right.
After specialization is complete, we may still have functions like the original `getThing` that will produce invalid code when emitted as GLSL, so we need a way to make sure they don't appear in the output.
To date we've had some very ad hoc approaches for ignoring IR constructs that we don't want to affect emitted code, but this change goes ahead and adds a more real dead code elimination (DCE) pass in `ir-dce.{h,cpp}`.
This pass follows a straightforward approach of tagging instructions that are "live" and then propagating liveness through the whole program, before making a single pass to delete anything that isn't live.
When I first added the DCE pass it eliminated *everything* because there were no "roots" for liveness.
I solved this for now by adding a new decoration, `IREntryPointDecoration`, to mark shader entry points in the IR which should always be live (as should anything they depend on).
A secondary problem that arose was that for GLSL ray tracing shaders it is possible for the incoming/outgoing payload or attributes parameters to be unused, but eliminating them as dead would change the signature of a shader an potential break the rules for how ray tracing programs communicate.
I added a very simple `IRDependsOnDecoration` that allows one IR instruction to keep another alive *as if* it used it, without actually using it.
There's also a fixup in the IR dumping logic where I was forgetting to store anything in the mapping from instruction to their names, so that the name of an instruction was getting incremented each time it was referenced.
Testing
-------
There are three different tests added as part of this change:
* The `compute/func-resource-param` test covers the basic `RWStructuredBuffer` case above, which we expect to work fine for D3D11/12, but fail for Vulkan without specialization.
* The `cross-compile/func-resource-param-array` test covers the case where we don't just have one resource, but an array of them. This is not an end-to-end compute test primarily because our `render-test` application doesn't yet handle arrays of resources correctly in its binding logic.
* The `compute/func-cbuffer-param` test covers the case of a function with a `ConstantBuffer<T>` parameter, which requires specialization to become valid for any of our targets.
* fixup: warnings/errors from other compilers
* fixup: typos and cleanup
* fixup: typos
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* First pass at having an interface to write text to that can be replaced.
Simplifed and made more rigerous the interface used to write formatted strings.
* Added AppContext to simplify setting up and parsing around of streams.
* Added more simplified way to get the std error/out from AppContext.
* Work in progress using dll for tools to speed up testing.
* First pass at ISlangWriter interface.
* Added support for writing VaArgs.
Added NullWriter.
* Use ISlangWriter for output.
* Use ISlangWriter for output - replacing OutputCallback.
Make IRDump go to ISlangWriter
* SlangWriterTargetType -> SlangWriterChannel
Improvements around AppContext
* Shared library working with slang-reflection-test.
* Dll testing working for render-test.
* Include va_list definintion from header.
* Fix errors from clang.
* Fix typo for linux.
* Added -usexes option
* Fix typo.
* Fix arguments problem on linux.
* Fix typo for linux.
* Add windows tool shared library projects.
* Fix warning from x86 win build.
Fix signed warning from slang-test/main.cpp
* First attempt at getting premake to work on travis, and run tests.
* Try moving build out into script.
* Invoke bash scripts so they don't have to be executable.
* Drive configuration/tests from env parameters set by travis
* Try using source to run travis tests.
* Remove the build.linux directory - but doing so will overwrite Makefile.
* Made -fno-delete-null-pointer-checks gcc only.
* Try to fix warning from -fno-delete-null-pointer-checks
* Turn of warnings for unknown switches.
* Try to make premake choose the correct tooling.
* Disabled missing braces warning.
* Disable -Wundefined-var-template on clang.
* -Wunused-function disabled for clang.
* Fix typo due to SlangBool.
* Remove this nullptr tests.
* "-Wno-unused-private-field" for clang.
* Added "-Wno-undefined-bool-conversion"
* Add DominatorList::end fix.
* Split scripts into travis_build.sh travis_test.sh
* Fix gcc/clang template pre-declaration issue around QualType.
* Fix premake to build such that pthread correctly links with slang-glslang
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main change here is to fill out the `BaseType` enumeration so that it covers the full range of 8/16/32/64-bit signed and unsigned integers, as well as 16/32/64-bit floating-point numbers, and then propagate that completion through various places in the code.
More details:
* The current `half`, `float`, `double`, `int`, and `uint` types are still the default names for their types, so things like `float16_t` and `int32_t` were added as `typedef`s.
* We still need to generate the full gamut of vector/matrix `typedef`s for the new types, so that things like `float16_t4x3` will work (yes, I know that is ugly as sin, but that's the HLSL syntax...).
* A few pieces of dead code from earlier in the compiler's life got removed, since I did a find-in-files for `BaseType::` and tried to either update or delete every site.
* A few call sites that were enumerating integer base types in an ad-hoc fashion were changed to use a single `isIntegerBaseType()` function that I added in `check.cpp`
* When compiling with dxc for shader model 6.2 and up, we enable the compiler's support for native 16-bit types via a flag.
* The public API enumeration for reflection of scalar types added cases for 8- and 16-bit integers (it already exposed the other cases we need)
* The lexer was updated to be extremely liberal in what kinds of suffixes it allows on literals. I also removed the logic that was treating, e.g., `0f` as a floating-point literal (it doesn't seem to be the right behavior). That would now be an integer literal with an invalid suffix.
* The logic in the parser that applies types to literals was updated to handle a few more cases: `LL` and `ULL` for 64-bit integers, and `H` for 16-bit floats.
* The mangling logic needed to be updated to handle the new cases, and I consolidated the handling of those types in their front-end and IR forms.
* Removed the explicit `BasicExpressionType::ToString` logic, since all basic types are `DeclRefType`s in the front end, and we can just print them out as such.
* As a bit of a gross hack, fudged the conversion costs so that `int` to `int64_t` conversion is a bit more costly. The problem there is that given an operation like `int(0) + uint(0)`, the best applicable candidates ended up being `+(uint,uint)` and `+(int64_t,int64_t)` because the cost of a single `int`-to-`uint` conversion was the same as the sum of the cost of an `int`-to-`int64_t` and a `uint`-to-`int64_t`. A better long-term fix here is to completely change our overload resolution strategy, but that is obviously way too big to squeeze into this change.
* Type layout computation was updated to handle all the new types and give them their natural size/alignment. Note that this does *not* work for down-level HLSL where `half` is treated as a synonym for `float`. It also doesn't deal with the fact that many of these types aren't actually allowed in constant buffers for certain shader models. A future change should work to add error messages for unsupported stuff during type layout (or just make the types themselves require support for certain capabilities)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #627
The front-end has support for `RasterizerOrderedBuffer` and `RasterizerOrderedTexture*`, but left out support for:
* `RasterizerOrderedByteAddressBuffer`
* `RasterizerOrderedStructuredBuffer`
[Nitpick: these tyeps are all amazingly annoying to type. It is easy to want to write `RasterOrdered` instead of the bulkier `RasterizerOrdered`, and almost everybody does in casual speech. There's already the issue of wanting to type `StructureBuffer` (a buffer of structures) instead of `StructuredBuffer` (a buffer that is... structured?). Then you have `ByteAddressBuffer` which is just adding to the confusion because it is nominally a "byte addressable" buffer (so that `ByteAddressedBuffer` would actually make sense), but then actually *isn't* byte addressable in practice.]
There were a few `TODO` comments related to this already, and this change was mostly a matter of doing a find-in-files for `RWByteAddressBuffer` and `RWStructuredBuffer` and adding matching `RasterizerOrdered` cases.
The test I added just checks that these types make it through the front-end, and doesn't do any actual confirmation that they work as intended. It is worth noting that the handling of ordering in GLSL/VK is different from in HLSL ("pixel shader interlock" instead of "rasterizer ordered views"), so coming up with a cross-compilation story would need to be a later step.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Slang `enum` declarations will always be scoped, e.g.:
```hlsl
enum Color
{
Red,
Green = 2,
Blue,
}
Color c = Color.Red; // Not just `Red`
```
A user can write `enum class` as a placebo for now (to ease sharing of headers with C++).
Slang does not currently support the `::` operator for static member lookup, so it must be `Color.Green` and not `Color::Green`. Support for `::` as an alternate syntax could be added later if there is strong user demand.
An `enum` type can have a declared "tag type" using syntax like C++ `enum class`:
```hlsl
enum MyThings : uint
{
First = 0,
// ...
}
```
The `enum` cases will store their values using that type. An `enum` that doesn't declare a tag type will use the type `int` by default.
Enum cases are assigned values just like in C/C++: cases can have explicit values, but otherwise default to one more than the previous case, or zero for the first case.
All `enum` types will automatically conform to a standard-library `interface` called `__EnumType`, which is used so that basic operators like equality testing can be defined generically for all `enum` types.
This change only adds one operator at first (the `==` comparison), but other should be added later.
An `enum` case needs to be explicitly converted to an integer where needed (e.g., `int(Color.Red)`).
This is implemented by having the main integer types (`int` and `uint`) support built-in initializers that can work for *any* `enum` type (or rather, anything conforming to `__EnumType`).
Eventually these will be restricted so that an `enum` type can only be converted to its associated tag type.
IR code generation completely eliminates `enum` types and their cases.
The `enum` type will be replaced with its tag type, and the cases will be replaced with the tag values.
Currently this could leave some mess in the IR where cast operations are applied between values that actually have the same type.
|
| |
|
|
|
|
|
|
|
| |
Fixes #581
This change adds a new parameter passing mode `__ref` to exist alongisde `in`, `out`, and `inout`.
The `__ref` modifier indicates true by-reference parameter passing (whereas `inout` is copy-in-copy-out).
This is not intended to be something that users interact with directly, but rather a low-level feature that lets us provide a correct signature for the `Interlocked*()` operations in the standard library.
Most of the support for passing what are logically addresses around already exists in the IR, so the majority of the work here is just in introducing the new type `Ref<T>` and then using it appropriately when lowering `__ref` parameters/arguments to the IR.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Work on #499
Two big fixes here:
* The logic for checking constraints on `out` arguments wasn't actually triggering because it relied on function parameters being given an `OutType` if they are marked `out`, but the code wasn't actually doing that. Fixing the computation of types for functions resolved that issue.
* Next, I added a specific diagnostic to follow up the "expected an l-value" error to let the user know that their argument was implicitly converted, and that is why it doesn't count as an l-value in Slang's rules.
I've added a test case to ensure that we retain this diagnostic until we can do a true fix for the issue.
The right long-term fix is to have an AST representation of all the implicit casts involved (e.g., in both directions for an `inout` parameter), and then have the IR generate explicit code for the conversions in each direction (the `LoweredVal` representation can handle this sort of thing).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Introduce an IR-level type system
Up to this point, the Slang IR has used the front-end type system to represent types in the IR.
As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR.
For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`.
The whole thing was very complicated.
The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward.
In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice.
* Fix 64-bit build warning
* Fix up some clang warnings/errors
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix decl-ref printing to handling NULL pointers
If the underlying decl, or its name is NULL, then use an empty string for the declaration name.
This issue was found when debugging, but could bite non-debug cases too, if we ever try to print something like a generic type constraint, which has no name.
* Unify all generic parameters, even if some mismatch
Fixes #449
The front end tries to infer the right generic arguments to use at a call site using a sloppily implemented "unification" approach. The basic idea is that if you pass a `vector<float,3>` into a function that operates ona `vector<T,N>` where `T` and `N` are generic paameters, then the unification will try to unify `vector<float,3>` with `vector<T,N>` which will lead to it recursively unifying `float` with `T` and `3` with `N`, at which point we have viable values to substitute in for those parameters.
Where the existing approach is maybe not quite right is in how it handles obvious unification failures. So if we ask the code to unify, say, `float` with `uint`, it will bail out immediately because those can't be unified. This sounds right superficially, but in some cases with might be calling a function that takes a `vector<float,N>` and passing a `vector<uint,3>` and we'd like to at least get far enough along with unification to see that `N` should be `3` so that the front end can maybe decide to call the function anyway, with some amount of implicit conversion.
Over time I've had to modify a lot of the "unification" logic so that it doesn't treat the obvious failures as a hard stop, and instead just returns the failure as a boolean status, but keeps on trying to unify things even after such a failure. When doing unification as part of inference for generic arguments, there will usually be subsequent steps (e.g., type conversions for function aguments) that will catch the type errors that arise.
This specific change is to make is so that when unifying the substitutions for a generic decl-ref, we try to unify all the pair-wise arguments, and don't bail out on the first mismatch (so that the `float`-vs-`uint` failure above doesn't lead to us skipping the `3` and `N` pairing).
The one case we need to watch out for in all of this is when unification is used to check if an `extension` declaration (which might be generic) is actually application to a concrete type. In that case we obviously don't want an extension for `vector<float,N>` to apply to `vector<uint,3>`, so it is important that the extension case check the return status from the unification logic (*or* in the future, it could just confirm that the substituted type is equivalent to the original as a post-process...).
I've added a test case that reproduces the original failure that surfaced the bug.
* fixup: add expected test output
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Typo
* Add [shader(...)] and clean up some literal handling
* Add supporting for validating the `[shader(...)]` attribute, by checking that its argument is a string literal that names a known shader stage.
* Split the `ConstantExpr` class into distinct subclasses rooted at `LiteralExpr`, so we have `BoolLiteralExpr`, `IntegerLiteralExpr`, `FloatingPointLiteralExpr`, and `StringLiteralExpr`
* Add a `String` type to the stdlib, to be used as the type of a string literal.
This change allows code using `[shader(...)]` to be accepted by the front-end again, but it does nothing about emitting it in final HLSL.
* Allow entry points to be specified via [shader(...)]
Before this change, the compiler would track a list of `EntryPointRequest` objects, based on what the suer specified via API and/or command-line options. Each entry point request would get matched up with an AST `FuncDecl` as part of semantic checking, and then the back end steps (layout, codegen, etc.) would work from that information.
This change makes the compiler modal, in that it can *either* continue to use an explicit list of entry point requests (this is the mode when the list is non-empty), or it can rely on user-supplied attributes on entry point functions to drive codegen (this is the mode when the list is empty).
User-specified `[shader(...)]` attributes are processed at the same place where the association from `EntryPointRequest`s to `FuncDecl`s would otherwise be made, and basically does the same thing in the opposite direction: looks for `FuncDecl`s with the appropriate attribute and synthesizes an `EntryPointRequest` for them.
Subsequent processing should ideally not know where a given `EntryPointRequest` came from, and should handle both methods of specifying the entry points equivalently.
One design choice that might not make immediate sense is that we do *not* process a function as an entry point (applying further validation, etc.) just because it has a `[shader(...)]` modifier, unless we are in the appropriate mode (which in this case is the mode where the user didn't specify their own entry points via API or command line). This is to handle cases where the user wants to explicitly compile only one entry point, so that they (1) don't want us to spend time validating code they don't care about, (2) don't want do get output they don't expect, and (3) might actually be presenting us with code that violates the language rules due to a combination of `#define`s in effect (e.g., they might have a `[shader("vertex")]` function that transitively executes a `discard` because of how the preprocessor was configured, but they don't care because they are compiling a fragment entry point). This decision might be something we revisit over time.
As part of this work, I had to add some logic to pick a "profile version" to use for a combination of a target and stage (because when you specify `[shader("vertex")]` the compiler can't tell if you want `vs_5_0`, `vs_5_1`, etc.). This isn't really complete right now, because something like `-target dxbc` *also* doesn't determine a profile, so there is a bit of a kludge at present. We need to figure out a good long-term plan here, which might involve keeping target format, feature level/version, and pipeline stage as truly orthogonal concepts, rather than conflating them. That would involve more work in the API and command-line layers to de-compose things when the user specifies, e.g., `vs_5_1`, but might make downstream logic easier to manage.
* Emit [shader(...)] attribute on entry point for SM 6.1 and later
This should help ensure that the output from Slang can be compiled with dxc `lib_*` profiles.
* Fix warning
|
| |
|
| |
The code in `DeclRefType::Create` was treating all of the point/line/triangle output stream types as `HLSLPointStreamType`, which meant we always output GLSL geometry shaders with `layout(points) out;`.
|
| |
|
|
| |
Pull BaseType, TextureFlavor and SamplerStateFlavor enums and helper functions into a shared file "type-system-shared.h".
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Initial work on validating "constexpr"-ness in IR
The underlying issue here is that certain operations in the target shading languages constrain their operands to be compile-time constants. A notable example is the optional texel offset parameter to the `Texture2D.Sample` operation.
When calling these operations in GLSL, the user is required to pass a "constant expression," and any variables in that expression must therefore be marked with the `const` qualifier (and themselves be initialized with constant expressions). Any GLSL output we generate must of course respect these rules.
When calling these operations in HLSL, the user is not so constrained. Instead, they can pass an arbitrary expression, which may involve ordinary variables with no particular markup, and then the compiler is responsible for determining if the actual value after simplification works out to be a constant. In some cases, the requirement that a value be constant might actually trigger things like loop unrolling. Also, it is okay to use a function parameter to determine such a constant expression, as long as the argument turns out to be a constant at all call sites.
The way we have decided to tackle these challenges in Slang is that we we propagate a notion of `constexpr`-ness through the IR. This is currently being tackled in `ir-constexpr.cpp` with a combination of forward and backward iterative dataflow:
* When the operands to an instruction are all `constexpr`, and the opcode is one we believe can be constant-folded, then we infer that the instruction *can* be evaluated as `constexpr`
* When instruction is required to be `constexpr`, then we infer that all of its operands are also required to be `constexpr`.
If this process ever infers that a function parameter is required to be `constexpr`, then we might have to continue propagation at all the call sites to that function.
If after all the propagation is done, there are any cases where an instruction is *required* to be `constexpr`, but it *can't* be `constexpr` (we weren't able to infer `constexpr`-ness for its operands), then we issue an error.
This implementation encodes the idea of `constexpr`-ness in the IR as part of the type system, using a simplified notion of rates. This change adds a `RateQualifiedType` that can represent `@R T`, and then introduces a `ConstExprRate` that can be used for `R`. Many accessors for the type information on IR nodes were updated to distinguish when one wants the "full" type of an IR value (which might include rate information) vs. just the "data" type.
A `constexpr` qualifier was added in the front-end, and is being used to decorate the texel offset parameter for `Texture2D.Sample`. Lowering from AST to IR looks for this qalifier and infers when a function parameter must be typed as `@ConstExpr T` instead of just `T`.
There are lots of limitations and gotchas in the implementation so far:
* The `@ConstExpr` rate is the only one added in this change, but it seems clear that the conceptual `ThreadGroup` rate that was added to represent `groupshared` should probably get folded into the representation.
* I'm not 100% pleased with how many places in the IR I have to special-case for rate-qualified types. At the same type, pulling out rate as a distinct field on `IRValue` would probably require that we pay attention to rate everywhere.
* I've added a test case to show that we can issue errors when users fail to provide a constant expression for the texel offset, but the actual error message isn't great because it doesn't indicate *why* a constant expression was required. Realistically the "initial IR" should contain a few more decorations we can use to relate error conditions back to the original code (even if this is in a side-band structure).
* I've added a test case that is supposed to show that we can back-propagate `constexpr`-ness to local variables, and I've manually confirmed that it works for Vulkan/SPIR-V output, but the level of Vulkan support in `render_test` today means I can't enable the test for check-in.
* While I'm attempting to propagate `@ConstExpr` information from callees to callers, I haven't implemented any logic to specialize callee functions based on values at call sites.
* In a similar vein, there is no handling of control-flow dependence in the current code. If we infer that a phi (block parameter) needs to be `@ConstExpr`, then it isn't actually enough to require that the inputs to the phi (arguments from predecessor blocks) are all `@ConstExpr` because we also need any control-flow decisions that pick which incoming edge we take to be `@ConstExpr` as well.
* As a practical matter, implicit propagation of `@ConstExpr` from a function body to a function parameter should only be allowed for functions that are "local" to a module. Any function that might be accessed from outside of a module should really have had its `@ConstExpr` parameter marked manually, and our pass should validate that they follow their own rules. Right now we have no kind of visibility (`public` vs `private`) system, so I'm kind of ignoring this issue.
While that is a lot of gaps, this is also just enough code to get the Falcor MultiPassPostProcess example working, so I'm inclined to get it checked in.
* Fixup: missing expected output for test
* Fixup: disable test that relies on [unroll] for now
|
| |
|
|
|
| |
1. reorder destruction order of several key classes to avoid using deleted IR objects when destroying Types
2. remove Session::canonicalTypes and make each Type own a RefPtr to the canonicalType, to allow types to be destroyed along with each IRModule it belongs to.
|
| |
|
|
|
|
|
|
|
|
|
| |
The basic problem here is that when unlinking an `IRUse` from the linked list of uses, there were several cases where I was failing to set the `prevLink` field of the next node to match the `prevLink` field of the node being removed. That doesn't show up when walking the linked list of uses forward, but it breaks it whenever you have subsequent unlinking operations.
This change fixes the bugs of that kind I could find, and also adds a debug validation method to try to avoid breaking it again. I also made more access to `IRUse` go through accessor methods rather than using fields directly, to try to avoid this kind of error. I stopped short of making anything `private`, because I tend to find that it creates more hassles than it avoids.
A few other fixes along the way:
- Made the `List<T>` type default-initialize elements when you resize it. I hadn't realized we weren't doing that.
- Add a standalone `dumpIR(IRGlobalValue*)` so help when debugging issues.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Re-define deprecated compile flags
By including these flags in the header file, with a value of zero, we can allow some existing code to compile even after the major changes to the implementation.
* The `SLANG_COMPILE_FLAG_NO_CHECKING` option will effectively be ignored, since checking is always enabled.
* The `SLANG_COMPILE_FLAG_SPLIT_MIXED_TYPES` option will now act as if it is always enabled (and indeed some of the code has been relying on this flag being set always).
* Make subscript operators writable for writable textures
This even had a `TODO` comment saying that we needed to fix it, and now I'm seeing semantic checking failures because we didn't define these and so we find assignment to non l-values.
* Fix definitions of any() and all() intrinsics
These should always return a scalar `bool` value, but they were being defined wrong in two ways:
1. They were using their generic type parameter `T` in the return type
2. They were returning a vector in the vector case, and a matrix in the matrix case.
This change just alters the return type to be `bool` in all cases.
* Fix bug in SSA construction
When eliminating a trivial phi node, it is possible that the phi is still recorded as the "latest" value for a local variable in its block.
When later code queries that value from the block (which can happen whenever another block looks up a variable in its predecessors), it would get the old phi and not the replacement value.
I simply added a loop that checks if the value we look up is a phi that got replaced, and then continues with the replacement value (which might itself be a phi...). A more advanced solution might try to get clever and have the map itself hold `IRUse` values so that we can replace them seamlessly.
* Simplify IR control flow representation
This change gets rid of various special-case operations for conditional and unconditional branches, and instead requires emit logic to recognize when a direct branch is targetting a `break` or `continue` label.
The new approach here isn't perfect, but it seems beter than what we had before, because it can actually work in the presence of control-flow optimizations (including our current critical-edge-splitting step).
* Load from groupshared isn't groupshared
When loading from a `groupshared` variable, the resulting temporary shouldn't have the `groupshared` qualifier on it.
This might eventually need to generalize to a better understanding of storage modifiers in the IR, but I don't really want to deal with that right now.
* Don't emit references to typedefs in output code
Now that we are using the IR for all codegen, we shouldn't be dealing with surface-level things like `typedef` declarations in the output code; just use the type that was being referred to in the first place.
* Fix floating-point literal printing for IR
The IR was calling `emit()` instead of `Emit()` (we really need to normalize our convention here), and was implicitly invoking a default constructor on `String` that takes a `double` (that constructor should really be marked `explicit`), and which doesn't meet our requirements for printing floating-point values.
* Fix error when importing module that doesn't parse
We already added a case to bail out if semantic checking fails, but neglected to add a case if there is an error during parsing of a module to be imported.
Note: this logic doesn't correctly register the module as being loaded (but still in error), so users could see multiple error messages if there are multiple `import`s for the same module.
* Improve error message for overload resolution failure
- Drop debugging info from the candidate printing
- Add cases to print `double` and `half` types properly
* Fixup: switch loopTest to ifElse in expected IR output
|
| |
|
|
|
|
|
|
| |
Previously, all legalizations of a generic type would use the name of the original decl for the "ordinary" part of things, and this would lead to collisions because the names didn't include the mangled generic arguments.
This is now fixed by storing the mangled name of the original inside of `struct` declarations created for legalization, and using those names instead.
Also adds support for `getElementPtr` instructions when doing IR type legalization.
Also tries to make a `DeclRefType` convert to a string using the underlying `DeclRef`. This doesn't help because `DeclRef::toString` doesn't actually include generic arguments either.
|
| |
|
|
| |
`lookup_witness_table` instruction. (#376)
|
| |
|
|
|
|
| |
1. allow spReflection_FindTypeByName to accept arbitrary type expression string
2. allow const int generic value to be used as expression value, and as array size
3. various bug fixes in witness table specialization / function cloning during specializeIRForEntryPoint to avoid creating duplicate global values, not copying the right definition of a function from the other module, not cloning witness tables that are required by specializeGenerics etc.
|
| |
|
|
|
| |
fixes #373
fixes bug that misses current translation unit's scope when resolving entry-point global type argument expression.
|
| |
|
|
|
| |
1. prevent cyclic lookups when an interface inherits transitively from itself.
2. in `createGlobalGenericParamSubstitution`, create a default substitution for the base type declref before using it to lookup the witness table.
|
| | |
|
| |
|
|
| |
fixes #362
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This commit changes the type of `DeclRefBase::substitutions` from `RefPtr<Substitutions>` to `SubstitutionSet`, which is a new type defined as following:
```
struct SubstitutionSet
{
RefPtr<GenericSubstitution> genericSubstitutions;
RefPtr<ThisTypeSubstitution> thisTypeSubstitution;
RefPtr<GlobalGenericParamSubstitution> globalGenParamSubstitutions;
}
```
This change get rid of most helper functions to retreive the substitution of a certain type, as well as surgery operations to insert a `ThisTypeSubstitution` or `GlobalGenericTypeSubstittuion` at top or bottom of the substitution chain. It also simplies type comparison when certain type of substitution should not be considered as part of type definition.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
* fix #353
* move validateEntryPoint to after all entrypoints has been checked
* bug fix: DeclRefType::SubstituteImpl should change ioDiff
* bug fix: generic resource usage should have count of 1 instead of 0.
* update test case
|
| | |
|
| |\ |
|
| | |
| |
| |
| |
| |
| | |
fixes #341
When a typedef definition is used to satisfy an associated type, we must also substitute the resulting typedef type using parent substitution, in the case that the typedef is a generic application.
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
fixes #339
`NamedExpressionType::CreateCanonicalType()` may return a deleted pointer. The original implementation is as follows:
```
Type* NamedExpressionType::CreateCanonicalType()
{
return GetType(declRef)->GetCanonicalType();
}
```
If `GetType()` returns a newly constructed Type (this happens when the `typedef` is defined inside a generic parent, which triggers a non-trivial substitution), the temporary type will be deleted when the function returns. The fix is to store the temporary type as a field of NamedExpressionType (`innerType`).
A relevant fix (though not the true cause of issue #339) is to have `Type::GetCanonicalType()` also hold a `RefPtr` to the constructed canonical type, when the canonical type is not `this`. This prevents a returned canonical type being assigned to a RefPtr, which makes it possible for that RefPtr to be the sole owner of the canonical type and deleteing the canonical type when that RefPtr is destroyed.
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #325
This commit includes following changes:
1. Including a default DeclaredSubtypeWitness argument when creating a default GenericSubstitution for a DeclRefType, so that the witness argument can be successfully replaced with an actual witness table after specialization. (check,cpp)
2. Not emitting full mangled name for struct field members. Since the declref of the member access instruction do not include necessary generic substitutions for its parent generic parameters, so the mangled names of the declaration site and use site mismatches. Instead we just emit the original name for struct fields. (emit.cpp)
3. Allow IRWitnessTable to represent a generic witness table for generic structs. Adds necessary fields to IRWitnessTable for generic specialization. For now, the user field of the IRUse is not used and is nullptr. (ir-inst.h)
4. Make IRProxyVal use an IRUse instead of an IRValue*, so that an IRValue referenced by IRProxyVal (as a substitution argument) can be managed by the def-use chain for easy replacement. This is used for specializing witness tables. (ir.cpp, ir.h)
5. Add a `String dumpIRFunc(IRFunc*)` function for debugging.
6. Add name mangling for generic / specialized witness tables (mangle.cpp)
7. improved natvis file for inspecting witness tables.
8. Add specialization of witness tables:
1) `findWitnessTable` will simply return the specialize IRInst for a generic witness table.
2) make `cloneSubstitutionArg` call `cloneValue` to clone the argument instead of calling `context->maybeCloneValue`, so we can make use of the cloned value lookup machanism to directly return the specialized witness table (which is done when we process the `specialize` instruction on the generic witness table before process the decl ref).
3) bug fix: the argument in ir.cpp:3338 should be `newArg` instead of `arg`.
4) add `specializeWitnessTable` function to specailize a generic witness table. It clones the witness table, and recursively calls `getSpecailizedFunc` for the witness table entries.
5) make `specailizeGenerics` function also handle the case when an operand of the `specialize` instruction is a witness table. We will call `specializeWitnessTable` here and replace the `specialize` instruction with the specialized witness table. The replacement mechanism based on IR def-use chain works here because we have already make IRProxyVal a part of the def-use chain.
9. Add two more test cases for nested generics with constraints. (generic-list.slang and generic-struct-with-constraint.slang)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Change stdlib `saturate` to explicitly specialize `clamp`
This exposes issue #329, and so gives us an easy way to see if transitive subtype witnesses have been implemented correctly.
* Fixup: invoke correct `clamp` overloads
When switching the `clamp` calls in the stdlib definition of `saturate` I made two big mistakes:
1. I was passing in `<T>` in all cases, instead of, e.g., `<vector<T,N>>` in the vector case
2. Of course, the overloads don't actually take `<vector<T,N>>` for the vector case, because `vector<T,N>` is not a `__BuiltinArithmeticType` (`T` is), so instead it should be `clamp<T,N>(...)`.
The issue behind (2) is that we don't support "conditional conformances," which would be a way to say that when `T : __BuiltinArithmeticType` then `vector<T,N> : __BuiltinArithmeticType`. That would be a great long-term wish-list feature, but not something I can see us adding in a hurry.
Anyway the fix here is the simple one: change the vector/matrix call sites to invoke the correct overload in each case.
* Add a notion of transitive subtype witnesses
There are two pieces here:
1. Add the `TransitiveSubtypeWitness` class. This is a witness that `A : C` that works by storing nested subtype witnesses that show that `A : B` and `B : C` for some intermediate type `B`. All the basic `Val` operations are easy enough to define on this.
- The one gotcha case is whether we can ever simplify away a `TransitiveSubtypeWitness` as part of substitution. That is, if we end up substituting so that both `A` and `B` end up as the same type, then we really just need the `B : C` sub-part. Stuff like that is left as future work.
2. Make the logic in `check.cpp` that constructs subtype witnesses based on found inheritance and constraint declarations able to build up transitive chains. Most of the required infrastructure was already there (the search process maintains a trail of "breadcrumbs" that represent all the steps getting from `A : B` to `B : C` to `C : D` ...).
This change does *not* deal with the required changes in the IR to take advantage of transitive witnesses.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Make AST and IR share type legalization code
A previous change already made it so that the AST-to-AST lowering/legalization pass could work together with IR-based lowering of `import`ed code, but that change didn't take into account the case where a function written in the AST needed to call an IR function and pass in a type that required legalization.
Both the IR-based and AST-based passes had their own approaches to type legalization, that mostly agreed on the desired output, but they ended up creating their own representations for legalized types which would mean that for a function call the caller and callee might end up legalizing the parameter list to use different types.
This change tries to fix this issue (and adds a new test case that relies on the fix) by massively overhauling the AST-based legalization pass so that it uses the same type legalization code as the IR. The shared code has been moved out into `legalize-types.{h,cpp}`.
Notes:
- I eliminated the `FilteredTupleType` type, since it was starting to cause code duplication in a lot of places. Instead, type legalization just creates new `struct` types to represent the result of filtering.
- One big consequence of this is that the `LegalType::pair` case needs to remember for each field in the original type which field (if any) in the new `struct` type it maps to
- A big source of complexity (and probably bugs) in this code is trying to figure out how to parent these new `struct` definitions effectively. A good follow-on change would be something that outputs declarations on-demand during the AST emit logic (as we do for the IR), just to avoid some of this song and dance.
- The old AST type legalization had a notion of both a "tuple" type and a "varying tuple" type. The "tuple" case was quite complex, and combined behavior currently handled by `LegalType::pair` (for splitting into ordinary and special sides) and `LegalType::tuple` (for holding multiple distinct elements to represent the fields of an aggregate). The "varying tuple" case was closer to `LegalType::tuple`, so I tried to just re-use the existing logic for that too. The one place this potentially gets messy is in `reifyTuple()`.
- The messiest bit of handling the "varying tuple" concept (which is used for GLSL shader inputs/outputs since they have to be scalarized) is that when passing them as function arguments we need to reify the tuple back into a structured value. Because the `LegalExpr` hierarchy doesn't have type information, but constructing a value of the "original" type requires such information, things get a little messy.
- I did *not* try to deal with any of the logic related to handling system inputs/outputs for cross-compilation purposes. Of course, the long-term goal is that any actual cross-compilation is handled via the IR, but this change can't afford to break the AST-based path just yet. As a result, there is still quite a bit of complexity in the handling of assignment, to deal with cases where "fixups" are required.
* fixup: bad code in macro, not caught by Visual Studio compiler
* fixup: more stuff missed by VS compiler
* fixup: VS continutes to miss stuff in UNREACHABLE_RETURN
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new function:
`substituteSubstitutions(Substitutions * substHead, Substitutions subst, int * ioDiff)`
This function substitutes the type arguments referenced in a linked list of substitutions headed at `substHead` using the substitutions specified by `subst`. If the linked list `substHead` does not contain `GlobalGenericParamSubstitution` entries, they will be added to the bottom (outter most) of the linked list.
Note that this function should be called when `substHead` is known to be the head of substitution linked list because the existance of `GlobalGenericPaaramSubstitution` is detected assuming the linked lists starts at `substHead`. If a substitution that is not the head of a substitution linked list is passed in, duplicate `GlobalGenericParamSubstitution`s could be appended to the linked list.
This means that this function should *not* be called in places like `GenericSubstitution::SubstitutionImpl()` for its outer substitutions, because `outer` is obviously not the head of the linked list. Instead, use this function to substitution the substitution lists of `DeclRef` etc. instead of calling `declRef.substitutions->SubstituteImpl()` where the head to the linked list is known as a member of that class.
With this function, IRSpecContext::maybeCloneType() is simplified down to `originalType->Substitute(subst)`
Updates `DeclRefBase::SubstituteImpl` and `DeclRefType::SubstituteImpl` to call `substituteSubstitutions` instead of making direct `substitutions->SubstituteImpl` call.
Providing actual implementation of `GlobalGenericParamSubstitution::SubstituteImpl` instead of just returning `this` to deal with potential situations where a true substitution is needed.
|
| | |
|
| |
|
|
|
|
| |
1. simplify RoundUpToAlignment()
2. add new a render-compute test case to cover the situation where the entry-point interface (parameter/return types of an entry-point function) is dependent on the global generic type.
3. initial fixes to get this test case to compile (but is not producing correct HLSL output yet)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were already being handled a little bit, by lowering an `out T` or `inout T` function parameter in the AST to a function parameter with type `T*` in the IR, and then emiting explicit loads/stores.
The HLSL emit logic, however, couldn't tell the difference between an `out` parameter, an `inout`, or a true pointer (if we ever needed to support them...).
The intention (not fully implemented) was that we'd use a hierarchy of types rooted at `PtrTypeBase`:
- `PtrTypeBase`
- `Ptr`: "real" pointers in the C/C++ sense
- `OutTypeBase`: pointers used to represent by-reference parameter passing
- `OutType`: IR level type for an `out` parameter
- `InOutType`: IR level type for an `inout` or `in out` parameter
Actually implementing this involved:
- Adding a bit more flexibility to the `Session::getPtrType` logic to allow for creating any of the concrete types above
- Making the `lower-to-ir` logic create the right type for function parameters (instead of just using `PtrType`)
- Making the HLSL emit logic check for the `OutType` and `InOutType` cases rather than just `PtrType`
- Changing a bunch of small places in the code so that they use `PtrTypeBase` instead of `PtrType` when they should handle any of the above cases, and also make a few places check for `OutTypeBase` instead of `PtrType` or `PtrTypeBase`, when they are really trying to capture by-reference parameters
- Add a test case that uses all of the different cases we care about (without these fixes, this test case generates errors from fxc because of variables being used before being initialized, becaues parameters get declared `out` that should be `inout`).
A minor point here is that we are playing a bit fast and loose right now because the IR does not actually enforce any type checks. From the standpoint of the front end, `Ptr<T>`, `Out<T>`, and `InOut<T>` are all unrelated types (each is just a `struct` declared in `core.meta.slang`), but this doesn't really matter because none of these are types our current users are explicitly using.
In the IR it makes perfect sense to allow `Out<T>` or `InOut<T>` as the operand of a `load` or `store` instruction (and ditto for `getFieldAddr`, etc.) - there instructions just apply to any `PtrTypeBase`. The place where this potentially gets tricky is whether an `Out<T>` can be used where a `Ptr<T>` is expected, or vice vers (e.g., can I just pass my local variable's pointer directly to an `Out<T>` function parameter?
I'm going to ignore these issues for now, since the code currently works for our test case.
|