<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/slang/ir-insts.h, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2019-05-31T21:20:37+00:00</updated>
<entry>
<title>Use slang- prefix on slang compiler and core source (#973)</title>
<updated>2019-05-31T21:20:37+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-05-31T21:20:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6cbc3929a54d37bd23cb5efa8e3320ba02f78b2f'/>
<id>urn:sha1:6cbc3929a54d37bd23cb5efa8e3320ba02f78b2f</id>
<content type='text'>
* Prefixing source files in source/slang with slang-

* Prefix source in source/slang with slang- prefix.

* Rename core source files with slang- prefix.

* Update project files.

* Fix problems from automatic merge.
</content>
</entry>
<entry>
<title>Hotfix/improve glsl semantic conversion review (#968)</title>
<updated>2019-05-22T16:00:20+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-05-22T16:00:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=3247174cdb00836435794e3f07daad70bc92b66f'/>
<id>urn:sha1:3247174cdb00836435794e3f07daad70bc92b66f</id>
<content type='text'>
* Small changes based on review

* Remove the explicit 'nominal' tests
* Made isValueEqual and isEqual on on IRConstant take a pointer
* Small improvements to comments, and clarity of using 'nominal'

* Simplify comparison by just using isTypeOperandEqual as basis for isTypeEqual

* Use cross compile to test half-texture.slang on glsl

* Don't need half-texture.slang.expected

* Fix handling of nominal comparison based on review, ensuring that for nominal insts, they can only be compared by pointer.
</content>
</entry>
<entry>
<title>Allow interface types to be used inside of structs (#966)</title>
<updated>2019-05-21T15:52:37+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-05-21T15:52:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=7ffe6f03976c98ba0233483d0182fc0ad600fd7a'/>
<id>urn:sha1:7ffe6f03976c98ba0233483d0182fc0ad600fd7a</id>
<content type='text'>
Previously, interface types were allowed to be used directly as function parameters, local variables, and global shader parameters.
Using an interface type as a field of a `struct` type or a `cbuffer` declaration was not implemented.
This change adds that support, and fixes several unrelated issues that caused problems in doing so.

* The most important work here was adding a case for `IRStructType` to `maybeSpecializeBindExistentialsType` that creates a specialized variant of a `struct` type on-demand based on specialization operands. This logic loops over the fields of the original struct, and creates new fields by binding the existentials/interfaces in the type of each field. Caching is used to ensure that the same `struct` type specialized to the same operands should yield the same result.

* To allow subsequent specialization to occur when a `struct` with interface-type fields is used, it was also necessary to specialize field-address and field-extract instructions in cases where the value that the field is being extracted from is a `wrapExistential`.

* Similarly, we neede to make sure that the logic for specializing called functions based on the concrete types for interfaces in the argument list would also take into account `struct` types with existential-type fields inside of them.

* Doing the above changes revealed some serious flaws in how the `ir-specialize.cpp` logic was tracking which instructions still needed to be processed. It had previously been assuming that it could assume any relevant instructions were on its work list, and when the work list went empty it could exit. This runs into two problems: (1) sometimes we create new instructions when specializing, and it may be impossible to ensure that all the new instructions (e.g., those created by utility routines in other files) get added to the work list, and (2) sometimes the instruction(s) that need to be re-visited when we specialize something aren't its direct users, but instead somethign that transitively depends on the instruction.

  These issues were fixed by two changes to the pass: (1) we now maintain a list of known "clean" instructions instead of implicitly using the work-list as a list of "dirty" instructions (so that implicitly any new instruction is dirty), and periodically iterating over all instructions to add the non-clean ones to the work list for processing, and (2) when an instruction is specialized/replaced we mark everything that transitively depends on it "dirty" (by removing it from the "clean" list).

* Added some logic to "fix up" the type of an IR function after changes that might modify its parameter list. Failing to have this logic meant that certain types were still live (because they were referenced by a function type) that couldn't actually be emitted as legal HLSL/GLSL.

* Added some special cases to IR instruction creation for `wrapExistential` and `BindExistentialsType` so that they act as no-ops when there are no "slots" providing specialization information. This helps avoid some special cases when specializing structure fields (since some fields specialization and others don't, so in general there are zero or more operands specific to each field).

* Added a test case that uses an interface type in a `cbuffer`, as well as an interface type in a `struct` passed as an entry-point `uniform` parameter.

* Fixed up some parts of the `.natvis` files to reflect naming changes from a previous PR and thus restore some of the useful Visual Studio debugging experience for Slang.</content>
</entry>
<entry>
<title>String/List closer to conventions, and use Index type (#959)</title>
<updated>2019-04-29T21:03:46+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-04-29T21:03:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4880789e3003441732cca4471091563f36531635'/>
<id>urn:sha1:4880789e3003441732cca4471091563f36531635</id>
<content type='text'>
* List made members m_
Tweaked types to closer match conventions.

* Use asserts for checking conditions on List.
Other small improvements.

* List&lt;T&gt;.Count() -&gt; getSize()

* List&lt;T&gt;
Add -&gt; add
First -&gt; getFirst
Last -&gt; getLast
RemoveLast -&gt; removeLast
ReleaseBuffer -&gt; detachBuffer
GetArrayView -&gt; getArrayView

* List&lt;T&gt;::
AddRange -&gt; addRange
Capacity -&gt; getCapacity
Insert -&gt; insert
InsertRange -&gt; insertRange
AddRange -&gt; addRange
RemoveRange -&gt; removeRange
RemoveAt -&gt; removeAt
Remove -&gt; remove
Reverse -&gt; reverse
FastRemove -&gt; fastRemove
FastRemoveAt -&gt; fastRemoveAt
Clear -&gt; clear

* List&lt;T&gt;
FreeBuffer -&gt; _deallocateBuffer
Free -&gt; clearAndDeallocate
SwapWith -&gt; swapWith

* List&lt;T&gt;
SetSize -&gt; setSize
Reserve -&gt; reserve
GrowToSize growToSize

* UnsafeShrinkToSize -&gt; unsafeShrinkToSize
Compress -&gt; compress
FindLast -&gt; findLastIndex
FindLast -&gt; findLastIndex
Simplify Contains

* List&lt;T&gt;
Removed m_allocator (wasn't used)
Swap -&gt; swapElements
Sort -&gt; sort
Contains -&gt; contains
ForEach -&gt; forEach
QuickSort -&gt; quickSort
InsertionSort -&gt; insertionSort
BinarySearch -&gt; binarySearch

Max -&gt; calcMax
Min -&gt; calcMin

* Initializer::Initialize -&gt; initialize
List&lt;T&gt;::
Allocate -&gt; _allocate
Init -&gt; _init
IndexOf -&gt; indexOf

* * Put #include &lt;assert.h&gt; in common.h, and remove unneeded inclusions
* Small refactor of ArrayView - remove stride as not used

* getSize -&gt; getCount
setSize -&gt; setCount
unsafeShrinkToSize-&gt;unsafeShrinkToCount
growToSize -&gt; growToCount
m_size -&gt; m_count

* Some tidy up around Allocator.

* Use Index type on List.

* Refactor of IntSet.
First tentative look at using Index.

* Made Index an Int
Did preliminary fixes.
Made String use Index.

* Partial refactor of String.

* String::Buffer -&gt; getBuffer
ToWString -&gt; toWString

* Small improvements to String.
String::
Buffer() -&gt; getBuffer()
Equals() -&gt; equals

* Try to use Index where appropriate.

* Fix warnings on windows x86 builds.
</content>
</entry>
<entry>
<title>Initial support for the `precise` keyword (#958)</title>
<updated>2019-04-29T16:31:25+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-04-29T16:31:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=ded340beb4b5197b559626acc39920abb2d39e77'/>
<id>urn:sha1:ded340beb4b5197b559626acc39920abb2d39e77</id>
<content type='text'>
Fixes #858

The `precise` keyword exists in both HLSL and GLSL and when applied to a variable declaration is supposed to indicate that all computations that contribute to the value of that variable should not be altered based on "fast-math" optimizations. The main examples are that separate multiply and add operations should not be turned into fused multiply-add (fma) operations, and that operations cannot ignore the possibility of infinity or not-a-number values (e.g., by assuming that `x * 0.0f` is always `0.0f`).

(Aside: it is possible that my understanding of what the semantics of `precise` are in HLSL and GLSL is imperfect so that either the GLSL variant isn't sufficient to provide the semantics of the HLSL keyword, or that the definition of "all computations that contribute" to a value isn't actually correct. We may need to revise this implementation based on subsequent learnings.)

The basic idea here is to turn the AST `precise` keyword into a `[precise]` decoration in the IR and then emit that as a `precise` keyword again in the output.

The main catch is that whereas most of our existing IR decorations apply to things like global shader parameters or `struct` members that usually stick around for the duration of compilation, `[precise]` will get slapped on local variables that will often get optimized away by our SSA pass. There are two ways a variable can get eliminated/replaced during the SSA pass:

1. A use of the variable can be replaced with an ordinary instruction that computes its value.
2. A use of the variable can be replaced with a reference to a "phi node" that will take on the appropriate value based on control flow.

These two cases already had logic to copy a "name hint" decoration from the variable over to an instruction that will replace it, and I simply extended them to also propagate over a `[precise]` decoration.

The test case added with this change intentionally constructs a case where `[precise]` needs to be propagated over to an SSA "phi node" in order to generate correct output code.

The other gotcha is that we can emit variable declarations in various places in `emit.cpp`, and all of these needed to handle `[precise]`. Not only do we have actually local variables (`IRVar`), but we also have SSA phi nodes (`IRParam`), and then there are cases where an intermediate computation (an ordinary instruction) should be `[precise]` and thus we need to emit it as a temporary (not folding it into its use sites) and make sure that the temporary itself gets the `precise` keyword.

I have manually confirmed that in the output SPIR-V, this change results in the `NoContraction` SPIR-V decoration being added to the relevant operations, and the output DXBC contains a multiply and an add in place of a multiply-add. The output DXIL does not show any obvious changes due to `precise`, although the exact order and operands of the math instructions emitted does differ when `precise` is added/removed. In all cases the output is equivalent to hand-written HLSL/GLSL with a `precise`-qualified local variable.</content>
</entry>
<entry>
<title>Add better control over image formats for GLSL/SPIR-V targets (#939)</title>
<updated>2019-04-08T18:09:03+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-04-08T18:09:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=dc54f1dd1b694b087816857a791e9d37dc25de6d'/>
<id>urn:sha1:dc54f1dd1b694b087816857a791e9d37dc25de6d</id>
<content type='text'>
* Add better control over image formats for GLSL/SPIR-V targets

Currently Slang emits GLSL code assuming all R/W images need to have explicit formats, and thus we try to infer a format from the element type of the image.
E.g., given a `RWTexture2D&lt;half4&gt;` we might infer that a qualifier of `layout(rgba16f)` should be used.

This strategy has two notable shortcomings:

* Sometimes the user will want a format that doesn't match an existing HLSL type. E.g., if they want the equivalent of `layout(r11f_g11f_b10f)`, then what should they put in their `RWTexture2D&lt;...&gt;` to make the inference do what they need?

* Sometimes the user knows that they don't need to specify a format *at all*, because using the `GL_EXT_shader_image_load_formatted` extension, they can still perform non-atomic load/store on images with no format specified in the SPIR-V.

This change adds two features directed at these challenges.

First, we add an explicit `[format(...)]` attribute that can be used to specify an explicit image format, including ones that don't match any HLSL type.
An example of using this new attribute is:

```hlsl
[format("r11f_g11f_b10f")]
RWTexture2D&lt;float3&gt; myImage;
```

For simplicity in initial bring-up, the new formats all use the same naming as formats in GLSL (this should make it easy for a programmer who knows what they expect to get in the GLSL output). We can change the naming convention for formats at a later time, so long as we keep these existing names in as a compatibility feature.

Note that this is *not* given a `vk::` prefix since the attribute should signal the programmer's intent to provide an image with that format on *all* targets (although only some targets might act on that information).

Also note that the attribute takes a string (`[format("rgba8")`) instead of a bare identifier (`[format(rgba8)]`) because this is consistent with the existing convention for attributes in HLSL.

When `[format(...)]` is left off, the default compiler behavior will still be to infer a format, but this behavior can be overidden for a single image using an explicit format of `"unknown"`:

```hlsl
[format("unknown")]
RWTexture2D&lt;float4&gt; mysteryMachine;
```

The second new feature is that if a user knows they are coding for a GPU that supports the `"unknown"` format in all non-atomic cases, then they can opt into making that the default for images without an explicit `[format(...)]`, using the new `-default-image-format-unknown` command-line option for `slangc`.

The new test case included with this change confirms that we correctly see the explicit formats in the output GLSL and *no* formats for images without explicit `[format(...)]` when using the new command-line option. The test stresses images declared at global scope, in parameter blocks, and in entry-point parameter lists, to try and make sure that all the relevant IR passes in the compiler preserve the format information.

* fixup: missing file
</content>
</entry>
<entry>
<title>Improve support for interfaces as shader parameters (#886)</title>
<updated>2019-03-09T00:24:02+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-03-09T00:24:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4f94dd46a2d885e570814dd14a5e46f8e0814802'/>
<id>urn:sha1:4f94dd46a2d885e570814dd14a5e46f8e0814802</id>
<content type='text'>
* Improve support for interfaces as shader parameters

This change adds two main things over the existing support:

1. It is now possible to plug in concrete types that actually contain (uniform/ordinary) fields for the existential type parameters introduced by interface-type shader parameters. The `interface-shader-param2.slang` test shows that this works.

2. There is a limited amount of support for doing correct layout computation and generating output code that matches that layout, so that interface and ordinary-type fields can be interleaved to a limited extent. The `interface-shader-param3.slang` test confirms this behavior.

There are several moving pieces in the change.

* When it comes to terminology, we try to draw a more clear distinction between existial type parameters/arguments and existential/object value parametes/arguments. A simple way to look at it is that an `IFoo[3]` shader parameter introduces a single existential type parameter (so that a concrete type argument like `SomeThing` can be plugged in for the `IFoo`) but introduces three existential object/value parameters (to represent the concrete values for the array elements).

* At the IR level, we support a few new operations. A `BindExistentialsType` can take a type that is not itself an interface/existential type but which depends on interfaces/existentials (e.g., `ConstantBuffer&lt;IFoo&gt;`) and plug in the concrete types to be used for its existential type slots.

* Then a `wrapExistentials` instruction can take a type with all the existentials plugged in (possibly by `BindExistentialsType`) and wrap it into a value of the existential-using type (e.g., turn `ConstantBuffer&lt;SomeThing&gt;` into a `ConstantBuffer&lt;IFoo&gt;`).

* The IR passes for doing generic/existential specialization have been updated to be able to desugar uses of these new operations just enough so that a `ConstantBuffer&lt;IFoo&gt;` can be used.

* When we specialize an IR parameter of an interface type like `IFoo` based on a concrete type `SomeThing`, we turn the parameter into an `ExistentialBox&lt;SomeThing&gt;` to reflect the fact that we are conceptually referring to `SomeThing` indirectly (it shouldn't be factored into the layout of its surrounding type).

* Parameter binding was updated so that it passes along the bound existential type arguments in a `Program` or `EntryPoint` to type layout, so that we can take them into account. The type layout code needs to do a little work to pass the appropriate range of arguments along to sub-fields when computing layout for aggregate types.

* Type layout was updated to have a notion of "pending" items, which represent the concrete types of data that are logically being referenced by existential value slots. The basic idea is that these values aren't included in the layout of a type by default, but then they get "flushed" to come after all the non-existential-related data in a constant buffer, parameter block, etc.

* The logic for computing a parameter group (`ConstantBuffer` or `ParameterBlock`) layout was updated to always "flush" the pending items on the element type of the group, so that the resource usage of specialized existential slots would be taken into account.

* The type legalization pass has been adapted so that we can derive two different passes from it. One does resource-type legalization (which is all that the original pass did). The new pass uses the same basic machinery to legalize `ExistentialBox&lt;T&gt;` types by moving them out of their containing type(s), and then turning them into ordinary variables/parameters of type `T`.

Big things missing from this change include:

- Nothing is making sure that "pending" items at the global or entry-point level will get proper registers/bindings allocated to them. For the uniform case, all that matters in the current compiler is that we declare them in the right order in the output HLSL/GLSL, but for resources to be supported we will need to compute this layout information and start associating it with the existential/interface-type fields.

- Nothing is being done to support `BindExistentials&lt;S, ...&gt;` where `S` is a `struct` type that might have existential-type fields (or nested fields...). Eventually we need to desugar a type like this into a fresh `struct` type that has the same field keys as `S`, but with fields replaced by suitable `BindExistentials` as needed. (The hard part of this would seem to be computing which slots go to which fields). As a practial matter, this missing feature means that interface-type members of `cbuffer` declarations won't work.

The current tests carefully avoid both of these problems. They don't declare any buffer/texture fields in the concrete types, and they don't make use of `cbuffer` declarations or `ConstantBuffer`s over structure types with interface-type fields.

* fixup: add override to methods

* fixup: typos
</content>
</entry>
<entry>
<title>First steps toward supporting interface-type parameters on shaders (#852)</title>
<updated>2019-02-19T19:46:05+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-02-19T19:46:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=32135c5bdfb4d387f8227742a2d2fd555898aca8'/>
<id>urn:sha1:32135c5bdfb4d387f8227742a2d2fd555898aca8</id>
<content type='text'>
* First steps toward supporting interface-type parameters on shaders

What's New
----------

From the perspective of a user, the main thing this change adds is the ability to declare top-level shader parameters (either at global scope, or in an entry-point parameter list) with interface types. For example, the following becomes possible:

```hlsl
// Define an interface to modify values
interface IModifier { float4 modify(float4 val); }

// Define some concrete implementations
struct Doubler : IModifier
{
    float4 modify(float4 val) { return val + val; }
}
struct Squarer : IModifier { ... }

// Define a global shader parameter of interface type
IModifier gGlobalModifier;

// Define an entry point with an interface-type `uniform` parameter
void myShader(
    unifrom IModifier entryPointModifier,
            float4    inColor  : COLOR,
        out float4    outColor : SV_Target)
{
    // Use the interface-type parameters to compute things
    float4 color = inColor;
    color = gGlobalModifier.modify(color);
    color = entryPointModifier.modify(color);
    outColor = color;
}
```

The user can specialize that shader by specifying the concrete types to use for global and entry-point parameters of interface types (e.g., plugging in `Doubler` for `gGlobalModifier` and `Squarer` for `entryPointModifier`).

The "plugging in" process is done in terms of a concept of both global and local "existential slots" which are a new `LayoutResourceKind` that represents the holes where concrete types need to be plugged in for existential/interface types.

In simple cases like the above, each interface-type parameter will yield a single existential slot in either the global or entry-point parameter layout. Users can query the start slot and number of slots for each shader parameter, just like they would for any other resource that a parameter can consume. Before generating specialized code, the user plugs in the name of the concrete type they would like to use for each slot using `spSetTypeNameForGlobalExistentialSlot` and/or `spSetTypeNameForEntryPointExistentialSlot`.

There are some major limitations to the implementation in this first change:

* Parameters must be of interface type (e.g., `IFoo`) and not an array (`IFoo[3]`), or buffer (`ConstantBuffer&lt;IFoo&gt;`) over an interface type. Similarly, `struct` types with interface-type fields still don't work.

* The work on interface-type function parameters still doesn't include support for `out` or `inout` parameters, nor for functions that return interface types (that isn't technically related to this change, but affects its usefullness).

* No work is being done to correctly lay out shader parameters once the concrete types for existential slots are known, so that this change really only works when the concrete type that gets plugged in is empty.

These limitations are severe enough that this feature isn't really usable as implemented in this change, and this merely represents a stepping stone toward a more complete implementation.

Implementation
--------------

The API side of thing largely mirrors what was already done to support passing strings for the type names to use for global/entry-point generic arguments, so there should be no major surprises there.

The logic in `check.cpp` computes the list of existential slots when creating unspecialized `Program`s and `EntryPoint`s (this is logically the "front end" of the compiler), and then checks the supplied argument types against what is expected in each slot when creating specialized `Program`s and `EntryPoint`s. This again mirrors how generic arguments are handled.

Type layout was extended to compute the number of existential slots that a type consumes, and will thus automatically assign ranges of slots to top-level and entry-point shader parameters in the same way it already allocates `register`s and `binding`s. The big missing feature is the ability to specialize a layout to account for the concrete types plugged into the existential-type slots.

IR generation for specialized programs and entry points was slightly extended so that it attaches information about the concrete types plugged into the existential slots, and the witness tables that show how they conform to the interface for that slot. The linking step needed some small tweaks to make sure that information gets copied over to the target-specific program when we start code generation.

The meat of the IR-level work is in `ir-bind-existentials.cpp`, which takes the information that was placed in the IR module by the generation/linking steps and uses it to rewrite shader parameters. For example, if there is a shader parameter `p` of type `IModifier`, and the corresponding existential slot has the type `Doubler` in it, we will rewrite the parameter to have type `Doubler`, and rewrite any uses of `p` to instead use `makeExistential(p, /*witness that Doubler conforms to IModifier*/)`.

Once the replacement is done on the parameters, the existing work for specializing existential-based code when the input type(s) are known kicks in and does the rest.

Testing
-------

A single compute test is added to validate that this feature works. It is narrowly tailored to not require any of the features not supported by the initial implementation (e.g., all of the concrete types used have no members).

The test case *does* include use of an associated type through one of these existential-type parameters, which has exposed a subtle bug in how "opening" of existential values is implemented in the front-end. Rather than fix the underlying problem, I cleaned up the code in the front-end to special case when the existential value being opened is a variable bound with `let`, to directly use a reference to that variable rather than introduce a temporary. Similarly, in the IR generation step, I added an optimization to make variables declared with `let` skip introducing an IR-level variable and just use the SSA value of their initializer directly instead.

* fixup: missing files

* fixup: incorrect type for unreachable return

* fixup: actually comment ir-bind-existentials.cpp
</content>
</entry>
<entry>
<title>Split front- and back-ends (#846)</title>
<updated>2019-02-15T17:08:19+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-02-15T17:08:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=a3fd4e2bc40cfc77db953b14744c30e7a18e7c1d'/>
<id>urn:sha1:a3fd4e2bc40cfc77db953b14744c30e7a18e7c1d</id>
<content type='text'>
* Split front- and back-ends

This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API.
The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion.
For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points.
The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed.

The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did:

* The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files.

* A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does *not* interact with generic arguments for global or entry-point generic parameters.

* The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`).

* The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation.

* A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around.

* The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward.

* Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios.

* The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time).

* One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking  steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation).

It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required:

* I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it.

* A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters.

* In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation.

* In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway.

* In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better.

* In cases where code was doing, e.g., `context-&gt;shared-&gt;compileRequest-&gt;session-&gt;getThing()` I have tried to introduce convenience routines so that the usage site is just `context-&gt;getThing()` to improve encapsulation and allow changes to be made more easily going forward.

* The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors.

* A few APIs now take `DiagnosticSink*` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob**` parameter for the output.

* fixup: missing include for compilers with accurate template checking (non-VS)

* fixup: review feedback
</content>
</entry>
<entry>
<title>Initial support for uniform parameters on entry points (#815)</title>
<updated>2019-01-31T19:33:57+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-01-31T19:33:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=f20c64c348393602ed2a9c873386345cc4b493e8'/>
<id>urn:sha1:f20c64c348393602ed2a9c873386345cc4b493e8</id>
<content type='text'>
* Initial support for uniform parameters on entry points

The basic feature this work adds is the ability to define a shader entry point like:

```hlsl
[shader("fragment")]
float4 main(
    uniform Texture2D    t,
    uniform SamplerState s,
            float2       uv : UV)
{
    return t.Sample(s,uv);
}
```

In this example, the `uniform` keyword is used to mark that the given entry point parameters are *not* varying input/output flowing through the pipeline, but rather uniform shader parameters that should function as if the shader was declared more like:

```hlsl
Texture2D    t,
SamplerState s,

[shader("fragment")]
float4 main(
    float2 uv : UV)
{
    return t.Sample(s,uv);
}
```

Allowing `uniform` parameters on entry points makes it easier to define multiple entry points in one file without accidentally polluting the global scope with shader parameters that only certain entry points care about.
This feature is also more or less a prerequisite for allowing generic type parameters directly on entry point functions, since the main use case for those type parameters is for determining what goes in various `ConstantBuffer`s or `ParameterBlock`s.

There are two main pieces to the implementation.
First, we need to be able to compute appropriate layout information for entry points that include `uniform` parameters.
Second, we need to transform the entry point function to move any `uniform` parameters to be ordinary global-scope shader parameters, to make sure that all other back-end passes don't need to worry about this special case.

The latter piece of the implementation is, relatively speaking, simpler.
The pass in `ir-entry-point-uniforms.{h,cpp}` converts entry point parameters that are determined to be uniform (using the already-computed layout information) into fields of a `struct` type and then declares a global shader parameter based on that `struct` type (and applies already-computed layout information to that parameter).
After that, the remaining IR passes (notably including type legalization) will handle things just as for any other global shader parameter.

The changes to the layout step are more significant, but most of the changes are just cleanups and fixes to enable the feature.
The two major changes that enable entry-point `uniform` parameters are:

* In `collectEntryPointParameters` we now dispatch out to a new `computeEntryPointParameterTypeLayout` function, which decided whether to compute the type layout for a `uniform` parameter, or for a varying parameter (what used to be the default behavior handled by `processEntryPointParameterDecl`).

* The main `generateParameterBindings` routine was extended so that it allocates registers/bindings to the resources required by each entry point (using `completeBindingsForParameter`) after it has allocated registers/binding to all of the global-scope parameters (this addition is mirrored in `specializeProgramLayout`).

The effect of these changes is that the `uniform` parameters of any entry points specified in a compile request will be laid out after the global-scope parameters, in the order the entry points were specified in the compile request.

A bunch of smaller changes were made around parameter layout that are worth enumerating so that the diffs make some sense:

* The `EntryPointLayout` type was changed so that instead of trying to *be* a `StructTypeLayout`, it instead *owns* one, in the same fashion as `ProgramLayout`. This commonality was factored into a base class `ScopeLayout`, and a bunch of edits followed from that change.

* Because `uniform` parameters are moved out of the entry point parameter list early in the IR transformations, the logic in `ir-glsl-legalize.cpp` that tried to look up parameter layout information by index would no longer work if the entry point parameter list had been altered. Instead, that logic now looks for the decorations directly on the parameters.

* The `UsedRange` type in `parameter-binding.cpp` was tracking the existing parameter associated with a range using a `ParameterInfo*` (which accounts for the possibility of multiple `VarDecl`s mapping to the same logical shader parameter), when just using a `VarLayout*` is sufficient for all current use cases. The overhead of allocating a `ParameterInfo` seems like overkill for entry-point parameters, where there can't possibly be multiple declarations of the "same" parameter, so avoiding these overheads was a focus when trying to deduplicate code between the global and entry-point parameter cases.

* A bunch of parameter binding logic that was specific to GLSL input has been deleted completely. There was no way to even execute this code in the compiler today, and there is pretty much zero chance of us needing (or wanting) to deal with GLSL input in the future. This includes custom `UsedRangeSet`s specific to each translation unit, which were only needed for global-scope `in` and `out` varying declarations in GLSL.

* A bunch of functions with `EntryPointParameter` in their names were renamed to use `EntryPointVaryingParameter` to help distinguish that they only apply to the varying case, while entry point `uniform` parameters are handled elsewhere.

* The `completeBindingsForParameter` function was re-worked into something that can be used for both global-scope shader parameters (where we have a `ParameterInfo` and possibly explicit bindings) and entry-point parameters (where we expect to have neither). This helps unify the (fairly subtle) logic for how we allocate and assign bindings for resources, constant buffers, parameter blocks, etc.

* A small change was made so that the entry-point stage is attached directly to top-level parameters of the entry point, and *not* recursively to every field along the way. This could be a breaking change for some applications, but it makes more logical sense (to me); we'll have to check if this affects Falcor. This change produces different output for several of the reflection tests, but the changes are consistent with no longer attaching stage information to sub-fields of varying `struct`-type parameters.

* Because there is a bunch of repeated logic in `parameter-binding.cpp` that has to do with computing a `struct` layout for ordinary/uniform data, I tried to factor that into a single `ScopeLayoutBuilder` type, which handles computing the offsets for any parameters with ordinary data, and then also handles wrapping up the layout in a constant buffer layout if there was any ordinary data at the end.

* A similar convenience routine `maybeAllocateConstantBufferBinding` was added because I noticed multiple places in `parameter-binding.cpp` that were trying to allocate a constant buffer binding for global uniforms, and they were wildly inconsistent (and in most cases used logic that would only work for D3D).

* The main `generateParameterBindings` routine is significantly shortened by using all of these utilities that were introduced. I tried to comment the places that changed to explain the overall flow correctly.

* The `specializeProgramLayout` routine (used to take a `ProgramLayout` from `generateParameterBindings` and specialize it based on knowledge of global generic arguments) had basically been rewritten with more explicit commenting/rationale for what happens in each step. It makes use of the same shared utilities as `generateParameterBindings` and `collectEntryPointParameters`.

In terms of testing:

* I added a test case to specifically test the new behavior, and in particular I made sure to include a mix of both global and entry-point parameters and also to have entry-point parameters of both ordinary and resource/object types.

* I tweaked an existing test for global type parameters to use an entry-point `uniform` parameter instead of a global one, in an effort to migrate it toward being able to use an explicitly generic entry point.

* fixups from merge
</content>
</entry>
</feed>
