<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/slang/legalize-types.cpp, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2019-05-31T21:20:37+00:00</updated>
<entry>
<title>Use slang- prefix on slang compiler and core source (#973)</title>
<updated>2019-05-31T21:20:37+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-05-31T21:20:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6cbc3929a54d37bd23cb5efa8e3320ba02f78b2f'/>
<id>urn:sha1:6cbc3929a54d37bd23cb5efa8e3320ba02f78b2f</id>
<content type='text'>
* Prefixing source files in source/slang with slang-

* Prefix source in source/slang with slang- prefix.

* Rename core source files with slang- prefix.

* Update project files.

* Fix problems from automatic merge.
</content>
</entry>
<entry>
<title>String/List closer to conventions, and use Index type (#959)</title>
<updated>2019-04-29T21:03:46+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-04-29T21:03:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4880789e3003441732cca4471091563f36531635'/>
<id>urn:sha1:4880789e3003441732cca4471091563f36531635</id>
<content type='text'>
* List made members m_
Tweaked types to closer match conventions.

* Use asserts for checking conditions on List.
Other small improvements.

* List&lt;T&gt;.Count() -&gt; getSize()

* List&lt;T&gt;
Add -&gt; add
First -&gt; getFirst
Last -&gt; getLast
RemoveLast -&gt; removeLast
ReleaseBuffer -&gt; detachBuffer
GetArrayView -&gt; getArrayView

* List&lt;T&gt;::
AddRange -&gt; addRange
Capacity -&gt; getCapacity
Insert -&gt; insert
InsertRange -&gt; insertRange
AddRange -&gt; addRange
RemoveRange -&gt; removeRange
RemoveAt -&gt; removeAt
Remove -&gt; remove
Reverse -&gt; reverse
FastRemove -&gt; fastRemove
FastRemoveAt -&gt; fastRemoveAt
Clear -&gt; clear

* List&lt;T&gt;
FreeBuffer -&gt; _deallocateBuffer
Free -&gt; clearAndDeallocate
SwapWith -&gt; swapWith

* List&lt;T&gt;
SetSize -&gt; setSize
Reserve -&gt; reserve
GrowToSize growToSize

* UnsafeShrinkToSize -&gt; unsafeShrinkToSize
Compress -&gt; compress
FindLast -&gt; findLastIndex
FindLast -&gt; findLastIndex
Simplify Contains

* List&lt;T&gt;
Removed m_allocator (wasn't used)
Swap -&gt; swapElements
Sort -&gt; sort
Contains -&gt; contains
ForEach -&gt; forEach
QuickSort -&gt; quickSort
InsertionSort -&gt; insertionSort
BinarySearch -&gt; binarySearch

Max -&gt; calcMax
Min -&gt; calcMin

* Initializer::Initialize -&gt; initialize
List&lt;T&gt;::
Allocate -&gt; _allocate
Init -&gt; _init
IndexOf -&gt; indexOf

* * Put #include &lt;assert.h&gt; in common.h, and remove unneeded inclusions
* Small refactor of ArrayView - remove stride as not used

* getSize -&gt; getCount
setSize -&gt; setCount
unsafeShrinkToSize-&gt;unsafeShrinkToCount
growToSize -&gt; growToCount
m_size -&gt; m_count

* Some tidy up around Allocator.

* Use Index type on List.

* Refactor of IntSet.
First tentative look at using Index.

* Made Index an Int
Did preliminary fixes.
Made String use Index.

* Partial refactor of String.

* String::Buffer -&gt; getBuffer
ToWString -&gt; toWString

* Small improvements to String.
String::
Buffer() -&gt; getBuffer()
Equals() -&gt; equals

* Try to use Index where appropriate.

* Fix warnings on windows x86 builds.
</content>
</entry>
<entry>
<title>Allow plugging in types with resources for interface parameters (#913)</title>
<updated>2019-03-26T19:49:02+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-03-26T19:49:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=88859d413e53e4228ae3b832d8bbd2711ccce84a'/>
<id>urn:sha1:88859d413e53e4228ae3b832d8bbd2711ccce84a</id>
<content type='text'>
* Allow plugging in types with resources for interface parameters

The key feature enabled by this change is that you can take a shader declared with interface-type parameters:

```hlsl
ConstantBuffer&lt;ILight&gt; gLight;
float4 myShader(IMaterial material, ...)
{ ... }
```

and specialize its interface-type parameters to concrete type that can contain resources like textures, samplers, etc.

The hard part of doing this layout is that we need to support signatures that include a mix of interface and non-interface types. Imagine this contrived example:

```hlsl
float4 myShader(
    Texture2D diffuseMap,
    ILight        light,
    Texture2D specularMap)
{ ... }
```

We end up wanting `diffuseMap` to get `register(t0)` and `specularMap` to get `register(t1)`, so that they have the same location no matter what we plug in for `light`.
But if we plug in a concrete type for `light` that needs a texture register, we need to allocate it *somewhere*.
We handle this by having the `TypeLayout` for `light` come back with a "primary" type layout that doesn't have any texture registers, but with a "pending" type layout that includes the texture register requirements of whatever concrete type we plug in.

This split between "primary" and "pending" layout then needs to work its way up the hierarchy, so that an aggregate `struct` type with a mix of interface and non-interface fields (recursively), needs to compute an aggregate "primary type layout" and an aggregate "pending type layout," and then each field needs to be able to compute its offset in the primary/pending layout of the aggregate.

A large chunk of the work in this PR is then just implementing the split between primary and pending data, and ensuring that layouts are computed appropriately.

The next catch is that when a "parameter group" (either a parameter block or constant buffer) contains one or more values of interface type, then we can allow the parameter group to "mask" some of the resource usage of the concrete types we plug in, but others "bleed through."

For example, if we have:

```hlsl
struct MyStuff { float3 color; ILight light; }
ConstantBuffer&lt;MyStuff&gt; myStuff;

struct SpotLight { float3 position; Texture2D shadowMap; }
``

If we plug in the `SpotLight` type for `myStuff.light`, then the `float3` data for the light can be "masked" by the fact that we have a constant buffer (we can just allocate the `float3` `position` right after `color`), but the `Texture2D` needed for `shadowMap` needs to "bleed through" and become "pending" data for the `myStuff` shader parameter.

Adding support for that detail more or less required a full rewrite of the logic for allocating parameter group type layouts.

The next detail is that when we go to legalize a declaration like the `myStuff` buffer, we will end up with something like:

```hlsl
struct MyStuff_stripped { float3 color; }
struct Wrapped
{
    MyStuff_stripped primary;
    SpotLight pending;
}
ConstantBuffer&lt;Wrapped&gt; myStuff;
```

This "wrapped" version of the buffer type more accurately reflects the layout we need/want for the uniform/ordinary data, but in order to further legalize it and pull out the resource-type fields like `shadowMap` we need to have accurate layout information, and the problem is that layout information for the original buffer can't apply to this new "wrapped" buffer.

The last major piece of this change is logic that runs during existential type legalization to compute new layouts for "wrapped" buffers like these that embeds correct offset/binding/register information for any resources nested inside them. A key challenge in that code is that existential legalization needs to erase any "pending" data from the program entirely, so that offset information that used to be relatie to the "pending" part of a surrounding type now needs to be relative to the primary part.

The work here may not be 100% complete for all scenarios, but it does well enough on the new and existing tests that I want to checkpoint it. Note that a few other tests have had their output changed, but in all cases I've reviewed the diffs and determined that the change in observable behavior is consistent with what we intened Slang's behavior to be.

Note that there is still one major piece of support for interface-type parameters that is missing here, and which might force us to revisit some of the decisions in this code: we don't properly support user-defined `struct` types with interface-type fields.

* fixup: typos
</content>
</entry>
<entry>
<title>Improve support for interfaces as shader parameters (#886)</title>
<updated>2019-03-09T00:24:02+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-03-09T00:24:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=4f94dd46a2d885e570814dd14a5e46f8e0814802'/>
<id>urn:sha1:4f94dd46a2d885e570814dd14a5e46f8e0814802</id>
<content type='text'>
* Improve support for interfaces as shader parameters

This change adds two main things over the existing support:

1. It is now possible to plug in concrete types that actually contain (uniform/ordinary) fields for the existential type parameters introduced by interface-type shader parameters. The `interface-shader-param2.slang` test shows that this works.

2. There is a limited amount of support for doing correct layout computation and generating output code that matches that layout, so that interface and ordinary-type fields can be interleaved to a limited extent. The `interface-shader-param3.slang` test confirms this behavior.

There are several moving pieces in the change.

* When it comes to terminology, we try to draw a more clear distinction between existial type parameters/arguments and existential/object value parametes/arguments. A simple way to look at it is that an `IFoo[3]` shader parameter introduces a single existential type parameter (so that a concrete type argument like `SomeThing` can be plugged in for the `IFoo`) but introduces three existential object/value parameters (to represent the concrete values for the array elements).

* At the IR level, we support a few new operations. A `BindExistentialsType` can take a type that is not itself an interface/existential type but which depends on interfaces/existentials (e.g., `ConstantBuffer&lt;IFoo&gt;`) and plug in the concrete types to be used for its existential type slots.

* Then a `wrapExistentials` instruction can take a type with all the existentials plugged in (possibly by `BindExistentialsType`) and wrap it into a value of the existential-using type (e.g., turn `ConstantBuffer&lt;SomeThing&gt;` into a `ConstantBuffer&lt;IFoo&gt;`).

* The IR passes for doing generic/existential specialization have been updated to be able to desugar uses of these new operations just enough so that a `ConstantBuffer&lt;IFoo&gt;` can be used.

* When we specialize an IR parameter of an interface type like `IFoo` based on a concrete type `SomeThing`, we turn the parameter into an `ExistentialBox&lt;SomeThing&gt;` to reflect the fact that we are conceptually referring to `SomeThing` indirectly (it shouldn't be factored into the layout of its surrounding type).

* Parameter binding was updated so that it passes along the bound existential type arguments in a `Program` or `EntryPoint` to type layout, so that we can take them into account. The type layout code needs to do a little work to pass the appropriate range of arguments along to sub-fields when computing layout for aggregate types.

* Type layout was updated to have a notion of "pending" items, which represent the concrete types of data that are logically being referenced by existential value slots. The basic idea is that these values aren't included in the layout of a type by default, but then they get "flushed" to come after all the non-existential-related data in a constant buffer, parameter block, etc.

* The logic for computing a parameter group (`ConstantBuffer` or `ParameterBlock`) layout was updated to always "flush" the pending items on the element type of the group, so that the resource usage of specialized existential slots would be taken into account.

* The type legalization pass has been adapted so that we can derive two different passes from it. One does resource-type legalization (which is all that the original pass did). The new pass uses the same basic machinery to legalize `ExistentialBox&lt;T&gt;` types by moving them out of their containing type(s), and then turning them into ordinary variables/parameters of type `T`.

Big things missing from this change include:

- Nothing is making sure that "pending" items at the global or entry-point level will get proper registers/bindings allocated to them. For the uniform case, all that matters in the current compiler is that we declare them in the right order in the output HLSL/GLSL, but for resources to be supported we will need to compute this layout information and start associating it with the existential/interface-type fields.

- Nothing is being done to support `BindExistentials&lt;S, ...&gt;` where `S` is a `struct` type that might have existential-type fields (or nested fields...). Eventually we need to desugar a type like this into a fresh `struct` type that has the same field keys as `S`, but with fields replaced by suitable `BindExistentials` as needed. (The hard part of this would seem to be computing which slots go to which fields). As a practial matter, this missing feature means that interface-type members of `cbuffer` declarations won't work.

The current tests carefully avoid both of these problems. They don't declare any buffer/texture fields in the concrete types, and they don't make use of `cbuffer` declarations or `ConstantBuffer`s over structure types with interface-type fields.

* fixup: add override to methods

* fixup: typos
</content>
</entry>
<entry>
<title>Feature/casting tidyup (#822)</title>
<updated>2019-02-04T17:11:18+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-02-04T17:11:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=0d206996cd68b9f08ae1b4d9da6f16293984302c'/>
<id>urn:sha1:0d206996cd68b9f08ae1b4d9da6f16293984302c</id>
<content type='text'>
* Use 'is' over 'as' where appropriate.

* dynamic_cast -&gt; dynamicCast

* Replace 'dynamicCast' with 'as' where has no change in behavior/ambiguity.

* Replace dynamicCast with as where doesn't change behavior/non ambiguous.
</content>
</entry>
<entry>
<title>Fix type legalization to correctly handle empty struct fields.</title>
<updated>2019-01-28T18:50:05+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2019-01-27T08:08:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=9812963c52eb6312546f600804bb642a38dcfe64'/>
<id>urn:sha1:9812963c52eb6312546f600804bb642a38dcfe64</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Move mangled name out of IRGlobalValue (#752)</title>
<updated>2018-12-13T20:13:58+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-12-13T20:13:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=822ed708364b257b7d2f61ecb8a51a4c96f7edaa'/>
<id>urn:sha1:822ed708364b257b7d2f61ecb8a51a4c96f7edaa</id>
<content type='text'>
* Move mangled name out of IRGlobalValue

Previously the `IRGlobalValue` type was used as a root for all IR instructions that can have "linkage," in the sense that a definition in one module can satisfy a use in another module.
The mangled symbol name was stored in state directly on each `IRGlobalValue`, which created some complications, and also forced IR instructions that wanted to support linkage to wedge into the hierarchy at that specific point.

This change moves the mangled name out into a decoration: either an `IRImportDecoration` or an `IRExportDecoration`, both of which inherit from `IRLinkageDecoration` which exposes the mangled name.
This change has a few benefits:

* We can now have any kind of instruction be exported/imported, without having to inherit from `IRGlobalValue`. This could potentially let `IRStructType` and `IRWitnessTable` be simplified to just have operand lists instead of dummy chldren as they do today.

* We can now easily have "global values" like functions that explicitly *don't* get linkage, instead of using a null or empty mangled name as a marker.

* We can use the exact opcode on a linkage decoration to distinguish imports from exports, which could be used to more accurately resolve symbols during the linking step.

Other than adding the decorations and making sure that AST-&gt;IR lowering adds them, the main changes here are around any code that used `IRGlobalValue`. Variables and parameters of type `IRGlobalValue*` were changed to `IRInst*` easily, so the main challenge was around code that *casts* to `IRGlobalValue*.

In cases where a cast to `IRGlobalValue` also performed a test for the mangled name being non-null/non-empty, we simply switched the code to check for the presence of an `IRLinkageDecoration`, since that is the new way of indicating a value with linakge.

Most of the serious complications arose in `ir.cpp` around the "linking"/target-specialization and generic specialization steps.

The "linking" logic was checking for `IRGlobalValue` to opt into some more complicated cloning logic, and just checking for a linkage decoration here wasn't sufficient since the front-end *does* produce global values without linkage in some cases (e.g., for a function-`static` variable we produce a global variable without linkage). This logic was updated to just check for the cases that used to amount to `IRGlobalValue`s directly by opcode. It might be simpler in the short term to have kept `IRGlobalValue` around to make the existing casts Just Work, but I'm confident that this logic could actually be rewritten for much greater clarity and simplicity and that is the better way forward.

The generic specialization logic was using some really messy code to generate a new mangled name to represent the specialized symbol, and then searching for an existing match for that name.
The original idea there was that an IR module could include "pre-specialized" versions of certain generics to speed up back-end compilation by eliminating the need to specialize in some cases, but this feature has never been implemented so the overhead here is just a waste.
Instead, I moved generic specialization to use a simpler dictionary to map the operands to a `specialize` instruction over to the resulting specialized value.
This allows for some simplifications in the name mangling logic, because it no longer needs to figure out how to produce mangled names from IR instructions representing types/values.

As part of this change I also overhauled the IR emit logic to produce cleaner output by default, borrowing some of the ideas from the logic in `emit.cpp`. IR values are now automatically given names based on their "name hint" decoration, if any, to make the code easier to follow, and I also made it so that types and literals get collapsed into their use sites in a new "simplified" IR dump mode (which is currently the default, with no way to opt into the other mode without tweaking the code). The resulting IR dumps are much nicer to look at, but as a result the one test that involves IR dumping (`ir/string-literal`) doesn't really test what it used to.

One weird issue that came up during testing is that the `transitive-interface` test had previously been producing output that made no sense (that is, the expected output file wasn't really sensible), and somehow these changes were altering its behavior. Changing the test to use `int` values instead of `float` was enough to make the output be what I'd expect, and hand inspection of generating DXBC has me convinced we were compiling the `float` case correctly too. There appears to be some issue around tests with floating-point outputs that we should investigate.

* fixup: C++ declaration order
</content>
</entry>
<entry>
<title>Decorations are instructions (#748)</title>
<updated>2018-12-11T23:17:55+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-12-11T23:17:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=62d3e387774255be4d507cca045ac97dabac9970'/>
<id>urn:sha1:62d3e387774255be4d507cca045ac97dabac9970</id>
<content type='text'>
* Make a test case use IR serialization

* Make all IR instructions usable as parents

This makes it so that every `IRInst` has the list of children that used to be on `IRParentInst` and eliminates `IRParentInst`.
Most places in the code were only checking against `IRParentInst` so that they could know whether there were child instructions to iterate over.

This change bloats the size of every instruction by two pointers, but we hope to be able to eliminate that overhead with a better encoding later.

* Change IR decorations to be instructions.

The main change here is that `IRDecoration` now inherits from `IRInst`, and `IRInst` now has a single linked list that holds both decorations *and* children.
At each point where code used to loop over `getChildren()` on an `IRInst`, I checked whether it made sense to leave the operation as processing just the children, or if it should process both decorations and children.

The thorniest bit was making sure the logic for inserting an instruction into a parent is correct. For the most part, once IR code is built all insertions are explicitly before/after another instruction, so the ordering can't get messed up. The sticking point is any code that does an explicit `insertAtStart` or `insertAtEnd`, but I surveyed those to make sure they are correct in context, and I also made all insertions bottleneck through one routine that does a better job of asserting the preconditions than what was there before. We may still want a "smart" insertion function at some point so that if somebody does `someDecoration-&gt;insertAtEnd(someInst)` the decoration intelligently goes to the end of the decoration list, and not the entire decorations-and-children list.

All of the existing decoration types were refactored to provide accessors for their operands, rather than directly exposing fields. In most cases the operands are required to be `IRConstant` nodes of fixed types. Not all of these types need to be kept around in the new approach, but they were left in so that as much existing code as possible can be kept working.
The `IRBuilder` was extended with factory functions to make the various decoration types and attach them.

All the fields in concrete decorations that were using `StringRepresentation` or `Name` pointers are now using IR-level string operands which provide their value as an `UnownedStringSlice`, so logic that was working with those decoration values needed to be updated here and there. I also needed to add the logic to clone string-literal values to the IR cloning pass, since they are now being used in almost every piece of code.

A new type of constant IR instruction for literal pointers was added, to handle the cases where an IR decoration needs an operand that is a raw AST-level pointer. These are even being serialized, although we obviously should not rely on them to round-trip through serialization in the future. Ideally, a follow-on change should add a cleanup pass where we remove any decorations from a module that shouldn't be allowed in the serialized code.

The biggest overall cleanup is in the serialization logic, where a lot of code just disappears because it can process the raw "decorations and children" list as the logical children of an IR instruction. The only special cases left are literals (which seem like they will always need special-casing) and global values (because they have a mangled name, which we plan to move into a decoration).

One other example of a simplification made possible by this change: the `IRNotePatchConstantFunc` instruction was implemented as an instruction only because it couldn't be encoded as a decoration at the time (it needed to have an operand that referenced an IR function).

The IR dumping logic was also updated (which meant a change to the `ir/string-literal` test) to try to make it print out all decorations a bit more systematically now that they are encoded like other instructions. The formatting isn't quite perfect, but it is good enough to be able to read what is going on.

I didn't include updates to the validation logic to ensure that decorations are being added in ways that follow the invariants, but that would be a nice thing to add next.

* fixup: 64-bit issues

* fixup: forward declaration issues
</content>
</entry>
<entry>
<title>Fixes related to handling of empty types (#600)</title>
<updated>2018-06-13T18:18:44+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-06-13T18:18:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=860b0d604377d3635f6fa994993371399327129f'/>
<id>urn:sha1:860b0d604377d3635f6fa994993371399327129f</id>
<content type='text'>
PR #577 tries to eliminate empty `struct` types by replacing them with a `LegalType::tuple` with zero elements, but this seems to run into problems in some cases, where we end up trying to match up `::none` values with empty `::tuple`s.

An alternative way to handle this is to never create empty `LegalType::tuple`s (and the same for `LegalVal::tuple`), and instead create `LegalType::none` and `LegalVal::none`. PR #577 avoided this because there were various cases in the legalization logic that didn't robustly handle `LegalType::Flavor::none`.

This PR thus includes two main changes:

1. Construct a `::none` type when we have an empty `struct` type.

2. Survery all places that handle the `::tuple` case and extend them to handle the `::none` case if it was missing.

This fixes an issue filed in Falcor's internal GitLab as number 424.</content>
</entry>
<entry>
<title>Fixes 574. Eliminate empty structs during type legalization (#577)</title>
<updated>2018-05-25T14:01:34+00:00</updated>
<author>
<name>Yong He</name>
<email>yonghe@outlook.com</email>
</author>
<published>2018-05-25T14:01:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=ace9a8dc7e4353b1cf8e846abe2b8dc53ecdbc59'/>
<id>urn:sha1:ace9a8dc7e4353b1cf8e846abe2b8dc53ecdbc59</id>
<content type='text'>
</content>
</entry>
</feed>
