<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/slang/ir-validate.cpp, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2019-05-31T21:20:37+00:00</updated>
<entry>
<title>Use slang- prefix on slang compiler and core source (#973)</title>
<updated>2019-05-31T21:20:37+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-05-31T21:20:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6cbc3929a54d37bd23cb5efa8e3320ba02f78b2f'/>
<id>urn:sha1:6cbc3929a54d37bd23cb5efa8e3320ba02f78b2f</id>
<content type='text'>
* Prefixing source files in source/slang with slang-

* Prefix source in source/slang with slang- prefix.

* Rename core source files with slang- prefix.

* Update project files.

* Fix problems from automatic merge.
</content>
</entry>
<entry>
<title>Split front- and back-ends (#846)</title>
<updated>2019-02-15T17:08:19+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-02-15T17:08:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=a3fd4e2bc40cfc77db953b14744c30e7a18e7c1d'/>
<id>urn:sha1:a3fd4e2bc40cfc77db953b14744c30e7a18e7c1d</id>
<content type='text'>
* Split front- and back-ends

This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API.
The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion.
For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points.
The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed.

The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did:

* The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files.

* A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does *not* interact with generic arguments for global or entry-point generic parameters.

* The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`).

* The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation.

* A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around.

* The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward.

* Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios.

* The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time).

* One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking  steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation).

It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required:

* I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it.

* A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters.

* In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation.

* In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway.

* In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better.

* In cases where code was doing, e.g., `context-&gt;shared-&gt;compileRequest-&gt;session-&gt;getThing()` I have tried to introduce convenience routines so that the usage site is just `context-&gt;getThing()` to improve encapsulation and allow changes to be made more easily going forward.

* The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors.

* A few APIs now take `DiagnosticSink*` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob**` parameter for the output.

* fixup: missing include for compilers with accurate template checking (non-VS)

* fixup: review feedback
</content>
</entry>
<entry>
<title>Decorations are instructions (#748)</title>
<updated>2018-12-11T23:17:55+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-12-11T23:17:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=62d3e387774255be4d507cca045ac97dabac9970'/>
<id>urn:sha1:62d3e387774255be4d507cca045ac97dabac9970</id>
<content type='text'>
* Make a test case use IR serialization

* Make all IR instructions usable as parents

This makes it so that every `IRInst` has the list of children that used to be on `IRParentInst` and eliminates `IRParentInst`.
Most places in the code were only checking against `IRParentInst` so that they could know whether there were child instructions to iterate over.

This change bloats the size of every instruction by two pointers, but we hope to be able to eliminate that overhead with a better encoding later.

* Change IR decorations to be instructions.

The main change here is that `IRDecoration` now inherits from `IRInst`, and `IRInst` now has a single linked list that holds both decorations *and* children.
At each point where code used to loop over `getChildren()` on an `IRInst`, I checked whether it made sense to leave the operation as processing just the children, or if it should process both decorations and children.

The thorniest bit was making sure the logic for inserting an instruction into a parent is correct. For the most part, once IR code is built all insertions are explicitly before/after another instruction, so the ordering can't get messed up. The sticking point is any code that does an explicit `insertAtStart` or `insertAtEnd`, but I surveyed those to make sure they are correct in context, and I also made all insertions bottleneck through one routine that does a better job of asserting the preconditions than what was there before. We may still want a "smart" insertion function at some point so that if somebody does `someDecoration-&gt;insertAtEnd(someInst)` the decoration intelligently goes to the end of the decoration list, and not the entire decorations-and-children list.

All of the existing decoration types were refactored to provide accessors for their operands, rather than directly exposing fields. In most cases the operands are required to be `IRConstant` nodes of fixed types. Not all of these types need to be kept around in the new approach, but they were left in so that as much existing code as possible can be kept working.
The `IRBuilder` was extended with factory functions to make the various decoration types and attach them.

All the fields in concrete decorations that were using `StringRepresentation` or `Name` pointers are now using IR-level string operands which provide their value as an `UnownedStringSlice`, so logic that was working with those decoration values needed to be updated here and there. I also needed to add the logic to clone string-literal values to the IR cloning pass, since they are now being used in almost every piece of code.

A new type of constant IR instruction for literal pointers was added, to handle the cases where an IR decoration needs an operand that is a raw AST-level pointer. These are even being serialized, although we obviously should not rely on them to round-trip through serialization in the future. Ideally, a follow-on change should add a cleanup pass where we remove any decorations from a module that shouldn't be allowed in the serialized code.

The biggest overall cleanup is in the serialization logic, where a lot of code just disappears because it can process the raw "decorations and children" list as the logical children of an IR instruction. The only special cases left are literals (which seem like they will always need special-casing) and global values (because they have a mangled name, which we plan to move into a decoration).

One other example of a simplification made possible by this change: the `IRNotePatchConstantFunc` instruction was implemented as an instruction only because it couldn't be encoded as a decoration at the time (it needed to have an operand that referenced an IR function).

The IR dumping logic was also updated (which meant a change to the `ir/string-literal` test) to try to make it print out all decorations a bit more systematically now that they are encoded like other instructions. The formatting isn't quite perfect, but it is good enough to be able to read what is going on.

I didn't include updates to the validation logic to ensure that decorations are being added in ways that follow the invariants, but that would be a nice thing to add next.

* fixup: 64-bit issues

* fixup: forward declaration issues
</content>
</entry>
<entry>
<title>Fix emit logic when "terminators" occur in the middle of a block (#540)</title>
<updated>2018-05-02T13:45:35+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-05-02T13:45:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=d3c1c8b5a80d7ae72678ae209b5c0a7a7053ae2a'/>
<id>urn:sha1:d3c1c8b5a80d7ae72678ae209b5c0a7a7053ae2a</id>
<content type='text'>
Fixes #527

There were a few problem cases for the IR emit logic. The most obvious, which came up in #527 is that a function body with multiple `return` statements would generate invalid code:

```hlsl
int foo()
{
    return 1;
    int x = 2;
    return x;
}
```

In that case the IR for `foo` would have a single block that has two `return` instructions, which is invalid.

Another case that seems to be arising more often, but that had less obvious consequences was when one arm of an `if` statement ends in a `return`:

```hlsl
if(a)
{
    return b;
}
else
{
    int c = 0;
}
int d = 0;
```

In that case, the `return` instruction for `return b` would be followed by a branch to the end of the `if` (the `int d = 0;` line), because that would be the normal control flow without the early `return`.

The fix implemented here is to have the IR lowering logic be a bit more careful on two fronts:

1. When emitting a branch, check if the block we are emitting into has already been terminated, and if so just don't emit the branch (since we are logically at an unreachable point in the CFG.

2. Whenever we are about to emit code for a (non-empty) statement, ensure that the current block being build is unterminated. If the current block is terminated, then start a new one.

Case (2) will only matter when there is unreachable code (e.g., in the function `foo()`, the declaration of `x` and the second `return` can never be reached), so I added a warning in that case, and included a test case that triggers the new warning (with a function like `foo()` above).</content>
</entry>
<entry>
<title>Improve SSA promotion for arrays and structs (#521)</title>
<updated>2018-04-23T17:37:56+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-04-23T17:37:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=9a7849d893ebb755a607befff6b3429830421112'/>
<id>urn:sha1:9a7849d893ebb755a607befff6b3429830421112</id>
<content type='text'>
* Improve SSA promotion for arrays and structs

Fixes #518

The existing SSA pass would only handle `load(v)` and `store(v,...)`
where `v` is the variable instruction, and would bail out if `v` was
used as an operand in any other fashion.

The new pass adds support for `load(ac)` where `ac` is an "access chain"
with a gramar like:

    ac :: v
        | getElementPtr(ac, ...)
	| getFieldAddress(ac, ...)

What this means in practical terms is that we can promote a local
variable of array or structure type to an SSA temporary even if there
are loads of individual elements/fields, as along as any *assignment* to
the variable assigns the whole thing.

I've added a test case to confirm that this change fixes passing of
arrays as function parameters for Vulkan.

* Fixup: disable test on Vulkan because render-test isn't ready

This is a fix for Vulkan, but I don't think our testing setup is ready
for it.

* Fixup: error in unreachable return case, caught by clang

* Fixups based on testing

These are fixes found when testing the original changes against the user code that originated the bug report.

* `emit.cpp`: Make sure to handle array-of-texture types when deciding whether to declare a temporary as a local variable in GLSL output

* `ir-legalize-types.cpp`: Make a not of a source of validation failures that we need to clean up sooner or later (just not in scope for this bug fix change).

* `ir-ssa.cpp`:
  * When checking if something is an access chain with a promotable var at the end, make sure the recursive case recurses into the "access chain" logic instead of the leaf case
  * Add some assertions to guard the assumption that any access chain we apply has been scheduled for removal
  * Correctly emit an element *extract* instead of getting an element *address* when promoting an element access into an array being promoted
  * Eliminate a wrapper routine that was setting up an `IRBuilder` and use the one from the block being processed in the SSA pass (since it was set up for stuff just like this)

* `ir-validate.cpp`
  * Add a hack to avoid validation failures when running IR validation on the stdlib code. This case triggers for an initializer (`__init`) declaration inside an interface, since the logical "return type" is the interface type itself, which has no representation at the IR level and thus yields a null result type in a `FuncType` instruction.
</content>
</entry>
<entry>
<title>Introduce an IR-level type system (#481)</title>
<updated>2018-04-11T23:18:29+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-04-11T23:18:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=baf194e7456ba4568dcf11249896af35b3ce18cc'/>
<id>urn:sha1:baf194e7456ba4568dcf11249896af35b3ce18cc</id>
<content type='text'>
* Introduce an IR-level type system

Up to this point, the Slang IR has used the front-end type system to represent types in the IR.
As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR.
For example, to express the specialization of `vector&lt;T,N&gt;` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`.
The whole thing was very complicated.

The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward.

In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice.

* Fix 64-bit build warning

* Fix up some clang warnings/errors
</content>
</entry>
<entry>
<title>IR: next phase of "everything is an instruction" (#433)</title>
<updated>2018-03-03T15:16:08+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2018-03-03T15:16:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=1fef9b4abfce5ace686a6acc772c605503f825fd'/>
<id>urn:sha1:1fef9b4abfce5ace686a6acc772c605503f825fd</id>
<content type='text'>
The main practical change here is that things that used to be `IRValue`s, like literals, are now being expressed as instructions in the global scope.

In order to validate that things are actually being handled correctly, this change introduces an explicit "validation" pass that can be run on the IR to check for different invariants (although it doesn't check many of the important ones right now). I've left the validation pass turned off by default, but with a command-line flag to enable it. We may want to make it be on by default in debug builds, just to keep us honest. The main invariant for the moment is that when on IR instruction is used as an operand to another, it had better come from the same IR module.

Some of the existing passes were violating this rule, in particular when it came to cloning of witness tables related to global generic parameter substitution. Those features can in theory be handled better now by allowing `specialize` instructions at other scopes, but I didn't want to over-complicate this change, so I make just enough fixes to ensure that these steps always clone witness tables they get from the "symbols" on an IR specialization context. In order for this to work when recursively specializing, I had to ensure that the logic for generic specialization had a notion of a "parent" specialization context that it would fall back to to perform cloning when necessary.

This change keeps the logic that was caching and re-using the instructions for literal values within a module, but adds some logic that isn't really being tested right now for picking the right parent instruction to insert a constant instruction into. This logic doesn't trigger right now because all of the cases we are using it on have zero operands (and so they always get "hoisted" to the global scope), but eventually for things like types we want to be able to support instructions with operands (e.g., `vector&lt;float, 4&gt;`) and handle the case where some of those operands come from different scopes (e.g., when nested inside a generic).

The final change here is mostly cosmetic: the `IRBuilder` is now more abstract about where insertion occurs: it tracks a single `IRParentInst` to insert into, and then an optional `IRInst` to insert before. In the common case, that parent is an `IRBlock`, but it could conceivably also be the global scope, or a witness table, etc. Use sites where we used to change those fields directly now use distinct methods `setInsertInto(parent)` and `setInsertBefore(inst)` which capture the two cases we care about. Accessors are also defined to extract the current block (if the current parent is a block), and the current "function" (global value with code, if the current parent is a global value with code, or a block inside one).

With this work in place, it should be possible for a follow-on change to start putting `specialize` instructions at the global scope and thus clean up some of the on-the-fly specialization work. This work should also help with some of the requirements around a distinct IR-level type system and more explicit generics.</content>
</entry>
</feed>
