diff options
| -rw-r--r-- | docs/design/README.md | 21 | ||||
| -rw-r--r-- | docs/design/capabilities.md | 271 | ||||
| -rw-r--r-- | docs/design/coding-conventions.md | 426 | ||||
| -rw-r--r-- | docs/design/decl-refs.md | 166 | ||||
| -rw-r--r-- | docs/design/existential-types.md | 252 | ||||
| -rw-r--r-- | docs/design/ir.md | 195 | ||||
| -rw-r--r-- | docs/design/overview.md | 259 |
7 files changed, 1590 insertions, 0 deletions
diff --git a/docs/design/README.md b/docs/design/README.md new file mode 100644 index 000000000..e56ec155b --- /dev/null +++ b/docs/design/README.md @@ -0,0 +1,21 @@ +Slang Design and Implementation Notes +===================================== + +This directory contains documents that are primarily intended for developers working on the Slang implementation. +They are not indended to be helpful to Slang users. + +These documents can only be trusted to reflect the state of the codebase or the plans of their authors at the time they were written. Changes to the implementation are not expected to always come with matching changes to these documents, so some amount of drift is to be expected. + +Developers interested in contributing to Slang might want to start with the [Overview](overview.md) document, which describes the overall compilation pipeline that Slang uses and the purpose of the various steps (both implemented and planned). + +The [Coding Conventions](coding-conventions.md) document describes the conventions that should be followed in all code added to the Slang project. + +The [Interfaces](interfaces.md) document describes the high-level design plan for Slang's interfaces and generics features. + +The [Declaration References](decl-refs.md) document is intended to help out developers who are mystified by the heavily used `DeclRef` type in the compiler implementation. + +The [Intermediate Representation (IR)](ir.md) document describes the design of Slang's internal IR. + +The [Existential Types](existential-types.md) document goes into some detail about what "existential types" are in the context of the Slang language, and explains how we may go about supporting them. + +The [Capabilities](capabilities.md) document explains the proposed model for how Slang will support general notions of profile- or capability-based overloading/dispatch. diff --git a/docs/design/capabilities.md b/docs/design/capabilities.md new file mode 100644 index 000000000..fe582bd99 --- /dev/null +++ b/docs/design/capabilities.md @@ -0,0 +1,271 @@ +Capabilities +============ + +Slang aims to be a portable language for shader programming, which introduces two complementary problems: + +1. We need a way to indicate that certain constructs (types, functions, etc.) are only allowed on certain targets, so that a user gets a meaningful error if they try to do something that won't work on one or more of the APIs or platforms they want to target. Similarly, the user expects to get an error if they call a fragment-shader-specific function inside of, say, compute shader code, or vice versa. + +2. If the same feature can be implemented across multiple platforms, but the best (or only) implementation path differs across platforms, then we need a way to express the platform specific code and pick the right implementation per-target. + +Item (2) is traditionally handled with preprocessor techniques (e.g., `#ifdef`ing the body of a function based on target platform), but that of course requires that the user invoke the Slang front end once for each target platform, and target-specific coding in a library will then "infect" code that uses that library, forcing them to invoke the front-end once per target as well. + +We are especially sensitive to this problem in the compiler itself, because we have to author and maintain the Slang standard library, which needs to (1) expose the capabilities of many platforms and (2) work across all those platforms. It would be very unfortunate if we had to build different copies of our standard library per-target. + +The intention in Slang is to solve both of these problems with a system of *capabilities*. + +What is a capability? +--------------------- + +For our purposes a capability is a discrete feature that a compilation target either does or does not support. +We could imagine defining a capability for the presence of texture sampling operations with implicit gradients; this capability would be supported when generating fragment shader kernel code, but not when generating code for other stages. + +Let's imagine a language syntax that the standard library could use to define some *atomic* capabilities: + +``` +capability implicit_gradient_texture_fetches; +``` +We can then imagine using attributes to indicate that a function requires a certain capability: + +``` +struct Texture2D +{ + ... + + // Implicit-graident sampling operation. + [availableFor(implicit_gradient_texture_fetches)] + float4 Sample(SamplerState s, float2 uv); +} +``` + +(Note that the `[availableFor(...)]` syntax is just a straw-man to write up examples, and a better name would be desirable if/when we implement this stuff.) + +Given those declarations, we could then check when compiling code if the user is trying to call `Texture2D.Sample` in code compiled for a target that *doesn't* support implicit-gradient texture fetches, and issue an appropriate error. +The details on how to sequence this all in the compiler will be covered later. + +Derived Capabilities +-------------------- + +Once we can define atomic capabilities, the next step is to be able to define *derived* capabilities. +Let's imagine that we extend our `capability` syntax so that we can define a new capability that automatically implies one or more other capabilities: + +``` +capability fragment : implicit_gradient_texture_fetches; +``` + +Here we've said that whenever the `fragment` capability is available, we can safely assume that the `implicit_gradient_texture_fetches` capability is available (but not vice versa). + +Given even a rudientary tool like that, we can start to build up capabilities that relate closely to the "profiles" in things like D3D: + +``` +capability d3d; +capability sm_5_0 : d3d; +capability sm_5_1 : sm_5_0; +capability sm_6_0 : sm_5_1; +... + +capability d3d11 : d3d, sm_5_0; +capability d3d12 : d3d, sm_6_0; + +capability khronos; +capability glsl_400 : khronos; +capability glsl_410 : glsl_400; +... + +capability vulkan : khronos, glsl_450; +capability opengl : khronos; +``` + +Here we are saying that `sm_5_1` supports everything `sm_5_0` supports, and potentially more. We are saying that `d3d12` supports `sm_6_0` but maybe not, e.g., `sm_6_3`. +We are expressing that fact that having a `glsl_*` capability means you are on some Khronos API target, but that it doesn't specify which one. +(The extact details of these declarations obviously aren't the point; getting a good hierarchy of capabilites will take time) + +Capability Composition +---------------------- + +Sometimes we'll want to give a distinct name to a specific combination of capabilties, but not say that it supports anything new: + +``` +capability ps_5_1 = sm_5_1 & fragment; +``` + +Here we are saying that the `ps_5_1` capability is *equivalent* to the combination of `sm_5_1` and `fragment` (that is, if you support both `sm_5_1` and `fragment` then you support `ps_5_1` and vice versa). + +Compositions should be allowed in `[availableFor(...)]` attributes (e.g., `[availableFor(vulkan & glsl_450)]`), but pre-defined compositions should be favored when possible. + +When composing things with `&` it is safe for the compiler to filter out redundancies based on what it knows so that, e.g., `ps_5_0 & fragment` resolves to just `ps_5_0`. + +Once we have an `&` operator for capabilities, it is easy to see that "derived" capabilities are really syntax sugar, so that a derived capability like: + +``` +capability A : B, C +``` + +could have been written instead as : + +``` +capability A_atomic +capability A = A_atomic & B & C +``` + +Where the `A_atomic` capability guarantees that `A` implies `B` and `C` but not vice versa. + +It is also useful to think of an `|` operator on capabilities. +In particular if a function has multiple `[availableFor(...)]` attributes: + +``` +[availableFor(vulkan & fragment)] +[availableFor(d3d12 & fragment)] +void myFunc(); +``` + +This function should be equivalent to one with just a single `[availableFor((vulkan & fragment) | (d3d12 & fragment))]` which is equivalent to `[availableFor((vulkan | d3d12) & fragment)]`. +Simplification should generally push toward "disjunctive normal form," though, rather than puruse simplifications like that. +Note that we do *not* include negation, so that capabilities are not general Boolean expressions. + +Validation +---------- + +For a given function definition `F`, the front end will scan its body and see what it calls, and compose the capabilities required by the called functions using `&` (simplifying along the way). Call the resulting capability (in disjunctive normal form) `R`. + +If `F` doesn't have an `[availableFor(...)]` attribute, then we can derive its *effective* `[availableFor(...)]` capability as `R` (this probably needs to be expressed as an iterative dataflow problem over the call graph, to handle cycles). + +If `F` *does* have one or more `[availabelFor(...)]` clauses that amount to a declared capability `C` (again in disjunctive normal form), then we can check that `C` implies `R` and error out if it is not the case. +A reasonable implementation would track which calls introduced with requirements, and be able to explain *why* `C` does not capture the stated requirements. + +For a shader entry point, we should check it as if it had an `[availableFor(...)]` that is the OR of all the specified target profiles (e.g., `sm_5_0 | glsl_450 | ...`) ANDed with the specified stage (e.g., `fragment`). +Any error here should be reported to the user. +If an entry point has an explicit `[availableFor(...)]` then we should AND that onto the profile computed above, so that the user can restrict certain entry points to certain profiles. + +In order to support separate compilation, the functions that are exported from a module should probably either have explicit availability attributes, or else they will be compiled against a kind of "default capability" used for the whole module. +Downstream code that consumes such a module would see declarations with explicit capabilities only. +Picking an appropriate "default capability" to use when compiling modules is an important challenge; it would in practice define the "min spec" to use when compiling. + +Capability Overriding +--------------------- + +It should be possible to define multiple versions of a function, having different `[availableFor(...)]` attributes: + +``` +[availableFor(vulkan)] void myFunc() { ... } + +[availableFor(d3d12)] void myFunc() { ... } +``` + +For front-end checking, these should be treated as if they were a single definition of `myFunc` with a ORed capability (e.g., `vulkan | d3d12`). +Overload resoultion will pick the "best" candidate at a call site based *only* on the signatures of the function (note that this differs greatly from how profile-specific function overloading works in Cg). + +The front-end will then generate initial IR code for each definition of `myFunc`. +Each of the IR functions will have the *same* mangled name, but different bodies, and each will have appropriate IR decorations to indicate the capabilities it requires. + +The choice of which definition to use is then put off until IR linking for a particular target. +At that point we can look at all the IR functions matching a given mangled name, filter them according to the capabilities of the target, and then select the "best" one. + +In general a definition `A` of an IR symbol is better than another definition `B` if the capabilities on `A` imply those on `B` but not versa. +(In practice this probably needs to be "the capabilities on `A` intersected with those of the target," and similarly for `B`) + +This approach allows us to defer profile-based choices of functions to very late in the process. The one big "gotcha" to be aware of is when functions are overloaded based on pipeline stage, where we would then have to be careful when generating DXIL or SPIR-V modules with multiple entry points (as a single function `f` might need to be specialized twice if it calls a stage-overloaded function `g`). + +Capabilities in Other Places +---------------------------- + +So far I've talked about capabilities on functions, but they should also be allowed on other declarations including: + +- Types, to indicate that code using that type needs the given capability +- Interface conformances, to indicate that a type only conforms to the interface when the capabilities are available +- Struct fields, to indicate that the field is only present in the type when the capabilities are present +- Extension declarations, to indicate that everything in them requires the specified capabilities + +We should also provide a way to specify that a `register` or other layout modifier is only applicable for specific targets/stages. Such a capability nominally exists in HLSL today, but it would be much more useful if it could be applied to specify target-API-specific bindings. + +Only functions should support overloading based on capability. in all other cases there can only be one definition of an entity, and capabilities just decide when it is available. + +API Extensions as Capabilities +------------------------------ + +One clear use case for capabilities is to represent optional extensions, including cases where a feature is "built-in" in D3D but requires an extension in Vulkan: + +``` +capability KHR_secret_sauce : vulkan; + +[available_for(sm_7_0)] // always available for D3D Shader Model 7.0 +[available_for(KHR_secret_sauce)] // Need the "secret sauce" extension for Vulkan +void improveShadows(); +``` + +When generating code for Vulkan, we should be able to tell the user that the `improveShadows()` function requires the given extension. The user should be able to expression compositions of capabilities in their `-profile` option (and similarly for the API): + +``` +slangc code.slang -profile vulkan+KHR_secret_sauce +``` +(Note that for the command line, it is beneficial to use `+` instead of `&` to avoid conflicts with shell interpreters) + +And important question is whether the compiler should automatically infer required extensions without them being specified, so that it produces SPIR-V that requires extensions the user didn't ask for. +The argument against such inference is that users should opt in to non-standard capabilities they are using, but it would be unfortunate if this in turn requires verbose command lines when invoking the compiler. +It should be possible to indicate the capabilities that a module or entry point should be compiled to use without command-line complications. + +(A related challenge is when a capability can be provided by two different extensions: how should the compiler select the "right" one to use?) + +Disjoint Capabilities +--------------------- + +Certain compositions of capabilities make no sense. If a user declared a function as needing `vulkan & d3d12` they should probably get an error message. + +Knowing that certain capabilities are disjoint can also help improve the overall user experience. +If a function requires `(vulkan & extensionA) | (d3d12 & featureb)` and we know we are compiling for `vulkan` we should be able to give the user a pointed error message saying they need to ask for `extensionA`, because adding `featureB` isn't going to do any good. + +As a first-pass model we could have a notion of `abstract` capabilities that are used to model the root of hierarcies of disjoint capabilities: + +``` +abstract capability api; + +abstract capability d3d : api; +capability d3d11 : d3d; +capability d3d12 : d3d; + +abstract capability khronos : api; +capability vulkan : khronos; +capability opengl : khronos; +``` + +As a straw man: we could have a rule that to decide if non-abstract capabilities `A` and `B` are disjoint, we look for their common ancestor in the tree of capabilities. +If the common ancestor is abstract, they are disjoint, and if not they not disjoint. +We'd also know that if the user tries to compile for a profile that includes an abstract capability but *not* some concrete capability derived from it, then that is an error (we can't generate code for just `d3d`). + +The above is an over-simplification because we don't have a *tree* of capabilities, but a full *graph*, so we'd need an approach that works for the full case. + +Interaction with Generics/Interfaces +------------------------------------ + +It should be possible for an interface requirement to have a capability requirement attached to it. +This would mean that users of the interface can only use the method/type/whatever when the capability is present (just like for any other function): + +``` +interface ITexture +{ + float4 sampleLevel(float2 uv, float lod); + + [availableFor(fragment)] + float4 sample(float2 uv); // can only call this from fragment code +} +``` +When implementing an interface, any capability constraints we put on a member that satisfies an interface requirement would need to guarantee that either: + +- the capabilities on our method are implied by those on the requirement (we don't require more), or + +- the capabilities on the method are implied by those on the type itself, or its conformance to the interface (you can't use the conformance without the capabilities), or + +- the capabilities are already implied by those the whole module is being compiled for + +In each case, you need to be sure that `YourType` can't be passed as a generic argument to some function that uses just the `ITexture` interface above and have them call a method on your type from a profile that doesn't have the required capabilities. + +Interaction with Heterogeneity +------------------------------ + +If Slang eventually supports generating CPU code as well as shaders, it should use capabilities to handle the CPU/GPU split similar to how they can be used to separate out vertex- and fragment-shader functionality. +Something like a `cpu` profile that works as a catch-all for typical host CPU capabilities would be nice, and could be used as a convenient way to mark "host" functions in a file that is otherwise compiled for a "default profile" that assumes GPU capabilities. + +Conclusion +---------- + +Overall, the hope is that in many cases developers will be able to use capability-based partitioning and overloading of APIs to build code that only has to pass through the Slang front-end once, but that can then go through back-end code generation for each target. +In cases where this can't be achieved, the way that capability-based overloading is built into the Slang ir design means that we should be able to merge multiple target-specific definitions into one IR module, so that a library can employ target-specific specializations while still presenting a single API to consumers. diff --git a/docs/design/coding-conventions.md b/docs/design/coding-conventions.md new file mode 100644 index 000000000..cdb9b2d91 --- /dev/null +++ b/docs/design/coding-conventions.md @@ -0,0 +1,426 @@ +Slang Project Coding Conventions +================================ + +Principles +---------- + +This document attempts to establish conventions to be used in the Slang codebase. +We have two goals for this convention. + +The first goal is to make the code look relatively consistent so that it is easy to navigate and understand for contributors. +Having varying styles across different modules, files, functions, or lines of code makes the overall design and intention of the codebase harder to follow. + +The second goal is to minimize the scope complexity of diffs when multiple maintainers work together on the codebase. +In the absence of an enforced style, developers tend to "clean up" code they encoutner to match their personal preferences, and in so doing create additional diffs that increase the chances of merge conflicts and pain down the line. + +Because the Slang codebase has passed through many hands and evolved without a pre-existing convention, these two goals can come into conflict. +We encourage developers to err on the side of leaving well enough alone (favoring the second goal). +Don't rewrite or refactor code to match these conventions unless you were already going to have to touch all of those lines of code anyway. + +Note that external code that is incorporated into the project is excluded from all of these conventions. + +Languages +--------- + +### C++ + +Most code in the Slang project is implemented in C++. +We currently assume support for some C++11 idioms, but have explicitly avoided adding dependencies on later versions. + +As a general rule, be skeptical of "modern C++" ideas unless they are clearly better to simpler alternatives. +We are not quite in the realm of "Orthodox C++", but some of the same guidelines apply: + +* Don't use exceptions for non-fatal errors (and even then support a build flag to opt out of exceptions) +* Don't the built-in C++ RTTI system (home-grown is okay) +* Don't use the C++ variants of C headers (e.g., `<cstdio>` instead of `<stdio.h>`) +* Don't use the STL containers +* Don't use iostreams + +The compiler implementation does not follow some of these guidelines at present; that should not be taken as an excuse to further the proliferation of stuff like `dynamic_cast`. +Do as we say, not as we do. + +Some relatively recent C++ features that are okay to use: + +* Rvalue references for "move semantics," but only if you are implementing performance-critical containers or other code where this really matters. + +* `auto` on local variables, if the expected type is clear in context + +* Lambdas are allowed, but think carefully about whether just declaring a subroutine would also work. + +* Using `>>` to close multiple levels of templates, instead of `> >` (but did you really need all those templates?) + +* `nullptr` + +* `enum class` + +* Range-based `for` loops + +* `override` + +* Default member initializers in `class`/`struct` bodies + +Templates are suitable in cases where they improve clarity and type safety. +As a general rule, it is best when templated code is kept minimal, and forwards to a non-templated function that does the real work, to avoid code bloat. + +Any use of template metaprogramming would need to prove itself exceptionally useful to pay for the increase in cognitive complexity. +We don't want to be in the business of maintaining "clever" code. + +As a general rule, `const` should be used sparingly and only with things that are logically "value types." +If you find yourself having to `const`-qualify a lot of member function in type that you expect to be used as a heap-allocated object, then something has probably gone wrong. + +As a general rule, default to making the implementation of a type `public`, and only encapsulate state or operations with `private` when you find that there are complex semantics or invariants that can't be provided without a heavier hand. + +### Slang + +The Slang project codebase also includes `.slang` files implementing the Slang standard library, as well as various test cases and examples. +The conventions described here are thus the "official" recommendations for how users should format Slang code. + +To the extent possible, we will try to apply the same basic conventions to both C++ and Slang. +In places where we decide that the two languages merit different rules, we will point it out. + +Files and Includes +------------------ + +### File Names + +All files and directories that are added to codebase should have names that contain only ASCII lower-case letters, digits, dots (`.`) and dashes (`-`). +Operating systems still vary greatly in their handling of case sensitivity for file names, and non-ASCII code points are handled with even less consistency; sticking to a restricted subset of ASCII helps avoids some messy interactions between case-insensitive file systems and case-sensitive source-control systems like Git. +As with all these conventions, files from external projects are exempted from these restrictions. + +### Naming of Source and Header Files + +In general the C++ codebase should be organized around logical features/modules/subsystem, each of which has a single `.h` file and zero or more `.cpp` files to implement it. + +If there is a single `.cpp` file, its name should match the header: e.g., `parser.h` and `parser.cpp`. + +If there is more than one `.cpp` file, their names should start with the header name: e.g., `parser.h` and `parser-decls.cpp` and `parser-exprs.cpp`. +If there are declarations that need to be shared by the `.cpp` files, but shouldn't appear in the public interface, then can go in a `*-impl.h` header (e.g., `parser-impl.h`). + +Use best judgement when deciding what counts as a "feature." One class per file is almost always overkill, but the codebase currently leans too far in the other direction, with some oversized source files. + +### Headers + +Every header file should have an include guard. +Within the implementation we can use `#pragma once`, but exported API headers (`slang.h`) should use traditional `#ifdef` style guards (and they should be consumable as both C and C++). + +A header should include or forward-declare everything it needs in order to compile. +It is *not* up to the programmer who `#include`s a header to sort out the dependencies. + +Avoid umbrella or "catch-all" headers. + +### Source Files + +Every source file should start by including the header for its feature/module, before any other includes (this helps ensure that the header correctly includes its dependencies). + +Functions that are only needed within that one source file can be marked `static`, but we should avoid using the same name for functions in different files (in order to support lumped/unified builds). + +### Includes + +In general, includes should be grouped as follows: + +* First, the correspodning feature/module header, if we are in a source file +* Next, any `<>`-enlosed includes for system/OS headers +* Next, any `""`-enclosed includes for external/third-part code that is stored in the project repository +* Finally, any includes for other features in the project + +Within each group, includes should be sorted alphabetically. +If this breaks because of ordering issues for system/OS/third-party headers (e.g., `<Windows.h>` must be included before `<GL/GL.h>`), then ideally those includes should be mediated by a Slang-project-internal header that features can include. + +Namespaces +---------- + +Favor fewer namespaces when possible. +Small programs may not need any. + +All library code that a Slang user might link against should go in the `Slang` namespace for now, to avoid any possibility of clashes in a static linking scenario. +The public C API is obviously an exception to this. + + +Indenting, Spacing, and Braces +------------------------------ + +Indent by four spaces. Don't use tabs except in files that require them (e.g., `Makefile`s). + +Matching opening/closing curly braces should be indented to the same column, with an exception for empty braces (`{}`). +An opening curly brace should always be on a new line. + +```c++ +void someFunc(int a) +{ + doThings(a); +} + +void emptyFunc() +{} + +struct Helper +{ + void help() + { + emptyFunc(); + } +} +``` + +Extremely short "accessor" style functions can violate these rules if the whole definition fits on a line. + +```c++ +struct Vec1 +{ + float x; + + float getX() { return x; } +} +``` + +There is no hard limit on line length, but if you are going past 80-100 columns quite often, maybe think about breaking lines. + +When you decide to break lines for a paramter or argument list, always break after the opening `(`, and put each argument/parameter on its own line: + +```c++ +float bigFunc( + int a, + float b, + void* c) +{ + ... +} + +float gVar = bigFunc( + 0, + 1.0f, + data); +``` + +You can vertically align succesive lines of code to emphasize common structure, but this is not required: + +```c++ +case A_AND_B: doA(); doB(); break; +case JUST_A: doA(); break; +case JUST_B: doB(); break; +``` + +Don't put space between a control-flow keyword and a following `(`. +Examples of how to format the major C++ control-flow constructs follow: + +```c++ +void example() +{ + for(int ii = 0; ii < N; ++ii) + { + if(ii == 0) + { + } + else + { + } + } + + int x = 0; + while(x < 100) + { + x++; + } + + int mode = 0; + do + { + switch(mode) + { + case 0: + x /= 2; + /* fallthrough */ + case 1: + x--; + mode = 3; + + case 2: + default: + x++; + mode--; + break; + } + } while(x > 0); +} +``` + +Don't put space between a unary operator and its operand. +Always put space between an assignment (including compound assignment) operator and its operands. +For other binary operators, use your best judgement. + +If you have to line break a binary expression, put the line break after the operator for an assignment, and before the operator before all others. + +Intializer lists for constructors get some special-case formatting rules. +If everything fits on one line, then great: + +```c++ +SomeClass::SomeClass() + : a(0), b(1), c(2) +{} +``` + +Otherwise, put the line break *before* the comma: + +```c++ +SomeClass::SomeClass() + : a(0) + , b(1) + , c(2) +{} +``` + +When working with `static const` arrays, put the outer `{}` for the array on their own line, and then put each element on its own line and aim for vertical alignment. + +```c++ +struct Mapping +{ + char const* key; + char const* val; +} kMapping[] = +{ + { "a", "aardvark" }, + { "b", "bat" }, +}; +``` + +A trailing comma should always be used for array initializer lists like this, as well as for `enum` tags. + +When writing multi-line macros, the backslash escapes should align vertically. +For consistentcy, a trailing `\` can be used, followed by a comment to mark the end of the definition: + +```c++ +#define FOREACH_ANIMAL(X) \ + X(Aardvark) \ + X(Bat) \ + /* end */ +``` + +Naming +------ + +### Casing + +Types should in general use `UpperCamelCase`. This includes `struct`s, `class`es, `enum`s and `typedef`s. + +Values should in general use `lowerCamelCase`. This includes functions, methods, local variables, global variables, parameters, fields, etc. + +Macros should in general use `SCREAMING_SNAKE_CASE`. +It is important to prefix all macros (e.g., with `SLANG_`) to avoid collisions, since `namespace`s don't affect macros). + +In names using camel case, acronyms and initialisms should appear eniterly in either upper or lower case (e.g., `D3DThing d3dThing`) and not be capitalized as if they were ordinary words (e.g., `D3dThing d3dThing`). +Note that this also applies to uses of "ID" as an abbreviation for "identifier" (e.g., use `nodeID` instead of `nodeId`). + +### Prefixes + +Prefixes based on types (e.g., `p` for pointers) should never be used. + +Global variables should have a `g` prefix, e.g. `gCounter`. +Non-`const` `static` class members can have an `s` prefix if that suits your fancy. +Of course, both of these should be avoided, so this shouldn't come up often. + +Constant data (in the sense of `static const`) should have a `k` prefix. + +In contexts where "information hiding" is relevant/important, such as when a type has both `public` and `private` members, or just has certain operations/fields that are considered "implementation details" that most clients should not be using, and `_` prefix on function and field members is allowed (but not required). + +In function parameter lists, an `in`, `out`, or `io` prefix can be added to a parameter name to indicate whether a pointer/reference/buffer is intended to be used for input, output, or both input and output. +For example: + +```c++ +void copyData(void* outBuffer, void const* inBuffer, size_t size); + +Result lookupThing(Key k, Thing& outThing); + +void maybeAppendExtraNames(std::vector<Name>& ioNames); +``` + +Public C APIs will prefix all symbol names while following the casing convention (e.g. `SlangModule`, `slangLoadModule`, etc.). + +### Enums + +C-style `enum` should use the following convention: + +```c++ +enum Color +{ + Color_Red, + Color_Green, + Color_Blue, + + ColorCount, +}; +``` + +When using `enum class`, drop the type name as prefix, but retain the `UpperCamelCase` tag names: + +```c++ +enum class Color +{ + Red, + Green, + Blue, + + Count, +}; +``` + +When defining a set of flags, separate the type definition from the `enum`: + +```c++ +typedef unsigned int Axes; +enum +{ + Axes_None = 0, + + Axis_X = 1 << 0, + Axis_Y = 1 << 1, + Axis_Z = 1 << 2, + + Axes_All = kAxis_X | kAxis_Y | kAxis_Z, +}; +``` + +Note that the type name reflects the plural case, while the cases that represent individual bits are named with a singular prefix. + +In public APIs, all `enum`s should use the style of separating the type defintion from the `enum`, and all cases should use `SCREAMING_SNAKE_CASE`: + +```c++ +typedef unsigned int SlangAxes; +enum +{ + SLANG_AXES_NONE = 0, + + SLANG_AXIS_X = 1 << 0, + SLANG_AXIS_Y = 1 << 1, + SLANG_AXIS_Z = 1 << 2, + + SLANG_AXES_ALL = SLANG_AXIS_X | SLANG_AXIS_Y | SLANG_AXIS_Z, +}; +``` + +### General + +Names should default to the English language and US spellings, to match the dominant conventiosn of contemporary open-source projects. + +Function names should either be named with action verbs (`get`, `set`, `create`, `emit`, `parse`, etc.) or read as questions (`isEnabled`, `shouldEmit`, etc.). + +Whenever possible, compiler concepts should be named using the most widely-understood term available: e.g., we use `Token` over `Lexeme`, and `Lexer` over `Scanner` simply because they appear to be the more common names. + +Avoid abbreviations and initialisms unless they are already widely established across the codebase; a longer name may be cumbersome to write in the moment, but the code will probably be read many more times than it is written, so clarity should be preferred. +An important exception to this is common compiler concepts or techqniues which may have laboriously long names: e.g., Static Single Assignment (SSA), Sparse Conditional Copy Propagation (SCCP), etc. + +One gotcha particular to compiler front-ends is that almost every synonym for "type" has some kind of established technical meaning; most notably the term "kind" has a precise meaning that is relevant in our domain. +It is common practice in C and C++ to define tagged union types with a selector field called a "type" or "kind," which does not usually match this technical definition. +If a developer wants to avoid confusion, they are encouraged to use the term "flavor" instead of "type" or "kind" since this term (while a bit silly) is less commonly used in the literature. + +Comments and Documentation +-------------------------- + +You probably know the drill: comments are good, but an out-of-date comment can be worse than no comment at all. +Try to write comments that explain the "why" of your code more than the "what." +When implementing a textbook algorithm or technique, it may help to imagine giving the reviewer of your code a brief tutorial on the topic. + +In cases where comments would benefit from formatting, use Markdown syntax. +We do not currently have a setup for extracting documentation from comments, but if we add one we will ensure that it works with Markdown. + +When writing comments, please be aware that your words could be read by many people, from a variety of cultures and backgrounds. +Default to a plain-spoken and professional tone and avoid using slang, idiom, profanity, etc. diff --git a/docs/design/decl-refs.md b/docs/design/decl-refs.md new file mode 100644 index 000000000..1644f6f34 --- /dev/null +++ b/docs/design/decl-refs.md @@ -0,0 +1,166 @@ +Understanding Declaration References +==================================== + +This document is intended as a reference for developers working on the Slang compiler implementation. + +As you work on the code, you'll probably notice a lot of places where we use the `DeclRef<T>` type: + +* Expressions like `VarExpr` and `MemberExpr` are subclasses of `DeclRefExpr`, which holds a `DeclRef<Decl>`. + +* The most common subclass of `Type` is `DeclRefType`, which holds a `DeclRef<Decl>` for the type declaration. + +* Named types (references to `typedef`s) hold a `DeclRef<TypedefDecl>` + +* The name lookup process relies a lot on `DeclRef<ContainerDecl>` + +So what in the world is a `DeclRef`? + +The short answer is that a `DeclRef` packages up two things: + +1. A pointer to a `Decl` in the parsed program AST + +2. A set of "substitutions" to be applied to that decl + +Why do we need `DeclRef`s? +-------------------------- + +In a compiler for a simple language, we might represent a reference to a declaration as simply a pointer to the AST node for the declaration, or some kind of handle/ID that references that AST node. +A reprsentation like that will work in simple cases, for example: + +```hlsl +struct Cell { int value }; + +Cell a = { 3 }; +int b = a.value + 4; +``` + +In this case, the expression node for `a.value` can directly reference the declaration of the field `Cell::value`, and from that we can conclude that the type of the field (and hence the expression) is `int. + +In contrast, things get more complicated as soon as we have a language with generics: + +```hlsl +struct Cell<T> { T value; }; + +// ... + +Cell<int> a = { 3 }; +int b = a.value + 4; +``` + +In this case, if we try to have the expression `a.value` only reference `Cell::value`, then the best we can do is conclude that the field has type `T`. + +In order to correctly type the `a.value` expression, we need enough additional context to know that it references `Cell<int>::value`, and from that to be able to conclude that a reference to `T` in that context is equivalent to `int`. + +We can represent that information as a substitution which maps `T` to `int`: + +``` +[ Cell::T => int ] +``` + +Then we can encode a reference to `Cell<int>::value` as a reference to the single declaration `Cell::value` with such a substitution applied: + +``` +Cell::value [Cell::T => int] +``` + +If we then want to query the type of this field, we can first look up the type stored on the AST (which will be a reference to `Cell::T`) and apply the substitutions from our field reference to get: + +``` +Cell::T [Cell::T => int] +``` + +Of course, we can then simplify the reference by applying the substutions, to get: + +``` +int +``` + +How is this implemented? +------------------------ + +At the highest level, a `DeclRef` consists of a pointer to a declaration (a `Decl*`) plus a single-linked list of `Substution`s. +These substitutions fill in the missing information for any declarations on the ancestor chain for the declaration. + +Each ancestor of a declaration can introduce an expected substitution along the chain: + +* Most declarations don't introduce any substitutions: e.g., when referencing a non-generic `struct` we don't need any addition information. + +* A surrounding generic declaration requires a `GenericSubstitution` which specifies the type argument to be plugged in for each type parameter of the declaration. + +* A surrounding `interface` declaration usually requires a `ThisTypeSubstitution` that identifies the specific type on which an interface member has been looked up. + +All of the expected substitutions should be in place in the gerneral case, even when we might not have additional information. E.g., within a generic declaration like this: + +```hlsl +struct Cell<T> +{ + void a(); + void b() { a(); } +} +``` + +The reference to `a` in the body of `b` will be represented as a declaration reference to `Cell::a` with a substitution that maps `[Cell::T => Cell::T]`. This might seem superfluous, but it makes it clear that we are "applying" the generic to arguments (even if they are in some sense placeholder arguments), and not trying to refer to an unspecialized generic. + +There are a few places in the compiler where we might currently bend these rules, but experience has shown that failing to include appropriate substitutions is more often than not a source of bugs. + +What in the world is a "this type" substitution? +------------------------------------------------ + +When using interface-constrained generics, we need a way to invoke methods of the interface on instances of a generic parameter type. +For example, consider this code: + +```hlsl +interface IVehicle +{ + associatedtype Driver; + Driver getDriver(); +} + +void ticketDriver<V : IVehicle>(V vehicle) +{ + V.Driver driver = vehicle.getDriver(); + sentTicketTo(driver); +} +``` + +In the expression `vehicle.getDriver`, we are referencing the declaration of `IVehicle::getDriver`, and so a naive reading tells us that the return type of the call is `IVehicle.Driver`, but that is an associated type and not a concrete type. It is clear in context that the expression `vehicle.getDriver()` should result in a `V.Driver`. + +The way the compiler encodes that is that we treat the expression `v.getDriver` as first "up-casting" the value `v` (of type `V`) to the interface `IVehicle`. We know this is valid because of the generic constraint `V : IVehicle`. The result of the up-cast operation is an expression with a type that references `IVehicle`, but with a substitution to track the fact that the underlying implementation type is `V`. This amounts to something like: + +``` +IVehicle [IVehicle.This => V] +``` + +where `IVehicle.This` is a way to refer to "the concrete type that is implementing `IVehicle`". + +Looking up the `getDriver` method on this up-cast expression yields a reference to: + +``` +IVehicle::getDriver [IVehicle.This => V] +``` + +And extracting the return type of that method gives us a reference to the type: + +``` +IVehicle::Driver [IVehicle.This => V] +``` + +which turns out to be exactly what the front end produces when it evaluates the type reference `V.Driver`. + +As this example shows, a "this type" substitution allows us to refer to interface members while retaining knowledge of the specific type on which those members were looked up, so that we can compute correct references to things like associated types. + +What does any of this mean for me? +---------------------------------- + +When working in the Slang compiler code, try to be aware of whether you should be working with a plain `Decl*` or a full `DeclRef`. +There are many queries like "what is the return type of this function?" that typically only make sense if you are applying them to a `DeclRef`. + +The `syntax.h` file defines helpers for most of the existing declaration AST nodes for querying properties that should represent substitutions (the type of a variable, the return type of a function, etc.). +If you are writing code that is working with a `DeclRef`, try to use these accessors and avoid being tempted to extract the bare declaration and start querying it. + +Some things like `Modifier`s aren't (currently) affected by substitutions, so it can make sense to query them on a bare declaration instead of a `DeclRef. + +Conclusion +---------- + +Working with `DeclRef`s can be a bit obtuse at first, but they are the most elegant solution we've found to the problems that arise when dealing with generics and interfaces in the compiler front-end. Hopefully this document gives you enough context to see why they are important, and hints at how their representation in the compiler helps us implement some cases that would be tricky otherwise. diff --git a/docs/design/existential-types.md b/docs/design/existential-types.md new file mode 100644 index 000000000..e5bae0b7d --- /dev/null +++ b/docs/design/existential-types.md @@ -0,0 +1,252 @@ +Existential Types +================= + +This document attempts to provide some background on "existential types" as they pertain to the design and implementation of Slang. +The features described here are *not* reflected in the current implementation, so this is mostly a sketch of where we can go with the language and compiler. + +Background: Generics and Universal Quantification +------------------------------------------------- + +Currently Slang supports using interfaces as generic constraints. Let's use a contrived example: + +```hlsl +interface IImage { float4 getValue(float2 uv); } + +float4 offsetImage<T : IImage>(T image, float2 uv) +{ + float2 offset = ...; + return image.getValue(uv + offset) +} +``` + +Generics like this are a form of "universal quantification" in the terminology of type theory. +This makes sense, because *for all* types `T` that satisfy the constraints, `offsetImage` provides an implementation of its functionality. + +When we think of translating `offsetImage` to code, we might at first only think about how we can specialize it once we have a particular type `T` in mind. +However, we can also imagine trying to generate one body of code that can implement `offsetImage` for *any* type `T`, given some kind of runtime representation of types. +For example, we might generate C++ code like: + +```c++ +struct IImageWitnessTable { float4 (*getValue)(void* obj, float2 uv); }; + +float4 offsetImage(Type* T, IImageWitnessTable* W, void* image, float2 uv) +{ + float2 offset = ...; + return W->getvalue(image, uv + offset); +} +``` + +This translation takes the generic parameters and turns them into ordinary runtime parameters: the type `T` becomes a pointer to a run-time type representation, while the constraint that `T : IImage` becomes a "witness table" of function pointers that, we assume, implements the `IImage` interface for `T`. So, the syntax of generics is *not* tied to static specialization, and can admit a purely runtime implementation as well. + +Readers who are familiar with how languages like C++ are implemented might see the "witness table" above and realize that it is kind of like a virtual function table, just being passed alongside the object, rather than stored in its first word. + +Using Interfaces Like Types +--------------------------- + +It is natural for a user to want to write code like the following: + +```hlsl +float4 modulateImage(IImage image, float2 uv) +{ + float4 factor = ...; + return factor * image.getValue(uv); +} +``` + +Unlike `offsetImage`, `modulateImage` is trying to use the `IImage` interface as a *type* and not just a constraint. + +This code appears to be asking for a dynamic implementation rather than specialization (we'll get back to that...) and so we should be able to implement it similarly to our translation of `offsetImage` to C++. +Something like the following makes a lot of sense: + +```c++ +struct IImage { Type* T; IImageWitnessTable* W; void* obj; }; + +float4 modulateImage(IImage image, float2 uv) +{ + float4 factor = ...; + return factor * image.W->getvalue(image.obj, uv); +} +``` + +Similar to the earlier example, there is a one-to-one mapping of the parameters of the Slang function the user wrote to the parameters of the generated C++ function. +To make this work, we had to bundle up the information that used to be separate parameters to the generic as a single value of type `IImage`. + +Existential Types +----------------- + +It turns out that when we use `IImage` as a type, it is what we'd call an *existential* type. +That is because if I give you a value `img` of type `IImage` in our C++ model, then you know that *there exists* some type `img.T`, a witness table `img.W` proving the type implements `IImage`, and a value `img.obj` of that type. + +Existential types are the bread and butter of object-oriented programming. +If I give you an `ID3D11Texture2D*` you don't know what its concrete type is, and you just trust me that some concrete type *exists* and that it implements the interface. +A C++ class or COM component can implement an existential type, with the constraint that the interfaces that a given type can support is limited by the way that virtual function tables are intrusively included inside the memory of the object, rather than externalized. +Many modern languages (e.g., Go) support adapting existing types to new interfaces, so that a "pointer" of interface type is actually a fat pointer: one for the object, and one for the interface dispatch table. +Our examples so far have assumed that the type `T` needs to be passed around separately from the witness table `W`, but that isn't strictly required in some implementations. + +In type theory, the most important operation you can do with an existential type is to "open" it, which means to have a limited scope in which you can refer to the constinuent pieces of a "bundled up" value of a type like `IImage`. +We could imagine "opening" an existential as something like: + +``` +void doSomethingCool<T : IImage>(T val); + +void myFunc(IImage img) +{ + open img as obj:T in + { + // In this scope we know that `T` is a type conforming to `IImage`, + // and `obj` is a value of type `T`. + // + doSomethingCool<T>(obj); + } +} +``` + +Self-Conformance +---------------- + +The above code with `doSomethingCool` and `myFunc` invites a much simpler solution: + +``` +void doSomethingCool<T : IImage>(T val); + +void myFunc(IImage img) +{ + doSomethingCool(img); +} +``` + +This seems like an appealing thing for a language to support, but there are some subtle reasons why this isn't possible to support in general. +If we think about what `doSomethingCool(img)` is asking for, it seems to be trying to invoke the function `doSomethingCool<IImage>`. +That function only accepts type parameters that implement the `IImage` interface, so we have to ask ourselves: + +Does the (existential) type `IImage` implement the `IImage` interface? + +Knowing the implementation strategy outline above, we can re-phrase this question to: can we construct a witness table that implements the `IImage` interface for values of type `IImage`? + +For simple interfaces this is sometimes possible, but in the general case there are other desirable language features that get in the way: + +* When an interface has associated types, there is no type that can be chosen as the associated type for the interface's existential type. The "obvious" approach of using the constraints on the associatd type can lead to unsound logic when interface methods take associated types as parameters. + +* When an interface uses the "this type" (e.g., an `IComparable` interface with a `compareTo(ThisType other)` method), it isn't correct to simplify the this type to the interface type (just because you have to `IComarable` values doesn't mean you can compare them - they have to be of the same concrete type!) + +* If we allow for `static` method on interfaces, then what implementation would we use for these methods on the interface existential type? + +Encoding Existentials in the IR +------------------------------- + +Existentials are encoded in the Slang IR quite simply. We have an operation `makeExistential(T, obj, W)` that takes a type `T`, a value `obj` that must have type `T`, and a witness table `W` that shows how `T` conforms to some interface `I`. The result of the `makeExistential` operation is then a value of the type `I`. + +Rather than include an IR operation to "open" an existential, we can instead just provide accessors for the pieces of information in an existential: one to extract the type field, one to extract the value, and one to extract the witness table. These would idomatically be used like: + +``` +let e : ISomeInterface = /* some existential */ +let T : Type = extractExistentialType(e); +let W : WitnessTbale = extractExistentialWitnessTable(e); +let obj : T = extractExistentialValue(e); +``` + +Note how the operation to extract `obj` gets its result type from the previously-executed extraction of the type. + +Simplifying Code Using Existentials +----------------------------------- + +It might seem like IR code generated using existentials can only be implemented using dynamic dispatch. +However, within a local scope it is clear that we can simplify expressions whenever `makeExistential` and `extractExistential*` operations are paired. +For example: + +``` +let e : ISomeInterface = makeExistential(A, a, X); +... +let B = extractExistentialType(e); +let b : B = extractExistentialValue(e); +let Y = extractExistentialWitnessTable(e); +``` + +It should be clear in context that we can replace `B` with `A`, `b` with `a`, and `Y` with `X`, after which all of the `extract*` operations and the `makeExistential` operation are dead and can be eliminated. + +This kind of simplification works within a single function, as long as there is no conditional logic involving existentials. +We require further transformation passes to allow specialization in more general cases: + +* Copy propagation, redundancy elimination and other dataflow optimizations are needed to simplify use of existentials within functions +* Type legalization passes, including some amount of scalarization, are needed to "expose" existential-type fields that are otherwise buried in a type +* Function specialization, is needed so that a function with existential parameters is specialized based on the actual types used at call sites + +Transformations just like these are already required when working with resource types (textures/samplers) on targets that don't support first-class computation on resources, so it is possible to share some of the same logic. +Similarly, any effort we put into validation (to ensure that code is written in a way that *can* be simplified) can hopefully be shared between existentials and reources. + +Compositions +------------ + +So far I've only talked about existential types based on a single interface, but if you look at the encoding as a tuple `(obj, T, W)` there is no real reason that can't be generalized to hold multiple witness tables: `(obj, T, W0, ... WN)`. Interface compositions could be expressed at the language level using the `&` operator on interface (or existential) types. + +The IR encoding doesn't need to change much to support compositions: we just need to allow multiple witness tables on `makeExistential` and have an index operand on `extractExistentialWitnessTable` to get at the right one. + +The hardest part of supporting composition of interfaces is actually in how to linearize the set of interfaces in a way that is stable, so that changing a function from using `IA & IB` to `IB & IA` doesn't change the order in which witness tables get packed into an existential value. + +Why are we passing along the type? +---------------------------------- + +I'm glossing over something pretty significant here, which is why anybody would pass around the type as part of the existential value, when none of our examples so far have made us of it. +This sort of thing isn't very important for languages where interface polymorphism is limited to heap-allocated "reference" types (or values that have been "boxed" into reference types), because the dynamic type of an object can almost always be read out of the object itself. + +When dealing with a value type, though, we have to deal with things like making *copies*: + +``` +interface IWritable { [mutating] void write(int val); } + +stuct Cell : IWritable { int data; void write(int val) { data = val; } } + +T copyAndClobber<T : IWritable>(T obj) +{ + T copy = obj; + obj.write(9999); + return copy; +} + +void test() +{ + Cell cell = { 0 }; + Cell result = copyAndClobber(cell); + // what is in `result.data`? +} +``` + +If we call `copyAndClober` on a `Cell` value, then does the line `obj.write` overwrite the data in the explicit `copy` that was made? +It seems clear that a user would expect `copy` to be unaffected in the case where `T` is a value type. + +How does that get implemented in our runtime version of things? Let's imagine some C++ translation: + +``` +void copyAndClobber(Type* T, IWriteableWitnessTable* W, void* obj, void* _returnVal) +{ + void* copy = alloca(T->sizeInBytes); + T->copyConstruct(copy, obj); + + W->write(obj, 9999); + T->moveConstruct(_returnVal, copy); +} +``` + +Because this function returns a value of type `T` and we don't know how big that is, let's assume the caller is passing in a pointer to the storage where we should write the result. +Now, in order to have a local `copy` of the `obj` value that was passed in, we need to allocate some scratch storage, and only the type `T` can know how many bytes we need. +Furthermore, when copying `obj` into that storage, or subsequently copying the `copy` variable into the function result, we need the copy/move semantics of type `T` to be provided by somebody. + +This is the reason for passing through the type `T` as part of an existential value. + +If we only wanted to deal with reference types, this would all be greatly simplified, because the `sizeInBytes` and the copy/move semantics would be fixed: everything is a single pointer. + +All of the same issues arise if we making copies of existential values: + +``` +IWritable copyAndClobberExistential(IWritable obj) +{ + IWritable copy = obj; + obj.write(9999); + return copy; +} +``` + +If we want to stay consistent and say that `copy` is an actual copy of `obj` when the underlying type is a value rather than a reference type, then we need the copy/move operations for `IWritable` to handle invoking the copy/move operations of the underlying encapsulated type. + +Aside: it should be clear from these examples that implementing generics and existential types with dynamic dispatch has a lot of complexity when we have to deal with value types (because copying requires memory allocation). +It is likely that a first implementation of dynamic dispatch support for Slang would restrict it to reference types (and would thus add a `class` keyword for defining reference types). diff --git a/docs/design/ir.md b/docs/design/ir.md new file mode 100644 index 000000000..086d58b91 --- /dev/null +++ b/docs/design/ir.md @@ -0,0 +1,195 @@ +The Design of Slang's Intermediate Representation (IR) +====================================================== + +This document details some of the important design choices for Slang's IR. + +Goals and Non-Goals +------------------- + +The IR needs to balance many goals which can sometimes come into conflict. +We will start by enumerating these goals (and related non-goals) explicitly so that we can better motivate specific design choices. + +* Obviously it must be simple to lower any source code in Slang code to the IR. It is however a non-goal for the lowering process to be lossless; we do not need to recover source-level program structure from the IR. + +* The IR must be amenable to standard dataflow analyses and optimizations. It should be possible to read a paper on a compiler algorithm or technique and apply it to our IR in a straightforward manner, and with the expected asymptotic efficiency. + +* As a particular case of analysis and optimization, it should be possible to validate flow-dependent properties of an input function/program (e.g., whether an `[unroll]` loop is actually unrollable) using the IR, and emit meaningful error messages that reference the AST-level names/locations of constructs involved in an error. + +* It should be posible to compile modules to the IR separately and then "link" them in a way that depends only on IR-level (not AST-level) constructs. We want to allow changing implementation details of a module without forcing a re-compile of IR code using that module (what counts as "implementation details") is negotiable. + +* There should be a way to serialize IR modules in a round-trip fashion preserving all of the structure. As a long-term goal, the serialized format should provide stability across compiler versions (working more as an IL than an IR) + +* The IR must be able to encode "generic" (type-parameterized) constructs explicitly, and to express transformations from generic to specialized (or dynamic-dispatch) code in the IR. In particular, it must be possible for a module to make use of generic defined in another (separately-compiled) module, with validation performed before linking, and specialization performed after. + +* The IR must be able to express code that is close to the level of abstraction of shader intermediate languages (ILs) like SPIR-V and DXIL, so that we can minimize the amount of work required (and the number of issues that can arise) when translating the IR to these targets. This can involve lowering and legalization passes to match the constraints of those ILs, but it should not require too much work to be done outside of the IR. + +* It should be possible to translate code in the IR back into high-level-language code, including things like structured control-flow constructs. + +* Whenever possible, invariants required by the IR should be built into its structure so that they are easier to maintain. + +* We should strive to make the IR encoding, both in memory and when serialized, as compact as is practically possible. + +Inspirations +------------ + +The IR design we currently use takes inspiration from three main sources: + +* The LLVM project provides the basic inspiration for the approach to SSA, such as using a typed IR, the decision to use the same object to represent an instruction and the SSA value it produces, and the push to have an extremely simple `replaceAllUsesWith` primitive. It is easy to forget that it is possible to design a compiler with different design decisions; the LLVM ones just happen to both be well-motivated and well-known. + +* The Swift IL (SIL) provides the inspiration for our approach for encoding SSA "phi nodes" (blocks with arguments), and also informs some of how we have approached encoding generics and related features like existential types. + +* The SPIR-V IL provides the inspiration for the choice to uniformly represent types as instructions, for how to encode "join points" for structured control flow, and for the concept of "decorations" for encoding additional metadata on instructions. + + +Key Design Decisions +-------------------- + +### Everything is an Instruction + +The Slang IR strives for an extremely high degree of uniformity, so almost every concept in the IR is ultimately just an instruction: + +* Ordinary add/sub/mul/etc. operations are instructions, as are function calls, branches, function parameters, etc. + +* Basic blocks in functions, as well as functions themselves are "parent instructions" that can have other instructions as children + +* Constant values (e.g., even `true` and `false`) are instructions + +* Types are instructions too, and can have operands (e.g., a vector type is the `VectorType` instruction applied to operands for the element type and count) + +* Generics are encoded entirely using ordinary instructions: a generic is encoded like a function that just happens to do computation at the type level + +* It isn't true right now, but eventually decorations will also be instructions, so that they can have operands like any other instruction + +* An overall IR module is itself an instruction so that there is a single tree that owns everything + +This uniformity greatly simplifies the task of supporting generics, and also means that operations that need to work over all instructions, such as cloning and serialization, can work with a single uniform representation and avoid special-casing particular opcodes. + +The decision to use an extremely uniform deisgn, even going as far to treat types as "ordinary" instructions, is similar to SPIR-V, although we do not enforce many of the constraints SPIR-V does on how type and value instructions can be mixed. + +### Instructions Have a Uniform Structure + +Every instruction has: + +* An opcode +* A type (the top-level module is the only place where this can be null) +* Zero or more operands +* Zero or more decorations +* Zero or more children + +Instructions are not allowed to have any semantically-relevant information that is not in the above list. +The only exception to this rule is instructions that represent literal constants, which store additional data to represent their value. + +The in-memory encoding places a few more restrictions on top of this so that, e.g., currently an instruction can either have operands of children, but not both. + +Because everything that could be used as an operand is also an instruction, the operands of an instruction are stored in a highly uniform way as a contiguous array of `IRUse` values (even the type is continguous with this array, so that it can be treated as an additional operand when required). +The `IRUse` type maintains explicit links for use-def information, currently in a slightly bloated fashion (there are well-known techniques for reducing the size of this information). + +### A Class Hierarchy Mirrored in Opcodes + +There is a logical "class hierarchy" for instructions, and we support (but do not mandate) declaring a C++ `struct` type to expose an instruction or group of instructions. +These `struct` types can be helpful to encode the fact that the program knows an instruction must/should have a particular type (e.g., having a function parameter of type `IRFunction*` prevents users from accidentally passing in an arbitrary `IRInst*` without checking that it is a function first), and can also provide convenience accessors for operands/children. + +Do make "dynamic cast" operations on this class hierarchy efficient, we arrange for the instruction opcodes for the in-memory IR to guarantee that all the descendents of a particular "base class" will occupy a contiguous range of opcodes. Checking that an instruction is in that range is then a constant-time operation that only looks at its opcode field. + +There are some subtleties to how the opcodes are ordered to deal with the fact that some opcodes have a kind of "multiple inheritance" thing going on, but that is a design wart that we should probably remove over time, rather than something we are proud of. + +### A Simpler Encoding of SSA + +The traditional encoding of SSA form involves placing "phi" instructions at the start of blocks that represent control-flow join points where a variable will take on different values depend on the incoming edge that is taken. +There are of course benefits to sticking with tradition, but phi instructions also have a few downsides: + +- The operands to phi instructions are the one case where the "def dominates use" constraint of SSA appears to be violated. I say "appears" because officially the action of a phi occurs on the incoming edge (not in the target block) and that edge will of course be dominated by the predecessor block. It still creates a special case that programmers need to be careful about. This also complicates serialization in that there is no order in which the blocks/instructions of a function can be emitted that guarantees that every instruction always precedes all of its uses in the stream. + +- All of the phi instructions at the start of the block must effectively operate in parallel, so that they all "read" from the correct operand before "writing" to the target variable. Like the above special case, this is only a problem for a phi related to a loop back-edge. It is of course possible to always remember the special interpretation of phi instructions (that they don't actually execute sequentially like every other instruction in a block), but its another special case. + +- The order of operands to a phi instruction needs to be related back to the predecessor blocks, so that one can determine which value is to be used for which incoming edge. Any transformation that modifies the CFG of a function needs to be careful to rewrite phi instructions to match the order in which predecessors are list, or else the compiler must maintain a side data structure that remembers the mapping (and update it instead). + +- Directly interpreting/executing code in an SSA IR with phi instructions is made more difficult because when branching to a block we need to immediately execute any phi instructions based on the block from which we just came. The above issues around phis needing to be executed in parallel, and needing to track how phi operands relate to predecessor blocks also add complexity to an interpreter. + +Slang ditches traditional phi functions in favor of an alternative that matches the Swift IL (SIL). +The idea doesn't really start in Swift, but rather in the existing observation that SSA form IR and a continuation-passing style (CPS) IR are semantically equivalent; one can encode SSA blocks as continuation functions, where the arguments of the continuation stand in for the phi instructions, and a branch to the block becomes just a call. + +Like Swift, we do not use an explicit CPS representation, but instead find a middle ground of a traditional SSA IR where instead of phi instructions basic blocks have parameters. +The first N instructions in a Slang basic block are its parameters, each of which is an `IRParam` instruction. + +A block that would have had N phi instrutions now has N parameters, but the parameters do not have operands. +Instead, a branch instruction that targets that block will have N *arguments* to match the parameters, representing the values to be assigned to the parameters when this control-flow edge is taken. + +This encoding is equivalent in what it represents to traditional phi instructions, but nicely solves the problems outlines above: + +- The phi operands in the successor block are now arguments in the *predecessor* block, so that the "def dominates use" property can be enforced without any special cases. + +- The "assignment" of the argument values to parameters is now encoded with a single instruction, so that the simultaneity of all the assignments is more clear. We still need to be careful when leaving SSA form to obey those semantics, but there are no tricky issues when looking at the IR itself. + +- There is no special work required to track which phi operands come from which predecessor block, since the operands are attached to the terminator instruction of the predecessor block itself. There is no need to update phi instructions after a CFG change that might affect the predecessor list of a block. The trade-off is that any change the *number* of parameters of a block now requires changes to the terminator of each predecessor, but that is a less common change (isolated to passes that can introduce or eliminate block parameters/phis). + +- It it much more clear how to give an operational semantics to a "branch with arguments" instead of phi instructions: compute the target block, copy the argumenst to temporary storage (because of the simultaneity requirement), and then copy the temporaries over the parameters of the target block. + +The main caveat of this representation is that it requires branch instructions to have room for arguments to the target block. For an ordinary unconditional branch this is pretty easy: we just put a variable number of arguments after the operand for the target block. For branch instructions like a two-way conditional, we might need to encode two argument lists - one for each target block - and an N-way `switch` branch only gets more complicated. + +The Slang IR avoids the problem of needing to store arguments on every branch instruction by banning *critical edges* in IR functions that are using SSA phis/parameters. A critical edge is any edge from a block with multiple successors (meaning it ends in a conditional branch) to one with multiple predecessors (meaning it is a "join point" in the CFG). +Phi instructions/parameters are only ever needed at join points, and so block arguments are only needed on branches to a join point. +By ruling out conditional branches that target join points, we avoid the need to encode arguments on conditional branch instructions. + +This constraint could be lifted at some point, but it is important to note that there are no programs that cannot be represented as a CFG without critical edges. + +### A Simple Encoding of the CFG + +A traditional SSA IR represents a function as a bunch of basic blocks of instructions, where each block ends in a *terminator* instruction. +Terminators are instructions that can branch to another block, and are only allowed at the end of a block. +The potential targets of a terminator determine the *successors* of the block where it appears, and contribute to the *predecessors* of any target block. +The succesor-to-predecessor edges form a graph over the basic blocks called the control-flow graph (CFG). + +A simple representation of a function would store the CFG explicitly as a graph data structure, but in that case the data structure would need to be updated whenever a change is made to the terminator instruction of a branch in a way that might change the successor/predecessor relationship. + +The Slang IR avoids this maintenance problem by noting an important property. +If block `P`, with terminator `t`, is a predecessor of `S`, then `t` must have an operand that references `S`. +In turn, that means that the list of uses of `S` must include `t`. + +We can thus scan through the list of predecessors or successors of a block with a reasonably simple algorithm: + +* To find the successors of `P`, find its terminator `t`, identify the operands of `t` that represent successor blocks, and iterate over them. This is O(N) in the number of outgoing CFG edges. + +* To find the predecessors of `S`, scan through its uses and identify users that are terminator instructions. For each such user if this use is at an operand position that represents a successor, then include the block containing the terminator in the output. This is O(N) in the number of *uses* of a block, but we expect that to be on the same order as the number of predecessors in practice. + +Each of these actually iterates over the outgoing/incoming CFG *edges* of a block (which might contain duplicates if one block jumps to another in, e.g, multiple cases of a `switch`). +Sometimes you actually want the edges, or don't care about repeats, but in the case where you want to avoid duplicates the user needs to build a set to deduplicate the lists. + +The clear benefit of this approach is that the predecessor/successor lists arise naturally from the existing encoding of control-flow instructions. It creates a bit of subtle logic when walking the predecessor/successor lists, but that code only needs to be revisited if we make changes to the terminator instructions that have successors. + +### Explicit Encoding of Control-Flow Join Points + +In order to allow reconstruction of high-level-language source code from a lower-level CFG, we need to encode something about the expected "join point" for a structured branch. +This is the logical place where control flow is said to "reconverge" after a branch, e.g.: + +```hlsl +if(someCondition) // join point is "D" +{ + A; +} +else +{ + B; + if(C) return; +} +D; +``` + +Note that (unlike what some programming models would say) a join point is *not* necessarily a postdominator of the conditional branch. In the example above the block with `D` does not postdominate the block with `someCondition` nor the one with `B`. It is even possible to construct cases where the high-level join point of a control-flow construct is unreachable (e.g., the block after an infinite loop). + +The Slang IR encodes structured control flow by making the join point be an explicit operand of a structured conditional branch operation. Note that a join-point operand is *not* used when computing the successor list of a block, since it does not represent a control-flow edge. +This is slightly different from SPIR-V where joint points ("merge points" in SPIR-V) are encoded using a metadata instruction that precedes a branch. Keeping the information on the instruction itself avoids cases where we move one but not the other of the instructions, or where we might accidentally insert code between the metadata instruction and the terminator it modifies. +In the future we might consider using a decoration to represent join points. + +When using a loop instruction, the join point is also the `break` label. The SPIR-V `OpLoopMerge` includes not only the join point (`break` target) but also a `continue` target. We do not currently represent structured information for `continue` blocks. +The reason for this is that while we could keep structured information about `continue` blocks, we might not be able to leverage it when generating high-level code, because the syntactic form of a `for` loop (the only construct in C-like languages where `continue` can go somewhere other than the top of the loop body) only allows an *expression* for the continue clause and not a general *statement*, but we cannot guarantee that after optimization the code in an IR-level "continue clause" would constitute a single expression. +The approach we use today means that the code in "continue clause" might end up being emitted more than once in final code; this is deemed acceptable because it is what `fxc` already does. + +When it comes time to re-form higher-level structured control flow from Slang IR, we use the structuring information in the IR to form single-entry "regions" of code that map to existing high-level control-flow constructs (things like `if` statements, loops, `break` or `continue` statements, etc.). +The current approach we use requires the structuring information to be maintained by all IR transformations, and also currently relies on some invariants about what optimizations are allowed to do (e.g., we had better not introduce multi-level `break`s into the IR). + +In the future, it would be good to investigate adapting the "Relooper" algorithm used in Emscripten so that we can reover valid structured control flow from an arbitrary CFG; for now we put off that work. +If we had a more powerful restructuring algorithm at hand, we could start to support things like multi-level `break`, and also ensure that `continue` clauses don't lead to code duplication any more. + + + diff --git a/docs/design/overview.md b/docs/design/overview.md new file mode 100644 index 000000000..7c2b16cf3 --- /dev/null +++ b/docs/design/overview.md @@ -0,0 +1,259 @@ +An overview of the Slang Compiler +================================= + +This document will attempt to walk through the overall flow of the Slang compiler, as an aid to developers who are trying to get familiar with the codebase and its design. +More emphasis will be given to places where the compiler design is nontraditional, or might surprise newcomers; things that are straightforward won't get much detail. + +High-Level Concepts +------------------- + +Compilation is always performed in the context of a *compile request*, which bundles together the options, input files, and request for code generation. +Inside the code, there is a type `CompileRequest` to represent this. + +The user specifies some number of *translation units* (represented in the code as a `TranslationUnitRequest`) which comprise some number of *sources* (files or strings). +HLSL follows the traditional C model where a "translaiton unit" is more or less synonymous with a source file, so when compiling HLSL code the command-line `slangc` will treat each source file as its own translation unit. +For Slang code, the command-line tool will by default put all source files into a single translation unit (so that they represent a shared namespace0). + +The user can also specify some number of *entry points* in each translation unit (`EntryPointRequest`), which combines the name of a function to compile with the pipeline stage to compile for. + +In a single compile request, we can generate code for zero or more *targets* (represented with `TargetRequest`) a target defines both the format for output code (e.g., DXIL or SPIR-V) and a *profile* that specifies the capability level to assume (e.g., "Shader Model 5.1"). + +It might not be immediately clear why we have such fine-grained concepts as this, but it ends up being quite important to decide which pieces of the compiler are allowed to depend on which pieces of information (e.g., whether or not a phase of compilation gets to depend on the chosen target). + +The "Front End" +--------------- + +The job of the Slang front-end is to turn textual source code into a combination of code in our custom intermediate represnetation (IR) plus layout and binding information for shader parameters. + +### Lexing + +The first step in the compiler (after a source file has been loaded into memory) is to *lex* it. +The `Lexer` type is implement in `lexer.{h,cpp}` and produces `Token`s that represent the contents of the file on-demand as requested by the next phase of compilation. + +Each token stores a `TokenCode` that indicates the kind of token, the raw text of the token, and the location in the source code where it is located. +Source locations use a somewhat clever encoding to avoid being bloated (they are a single integer rather than separate file, line, and column fields). + +We don't make any attempt in the lexer to extract the actual value of integer and floating-point literals; we just store the raw text. +We also don't try to distinguish keywords from identifiers; keywords show up as ordinary identifier tokens. + +Much of the complexity (and inefficiency) in the current lexer is derived from the need to support C-isms like backspace line continuation, and special case rules like allowing `<>` to delimit a file name string after a `#include`. + +### Preprocessing + +The preprocessor (`Preprocessor`) in `preprocessor.{h,cpp}` deals with `#include` constructs, macro expansions, etc. +It pulls tokens from the lexer as needed (making sure to set flags to control the lexer behavior when required) and uses a limited lookahead to decide what to do with each token. + +The preprocessor maintains a stack of input streams, with the original source file at the bottom, and pushes entries for `#include`d files, macros to expand etc. + +Macro definitions store a sequence of already-lexed tokens, and expansion simply "replays" these tokens. +Expansion keeps a notion of an "environment" for looking up identifiers and mapping them to macro definitions. +Calling through to a function-style macro creates a fresh environment that maps the macro parameter names to pseudo-macros for the arguments. + +We still tokenize code in inactive preprocessor conditionals, but don't evaluate preprocessor directives inside inactive blocks (except those that may change the active/inactive state). +Preprocessor directives are each handled as a callback on the preprocessor state and are looked up by name; adding a new directive (if we ever had a reason to) is a fairly simple task. + +One important detail of the preprocessor is that it runs over a full source file at once and produces a flat array of `Token`s, so that there is no direct interaction between the parser and preprocessor. + +### Parsing + +The parser (`Parser` in `parser.{h,cpp}`) is mostly a straightforward recursive-descent parser. +Because the input is already tokenized before we start, we can use arbitrary lookahead, although we seldom look ahead more than one token. + +Traditionally, parsing of C-like languages requires context-sensitive parsing techniques to distinguish types from values, and deal with stuff like the C++ "most vexing parse." +Slang instead uses heuristic approaches: for example, when we encouter an `<` after an identifier, we first try parsing a generic argument list with a closing `>` and then look at the next token to determine if this looks like a generic application (in which case we continue from there) or not (in which case we backtrack). + +There are still some cases where we use lookup in the current environment to see if something is a type or a value, but officially we strive to support out-of-order declarations like most modern languages. +In order to achieve that goal we will eventually move to a model where we parse the bodies of declarations and functions in a later pass, after we have resolved names in the global scope. + +One important choice in the parser is that we strive to avoid hard-coding keywords as much as possible. +We already track an environment for C-like parsing, and we simply extend that so that we also look up declaration and statement keywords in the environment. +This means that most of the language "keywords" in Slang aren't keywords at all, and instead are just identifiers that happen to be bound to syntax in the default environment. +Syntax declarations are associated with a callback that is invoked to parse the construct they name. + +The design of treating syntax as ordinary declarations has a long-term motivation (we'd like to support a flexible macro system) but it also has short-term practical benefits. +It is easy for us to add new modifier keywords to the language without touching the lexer or parser (just adding them to the standard library), and we also don't have to worry about any of Slang's extended construct (e.g., `import`) breaking existing HLSL code that just happens to use one of those new keywords as a local variable name. + +What the parser produces is an abstract syntax tree (AST). +The AST currently uses a strongly-typed C++ class hierarchy with a "visitor" API generated via some ugly macro magic. +Dynamic casting using C++ RTTI is used in many places to check the class of an AST node; we aren't happy with this but also haven't had time to implement a better/faster solution. + +In the parsed AST, both types and expressions use the same representation (because in an expression like `A(B)` it is possible that `A` will resolve to a type, or to a function, and we don't know which yet). + +One slightly odd design choice in the parser is that it attaching lexical scoping information to the syntax nodes for identifiers, and any other AST node that need access to the scope/environment where it was defined. This is a choice we will probably change at some point, but it is deeply ingrained right now. + +### Semantic Checking + +The semantic checking step (`check.{h,cpp}`) is, not surprisingly, the most complicated and messiest bit of the compiler today. +The basic premise is simple: recursively walk the entire AST and apply semantic checking to each construct. + +Semantic checking applies to one translation unit at a time. +It has access to the list of entry points for the translation unit (so it can validate them), but it *not* allowed to depend on the compilation target(s) the user might have selected. + +Semantic checking of an expression or type term can yield the same AST node, with type information added, or it can return newly constructed AST needs (e.g., when an implicit cast needs to be inserted). +Unchecked identifiers or member references are always resolved to have a pointer to the exact declaration node they are referencing. + +Types are represented with a distinct class hierarchy from AST nodes, which is also used for a general notion of compile-time values which can be used to instantiate generic types/functions/etc. +An expression that ends up referring to a type will have a `TypeType` as its type, which will hold the actual type that the expression represents. + +The most complicated thing about semantic checking is that we strive to support out-of-order declarations, which means we may need to check a function declaration later in the file before checking a function body early in the file. +In turn, that function declaration might depend on a reference to a nested type declared somewhere else, etc. +We currently solve this issue by doing some amount of on-demand checking; when we have a reference to a function declaration and we need to know its type, we will first check if the function has been through semantic checking yet, and if not we will go ahead and recurisvely type check that function before we proceed. + +This kind of unfounded recursion can lead to real problems (especially when the user might write code with circular dependencies), so we have made some attempts to more strictly "phase" the semantic checking, but those efforts have not yet been done systematically. + +When code involved generics and/or interfaces, the semantic checking phase is responsible for ensuring that when a type claims to implement an interface it provides all of the requirements of that interface, and it records the mapping from requirements to their implementations for later use. Similarly, the body of a generic is checked to make sure it uses type parameters in ways that are consistent with their constraints, and the AST is amended to make it explicit when an interface requirement is being employed. + +### Lowering and Mandatory Optimizations + +The lowering step (`lower-to-ir.{h,cpp}`) is responsible for converting semantically valid ASTs into an intermediate representation that is more suitable for specialization, optimization, and code generaton. +The main thing that happens at this step is that a lot of the "sugar" in a high-level language gets baked out. For example: + +- A "member function" in a type will turn into an ordinary function that takes an initial `this` parameter +- A `struct` type nested in another `struct` will turn into an ordinary top-level `struct` +- Compound expressions will turn into sequences of instructions that bake the order of evaluation +- High-level control-flow statements will get resolved to a control-flow graph (CFG) of basic blocks + +The lowering step is done once for each translation unit, and like semantic checking it does *not* depend on any particular compilation target. +During this step we attach "mangled" names to any imported or exported symbols, so that each function overload, etc. has a unique name. + +After IR code has been generated for a translation unit (now called a "module") we next perform a set of "mandatory" optimizations, including SSA promotion and simple copy propagation and elmination of dead control-flow paths. +These optimizations are not primarily motivated by a desire to speed up code, but rather to ensure that certain "obvious" simplifications have been performed before the next step of validation. + +After the IR has been "optimized" we perform certain validation/checking tasks that would have been difficult or impossible to perform on the AST. +For example, we can validate that control flow never reached the end of a non-`void` function, and issue an error otherwise. +There are other validation tasks that can/should be performed at this step, although not all of them are currently implemented: + +- We should check that any `[unroll]` loops can actually be unrolled, by ensuring tha their termination conditions can be resolved to a compile-time constant (even if we don't know the constant yet) + +- We should check that any resource types are being used in ways that can be statically resolved (e.g., that the code never conditionally computes a resource to reference), since this is a requirement for all our current targets + +- We should check that the operands to any operation that requires a compile-time constant (e.g., the texel offset argument to certain `Sample()` calls) are passed values that end up being compile-time cosntants + +The goal is to eliminate any possible sources of failure in low-level code generation, without needing to have a global view of all the code in a program. +Any error conditions we have to push off until later starts to limit the value of our separate compilation support. + +### Parameter Binding and Type Layout + +The next phase of parameter binding (`parameter-binding.{h,cpp}`) is independednt of IR generation, and proceeds based on the AST that came out of semantic checking. +Parameter binding is the task of figuring out what locations/bindings/offsets should be given to all shader parameters referenced by the user's code. + +Parameter binding is done once for each target (because, e.g., Vulkan may bind parameters differently than D3D12), and it is done for the whole compile request (all translation units) rather than one at a time. +This is because when users compile something like HLSL vertex and fragment shaders in distinct translation units, they will often share the "same" parameter via a header, and we need to ensure that it gets just one locaton. + +At a high level, parameter binding starts by computing the *type layout* of each shader parameter. +A type layout describes the amount of registers/bindings/bytes/etc. that a type consumes, and also encodes the information needed to compute offsets/registers for individual `struct` fields or array elements. + +Once we know how much space each parameter consumes, we then inspect an explicit binding information (e.g., `register` modifiers) that are relevant for the target, and build a data structure to record what binding ranges are already consumed. +Finally, we go through any parameters without explicit binding information and assign them the next available range of the appropriate size (in a first-fit fashion). + +The parameter binding/layout information is what the Slang reflection API exposes. It is layered directly over the Slang AST so that it accurately reflects the program as the user wrote it, and not the result of lowering that program to our IR. + +This document describes parameter binding as a "front end" activity, but in practice it is something that could be done in the front-end, the back-end or both. +When shader code involves generic type parameters, complete layout information cannot be generated until the values of these parameters are fully known, and in practice that might not happen until the back end. + +### Serialization + +It is not yet fully implemented, but our intention is that the last thing the front-end does is to serialize the following information: + +- A stripped-down version of the checked AST for each translation unit including declarations/types, but not function bodies + +- The IR code for each translation unit + +- The binding/layout information for each target + +The above information is enough to type-check a subsequent module that `import`s code compile in the front-end, to link against its IR code, or to load and reflect type and binding information. + + +The "Back End" +-------------- + +The Slang back end logically starts with the user specifying: + +- An IR module, plus any necessary modules to link in and provide its dependencies + +- An entry point in that module, plus arguments for any generic parameters that entry point needs + +- A compilation target (e.g., SPIR-V for Vulkan) + +- Parameter binding/layout information for that module and entry point, computed for the chosen target + +We eventually want to support compiling multiple entry points in one pass of the back end, but for now it assumes a single entry point at a time + +### Linking and Target Specialization + +The first step we perform is to copy the chosen entry point and anything it depends on, recursively into a "fresh" IR module. +We make a copy of things so that any optimization/transformation passes we do for one target don't alter the code the front-end produced in ways that affect other targets. + +While copying IR code into the fresh module, we have cases where there might be multiple definitions of the same function or other entity. +In those cases, we apply "target specialization" to pick the definition that is the best for the chosen target. +This step is where we can select between, say, a built-in definition of the `saturate` function for D3D targets, vs. a hand-written one in the Slang standard library to use for GLSL-based targets. + +### API Legalization + +If we are targetting a GLSL-based platform, we need to translate "varying" shader entry point parameters into global variables used for cross-stage data passing. +We also need to translate any "system value" semantics into uses of the special built-in `gl_*` variables. + +We currently handle this kind of API-specific legalization quite early in the process, performing it right after linking. + +### Generic Specialization + +Once the concrete values for generic parameters are know we can set about specializing code to the known types. +We do this by cloning a function/type/whatever and substituting in the concrete arguments for the parameters. +This process can be continued as specializing one function may reveal opportunities to specialize others. + +During this step we also specialize away lookup of interface requirements through their witness tables, once generic witness-table parameters have been replaced with concrete witness tables. + +At the end of specialization, we should have code that makes no use of user-defined generics or interfaces. + +### Type Legalization + +While HLSL and Slang allow a single `struct` type to contain both "ordinary" data like a `float3` and "resources" like a `Texture2D`, the rules for GLSL and SPIR-V are more restrictive. +Ther are some additional wrinkles that arise for such "mixed" types, so we prefer to always "legalize" the types in the users code by replacing an aggregate type like: + +```hlsl +struct Material { float4 baseColor; Texture2D detailMap; }; +Material gMaterial; +``` + +with separate declarations for ordinary and resource fields: + +```hlsl +struct Material { float4 baseColor; } + +Material gMaterial; +Texture2D gMaterial_detailMap; +``` + +Changing the "shape" of a type like this (so that a single variable becomes more than one) needs to be done consistently across all declarations/functions in the program (hence why we do it after specialization, so that all concrete types are known). + +### Other Optimizations + +We dont' currently apply many other optimizations on the IR code in the back-end, under the assumption that the lower-level compilers below Slang will do some of the "heavy lifting." + +That said, there are certain optimizations that Slang must do eventually, for semantic completeness. One of the most important examples of these is implementing the sematncis of the `[unroll]` attribute, since we can't always rely on downstream compilers to have a capable unrolling implementation. + +We expect that over time it will be valuable for Slang to support a wider array of optimization passes, as long as they are ones that are considered "safe" to do above the driver interface, because they won't interfere with downstream optimization opportunities. + +### Emission + +Once we have transformed the IR code into something that should be legal for the chosen target, we emit high-level source code in either HLSL or GLSL. + +The emit logic is mostly just a scan over the IR code to emit a high-level declaration for each item: an IR structure type becomes a `struct` declaration, and IR function becomes a function definition, etc. + +In order to make the generated code a bit more readable, the Slang compiler currently does *not* emit declarations using their mangled names and instead tries to emit everything using a name based on how it was originally declared. + +To improve the readability of function bodies, the emit logic tries to find consecutive sequences of IR instructions that it can emit as a single high-level language expression. This reduces the number of temporaries in the output code, but we need to be careful about inserting parentheses to respect operator precedence, and also to not accidentally change the order of evaluation of code. + +When emitting a function body, we need to get from the low-level control flow graph (CFG) to high-level structured control-flow statements like `if`s and loops. We currently do this on a per-function basis during code emission, using an ad hoc algorithm based on control-flow structured information we stored in the IR. +A future version of the compiler might implement something more complete like the "Relooper" algorithm used by Emscripten. + +### Downstream Compiler Execution + +Once we have source code, we can invoke downstream compilers like fxc, dxc, and glslang to generate binary code (and optionally to disassemble that code for console output). + +The Slang compiler also supports a "pass through" mode where it skips most of the steps outlined so far and just passes text along to these downstream compilers directly. This is primarily intended as a debugging aid for developers working on Slang, since it lets you use the same command-line arguments to invoke both Slang compilation and compilation with these other compilers. + +Conclusion +---------- + +Hopefully this whirlwind introduction to the flow of the Slang compiler gives some idea of how the project fits together, and makes it easier to dive into the code and start being productive. |
