diff options
Diffstat (limited to 'docs/design')
| -rw-r--r-- | docs/design/autodiff.md | 6 | ||||
| -rw-r--r-- | docs/design/autodiff/basics.md | 6 | ||||
| -rw-r--r-- | docs/design/autodiff/decorators.md | 4 | ||||
| -rw-r--r-- | docs/design/autodiff/ir-overview.md | 18 | ||||
| -rw-r--r-- | docs/design/autodiff/types.md | 4 | ||||
| -rw-r--r-- | docs/design/capabilities.md | 14 | ||||
| -rw-r--r-- | docs/design/casting.md | 6 | ||||
| -rw-r--r-- | docs/design/coding-conventions.md | 2 | ||||
| -rw-r--r-- | docs/design/decl-refs.md | 2 | ||||
| -rw-r--r-- | docs/design/existential-types.md | 2 | ||||
| -rw-r--r-- | docs/design/interfaces.md | 22 | ||||
| -rw-r--r-- | docs/design/ir.md | 10 | ||||
| -rw-r--r-- | docs/design/overview.md | 26 | ||||
| -rw-r--r-- | docs/design/semantic-checking.md | 16 | ||||
| -rw-r--r-- | docs/design/serialization.md | 4 | ||||
| -rw-r--r-- | docs/design/stdlib-intrinsics.md | 2 |
16 files changed, 70 insertions, 74 deletions
diff --git a/docs/design/autodiff.md b/docs/design/autodiff.md index 29d7c82c7..8bf26baa9 100644 --- a/docs/design/autodiff.md +++ b/docs/design/autodiff.md @@ -201,7 +201,7 @@ DP<float> f_SSA_Proped(DP<float> dpa, DP<float> dpb) } // Note here that we have to 'store' all the intermediaries - // _t1, _t2, _q4, _t3, _q5, _t3_d, _t4 and _q1. This is fundementally + // _t1, _t2, _q4, _t3, _q5, _t3_d, _t4 and _q1. This is fundamentally // the tradeoff between fwd_mode and rev_mode if (_b1) @@ -288,7 +288,7 @@ void f_SSA_Rev(inout DP<float> dpa, inout DP<float> dpb, float dout) } // Note here that we have to 'store' all the intermediaries - // _t1, _t2, _q4, _t3, _q5, _t3_d, _t4 and _q1. This is fundementally + // _t1, _t2, _q4, _t3, _q5, _t3_d, _t4 and _q1. This is fundamentally // the tradeoff between fwd_mode and rev_mode if (_b1) @@ -330,4 +330,4 @@ void f_SSA_Rev(inout DP<float> dpa, inout DP<float> dpb, float dout) } } -```
\ No newline at end of file +``` diff --git a/docs/design/autodiff/basics.md b/docs/design/autodiff/basics.md index 189260aff..43ed164ad 100644 --- a/docs/design/autodiff/basics.md +++ b/docs/design/autodiff/basics.md @@ -4,7 +4,7 @@ This documentation is intended for Slang contributors and is written from a comp ## What is Automatic Differentiation? -Before diving into the design of the automatic differentiation (for brevity, we will call it 'auto-diff') passes, it is important to understand the end goal of what auto-diff tries to acheive. +Before diving into the design of the automatic differentiation (for brevity, we will call it 'auto-diff') passes, it is important to understand the end goal of what auto-diff tries to achieve. The over-arching goal of Slang's auto-diff is to enable the user to compute derivatives of a given shader program or function's output w.r.t its input parameters. This critical compiler feature enables users to quickly use their shaders with gradient-based parameter optimization algorithms, which forms the backbone of modern machine learning systems. It enables users to train and deploy graphics systems that contain ML primitives (like multi-layer perceptron's or MLPs) or use their shader programs as differentiable primitives within larger ML pipelines. @@ -60,7 +60,7 @@ DifferentialPair<float> fwd_f(DifferentialPair<float> dpx) } ``` -Note that `(2 * x)` is the multiplier corresponding to $Df(x)$. We refer to $x$ and $f(x)$ as "*primal*" values and the pertubations $dx$ and $Df(x)\cdot dx$ as "*differential*" values. The reason for this separation is that the "*differential*" output values are always linear w.r.t their "*differential*" inputs. +Note that `(2 * x)` is the multiplier corresponding to $Df(x)$. We refer to $x$ and $f(x)$ as "*primal*" values and the perturbations $dx$ and $Df(x)\cdot dx$ as "*differential*" values. The reason for this separation is that the "*differential*" output values are always linear w.r.t their "*differential*" inputs. As the name implies, `DifferentialPair<T>` is a special pair type used by Slang to hold values and their corresponding differentials. @@ -256,7 +256,7 @@ void rev_f(inout DifferentialPair<float> dpx, inout DifferentialPair<float> dpy, Note that `rev_f` accepts derivatives w.r.t the output value as the input, and returns derivatives w.r.t inputs as its output (through `inout` parameters). `rev_f` still needs the primal values `x` and `y` to compute the derivatives, so those are still passed in as an input through the primal part of the differential pair. -Also note that the reverse-mode derivative function does not have to compute the primal result value (its return is void). The reason for this is a matter of convenience: reverse-mode derivatives are often invoked after all the primal fuctions, and there is typically no need for these values. We go into more detail on this topic in the checkpointing chapter. +Also note that the reverse-mode derivative function does not have to compute the primal result value (its return is void). The reason for this is a matter of convenience: reverse-mode derivatives are often invoked after all the primal functions, and there is typically no need for these values. We go into more detail on this topic in the checkpointing chapter. The reverse mode function can be used to compute both `dOutput/dx` and `dOutput/dy` with a single invocation (unlike the forward-mode case where we had to invoke `fwd_f` once for each input) diff --git a/docs/design/autodiff/decorators.md b/docs/design/autodiff/decorators.md index 626f8bc4c..27bf0e3d0 100644 --- a/docs/design/autodiff/decorators.md +++ b/docs/design/autodiff/decorators.md @@ -45,7 +45,7 @@ interface IFoo_after_checking_and_lowering ### `[TreatAsDifferentiable]` In large codebases where some interfaces may have several possible implementations, it may not be reasonable to have to mark all possible implementations with `[Differentiable]`, especially if certain implementations use hacks or workarounds that need additional consideration before they can be marked `[Differentiable]` -In such cases, we provide the `[TreatAsDifferentiable]` decoration (AST node: `TreatAsDifferentiableAttribute`, IR: `OpTreatAsDifferentiableDecoration`), which instructs the auto-diff passes to construct an 'empty' function that returns a 0 (or 0-equivalent) for the derivative values. This allows the signature of a `[TreatAsDifferentiable]` function to match a `[Differentiable]` requirment without actually having to produce a derivative. +In such cases, we provide the `[TreatAsDifferentiable]` decoration (AST node: `TreatAsDifferentiableAttribute`, IR: `OpTreatAsDifferentiableDecoration`), which instructs the auto-diff passes to construct an 'empty' function that returns a 0 (or 0-equivalent) for the derivative values. This allows the signature of a `[TreatAsDifferentiable]` function to match a `[Differentiable]` requirement without actually having to produce a derivative. ## Custom derivative decorators In many cases, it is desirable to manually specify the derivative code for a method rather than let the auto-diff pass synthesize it from the method body. This is usually desirable if: @@ -68,7 +68,7 @@ In some cases, we face the opposite problem that inspired custom derivatives. Th This frequently occurs with hardware intrinsic operations that are lowered into special op-codes that map to hardware units, such as texture sampling & interpolation operations. However, these operations do have reference 'software' implementations which can be used to produce the derivative. -To allow user code to use the fast hardward intrinsics for the primal pass, but use synthesized derivatives for the derivative pass, we provide decorators `[PrimalSubstitute(ref-fn)]` and `[PrimalSubstituteOf(orig-fn)]` (AST Node: `PrimalSubstituteAttribute`/`PrimalSubstituteOfAttribute`, IR: `OpPrimalSubstituteDecoration`), that can be used to provide a reference implementation for the auto-diff pass. +To allow user code to use the fast hardware intrinsics for the primal pass, but use synthesized derivatives for the derivative pass, we provide decorators `[PrimalSubstitute(ref-fn)]` and `[PrimalSubstituteOf(orig-fn)]` (AST Node: `PrimalSubstituteAttribute`/`PrimalSubstituteOfAttribute`, IR: `OpPrimalSubstituteDecoration`), that can be used to provide a reference implementation for the auto-diff pass. Example: ```C diff --git a/docs/design/autodiff/ir-overview.md b/docs/design/autodiff/ir-overview.md index a6b3ec207..83391e27f 100644 --- a/docs/design/autodiff/ir-overview.md +++ b/docs/design/autodiff/ir-overview.md @@ -17,7 +17,7 @@ At this step, there are 2 other variants that can appear `IRBackwardDifferentiat 4. This process from (1.) is run in a loop. This is because we can have nested differentiation requests such as `IRForwardDifferentiate(IRBackwardDifferentiate(a : IRFuncType))`. The inner request is processed in the first pass, and the outer request gets processed in the next pass. ## Auto-Diff Passes for `IRForwardDifferentiate` -For forward-mode derivatives, we only require a single pass implemented wholly in `ForwardDiffTranscriber`. This implementes the linearization algorithm, which roughly follows this logic: +For forward-mode derivatives, we only require a single pass implemented wholly in `ForwardDiffTranscriber`. This implements the linearization algorithm, which roughly follows this logic: 1. Create a clone of the original function 2. Perform pre-autodiff transformations, the most @@ -357,7 +357,7 @@ The unzipping pass uses the decorations from the linearization step to figure ou The separation process uses the following high-level logic: 1. Create two clones of all the blocks in the provided function (one for primal insts, one for differential insts), and hold a mapping between each original (mixed) block to each primal and differential block. The return statement of the current final block is **removed**. 2. Process each instruction of each block: instructions marked as **primal** are moved to the corresponding **primal block**, instructions marked **differential** are moved to the corresponding **differential block**. -3. Instructions marked **mixed** need op-specific handling, and so are dispatched to the appropriate splitting function. For instance, block parameters that are holding differential-pair values are split into parameters for holding primal and differential values (the exception is function parameters, which are not affected). Simlarly, `IRVar`s, `IRTerminatorInst`s (control-flow) and `IRCall`s are all split into multiple insts. +3. Instructions marked **mixed** need op-specific handling, and so are dispatched to the appropriate splitting function. For instance, block parameters that are holding differential-pair values are split into parameters for holding primal and differential values (the exception is function parameters, which are not affected). Similarly, `IRVar`s, `IRTerminatorInst`s (control-flow) and `IRCall`s are all split into multiple insts. 4. Except for `IRReturn`, all other control-flow insts are effectively duplicated so that the control-flow between the primal blocks and differential blocks both follow the original blocks' control-flow. The main difference is that PHI arguments are split (primal blocks carry primal values in their PHI arguments, and differential blocks carry diff values) between the two. Note that condition values (i.e. booleans) are used by both the primal and differential control-flow insts. However, since booleans are always primal values, they are always defined in the primal blocks. @@ -522,7 +522,7 @@ We synthesize a CFG that satisfies this property through the following steps: %da_rev = OpAdd %da_rev_1 %da_rev_2 : %float ``` - Derivative accumulation is acheived through two ways: + Derivative accumulation is achieved through two ways: **Within** a block, we keep a list all the reverse derivative insts for each inst and only **materialize** the total derivative when it is required as an operand. This is the most efficient way to do this, because we can apply certain optimizations for composite types (derivative of an array element, vector element, struct field, etc..). @@ -756,12 +756,12 @@ After AD passes, this results in the following code: { /*...*/ } ``` -4. Construct the reverse control-flow (`reveseCFGRegion()`) by going through the reference forward-mode blocks, and cloning the control-flow onto the reverse-mode blocks, but in reverse. This is acheived by running `reverseCFGRegion()` recursively on each sub-region, where a *region* is defined as a set of blocks with a single entry block and a single exit block. This definition of a region only works because we normalized the CFG into this form. +4. Construct the reverse control-flow (`reveseCFGRegion()`) by going through the reference forward-mode blocks, and cloning the control-flow onto the reverse-mode blocks, but in reverse. This is achieved by running `reverseCFGRegion()` recursively on each sub-region, where a *region* is defined as a set of blocks with a single entry block and a single exit block. This definition of a region only works because we normalized the CFG into this form. The reversal logic follows these general rules: 1. **Unconditional Branch**: For an unconditional branch from `A->B` we simply have to map the reverse version of B with that of A. i.e. `rev[B] -> rev[A]` 2. **If-Else**: For an if-else of the form `A->[true = T->...->T_last->M, false = F->...->F_last->M]`, we construct `rev[M]->[true = rev[T_last]->...->rev[T_last]->rev[A], false = rev[F_last]->...->rev[F]->rev[A]]`. That is, we reverse each sub-region, and start from the merge block and end at the split block. - Note that we need to identify `T_last` and `F_last` i.e. the last two blocks in the true and false regions. We make the last block in the region an additional return value of `reverseCFGRegion()`, so that when reversing the true and false sub-regions, we also get the relevent last block as an additional output. Also note that additional empty blocks may be inserted to carry derivatives of the phi arguments, but this does not alter the control-flow. + Note that we need to identify `T_last` and `F_last` i.e. the last two blocks in the true and false regions. We make the last block in the region an additional return value of `reverseCFGRegion()`, so that when reversing the true and false sub-regions, we also get the relevant last block as an additional output. Also note that additional empty blocks may be inserted to carry derivatives of the phi arguments, but this does not alter the control-flow. 3. **Switch-case**: Proceeds in exactly the same way as `if-else` reversal, but with multiple cases instead of just 2. 4. **Loop**: After normalization, all (non-trivial) loops are of the form: `A->C->[true = T->...->T_last->C, false=B->...->M]`. We reverse this loop into `rev[M]->...rev[B]->rev[C]->[true=rev[T_last]->...->rev[T]->rev[C], false=rev[A]]`. The actual reversal logic also handles some corner cases by inserting additional blank blocks to avoid situations where regions may share the same merge block. @@ -975,12 +975,12 @@ When storing values this way, we must consider that instructions within loops ca **Indexed Region Processing:** In order to be able to allocate the right array and use the right indices, we need information about which blocks are part of which loop (and loops can be nested, so blocks can be part of multiple loops). To do this, we run a pre-processing step that maps all blocks to all relevant loop regions, the corresponding index variables and the inferred iteration limits (maximum times a loop can run). Note that if an instruction appears in a nested block, we create a multi-dimensional array and use multiple indices. - **Loop State Variables:** Certain variables cannot be classified as recompute. Major examples are loop state variables which are defined as variables that are read from and written to within the loop. In practice, they appear as phi-variables on the first loop block after SSA simplification. Their uses _must_ be classifed as 'store', because recomputing them requires duplicating the primal loop within the differential loop. This is because the differential loop runs backwards so the state of a primal variable at loop index $N$ cannot be recomputed when the loop is running backwards ($N+1 \to N \to N-1$), and involves running the primal loop up to $N$ times within the current iteration of the differential loop. In terms of complexity, this turns an $O(N)$ loop into an $O(N^2)$ loop, and so we disallow this. + **Loop State Variables:** Certain variables cannot be classified as recompute. Major examples are loop state variables which are defined as variables that are read from and written to within the loop. In practice, they appear as phi-variables on the first loop block after SSA simplification. Their uses _must_ be classified as 'store', because recomputing them requires duplicating the primal loop within the differential loop. This is because the differential loop runs backwards so the state of a primal variable at loop index $N$ cannot be recomputed when the loop is running backwards ($N+1 \to N \to N-1$), and involves running the primal loop up to $N$ times within the current iteration of the differential loop. In terms of complexity, this turns an $O(N)$ loop into an $O(N^2)$ loop, and so we disallow this. It is possible that the resulting $O(N^2)$ loop may end up being faster in practice due to reduced memory requirements, but we currently lack the infrastructure to robustly allow such loop duplication while keeping the user informed of the potentially drastic complexity issues. 3. **Process 'Recompute' insts:** Insert a copy of the primal instruction into a corresponding 'recomputation' block that is inserted into the differential control-flow so that it dominates the use-site. - **Insertion of Recompute Blocks:** In order to accomodate recomputation, we first preprocess the function, by going through each **breakable (i.e. loop) region** in the differential blocks, looking up the corresponding **primal region** and cloning all the primal blocks into the beginning of the differential region. Note that this cloning process does not actually clone the instructions within each block, only the control-flow (i.e. terminator) insts. This way, there is a 1:1 mapping between the primal blocks and the newly created **recompute blocks**, This way, if we decide to 'recompute' an instruction, we can simply clone it into the corresponding recompute block, and we have a guarantee that the definition and use-site are within the same loop scope, and that the definition comes before the use. + **Insertion of Recompute Blocks:** In order to accommodate recomputation, we first preprocess the function, by going through each **breakable (i.e. loop) region** in the differential blocks, looking up the corresponding **primal region** and cloning all the primal blocks into the beginning of the differential region. Note that this cloning process does not actually clone the instructions within each block, only the control-flow (i.e. terminator) insts. This way, there is a 1:1 mapping between the primal blocks and the newly created **recompute blocks**, This way, if we decide to 'recompute' an instruction, we can simply clone it into the corresponding recompute block, and we have a guarantee that the definition and use-site are within the same loop scope, and that the definition comes before the use. **Legalizing Accesses from Branches:** Our per-loop-region recompute blocks ensure that the recomputed inst is always within the same region as its uses, but it can still be out-of-scope if it is defined within a branch (i.e. if-else). We therefore still run a light-weight hoisting pass that detects these uses, inserts an `IRVar` at the immediate dominator of the def and use, and inserts loads and stores accordingly. Since they occur within the same loop region, there is no need to worry about arrays/indices (unlike the 'store' case). @@ -1363,7 +1363,7 @@ struct f_Intermediates }; -// After extraction: primal context funtion +// After extraction: primal context function float s_primal_ctx_f(float x, out f_Intermediates ctx) { // @@ -1459,4 +1459,4 @@ void outer_rev(DifferentialPair<float> dpx, float d_output) dpx = _dpx; } -```
\ No newline at end of file +``` diff --git a/docs/design/autodiff/types.md b/docs/design/autodiff/types.md index 2655b5c25..3860f0dfb 100644 --- a/docs/design/autodiff/types.md +++ b/docs/design/autodiff/types.md @@ -74,9 +74,9 @@ T.Differential dmul<S:__BuiltinRealType>(S s, T.Differential a) 5. During auto-diff, the compiler can sometimes synthesize new aggregate types. The most common case is the intermediate context type (`kIROp_BackwardDerivativeIntermediateContextType`), which is lowered into a standard struct once the auto-diff pass is complete. It is important to synthesize the `IDifferentiable` conformance for such types since they may be further differentiated (through higher-order differentiation). This implementation is contained in `fillDifferentialTypeImplementationForStruct(...)` and is roughly analogous to the AST-side synthesis. ### Differentiable Type Dictionaries -During auto-diff, the IR passes frequently need to perform lookups to check if an `IRType` is differentiable, and retreive references to the corresponding `IDifferentiable` methods. These lookups also need to work on generic parameters (that are defined inside generic containers), and existential types that are interface-typed parameters. +During auto-diff, the IR passes frequently need to perform lookups to check if an `IRType` is differentiable, and retrieve references to the corresponding `IDifferentiable` methods. These lookups also need to work on generic parameters (that are defined inside generic containers), and existential types that are interface-typed parameters. -To accomodate this range of different type systems, Slang uses a type dictionary system that associates a dictionary of relevant types with each function. This works in the following way: +To accommodate this range of different type systems, Slang uses a type dictionary system that associates a dictionary of relevant types with each function. This works in the following way: 1. When `CheckTerm()` is called on an expression within a function that is marked differentiable (`[Differentiable]`), we check if the resolved type conforms to `IDifferentiable`. If so, we add this type to the dictionary along with the witness to its differentiability. The dictionary is currently located on `DifferentiableAttribute` that corresponds to the `[Differentiable]` modifier. 2. When lowering to IR, we create a `DifferentiableTypeDictionaryDecoration` which holds the IR versions of all the types in the dictionary as well as a reference to their `IDifferentiable` witness tables. diff --git a/docs/design/capabilities.md b/docs/design/capabilities.md index a4f4fa396..b4bd4c099 100644 --- a/docs/design/capabilities.md +++ b/docs/design/capabilities.md @@ -31,7 +31,7 @@ struct Texture2D { ... - // Implicit-graident sampling operation. + // Implicit-gradient sampling operation. [availableFor(implicit_gradient_texture_fetches)] float4 Sample(SamplerState s, float2 uv); } @@ -54,7 +54,7 @@ capability fragment : implicit_gradient_texture_fetches; Here we've said that whenever the `fragment` capability is available, we can safely assume that the `implicit_gradient_texture_fetches` capability is available (but not vice versa). -Given even a rudientary tool like that, we can start to build up capabilities that relate closely to the "profiles" in things like D3D: +Given even a rudimentary tool like that, we can start to build up capabilities that relate closely to the "profiles" in things like D3D: ``` capability d3d; @@ -77,12 +77,12 @@ capability opengl : khronos; Here we are saying that `sm_5_1` supports everything `sm_5_0` supports, and potentially more. We are saying that `d3d12` supports `sm_6_0` but maybe not, e.g., `sm_6_3`. We are expressing that fact that having a `glsl_*` capability means you are on some Khronos API target, but that it doesn't specify which one. -(The extact details of these declarations obviously aren't the point; getting a good hierarchy of capabilites will take time.) +(The exact details of these declarations obviously aren't the point; getting a good hierarchy of capabilities will take time.) Capability Composition ---------------------- -Sometimes we'll want to give a distinct name to a specific combination of capabilties, but not say that it supports anything new: +Sometimes we'll want to give a distinct name to a specific combination of capabilities, but not say that it supports anything new: ``` capability ps_5_1 = sm_5_1 & fragment; @@ -129,7 +129,7 @@ For a given function definition `F`, the front end will scan its body and see wh If `F` doesn't have an `[availableFor(...)]` attribute, then we can derive its *effective* `[availableFor(...)]` capability as `R` (this probably needs to be expressed as an iterative dataflow problem over the call graph, to handle cycles). -If `F` *does* have one or more `[availabelFor(...)]` clauses that amount to a declared capability `C` (again in disjunctive normal form), then we can check that `C` implies `R` and error out if it is not the case. +If `F` *does* have one or more `[availableFor(...)]` clauses that amount to a declared capability `C` (again in disjunctive normal form), then we can check that `C` implies `R` and error out if it is not the case. A reasonable implementation would track which calls introduced which requirements, and be able to explain *why* `C` does not capture the stated requirements. For a shader entry point, we should check it as if it had an `[availableFor(...)]` that is the OR of all the specified target profiles (e.g., `sm_5_0 | glsl_450 | ...`) ANDed with the specified stage (e.g., `fragment`). @@ -152,7 +152,7 @@ It should be possible to define multiple versions of a function, having differen ``` For front-end checking, these should be treated as if they were a single definition of `myFunc` with an ORed capability (e.g., `vulkan | d3d12`). -Overload resoultion will pick the "best" candidate at a call site based *only* on the signatures of the function (note that this differs greatly from how profile-specific function overloading works in Cg). +Overload resolution will pick the "best" candidate at a call site based *only* on the signatures of the function (note that this differs greatly from how profile-specific function overloading works in Cg). The front-end will then generate initial IR code for each definition of `myFunc`. Each of the IR functions will have the *same* mangled name, but different bodies, and each will have appropriate IR decorations to indicate the capabilities it requires. @@ -213,7 +213,7 @@ Certain compositions of capabilities make no sense. If a user declared a functio Knowing that certain capabilities are disjoint can also help improve the overall user experience. If a function requires `(vulkan & extensionA) | (d3d12 & featureb)` and we know we are compiling for `vulkan` we should be able to give the user a pointed error message saying they need to ask for `extensionA`, because adding `featureB` isn't going to do any good. -As a first-pass model we could have a notion of `abstract` capabilities that are used to model the root of hierarcies of disjoint capabilities: +As a first-pass model we could have a notion of `abstract` capabilities that are used to model the root of hierarchies of disjoint capabilities: ``` abstract capability api; diff --git a/docs/design/casting.md b/docs/design/casting.md index 80c1f149f..6eafea1ac 100644 --- a/docs/design/casting.md +++ b/docs/design/casting.md @@ -146,9 +146,5 @@ The following code shows the change in behavior of 'as' is based on the source * SLANG_ASSERT(as<NamedExpression>(exprType) == nullptr); // dynamicCast is always the same object returned, so must match - SLANG_ASSERT(dynamcCast<NamedExpression(exprType) == exprType); + SLANG_ASSERT(dynamicCast<NamedExpression>(exprType) == exprType); ``` - - - -
\ No newline at end of file diff --git a/docs/design/coding-conventions.md b/docs/design/coding-conventions.md index 4223bee93..bc540783a 100644 --- a/docs/design/coding-conventions.md +++ b/docs/design/coding-conventions.md @@ -237,7 +237,7 @@ enum Note that the type name reflects the plural case, while the cases that represent individual bits are named with a singular prefix. -In public APIs, all `enum`s should use the style of separating the type defintion from the `enum`, and all cases should use `SCREAMING_SNAKE_CASE`: +In public APIs, all `enum`s should use the style of separating the type definition from the `enum`, and all cases should use `SCREAMING_SNAKE_CASE`: ```c++ typedef unsigned int SlangAxes; diff --git a/docs/design/decl-refs.md b/docs/design/decl-refs.md index 34b74a6f4..5c1958694 100644 --- a/docs/design/decl-refs.md +++ b/docs/design/decl-refs.md @@ -25,7 +25,7 @@ Why do we need `DeclRef`s? -------------------------- In a compiler for a simple language, we might represent a reference to a declaration as simply a pointer to the AST node for the declaration, or some kind of handle/ID that references that AST node. -A reprsentation like that will work in simple cases, for example: +A representation like that will work in simple cases, for example: ```hlsl struct Cell { int value }; diff --git a/docs/design/existential-types.md b/docs/design/existential-types.md index 06e2613e3..0f3469051 100644 --- a/docs/design/existential-types.md +++ b/docs/design/existential-types.md @@ -194,7 +194,7 @@ When dealing with a value type, though, we have to deal with things like making ``` interface IWritable { [mutating] void write(int val); } -stuct Cell : IWritable { int data; void write(int val) { data = val; } } +struct Cell : IWritable { int data; void write(int val) { data = val; } } T copyAndClobber<T : IWritable>(T obj) { diff --git a/docs/design/interfaces.md b/docs/design/interfaces.md index c0e284f59..b0c484327 100644 --- a/docs/design/interfaces.md +++ b/docs/design/interfaces.md @@ -13,7 +13,7 @@ Introduction The basic problem here is not unique to shader programming: you want to write code that accomplished one task, while abstracting over how to accomplish another task. As an example, we might want to write code to integrate incident radiance over a list of lights, while not concerning ourself with how to evaluate a reflectance function at each of those lights. -If we were doing this task on a CPU, and performance wasn't critical, we could probably handle this with higher-order functions or an equivalent mechansim like function pointers: +If we were doing this task on a CPU, and performance wasn't critical, we could probably handle this with higher-order functions or an equivalent mechanism like function pointers: float4 integrateLighting( Light[] lights, @@ -39,7 +39,7 @@ Depending on the scenario, we might be able to generate statically specialized c } Current shading languages support neither higher-order functions nor templates/generics, so neither of these options is viable. -Instead practicioners typically use preprocessor techniques to either stich together the final code, or to substitute in different function/type definitions to make a definition like `integrateLighting` reusable. +Instead practitioners typically use preprocessor techniques to either stich together the final code, or to substitute in different function/type definitions to make a definition like `integrateLighting` reusable. These ad hoc approaches actually work well in practice; we aren't proposing to replace them *just* to make code abstractly "cleaner." Rather, we've found that the ad hoc approaches end up interacting poorly with the resource binding model in modern APIs, so that *something* less ad hoc is required to achieve our performance goals. @@ -48,7 +48,7 @@ At that point, we might as well ensure that the mechanism we introduce is also a Overview -------- -The baisc idea for our approach is as follows: +The basic idea for our approach is as follows: - Start with the general *semantics* of a generic-based ("template") approach @@ -63,7 +63,7 @@ Interfaces ---------- An **interface** in Slang is akin to a `protocol` in Swift or a `trait` in Rust. -The choice of the `interface` keyword is to hilight the overlap with the conceptually similar construct that appeared in Cg, and then later in HLSL. +The choice of the `interface` keyword is to highlight the overlap with the conceptually similar construct that appeared in Cg, and then later in HLSL. ### Declaring an interface @@ -263,7 +263,7 @@ Then what should `BRDFParams` be? The two-parameter or six-parameter case? An **associated type** is a concept that solves exactly this problem. We don't care *what* the concrete type of `BRDFParams` is, so long as *every* implementation of `Material` has one. -The exact `BRDFParams` type can be different for each implementation of `Material`; the type is *assocaited* with a particular implementation. +The exact `BRDFParams` type can be different for each implementation of `Material`; the type is *associated* with a particular implementation. We will crib our syntax for this entirely from Swift, where it is verbose but explicit: @@ -276,7 +276,7 @@ We will crib our syntax for this entirely from Swift, where it is verbose but ex float3 evaluateBRDF(BRDFParams param, float3 wi, float3 wo); } -In this example we've added an assocaited type requirement so that every implementation of `Material` must supply a type named `BRDFParams` as a member. +In this example we've added an associated type requirement so that every implementation of `Material` must supply a type named `BRDFParams` as a member. We've also added a requirement that is a function to evaluate the BRDF given its parameters and incoming/outgoing directions. Using this declaration one can now define a generic function that works on any material: @@ -300,7 +300,7 @@ Some quick notes: - The use of `associatedtype` (for associated types) and `typealias` (for `typedef`-like definitions) as distinct keywords in Swift was well motivated by their experience (they used to use `typealias` for both). I would avoid having the two cases be syntactically identical. -- Swift has a pretty involved inference system where a type doesn't actually need to explicitly provide a type member with the chosen name. Instead, if you have a required method that takes or returns the assocaited type, then the compiler can infer what the type is by looking at the signature of the methods that meet other requirements. This is a complex and magical feature, and we shouldn't try to duplicate it. +- Swift has a pretty involved inference system where a type doesn't actually need to explicitly provide a type member with the chosen name. Instead, if you have a required method that takes or returns the associated type, then the compiler can infer what the type is by looking at the signature of the methods that meet other requirements. This is a complex and magical feature, and we shouldn't try to duplicate it. - Both Rust and Swift call this an "associated type." They are related to "virtual types" in things like Scala (which are in turn related to virtual classes in beta/gbeta). There are similar ideas that arise in Haskell-like languages with type classes (IIRC, the term "functional dependencies" is relevant). @@ -308,7 +308,7 @@ Some quick notes: I want to point out a few alternatives to the `Material` design above, just to show that associated types seem to be an elegant solution compared to the alternatives. -First, note that we could break `Material` into two interfaces, so long as we are allowed to place type constraints on assocaited types: +First, note that we could break `Material` into two interfaces, so long as we are allowed to place type constraints on associated types: interface BRDF { @@ -412,7 +412,7 @@ can in principle be desugared into: } with particular loss in what can be expressed. -The same desugaring appraoch should apply to global-scope functions that want to return an existential type (just with a global `typealias` instead of an `associatedtype`). +The same desugaring approach should apply to global-scope functions that want to return an existential type (just with a global `typealias` instead of an `associatedtype`). It might be inconvenient for the user to have to explicitly write the type-level expression that yields the result type (consider cases where C++ template metaprogrammers would use `auto` as a result type), but there is really no added power. @@ -434,12 +434,12 @@ The intent seems to be clear if we instead write: We could consider the latter to be sugar for the former, and allow users to write in familiar syntax akin to what ws already supported in Cg. -We'd have to be careful with such sugar, though, because there is a real and menaingful difference between saying: +We'd have to be careful with such sugar, though, because there is a real and meaningful difference between saying: - "`material` has type `Material` which is an interface type" - "`material` has type `M` where `M` implements `Material`" -In particular, if we start to work with assocaited types: +In particular, if we start to work with associated types: let b = material.evaluatePattern(...); diff --git a/docs/design/ir.md b/docs/design/ir.md index c7f4ffeb2..ba156c2f6 100644 --- a/docs/design/ir.md +++ b/docs/design/ir.md @@ -15,7 +15,7 @@ We will start by enumerating these goals (and related non-goals) explicitly so t * As a particular case of analysis and optimization, it should be possible to validate flow-dependent properties of an input function/program (e.g., whether an `[unroll]` loop is actually unrollable) using the IR, and emit meaningful error messages that reference the AST-level names/locations of constructs involved in an error. -* It should be posible to compile modules to the IR separately and then "link" them in a way that depends only on IR-level (not AST-level) constructs. We want to allow changing implementation details of a module without forcing a re-compile of IR code using that module (what counts as "implementation details") is negotiable. +* It should be possible to compile modules to the IR separately and then "link" them in a way that depends only on IR-level (not AST-level) constructs. We want to allow changing implementation details of a module without forcing a re-compile of IR code using that module (what counts as "implementation details") is negotiable. * There should be a way to serialize IR modules in a round-trip fashion preserving all of the structure. As a long-term goal, the serialized format should provide stability across compiler versions (working more as an IL than an IR) @@ -81,7 +81,7 @@ The only exception to this rule is instructions that represent literal constants The in-memory encoding places a few more restrictions on top of this so that, e.g., currently an instruction can either have operands of children, but not both. -Because everything that could be used as an operand is also an instruction, the operands of an instruction are stored in a highly uniform way as a contiguous array of `IRUse` values (even the type is continguous with this array, so that it can be treated as an additional operand when required). +Because everything that could be used as an operand is also an instruction, the operands of an instruction are stored in a highly uniform way as a contiguous array of `IRUse` values (even the type is contiguous with this array, so that it can be treated as an additional operand when required). The `IRUse` type maintains explicit links for use-def information, currently in a slightly bloated fashion (there are well-known techniques for reducing the size of this information). ### A Class Hierarchy Mirrored in Opcodes @@ -112,7 +112,7 @@ The idea doesn't really start in Swift, but rather in the existing observation t Like Swift, we do not use an explicit CPS representation, but instead find a middle ground of a traditional SSA IR where instead of phi instructions basic blocks have parameters. The first N instructions in a Slang basic block are its parameters, each of which is an `IRParam` instruction. -A block that would have had N phi instrutions now has N parameters, but the parameters do not have operands. +A block that would have had N phi instructions now has N parameters, but the parameters do not have operands. Instead, a branch instruction that targets that block will have N *arguments* to match the parameters, representing the values to be assigned to the parameters when this control-flow edge is taken. This encoding is equivalent in what it represents to traditional phi instructions, but nicely solves the problems outlined above: @@ -123,7 +123,7 @@ This encoding is equivalent in what it represents to traditional phi instruction - There is no special work required to track which phi operands come from which predecessor block, since the operands are attached to the terminator instruction of the predecessor block itself. There is no need to update phi instructions after a CFG change that might affect the predecessor list of a block. The trade-off is that any change in the *number* of parameters of a block now requires changes to the terminator of each predecessor, but that is a less common change (isolated to passes that can introduce or eliminate block parameters/phis). -- It it much more clear how to give an operational semantics to a "branch with arguments" instead of phi instructions: compute the target block, copy the argumenst to temporary storage (because of the simultaneity requirement), and then copy the temporaries over the parameters of the target block. +- It it much more clear how to give an operational semantics to a "branch with arguments" instead of phi instructions: compute the target block, copy the arguments to temporary storage (because of the simultaneity requirement), and then copy the temporaries over the parameters of the target block. The main caveat of this representation is that it requires branch instructions to have room for arguments to the target block. For an ordinary unconditional branch this is pretty easy: we just put a variable number of arguments after the operand for the target block. For branch instructions like a two-way conditional, we might need to encode two argument lists - one for each target block - and an N-way `switch` branch only gets more complicated. @@ -138,7 +138,7 @@ This constraint could be lifted at some point, but it is important to note that A traditional SSA IR represents a function as a bunch of basic blocks of instructions, where each block ends in a *terminator* instruction. Terminators are instructions that can branch to another block, and are only allowed at the end of a block. The potential targets of a terminator determine the *successors* of the block where it appears, and contribute to the *predecessors* of any target block. -The succesor-to-predecessor edges form a graph over the basic blocks called the control-flow graph (CFG). +The successor-to-predecessor edges form a graph over the basic blocks called the control-flow graph (CFG). A simple representation of a function would store the CFG explicitly as a graph data structure, but in that case the data structure would need to be updated whenever a change is made to the terminator instruction of a branch in a way that might change the successor/predecessor relationship. diff --git a/docs/design/overview.md b/docs/design/overview.md index c81853f1a..24c316038 100644 --- a/docs/design/overview.md +++ b/docs/design/overview.md @@ -11,7 +11,7 @@ Compilation is always performed in the context of a *compile request*, which bun Inside the code, there is a type `CompileRequest` to represent this. The user specifies some number of *translation units* (represented in the code as a `TranslationUnitRequest`) which comprise some number of *sources* (files or strings). -HLSL follows the traditional C model where a "translaiton unit" is more or less synonymous with a source file, so when compiling HLSL code the command-line `slangc` will treat each source file as its own translation unit. +HLSL follows the traditional C model where a "translation unit" is more or less synonymous with a source file, so when compiling HLSL code the command-line `slangc` will treat each source file as its own translation unit. For Slang code, the command-line tool will by default put all source files into a single translation unit (so that they represent a shared namespace0). The user can also specify some number of *entry points* in each translation unit (`EntryPointRequest`), which combines the name of a function to compile with the pipeline stage to compile for. @@ -23,7 +23,7 @@ It might not be immediately clear why we have such fine-grained concepts as this The "Front End" --------------- -The job of the Slang front-end is to turn textual source code into a combination of code in our custom intermediate represnetation (IR) plus layout and binding information for shader parameters. +The job of the Slang front-end is to turn textual source code into a combination of code in our custom intermediate representation (IR) plus layout and binding information for shader parameters. ### Lexing @@ -60,7 +60,7 @@ The parser (`Parser` in `parser.{h,cpp}`) is mostly a straightforward recursive- Because the input is already tokenized before we start, we can use arbitrary lookahead, although we seldom look ahead more than one token. Traditionally, parsing of C-like languages requires context-sensitive parsing techniques to distinguish types from values, and deal with stuff like the C++ "most vexing parse." -Slang instead uses heuristic approaches: for example, when we encouter an `<` after an identifier, we first try parsing a generic argument list with a closing `>` and then look at the next token to determine if this looks like a generic application (in which case we continue from there) or not (in which case we backtrack). +Slang instead uses heuristic approaches: for example, when we encounter an `<` after an identifier, we first try parsing a generic argument list with a closing `>` and then look at the next token to determine if this looks like a generic application (in which case we continue from there) or not (in which case we backtrack). There are still some cases where we use lookup in the current environment to see if something is a type or a value, but officially we strive to support out-of-order declarations like most modern languages. In order to achieve that goal we will eventually move to a model where we parse the bodies of declarations and functions in a later pass, after we have resolved names in the global scope. @@ -97,7 +97,7 @@ An expression that ends up referring to a type will have a `TypeType` as its typ The most complicated thing about semantic checking is that we strive to support out-of-order declarations, which means we may need to check a function declaration later in the file before checking a function body early in the file. In turn, that function declaration might depend on a reference to a nested type declared somewhere else, etc. -We currently solve this issue by doing some amount of on-demand checking; when we have a reference to a function declaration and we need to know its type, we will first check if the function has been through semantic checking yet, and if not we will go ahead and recurisvely type check that function before we proceed. +We currently solve this issue by doing some amount of on-demand checking; when we have a reference to a function declaration and we need to know its type, we will first check if the function has been through semantic checking yet, and if not we will go ahead and recursively type check that function before we proceed. This kind of unfounded recursion can lead to real problems (especially when the user might write code with circular dependencies), so we have made some attempts to more strictly "phase" the semantic checking, but those efforts have not yet been done systematically. @@ -105,7 +105,7 @@ When code involved generics and/or interfaces, the semantic checking phase is re ### Lowering and Mandatory Optimizations -The lowering step (`lower-to-ir.{h,cpp}`) is responsible for converting semantically valid ASTs into an intermediate representation that is more suitable for specialization, optimization, and code generaton. +The lowering step (`lower-to-ir.{h,cpp}`) is responsible for converting semantically valid ASTs into an intermediate representation that is more suitable for specialization, optimization, and code generation. The main thing that happens at this step is that a lot of the "sugar" in a high-level language gets baked out. For example: - A "member function" in a type will turn into an ordinary function that takes an initial `this` parameter @@ -116,29 +116,29 @@ The main thing that happens at this step is that a lot of the "sugar" in a high- The lowering step is done once for each translation unit, and like semantic checking it does *not* depend on any particular compilation target. During this step we attach "mangled" names to any imported or exported symbols, so that each function overload, etc. has a unique name. -After IR code has been generated for a translation unit (now called a "module") we next perform a set of "mandatory" optimizations, including SSA promotion and simple copy propagation and elmination of dead control-flow paths. +After IR code has been generated for a translation unit (now called a "module") we next perform a set of "mandatory" optimizations, including SSA promotion and simple copy propagation and elimination of dead control-flow paths. These optimizations are not primarily motivated by a desire to speed up code, but rather to ensure that certain "obvious" simplifications have been performed before the next step of validation. After the IR has been "optimized" we perform certain validation/checking tasks that would have been difficult or impossible to perform on the AST. For example, we can validate that control flow never reached the end of a non-`void` function, and issue an error otherwise. There are other validation tasks that can/should be performed at this step, although not all of them are currently implemented: -- We should check that any `[unroll]` loops can actually be unrolled, by ensuring tha their termination conditions can be resolved to a compile-time constant (even if we don't know the constant yet) +- We should check that any `[unroll]` loops can actually be unrolled, by ensuring that their termination conditions can be resolved to a compile-time constant (even if we don't know the constant yet) - We should check that any resource types are being used in ways that can be statically resolved (e.g., that the code never conditionally computes a resource to reference), since this is a requirement for all our current targets -- We should check that the operands to any operation that requires a compile-time constant (e.g., the texel offset argument to certain `Sample()` calls) are passed values that end up being compile-time cosntants +- We should check that the operands to any operation that requires a compile-time constant (e.g., the texel offset argument to certain `Sample()` calls) are passed values that end up being compile-time constants The goal is to eliminate any possible sources of failure in low-level code generation, without needing to have a global view of all the code in a program. Any error conditions we have to push off until later starts to limit the value of our separate compilation support. ### Parameter Binding and Type Layout -The next phase of parameter binding (`parameter-binding.{h,cpp}`) is independednt of IR generation, and proceeds based on the AST that came out of semantic checking. +The next phase of parameter binding (`parameter-binding.{h,cpp}`) is independent of IR generation, and proceeds based on the AST that came out of semantic checking. Parameter binding is the task of figuring out what locations/bindings/offsets should be given to all shader parameters referenced by the user's code. Parameter binding is done once for each target (because, e.g., Vulkan may bind parameters differently than D3D12), and it is done for the whole compile request (all translation units) rather than one at a time. -This is because when users compile something like HLSL vertex and fragment shaders in distinct translation units, they will often share the "same" parameter via a header, and we need to ensure that it gets just one locaton. +This is because when users compile something like HLSL vertex and fragment shaders in distinct translation units, they will often share the "same" parameter via a header, and we need to ensure that it gets just one location. At a high level, parameter binding starts by computing the *type layout* of each shader parameter. A type layout describes the amount of registers/bindings/bytes/etc. that a type consumes, and also encodes the information needed to compute offsets/registers for individual `struct` fields or array elements. @@ -190,7 +190,7 @@ This step is where we can select between, say, a built-in definition of the `sat ### API Legalization -If we are targetting a GLSL-based platform, we need to translate "varying" shader entry point parameters into global variables used for cross-stage data passing. +If we are targeting a GLSL-based platform, we need to translate "varying" shader entry point parameters into global variables used for cross-stage data passing. We also need to translate any "system value" semantics into uses of the special built-in `gl_*` variables. We currently handle this kind of API-specific legalization quite early in the process, performing it right after linking. @@ -208,7 +208,7 @@ At the end of specialization, we should have code that makes no use of user-defi ### Type Legalization While HLSL and Slang allow a single `struct` type to contain both "ordinary" data like a `float3` and "resources" like a `Texture2D`, the rules for GLSL and SPIR-V are more restrictive. -Ther are some additional wrinkles that arise for such "mixed" types, so we prefer to always "legalize" the types in the users code by replacing an aggregate type like: +There are some additional wrinkles that arise for such "mixed" types, so we prefer to always "legalize" the types in the users code by replacing an aggregate type like: ```hlsl struct Material { float4 baseColor; Texture2D detailMap; }; @@ -230,7 +230,7 @@ Changing the "shape" of a type like this (so that a single variable becomes more We dont' currently apply many other optimizations on the IR code in the back-end, under the assumption that the lower-level compilers below Slang will do some of the "heavy lifting." -That said, there are certain optimizations that Slang must do eventually, for semantic completeness. One of the most important examples of these is implementing the sematncis of the `[unroll]` attribute, since we can't always rely on downstream compilers to have a capable unrolling implementation. +That said, there are certain optimizations that Slang must do eventually, for semantic completeness. One of the most important examples of these is implementing the semantics of the `[unroll]` attribute, since we can't always rely on downstream compilers to have a capable unrolling implementation. We expect that over time it will be valuable for Slang to support a wider array of optimization passes, as long as they are ones that are considered "safe" to do above the driver interface, because they won't interfere with downstream optimization opportunities. diff --git a/docs/design/semantic-checking.md b/docs/design/semantic-checking.md index 0617aa0ab..10ddd5142 100644 --- a/docs/design/semantic-checking.md +++ b/docs/design/semantic-checking.md @@ -22,7 +22,7 @@ Checking Terms ### Some Terminology for Terms -We use the word "term" to refer genericaly to something that can be evaluated to produce a result, but where we do not yet know if the result will be a type or a value. For example, `Texture2D` might be a term that results in a type, while `main` might be a term that results in a value (of function type), but both start out as a `NameExpr` in the AST. Thus the AST uses the class hierarchy under `Expr` to represent terms, whether they evaluate to values or types. +We use the word "term" to refer generically to something that can be evaluated to produce a result, but where we do not yet know if the result will be a type or a value. For example, `Texture2D` might be a term that results in a type, while `main` might be a term that results in a value (of function type), but both start out as a `NameExpr` in the AST. Thus the AST uses the class hierarchy under `Expr` to represent terms, whether they evaluate to values or types. There is also the `Type` hierarchy, but it is important to understand that `Type` represents types as their logical immutable selves, while `Expr`s that evaluate to types are *type expressions* which can be concretely pointed to in the user's code. Type expressions have source locations, because they represent something the user wrote in their code, while `Type`s don't have singular locations by default. @@ -67,7 +67,7 @@ If we can't reasonably form an expression to return *at all* then we will return These classes are designed to make sure that subsequent code won't crash on them (since we have non-null objects), but to help avoid cascading errors. Some semantic checking logic will detect `ErrorType`s on sub-expressions and skip its own checking logic (e.g., this happens for function overload resolution), producing an `ErrorType` further up. In other cases, expressions with `ErrorType` can be silently consumed. -For example, an errorneous expression is implicitly convertible to *any* type, which means that assignment of an error expression to a local variable will always succeed, regardless of variable's type. +For example, an erroneous expression is implicitly convertible to *any* type, which means that assignment of an error expression to a local variable will always succeed, regardless of variable's type. ### Overload Resolution @@ -139,14 +139,14 @@ Checking of declarations is the most complicated and involved part of semantic c Simple approaches to semantic checking of declarations fall into two camps: -1. One can define a total ordering on declarations (usually textual order in the source file) and only allow dependecies to follow that order, so that checking can follow the same order. This is the style of C/C++, which is inherited from the legacy of traditional single-pass compilers. +1. One can define a total ordering on declarations (usually textual order in the source file) and only allow dependencies to follow that order, so that checking can follow the same order. This is the style of C/C++, which is inherited from the legacy of traditional single-pass compilers. 2. One can define a total ordering on *phases* of semantic checking, so that every declaration in the file is checked at phase N before any is checked at phase N+1. E.g., the types of all variables and functions must be determined before any expressions that use those variables/functions can be checked. This is the style of, e.g., Java and C#, which put a premium on defining context-free languages that don't dictate order of declaration. Slang tries to bridge these two worlds: it has inherited features from HLSL that were inspired by C/C++, while it also strives to support out-of-order declarations like Java/C#. -Unsurprisngly, this leads to unique challenges. +Unsurprisingly, this leads to unique challenges. -Supporting out-of-order declarations meeans that there is no simple total order on declarations (we can have mutually recursive function or type declarations), and supporting generics with value parameters means there is no simple total order on phases. +Supporting out-of-order declarations means that there is no simple total order on declarations (we can have mutually recursive function or type declarations), and supporting generics with value parameters means there is no simple total order on phases. For that last part observe that: * Resolving an overloaded function call requires knowing the types of the parameters for candidate functions. @@ -191,13 +191,13 @@ As a programmer contributing to the semantic checking infrastructure, the declar Name Lookup ----------- -Lookup is the processing of resolving the contextual meaning of names either in a lexical scope (e.g., the user wrote `foo` in a function body - what does it refer to?) or in the scope of scome type (e.g., the user wrote `obj.foo` for some value `obj` of type `T` - what does it refer to?). +Lookup is the processing of resolving the contextual meaning of names either in a lexical scope (e.g., the user wrote `foo` in a function body - what does it refer to?) or in the scope of some type (e.g., the user wrote `obj.foo` for some value `obj` of type `T` - what does it refer to?). Lookup can be tied to semantic analysis quite deeply. In order to know what a member reference like `obj.foo` refers to, we not only need to know the type of `obj`, but we may also need to know what interfaces that type conforms to (e.g., it might be a type parameter `T` with a constraint `T : IFoo`). In order to support lookup in the presence of our declaration-checking strategy described above, the lookup logic may be passed a `SemanticsVisitor` that it can use to `ensureDecl()` declarations before it relies on their properties. -However, lookup also currently gets used during parsing, and in those cases it may need ot be applied without access to the semantics-checking infrastructure (since we currently separate parsing and semantic analysis). +However, lookup also currently gets used during parsing, and in those cases it may need to be applied without access to the semantics-checking infrastructure (since we currently separate parsing and semantic analysis). In those cases a null `SemanticsVisitor` is passed in, and the lookup process will avoid using lookup approaches that rely on derived semantic information. This is fine in practice because the main thing that gets looked up during parsing are names of `SyntaxDecl`s (which are all global) and also global type/function/variable names. @@ -210,7 +210,7 @@ Just like a C/C++ parser, the Slang parser sometimes needs to disambiguate wheth Ideally the way forward is some combination of the following two strategies: -* We should strive to make parsing at the "global scope" fully context-insensitive (e.g., by using similar lookahead heuristics to C#). We are already close to this goal today, but will need to be careful that we do not introduce regressions compared to the old parser (perhaps a "compatiblity" mode for legacy HLSL code is needed?) +* We should strive to make parsing at the "global scope" fully context-insensitive (e.g., by using similar lookahead heuristics to C#). We are already close to this goal today, but will need to be careful that we do not introduce regressions compared to the old parser (perhaps a "compatibility" mode for legacy HLSL code is needed?) * We should delay the parsing of nested scopes (both function and type bodies bracketed with `{}`) until later steps of the compiler. Ideally, parsing of function bodies can be done in a context-sensitive manner that interleaves with semantic checking, closer to the traditional C/C++ model (since we don't care about out-of-order declarations in function bodies). diff --git a/docs/design/serialization.md b/docs/design/serialization.md index c05c60ad8..008fd6da6 100644 --- a/docs/design/serialization.md +++ b/docs/design/serialization.md @@ -24,7 +24,7 @@ We could imagine a mechanism that saved off each instance, by writing off the ad * If we try to read back on a different machine, with a different pointer size, the object layout will be incompatible * If we try to read back on the same machine where the source is compiled by a different compiler, the object layout might be incompatible (say bool or enum are different size) -* Endianess might be different +* Endianness might be different * Knowing where all the pointers are and what they point to and therefore what to serialize is far from simple. * The alignment of types might be different across different processors and different compilers @@ -304,7 +304,7 @@ Riff Container [Riff](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format) is used as a mechanism to store binary sections. The format allows for a hierarchy of `chunks` that hold binary data. How the data is interpreted depends on the [FOURCC](https://en.wikipedia.org/wiki/FourCC) associated with each chunk. -As previously touched on there are multiple different mechanisms used for serialization. IR serialization, generalized serialization, SourceLoc serialization - there are also other uses, such as serializing of entry point information. Riff is used to combine all of these incompatible binary parts togther such that they can be stored together. +As previously touched on there are multiple different mechanisms used for serialization. IR serialization, generalized serialization, SourceLoc serialization - there are also other uses, such as serializing of entry point information. Riff is used to combine all of these incompatible binary parts together such that they can be stored together. The handling of these riff containers is held within the `SerialContainerUtil` class. diff --git a/docs/design/stdlib-intrinsics.md b/docs/design/stdlib-intrinsics.md index 6e86f4c3f..a9369138d 100644 --- a/docs/design/stdlib-intrinsics.md +++ b/docs/design/stdlib-intrinsics.md @@ -21,7 +21,7 @@ Looking at these files will demonstrate the features in use. Most of the intrinsics and attributes have names that indicate that they are not for normal use. This is typically via a `__` prefix. -The `.meta.slang` files look largely like Slang source files, but their contents can also be generated programatically with C++ code. A section of code can drop into `C++` code if it is proceeded by `${{{{`. The C++ section is closed with a closing `}}}}`. This mechanism is typically used to generate different versions of a similar code sequence. Values from the C++ code can be accessed via the `$()`, where the contents of the brackets specifies something that can be calculated from within the C++ code. +The `.meta.slang` files look largely like Slang source files, but their contents can also be generated programmatically with C++ code. A section of code can drop into `C++` code if it is proceeded by `${{{{`. The C++ section is closed with a closing `}}}}`. This mechanism is typically used to generate different versions of a similar code sequence. Values from the C++ code can be accessed via the `$()`, where the contents of the brackets specifies something that can be calculated from within the C++ code. As an example, to produce an an array with values 0 to 9 we could write... |
