summaryrefslogtreecommitdiff
path: root/source/slang/slang-check-expr.cpp
AgeCommit message (Collapse)Author
2021-03-10A bunch of overlapping semantic-checking fixes (#1743)Tim Foley
This change originally started with the simple goal of allowing generic functions with default argument values on their parameters to work: ``` void someFunction<T>(T value, int optional = 0); ``` The core problem there was that the compiler code was (correctly) anticipate the case where the default argument value for a parameter depends on a generic parameter, such as: ``` interface IDefaultable { static This getDefault(); } void anotherFunction<T : IDefaultable>(T first, T second = T.getDefault()); ``` Supporting this latter case requires some kind of ability to apply subsitutions to an `Expr`, but our compiler logic simply errored out in that case. The first major fix that went into this change was to add a new `SubstExpr<T>` type that behaves a lot like `DeclRef<T>` in that it stores a `T*` plus a set of substititions that need to be applied to it. In addition, it was found that even if `anotherFunction<ConcreteType>(...)` might work, when generic argument inference was used for just `anotherFunction(...)` would fail because it includes a strict match on the number of arguments/parameters in the call expression. The next problem that arose was that the test I'd created used an interace with an `__init` requirement, and it appeared that our code generation didn't work for that case: ``` interface IStuff { __init(int val); } void f<T : IStuff>(T x = T(0)); ``` In this case, the `T(0)` initialization would get compiled to `(ConcreteType) 0` in the output rather than calling the function generated for the `__init` inside `ConcreteType`. The basic problem there was a bit of crufty old logic we have in place to work around the large number of `__init` declarations in the stdlib that don't have proper `__intrinsic_op` modifiers on them. We really need to fix the underlying problem there, but I worked around it by having the IR lowering pass only do its workaround magic on stdlib declarations. The next problem down this line was that my test had two different `__init` declarations in the concrete type and the logic for checking interface conformance was picking the wrong one to satisfying an interface requirement despite it being obviously wrong (not even the right number of parameter). This last problem led me down the rabbit-hole of trying to actually get our semantic checking for interface requirements right. There were a few pieces to that work: * Actually checking that the parameter and result types for two callables match is the simple part. If that was all that would be required we would have implement this logic a long time ago. * Next we have to deal with functions that make use of the `This` type, associated types, etc. We have to know that when the interface uses `This`, we want to treat that as equivalent to `ConcreteType`, and similarly for associated types. Getting that working is mostly a matter of setting up a this-type subsitution for the interface member being checked. * Finally, when comparing generic declarations like `IBase::doThing<T>` and `Derived::doThing<U>` we need to deal with the way that `T` and `U` represent the "same" logical type parameter, but are distinct `Decl`s. This is handled by specializing the base declaration to the parameters of the derived one (e.g., forming `IBase::doThing<U>` using the `U` from `Derived::doThing`). The result seems to be passing our tests, but there are still a few gotchas lurking, I'm sure.
2021-03-01Doc improvements (#1729)jsmall-nvidia
* #include an absolute path didn't work - because paths were taken to always be relative. * Split out AST 'printing'. * Replace listener with List<Section> * Section -> Part. * Kind -> Type Flags -> Kind for ASTPrinter::Part * Improve comments around ASTPrinter. * toString -> toText on Val derived types. toText appends to a StringBuilder. * Added toSlice free function. Added operator<< for Val derived types. Use << where appropriate in doing toText. * More work at mark down output. * Fill in sourceloc for enum case. Add more sophisticated location determination for EnumCase. Refactored documentation output into DocMarkdownWriter. * Improvements for sig output.
2021-02-05Initial implementation of interface conjunctions (#1691)Tim Foley
The basic feature here is the ability to use the `&` operator to produce the conjunction/intersection of two interfaces. That is, you can have interfaces: interface IFirst { int getFirst(); } interface ISecond { int getSecoond(); } and if you need a generic function where the type parameter `T` must conform to *both* of these interfaces, you express that by constraining the parameter to the intersection of the interfaces: void someFunction<T : IFirst & ISecond>(T value) { ... } Without this feature, the main alternative an application would have is to define an intermediate interface, like: interface IBoth : IFirst, ISecond {} Forcing users to deal with an intermediate interface creates more work for type authors (they need to remember to inherit from the right combined interface(s)), or for `extension` authors (when you add `ISecond` to a type that used to just support `IFirst`, you had better also add `IBoth`). In the worst case, a family of N related "leaf" interfaces would give rise to an exponential number of intermediate interfaces to represnt the possible combinations. A conjunction like `IFirst & ISecond` is officially its own type, and can be used to declare a type alias: typealias IBoth = IFirst & ISecond; This change only includes the first pass of work on this feature, so there are several caveats to be aware of: * Using a conjunction as part of an inheritance clause is not yet supported (e.g., `struct X : IFirst & ISecond`). This is true even if the conjunction was introduced by an intermediate `typealias` * The `&` syntax introduced here is only parsed in places where only a type (not an expression) is possible. This means you cannot do things like cast to a conjunction with `(IFirst & ISecond)(someValue)`. * This work *should* apply to conjunctions of more than two interfaces (like `IA & IB & IC`) but that has not yet been tested * In the long run it may be sensible to allow conjunctions that use concrete types, but we really ought to have the semantic checking logic rule that out for now. * During testing, I encountered compiler crashes when trying to use this feature together with `property` declarations. Further investigation and debugging is called for. * The handling of conjunction types is currently incomplete, in that there are many equivalences the compiler does not yet understand. For example, it is clear that `IA & IB` is equivalent to `IB & IA`, but the compiler currently does not understand this and will treat them as different types. A deeper implementation approach is called for. * Conjunctions are currently only supported for generic type parameter constraints, when performing full specialization. Use of conjunctions for existential-type value parameters or with dynamic dispatch is not yet supported.
2020-11-19Fix constant folding in attributes (#1610)Yong He
* Fix constant folding in attributes * remove unnecessary change * remove unnecessary change * remove unnecessary change * Fixed circular checking issue. * cleanup * more cleanup * minimize diff * minimize diff * minimize diff
2020-08-27Clean up the way that lookup "through" a base type is encoded (#1519)Tim Foley
* Clean up the way that lookup "through" a base type is encoded In order to undestand this change, it is important to undestand how lookup through base interfaces works prior to this change. In order to understand *that* it helps to be reminded of how inheritance relationships get encoded in the AST. Suppose the user writes: struct Base { int val; } struct Derived : Base { ... } ... Derived d = ...; int v = d.val; The question is how an expression like `d.val` gets semantically checked, and how it is encoded into the IR after semantic checking. You might assume it gets checked and encoded so that we end up with: int v = ((Base) d).val; and that seems like it should Just Work... so of course that isn't what Slang has been doing. Instead, we relied on the fact that the inheritance relationship `Derived : Base` is represented as an `InheritanceDecl` member of the `Derived` type, and we ended up checking the code into something like: int v = d.<anonymous>.val; where `<anonymous>` stands in for the name of the `InheritanceDecl` that represents inheritance from `Base`. This design choice makes a limited amount of sense when you consider how inheritance would typically be lowered to a C-like output language: // struct Derived : Base { ... } // => struct Derived { Base base; ... } The problem with that encoding is that it really doesn't make sense for almost any other scenario. In particular, if you have a generic type parameter `T` that was constrianed with `T : ISomething`, then the constraint isn't even technically a *member* of the type parameter `T`, so expressing thing as a member reference in the AST is completely incorrect. Unfortunately, by the time it was clear that we needed something better, a bunch of implementation work was done based on the existing representation. This change tries to clean things up so that lookup of a super-type member through a value of a sub-type does the obvious thing: cast the value to the super-type and then look up the member (as in `((Base) d).val`). The core of the change is that in lookup, instead of creating `Constraint` breadcrumbs whenever we are looking up in a super-type (with a reference to the `TypeConstraintDecl` being used) we instead use `SuperType` breadcrumbs (with a reference to a `SubtypeWitness`). Then when we create the expression from a `LookupResultItem`, we translate any `SuperType` breadcrumbs into `CastToSuperTypeExpr`s (an expression type that already existed). This change also adds support for lookup through the `This` type in the context of an interface, and in order for that to work we need a new kind of subtype witness to represent the knowledge that a `This` type is a subtype of the enclosing interface. Making that work forces us to change the representation of `TransitiveSubtypeWitness` so that it takes a pair of subtype witnesses (and not one subtype witness plus one `TypeConstraintDecl`). For the most part this is a small change, but it raises the possibility that some pieces of the code aren't going to be robust against all possible shapes of subtype witnesses. The IR lowering logic has relied on the weird `d.<anonymous>` representation in order to ensure that when looking up interface members we weren't always casting to the interface type (which would create a `makeExistential` instruction), and then calling using that. Basically, the IR lowering would ignore the `d.<anonymous>` part and just emit `d`, but we can't do that for `((Base) d)` or `((IThing) d)` because whehter or not we should actually perform the cast depends on context. For now we solve that problem by adding specific logic to ignore up-casts to interface types when they appear in member expressions or method calls. A more robust solution might be needed down the line, but this seems to work in practice. All of this work is cleanup that I found was needed in order to make `extension`s of `interface` types workable. * fixup: disable an incorrect test
2020-08-21Another fix for overriding property decls (#1509)Tim Foley
* Another fix for overriding property decls The central problem we keep running into with `property` decls in `interface`s comes down to two choices: 1. When a member lookup `obj.someName` or a simple lookup for `someName` produces an overloaded result, we make no attempt to resolve the overloading right away, and instead postpone disambiguation until the point where that expression gets *used*, in case the context where it gets used can help in disambiguation (a notable case being when there is a call expression `obj.someName(...)` or `someName(...)`). 2. When looking up members in a the scope of a type (either for `obj.someName` or `someName` in the context of a method), we include all results from base types in the set of overloads returned, even in cases where the type has a direct member that "overrides" the inherited one. The combination of these factors means that when a `struct` type implements a `property` to satisfy a requirement of an inherited `interface`, then references to `obj.someProp` end up being ambiguous between the property in the concrete `struct` type and the property it inherits through the `interface`. There is no quick fix possible for issue (2). It might seem that we could just skip over members inherited through `interface`s when doing lookup in a type, but that solution wouldn't apply to inheritance from another `struct` type, or any future scenario where we support default implementations of methods in interfaces. The simple idea of saying that a derived-type member named `M` hides all inherited members named `M` is possible, but would lead to a bad user interface when a type wants to support both a core "bottleneck" method and a bunch of convenience overloads with the same name. That leaves us with issue (1), and trying to find a reasonable fix for it. The common case is that any expression `e` eventually gets used in a context where it will be be subject to disambiguation: * If we form a call expression `e(...)`, then the overload resolution logic will (obviously) work to disambiguate which `e` was meant. * If `e` is used as an argument to another call (`f(... e ...)` or `... + e`), then `e` will be coerced to the expected parameter type for its argument position, and that coercion will disambiguate it (this is the bit that was fixed in #1501) * If `e` is used in another context where a type is expected/known, it will also be coerced: `if(e)`, `int v = e`, etc. The problem case that is left behind is any scenario where `e` is not subject to one of the above resolution cases, which mostly amounts to cases where an expression is never coerced to a single fixed type. There are a few important cases where this occurs today: * When the expression is used as the left-hand side of an assignement (`e = ...`). * When an expression is used to initialize a variable with an implicit type (`let v = e`). * When inferring generic arguments from the value arguments at a call site (`f(e)` where `f` is defined as `f<T>(T v)`) The key connecting thread in each of these cases is that the front-end needs to determine the type of `e` to make progress. Our semantic checking logic already has functions that try to draw a distinction between the two cases: * The `CheckTerm()` operation is supposed to be used when we expect that we will eventually coerce or otherwise diambiguate the term, and also in cases where we don't yet know if a term should name a type or a value * The `CheckExpr()` operation is supposed to be used when we do not expect that we will apply coercion/disambiguation to a term, and need to have assurances that it has been coerced into a non-overloaded expression with a reasonable type The simple part of the fix made here is to make `CheckExpr()` actually do part of what it is suppsoed to (attempt to disambiguate overloaded terms), and then audit all the call sites to `CheckExpr()` to make sure they are actually ones that intend to opt into that logic. The messier part of the fix is dealing with generic argument inference, because we need to extract the type of the disambiguated expression for the purposes of inference, but we don't want to disturb the actual argument list at a call site (because type coercion of the arguments is supposed to handle the disambiguation). This part is done with a bit of special-casing in the overload-resolution context, by adding a method that gets the type or an argument after disambiguation (when possible). * fixup Co-authored-by: Yong He <yonghe@outlook.com>
2020-08-21Allow calling a generic function with an existential value (dynamic ↵Yong He
dispatch) (#1508) * Allow calling a generic function with an existential value (dynamic dispatch). * Fixes per review comments. * Clean up implementation by having `openExistential` return `ExtractExistentialType` instead of a DeclRef to the interface with a `ThisTypeSubstitution`. * More cleanups Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com> Co-authored-by: Yong He <yhe@nvidia.com>
2020-08-17Attempt to fix lookup for members that "override" (#1501)Tim Foley
Our current lookup process always finds *all* members of a type, which can include both an inherited member (e.g., from an `interface`) and one that logically overrides/implements it. If something downstream doesn't filter this result down and favor the derived member, then an ambiguity error will result. To date, this has mostly been a non-issue because we haven't emphasized inheritance, and the main case we did support (`struct` types implemented `interface` methods) gets disambiguated as part of overload resolution for function calls. Recent changes to support `property` declarations to `interface`s add the possibility for ambiguity between a "base" and "derived" declaration that can't rely on overload resolution for disambiguation. The approach in this PR is to add disambiguation logic to the other main place where the results of lookup get used. If a lookup result is being assigned to a variable, passed to a function, or otherwise used in a case where a value of a specific type is needed, it will be "coerced" to the desired type. This change makes it so that the first step in the coercion logic is to try to disambiguate the expression that is being coerced. In order to ensure that an overloaded expression can be detected and resolved even when just checking if coercion is possible, I needed to update the `canCoerce*()` functions to also take the expression that is being tested for coercibility, and not just its type. There is only one case (that I saw) where coercion checks were being made without an expression value available, and that case didn't actually need/want to handle overloading. In order to test the fixes here, I added logic to the `property`-in-`interface` test to make sure that the critical cases work as expected (references to a derived member using "dot syntax" and "implicit `this`" syntax). Alternatives Considered ----------------------- The first attempt at this fix took a simpler approach: I added the disambiguation logic as a post-process on member lookup. That is, given `obj.foo` I would take the `LookupResult` for `foo` and immediately try to filter it to include only the most-derived members. This approach has the major benefit of catching even more use cases of values (and thus helping to ensure that we don't spend forever chasing down more of these ambiguity errors), but it also has two critical problems: 1. If we only trigger disambiguation when looking up `obj.foo`, then we can't do anything to help when `foo` is looked up as an ordinary identifier, but is actually equivalent to `this.foo`. A full fix would require doing this disambiguation on *every* name lookup, which leads to the second issue: 2. It is important that for a method call like `obj.m(...)` we do *not* disambiguate when looking up `obj.m`, and instead let the overload resolution for the call resolve things. That choice is what makes it possible to call an inherited `m` declaration even when there is a derived `m` with a different signature. Issue (1) is covered by the test case that was added here, but we should probably have a test case for (2) to make sure we don't break that use case. Caveats ------- An important case that we don't solve in this PR is when the result of a lookup is captured in a variable without an explicit type: let f = obj.foo; That case also needs disambiguation, and should be addressed in a later change. A secondary issue is that our approach to prioritizing declarations during lookup is still quite naive. We really need a way for lookup to attach information about nesting of scopes to results (to be clear that results from inner scopes should be preferred over those from outer scopes), as well as have a robust mechanism for comparing the priority of members based on the inheritance graph of a type. This change doesn't do anything to make the situation better or worse.
2020-08-13Support property declarations in interfaces (#1494)Tim Foley
There are two main features in this change. First, we allow for `interface`s to declare `property` requirements, which can be satisfied by matching `property` declarations in a type that conforms to the interface: interface IRectangle { property float width { get; } property float height { get; } } struct Square : IRectangle { float size; property float width { get { return size; } } property float height { get { return size; } } } Second, we allow a type to satisfy a `property` requirement with an ordinary field of the same name: struct Rectangle : IRectangle { float width; float height; // no explicit `property` declarations needed } The implementation of these features is mostly in `slang-check-decl.cpp` in the logic for checking conformance of a type to an interface. The first feature simply requires adding logic to checking whether a candidate satisfying `property` declaration matches a required `property` declaration. To do so, it must have the same type, and an accessor to satisfy each of the required accessors. The second feature requires adding logic to synthesize an AST `property` declaration for a type, based on a required `property` declaration and its accessors. This means that, more or less, any type where `this.name` yields a storage location that does what is needed can satisfy a property requirement (there is no specific rule that says the storage needs to be a field, although that is the most likely case). The way that witnesses are stored for property declarations probably merits some description. During IR lowering, an abstract storage declaration like a subscript or `property` more or less desugars away, so that the actual interface requirements correspond to the accessors within it (the `get`, `set`, etc.). This means that a witness table should have entries/keys corresponding to the accessors and not the property itself. The process of finding/recording witnesses for `property` requirements thus installs entries for the individual accessors (with care taken to only install accessor witnesses once we are sure we have witnesses for all the requirements). Currently, the code also installs an entry for the property itself, although that is not strictly required, and might not be something we continue to do long-term. (Aside: it was somewhat surprising that an end-to-end test of `property` declarations in `interface`s Just Worked without any changes to IR lowering.) As we continue to write more code that synthesizes and checks AST expressions/statements, it becomes necessary to refactor the semantic checking logic so that it splits the recursive part (e.g., checking the operands of an assignment) from the validation part (e.g., checking that the assignment itself is valid). It is probably too big of a change to justify at this point, but it might be valuable in the future to have distinct hierarchies that represent unchecked and checked ASTs, with semantic checking mostly being a transformation from one to the other. The benefit of such a change is we could factor out a distinct "builder" API for constructing validated/checked AST nodes, with both semantic checking and AST synthesis being clients of that API.
2020-06-30Initial work on property declarations (#1410)Tim Foley
* Initial work on property declarations Introduction ============ The main feature added here is support for `property` declarations, which provide a nicer experience for working with getter/setter pairs. If existing code had something like this: ```hlsl struct Sphere { float4 centerAndRadius; // xyz: center, w: radius float3 getCenter() { return centerAndRadius.xyz; } void setCenter(float3 newValue) { centerAndRadius.xyz = newValue; } // similarly for radius... } void someFunc(in out Sphere s) { float3 c = s.getCenter(); s.setCenter(c + offset); } ``` It can be expressed instead using a `property` declaration for `center`: ```hlsl struct Sphere { float4 centerAndRadius; // xyz: center, w: radius property center : float3 { get { return centerAndRadius.xyz; } set(newValue) { centerAndRadius.xyz = newValue; } } // similarly for radius... } void someFunc(in out Sphere s) { float3 c = s.center; s.center = c + offset; } ``` The benefits at the declaration site aren't that signficiant (e.g., in the example above we actually have slightly more lines of code), but the improvement in code clarity for users is significant. Having `property` declarations should also make it easier to migrate from a simple field to a property with more complex logic without having to first abstract the use-site code using a getter and setter. An important future benefit of `property` syntax will be if we allow `interface`s to include `property` requirements, and then also allow those requirements to be satisfied by ordinary fields in concrete types. Subscripts ---------- The Slang compiler already has limited (stdlib-use-only) support for `__subscript` declarations, which are conceptually similar to `operator[]` from the C++ world, but are expressed in a way that is more in line with `subscript` declarations in Swift. A `SubscriptDecl` in the AST contains zero or more `AccessorDecl`s, which correspond to the `get` and `set` clauses inside the original declaration (there is also a case for a `__ref` accessor, to handle the case where access needs to return a single address/reference that can be atomically mutated). A major goal of the implementation here is to re-use as much of the infrastructure as possible for `__subscript` declarations when implementing `property` declarations. Nonmutating Setters ------------------- One additional thing added in this change is the ability to mark a `set` accessor on either a subscript or a property as `[nonmutating]`, and indeed all of the existing `set` accessors declared in the stdlib have been marked this way. The need for this modifier is a bit subtle. If we think about a typical subscript or property: ```hlsl struct MyThing { int f; property p : int { get { return f; } set(newValue) { f = newValue; } } } ``` it is clear we want the `set` accessor to translate to output HLSL as something like: ``` void MyThing_p_set(inout MyThing this, int newValue) { this.f = newValue; } ``` Note how the implicit `this` parameter is `inout` even though we didn't mark anything as `[mutating]`. This is the obvious thing a user would expect us to generate given a property declaration. Now consider a case like the following: ```hlsl struct MyThing { RWStructuredBuffer<int> storage; property p : int { get { return storage[0]; } set(newValue) { storage[0] = newValue; } } } ``` This new declaration doesn't require (or want) an `inout` `this` parameter at all: ``` void MyThing_p_set(MyThing this, int newValue) { this.storage[0] = newValue; } ``` In fact, given the limitations in the current Slang compiler around functions that return resource types (or use them for `inout` parameters), we can only support a `set` operation like this if we can ensure that the `this` parameter is considered to be `in` instead of `inout`. This is exactly the behavior we allow users to opt into with a `[nonmutating] set` declaration. All of the subscript operations in the stdlib today have `set` accessors that don't actually change the value of `this` that they act on (e.g., storing into a `RWStructuredBuffer` using its `operator[]` doesn't change the value of the `RWStructuredBuffer` variable -- just its contents). We'd gotten away without this detail so far just because `set` accessors were only being declared in the stdlib and they were all implicitly `[nonmutating]` anyway, so it never surfaced as an issue that the code we generated assumed a setter wouldn't change `this`. Implementation ============== Parser and AST -------------- Adding a new AST node for `PropertyDecl` and the relevant parsing logic was mostly straightforward. The biggest change was allowing a `set` declaration to introduce an explicit name for the parameter that represents the new value to be set. This change also adds a `[nonmutating]` attribute as a dual to `[mutating]`, for reasons I will get to later. Semantic Checking ----------------- The `getTypeForDeclRef` logic was updated to allow references to `property` declarations. Some of the semantic checking work for subscripts was pulled out into re-usable subroutines to allow it to be shared by `__subscript` and `property` declarations. The checking of accessor declarations, which sets their result type based on the type of the outer `__subscript` was changed to also handle an outer `property`. Some special-case logic was added for checking of `set` declarations to make sure that their parameter is given the expected type. Some logic around deciding whether or not `this` is mutable had to be updated to correctly note that `this` should be mutable by default in a `set` accessor, with an explicit `[nonmutating]` modifier required to opt out of this default. (This is the inverse of how a typical method or `get` accessor works). IR Lowering ----------- The good news is that after IR lowering, access to properties turns into ordinary function calls (equivalent to what hand-written getters and setters would produce), so that subsequent compiler steps (including all the target-specific emit logic) doesn't have to care about the new feature. The bad news is that adding `property` declarations has revealed a few holes in how IR lowering was handling `__subscript` declarations and their accessors, so that it didn't trivially work for the new case as-is. The IR lowering pass already has the `LoweredValInfo` type that abstractly represents a value that resulted from lowering some AST code to the IR. One of the cases of `LoweredValInfo` was `BoundSubscript` that represented an expression of the form `baseVal[someIndex]` where the AST-level expression referenced a `__subscript` declaration. The key feature of `BoundSubscript` is that it avoided deciding whether to invoke the getter, the setter, or both "too early" and instead tried to only invoke the expected/required operations on-demand. This change generalizes `BoundSubscript` to handle `property` references as well, so it changes to `BoundStorage`. Making the type handle user-defined property declarations required fixing a bunch of issues: * When building up argument lists in the IR, we need to know whether an argument corresponds to an `in` or an `out`/`inout` parameter, to decide whether to pass the value directly or a pointer to the value. Some of the logic in the lowering pass had been playing fast and loose with this, so this change tries to make sure that whenever we care computing a list of `IRInst*` that represent the arguments to a call we have the information about the corresponding parameter. * Similarly, when emitting a call to an accessor in the IR, the information about the expected type of the callee was missing/unavailable, and the code was incorrectly building up the expected type of the callee based on the types of the arguments at the call site. The logic has been changed so that we can extract the expected signature of an accessor (how it will be translated to the IR) using the same logic that is used to produce the actual `IRFunc` for the accessor (so hopefully both will always agree). * Dealing with `in` vs. `inout` differences around parameters means also dealing with the "fixup" code that is used to assign from the temporary used to pass an `inout` argument back into the actual l-value expression that was used. That logic has all been hoisted out of the expression visitor(s) and into the global scope. Future Work =========== The entire approach to handling l-values in the IR lowering pass is broken, and it is in need a of a complete rewrite based on new first-principles design goals. While something like `LoweredValInfo` is decent for abstracting over the easy cases of r-values, addresses, and a few complicated l-value cases like swizzling, it just doesn't scale to highly abstract l-values like we get from `__subcript` and `property` declarations, nor other corner cases of l-values that we need to handle (e.g., passing an `int` to an `inout float` parameter is allowed in HLSL, and performs conversions in both directions!). It Should be Easy (TM) to extend the logic that tries to synthesize an interface conformance witness method when there isn't an exact match to also support synthesizing a property declaration (plus its accessors) to witness a required property when the type has a field of the same name/type. * fixup: pedantic template parsing error (thanks, clang!) * fixup: cleanups and review feedback * Removed some `#ifdef`'d out code from merge change * Added proper diagnostics for accessor parameter constraints, which led to some fixes/refactorings * Added a test case for the accessor-related diagnostics
2020-06-12Diagnose circularly-defined constants (#1384)Tim Foley
* Diagnose circularly-defined constants Work on #1374 This change diagnoses cases like the following: ```hlsl static const int kCircular = kCircular; static const int kInfinite = kInfinite + 1; static const int kHere = kThere; static const int kThere = kHere; ``` By diagnosing these as errors in the front-end we protect against infinite recursion leading to stack overflow crashes. The basic approach is to have front-end constant folding track variables that are in use when folding a sub-expression, and then diagnosing an error if the same variable is encountered again while it is in use. In order to make sure the error occurs whether or not the constant is referenced, we invoke constant folding on all `static const` integer variables. Limitations: * This only works for integers, since that is all front-end constant folding applies to. A future change can/should catch circularity in constants at the IR level (and handle more types). * This only works for constants. Circular references in the definition of a global variable are harder to diagnose, but at least shouldn't result in compiler crashes. * This doesn't work across modules, or through generic specialization: anything that requires global knowledge won't be checked * fixup: missing files * fixup: review feedback
2020-06-08Small fixes/improvements based on review. (#1379)jsmall-nvidia
2020-06-05Filter lookup results from interfaces in `visitMemberExpr`.Yong He
Fixes #1377
2020-06-05ASTNodes use MemoryArena (#1376)jsmall-nvidia
* Add a ASTBuilder to a Module Only construct on valid ASTBuilder (was being called on nullptr on occassion) * Add nodes to ASTBuilder. * Compiles with RefPtr removed from AST node types. * Initialize all AST node pointer variables in headers to nullptr; * Initialize AST node variables as nullptr. Make ASTBuilder keep a ref on node types. Make SyntaxParseCallback returns a NodeBase * Don't release canonicalType on dtor (managed by ASTBuilder). * Give ASTBuilders a name and id, to help in debugging. For now destroy the session TypeCache, to stop it holding things released when the compile request destroys ASTBuilders. * Moved the TypeCheckingCache over to Linkage from Session. * NodeBase no longer derived from RefObject. * Only add/dtor nodes that need destruction. First pass compile on linux.
2020-06-04First steps toward inheritance for struct types (#1366)Tim Foley
* First steps toward inheritance for struct types This change adds the ability for a `struct` type to declare a base type that is another `struct`: ```hlsl struct Base { int baseMember; } struct Derived : Base { int derivedMember; } ``` The semantics of the feature are that code like the above desugars into code like: ```hlsl struct Base { int baseMember; } struct Derived { Base _base; int derivedMember; } ``` At points where a member from the base type is being projected out, or the value is being implicitly cast to the base type, the compiler transforms the code to reference the implicitly-generated `_base` member. That means code like this: ```hlsl void f(Base b); ... Derived d = ...; int x = d.baseMember; f(d); ``` gets transformed into a form like this: ```hlsl void f(Base b); ... Derived d = ...; int x = d._base.baseMember; f(d._base); ``` Note that as a result of this choice, the behavior when passing a `Derived` value to a function that expects a `Base` (including to inherited member functions) is that of "object shearing" from the C++ world: the called function can only "see" the `Base` part of the argument, and any operations performed on it will behave as if the value was indeed a `Base`. There is no polymorphism going on because Slang doesn't currently have `virtual` methods. In an attempt to work toward inheritance being a robust feature, this change adds a bunch of more detailed logic for checking the bases of various declarations: * An `interface` declaration is only allowed to inherit from other `interface`s * An `extension` declaration can only introduce inheritance from `interface`s * A `struct` declaration can only inherit from at most one other `struct`, and that `struct` must be the first entry in the list of bases This change also adds a mechanism to control whether a `struct` or `interface` in one module can inherit from a `struct` or `interface` declared in another module: * If the base declaration is marked `[open]`, then the inheritance is allowed * If the base declaration is marked `[sealed]`, then the inheritance is allowed * If it is not marked otherwise, a `struct` is implicitly `[sealed]` * If it is not marked otherwise, an `interface` is implicitly `[open]` These seem like reasonable defaults. In order to safeguard the standard library a bit, the interfaces for builtin types have been marked `[sealed]` to make sure that a user cannot declare a `struct` and then mark it as a `BuiltinFloatingPointType`. This step should bring us a bit closer to being able to document and expose these interfaces for built-in types so that users can write code that is generic over them. There are some big caveats with this work, such that it really only represents a stepping-stone toward a usable inheritance feature. The most important caveats are: * If a `Derived` type tries to conform to an interface, such that one or more interface requirements are satisfied with members inherited from the `Base` type, that is likely to cause a crash or incorrect code generation. * If a `Derived` type tries to inherit from a `Base` type that conforms to one or more interfaces, the witness table generated for the conformance of `Derived` to that interface is likely to lead to a crash or incorrect code generation. It is clear that solving both of those issues will be necessary before we can really promote `struct` inheritance as a feature for users to try out. * fixup: trying to appease clang error * fixups: review feedback
2020-06-02Working matrix swizzle (#1354)Dietrich Geisler
* Working matrix swizzle. Supports one and zero indexing and multiple elements. Performs semantic checking of the swizzle. Matrix swizzles are transformed into a vector of indexing operations during lowering to the IR. This change does not handle matrix swizzle as lvalues. * Renaming * Added missing semicolon * Initialize variable for gcc * Added the expect file for diagnostics * Matrix swizzle updated per PR feedback * Stylistic fix * Formatting fixes * Fix compiling with AST change. Change indentation. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2020-05-29Feature/ast syntax standard (#1360)jsmall-nvidia
* Small improvements to documentation and code around DiagnosticSink * Made methods/functions in slang-syntax.h be lowerCamel Removed some commented out source (was placed elsewhere in code) * Making AST related methods and function lowerCamel. Made IsLeftValue -> isLeftValue.
2020-05-28WIP: ASTBuilder (#1358)jsmall-nvidia
* Compiles. * Small tidy up around session/ASTBuilder. * Tests are now passing. * Fix Visual Studio project. * Fix using new X to use builder when protectedness of Ctor is not enough. Substitute->substitute * Add some missing ast nodes created outside of ASTBuilder. * Compile time check that ASTBuilder is making an AST type. * Moced findClasInfo and findSyntaxClass (essentially the same thing) to SharedASTBuilder from Session.
2020-05-22Tidy up around AST nodes (#1353)jsmall-nvidia
* Fields from upper to lower case in slang-ast-decl.h * Lower camel field names in slang-ast-stmt.h * Fix fields in slang-ast-expr.h * slang-ast-type.h make fields lowerCamel. * slang-ast-base.h members functions lowerCamel. * Method names in slang-ast-type.h to lowerCamel. * GetCanonicalType -> getCanonicalType * Substitute -> substitute * Equals -> equals ToString -> toString * ParentDecl -> parentDecl Members -> members
2020-04-21Diagnose attempts to call instance methods from static methods (#1330)Tim Foley
Currently we fail to diagnose code that calls an instance method from a static method using implicit `this`, and instead crash during lowering of the AST to the IR. This change introduces a bit more detail to the "this parameter mode" that is computed during lookup, so that it differentiates three cases. The existing two cases of a mutable `this` and immutable `this` remain, but we add a third case where the "this parameter mode" only allows for a reference to the `This` type. When turning lookup "breadcrumb" information into actual expressions, we respect this setting to construct either a `This` or `this` expression. In order to actually diagnose the incorrect reference, I had to add code around an existing `TODO` comment that noted how we should diagnose attempts to refer to instance members through a type. Enabling that diagnostic revealed a missing case needed by generics (including those in the stdlib) - a type-constraint member is always referenced statically. Putting the diagnostic for a static reference to a non-static member in its new bottleneck location meant that some code higher up the call static that handles explicit static member references had to be tweaked to not produce double error messages. This change includes a new diagnostic test to show that we now give an error message on code that makes this mistake, instead of crashing.
2020-04-02Add basic support for namespaces (#1304)Tim Foley
This change adds logic for parsing `namespace` declarations, referencing them, and looking up their members. * The parser changes are a bit subtle, because that is where we deal with the issue of "re-opening" a namespace. We kludge things a bit by re-using an existing `NamespaceDecl` in the same parent if one is available, and thereby ensure that all the members in the same namespace can see on another. * In order to allow namespaces to be referenced by name they need to have a type so that a `DeclRefExpr` to them can be formed. For this purpose we introduce `NamespaceType` which is the (singleton) type of a reference to a given namespace. * The new `NamespaceType` case is detected in the `MemberExpr` checking logic and routed to the same logic that `StaticMemberExpr` uses, and the static lookup logic was extended with support for looking up in a namespace (a thin wrapper around one of the existing worker routines in `slang-lookup.cpp`. * I made `NamespaceDecl` have a shared base class with `ModuleDecl` in the hopes that this would allow us to allow references to modules by name in the future. That hasn't been tested as part of this change. * I cleaned up a bunch of logic around `ModuleDecl` holding a `Scope` pointer that was being used for some of the more ad hoc lookup routines in the public API. Those have been switched over to something that is a bit more sensible given the language rules and that doesn't rely on keeping state sititng around on the `ModuleDecl`. * I added a test case to make sure the new funcitonality works, which includes re-opening a namespace, and it also tests both `.` and `::` operations for lookup in a namespace. * The main missing feature here is the ability to do something like C++ `using`. It would probably be cleanest if we used `import` for this, since we already have that syntax (and having both `import` and `using` seems like a recipe for confusion). Most of the infrastructure is present to support `import`ing one namespace into another (in a way that wouldn't automatically pollute the namespace for clients), but some careful thought needs to be put into how import of namespaces vs. modules should work.
2020-03-16Define compound intrinsic ops in the standard library (#1273)Tim Foley
* Define compound intrinsic ops in the standard library The current stdlib code has a notion of "compound" intrinsic ops, which use the `__intrinsic_op` modifier but don't actually map to a single IR instruction. Instead, most* of these map to multiple IR instructions using hard-coded logic in `slang-ir-lower.cpp`. (* One special case is `kCompoundOp_Pos` that is used for unary `operator+` and that maps to *zero* IR instructions) All of the opcodes that used to use the `kCompoundOp_` enumeration values now have definitions directly in the stdlib and use the new `[__unsafeForceInlineEarly]` attribute to ensure that they get inlined into their use sites so that the output code is as close as possible to the original. For the most part, generating the stdlib definitions for the compound ops is straightforward, but here's some notes: * The unary `operator+` I just defined directly in Slang code, since it doesn't share much structure with other cases * The unary increment/decrement ops are generated as a cross product of increment/decrement and prefix/postfix. The logic is a bit messy but given that we have scalar, vector, and matrix versions to deal with it still saves code overall * Because all the compound/assignment cases got moved out, the existing code for generating unary/binary ops can be simplified a bit * All the no-op bit-cast operations like `asfloat(float)` are now inline identity functions * A few other small cleanups are made by not having to worry about the compound ops (which used to be called "pseudo ops") sometimes being encoded in to the same type of value as a real IR opcode. The one big detail here is a fix for how IR lowering works for `let` declarations: they were previously being `materialize()`d which only guarantees that they've been placed in a contiguous and addressable location, but doesn't actually convert them to an r-value. As a result a `let` declaration could accidentally capture a mutable location by reference, which is definitely *not* what we wanted it to do. Fixing this was needed to make the new postfix `++` definition work (several existing tests end up covering this). One important forward-looking note: One subtle (but significant) choice in this change is that we actually reduce the number of declarations in the stdlib, because instead of having the compound operators include both per-type and generic overloads (just listing scalar cases for now): float operator+=(in out float left, float right) { ... } int operator+=(in out int left, int right) { ... } ... T operator+= <T:__BuiltinBlahBlah>(in out T left, T right) { ... } We now have *only* the single generic version: T operator+= <T:__BuiltinBlahBlah>(in out T left, T right) { ... } In running our current tests, this change didn't lead to any regressions (perhaps surprisingly). Given that we were able to reduce the number of overloads for `operator+=` by a factor of N (where N is the number of built-in types), it seems worth considering whether we could also reduce the number of overloads of `operator+` by the same factor by only having generic rather than per-type versions. One concern that this forward-looking question raises is whether the quality of diagnostic messages around bad calls to the operators might suffer when there are only generic overloads instead of per-type overloads. In order to feel out this problem I added a test case that includes some bad operator calls both to `+=` (which is now only generic with this change) and `+` (which still has per-type overloads). Overall, I found the quality of the error messages (in terms of the candidates that get listed) isn't perfect for either, but personally I prefer the output in the generic case. As part of adding that test, I also added some fixups to how overload resolution messages get printed, to make sure the function name is printed in more cases, and also that the candidates print more consistently. These changes affected the expected output for one other diagnostic test. * fixup: disable bad operator test on non-Windows targets
2020-02-25Fix a crash when a generic value argument isn't constant (#1241)Tim Foley
This arose when a user tried to specialize the DXR 1.1 `RayQuery` type to a local variable: ```hlsl RAY_FLAG rayFlags = RAY_FLAG_CULL_FRONT_FACING_TRIANGLES | RAY_FLAG_CULL_NON_OPAQUE; RayQuery<rayFlags> query; ``` In this case, we issued an error around `rayFlags` not being a constant as expected, but then we also crashes later on in checking because the `DeclRef` that was being used for the type had a null pointer for the generic argument corresponding to `rayFlags`. The main fix here was thus to add an `ErrorIntVal` case that can be used to represent something that should be an `IntVal` but where there was some kind of error in the input code so that the actual value isn't known to the compiler. A secondary fix here is that we were issuing error messages about expecting a constant for a parameter like `rayFlags` there *twice*, and one of those times was during the `JustChecking` part of overload resolution (when we are not supposed to emit any diagnostics). I fixed that up by allowing the `DiagnosticSink` to be used to be passed down explicitly (and allowing it to be null), while also leaving behind overloaded functions with the old signatures so that all the existing logic can continue to work unmodified.
2020-02-21Add surface syntax for "this type" (#1236)Tim Foley
Within the context of an aggregate type (or an `extension` of one), the programmer can use `this` to refer to the "current" instance of the surrounding type, but there is no easy way to utter the name of the type itself. This is especially relevant inside of an `interface`, where the type of `this` isn't actually the `interface` type, but rather a placeholder for the as-yet-unknown concrete type that will implement the interface. This change adds a keyword `This` that works similarly to `this`, but names the current *type* instead of the current instance. It can be used to declare things like binary methods or factory functions in an interface: ``` interface IBasicMathType { This absoluteValue(); This sumWith(This left); } T doSomeMath<T:IBasicMathType>(T value) { return value.sumWith(value.absoluteValue()); } ``` The `This` type is consistent with the type named `Self` in Rust and Swift (where Rust/Swift use `self` instead of `this`). Other names could be considered (e.g., `ThisType`) if we find that users don't like the name in this change.
2020-02-06Improve checks and diagnostics around redeclarations (#1201)Tim Foley
* Improve checks and diagnostics around redeclarations This change turns checking for redeclarations into a dedicated phase of semantic checking, and ensures that it applies to the main categories of declarations: functions, types, and variables. Note that "variables" here includes function parameters and `struct` fields in addition to the more obvious global and local variables. Some of the logic for checking redeclarations already existed for functions, and was refactored to deal with other cases of declarations. The checking for functions still needs to be special-cased because functions are much more flexible about the kinds of redeclarations that are allowed. In addition to improving the diagnosis of redeclaration itself, this change also changes the error message that is produced when referencing a symbol that is ambiguous due to begin redeclared. This is a small quality-of-life fix, and has the benefit of being much easier to implement than robust tracking of what variables have had redeclaration errors issued so that we can skip emitting an ambiguity error at the use site. A new test case was added to cover the redeclaration cases for variables (but not types or functions), and the test for function parameters was updated to account for the new more universal diagnostic message (since function parameters used to have special-case redeclaration checking). * fixup: missing file
2020-02-05Improve behavior when undefined identifier is a contextual keyword (#1200)Tim Foley
The HLSL language has keywords with very common names like `triangle`, and Slang doesn't want to preclude users from using such names for their variables/functions/etc. In addition, Slang adds new keywords on top of HLSL (like `extension`) and we don't want those to prevent us from compiling existing code. As a result, almost all keywords in Slang are contextual keywords, and they can be shadowed by user varaibles. The down-side to making all keywords contextual is that in a case like this: ``` int test() { return triangle; } ``` The identifier `triangle` is *not* undefined as far as lookup (it is defined as a modifier keyword), so the existing "undefined identifier" logic gets bypassed, and instead we ran into an internal compiler error trying to construct an expression that refers to a modifier keyword. Fortunately, the internal compiler error in that case was overkill, and the compiler already had defensive logic to produce an expression with an error type if it couldn't figure out what the type of a declaration reference should be. The main fix here is thus to emit an "undefined identifier" error instead of an internal compiler error at the point where we see an attempt to reference a declaration that shouldn't be available in an expression context. In order to improve the quality of the diagnostic, the code for constructing declaration references was updated to pass along a source location to be used in error messages.
2019-12-06Support conversion from int/uint to enum types (#1147)Tim Foley
* Support conversion from int/uint to enum types The basic feature here is tiny, and is summarized in the code added to the stdlib: ``` extension __EnumType { __init(int val); __init(uint val); } ``` The front-end already makes all `enum` types implicitly conform to `__EnumType` behind the scenes, and this `extension` makes it so that all such types inherit some initializers (`__init` declarations, aka. "constructors") that take `int` and `uint`. (Note: right now all `__init` declarations in Slang are assumed to be implemented as intrinsics using `kIROp_Construct`. This obviously needs to change some day, especially so that we can support user-defined initializers.) Actually making this *work* required a bit of fleshing out pieces of the compiler that had previously been a bit ad hoc to be a bit more "correct." Most of the rest of this description is focused on those details, since the main feature is not itself very exciting. When overload resolution sees an attempt to "call" a type (e.g., `MyType(3.0)`) it needs to add appropriate overload candidates for the initializers in that type, which may take different numbers and types of parameters. The existing code for handling this case was using an ad hoc approach to try to enumerate the initializer declarations to consider, which might be found via inheritance, `extension` declarations, etc. In practice, the ad hoc logic for looking up initializers was just doing a subset of the work that already goes into doing member lookup. Changing the code so that it effectively does lookup for `MyType.__init` allows us to look up initializers in a way that is consistent with any other case of member lookup. Generalizing this lookup step brings us one step closer to being able to go from an `enum` type `E` to an initializer defined on an `extension` of an `interface` that `E` conforms to. One casualty of using the ordinary lookup logic for initializers is that we used to pass the type being constructed down into the logic that enumerated the initializers, which made it easier to short-circuit the part of overload resolution that usually asks "what type does this candidate return." It might seem "obvious" that an initializer/constructor on type `Foo` should return a value of type `Foo`, but that isn't necessarily true. Consider the `__BuiltinFloatingPointType` interface, which requires all the built-in floating-point types (`float`, `double`, `half`) to have an initializer that can take a `float`. If we call that interface in a generic context for `T : __BuiltinFloatingPointType`, then we want to treat that initializer as returning `T` and not `__BuiltinFloatingPointType`. Without the ad hoc logic in initializer overload resolution, this is the exact problem that surfaced for the stdlib definition of `clamp`. The solution to the "what type does an initializer return" problem was to introduce a notion of a `ThisType`, which refers to the type of `this` in the body of an interface. More generally, we will eventually want to have the keyword `This` be the type-level equivalent of `this`, and be usable inside any type. The `calcThisType` function introduced here computes a reasonable `Type` to represent the value of `This` within a given declaration. Inside of concrete type it refers to the type itself, while in an `interface` it will always be a `ThisType`. The existing `ThisTypeSubstitution`s, previously only applied to associated types, now apply to `ThisType`s as well, in the same situations. The next roadblock for making the simple declarations for `__EnumType` work was that the lookup logic was only doing lookup through inheritance relationships when the type being looked up in was an `interface`. The logic in play was reasonable: if you are doing lookup in a type `T` that inherits from `IFoo`, then why bother looking for `IFoo::bar` when there must be a `T::bar` if `T` actually implements the interface? The catch in this case is that `IFoo::bar` might not be a requirement of `IFoo`, but rather a concrete method added via an `extension`, in which case `T` need not have its own concrete `bar`. The simple/obvious fix here was to make the lookup logic always include inherited members, even when looking up through a concrete type. Of course, if we allow lookup to see `IFoo::bar` when looking up on `T`, then we have the problem that both `T::bar` and `IFoo::bar` show up in the lookup results, and potentially lead to an "ambiguous overload" error. This problem arises for any interface rquirement (so both methods and associated types right now). In order to get around it, I added a somewhat grungy check for comparing overload candidates (during overload resolution) or `LookupResultItem`s (during resolution of simple overloaded identifiers) that considers a member of a concrete type as automatically "better" than a member of an interface. The Right Way to solve this problem in the long run requires some more subtlety, but for now this check should Just Work. One final wrinkle is that due to our IR lowering pass being a bit overzealous, we currently end up trying to emit IR for those new `__init` declarations, which ends up causing us to try and emit IR for a `ThisType`. That is a case that will require some subtlty to handle correctly down the line, for for now we do the expedient thing and emit the `ThisType` for `IFoo` as `IFoo` itself, which is not especially correct, but doesn't matter since the concrete initializer won't ever be called. * testing: add more debug output to Unix process launch function * testing: increase timeout when running command-line tests
2019-11-18Further refactoring of semantic checking (#1102)Tim Foley
* Split apart `SemanticsVisitor` The existing `SemanticsVisitor` type was the visitor for expressions, statements, and declarations, and its monolithic nature made it hard to introduce distinct visitors for different phases of checking (despite the fact that we had, de facto, multiple phases of declaration checking). This change splits up `SemanticsVisitor` as follows: * There is nosw a `SharedSemanticsContext` type which holds the shared state that all semantics visiting logic needs. This includes state that gets mutated during the course of semantic checking. * The `SemanticsVisitor` type is now a base class that holds a pointer to a `SharedSemanticsContext`. Most of the non-visitor functions are still defined here, just to keep the code as simple as possible. The `SemanticsVisitor` type is no longer a "visitor" in any meaningful way, but retaining the old name minimizes the diffs to client code. * There are distinct `Semantics{Expr|Stmt|Decl}Visitor` types that have the actual `visit*` methods for an appropriate subset of the AST hierarchy. These all inherit from `SemanticsVisitor` primarily so that they can have easy access to all the helper methods it defines (which used to be accessible because these were all the same object). Any client code that was constructing a `SemanticsVisitor` now needs to construct a `SharedSemanticsContext` and then use that to initialize a `SemanticsVisitor`. Similarly, any code that was using `dispatch()` to invoke the visitor on an AST node needs to construct the appropriate sub-class and then invoke `dispatch()` on it instead. This is a pure refactoring change, so no effort has been made to move state or logic onto the visitor sub-types even when it is logical. Similarly, no attempt has been made to hoist any code out of the common headers to avoid duplication between `.h` and `.cpp` files. Those cleanups will follow. The one cleanup I allowed myself while doing this was getting rid of the `typeResult` member in `SemanticsVisitor` that appears to be a do-nothing field that got written to in a few places (for unclear reasons) but never read. * Remove some statefulness around statement checking Some of the state from the old `SemanticsVisitor` was used in a mutable way during semantic checking: * The `function` field would be set and the restored when checking the body of a function so that things like `return` statements could find the outer function. * The `outerStmts` list was used like a stack to track lexically surrounding statements to resolve things like `break` and `continue` targets. Both of these meant that semantic checking code was doing fine-grained mutations on the shared semantic checking state even though the statefullness wasn't needed. This change moves the relevant state down to `SemanticsStmtVisitor`, which is a type we create on-the-fly to check each statement, so that we now only need to establish the state once at creation time. The list of outer statements is handled as a linked list threaded up through the stack (a recurring idiom through the codebase). There was one place where the `function` field was being used that wasn't strictly inside statement checking: it appears that we were using it to detect whether a variable declaration represents a local, so I added an `_isLocalVar` function to serve the same basic purpose. With this change, the only stateful part of `SharedSemanticsContext` is the information to track imported modules, which seems like a necessary thing (since deduplication requires statefullness). * Refactor declaration checking to avoid recursion The flexiblity of the Slang language makes enforcing ordering on semantic checking difficult. In particular, generics (including some of the built-in standard library types) can take value arguments, so that type expressions can include value expressions. This means that being able to determine the type of a function parameter may require checking expressions, which may in turn require resolving calls to an overloaded function, which in turn requires knowing the types of the parameters of candidate callees. Up to this point there have been two dueling approaches to handling the ordering problem in the semantic checking logic: 1. There was the `EnsureDecl` operation, supported by the `DeclCheckState` type. Every declaration would track "how checked" it is, and `EnsureDecl(d, s)` would try to perform whatever checks are needed to bring declaration `d` up to state `s`. 2. There was top-down orchestration logic in `visitModuleDecl()` that tried to perform checking of declarations in a set of fixed phases that ensure things like all function declarations being checked before any function bodies. Each of these options had problems: 1. The `EnsureDecl()` approach wasn't implemented completely or consistently. It only understood two basic levels of checking: the "header" of a declaration was checked, and then the "body," and it relied on a single `visit*()` routine to try and handle both cases. Things ended up being checked twice, or in a circular fashion. 2. Rather than fix the problems with `EnsureDecl()` we layered on the top-down orchestration logic, but doing so ignores the fact that no fixed set of phases can work for our language. The orchestration logic was also done in a relatively ad hoc fashion that relied on using a single visitor to implement all phases of checking, but it added a second metric of "checked-ness" that worked alongside `DeclCheckState`. This change strives to unify the two worlds and make them consistent. One of the key changes is that instead of doing everything through a single visitor type, we now have distinct visitors for distinct phases of semantic checking, and those phases are one-to-one aligned with the values of the `DeclCheckState` type. More detailed notes: * Existing sites that used to call `checkDecl` to directly invoke semantic checking recursively now use `ensureDecl` instead. This makes sure that `ensureDecl` is the one bottleneck that everything passes through, so that it can guarantee that each phase of checking gets applied to each declaration at most once. * The existing `visitModuleDecl` was revamped into a `checkModule` routine that does the global orchestration, but now it is just a driver routine that makes sure `ensureDecl` gets called on everything in an order that represents an idealized "default schedule" for checking, while not ruling out cases where `ensureDecl()` will change the ordering to handle cases where the global order is insufficient. * Because `checkModule` handles much of the recursion over the declaration hierarchy, many cases where a declaration `visit*()` would recurse on its members have been eliminated. The only case where a declaration should recursively `ensureDecl()` its members is when its validity for a certain phase depends on those members being checked (e.g., determining the type of a function declaration depends on its parameters having been checked). * All cases where a `visit*()` routine was manually checking the state/phase of checking have been eliminated. It is now the responsibility of `ensureDecl` to make sure that checking logic doesn't get invoked twice or in an inappropriate order. * Most cases where a `visit*()` routine was manually *setting* the `DeclCheckState` of a declaration have been eliminated. The common case is now handled by `ensureDecl()` directly, and `visit*()` methods only need to override that logic when special cases arise. E.g., when a variable is declared without a type `(e.g., `let foo = ...;`) then we need to check its initial-value expression to determine its type, so that we must check it further than was initially expected/required. * This change goes to some lengths to try and keep semantic checking logic at the same location in the `slang-check-decl.cpp` file, so each of the per-phase visitor types is forward declared at the top of the file, and then the actual `visit*()` routines are interleaved throughout the rest of the file. A future change could do pure code movement (no semantic changes) to arrive at a more logical organization, but for now I tried to stick with what would minimize the diffs (although the resulting diffs can still be messy at times). * One important change to the semantic checking logic was that the test for use of a local variable ahead of its declaration (or as part of its own initial-value expression) was moved around, since its old location in the middle of the `ensureDecl` logic made the overall flow and intention of that function less clear. There is still a need to fix this check to be more robust in the future. * Add some design documentation on semantic checking The main thing this tries to lay out is the strategy for declaration checking and the rules/constraints on programmers that follow from it. * fixup: typos found during review
2019-10-25Refactor semantic checking code into more files (#1097)Tim Foley
The semantic checking logic was all inside `slang-check.cpp` and as a result this was a monster file that was extremely hard to follow. This change splits `slang-check.cpp` into several smaller files, although some of the resulting files are still quite large. This change attempts to be a copy-paste job as much as possible and does *not* perform any cleanup on naming, structure, duplication, etc. in the code it deal with. No function bodies or signatures have been touched.