From c35b763f811298a6e9c61a4a8eaf805ea98bd608 Mon Sep 17 00:00:00 2001 From: Theresa Foley <10618364+tangent-vector@users.noreply.github.com> Date: Mon, 22 Sep 2025 18:20:13 -0700 Subject: Split overloaded uses of RefType in front-end (#8427) Overview ======== This change is the start of an attempt to address how the Slang compiler codebase has ended up conflating two similar, but semantically distinct, concepts: * The long-standing notion of `ref` parameters (only allowed for use in the builtin modules), which are encoded using a wrapper `Type` in the AST as part of the representation of the parameters of a `FuncType`. * A recently-introduced notion of explicit reference types that mirror the built-in `Ptr` type, with a relationship comparable to that between pointer and reference types in C++. The change splits the `Ref` type in the core module into two distinct types, with one for each of the two use cases. Similarly, the `RefType` class in the compiler's AST is split into two distinct classes, to represent the two cases. Background ========== The `Ref` type in the core module (hidden and not intended for users to ever see or use) was originally introduced to encode the `ref` parameter-passing mode, comparable to the hidden `Out` and `InOut` types used to encode `out` and `inout` parameter-passing modes. The `Ref` type in the core module was encoded as a instance of the `RefType` class in the Slang AST (similar to how `Out` mapped to an `OutType`). These AST classes were *only* intended to be used by the compiler front-end as part of its encoding of function types. The `FuncType` class needed a way to distinguish an `inout int` parameter from a plain (implicitly `in`) `int` parameter, so these wrapper like `RefType` and `OutType` were introduced to encode both the parameter type (`T`) and the parameter-passing mode in a form that could be passed around as a `Type`. Notably, the `Ref` type (and `Out`, etc.) were *not* intended to be type names that ever get uttered in Slang code (not even in the builtin modules), and the vast majority of the compiler code was not supposed to ever encounter them. They were an implementation detail of `FuncType`, and nothing else. (In hindsight it may have been a mistake to use a nominal type declared in the core module to implement these wrappers; it might have been a good idea to use an entirely separate class of `Type` for this case...) Recent changes to the builtin modules introduced functions that wanted to *return* a reference (so that the parameter-passing-mode modifiers like `ref` could not trivially be used), and as part of those changes the appealingly-named `Ref` type in the core module was re-used for this new case. Builtin operations were declared with an explicit `Ref` return type, and parts of the compiler front-end that had previously been blissfully unaware of the AST's `RefType` (and `InOutType`, etc.) had to start accounting for the possibility that an explicit `Ref` would show up. Related changes also introduced a comparable conflation of the (unfortunately-named) `constref` parameter-passing modifier and builtin operations that wanted to return an explicit reference that is read-only. Both use cases were mapped to the core-module `ConstRef` type, which appeared in the AST as an instance of the `ConstRefType` class. The overlapping use of `ConstRef`` is actually significantly more troublesome than the `Ref` case because, despite what its name implies, `constref` was not really supposed to be the read-only analogue of `ref`, but rather it is closer to the "immutable value borrow" analogue to `inout`'s "mutable value borrow." The semantics of a "value borrow" vs. a "memory reference" in Slang have not been very carefully codified, and the conflation around `ConstRef` has contributed to things becoming increasingly muddy in the compiler back-end. Main Changes ============ Core Module ----------- The `Ref` type has been replaced with two distinct types, with one for each use case: * `RefParam` is intended for use when encoding a `ref` parameter in a function type * `ExplicitRef` is intended for use when an operation in a builtin module wants to return a reference The other types used to represent parameter-passing modes (e.g., `InOut`) were renamed to better indicate that their role in defining parameter types (e.g., `InOutParam`). The `ExplicitRef` type was given additional generic parameters for the allowed access and the address space, akin to what `Ptr` now supports. The pointer dereference operator (prefix `*`) in the core module should now properly propagate the access and address space of the pointer over to the reference that gets returned. The two distinct use cases of `ConstRef` were not split in the way as `Ref`, instead the case for the `constref` parameter-passing mode uses `ConstParamRef`, while cases that previously used `ConstRef` to represent a read-only explicit reference instead now use `ExplicitRef`. Prior to this change there were two subscripts declared on pointers: one in the `Ptr` type itself, and another in an `extension` for pointers with `Access.ReadWrite`. The comments on the code seemed to indicate that the catch-all subscript used to only have a `get` accessor, while the `ref` was only available on read-write pointers, but it seems that subsequent changes converted the default subscript to support `ref`. This change eliminates the subscript added via `extension`, since it is redundant. AST and Front-End ================= Similar to the changes in the core module, the AST `RefType` class was split into: * `RefParamType` for the case of encoding `ref` parameters * `ExplicitRefType` for the case where the user meant an explicit reference type All the other classes that represent wrappers for encoding parameter-passing modes (e.g., `OutType`) were similarly renamed (e.g., `OutParamType`). The `ConstRefType` class was simply renamed to `ConstRefParamType`, because any use cases of `ConstRefType` that intended an explicit reference type will now use `ExplicitRefType` with `Acccess.Read`. For convenience, this change includes type aliases to map the old names for these types over to the new ones (e.g., `using OutType = OutParamType`) so that the change doesn't need to affect quite so many lines of code. The `RefType` and `ConstRefType` names are intentionally left undefined, since it woudl be unsafe to assume that existing use sites should default to either of the two possible interpretations. All use cases of `RefType` and `ConstRefType` (and their former shared base class `RefTypeBase`) were audited and updated to refer to either `RefParamType`/`ConstRefParamType` or `ExplicitRefType`, as appropriate (based on whether the context of the code indicated it was working with parameter-passing mode wrapper types, or explicit reference types). In many (many) cases comments were added to the code that was updated (and some unrelated code that needed to be audited along the way) to note cases where there appears to be something fishy going on in the compiler and/or there are obvious opportunities for next-step improvement. The `QualType` constructor used to infer l-value-ness when passed a `RefType` or `ConstRefType`; that code was introduced to support explicit reference types. The code was updated to consult the access argument of an `ExplicitRefType` to try and determine the right l-value-ness to use. There is some ambiguity about what should be done in the case where the value of the generic argument representing the access cannot be statically determined; a better solution may be needed. Many other cases in the front-end that were working with `RefType` and `ConstRefType` for explicit references also need to figure out l-value-ness, and these were changed to rely on the logic already added to `QualType` so that it wouldn't have to be duplicated. It isn't clear if this structure is the best way to tackle the problem, but it seems to at least be an upgrade over the more strictly ad-hoc logic that was in place before. Future Work =========== IR-Level Work ------------- The most obvious next step to take is that the split that was made in the compiler front-end needs to be properly plumbed through all of the back-end. There appears to be a lot of code in the back end of the compiler that has made the same conflation of `ref` parameters and explicit reference types that the front-end did. In practice, any uses of `ExplicitRef` in the front-end should desugar into plain pointer-based code in the IR. Clean Up Parameter-Passing Modes -------------------------------- The code that handles different parameter-passing modes (`ParameterDirection`s) and their wrapper types is somewhat scattered and messy (as found while auditing use cases of `RefType`). A cleanup pass is warranted to ensure that most code only needs to think about `ParameterDirection`s. There should ideally be only a single operation in the front-end that handles determining the `ParameterDirection` of a parameter based on its modifiers. Similarly, there should be one operation to wrap a value type based on a parameter direction, and one operation to derive a `ParameterDirection` from the wrapper type. Ideally, the accessors for `FuncType` should not provide unrestricted access to the potentially-wrapped parameter types, and should instead return some kind of `ParamInfo` struct that encodes both a `ParameterDirection` and the unwrapped `Type` of the parameter. Clean Up `QualType` ------------------- A significant piece of future work that appears required is to drastically clean up and improve the way that `QualType`s are represente and handled in the front-end. There are currently various distinct `bool` flags in `QualType` (some with very unclear meaning) and differnet parts of the codebase consult/modify only subsets of them; a clear enumeration of the "value categories" (to use the C++ terminology) that Slang supports could be quite helpful. Naively, a `QualType` should at least encode the basic information that a `Ptr` type encodes: * A value type * Allowed access (read-only, read-write, etc.) * Address space The main additional thing that a `QualType` needs is a way to distinguish cases where an expression evaluates to: * A reference to a memory location, where all the information from a `Ptr` is relevant * A simple value, such that the access and address space are irrelevant * A reference to an abstract storage location (a `property`, `subscript`, or an implicit conversion that needs to support being an l-value), in which case address space is irrelevant and the "allowed access" basically amounts to a listing of the accessors the storage location supports Eliminate Explicit Reference Types ---------------------------------- Finally, twe should eventually eliminate the `ExplicitRef` type from the core module (and all of the supporting code from the front-end), since the feature is not a good fit for the Slang language. We should find some other way to decorate operations in the builtin module that need to returns a reference rather than a value (note how `ref` accessors already avoided exposing explicit reference types, by design). --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> --- source/slang/slang-check-conversion.cpp | 156 ++++++++++++++++++++++++++------ 1 file changed, 130 insertions(+), 26 deletions(-) (limited to 'source/slang/slang-check-conversion.cpp') diff --git a/source/slang/slang-check-conversion.cpp b/source/slang/slang-check-conversion.cpp index d8643c63f..1dcd0d737 100644 --- a/source/slang/slang-check-conversion.cpp +++ b/source/slang/slang-check-conversion.cpp @@ -1465,7 +1465,7 @@ bool SemanticsVisitor::_coerce( } } - // matrix type with different layouts are convertible + // matrix types with different layouts are convertible if (auto fromMatrixType = as(fromType)) { if (auto toMatrixType = as(toType)) @@ -1491,6 +1491,14 @@ bool SemanticsVisitor::_coerce( } } + // We allow a value of a `struct` type to be coerced to a function + // type if the `struct` provides an appropriate method for calling + // instances of that type. + // + // TODO(tfoley): This can and should be opened up to work for any + // type (or at least any nominal type) that supports the required + // operation. + // if (auto toFuncType = as(toType)) { if (auto fromLambdaType = isDeclRefTypeOf(fromType)) @@ -1541,6 +1549,16 @@ bool SemanticsVisitor::_coerce( // Is toType and fromType the same via some type equality witness? // If so there is no need to do any conversion. // + // Note that this is a somewhat messy case to have, since we *already* + // have a check for type equality above this point. For this code to + // execute we would need to have a case where the `To` and `From` types + // are considered distinct by `Type::equals` but `tryGetSubtypeWitness` + // is still able to produce a witness for the equality of the two types. + // + // TODO(tfoley): Try to set things up so that we can have an invariant + // that two types count as equal for `Type::equals` if and only if a + // type equality witness for those types can be dervied. + // if (isTypeEqualityWitness(fromIsToWitness)) { if (outToExpr) @@ -1562,29 +1580,54 @@ bool SemanticsVisitor::_coerce( return _failedCoercion(toType, outToExpr, fromExpr, sink); } - // We allow implicit conversion of a parameter group type like - // `ConstantBuffer` or `ParameterBlock` to its element - // type `X`. + // If the type that we are converting from is a parameter group type + // (something like `ConstantBuffer` or `ParameterBlock`) and we + // are converting to some type `Y`, then we want to allow for a multi-step + // conversion where we first implicitly dereference the parameter group + // to get an `X`, and then convert the resulting `X` to a `Y`. + // + // An important special case of the above is when `X == Y`, in which + // case we are just converting, e.g., a `ConstantBuffer` to an `X`. + // + // TODO(tfoley): When this conditional detects a parameter group type + // it funnels the coercion logic into only considering conversions that + // involve an automatic dereference. We need to ensure that any other + // kinds of conversion that could apply to a parameter group are considered + // earlier in this function, or else they will never actually be considered. + // Notably, with this logic in place it is impossible for there to be any + // conversion operations from a parameter-group type defined in code + // (e.g., a constructor for a `DescriptorHandle`-like type that takes + // a `ConstantBufer` parameter will never be considered as part of conversion + // logic, because we will first extract the `T` and then try to convert *that*). // if (auto fromParameterGroupType = as(fromType)) { auto fromElementType = fromParameterGroupType->getElementType(); - // If we convert, e.g., `ConstantBuffer to `A`, we will allow - // subsequent conversion of `A` to `B` if such a conversion - // is possible. - // - ConversionCost subCost = kConversionCost_None; - DerefExpr* derefExpr = nullptr; if (outToExpr) { + // TODO(tfoley): The logic here effectively assumes that any + // parameter-group type is read-only, because we are not + // setting the `isLeftValue` flag of the `QualType` based + // on the type of the container. That is, a `StorageBuffer` + // and a `ConstantBuffer` would both derive the `QualType` + // of the dereferenced expression from `X` alone, and ignore + // that one of these should yield an l-value and the other + // shouldn't. + // + // In practice, we should have a centralized function that + // handles dereferenencing of any `Expr`, and computes the + // correct type for the result, so that the logic here can + // exactly mirror other cases of implicit dereference. + // derefExpr = m_astBuilder->create(); derefExpr->base = fromExpr; derefExpr->type = QualType(fromElementType); derefExpr->checked = true; } + ConversionCost subCost = kConversionCost_None; if (!_coerce(site, toType, outToExpr, fromElementType, derefExpr, sink, &subCost)) { return false; @@ -1595,40 +1638,74 @@ bool SemanticsVisitor::_coerce( return true; } - if (auto refType = as(toType)) - { + // Because (for various bad reasons) we currently support an explicit + // `Ref` type (used to define some of our core-module functions), + // we have to account for the case where an expression of type `T` + // is being coerced to a `Ref`. + // + if (auto refType = as(toType)) + { + // TODO(tfoley): This logic is deeply and fundamentally incorrect. + // It presumes that if an expression of type `T` can coerce to + // type `U` then it can also coerce to a *reference* to `U`. + // That means that because we support, say, implicit coercion of + // an `int` to a `float`, this logic will support implicit coercion + // of an `int` l-value to a `Ref`!!!! + // ConversionCost cost; if (!canCoerce(refType->getValueType(), fromType, fromExpr, &cost)) return false; - if (as(toType) && !fromExpr->type.isLeftValue) + + // Depending on whether the result of the coercion would be an l-value + // or not, we may need to restrict the source to be an l-value. + // + // TODO(tfoley): Here we are again hijacking the `QualType` constructor + // to do the direct work. It's still not clear where this logic should + // live. In the longer run, I'm hopeful that we will get rid of + // the explicit `Ref` type entirely (since it was a design mistake to + // begin with), and thus not have to deal with the miserable mess that + // it pushes back on various parts of the compiler. + // + auto qualRefType = QualType(refType); + if (qualRefType.isLeftValue && !fromExpr->type.isLeftValue) + { + // The result type would be an l-value, but the source isn't, + // so there is no way to support the conversion. + // return false; + } + ConversionCost subCost = kConversionCost_GetRef; + if (outCost) + *outCost = subCost; - MakeRefExpr* refExpr = nullptr; if (outToExpr) { - refExpr = m_astBuilder->create(); + auto refExpr = m_astBuilder->create(); refExpr->base = fromExpr; - refExpr->type = QualType(refType); - refExpr->type.isLeftValue = false; + refExpr->type = qualRefType; refExpr->checked = true; *outToExpr = refExpr; } - if (outCost) - *outCost = subCost; + return true; } + // TODO(tfoley): I was told that explicit `Ref` types should not + // be seen by most of the compiler because they would be automatically + // eliminated via `maybeOpenRef()` before other code needs to deal + // with them... but that doesn't seem to be the case given how much + // code here in type coercion is having to account for the possibility + // of `Ref` types. - // Allow implicit dereferencing a reference type. - if (auto fromRefType = as(fromType)) + // If we find ourselves in a situation where we need to coerce an + // expression of type `Ref`, we will first unwrap the reference + // to get an expression of type `T` and then coerce *that*. + // + if (auto fromRefType = as(fromType)) { auto fromValueType = fromRefType->getValueType(); - // If we convert, e.g., `ConstantBuffer to `A`, we will allow - // subsequent conversion of `A` to `B` if such a conversion - // is possible. - // ConversionCost subCost = kConversionCost_None; Expr* openRefExpr = nullptr; @@ -1642,6 +1719,26 @@ bool SemanticsVisitor::_coerce( return false; } + // + // TODO(tfoley): This logic treats the implicit dereferencing + // of a `Ref` as an additional conversion cost, so that + // a function with an explicit `Ref` parameter would end up + // being preferred over one with just a `T`. + // + // Making that distinction and introducing this cost seems to have + // very little benefit, and risks causing developer confusion, + // because for the most part references are invisible to the user + // (intentionally). + // + // We don't want to support explicit `Ref` types in parameter + // positions anyway (people can use either a `ref` parameter or + // an explicit `Ptr`), so the whole thing is moot. + // + // For that matter, we probably should just remove explicit + // `Ref` types from the language, since they were never + // intended to be there in the first place. + // + if (outCost) *outCost = subCost + kConversionCost_ImplicitDereference; return true; @@ -2007,7 +2104,7 @@ bool SemanticsVisitor::tryCoerceLambdaToFuncType( for (auto param : invokeFunc->getParameters()) { auto paramType = getParamTypeWithDirectionWrapper(m_astBuilder, param); - auto toParamType = toFuncType->getParamType(paramId); + auto toParamType = toFuncType->getParamTypeWithDirectionWrapper(paramId); if (!paramType->equals(toParamType)) { return false; @@ -2189,6 +2286,13 @@ Expr* SemanticsVisitor::coerce( // clobber the type on `fromExpr`, and an invariant here is that coercion // really shouldn't *change* the expression that is passed in, but should // introduce new AST nodes to coerce its value to a different type... + // + // TODO(tfoley): Based on the comment above it seems like my past self + // wrote this code, but looking at it now, I'm unsure why we want to return + // an expression with an error type when we have the `toType` that is + // expected *right there*. It would be good to investigate whether changing + // this to return an expression of the expected type would Just Work. + // return CreateImplicitCastExpr(m_astBuilder->getErrorType(), fromExpr); } -- cgit v1.2.3