Unify handling of static and dynamic dispatch for interfaces (#1612)

Overview ======== Prior to this change, we had two different code generation strategies for interface/existential types in Slang, that didn't always play nicely together: * The "legacy" static specialization approach could handle plugging in an arbitrary concrete type for an existential type parameter (including types with resources, etc.), but wouldn't work well with things like a `StructuredBuffer<>` of an interface type, and requires somewhat counter-intuitive layout rules to make work. * The new dynamic dispatch approach produces simpler, more easily understood layouts by assuming that values of interface type can fit into a fixed number of bytes. The tradeoff there is that it cannot handle types that include resources (only POD types). The goal of this change is to make it so that the two strategies can co-exist. In particular, in cases where a shader is amenable to both static specialization and dynamic dispatch, the type layouts should agree. In order to make the type layouts agree, we: * Declare that *all* values of existential type reserve storage according to the dynamic-dispatch rules (so 16 bytes for the RTTI and witness-table information, plus whatever bytes are needed to story "any value" of a conforming type). * Then we modify the "legacy" layout rules so that if a value of concrete type can fit in the reserved "any value" space for a given interface, then it is laid out there exactly like the dynamic dispatch rules would do. Otherwise, we fall back to the previous legacy rules (since we don't need to agree with the dynamic-dispatch layout on types that can't be used with dynamic dispatch). Details ======= * Renamed `ExistentialBox` to `BoundInterfaceType` to better clarify how it relates to `BindExistentialsType` * Unconditionally apply the `lowerGenerics` pass during emit, since it is now responsible for aspects of the lowering of existential types when specialization is used. * Made IR type layout take the target into account, so that the layout of resource types can vary by target (e.g., being POD on some targets, and invalid on others) * Cleaned up some issues around using global shader parameters as the "key" for their layout information in the global-scope layout (only comes up when there are global-scope `uniform` parameters) * Made there be a default any-value size (16) instead of making it be an error to leave out. This was the simplest option; we could try to go back to having an error, but we'd need to only issue it if we are sure a type/interface is being used with dynamic dispatch, since static dispatch doesn't have to obey the restrictions. * Changed lowering of existential types to tuples so that bound interfaces where the concrete type won't fit use a "pseudo-pointer" instead of an "any-value" to hold the payload * Changed IR type legalization to handle the "pseudo-pointer" case and apply layout information from an interface type over to the payload part when static specialization was used. * Changed some details of how witness tables were being lowered, so that we didn't have to create "proxy" witness tables for the constraints on associated types (just use the actual requirement entries we generate) * Changed witness tables so that they know the subtype doing the conforming * Added logic so that we don't generate pack/unpack logic and witness table wrapper functions for types that are incompatible with any-value/dynamic dispatch for a given interface. * Changed the core AST-level type layout logic to use the dynamic-dispatch layout in case things fit, and the legacy static specialization case when things don't (while also reserving space for the dynamic-dispatch fields) * Changed a bunch of test cases for static specialization to properly use the new layout (which introduces new buffers in some cases, and moves data around in others). Future Work =========== The experience of trying to reconcile our older way of handling interface-type specialization with our newer model (that supports dynamic dispatch) makes it clear that we really need to make similar changes to our handling of generic type parameters on entry points and at the global scope. A future change should make it so that a global type parameter is lowered with a type layout similar to a value parameter of interface type, including the RTTI and witness-table pieces, and just leaving out the "any value" piece. A similar translation strategy should apply to entry-point generic parameters (mirroring how we lower generic functions for dynamic dispatch already), and value specialization parameters. Co-authored-by: Yong He <yonghe@outlook.com>
author: Tim Foley <tfoleyNV@users.noreply.github.com> 2020-11-19 01:26:43 -0800
committer: GitHub <noreply@github.com> 2020-11-19 01:26:43 -0800
commit: 4459d4428761b0581b221c52eaea595d1b257a9f (patch)
tree: ff2f3558afb82ee0d1ce0e956b647b0a5e053a9e /source/slang/slang-type-layout.cpp
parent: b59451020eee59cd52e4d8231360ebed4fc59adb (diff)
1 files changed, 190 insertions, 57 deletions
diff --git a/source/slang/slang-type-layout.cpp b/source/slang/slang-type-layout.cpp
index 33bdb4ef4..a015bfd78 100644
--- a/source/slang/slang-type-layout.cpp
+++ b/source/slang/slang-type-layout.cpp
@@ -1542,6 +1542,11 @@ bool isCUDATarget(TargetRequest* targetReq)
     }
 }
 
+bool areResourceTypesBindlessOnTarget(TargetRequest* targetReq)
+{
+    return isCPUTarget(targetReq) || isCUDATarget(targetReq);
+}
+
 static bool isD3D11Target(TargetRequest*)
 {
     // We aren't officially supporting D3D11 right now
@@ -3675,84 +3680,212 @@ static TypeLayoutResult _createTypeLayout(
         }
         else if( auto interfaceDeclRef = declRef.as<InterfaceDecl>() )
         {
+            RefPtr<ExistentialTypeLayout> typeLayout = new ExistentialTypeLayout();
+            typeLayout->type = type;
+            typeLayout->rules = rules;
+
             // When laying out a type that includes interface-type fields,
             // we cannot know how much space the concrete type that
             // gets stored into the field consumes.
             //
-            // If we were doing layout for a typical CPU target, then
-            // we could just say that each interface-type field consumes
-            // some fixed number of pointers (e.g., a data pointer plus a witness
-            // table pointer).
+            // For target platforms with flexible memory addressing,
+            // we can reserve a fixed amount of uniform/ordinary storage
+            // to hold a value of "any" type, with the expectation that:
             //
-            // We will borrow the intuition from that and invent a new
-            // resource kind for "existential slots" which conceptually
-            // represents the indirections needed to reference the
-            // data to be referenced by this field.
+            // * Values which fit entirely in the storage we've reserved
+            //   will be stored there directly.
             //
-
-            RefPtr<TypeLayout> typeLayout = new TypeLayout();
-            typeLayout->type = type;
-            typeLayout->rules = rules;
-
-            LayoutSize fixedExistentialValueSize = 0;
-            LayoutSize uniformSlotSize = 0;
-            bool targetSupportsPointer =
-                isCPUTarget(context.targetReq) || isCUDATarget(context.targetReq);
-
-            if (targetSupportsPointer)
+            // * Values that are too big to store directly will be referenced
+            //   indirectly, by a pointer stored in the reserved space.
+            //
+            // Note: the latter condition means that the minimum
+            // reservation must be large enough to store a pointer.
+            //
+            // Note: the layout choice here does *not* depend on whether
+            // or not specialization is being used, because we do not
+            // want host code that sets parameters to have to be re-run (and
+            // behave differently) depending on whether specialization is
+            // being used for a particular dispatch.
+            //
+            // For target platforms that do not support flexible memory
+            // addressing, we can follow the same approach in cases
+            // where a value fits in the reserved memory space, and we
+            // will discuss what happens in the other cases in a bit.
+            //
+            // The default reservation will be 16 bytes (and this number
+            // becomes part of our ABI contract), but the `interface`
+            // that is being used to bound the existential can have
+            // an attribute that specifies a different size to use for
+            // its instances.
+            //
+            // Note: changing the "any value size" attribute for an interface
+            // breaks binary compatibility with existing code that uses
+            // or implements that interface).
+            //
+            LayoutSize fixedExistentialValueSize = 16;
+            if (auto anyValueAttr =
+                    interfaceDeclRef.getDecl()->findModifier<AnyValueSizeAttribute>())
             {
-                fixedExistentialValueSize = 16;
-                if (auto anyValueAttr =
-                        interfaceDeclRef.getDecl()->findModifier<AnyValueSizeAttribute>())
-                {
-                    fixedExistentialValueSize = anyValueAttr->size;
-                }
-                // Append 16 bytes to accommodate RTTI pointer and witness table pointer.
-                uniformSlotSize = fixedExistentialValueSize + 16;
-                typeLayout->addResourceUsage(LayoutResourceKind::Uniform, uniformSlotSize);
+                fixedExistentialValueSize = anyValueAttr->size;
             }
+
+            // The `fixedExistentialValueSize` only accounts for the storage
+            // of a value that conforms to the interface type; you can think
+            // of it like a C `union` where it stores the bits of a value, but
+            // has no way of knowing what the type of the value is.
+            //
+            // For dynamic dispatch we also need to be able to know two key
+            // pieces of information:
+            //
+            // * Some kind of run-time type information (RTTI) that can identify
+            //   the actual type stored in the existential, and which can therefore
+            //   be used to allocate/copy/release the value stored.
+            //
+            // * A value that "witnesses" the fact that the above type actually
+            //   implements the interface, and thus gives us a way to look up
+            //   methods, etc. that implement the interface operations for that
+            //   type. For a C++-minded programmer, you can think of  this like
+            //   a virtual function table pointer, stored alongside the object pointer.
+            //
+            // We reserve 16 bytes to accomodate the RTTI and witness table information,
+            // which should be enough space to store a pointer for each on 64-bit
+            // platforms. Note that we don't try to vary this size based on platform-specific
+            // information, because we prefer to keep the encoding of existentials as
+            // simple as we can get away with.
+            //
+            // TODO: This layout logic does *not* accomodate the case where an
+            // existential type is formed from a conjuction of interfaces (e.g.,
+            // a type like `IReadable & IWritable`). In such a case we'd have
+            // to change the layout to accomodate N >= 0 witness tables, either
+            // stored directly in the existential value, or pointed to indirectly
+            // to keep the size independent of N.
+            //
+            LayoutSize uniformSlotSize = fixedExistentialValueSize + 16;
+            typeLayout->addResourceUsage(LayoutResourceKind::Uniform, uniformSlotSize);
+
+            // In addition to the uniform/ordinary storage, we will mark
+            // every interface-type parameter as consuming a few additional
+            // "fictitious" resources that allow applications to keep track
+            // of existential-type parameters in case they want to perform
+            // specialization.
+            //
+            // Each leaf parameter of existential type introduces a potential
+            // specialization parameter into the program, so we add the
+            // parameter to represent that here.
+            //
             typeLayout->addResourceUsage(LayoutResourceKind::ExistentialTypeParam, 1);
-            typeLayout->addResourceUsage(LayoutResourceKind::ExistentialObjectParam, 1);
 
-            // If there are any concrete types available, the first one will be
-            // the value that should be plugged into the slot we just introduced.
+            // A leaf parameter of existential type also introduces a conceptual
+            // "sub-object" that needs to be tracked by an application building
+            // a shader object or parameter block abstraction.
             //
-            if (context.specializationArgCount)
+            typeLayout->addResourceUsage(LayoutResourceKind::ExistentialObjectParam, 1);
+            //
+            // Note: It might be unclear at this point what the difference is between
+            // `ExistentialTypeParam` and `ExistentialObjectParam` is. The reason for
+            // the confusion is that in this code we are only looking at a single
+            // leaf parameter with a type like `ILight`, which both introduces the
+            // type parameter (for picking a specialized light type), and the object
+            // parameter (for passing in the actual light data).
+            //
+            // In a more general setting we might have `ILight someLights[10]`, and
+            // in that case we would expect to have ten `ExistentialObjectParam`s
+            // (one for each light in the array), but for specialization we would
+            // still only want one `ExistentialTypeParam`.
+            //
+            // Keeping the `LayoutResourceKind`s separate allows us to scale them
+            // differently when a type gets used as part of an array or buffer.
+
+            // At this point we have determined the layout of the existential
+            // type itself, but there are additional steps we need to take
+            // if we are on a platform that doesn't support general-purpose
+            // pointers and addressing *and* we also know of a concrete
+            // type argument that the parameter will be specialized to.
+            //
+            bool targetSupportsPointer =
+                isCPUTarget(context.targetReq) || isCUDATarget(context.targetReq);
+            bool hasConcreteSpecializationArg = context.specializationArgCount != 0;
+            if (!targetSupportsPointer && hasConcreteSpecializationArg)
             {
+                // We have a concrete specialization argument, so we
+                // can determine the concrete type that is going to
+                // be stored in this parameter.
+                //
                 auto& specializationArg = context.specializationArgs[0];
                 Type* concreteType = as<Type>(specializationArg.val);
                 SLANG_ASSERT(concreteType);
 
-                // Always use AnyValueRules regardless of the enclosing environment's layout rule
-                // for existential values.
+                // Our first job here is to figure out how `concreteType` will
+                // be laid out when stored into this existential.
+                //
+                // We know that *if* the value fits in the "any value" storage,
+                // then that is where it will be stored. We start by computing
+                // how much space the value would take up if stored in
+                // the any-value area.
+                //
                 auto anyValueRules = context.getRulesFamily()->getAnyValueRules();
+                RefPtr<TypeLayout> concreteTypeAnyValueLayout =
+                    createTypeLayout(context.with(anyValueRules), concreteType);
 
-                // TODO: for traditional GPU targets (HLSL/GLSL) we don't force
-                // anyValueRule for now, since it requires additional work to load
-                // the existential value. We should remove this special case logic
-                // and always use anyValueRule once we implement the correct loading
-                // code gen logic for these targets.
-                if (!targetSupportsPointer)
-                    anyValueRules = context.rules;
+                // We will look at the resource usage of the concrete type
+                // to determine if it "fits" in the reserved space.
+                //
+                bool fits = true;
+                for(auto usage : concreteTypeAnyValueLayout->resourceInfos)
+                {
+                    if(usage.kind == LayoutResourceKind::Uniform)
+                    {
+                        // If the amount of uniform storage that the concrete type
+                        // requires is more than has been reserved, when the
+                        // type does not fit.
+                        //
+                        if(usage.count > fixedExistentialValueSize)
+                        {
+                            fits = false;
+                            break;
+                        }
+                    }
+                    else
+                    {
+                        // If the concrete type requires any kind of storage
+                        // beyond ordinary uniform data, then it also
+                        // does not fit.
+                        //
+                        // TODO: Make sure this is okay with nested existentials.
+                        //
+                        fits = false;
+                        break;
+                    }
+                }
 
-                RefPtr<TypeLayout> concreteTypeLayout =
-                    createTypeLayout(context.with(anyValueRules), concreteType);
-                if (!targetSupportsPointer)
+                // If the value does fit, then there is nothing else to be
+                // done; the layout that would have been computed without
+                // knowing the `concreteType` is sufficient.
+                //
+                // If the value does *not* fit, then we need to figure out
+                // where the excess data will go.
+                //
+                if(!fits)
                 {
-                    // For targets that supports pointers, oversized existential values
-                    // should be placed in an overflow region and only a pointer is needed in
-                    // the place of the fixed sized uniform slot.
-                    // We only need the "pending layout" mechanism for targets that does not
-                    // support pointers.
-
-                    // For legacy targets without pointer support, the layout for this
-                    // specialized interface type then results in a type layout that tracks
-                    // both the resource usage of the interface type itself (just the
-                    // type + value slots introduced above), plus a "pending data" type that
-                    // represents the value conceptually pointed to by the interface-type
-                    // field/variable at runtime.
+                    // If we were doing layout for a typical CPU target, then
+                    // we could just say that the fixed-size storage contains
+                    // a data pointer to a "payload" of the data that wouldn't fit.
+                    //
+                    // We will borrow intuition from the approach, by saying that
+                    // the payload is stored somewhere else, but we will *not*
+                    // lock down where precisely "somewhere else" is going to be
+                    // at this point.
+                    //
+                    // Instead, we will store information about the layout of
+                    // the data that needs to go somewhere else, and leave it
+                    // up to the parent type/context to find a suitable place
+                    // for the data.
+                    //
+                    // Because we know the layout of the data, but not the placement,
+                    // it is considered to be a "pending" part of the type layout.
                     //
-                    typeLayout->pendingDataTypeLayout = concreteTypeLayout;
+                    typeLayout->pendingDataTypeLayout =
+                        createTypeLayout(context, concreteType);
                 }
             }
             // Interface type occupies a uniform slot for the fixed size storage, with alignment of 4 bytes.
author	Tim Foley <tfoleyNV@users.noreply.github.com>	2020-11-19 01:26:43 -0800
committer	GitHub <noreply@github.com>	2020-11-19 01:26:43 -0800
commit	4459d4428761b0581b221c52eaea595d1b257a9f (patch)
tree	ff2f3558afb82ee0d1ce0e956b647b0a5e053a9e /source/slang/slang-type-layout.cpp
parent	b59451020eee59cd52e4d8231360ebed4fc59adb (diff)