diff options
| author | Theresa Foley <10618364+tangent-vector@users.noreply.github.com> | 2025-06-18 17:11:16 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-06-19 00:11:16 +0000 |
| commit | 3ed77615924dc41b8b2f286d4ac646f625cd946c (patch) | |
| tree | ad278f918c90ffa3798bbae497ca3d73aa954ce8 /source | |
| parent | 97f328a669da035025a49d5b322a646bf97340f0 (diff) | |
Add support for on-demand AST deserialization (#7482)
Note that this change does not actually *enable* on-demand deserialization of ASTs, because doing so is incompatible with the current compiler architecture where we have both an `ASTBuilder` and a `SharedASTBuilder`, and there are important invariants about how all AST nodes related to the core module must be created before those of any module using the core module.
Instead, this change simply adds the *infrastructure* for on-demand deserialization, and ensures that those code paths get used at runtime, but actually "demands" all of the nodes in a given serialized AST immediately as part of the deserialization process.
Important notes about the implementation approach:
* PR #7242 ensured that all of the code accessing the direct member declarations of a `ContainerDecl` went through a small(-ish) set of accessor methods. This change takes advantage of that work by further abstracting the storage of the direct member declarations out in a type, `ContainerDeclDirectMemberDecls`, which makes it easy to add custom serialization logic for just that type.
* The `ContainerDeclDirectMemberDecls` type also stores two pointers (one a `RefPtr` and the other a plain pointer) that are only used in the case where the members of a given `ContainerDecl` are being accessed through on-demand deserialization. This can be queried using the `isUsingOnDemandDeserialization()` method but any code accessing a `ContainerDecl` through the intended public API should never need to care about that detail.
* Many of the accessor methods that were added in PR #7242 now branch on whether `isUsingOnDemandDeserialization()` is set. The normal code path is unchanged, and the implementation logic for the on-demand-deserialization case is largely held in `slang-serialize-ast.cpp`, to keep it close to the definitions of the serialized data structures themselves.
* A few types in the `slang-ast-*.h` headers have had `FIDDLE()` annotations added to them, so that they can be used to synthesize some of the serialization logic that was previously hand-written.
* The `_registerBuiltinDeclsRec()` function (which is used to scan the built-in module ASTs for the various "magic" declarations that the `SharedASTBuilder` needs to know about) was factored a bit to support the way that registration needs to behave differently in the case of loading a serialized module (if we kept using the existing recursive search, then it would force every declaration in the core module to be loaded right away). The new `_collectBuiltinDeclsThatNeedRegistrationRec()` function mirrors the overall traversal pattern to produce a flat list that gets included in the serialized AST module. Note in particular that we no longer call `registerBuiltinDecls()` from within `_readBuiltinModule()`.
* The interface of the `Module` type was slightly expanded so that there is a more complete API for accessing the declarations exported from the module. Previously they could only be queried by their mangled name, but the new API also allows the entire list to be iterated over. The `ensureLookupAcceleratorBuilt()` method factors out the logic for building those data structures for a module. Note that in the case where on-demand deserialization is being used for a module, the `findExportedDeclByMandledName()` query will use serialized data directly, rather than build the lookup accelerators as C++ data structures (this is required if we are to avoid immediately deserializing all of the (exported) declarations in the core module as soon as it is loaded).
* A few methods related to loading serialized modules (e.g., `loadSerializedModule()`) have been updated so that along with a pointer to the serialized `ModuleChunk` (which, for those who aren't aware, is a pointer directly into the serialized bytes of the module file), they receive an `ISlangBlob` that refers to the entire blob holding the serialized data (which the `ModuleChunk` is part of). Passing this pointer down allows code running under these methods to retain a reference-counted pointer to the blob to stop the memory of the serialized module from being released until deserialization has been completed.
* The data types defined in `slang-fossil.h` have been overhauled significantly:
* The most important change that is relevant to this work is the introduction of the `Fossilized<T>` template, which is used to statically map a "live" C++ type `T` to its binary fossilized representation. The `slang-fossil.h` file provides infrastructure allowing `Fossilized<T>` to be specialized for user-defined types, and also provides the necessary mappings for the core types like strings, arrays, and dictionaries.
* A key point is that in C++ code, one can take a value of some type `Foo`, serialize it using a `Fossil::SerialWriter`, get a pointer to that serialized data, and then directly cast it to a `Fossilized<Foo>*` and navigate the serialized data directly (without deserializing it back into a `Foo`). For that process to work, any specialization of `Fossilized<T>` must be sure to match the layout that will be produced by the `serialize()` implementation for `T`, when writing to a `Fossil::SerialWriter`.
* Another key change in the public interface of `slang-fossil.h` is that dynamically-typed traversal of the data used to be handled just with `FossilizedValRef`, but now uses a few different types. The `Fossil::ValRef<T>` and `Fossil::AnyValRef` types are used to capture the use cases that want reference-like behavior (basically a `Fossil::ValRef<T>` can be thought of as sort of like a `T&`), while `Fossil::ValPtr<T>` and `Fossil::AnyValPtr` are used for cases that want pointer like behavior (akin to `T*`).
* Then there are related changes in `slang-serialize-fossil.*`:
* The implementation of `Fossil::SerialReader` has been changed to use `Fossil::AnyValPtr` in most places where it formerly used `FossilizedValRef`. Using pointers (that can be null) instead of a weird kind of pseudo-reference (that could still be null) to traverse things was making the code harder to follow than it ought to be, in terms of understanding the levels of indirection in various places.
* Some of the state that was previously in `Fossil::SerialReader` has been split into `Fossil::ReadContext`. This type allows multiple `Fossil::SerialReader`s to be created to read from the same serialized blob(s), while maintaining a persistent mapping from fossilized data pointers to live object pointers. The `ReadContext` also maintains the work list of deferred deserialization actions waiting to be performed, and only flushes that list when the last currently-open `SerialReader` is about to go out of scope.
* In order to support the split of `Fossil::SerialReader` described above (and also to clean up something that didn't quite feel right in the original serialization design) the base serialization framework in `slang-serialize.h` has been tweaked so that a `Serializer` now wraps *two* pointers instead of just one. The first pointer continues to be an implementation of `ISerializerImpl`, which handles the actual reading/writing of data, while the other pointer is an explicit "context" pointer for operations that need additional user-defined context.
* Similar to the changes made to the accessors for direct member declarations in a `ContainerDecl`, the `Module::findExportedDeclByMangledName()` method was updated to conditionally execute a different code path in the case of a module that has been loaded from serialized data.
* Some improvements have been made to the fiddle tool:
* Most importantly, the error-handling logic around Lua script execution has been cleaned up to better match correct Lua idiom. Native functions exposed to the Lua scripts have been changed to just use `lua_call` instead of `lua_pcall`, so rather than attempt to intercept Lua errors they will just automatically propagate them.
* All Lua-related errors are caught at the top level, and reported in a way that uses the source location of the fiddle template that was being evaluated when the error was raised. In most cases, a Lua error should be accompanied by a stack trace of the Lua evluation state. The file paths and line numbers given should be accurate, but aren't directly double-clickable in the Visual Studio output panel, because they use a different format (a good future change might be to process the Lua stack trace and rewrite it into a format that is better for our needs).
* Fixed a subtle bug where having "raw" content (parts of the template that should neither be evaluated nor emitted into the output) that consisted of only whitespace could result in a template being translated to invalid Lua code.
* The bulk of the change is, unsurprisingly, in `slang-serialize-ast.cpp`.
* This file has been refactored enough to look like a complete rewrite. A lot of work has been put into comments that describe the overall approach being taken, so hopefully it can be understood even by somebody who wasn't familiar with the previous code. Some of these are just plain cleanups, rather than being directly related to on-demand serialization.
* Where possible, the code for reading and writing types that needed custom serialization has been moved so that the read/write functions are next to one another, making it easier to visually confirm that the serialized representations match on the read and write sides.
* Where possible, the serialization logic for all types (not just the AST nodes, as was the case before) is being generated via fiddle.
* Rather than just defining `serialize()` overloads for each of the relevant types, the code now defines `Fossilized<...>` specializations for these types as well, to enable statically-typed in-memory traversal of the serialized data. Note, however, that for the most part the `Fossilized<...>` representation types are *not* being used by the code (really only the `ASTModuleInfo` and `ContainerDeclDirectMemberDeclsInfo` types are traversed directly). This can be considered more as work to prove out the design of the `Fossil<...>` template approach, and it may or may not end up being relevant in the future.
* The trivial bit of work to enable on-demand deserialization is in `ASTSerialReadContext::handleContainerDeclDirectMemberDecls()` where, rather than recursively reading the contained declarations, the method effectively just grabs the current cursor of the `Fossil::SerialReader` (which is pointed into the fossilized data) and stashes it into the `ContainerDeclDirectMemberDecls`, along with a `RefPtr` to the `ASTSerialReadContext` itself. Those stashed pointers are what enables the accessors on `ContaienrDeclDirectMemberDecls` to look up information on-demand.
* The more interesting bits of the approach mostly come at the end of the file, where the accessor operations for on-demand deserialization are implemented. Once all the relevant work has been done to write the data structures, and produce `Fossilized<...>` types with the right layout, the work itself may seem almost trivial: a little bit of array iteration, and a little bit of binary-search lookup.
* As a reminder, all of this infrastructure for on-demand deserialization is now in place and able to be invoked by the rest of the compiler, but declarations are currently all being loaded eagerly. The `SLANG_DISABLE_ON_DEMAND_AST_DESERIALIZATION` macro is being used to enable a small bit of extra logic in `ASTSerialReadContext::_cleanUpASTNode` so that the "cleanup" on a just-deserialized `ContainerDecl` includes eagerly querying its list of direct member declarations, which will cause them to be recursively deserialized.
Diffstat (limited to 'source')
26 files changed, 3941 insertions, 1203 deletions
diff --git a/source/core/core.natvis b/source/core/core.natvis index f2547b3fe..b9e7009e4 100644 --- a/source/core/core.natvis +++ b/source/core/core.natvis @@ -102,6 +102,14 @@ </Expand> </Type> +<Type Name="Slang::RelativePtr<*,*>"> + <SmartPointer Usage="Minimal">_offset == 0 ? nullptr : ($T1*)((char*)this + _offset)</SmartPointer> + <DisplayString Condition="_offset == 0">{($T1*)0}</DisplayString> + <DisplayString Condition="_offset != 0">{($T1*)((char*)this + _offset)}</DisplayString> + <Expand> + <ExpandedItem>_offset == 0 ? nullptr : ($T1*)((char*)this + _offset)</ExpandedItem> + </Expand> +</Type> <Type Name="Slang::Safe32Ptr<*>"> <Expand> diff --git a/source/core/slang-string.cpp b/source/core/slang-string.cpp index e9804eaa8..a0612ccda 100644 --- a/source/core/slang-string.cpp +++ b/source/core/slang-string.cpp @@ -727,6 +727,22 @@ UnownedStringSlice UnownedStringSlice::subString(Index idx, Index len) const return UnownedStringSlice(m_begin + idx, m_begin + idx + len); } +int compare(UnownedStringSlice const& lhs, UnownedStringSlice const& rhs) +{ + auto lhsSize = lhs.getLength(); + auto rhsSize = rhs.getLength(); + + auto lhsData = lhs.begin(); + auto rhsData = rhs.begin(); + + auto sharedPrefixSize = std::min(lhsSize, rhsSize); + int sharedPrefixCmp = memcmp(lhsData, rhsData, sharedPrefixSize); + if (sharedPrefixCmp != 0) + return sharedPrefixCmp; + + return int(lhsSize - rhsSize); +} + bool UnownedStringSlice::operator==(ThisType const& other) const { // Note that memcmp is undefined when passed in null ptrs, so if we want to handle diff --git a/source/core/slang-string.h b/source/core/slang-string.h index 3da0db6b9..6d84a0c95 100644 --- a/source/core/slang-string.h +++ b/source/core/slang-string.h @@ -218,6 +218,14 @@ protected: char const* m_end; }; +/// Three-way comparison of string slices. +/// +/// * Returns 0 if `lhs == rhs` +/// * Returns a value < 0 if `lhs < rhs` +/// * Returns a value > 0 if `lhs > rhs` +/// +int compare(UnownedStringSlice const& lhs, UnownedStringSlice const& rhs); + // A more convenient way to make slices from *string literals* template<size_t SIZE> SLANG_FORCE_INLINE UnownedStringSlice toSlice(const char (&in)[SIZE]) diff --git a/source/slang/slang-ast-base.h b/source/slang/slang-ast-base.h index 5b77f9d53..ac963861a 100644 --- a/source/slang/slang-ast-base.h +++ b/source/slang/slang-ast-base.h @@ -767,7 +767,7 @@ public: /// method, which ensures that the `_prevInContainerWithSameName` fields /// have been properly set for all declarations in that container. /// - Decl* _prevInContainerWithSameName = nullptr; + FIDDLE() Decl* _prevInContainerWithSameName = nullptr; bool isChecked(DeclCheckState state) const { return checkState >= state; } void setCheckState(DeclCheckState state) diff --git a/source/slang/slang-ast-decl.cpp b/source/slang/slang-ast-decl.cpp index f37ebef48..4d9b8718f 100644 --- a/source/slang/slang-ast-decl.cpp +++ b/source/slang/slang-ast-decl.cpp @@ -51,80 +51,167 @@ bool isInterfaceRequirement(Decl* decl) } // -// ContainerDecl +// ContainerDeclDirectMemberDecls // -List<Decl*> const& ContainerDecl::getDirectMemberDecls() +void ContainerDeclDirectMemberDecls::_initForOnDemandDeserialization( + RefObject* deserializationContext, + void const* deserializationData, + Count declCount) { - return _directMemberDecls.decls; + SLANG_ASSERT(deserializationContext); + SLANG_ASSERT(deserializationData); + + SLANG_ASSERT(!decls.getCount()); + SLANG_ASSERT(!onDemandDeserialization.context); + + onDemandDeserialization.context = deserializationContext; + onDemandDeserialization.data = deserializationData; + + for (Index i = 0; i < declCount; ++i) + decls.add(nullptr); } -Count ContainerDecl::getDirectMemberDeclCount() +void ContainerDeclDirectMemberDecls::_readAllSerializedDecls() const { - return _directMemberDecls.decls.getCount(); + SLANG_ASSERT(isUsingOnDemandDeserialization()); + + // We start by querying each of the contained decls + // by index, which should cause the entire `decls` + // array to be filled in. + // + auto declCount = getDeclCount(); + for (Index i = 0; i < declCount; ++i) + { + auto decl = getDecl(i); + SLANG_UNUSED(decl); + } + + // At this point, we have loaded all the information + // that was in the serialized representation, and + // don't need to keep doing on-demand loading. + // Thus, we clear out the pointer to the serialized + // data (which will cause later calls to + // `isDoingOnDemandSerialization()` to return `false`). + // + // Note that we do *not* clear out the `context` pointer + // used for on-demand deserialization, because in the + // case where we are storing the members of a `ModuleDecl`, + // that context will hold the additional state needed to + // look up declarations by their mangled names, and we + // want to retain that state. The + // `isUsingOnDemandDeserializationForExports()` query + // is based on the `context` pointer only, so it will + // continue to return `true`. + // + onDemandDeserialization.data = nullptr; + + _invalidateLookupAccelerators(); } -Decl* ContainerDecl::getDirectMemberDecl(Index index) +List<Decl*> const& ContainerDeclDirectMemberDecls::getDecls() const { - return _directMemberDecls.decls[index]; + if (isUsingOnDemandDeserialization()) + { + _readAllSerializedDecls(); + } + + return decls; } -Decl* ContainerDecl::getFirstDirectMemberDecl() +Count ContainerDeclDirectMemberDecls::getDeclCount() const { - if (getDirectMemberDeclCount() == 0) - return nullptr; - return getDirectMemberDecl(0); + // Note: in the case of on-demand deserialization, + // the number of elements in the `decls` list + // will be correct, although one or more of the + // pointers in it might be null. + // + return decls.getCount(); } -DeclsOfNameList ContainerDecl::getDirectMemberDeclsOfName(Name* name) +Decl* ContainerDeclDirectMemberDecls::getDecl(Index index) const { - return DeclsOfNameList(findLastDirectMemberDeclOfName(name)); + auto decl = decls[index]; + if (!decl && isUsingOnDemandDeserialization()) + { + decl = _readSerializedDeclAtIndex(index); + decls[index] = decl; + } + return decl; } -Decl* ContainerDecl::findLastDirectMemberDeclOfName(Name* name) +Decl* ContainerDeclDirectMemberDecls::findLastDeclOfName(Name* name) const { - _ensureLookupAcceleratorsAreValid(); - if (auto found = _directMemberDecls.accelerators.mapNameToLastDeclOfThatName.tryGetValue(name)) - return *found; + if (isUsingOnDemandDeserialization()) + { + if (auto found = accelerators.mapNameToLastDeclOfThatName.tryGetValue(name)) + return *found; + + Decl* decl = _readSerializedDeclsOfName(name); + accelerators.mapNameToLastDeclOfThatName.add(name, decl); + return decl; + } + else + { + _ensureLookupAcceleratorsAreValid(); + if (auto found = accelerators.mapNameToLastDeclOfThatName.tryGetValue(name)) + return *found; + } return nullptr; } -Decl* ContainerDecl::getPrevDirectMemberDeclWithSameName(Decl* decl) +Dictionary<Name*, Decl*> ContainerDeclDirectMemberDecls::getMapFromNameToLastDeclOfThatName() const { - SLANG_ASSERT(decl); - SLANG_ASSERT(decl->parentDecl == this); + if (isUsingOnDemandDeserialization()) + { + // If we have been using on-demand deserialization, + // then the `mapNameToLastDeclOfThatName` dictionary + // may not accurately reflect the contained declarations. + // We need to force all of the declarations to be + // deserialized immediately, which will also have + // the effect of invalidating the accelerators so + // that they can be rebuilt to contain complete information. + // + _readAllSerializedDecls(); + } _ensureLookupAcceleratorsAreValid(); - return decl->_prevInContainerWithSameName; + return accelerators.mapNameToLastDeclOfThatName; } -void ContainerDecl::addDirectMemberDecl(Decl* decl) -{ - if (!decl) - return; - decl->parentDecl = this; - _directMemberDecls.decls.add(decl); +List<Decl*> const& ContainerDeclDirectMemberDecls::getTransparentDecls() const +{ + if (isUsingOnDemandDeserialization()) + { + if (accelerators.filteredListOfTransparentDecls.getCount() == 0) + { + _readSerializedTransparentDecls(); + } + } + else + { + _ensureLookupAcceleratorsAreValid(); + } + return accelerators.filteredListOfTransparentDecls; } -List<Decl*> const& ContainerDecl::getTransparentDirectMemberDecls() +bool ContainerDeclDirectMemberDecls::isUsingOnDemandDeserialization() const { - _ensureLookupAcceleratorsAreValid(); - return _directMemberDecls.accelerators.filteredListOfTransparentDecls; + return onDemandDeserialization.data != nullptr; } -bool ContainerDecl::_areLookupAcceleratorsValid() +bool ContainerDeclDirectMemberDecls::_areLookupAcceleratorsValid() const { - return _directMemberDecls.accelerators.declCountWhenLastUpdated == - _directMemberDecls.decls.getCount(); + return accelerators.declCountWhenLastUpdated == decls.getCount(); } -void ContainerDecl::_invalidateLookupAccelerators() +void ContainerDeclDirectMemberDecls::_invalidateLookupAccelerators() const { - _directMemberDecls.accelerators.declCountWhenLastUpdated = -1; + accelerators.declCountWhenLastUpdated = -1; } -void ContainerDecl::_ensureLookupAcceleratorsAreValid() +void ContainerDeclDirectMemberDecls::_ensureLookupAcceleratorsAreValid() const { if (_areLookupAcceleratorsValid()) return; @@ -133,24 +220,28 @@ void ContainerDecl::_ensureLookupAcceleratorsAreValid() // the accelerators are entirely invalidated, and must be rebuilt // from scratch. // - if (_directMemberDecls.accelerators.declCountWhenLastUpdated < 0) + if (accelerators.declCountWhenLastUpdated < 0) { - _directMemberDecls.accelerators.declCountWhenLastUpdated = 0; - _directMemberDecls.accelerators.mapNameToLastDeclOfThatName.clear(); - _directMemberDecls.accelerators.filteredListOfTransparentDecls.clear(); + accelerators.declCountWhenLastUpdated = 0; + accelerators.mapNameToLastDeclOfThatName.clear(); + accelerators.filteredListOfTransparentDecls.clear(); } - // are we a generic? - GenericDecl* genericDecl = as<GenericDecl>(this); - - Count memberCount = _directMemberDecls.decls.getCount(); - Count memberCountWhenLastUpdated = _directMemberDecls.accelerators.declCountWhenLastUpdated; + Count memberCount = decls.getCount(); + Count memberCountWhenLastUpdated = accelerators.declCountWhenLastUpdated; SLANG_ASSERT(memberCountWhenLastUpdated >= 0 && memberCountWhenLastUpdated <= memberCount); + // are we a generic? + GenericDecl* genericDecl = nullptr; + if (memberCount > 0) + { + genericDecl = as<GenericDecl>(decls[0]->parentDecl); + } + for (Index i = memberCountWhenLastUpdated; i < memberCount; ++i) { - Decl* memberDecl = _directMemberDecls.decls[i]; + Decl* memberDecl = decls[i]; // Transparent member declarations will go into a separate list, // so that they can be conveniently queried later for lookup @@ -163,7 +254,7 @@ void ContainerDecl::_ensureLookupAcceleratorsAreValid() // if (memberDecl->hasModifier<TransparentModifier>()) { - _directMemberDecls.accelerators.filteredListOfTransparentDecls.add(memberDecl); + accelerators.filteredListOfTransparentDecls.add(memberDecl); } // Members that don't have a name don't go into the lookup dictionary. @@ -190,9 +281,7 @@ void ContainerDecl::_ensureLookupAcceleratorsAreValid() // all of the overloaded functions with a given name. // Decl* prevMemberWithSameName = nullptr; - _directMemberDecls.accelerators.mapNameToLastDeclOfThatName.tryGetValue( - memberName, - prevMemberWithSameName); + accelerators.mapNameToLastDeclOfThatName.tryGetValue(memberName, prevMemberWithSameName); memberDecl->_prevInContainerWithSameName = prevMemberWithSameName; // Whether or not there was a previous declaration with this @@ -200,18 +289,131 @@ void ContainerDecl::_ensureLookupAcceleratorsAreValid() // with that name encountered so far, and it is what we will // store in the lookup dictionary. // - _directMemberDecls.accelerators.mapNameToLastDeclOfThatName[memberName] = memberDecl; + accelerators.mapNameToLastDeclOfThatName[memberName] = memberDecl; } - _directMemberDecls.accelerators.declCountWhenLastUpdated = memberCount; + accelerators.declCountWhenLastUpdated = memberCount; SLANG_ASSERT(_areLookupAcceleratorsValid()); } + +// +// ContainerDecl +// + +List<Decl*> const& ContainerDecl::getDirectMemberDecls() +{ + return _directMemberDecls.getDecls(); +} + +Count ContainerDecl::getDirectMemberDeclCount() +{ + return _directMemberDecls.getDeclCount(); +} + +Decl* ContainerDecl::getDirectMemberDecl(Index index) +{ + return _directMemberDecls.getDecl(index); +} + +Decl* ContainerDecl::getFirstDirectMemberDecl() +{ + if (getDirectMemberDeclCount() == 0) + return nullptr; + return getDirectMemberDecl(0); +} + +DeclsOfNameList ContainerDecl::getDirectMemberDeclsOfName(Name* name) +{ + return DeclsOfNameList(findLastDirectMemberDeclOfName(name)); +} + +Decl* ContainerDecl::findLastDirectMemberDeclOfName(Name* name) +{ + return _directMemberDecls.findLastDeclOfName(name); +} + +Decl* ContainerDecl::getPrevDirectMemberDeclWithSameName(Decl* decl) +{ + SLANG_ASSERT(decl); + SLANG_ASSERT(decl->parentDecl == this); + + if (isUsingOnDemandDeserializationForDirectMembers()) + { + // Note: in the case of on-demand deserialization, + // we trust that the caller has previously + // invoked `findLastDirectMemberDeclOfName()` + // in order to get `decl` (or an earlier + // entry in the same linked list), so that + // the list threaded through the declarations + // of the same name is already set up. + // + // If that is ever *not* the case, then this + // query would end up returning the wrong results. + + return decl->_prevInContainerWithSameName; + } + else + { + _ensureLookupAcceleratorsAreValid(); + return decl->_prevInContainerWithSameName; + } +} + +void ContainerDecl::addDirectMemberDecl(Decl* decl) +{ + if (isUsingOnDemandDeserializationForDirectMembers()) + { + SLANG_UNEXPECTED("this operation shouldn't be performed on deserialized declarations"); + } + + if (!decl) + return; + + decl->parentDecl = this; + _directMemberDecls.decls.add(decl); +} + +List<Decl*> const& ContainerDecl::getTransparentDirectMemberDecls() +{ + return _directMemberDecls.getTransparentDecls(); +} + +bool ContainerDecl::isUsingOnDemandDeserializationForDirectMembers() +{ + return _directMemberDecls.isUsingOnDemandDeserialization(); +} + +bool ModuleDecl::isUsingOnDemandDeserializationForExports() +{ + return _directMemberDecls.onDemandDeserialization.context != nullptr; +} + +bool ContainerDecl::_areLookupAcceleratorsValid() +{ + return _directMemberDecls._areLookupAcceleratorsValid(); +} + +void ContainerDecl::_invalidateLookupAccelerators() +{ + _directMemberDecls._invalidateLookupAccelerators(); +} + +void ContainerDecl::_ensureLookupAcceleratorsAreValid() +{ + _directMemberDecls._ensureLookupAcceleratorsAreValid(); +} + void ContainerDecl:: _invalidateLookupAcceleratorsBecauseUnscopedEnumAttributeWillBeTurnedIntoTransparentModifier( UnscopedEnumAttribute* unscopedEnumAttr, TransparentModifier* transparentModifier) { + if (isUsingOnDemandDeserializationForDirectMembers()) + { + SLANG_UNEXPECTED("this operation shouldn't be performed on deserialized declarations"); + } + SLANG_ASSERT(unscopedEnumAttr); SLANG_ASSERT(transparentModifier); @@ -225,6 +427,11 @@ void ContainerDecl:: _removeDirectMemberConstructorDeclBecauseSynthesizedAnotherDefaultConstructorInstead( ConstructorDecl* decl) { + if (isUsingOnDemandDeserializationForDirectMembers()) + { + SLANG_UNEXPECTED("this operation shouldn't be performed on deserialized declarations"); + } + SLANG_ASSERT(decl); _invalidateLookupAccelerators(); @@ -237,6 +444,11 @@ void ContainerDecl:: VarDecl* oldDecl, PropertyDecl* newDecl) { + if (isUsingOnDemandDeserializationForDirectMembers()) + { + SLANG_UNEXPECTED("this operation shouldn't be performed on deserialized declarations"); + } + SLANG_ASSERT(oldDecl); SLANG_ASSERT(newDecl); SLANG_ASSERT(index >= 0 && index < getDirectMemberDeclCount()); @@ -251,6 +463,11 @@ void ContainerDecl::_insertDirectMemberDeclAtIndexForBitfieldPropertyBackingMemb Index index, VarDecl* backingVarDecl) { + if (isUsingOnDemandDeserializationForDirectMembers()) + { + SLANG_UNEXPECTED("this operation shouldn't be performed on deserialized declarations"); + } + SLANG_ASSERT(backingVarDecl); SLANG_ASSERT(index >= 0 && index <= getDirectMemberDeclCount()); diff --git a/source/slang/slang-ast-decl.h b/source/slang/slang-ast-decl.h index c46878945..a92f73e2a 100644 --- a/source/slang/slang-ast-decl.h +++ b/source/slang/slang-ast-decl.h @@ -4,6 +4,7 @@ #include "slang-ast-base.h" #include "slang-ast-decl.h.fiddle" +#include "slang-fossil.h" FIDDLE() namespace Slang @@ -34,23 +35,54 @@ class UnresolvedDecl : public Decl struct ContainerDeclDirectMemberDecls { public: - List<Decl*> const& getDecls() const { return decls; } + List<Decl*> const& getDecls() const; - List<Decl*>& _refDecls() { return decls; } + Count getDeclCount() const; + Decl* getDecl(Index index) const; + + Decl* findLastDeclOfName(Name* name) const; + + Dictionary<Name*, Decl*> getMapFromNameToLastDeclOfThatName() const; + + List<Decl*> const& getTransparentDecls() const; + + bool isUsingOnDemandDeserialization() const; + + void _initForOnDemandDeserialization( + RefObject* deserializationContext, + void const* deserializationData, + Count declCount); private: friend class ContainerDecl; + friend class ModuleDecl; friend struct ASTDumpContext; - List<Decl*> decls; + bool _areLookupAcceleratorsValid() const; + void _invalidateLookupAccelerators() const; + void _ensureLookupAcceleratorsAreValid() const; + + void _readSerializedTransparentDecls() const; + Decl* _readSerializedDeclAtIndex(Index index) const; + Decl* _readSerializedDeclsOfName(Name* name) const; + + void _readAllSerializedDecls() const; + + mutable List<Decl*> decls; - struct + mutable struct { Count declCountWhenLastUpdated = 0; Dictionary<Name*, Decl*> mapNameToLastDeclOfThatName; List<Decl*> filteredListOfTransparentDecls; } accelerators; + + mutable struct + { + RefPtr<RefObject> context; + void const* data = nullptr; + } onDemandDeserialization; }; /// A conceptual list of declarations of the same name, in the same container. @@ -199,6 +231,10 @@ class ContainerDecl : public Decl // void addMember(Decl* member) { addDirectMemberDecl(member); } + /// Is this declaration using on-demand deserialization for its direct members? + /// + bool isUsingOnDemandDeserializationForDirectMembers(); + // // NOTE: The operations after this point are *not* considered part of // the public API of `ContainerDecl`, and new code should not be @@ -737,6 +773,14 @@ class ModuleDecl : public NamespaceDeclBase /// This mapping is filled in during semantic checking, as `ExtensionDecl`s get checked. /// FIDDLE() Dictionary<AggTypeDecl*, RefPtr<CandidateExtensionList>> mapTypeToCandidateExtensions; + + /// Is this module using on-demand deserialization for its exports? + /// + bool isUsingOnDemandDeserializationForExports(); + + /// Find a declaration exported from this module by its `mangledName`. + /// + Decl* _findSerializedDeclByMangledExportName(UnownedStringSlice const& mangledName); }; // Represents a transparent scope of declarations that are defined in a single source file. diff --git a/source/slang/slang-ast-expr.h b/source/slang/slang-ast-expr.h index 177feff15..d6027b2b2 100644 --- a/source/slang/slang-ast-expr.h +++ b/source/slang/slang-ast-expr.h @@ -309,13 +309,17 @@ class StaticMemberExpr : public DeclRefExpr SourceLoc memberOperatorLoc; }; +FIDDLE() struct MatrixCoord { + FIDDLE(...) + bool operator==(const MatrixCoord& rhs) const { return row == rhs.row && col == rhs.col; }; bool operator!=(const MatrixCoord& rhs) const { return !(*this == rhs); }; + // Rows and columns are zero indexed - int row; - int col; + FIDDLE() Int32 row; + FIDDLE() Int32 col; }; FIDDLE() @@ -851,40 +855,41 @@ public: }; // The flavour and token describes how this was parsed - Flavor flavor; + FIDDLE() Flavor flavor; // The single token this came from Token token; // If this was a SlangValue or SlangValueAddr or SlangType, then we also // store the expression, which should be a single VarExpr because we only // parse single idents at the moment - Expr* expr = nullptr; + FIDDLE() Expr* expr = nullptr; // If this is part of a bitwise or expression, this will point to the // remaining operands values in such an expression must be of flavour // Literal or NamedValue - List<SPIRVAsmOperand> bitwiseOrWith = List<SPIRVAsmOperand>(); + FIDDLE() List<SPIRVAsmOperand> bitwiseOrWith = List<SPIRVAsmOperand>(); // If this is a named value then we calculate the value here during // checking. If this is an opcode, then the parser will populate this too // (or set it to 0xffffffff); - SpvWord knownValue = 0xffffffff; + FIDDLE() SpvWord knownValue = 0xffffffff; // Although this might be a constant in the source we should actually pass // it as an id created with OpConstant - bool wrapInId = false; + FIDDLE() bool wrapInId = false; // Once we've checked things, the SlangType and BuiltinVar flavour operands // will have this type populated. - TypeExp type = TypeExp(); + FIDDLE() TypeExp type = TypeExp(); }; FIDDLE() struct SPIRVAsmInst { FIDDLE(...) + public: - SPIRVAsmOperand opcode; - List<SPIRVAsmOperand> operands; + FIDDLE() SPIRVAsmOperand opcode; + FIDDLE() List<SPIRVAsmOperand> operands; }; FIDDLE() diff --git a/source/slang/slang-ast-forward-declarations.h b/source/slang/slang-ast-forward-declarations.h index 717bca1d9..32cfd9ef0 100644 --- a/source/slang/slang-ast-forward-declarations.h +++ b/source/slang/slang-ast-forward-declarations.h @@ -1,10 +1,12 @@ // slang-ast-forward-declarations.h #pragma once +#include "../core/slang-basic.h" + namespace Slang { -enum class ASTNodeType +enum class ASTNodeType : Int32 { #if 0 // FIDDLE TEMPLATE: %for _, T in ipairs(Slang.NodeBase.subclasses) do diff --git a/source/slang/slang-ast-support-types.h b/source/slang/slang-ast-support-types.h index d05f24e56..e4002c237 100644 --- a/source/slang/slang-ast-support-types.h +++ b/source/slang/slang-ast-support-types.h @@ -232,10 +232,12 @@ FIDDLE() namespace Slang class Val; // Helper type for pairing up a name and the location where it appeared - struct NameLoc + FIDDLE() struct NameLoc { - Name* name; - SourceLoc loc; + FIDDLE(...) + + FIDDLE() Name* name; + FIDDLE() SourceLoc loc; NameLoc() : name(nullptr) @@ -572,10 +574,11 @@ FIDDLE() namespace Slang struct QualType { FIDDLE(...) - Type* type = nullptr; - bool isLeftValue = false; - bool hasReadOnlyOnTarget = false; - bool isWriteOnly = false; + + FIDDLE() Type* type = nullptr; + FIDDLE() bool isLeftValue = false; + FIDDLE() bool hasReadOnlyOnTarget = false; + FIDDLE() bool isWriteOnly = false; QualType() = default; @@ -1571,16 +1574,16 @@ FIDDLE() namespace Slang void add(Decl* decl, RequirementWitness const& witness); // The type that the witness table witnesses conformance to (e.g. an Interface) - Type* baseType; + FIDDLE() Type* baseType; // The type witnessesd by the witness table (a concrete type). - Type* witnessedType; + FIDDLE() Type* witnessedType; // Whether or not this witness table is an extern declaration. - bool isExtern = false; + FIDDLE() bool isExtern = false; // Cached dictionary for looking up satisfying values. - RequirementDictionary m_requirementDictionary; + FIDDLE() RequirementDictionary m_requirementDictionary; RefPtr<WitnessTable> specialize(ASTBuilder* astBuilder, SubstitutionSet const& subst); }; @@ -1634,8 +1637,9 @@ FIDDLE() namespace Slang class DeclAssociation : public RefObject { FIDDLE(...) - DeclAssociationKind kind; - Decl* decl; + + FIDDLE() DeclAssociationKind kind; + FIDDLE() Decl* decl; }; /// A reference-counted object to hold a list of associated decls for a decl. diff --git a/source/slang/slang-check-decl.cpp b/source/slang/slang-check-decl.cpp index f0cd32e74..1a70e25d7 100644 --- a/source/slang/slang-check-decl.cpp +++ b/source/slang/slang-check-decl.cpp @@ -3494,14 +3494,8 @@ struct SemanticsDeclDifferentialConformanceVisitor } }; -/// Recursively register any builtin declarations that need to be attached to the `session`. -/// -/// This function should only be needed for declarations in the core module. -/// -static void _registerBuiltinDeclsRec(Session* session, Decl* decl) +void registerBuiltinDecl(SharedASTBuilder* sharedASTBuilder, Decl* decl) { - SharedASTBuilder* sharedASTBuilder = session->m_sharedASTBuilder; - if (auto builtinMod = decl->findModifier<BuiltinTypeModifier>()) { sharedASTBuilder->registerBuiltinDecl(decl, builtinMod); @@ -3514,6 +3508,25 @@ static void _registerBuiltinDeclsRec(Session* session, Decl* decl) { sharedASTBuilder->registerBuiltinRequirementDecl(decl, builtinRequirement); } +} + + +void registerBuiltinDecl(ASTBuilder* astBuilder, Decl* decl) +{ + registerBuiltinDecl(astBuilder->getSharedASTBuilder(), decl); +} + + +/// Recursively register any builtin declarations that need to be attached to the `session`. +/// +/// This function should only be needed for declarations in the core module. +/// +static void _registerBuiltinDeclsRec(Session* session, Decl* decl) +{ + SharedASTBuilder* sharedASTBuilder = session->m_sharedASTBuilder; + + registerBuiltinDecl(sharedASTBuilder, decl); + if (auto containerDecl = as<ContainerDecl>(decl)) { for (auto childDecl : containerDecl->getDirectMemberDecls()) @@ -3535,6 +3548,42 @@ void registerBuiltinDecls(Session* session, Decl* decl) _registerBuiltinDeclsRec(session, decl); } +void _collectBuiltinDeclsThatNeedRegistrationRec(Decl* decl, List<Decl*>& ioDecls) +{ + if (decl->findModifier<BuiltinTypeModifier>()) + { + ioDecls.add(decl); + } + else if (decl->findModifier<MagicTypeModifier>()) + { + ioDecls.add(decl); + } + else if (decl->findModifier<BuiltinRequirementModifier>()) + { + ioDecls.add(decl); + } + + if (auto containerDecl = as<ContainerDecl>(decl)) + { + for (auto childDecl : containerDecl->getDirectMemberDecls()) + { + if (as<ScopeDecl>(childDecl)) + continue; + + _collectBuiltinDeclsThatNeedRegistrationRec(childDecl, ioDecls); + } + } + if (auto genericDecl = as<GenericDecl>(decl)) + { + _collectBuiltinDeclsThatNeedRegistrationRec(genericDecl->inner, ioDecls); + } +} + +void collectBuiltinDeclsThatNeedRegistration(ModuleDecl* moduleDecl, List<Decl*>& outDecls) +{ + _collectBuiltinDeclsThatNeedRegistrationRec(moduleDecl, outDecls); +} + Type* unwrapArrayType(Type* type) { for (;;) diff --git a/source/slang/slang-check-modifier.cpp b/source/slang/slang-check-modifier.cpp index d135744af..7779957ef 100644 --- a/source/slang/slang-check-modifier.cpp +++ b/source/slang/slang-check-modifier.cpp @@ -2225,7 +2225,6 @@ void SemanticsVisitor::checkRayPayloadStructFields(StructDecl* structDecl) { auto readModifier = fieldVarDecl->findModifier<RayPayloadReadSemantic>(); auto writeModifier = fieldVarDecl->findModifier<RayPayloadWriteSemantic>(); - bool hasReadModifier = readModifier != nullptr; bool hasWriteModifier = writeModifier != nullptr; diff --git a/source/slang/slang-check.h b/source/slang/slang-check.h index f4d86cff9..ebeff1afe 100644 --- a/source/slang/slang-check.h +++ b/source/slang/slang-check.h @@ -21,8 +21,13 @@ class TranslationUnitRequest; bool isGlobalShaderParameter(VarDeclBase* decl); bool isFromCoreModule(Decl* decl); +void registerBuiltinDecl(SharedASTBuilder* sharedASTBuilder, Decl* decl); +void registerBuiltinDecl(ASTBuilder* astBuilder, Decl* decl); + void registerBuiltinDecls(Session* session, Decl* decl); +void collectBuiltinDeclsThatNeedRegistration(ModuleDecl* moduleDecl, List<Decl*>& outDecls); + Type* unwrapArrayType(Type* type); Type* unwrapModifiedType(Type* type); diff --git a/source/slang/slang-compiler.h b/source/slang/slang-compiler.h index 43b51b231..cee80f882 100644 --- a/source/slang/slang-compiler.h +++ b/source/slang/slang-compiler.h @@ -1794,7 +1794,16 @@ public: /// Given a mangled name finds the exported NodeBase associated with this module. /// If not found returns nullptr. - NodeBase* findExportFromMangledName(const UnownedStringSlice& slice); + Decl* findExportedDeclByMangledName(const UnownedStringSlice& mangledName); + + /// Ensure that the any accelerator(s) used for `findExportedDeclByMangledName` + /// have already been built. + /// + void ensureExportLookupAcceleratorBuilt(); + + Count getExportedDeclCount(); + Decl* getExportedDecl(Index index); + UnownedStringSlice getExportedDeclMangledName(Index index); /// Get the ASTBuilder ASTBuilder* getASTBuilder() { return m_astBuilder; } @@ -1906,7 +1915,7 @@ private: // Holds map of exported mangled names to symbols. m_mangledExportPool maps names to indices, // and m_mangledExportSymbols holds the NodeBase* values for each index. StringSlicePool m_mangledExportPool; - List<NodeBase*> m_mangledExportSymbols; + List<Decl*> m_mangledExportSymbols; // Source files that have been pulled into the module with `__include`. Dictionary<SourceFile*, FileDecl*> m_mapSourceFileToFileDecl; @@ -2451,6 +2460,7 @@ public: /// Otherwise, return null. /// RefPtr<Module> findOrLoadSerializedModuleForModuleLibrary( + ISlangBlob* blobHoldingSerializedData, ModuleChunk const* moduleChunk, RIFF::ListChunk const* libraryChunk, DiagnosticSink* sink); @@ -2458,6 +2468,7 @@ public: RefPtr<Module> loadSerializedModule( Name* moduleName, const PathInfo& moduleFilePathInfo, + ISlangBlob* blobHoldingSerializedData, ModuleChunk const* moduleChunk, RIFF::ListChunk const* containerChunk, //< The outer container, if there is one. SourceLoc const& requestingLoc, @@ -2466,6 +2477,7 @@ public: SlangResult loadSerializedModuleContents( Module* module, const PathInfo& moduleFilePathInfo, + ISlangBlob* blobHoldingSerializedData, ModuleChunk const* moduleChunk, RIFF::ListChunk const* containerChunk, //< The outer container, if there is one. DiagnosticSink* sink); diff --git a/source/slang/slang-fossil.cpp b/source/slang/slang-fossil.cpp index e3b2e2c9d..204430901 100644 --- a/source/slang/slang-fossil.cpp +++ b/source/slang/slang-fossil.cpp @@ -25,12 +25,12 @@ const char Fossil::Header::kMagic[16] = { '\n' // byte 15 }; -FossilizedValRef getRootValue(ISlangBlob* blob) +Fossil::AnyValPtr getRootValue(ISlangBlob* blob) { return getRootValue(blob->getBufferPointer(), blob->getBufferSize()); } -FossilizedValRef getRootValue(void const* data, Size size) +Fossil::AnyValPtr getRootValue(void const* data, Size size) { if (!data) { @@ -72,9 +72,7 @@ FossilizedValRef getRootValue(void const* data, Size size) SLANG_UNEXPECTED("bad format for fossil"); } - return FossilizedValRef( - rootValueVariant->getContentData(), - rootValueVariant->getContentLayout()); + return getVariantContentPtr(rootValueVariant); } } // namespace Fossil @@ -85,13 +83,13 @@ Size FossilizedStringObj::getSize() const return Size(*sizePtr); } -UnownedTerminatedStringSlice FossilizedStringObj::getValue() const +UnownedTerminatedStringSlice FossilizedStringObj::get() const { auto size = getSize(); return UnownedTerminatedStringSlice((char*)this, size); } -Count FossilizedContainerObj::getElementCount() const +Count FossilizedContainerObjBase::getElementCount() const { auto countPtr = (FossilUInt*)this - 1; return Size(*countPtr); @@ -103,52 +101,18 @@ FossilizedValLayout* FossilizedVariantObj::getContentLayout() const return (*layoutPtrPtr).get(); } -FossilizedValRef getPtrTarget(FossilizedPtrValRef ptrRef) -{ - auto ptrLayout = ptrRef.getLayout(); - auto ptrPtr = ptrRef.getData(); - return FossilizedValRef(ptrPtr->getTargetData(), ptrLayout->elementLayout); -} - -bool hasValue(FossilizedOptionalObjRef optionalRef) -{ - return optionalRef.getData() != nullptr; -} - -FossilizedValRef getValue(FossilizedOptionalObjRef optionalRef) -{ - auto optionalLayout = optionalRef.getLayout(); - auto valuePtr = optionalRef.getData(); - return FossilizedValRef(valuePtr, optionalLayout->elementLayout); -} - -Count getElementCount(FossilizedContainerObjRef containerRef) -{ - if (!containerRef) - return 0; - - auto containerPtr = containerRef.getData(); - return containerPtr->getElementCount(); -} - -FossilizedValRef getElement(FossilizedContainerObjRef containerRef, Index index) +Fossil::AnyValRef Fossil::ValRef<FossilizedContainerObjBase>::getElement(Index index) const { SLANG_ASSERT(index >= 0); - SLANG_ASSERT(index < getElementCount(containerRef)); + SLANG_ASSERT(index < getElementCount()); - auto containerLayout = containerRef.getLayout(); + auto containerLayout = getLayout(); auto elementLayout = containerLayout->elementLayout.get(); auto elementStride = containerLayout->elementStride; - auto elementsPtr = (Byte*)containerRef.getData(); - auto elementPtr = (FossilizedVal*)(elementsPtr + elementStride * index); - return FossilizedValRef(elementPtr, elementLayout); -} - -Count getFieldCount(FossilizedRecordValRef recordRef) -{ - auto recordLayout = recordRef.getLayout(); - return recordLayout->fieldCount; + auto elementsPtr = (Byte*)getDataPtr(); + auto elementPtr = (void*)(elementsPtr + elementStride * index); + return Fossil::AnyValRef(elementPtr, elementLayout); } FossilizedRecordElementLayout* FossilizedRecordLayout::getField(Index index) const @@ -160,28 +124,29 @@ FossilizedRecordElementLayout* FossilizedRecordLayout::getField(Index index) con return fieldsPtr + index; } - -FossilizedValRef getField(FossilizedRecordValRef recordRef, Index index) +Fossil::AnyValRef Fossil::ValRef<FossilizedRecordVal>::getField(Index index) const { SLANG_ASSERT(index >= 0); - SLANG_ASSERT(index < getFieldCount(recordRef)); + SLANG_ASSERT(index < getFieldCount()); - auto recordLayout = recordRef.getLayout(); - auto field = recordLayout->getField(index); + auto recordLayout = getLayout(); + auto fieldInfo = recordLayout->getField(index); - auto fieldsPtr = (Byte*)recordRef.getData(); - auto fieldPtr = (FossilizedVal*)(fieldsPtr + field->offset); - return FossilizedValRef(fieldPtr, field->layout); + auto fieldsPtr = (Byte*)getDataPtr(); + auto fieldPtr = (void*)(fieldsPtr + fieldInfo->offset); + return Fossil::AnyValRef(fieldPtr, fieldInfo->layout); } +#if 0 FossilizedValRef getVariantContent(FossilizedVariantObjRef variantRef) { return getVariantContent(variantRef.getData()); } +#endif -FossilizedValRef getVariantContent(FossilizedVariantObj* variantPtr) +Fossil::AnyValPtr getVariantContentPtr(FossilizedVariantObj* variantPtr) { - return FossilizedValRef(variantPtr->getContentData(), variantPtr->getContentLayout()); + return Fossil::AnyValPtr(variantPtr->getContentDataPtr(), variantPtr->getContentLayout()); } } // namespace Slang diff --git a/source/slang/slang-fossil.h b/source/slang/slang-fossil.h index dcc12bacb..8d2465ddb 100644 --- a/source/slang/slang-fossil.h +++ b/source/slang/slang-fossil.h @@ -18,8 +18,43 @@ #include "../core/slang-relative-ptr.h" +#include <optional> +#include <type_traits> + namespace Slang { + +struct FossilizedPtrLikeLayout; +struct FossilizedRecordLayout; +struct FossilizedValLayout; + +using FossilInt = int32_t; +using FossilUInt = uint32_t; + +/// Kinds of values that can appear in fossilized data. +enum class FossilizedValKind : FossilUInt +{ + Bool, + Int8, + Int16, + Int32, + Int64, + UInt8, + UInt16, + UInt32, + UInt64, + Float32, + Float64, + StringObj, + ArrayObj, + OptionalObj, + DictionaryObj, + Tuple, + Struct, + Ptr, + VariantObj, +}; + // A key part of the fossil representation is the use of *relative pointers*, // so that a fossilized object graph can be traversed dirctly in memory // without having to deserialize any of the intermediate objects. @@ -27,7 +62,17 @@ namespace Slang // Fossil uses 32-bit relative pointers, to keep the format compact. template<typename T> -using FossilizedPtr = RelativePtr32<T>; +struct FossilizedPtr : RelativePtr32<T> +{ +public: + using RelativePtr32<T>::RelativePtr32; + + using Layout = FossilizedPtrLikeLayout; + + static bool isMatchingKind(FossilizedValKind kind) { return kind == FossilizedValKind::Ptr; } +}; + +static_assert(sizeof(FossilizedPtr<void>) == sizeof(uint32_t)); // Various other parts of the format need to store offsets or counts, // and for consistency we will store them with the same number of @@ -37,78 +82,218 @@ using FossilizedPtr = RelativePtr32<T>; // pointer size down the line, we define type aliases for the // general-purpose integer types that will be used in fossilized data. -using FossilInt = FossilizedPtr<void>::Offset; -using FossilUInt = FossilizedPtr<void>::UOffset; + +static_assert(sizeof(FossilInt) == sizeof(FossilizedPtr<void>)); +static_assert(sizeof(FossilUInt) == sizeof(FossilizedPtr<void>)); // -// The fossil format supports data that is *self-describing*. +// A "live" type can declare what its fossilized representation +// is by specializing the `FossilizedTypeTraits` template. // -// A `FossilizedValLayout` describes the in-memory layout of a fossilized -// value. Given a `FossilizedValLayout` and a pointer to the data -// for a particular value, it is possible to inspect the structure -// of the fossilized data. +// By default, a type is fossilized as an opaque `FossilizedOpaqueVal` +// if no user-defined specialization is provided. +// + +template<typename T> +struct Fossilized_; + +template<typename T> +struct FossilizedTypeTraits +{ + using FossilizedType = Fossilized_<T>; +}; + +template<typename T> +using Fossilized = FossilizedTypeTraits<T>::FossilizedType; + // -// If all you have is a `FossilizedVal*`, then there is no way to access -// its contents without assuming it is of some particular type and casting -// it. +// In many cases, a new C++ type can be fossilized using +// the same representation as some existing type, so we +// allow them to conveniently declare that fact with +// a macro. // -// A `FossilizedVariantObj` is a fossilized value that is self-describing; -// it stores a (relative) pointer to a layout, which can be used to inspect -// its own data/state. + +#define SLANG_DECLARE_FOSSILIZED_AS(TYPE, FOSSILIZED_AS) \ + template<> \ + struct FossilizedTypeTraits<TYPE> \ + { \ + using FossilizedType = Fossilized<FOSSILIZED_AS>; \ + } + +// +// Another common pattern is when some aggregate type +// can simply be fossilized as one of its members. // -struct FossilizedVal; -struct FossilizedValLayout; -struct FossilizedVariantObj; +#define SLANG_DECLARE_FOSSILIZED_AS_MEMBER(TYPE, MEMBER) \ + template<> \ + struct FossilizedTypeTraits<TYPE> \ + { \ + using FossilizedType = Fossilized<decltype(TYPE::MEMBER)>; \ + } -/// Kinds of values that can appear in fossilized data. -enum class FossilizedValKind : FossilUInt +// +// Simple scalar values are fossilized into a wrapper +// `struct` that contains the underlying value. +// +// The reason to impose the wrapper `struct` is that +// it allows us to control the alignment of the type +// in case it turns out that different targets/compilers +// don't apply the same layout to all of the underlying +// scalar types. +// + +template<typename T, FossilizedValKind Kind> +struct FossilizedSimpleVal { - Bool, - Int8, - Int16, - Int32, - Int64, - UInt8, - UInt16, - UInt32, - UInt64, - Float32, - Float64, - String, - Array, - Optional, - Dictionary, - Tuple, - Struct, - Ptr, - Variant, +public: + using Layout = FossilizedValLayout; + static const FossilizedValKind kKind = Kind; + + T const& get() const { return _value; } + + operator T const&() const { return _value; } + + static bool isMatchingKind(FossilizedValKind kind) { return kind == kKind; } + +private: + T _value; }; -/// Layout information about a fossilized value in memory. -/// -/// -/// Every `FossilizedValLayout` stores the kind of the value. -/// Based on that kind, specific additional fields may be -/// available as part of the layout. -/// -struct FossilizedValLayout +#define SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(TYPE, TAG) \ + template<> \ + struct FossilizedTypeTraits<TYPE> \ + { \ + using FossilizedType = FossilizedSimpleVal<TYPE, FossilizedValKind::TAG>; \ + }; + +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(int8_t, Int8) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(int16_t, Int16) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(int32_t, Int32) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(int64_t, Int64) + +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(uint8_t, UInt8) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(uint16_t, UInt16) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(uint32_t, UInt32) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(uint64_t, UInt64) + +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(float, Float32) +SLANG_DECLARE_FOSSILIZED_SIMPLE_TYPE(double, Float64) + +static_assert(sizeof(Fossilized<int8_t>) == 1); +static_assert(sizeof(Fossilized<int16_t>) == 2); +static_assert(sizeof(Fossilized<int32_t>) == 4); +static_assert(sizeof(Fossilized<int64_t>) == 8); + +static_assert(sizeof(Fossilized<uint8_t>) == 1); +static_assert(sizeof(Fossilized<uint16_t>) == 2); +static_assert(sizeof(Fossilized<uint32_t>) == 4); +static_assert(sizeof(Fossilized<uint64_t>) == 8); + +static_assert(sizeof(Fossilized<float>) == 4); +static_assert(sizeof(Fossilized<double>) == 8); + +// +// The `bool` type shouldn't be fossilized as itself, because +// its layout is not guaranteed to be consistent across targets. +// We instead fossilize it as an underlying `uint8_t`, and convert +// on reads. +// + +template<> +struct Fossilized_<bool> { - FossilizedValKind kind; +public: + using Layout = FossilizedValLayout; + static const FossilizedValKind kKind = FossilizedValKind::Bool; + + bool get() const { return _value != 0; } + + operator bool() const { return get(); } + + static bool isMatchingKind(FossilizedValKind kind) { return kind == kKind; } + +private: + uint8_t _value; }; -struct FossilizedPtrLikeLayout +static_assert(sizeof(Fossilized<bool>) == 1); + +// +// Some simple types can be fossilized as one of the +// scalar types above, with explicit casts between +// the "live" type and the "fossilized" type. +// +// A common example of this is `enum` types. +// + +template<typename LiveType, typename FossilizedAsType> +struct FossilizedViaCastVal { - // Note: we aren't using inheritance in the definitions - // of these types, because per the letter of the law in - // C++, a type is only "standard layout" when there is - // only a single type in the inheritance hierarchy that - // has (non-static) data members. +public: + LiveType get() const { return LiveType(_value.get()); } - FossilizedValKind kind; - FossilizedPtr<FossilizedValLayout> elementLayout; + operator LiveType() const { return get(); } + + +private: + Fossilized<FossilizedAsType> _value; +}; + +// +// By default we assume that an `enum` type should be fossilized +// as a signed 32-bit integer. +// +#define SLANG_DECLARE_FOSSILIZED_ENUM(TYPE) \ + template<> \ + struct Fossilized_<TYPE> : FossilizedViaCastVal<TYPE, int32_t> \ + { \ + }; + +// +// For many of the other kinds of types that get fossilized, +// the in-memory encoding will typically be as a (relative) +// pointer to the actual data. +// +// Here we distinguish between the *value* type (e.g., +// `FossilizedString`) that typically gets stored as the +// field of a record/tuple/whatever, and the *object* type +// that the value type is a (relative) pointer to (e.g., +// `FossilizedStringObj`). +// + +struct FossilizedStringObj +{ +public: + Size getSize() const; + UnownedTerminatedStringSlice get() const; + + operator UnownedTerminatedStringSlice() const { return get(); } + + using Layout = FossilizedValLayout; + + static bool isMatchingKind(FossilizedValKind kind) + { + return kind == FossilizedValKind::StringObj; + } + +private: + // Before the `this` address, there is a `FossilUInt` + // with the size of the string in bytes. + // + // At the `this` address there is a nul-terminated + // sequence of `getSize() + 1` bytes. }; +// +// The array and dictionary types are handled largely +// the same as strings, with the added detail that the +// object type is split into a base type without the +// template parameters, and a subtype that has those +// parameters. The base type enables navigating of +// these containers dynamically, based on layout. +// + struct FossilizedContainerLayout { FossilizedValKind kind; @@ -116,320 +301,824 @@ struct FossilizedContainerLayout FossilUInt elementStride; }; -struct FossilizedRecordElementLayout +struct FossilizedContainerObjBase { - FossilizedPtr<FossilizedValLayout> layout; - FossilUInt offset; +public: + using Layout = FossilizedContainerLayout; + + Count getElementCount() const; + + void const* getBuffer() const { return this; } + + static bool isMatchingKind(FossilizedValKind kind) + { + switch (kind) + { + default: + return false; + + case FossilizedValKind::ArrayObj: + case FossilizedValKind::DictionaryObj: + return true; + } + } + +private: + // Before the `this` address, there is a `FossilUInt` + // with the number of elements. + // + // At the `this` address there is a sequence of + // `getCount()` elements. The layout of those elements + // cannot be determined without having a `FossilizedContainerLayout` + // for this container. }; -struct FossilizedRecordLayout +template<typename T> +struct FossilizedContainerObj : FossilizedContainerObjBase { - FossilizedValKind kind; - FossilUInt fieldCount; +public: +}; - // FossilizedRecordElementLayout elements[]; +struct FossilizedArrayObjBase : FossilizedContainerObjBase +{ +public: + static bool isMatchingKind(FossilizedValKind kind) + { + return kind == FossilizedValKind::ArrayObj; + } +}; - FossilizedRecordElementLayout* getField(Index index) const; +template<typename T> +struct FossilizedArrayObj : FossilizedArrayObjBase +{ +}; + +// +// While we defined the core `FossilizedPtr` type above, there is +// some subtlety involved in defining the way that a C++ pointer +// type like `T*` maps to its fossilized representation via +// `Fossilized<T*>`. The reason for this is that the binary layout +// of fossilized data avoids storing redundant pointers-to-pointers, +// so because a `Dictionary<int, float>` would already be stored +// via an indirection in the binary layout, a pointer type +// `Dictionary<int, float> *` would be stored with the exact same +// binary layout. +// +// When computing what `Fossilized<T*>` is, the result will be +// `FossilizedPtr< FossilizedPtrTarget<T> >`. The `FossilizedPtrTarget<T>` +// template uses a set of helpers defined in a `details` namespace +// to compute the correct target type. +// + +namespace details +{ +// +// By default, a `Fossilized<T*>` will just be a `FossilizedPtr<Fossilized<T>>`. +// +template<typename T> +T fossilizedPtrTargetType(T*, void*); +} // namespace details + +template<typename T> +using FossilizedPtrTarget = decltype(details::fossilizedPtrTargetType( + std::declval<Fossilized<T>*>(), + std::declval<Fossilized<T>*>())); + + +template<typename T> +struct FossilizedTypeTraits<T*> +{ + using FossilizedType = FossilizedPtr<FossilizedPtrTarget<T>>; }; -/// A reference to a fossilized value in memory (of type T), and its layout. -/// template<typename T> -struct FossilizedValRef_ +struct FossilizedTypeTraits<RefPtr<T>> +{ + using FossilizedType = FossilizedPtr<FossilizedPtrTarget<T>>; +}; + + +// +// An optional value is effectively just a pointer, with +// the null case being used to represent the absence of +// a value. +// + +struct FossilizedPtrLikeLayout +{ + // Note: we aren't using inheritance in the definitions + // of these types, because per the letter of the law in + // C++, a type is only "standard layout" when there is + // only a single type in the inheritance hierarchy that + // has (non-static) data members. + + FossilizedValKind kind; + FossilizedPtr<FossilizedValLayout> elementLayout; +}; + +struct FossilizedOptionalObjBase { public: - using Val = T; - using Layout = typename T::Layout; + void* getValue() { return this; } - /// Construct a null reference. - /// - FossilizedValRef_() {} + void const* getValue() const { return this; } - /// Construct a reference to the given `data`, assuming it has the given `layout`. - /// - FossilizedValRef_(T* data, Layout* layout) - : _data(data), _layout(layout) - { - } + using Layout = FossilizedPtrLikeLayout; - /// Get the kind of value being referenced. - /// - /// This reference must not be null. - /// - FossilizedValKind getKind() + static bool isMatchingKind(FossilizedValKind kind) { - SLANG_ASSERT(getLayout()); - return getLayout()->kind; + return kind == FossilizedValKind::OptionalObj; } - /// Get the layout of the value being referenced. - /// - Layout* getLayout() { return _layout; } +private: + // An absent optional is encoded as a null pointer + // (so `this` would be null), while a present value + // is encoded as a pointer to that value. Thus the + // held value is at the same address as `this`. +}; - /// Get a pointer to the value being referenced. - /// - T* getData() { return _data; } +template<typename T> +struct FossilizedOptionalObj : FossilizedOptionalObjBase +{ + T* getValue() { return this; } - operator T*() const { return _data; } + T const* getValue() const { return this; } +}; - T* operator->() { return _data; } +template<typename T> +struct FossilizedOptional +{ +public: + explicit operator bool() const { return _value.get() != nullptr; } + T const& operator*() const { return *_value.get(); } private: - T* _data = nullptr; - Layout* _layout = nullptr; + FossilizedPtr<T> _value; }; -using FossilizedValRef = FossilizedValRef_<FossilizedVal>; +template<typename T> +struct FossilizedTypeTraits<std::optional<T>> +{ + using FossilizedType = FossilizedOptional<FossilizedPtrTarget<T>>; +}; -/// A fossilized value in memory. -/// -/// There isn't a lot that can be done with a bare pointer to -/// a `FossilizedVal`. This type is mostly declared to allow -/// us to make it explicit when a pointer points to a fossilized -/// value (even if we don't know anything about its layout). -/// -struct FossilizedVal +static_assert(sizeof(Fossilized<std::optional<double>>) == sizeof(FossilUInt)); + +// +// With all of the various `Fossilized*Obj` cases defined above, +// we can now define the more direct versions of things that +// apply in the common case. For example, `Fossilized<String>` +// simply maps to the `FossilizedString` type, and the parallels +// are similar for arrays and dictionaries. +// + +struct FossilizedString { public: - using Kind = FossilizedValKind; - using Layout = FossilizedValLayout; + Size getSize() const { return _obj ? _obj->getSize() : 0; } - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) + UnownedTerminatedStringSlice get() const { - SLANG_UNUSED(kind); - return true; + return _obj ? _obj->get() : UnownedTerminatedStringSlice(); } -protected: - FossilizedVal() = default; - FossilizedVal(FossilizedVal const&) = default; - FossilizedVal(FossilizedVal&&) = default; - ~FossilizedVal() = default; + operator UnownedTerminatedStringSlice() const { return get(); } + +private: + FossilizedPtr<FossilizedStringObj> _obj; }; -template<typename T, FossilizedValKind kKind> -struct FossilizedSimpleVal : FossilizedVal +inline int compare(FossilizedString const& lhs, UnownedStringSlice const& rhs) +{ + return compare(lhs.get(), rhs); +} + +inline bool operator==(FossilizedString const& left, UnownedStringSlice const& right) +{ + return left.get() == right; +} + +inline bool operator!=(FossilizedString const& left, UnownedStringSlice const& right) +{ + return left.get() != right; +} + +inline bool operator==(FossilizedStringObj const& left, UnownedStringSlice const& right) +{ + return left.get() == right; +} + +inline bool operator!=(FossilizedStringObj const& left, UnownedStringSlice const& right) +{ + return left.get() != right; +} + +#define SLANG_DECLARE_FOSSILIZED_TYPE(LIVE, FOSSILIZED) \ + template<> \ + struct FossilizedTypeTraits<LIVE> \ + { \ + using FossilizedType = FOSSILIZED; \ + } + +SLANG_DECLARE_FOSSILIZED_TYPE(String, FossilizedString); +SLANG_DECLARE_FOSSILIZED_TYPE(UnownedStringSlice, FossilizedString); +SLANG_DECLARE_FOSSILIZED_TYPE(UnownedTerminatedStringSlice, FossilizedString); + +static_assert(std::is_same_v<Fossilized<String>, FossilizedString>); +static_assert(sizeof(Fossilized<String>) == sizeof(FossilUInt)); + +template<typename T> +struct FossilizedContainer { public: - T getValue() const { return _value; } + Count getElementCount() const + { + if (!_obj) + return 0; + return _obj->getElementCount(); + } + T const* getBuffer() const + { + if (!_obj) + return nullptr; + return (T const*)_obj.get()->getBuffer(); + } - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) { return kind == kKind; } + T const* begin() const { return getBuffer(); } + T const* end() const { return getBuffer() + getElementCount(); } private: - T _value; + FossilizedPtr<FossilizedContainerObj<T>> _obj; }; -using FossilizedInt8Val = FossilizedSimpleVal<int8_t, FossilizedValKind::Int8>; -using FossilizedInt16Val = FossilizedSimpleVal<int16_t, FossilizedValKind::Int16>; -using FossilizedInt32Val = FossilizedSimpleVal<int32_t, FossilizedValKind::Int32>; -using FossilizedInt64Val = FossilizedSimpleVal<int64_t, FossilizedValKind::Int64>; +template<typename T> +struct FossilizedArray : FossilizedContainer<T> +{ +public: + T const& operator[](Index index) const + { + SLANG_ASSERT(index >= 0 && index < this->getElementCount()); + return this->getBuffer()[index]; + } +}; -using FossilizedUInt8Val = FossilizedSimpleVal<uint8_t, FossilizedValKind::UInt8>; -using FossilizedUInt16Val = FossilizedSimpleVal<uint16_t, FossilizedValKind::UInt16>; -using FossilizedUInt32Val = FossilizedSimpleVal<uint32_t, FossilizedValKind::UInt32>; -using FossilizedUInt64Val = FossilizedSimpleVal<uint64_t, FossilizedValKind::UInt64>; +template<typename T> +struct FossilizedTypeTraits<List<T>> +{ + using FossilizedType = FossilizedArray<Fossilized<T>>; +}; -using FossilizedFloat32Val = FossilizedSimpleVal<float, FossilizedValKind::Float32>; -using FossilizedFloat64Val = FossilizedSimpleVal<double, FossilizedValKind::Float64>; +template<typename T, int N> +struct FossilizedTypeTraits<ShortList<T, N>> +{ + using FossilizedType = FossilizedArray<Fossilized<T>>; +}; -struct FossilizedBoolVal : FossilizedVal +template<typename T, size_t N> +struct FossilizedTypeTraits<T[N]> { -public: - bool getValue() const { return _value != 0; } + using FossilizedType = FossilizedArray<Fossilized<T>>; +}; - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) { return kind == Kind::Bool; } +static_assert(sizeof(Fossilized<List<int32_t>>) == sizeof(FossilUInt)); -private: - uint8_t _value; +template<typename K, typename V> +struct FossilizedKeyValuePair +{ + using Layout = FossilizedRecordLayout; + K key; + V value; +}; + +template<typename K, typename V> +struct FossilizedTypeTraits<KeyValuePair<K, V>> +{ + using FossilizedType = FossilizedKeyValuePair<Fossilized<K>, Fossilized<V>>; +}; + +template<typename K, typename V> +struct FossilizedTypeTraits<std::pair<K, V>> +{ + using FossilizedType = FossilizedKeyValuePair<Fossilized<K>, Fossilized<V>>; }; -struct FossilizedPtrVal : FossilizedVal +// +// In terms of the encoding, a fossilized dictionary +// is really just an array of key-value pairs, but +// we keep the types distinct to help with clarity. +// + +struct FossilizedDictionaryObjBase : FossilizedContainerObjBase { public: - using Layout = FossilizedPtrLikeLayout; + static bool isMatchingKind(FossilizedValKind kind) + { + return kind == FossilizedValKind::DictionaryObj; + } +}; - FossilizedVal* getTargetData() const { return _value.get(); } +template<typename K, typename V> +struct FossilizedDictionaryObj : FossilizedDictionaryObjBase +{ +}; + +template<typename K, typename V> +struct FossilizedDictionary : FossilizedContainer<FossilizedKeyValuePair<K, V>> +{ +public: + using Entry = FossilizedKeyValuePair<K, V>; +}; - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) { return kind == Kind::Ptr; } +template<typename K, typename V> +struct FossilizedTypeTraits<Dictionary<K, V>> +{ + using FossilizedType = FossilizedDictionary<Fossilized<K>, Fossilized<V>>; +}; -private: - FossilizedPtr<FossilizedVal> _value; +template<typename K, typename V> +struct FossilizedTypeTraits<OrderedDictionary<K, V>> +{ + using FossilizedType = FossilizedDictionary<Fossilized<K>, Fossilized<V>>; }; +static_assert(sizeof(Fossilized<Dictionary<String, String>>) == sizeof(FossilUInt)); -struct FossilizedRecordVal : FossilizedVal +// +// A record (struct or tuple) is stored simply as a sequence of field +// values, and its layout gives the total number of fields as well as +// the offset and layout of each. +// + +struct FossilizedRecordElementLayout +{ + FossilizedPtr<FossilizedValLayout> layout; + FossilUInt offset; +}; + +struct FossilizedRecordLayout +{ + FossilizedValKind kind; + FossilUInt fieldCount; + + // FossilizedRecordElementLayout elements[]; + + FossilizedRecordElementLayout* getField(Index index) const; +}; + +/// Stand-in for a fossilized record of unknown type. +/// +/// Note that user-defined fossilized types should *not* try +/// to inherit from `FossilizedRecordVal`, as doing so can +/// end up breaking the correlation between the binary layout +/// of fossilized data and the matching C++ declarations. +/// +struct FossilizedRecordVal { public: using Layout = FossilizedRecordLayout; - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) + static bool isMatchingKind(FossilizedValKind kind) { switch (kind) { default: return false; - case Kind::Struct: - case Kind::Tuple: + case FossilizedValKind::Struct: + case FossilizedValKind::Tuple: return true; } } }; // -// Some of the following subtypes of `FossilizedVal` are -// named as `Fossilized*Obj` rather than `Fossilized*Val`, -// to indicate that they will only ever be located on the -// other side of a pointer indirection. -// -// E.g., a field of a fossilized struct value should never -// have a layout claiming it to be of kind `String`; instead -// it should show as a field of kind `Ptr`, where the -// pointed-to type is `String`. The same goes for `Optional`, -// `Array`, and `Dictionary`. -// -// This distinction only matters when dealing with things like -// an *optional* string, because instead of an in-memory -// layout like `Ptr -> Optional -> Ptr -> String`, the fossilized -// data will simply store `Ptr -> Optional -> String`. +// A *variant* is a value that can conceptually hold data of any type/layout, +// and stores a pointer to layout information so that the data it holds +// can be navigated dynamically. // -struct FossilizedStringObj : FossilizedVal +struct FossilizedVariantObj { public: - Size getSize() const; - UnownedTerminatedStringSlice getValue() const; + using Layout = FossilizedValLayout; + static const FossilizedValKind kKind = FossilizedValKind::VariantObj; + + FossilizedValLayout* getContentLayout() const; - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) { return kind == Kind::String; } + void* getContentDataPtr() { return this; } + void const* getContentDataPtr() const { return this; } + + static bool isMatchingKind(FossilizedValKind kind) + { + return kind == FossilizedValKind::VariantObj; + } private: - // Before the `this` address, there is a `FossilUInt` - // with the size of the string in bytes. + // Before the `this` address, there is a `FossilizedPtr<FossilizedValLayout>` + // with the layout of the content. // - // At the `this` address there is a nul-terminated - // serquence of `getSize() + 1` bytes. + // The content itself starts at the `this` address, with its + // layout determined by `getContentLayout()`. }; -struct FossilizedOptionalObj : FossilizedVal +struct FossilizedVariant { public: - using Layout = FossilizedPtrLikeLayout; +private: + FossilizedPtr<FossilizedVariantObj> _obj; +}; - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) { return kind == Kind::Optional; } +static_assert(sizeof(FossilizedVariant) == sizeof(FossilUInt)); - FossilizedVal* getValue() { return this; } +// +// Now that all of the relevant types for fossilized data have been defined, +// we can circle back to define the specializations of `FossilizedPtrTargetType` +// for the types that need it. +// - FossilizedVal const* getValue() const { return this; } +namespace details +{ +template<typename X> +FossilizedStringObj fossilizedPtrTargetType(X*, FossilizedString*); -private: - // An absent optional is encoded as a null pointer - // (so `this` would be null), while a present value - // is encoded as a pointer to that value. Thus the - // held value is at the same address as `this`. +template<typename X> +FossilizedVariantObj fossilizedPtrTargetType(X*, FossilizedVariant*); + +template<typename X, typename T> +FossilizedArrayObj<T> fossilizedPtrTargetType(X*, FossilizedArray<T>*); + +template<typename X, typename K, typename V> +FossilizedDictionaryObj<K, V> fossilizedPtrTargetType(X*, FossilizedDictionary<K, V>*); +} // namespace details + +// +// In addition to being able to expose a statically-known +// layout through `Fossilized<T>`, the fossil format also +// allows data to be *self-describing*, by carying its layout +// with it. +// +// A `FossilizedValLayout` describes the in-memory layout of a fossilized +// value. Given a `FossilizedValLayout` and a pointer to the data +// for a particular value, it is possible to inspect the structure +// of the fossilized data. +// +// If all you have is a `void*` to a fossilzied value, then there is no way +// to access its contents without assuming it is of some particular type and +// casting it. +// +// A `FossilizedVariantObj` is a fossilized value that is self-describing; +// it stores a (relative) pointer to a layout, which can be used to inspect +// its own data/state. +// + +struct FossilizedValLayout; +struct FossilizedPtrLikeLayout; +struct FossilizedContainerLayout; +struct FossilizedRecordLayout; +struct FossilizedVariantObj; + +/// Layout information about a fossilized value in memory. +/// +/// +/// Every `FossilizedValLayout` stores the kind of the value. +/// Based on that kind, specific additional fields may be +/// available as part of the layout. +/// +struct FossilizedValLayout +{ + FossilizedValKind kind; }; -struct FossilizedContainerObj : FossilizedVal +namespace Fossil +{ +/// A reference to a fossilized value in memory, along with layout information. +/// +template<typename T, typename L = typename T::Layout> +struct ValRefBase { public: - using Layout = FossilizedContainerLayout; + using Val = T; + using Layout = L; - Count getElementCount() const; + /// Construct a null reference. + /// + ValRefBase() {} - /// Determine if a value with the given `kind` should be allowed to cast to this type. - static bool _isMatchingKind(Kind kind) + /// Construct a reference to the given `data`, assuming it has the given `layout`. + /// + ValRefBase(T* data, Layout const* layout) + : _data(data), _layout(layout) { - switch (kind) - { - default: - return false; + } - case Kind::Array: - case Kind::Dictionary: - return true; - } + /// Construct a copy of `ref`. + /// + /// Only enabled if `U*` is convertible to `T*`. + /// + template<typename U> + ValRefBase(ValRefBase<U> ref, std::enable_if_t<std::is_convertible_v<U*, T*>, void>* = nullptr) + : _data(ref.getDataPtr()), _layout((Layout const*)ref.getLayout()) + { } -private: - // Before the `this` address, there is a `FossilUInt` - // with the number of elements. - // - // At the `this` address there is a sequence of - // `getCount()` elements. The layout of those elements - // cannot be determined without having a `FossilizedContainerLayout` - // for this container. + /// Get a pointer to the value being referenced. + /// + T* getDataPtr() const { return _data; } + + /// Get a reference to the value being referenced. + /// + /// This accessor is disabled in the case where `T` is `void`. + /// + template<typename U = T> + std::enable_if_t<!std::is_same_v<U, void>, T>& getDataRef() const + { + return *_data; + } + + /// Get the layout of the value being referenced. + /// + Layout const* getLayout() const { return _layout; } + + /// Get the kind of value being referenced. + /// + /// This reference must not be null. + /// + FossilizedValKind getKind() const + { + SLANG_ASSERT(getLayout()); + return getLayout()->kind; + } + +protected: + T* _data = nullptr; + Layout const* _layout = nullptr; +}; + +/// A reference to a fossilized value in memory, along with layout information. +/// +template<typename T> +struct ValRef : ValRefBase<T> +{ + using ValRefBase<T>::ValRefBase; }; -struct FossilizedVariantObj : FossilizedVal +/// Specialization of `ValRef<T>` for the case where `T` is `void`. +/// +template<> +struct ValRef<void> : ValRefBase<void, FossilizedValLayout> +{ + using ValRefBase<void, FossilizedValLayout>::ValRefBase; +}; + +/// A pointer to a fossilized value in memory, along with layout information. +/// +template<typename T> +struct ValPtr { public: - FossilizedValLayout* getContentLayout() const; + using TargetVal = T; + using TargetLayout = typename ValRef<T>::Layout; + /// Construct a null pointer. + /// + ValPtr() {} + ValPtr(std::nullptr_t) {} - FossilizedVal* getContentData() { return this; } - FossilizedVal const* getContentData() const { return this; } + /// Construct a pointer to the given `data`, assuming it has the given `layout`. + /// + ValPtr(T* data, TargetLayout const* layout) + : _ref(data, layout) + { + } - static bool _isMatchingKind(Kind kind) { return kind == Kind::Variant; } + /// Construct a pointer to the value referenced by `ref`. + /// + /// This constructor is basically equivalent to the address-of operator `&`. + /// We define it as a constructor as a slightly more preferable alternative + /// to overloading prefix `operator&` (which is almost always a Bad Idea) + /// + explicit ValPtr(ValRef<T> ref) + : _ref(ref) + { + } + + /// Construct a copy of `ptr`. + /// + /// Only enabled if `U*` is convertible to `T*`. + /// + template<typename U> + ValPtr(ValPtr<U> ptr, std::enable_if_t<std::is_convertible_v<U*, T*>, void>* = nullptr) + : _ref(*ptr) + { + } + + /// Get a pointer to the value being referenced. + /// + T* getDataPtr() const { return _ref.getDataPtr(); } + + /// Get the layout of the value being referenced. + /// + TargetLayout* getLayout() const { return _ref.getLayout(); } + + T* get() const { return _ref.getDataPtr(); } + operator T*() const { return get(); } + + /// Deference this `ValPtr` to get a `ValRef`. + /// + ValRef<T> operator*() const { return _ref; } + + /// Deference this `ValPtr` for member access. + /// + /// Note that an overloaded `operator->` must return either + /// a pointer or a type that itself overloads `operator->`. + /// Because `ValRef<T>` is not functionally a "smart pointer" + /// to a `T`, the logical behavior here is that we want + /// `someValPtr->foo` to be equvialent to `someValRef.foo`, + /// where `someValRef` is a reference to the same value + /// that `someValPtr` points to. The correct way to get + /// that behavior is for the `operator->` on `ValPtr` + /// to return a pointer to a `ValRef`. + /// + ValRef<T> const* operator->() const { return &_ref; } private: - // Before the `this` address, there is a `FossilizedPtr<FossilizedValLayout>` - // with the layout of the content. - // - // The content itself starts at the `this` address, with its - // layout determined by `getContentLayout()`. + ValRef<T> _ref; }; -/// Dynamic cast of a reference to a fossilized value. +/// Get a `ValPtr` pointing to the same value as the given `ref`. /// -template<typename T, typename U> -FossilizedValRef_<T> as(FossilizedValRef_<U> valRef) +template<typename T> +inline ValPtr<T> getAddress(ValRef<T> ref) { - if (!valRef || !T::_isMatchingKind(valRef.getKind())) - return FossilizedValRef_<T>(); - - return FossilizedValRef_<T>( - static_cast<T*>(valRef.getData()), - reinterpret_cast<typename T::Layout*>(valRef.getLayout())); + return ValPtr<T>(ref); } -using FossilizedInt8ValRef = FossilizedValRef_<FossilizedInt8Val>; -using FossilizedInt16ValRef = FossilizedValRef_<FossilizedInt16Val>; -using FossilizedInt32ValRef = FossilizedValRef_<FossilizedInt32Val>; -using FossilizedInt64ValRef = FossilizedValRef_<FossilizedInt64Val>; -using FossilizedUInt8ValRef = FossilizedValRef_<FossilizedUInt8Val>; -using FossilizedUInt16ValRef = FossilizedValRef_<FossilizedUInt16Val>; -using FossilizedUInt32ValRef = FossilizedValRef_<FossilizedUInt32Val>; -using FossilizedUInt64ValRef = FossilizedValRef_<FossilizedUInt64Val>; -using FossilizedFloat32ValRef = FossilizedValRef_<FossilizedFloat32Val>; -using FossilizedFloat64ValRef = FossilizedValRef_<FossilizedFloat64Val>; -using FossilizedBoolValRef = FossilizedValRef_<FossilizedBoolVal>; -using FossilizedStringObjRef = FossilizedValRef_<FossilizedStringObj>; -using FossilizedPtrValRef = FossilizedValRef_<FossilizedPtrVal>; -using FossilizedOptionalObjRef = FossilizedValRef_<FossilizedOptionalObj>; -using FossilizedContainerObjRef = FossilizedValRef_<FossilizedContainerObj>; -using FossilizedRecordValRef = FossilizedValRef_<FossilizedRecordVal>; -using FossilizedVariantObjRef = FossilizedValRef_<FossilizedVariantObj>; +using AnyValRef = ValRef<void>; +using AnyValPtr = ValPtr<void>; -FossilizedValRef getPtrTarget(FossilizedPtrValRef ptrRef); +// +// In order to make `ValRef<T>` more usable in contexts where we want +// to make use of the knowledge that it refers to a `T`, we define +// various specializations of `ValRef` for the specific types that +// are relevant for decoding serialized data. +// +// Note that we do not need to define any specializations of +// `ValPtr`, because that is ultimately just a wrapper around +// `ValRef`. +// -bool hasValue(FossilizedOptionalObjRef optionalRef); -FossilizedValRef getValue(FossilizedOptionalObjRef optionalRef); +template<> +struct ValRef<FossilizedStringObj> : ValRefBase<FossilizedStringObj> +{ +public: + using ValRefBase<FossilizedStringObj>::ValRefBase; + + Size getSize() const { return getDataPtr()->getSize(); } + UnownedTerminatedStringSlice get() const { return getDataPtr()->get(); } + + operator UnownedTerminatedStringSlice() const { return get(); } +}; -Count getElementCount(FossilizedContainerObjRef containerRef); -FossilizedValRef getElement(FossilizedContainerObjRef containerRef, Index index); -Count getFieldCount(FossilizedRecordValRef recordRef); -FossilizedValRef getField(FossilizedRecordValRef recordRef, Index index); +template<> +struct ValRef<FossilizedContainerObjBase> : ValRefBase<FossilizedContainerObjBase> +{ +public: + using ValRefBase<FossilizedContainerObjBase>::ValRefBase; -FossilizedValRef getVariantContent(FossilizedVariantObjRef variantRef); -FossilizedValRef getVariantContent(FossilizedVariantObj* variantPtr); + Count getElementCount() const + { + auto data = this->getDataPtr(); + if (!data) + return 0; + return data->getElementCount(); + } + AnyValRef getElement(Index index) const; +}; + + +template<> +struct ValRef<FossilizedArrayObjBase> : ValRefBase<FossilizedArrayObjBase> +{ +public: + using ValRefBase<FossilizedArrayObjBase>::ValRefBase; + + Count getElementCount() const + { + auto data = this->getDataPtr(); + if (!data) + return 0; + return data->getElementCount(); + } + + AnyValRef getElement(Index index) const; +}; + + +template<> +struct ValRef<FossilizedDictionaryObjBase> : ValRefBase<FossilizedDictionaryObjBase> +{ +public: + using ValRefBase<FossilizedDictionaryObjBase>::ValRefBase; + + Count getElementCount() const + { + auto data = this->getDataPtr(); + if (!data) + return 0; + return data->getElementCount(); + } + + AnyValRef getElement(Index index) const; +}; +template<> +struct ValRef<FossilizedOptionalObjBase> : ValRefBase<FossilizedOptionalObjBase> +{ +public: + using ValRefBase<FossilizedOptionalObjBase>::ValRefBase; + + bool hasValue() const { return this->getDataPtr() != nullptr; } + + AnyValRef getValue() const + { + SLANG_ASSERT(hasValue()); + return AnyValRef(this->getDataPtr(), this->getLayout()->elementLayout.get()); + } +}; + +template<> +struct ValRef<FossilizedRecordVal> : ValRefBase<FossilizedRecordVal> +{ +public: + using ValRefBase<FossilizedRecordVal>::ValRefBase; + + Count getFieldCount() const { return getLayout()->fieldCount; } + + AnyValRef getField(Index index) const; +}; + +template<typename T> +struct ValRef<FossilizedPtr<T>> : ValRefBase<FossilizedPtr<T>> +{ +public: + using ValRefBase<FossilizedPtr<T>>::ValRefBase; + + ValRef<T> getTargetValRef() const + { + auto ptrPtr = this->getDataPtr(); + return ValRef<T>(*ptrPtr, this->getLayout()->elementLayout.get()); + } + + ValPtr<T> getTargetValPtr() const { return ValPtr<T>(getTargetValRef()); } + + // ValRef<T> operator*() const; +}; + +// +// We support both static and dynamic casting of `ValPtr`s +// to fossilized data. In the dynamic case, the layout +// information associated with the pointer is used to +// determine if the cast is allowed. +// + +/// Statically cast a pointer to a fossilized value. +/// +template<typename T> +ValPtr<T> cast(AnyValPtr valPtr) +{ + if (!valPtr) + return ValPtr<T>(); + return ValPtr<T>( + static_cast<T*>(valPtr.getDataPtr()), + (typename T::Layout*)(valPtr->getLayout())); +} + +/// Dynamic cast of a pointer to a fossilized value. +/// +template<typename T> +ValPtr<T> as(AnyValPtr valPtr) +{ + if (!valPtr || !T::isMatchingKind(valPtr->getKind())) + { + return nullptr; + } + + return ValPtr<T>( + static_cast<T*>(valPtr.getDataPtr()), + (typename T::Layout*)(valPtr->getLayout())); +} + +} // namespace Fossil + +/// Get a dynamically-typed pointer to the content of a fossilized variant. +/// +/// This operation does not require a dynamically-typed `Fossil::ValPtr` +/// or `Fossil::ValRef` as input, because it makes use of the way that +/// a fossilized variant stores a (relative) pointer to the layout of +/// its content. +/// +Fossil::AnyValPtr getVariantContentPtr(FossilizedVariantObj* variantPtr); namespace Fossil { @@ -466,13 +1155,15 @@ struct Header FossilizedPtr<FossilizedVariantObj> rootValue; }; +static_assert(sizeof(Header) == 32); + /// Get the root object from a fossilized blob. /// /// This operation performs some basic validation on the blob to /// ensure that it doesn't seem incorrectly sized or otherwise /// corrupted/malformed. /// -FossilizedValRef getRootValue(ISlangBlob* blob); +Fossil::AnyValPtr getRootValue(ISlangBlob* blob); /// Get the root object from a fossilized blob. /// @@ -480,7 +1171,7 @@ FossilizedValRef getRootValue(ISlangBlob* blob); /// ensure that it doesn't seem incorrectly sized or otherwise /// corrupted/malformed. /// -FossilizedValRef getRootValue(void const* data, Size size); +Fossil::AnyValPtr getRootValue(void const* data, Size size); } // namespace Fossil } // namespace Slang diff --git a/source/slang/slang-module-library.cpp b/source/slang/slang-module-library.cpp index c3f4a1349..df3ae3687 100644 --- a/source/slang/slang-module-library.cpp +++ b/source/slang/slang-module-library.cpp @@ -39,6 +39,7 @@ void* ModuleLibrary::castAs(const Guid& guid) } SlangResult loadModuleLibrary( + ISlangBlob* blobHoldingSerializedData, const Byte* inData, size_t dataSize, String path, @@ -69,8 +70,11 @@ SlangResult loadModuleLibrary( for (auto moduleChunk : container->getModules()) { - auto loadedModule = - linkage->findOrLoadSerializedModuleForModuleLibrary(moduleChunk, container, sink); + auto loadedModule = linkage->findOrLoadSerializedModuleForModuleLibrary( + blobHoldingSerializedData, + moduleChunk, + container, + sink); if (!loadedModule) return SLANG_FAIL; @@ -111,6 +115,7 @@ SlangResult loadModuleLibrary( // Load the module ComPtr<IModuleLibrary> library; SLANG_RETURN_ON_FAIL(loadModuleLibrary( + blob, (const Byte*)blob->getBufferPointer(), blob->getBufferSize(), path, diff --git a/source/slang/slang-module-library.h b/source/slang/slang-module-library.h index 836366e93..2c25a8fb7 100644 --- a/source/slang/slang-module-library.h +++ b/source/slang/slang-module-library.h @@ -46,6 +46,7 @@ public: }; SlangResult loadModuleLibrary( + ISlangBlob* blobHoldingSerializedData, const Byte* inBytes, size_t bytesCount, String Path, diff --git a/source/slang/slang-serialize-ast.cpp b/source/slang/slang-serialize-ast.cpp index c5e54b835..1c05e3ce2 100644 --- a/source/slang/slang-serialize-ast.cpp +++ b/source/slang/slang-serialize-ast.cpp @@ -2,117 +2,985 @@ #include "slang-serialize-ast.h" #include "slang-ast-dispatch.h" +#include "slang-check.h" #include "slang-compiler.h" #include "slang-diagnostics.h" #include "slang-mangle.h" +#include "slang-parser.h" +#include "slang-serialize-ast.cpp.fiddle" #include "slang-serialize-fossil.h" #include "slang-serialize-riff.h" +#define SLANG_ENABLE_AST_DESERIALIZATION_STATS 0 +#define SLANG_DISABLE_ON_DEMAND_AST_DESERIALIZATION 1 + +FIDDLE() namespace Slang { -// TODO(tfoley): have the parser export this, or a utility function -// for initializing a `SyntaxDecl` in the common case. // -NodeBase* parseSimpleSyntax(Parser* parser, void* userData); +// The big picture here is that we will serialize all of the structures +// that make up the AST using the framework in `slang-serialize.h`, +// and the specific *implementation* of serialization from `slang-fossil.h`. +// +// There's a certain amount of work that needs to be done on a per-type basis +// to make all of the serialization magic work the way we want. In order to +// help illustrate what's going on before we grind through all the different +// types, we will start slow and define the needed pieces for a somewhat +// trivial type: `RefObject`. +// + +// +// For the general-purpose serialization framework in `slang-serialize.h`, the +// main requirement is that any type that we want to serialize should have an +// available overload of `serialize()`. +// +// +// In principle, the declarations and definitions of these functions ought to +// be more closely associated with the types that they pertain to, but for now +// they are all just getting dumped here in the AST serialization logic, because +// it is currenly the only place that cares about this stuff. +// +void serialize(Serializer const&, RefObject&) +{ + // There's actually no data stored in a `RefObject`, since it only exists + // to make reference-counting possible for other types. This function is + // primarily useful for cases where we might codegen logic to serialize + // a type by serializing its base class (if it has one) and then its fields. + // If the base class is `RefObject`, we want there to be an available + // overload of `serialize()` to handle that case. +} // -// Many of the types used in the AST can be serialized using -// just the `Serializer` type, so we will handle all of those first. +// In addition to using the general-purpose serialization system, we are +// specifically encoding the AST using the "fossil" format defined in `slang-fossil.h`. +// This format allows us to load the serialized data into memory and easily +// navigate it without having to deserialize any of its content. +// +// There are really two modes in which fossilized data can be navigated: +// +// * As a dynamically-typed graph of nodes, where a reference to a node +// comprises a data pointer and a layout pointer, with the layout +// describing the type and format of the data. +// +// * As a statically-typed data structure, where code can just cast a +// pointer to fossilized data to the type that it knows/expects it to +// have, and then access it like Just Another C++ Type. +// +// In order to enable the second of these modes, we need to do a little +// work to define the mapping from a "live" C++ type to its fossilized +// equivalent. +// +// In cases where a live type can use an existing fossilized type as +// its representation, we can specialize the `FossilizedTypeTraits` template: // -void serialize(Serializer const& serializer, ASTNodeType& value) +template<> +struct FossilizedTypeTraits<RefObject> { - serializeEnum(serializer, value); -} + struct FossilizedType + { + }; +}; -void serialize(Serializer const& serializer, TypeTag& value) +// +// The handling of `RefObject` was trivial, so let's cover a few more +// simple examples in detail before we move on to the rest of the things, +// where we will be using fiddle to generate a lot of the boilerplate +// code we'd otherwise be writing by hand. +// +// The `MatrixCoord` type is a fairly simple `struct` with two fields. +// While we could include this among the types we handle using fiddle, +// let's implement it by hand here, starting with the `serialize()` function: +// +void serialize(Serializer const& serializer, MatrixCoord& value) { - serializeEnum(serializer, value); + // We start with one of the `SLANG_SCOPED_SERIALIZER_*` + // macros, which basically just handles calling + // `ISerializerImpl::beginTuple()` and the start of our + // scope, and `ISerializer::endTuple()` at the end. + // + SLANG_SCOPED_SERIALIZER_TUPLE(serializer); + + // Next, we call `serialize()` on each of the fields + // of our type, in order. Ordinary overload resolution + // will pick the right function to call based on the type + // of the field itself. + // + serialize(serializer, value.row); + serialize(serializer, value.col); + + // Note: this one function handles both the read and write + // directions. For simple types like `MatrixCoord` the logic + // for reading and writing is symettric, and writing it as + // one function ensures that the two paths are kept in sync. + // + // Some of the later examples will perform different logic + // in the read and write cases, and all of those require + // more conscious effort by contributors to keep the two + // paths matching. } -void serialize(Serializer const& serializer, BaseType& value) +// +// Writing the `serialize()` function is one piece of the picture, but +// if we want to be able to navigate a fossilized `MatrixCoord` in +// memory, we need to declare what it's fossilized version will look like. +// +// We do that here by writing an explicit specialization of `FossilizedTypeTraits`: +// +template<> +struct FossilizedTypeTraits<MatrixCoord> { - serializeEnum(serializer, value); -} + // The `MatrixCoord` type can't map directly to any type + // for fossilized data, so we declare it here as a custom + // `struct`. + // + struct FossilizedType + { + // The contents of a fossilized struct will typically + // just be the fossilized representation of each of + // its fields. + // + // We use `decltype()` to access the type of each of + // the fields, and `Fossilized<...>` to map those + // types to their fossilized equivalents. + // + // Note that this type definition must be consistent + // with the implementation of `serialize()` above, + // so it is important to keep the two consistent. + // That requirement of consistency is part of why + // it helps to generate these definitions rather + // than author them by hand. + // + Fossilized<decltype(MatrixCoord::row)> row; + Fossilized<decltype(MatrixCoord::col)> col; + }; +}; -void serialize(Serializer const& serializer, TryClauseType& value) +// +// In some cases we don't really want to serialize a type directly, +// and instead want to translate it to some intermediate format +// that can be serialized more conveniently. +// +// For example, the `SemanticVersion` type conceptually has multiple +// fields, but it is also designed so that it can be encoded conveniently +// as a single scalar value. We'll define our `serialize()` function +// so that it serializes that "raw" value instead: +// +void serialize(Serializer const& serializer, SemanticVersion& value) { - serializeEnum(serializer, value); + // This function is doing something a little "clever" + // handle the fact that it might be used to either + // *write* a `SemanticVersion` to the serialized format, + // or to *read* one. + // + // In the case where we are writing, the following line + // will copy the `value` we want to write into the + // local variable `raw`, but if we are *reading* instead, + // this operation doesn't so anything useful. + // + // The assumption being made here is that it is safe to + // call `getRawValue()` on any `SemanticVersion`, including + // one that has been default-constructed, because we have + // no guarantee that the incoming `value` represents anything + // useful or even *valid* in the case where we are reading + // (and thus expected to overwrite `value`). + // + SemanticVersion::RawValue raw = value.getRawValue(); + + // Depending on whether we are reading or writing, this next + // line will either write out the value of `raw` that was + // computed above, or it will read serialized data into `raw`, + // and overwrite the useless value from before. + // + serialize(serializer, raw); + + // Finally, we overwrite the `value` by converting `raw` + // back to a `SemanticVersion`. If we are in reading mode, + // this will do exactly what the caller wants/expects. + // If we are in *writing* mode, this line makes a few more + // subtle assumptions: + // + // * It assumes that we can safely round-trip any `SemanticVersion` + // through its `RawValue` without changing its meaning. + // + // * It assumes that the passed-in `value` will never be a + // reference to read-only memory, and that in the case where + // there are other concurrent accesses to `value`, this write + // will not somehow create a difficult-to-debug data hazard + // (e.g., there might be an overload of `operator=` that + // temporarily sets the object into a state that shouldn't be + // obeserved). + // + value = SemanticVersion::fromRaw(raw); + + // In cases where a given type doesn't satsify all the assumptions + // being made above, it is relatively simple to just split the + // logic into distinct cases based on `isReading(serializer)` and + // avoid all the concerns. That conditional involves a virtual + // function call, so in cases where it can easily be avoided, + // we prefer to do the redundant copy-in and copy-out on a local + // variable, like in the code above. } -void serialize(Serializer const& serializer, DeclVisibility& value) +// +// Given the definition of `serialize()` above, it is clear that the +// fossilized representation of `SemanticVersion` would be the same +// as whatever the fossilized representation of `SemanticVersion::RawValue` +// would be. +// +// The fossil header provides some convenient macros for defining that +// one type gets fossilized as another: +// +SLANG_DECLARE_FOSSILIZED_AS(SemanticVersion, SemanticVersion::RawValue); + +// +// While in some cases we want to serialize something via an intermediate +// type that already exists (like for `SemanticVersion` and +// `SemanticVersion::RawValue` above), in other cases we need to *define* +// an intermediate type to store the data we care about in a more +// direct fashion. +// +// When serializing an AST `ModuleDecl`, there are certain pieces +// of data that are implicitly encoded in the object graph under +// that module declaration that are beneficial to make explicit +// in the serialized representation. +// +// As a concrete example, there are various declarations in the Slang +// core module that have to be "registered" with the `SharedASTBuilder` +// being used, so that they can be looked up by a well-defined tag +// (whether an integer or string) by other logic in the compiler. +// Those declarations can be found by doing a recursive search over +// the entire `Decl` hierarchy of a module, but doing such a recursive +// search would force us to load and inspect every single declaration +// in a module as part of deserialization, which would negate any +// possible benefits to supporting on-demand deserialization of those +// declarations. +// +// Thus, we define an intermediate `ASTModuleInfo` type that holds +// the pre-computed information that we want to serialize (and thus +// also represents the data that we will want to navigate in the +// serialized representation). +// +FIDDLE() +struct ASTModuleInfo { - serializeEnum(serializer, value); -} + FIDDLE(...) + + // We still want to serialize the original module declaration, + // and everything it transitvely refers to. + // + FIDDLE() ModuleDecl* moduleDecl; + + // The intermediate type will store an explicit list of all of + // the declarations that we need to register upon loading + // this module (this list is expected to be empty for everything + // other than the core module). + // + FIDDLE() List<Decl*> declsToRegister; + + // Another example of data that we want to store explicitly + // rather than leave implicit in the declaration hierarchy is + // the set of declarations exported from the module, and their + // mangled names. + // + FIDDLE() OrderedDictionary<String, Decl*> mapMangledNameToDecl; +}; + +// +// Another case where we wnat to define an intermediate type is +// the `ContainerDeclDirectMembers` type used to encapsulate +// the list of direct members for a `ContainerDecl` along with +// the acceleration structures used to enable efficient lookup +// of those declarations. +// +FIDDLE() +struct ContainerDeclDirectMemberDeclsInfo +{ + FIDDLE(...) + + // We need to store the ordered list of declarations, + // because many parts of the compiler need to access + // all of the direct members of a container, and the + // order of the direct members often matters (e.g., + // for layout). + // + FIDDLE() List<Decl*> decls; + + // One of the acceleration structures that a `ContainerDecl` + // may build and store is a list of those entries in + // `decls` that are marked as "transparent." This is + // not a commonly-occuring case, used only to support + // a few legacy features, so it is a bit wasteful to + // store such a list on *every* container decl in the + // serialized format, but the format itself isn't + // currently optimized for size, so we consider this + // fine for now. + // + FIDDLE() List<FossilUInt> transparentDeclIndices; + + + // The other main acceleration structure that a `ContainerDecl` + // may build and store is a dictionary to map a string name to + // a declaration of that name (which is then the first node + // in an internally-linked list of *all* the declarations with + // the given name). + // + FIDDLE() OrderedDictionary<String, FossilUInt> mapNameToDeclIndex; +}; + +// +// Okay, that's enough examples for now. Let's move on to the next big +// topic... +// +// Many types in the AST need additional context information to be able to +// read or write them properly, so instead of passing around the basic +// `Serializer` type (which wraps an `ISerializerImpl`), for those types +// that need extra context we will be passing around an `ASTSerializer` +// (which wraps an `IASTSerializerImpl`, with the latter interface providing +// the callbacks to handle the data types that need special-case behavior. +// + +struct ASTSerialContext; +using ASTSerializer = Serializer_<ISerializerImpl, ASTSerialContext>; + +/// Context interface for AST serialization +struct ASTSerialContext +{ +public: + virtual void handleASTNode(ASTSerializer const& serializer, NodeBase*& value) = 0; + virtual void handleASTNodeContents(ASTSerializer const& serializer, NodeBase* value) = 0; + virtual void handleName(ASTSerializer const& serializer, Name*& value) = 0; + virtual void handleSourceLoc(ASTSerializer const& serializer, SourceLoc& value) = 0; + virtual void handleToken(ASTSerializer const& serializer, Token& value) = 0; + virtual void handleContainerDeclDirectMemberDecls( + ASTSerializer const& serializer, + ContainerDeclDirectMemberDecls& value) = 0; +}; + + +// +// Now that we've covered some of the big-picture structure, and shown +// a few small examples, we will try to use fiddle to generate the code +// to handle as many of the remaining types as we can. +// +// TODO: It would be great to have more of this logic be driven by information +// that the fiddle tool scraped from the relevant declarations. +// -void serialize(Serializer const& serializer, BuiltinRequirementKind& value) +// +// We start with the easiest case, which is the various `enum` types that +// get stored as part of the AST. +// +#if 0 // FIDDLE TEMPLATE: +% +%local enumTypeNames = { +% "ASTNodeType", +% "TypeTag", +% "BaseType", +% "TryClauseType", +% "DeclVisibility", +% "BuiltinRequirementKind", +% "ImageFormat", +% "PreferRecomputeAttribute::SideEffectBehavior", +% "TreatAsDifferentiableExpr::Flavor", +% "LogicOperatorShortCircuitExpr::Flavor", +% "RequirementWitness::Flavor", +% "CapabilityAtom", +% "DeclAssociationKind", +% "TokenType", +% "ValNodeOperandKind", +% "SPIRVAsmOperand::Flavor", +% "SlangLanguageVersion", +%} +% +%for _,T in ipairs(enumTypeNames) do + +/// Serialize a `value` of type `$T`. +void serialize(Serializer const& serializer, $T& value) { serializeEnum(serializer, value); } -void serialize(Serializer const& serializer, ImageFormat& value) +% -- The `serializeEnum()` function encodes enum values as `FossilUInt`s +% -- so we declare the fossilized representation of these types to match. +% +// Declare fossilized representation of `$T` +SLANG_DECLARE_FOSSILIZED_AS($T, FossilUInt); + +%end +#else // FIDDLE OUTPUT: +#define FIDDLE_GENERATED_OUTPUT_ID 0 +#include "slang-serialize-ast.cpp.fiddle" +#endif // FIDDLE END + +// +// Next we have a few `struct` types that can be serialized just +// based on the information that the fiddle tool is able to +// scrape from their declarations. +// +// The main wrinkle here, as compared to the `enum` handling above, +// is that we will split this logic between two fiddle templates: +// one to generate forward declarations, and another to fill in +// the actual implementations. +// +// The forward declarations are needed to resolve ordering issues +// when types in the AST can transitively reference themselves +// through pointer chains. Because of the way that the `serialize()` +// approach relies on overload resolution, and the `FossilizedTypeTraits` +// approach relies on partial template specialization, it is important +// that the relevant declarations/specializations get seen before +// any use sites are encountered. +// +// TODO: Ideally we would be placing the declarations of the `serialize()` +// functions and the `FossilizedTypeTraits` specializations next to the +// declarations of the types themselves. It would be great if the scraper +// part of the fiddle tool could generate those for `FIDDLE()`-annotated +// types. +// +// The basic idea here is that for each struct type `Foo` that +// we want to serialize, we will forward-declare the `serialize()` +// function, and also forward-declare a type `Fossilized_Foo` +// that will represent a fossilized `Foo` (and we also wire it +// up so that `Fossilized<Foo>` will map to `Fossilized_Foo`). +// +// All of this can be done without ever iterating over the members +// of `Foo`, so we don't run into any ordering issues. +// +#if 0 // FIDDLE TEMPLATE: +% +% -- TODO: This declaration would ideally be `local` in Lua, +% -- but the way that fiddle currently translates the templates +% -- in a C++ file over to Lua puts each distinct template in +% -- its own nested function, which means that their `local` +% -- scopes are distinct. We should see if we can change the +% -- translation so that the code directly nested under a +% -- template like this is in the global scope. +% +%astStructTypes = { +% Slang.QualType, +% Slang.SPIRVAsmOperand, +% Slang.DeclAssociation, +% Slang.NameLoc, +% Slang.WitnessTable, +% Slang.SPIRVAsmInst, +% Slang.ASTModuleInfo, +% Slang.ContainerDeclDirectMemberDeclsInfo, +%} +% +%for _,T in ipairs(astStructTypes) do + +/// Fossilized representation of a `$T` +struct Fossilized_$T; + +SLANG_DECLARE_FOSSILIZED_TYPE($T, Fossilized_$T); + +/// Serialize a `$T` +void serialize(ASTSerializer const& serializer, $T& value); +%end +#else // FIDDLE OUTPUT: +#define FIDDLE_GENERATED_OUTPUT_ID 1 +#include "slang-serialize-ast.cpp.fiddle" +#endif // FIDDLE END + +// +// Now we move on to the AST nodes themselves (subtypes of `NodeBase`). +// +// The handling of these is largely the same as for the struct +// types above, except that the function that handles serializing +// them is called `_serializeASTNodeContents()`, because of the +// logic that we use to handle the polymorphism of `NodeBase`. +// +#if 0 // FIDDLE TEMPLATE: +% +%astNodeClasses = Slang.NodeBase.subclasses +% +%for _,T in ipairs(astNodeClasses) do + +/// Fossilized representation of a `$T` +struct Fossilized_$T; + +SLANG_DECLARE_FOSSILIZED_TYPE($T, Fossilized_$T); + +/// Serialize the content of a `$T` +void _serializeASTNodeContents(ASTSerializer const& serializer, $T* value); +%end +#else // FIDDLE OUTPUT: +#define FIDDLE_GENERATED_OUTPUT_ID 2 +#include "slang-serialize-ast.cpp.fiddle" +#endif // FIDDLE END + + +// +// We will define two different implementations of `IASTSerializerImpl`, one for +// the writing case, and one for the reading case. The writing direction is +// the simpler one, so let's look at it first: +// + +/// Context for writing a Slang AST to a serialized format. +/// +/// This type only provides the contextual information needed +/// to correctly write AST-related types, and delegates the +/// lower-level serialization operations to an underlying +/// `ISerializerImpl`. +/// +struct ASTSerialWriteContext : ASTSerialContext { - serializeEnum(serializer, value); +public: + /// Construct a context for writing a serialized AST. + /// + /// * `module` is the module that is being serialized, and will be + /// used to detect whether declarations are part of the module, + /// or imported from other modules. + /// + /// * `sourceLocWriter` will be used to handle translation of + /// `SourceLoc`s into a format suitable for serialization. + /// + ASTSerialWriteContext(ModuleDecl* module, SerialSourceLocWriter* sourceLocWriter) + : _module(module), _sourceLocWriter(sourceLocWriter) + { + } + +private: + ModuleDecl* _module = nullptr; + SerialSourceLocWriter* _sourceLocWriter = nullptr; + + // + // For the most part, this type just implements the methods + // of the `IASTSerializerImpl` interface, and then has some + // support routines needed by those implementations. + // + + virtual void handleName(ASTSerializer const& serializer, Name*& value) override; + virtual void handleSourceLoc(ASTSerializer const& serializer, SourceLoc& value) override; + virtual void handleToken(ASTSerializer const& serializer, Token& value) override; + virtual void handleASTNode(ASTSerializer const& serializer, NodeBase*& node) override; + virtual void handleASTNodeContents(ASTSerializer const& serializer, NodeBase* node) override; + virtual void handleContainerDeclDirectMemberDecls( + ASTSerializer const& serializer, + ContainerDeclDirectMemberDecls& value) override; + + void _writeImportedModule(ASTSerializer const& serializer, ModuleDecl* moduleDecl); + void _writeImportedDecl( + ASTSerializer const& serializer, + Decl* decl, + ModuleDecl* importedFromModuleDecl); + + ModuleDecl* _findModuleForDecl(Decl* decl) + { + for (auto d = decl; d; d = d->parentDecl) + { + if (auto m = as<ModuleDecl>(d)) + return m; + } + return nullptr; + } + + ModuleDecl* _findModuleDeclWasImportedFrom(Decl* decl) + { + auto declModule = _findModuleForDecl(decl); + if (declModule == nullptr) + return nullptr; + if (declModule == _module) + return nullptr; + return declModule; + } +}; + +// +// The reading direction is where things get a bit more interesting. +// +// In order to support on-demand deserialization, we need an object +// that persists across multiple deserialization requests, and that +// stores the state about what values have/haven't already been +// deserialized. Concretely, multiple requests to deserialize the +// same serialized declaration had better return the same `Decl*`. +// +// This is the place where we start concretely assuming that the +// AST will be written and read using the fossil format. +// + +/// Context for on-demand AST deserialization. +/// +/// This type owns the mapping from fossilized AST declarations +/// to their live `Decl*` counterparts. +/// +/// A single `ASTDeserializationContext` should be created and +/// maintained for the entire duration during which fossilized +/// declarations might need to be revitalized. Using multiple +/// contexts could result in the same declaration getting turned +/// into multiple distinct `Decl*`s. +/// +struct ASTSerialReadContext : public ASTSerialContext, public RefObject +{ +public: + /// Construct an AST deserialization context. + /// + /// The `linkage`, `astBuilder`, and `sink` arguments must + /// all remain valid for as long as this context will be used. + /// + /// The context will retain the `sourceLocReader` and the + /// `blobHoldingSerializedData`. It is assumed that the + /// `fossilizedModuleInfo` is a pointer into the + /// `blobHoldingSerializedData`, so that keeping the blob + /// alive will ensure that the pointer stays valid. + /// + ASTSerialReadContext( + Linkage* linkage, + ASTBuilder* astBuilder, + DiagnosticSink* sink, + SerialSourceLocReader* sourceLocReader, + SourceLoc requestingSourceLoc, + Fossilized<ASTModuleInfo> const* fossilizedModuleInfo, + ISlangBlob* blobHoldingSerializedData) + : _linkage(linkage) + , _astBuilder(astBuilder) + , _sink(sink) + , _sourceLocReader(sourceLocReader) + , _requestingSourceLoc(requestingSourceLoc) + , _fossilizedModuleInfo(fossilizedModuleInfo) + , _blobHoldingSerializedData(blobHoldingSerializedData) + { + } + + /// Translate a fossilized declaration into a live `Decl*`. + /// + /// If the same `fossilizedDecl` address has been passed to this + /// operation before, it will return the same `Decl*`. + /// + /// Otherwise, this operation will trigger deserialization + /// of the `fossilizedDecl` and return the result. + /// + /// It is assumed that the `fossilizedDecl` comes from the same + /// serialized AST and the same data blob that were passed into + /// the constructor for `ASTDeserializationContext`. + /// + Decl* readFossilizedDecl(Fossilized<Decl>* fossilizedDecl); + + /// Look up an export from the fossilized module, by its mangled name. + /// + /// If a matching export is found in the serialized data, returns a + /// the corresponding declaration as if `readFossilizedDecl()` was + /// invoked on it. + /// + /// If no matching export is found, returns null. + /// + Decl* findExportedDeclByMangledName(UnownedStringSlice const& mangledName); + +private: + Linkage* _linkage = nullptr; + ASTBuilder* _astBuilder = nullptr; + DiagnosticSink* _sink = nullptr; + RefPtr<SerialSourceLocReader> _sourceLocReader = nullptr; + SourceLoc _requestingSourceLoc; + Fossilized<ASTModuleInfo> const* _fossilizedModuleInfo; + ComPtr<ISlangBlob> _blobHoldingSerializedData; + + // + // The actual cache for the mapping from fossilized declaration pointers + // to their revitalized `Decl*`s is maintained by the `Fossil::ReadContext`. + // + + Fossil::ReadContext _readContext; + +#if SLANG_ENABLE_AST_DESERIALIZATION_STATS + Count _deserializedTopLevelDeclCount = 0; +#endif + + // + // Much like the `ASTSerialWriter`, for the most part this + // type just implements the `IASTSerializer` interface, + // plus a small number of utility methods that serve those + // implementations. + // + + virtual void handleName(ASTSerializer const& serializer, Name*& value) override; + virtual void handleSourceLoc(ASTSerializer const& serializer, SourceLoc& value) override; + virtual void handleToken(ASTSerializer const& serializer, Token& value) override; + virtual void handleASTNode(ASTSerializer const& serializer, NodeBase*& outNode) override; + virtual void handleASTNodeContents(ASTSerializer const& serializer, NodeBase* node) override; + virtual void handleContainerDeclDirectMemberDecls( + ASTSerializer const& serializer, + ContainerDeclDirectMemberDecls& value) override; + + ModuleDecl* _readImportedModule(ASTSerializer const& serializer); + NodeBase* _readImportedDecl(ASTSerializer const& serializer); + + void _cleanUpASTNode(NodeBase* node); + void _assignGenericParameterIndices(GenericDecl* genericDecl); +}; + +// +// Let's look at a concrete example of how the `ASTSerialReadContext` +// and `ASTSerialWriteContext` get applied to handle one of the types +// that needs them for additional context. +// +// The `serialize()` function for `SourceLoc` is declared to take +// an `ASTSerializer` argument instead of a simple `Serializer`: +// +void serialize(ASTSerializer const& serializer, SourceLoc& value) +{ + // Its body is trivial, because the actual handling of `SourceLoc` + // serialization is delegated to the `ASTSerialWriteContext` and + // `ASTSerialReadContext`. + // + serializer.getContext()->handleSourceLoc(serializer, value); } -void serialize(Serializer const& serializer, PreferRecomputeAttribute::SideEffectBehavior& value) +void ASTSerialWriteContext::handleSourceLoc(ASTSerializer const& serializer, SourceLoc& value) { - serializeEnum(serializer, value); + // Writing of source location information can be disabled by + // compiler options, and in that case the `_sourceLocWriter` + // may be null. + // + // In order to handle that possibility, we serialize a `SourceLoc` + // as an optional value, dependent on whether we have a + // `_sourceLocWriter` that can be used. + // + SLANG_SCOPED_SERIALIZER_OPTIONAL(serializer); + if (_sourceLocWriter != nullptr) + { + // The `SourceLoc` type is implemented under the hood as an + // integer offset that can only be decoded using the specific + // `SourceManager` that created it. + // + // The source location writer handles the task of translating + // the under-the-hood representation to a single integer value + // (represented as `SerialSourceLocData::SourceLoc`) that can + // be decoded on the other side using other data that the + // source location writer will write out as part of its own + // representation (all of which goes into the dedicated debug + // data chunk, distinct from the AST). + // + SerialSourceLocData::SourceLoc rawValue = _sourceLocWriter->addSourceLoc(value); + serialize(serializer, rawValue); + } } -void serialize(Serializer const& serializer, TreatAsDifferentiableExpr::Flavor& value) +void ASTSerialReadContext::handleSourceLoc(ASTSerializer const& serializer, SourceLoc& value) { - serializeEnum(serializer, value); + // Because the source location was *written* as an optional, + // we clearly need to *read* it as one. + // + SLANG_SCOPED_SERIALIZER_OPTIONAL(serializer); + if (hasElements(serializer)) + { + SerialSourceLocData::SourceLoc rawValue; + serialize(serializer, rawValue); + + // Even if the serialized optional had a value, it is + // possible that the debug-data chunk got stripped from + // the compiled module file, in which case we wouldn't + // have access to the data needed to decode it. + // + // In that case, the `_sourceLocReader` member would be + // null, so we handle that possibility here. + // + if (auto sourceLocReader = _sourceLocReader) + { + value = sourceLocReader->getSourceLoc(rawValue); + } + } } -void serialize(Serializer const& serializer, LogicOperatorShortCircuitExpr::Flavor& value) +// Now that we've seen the relevant serialization logic, it is clear that +// a `SourceLoc` gets fossilized the same way that an optional wrapping +// an integer (of type `SerialSourceLocData::SourceLoc`) would. +// +SLANG_DECLARE_FOSSILIZED_AS(SourceLoc, std::optional<SerialSourceLocData::SourceLoc>); + + +// +// Earlier we generated forward declarations for all of the types +// that we'll be able to handle with fiddle, but there are still a +// large number of types that we currently have to hand-write the +// serialization logic for. We'll go over those here. +// + +// +// A `Name` is basically just a string, but we need to handle +// a `Name*` as a pointer, and deal with the possibility that +// it might be null. +// +// TODO: It might be better to customize the serialization of +// `Name*` itself, so that it is handled as an optional string. +// + +SLANG_DECLARE_FOSSILIZED_AS(Name, String); + +void serializeObject(ASTSerializer const& serializer, Name*& value, Name*) { - serializeEnum(serializer, value); + serializer.getContext()->handleName(serializer, value); } -void serialize(Serializer const& serializer, RequirementWitness::Flavor& value) +void ASTSerialWriteContext::handleName(ASTSerializer const& serializer, Name*& value) { - serializeEnum(serializer, value); + serialize(serializer, value->text); } -void serialize(Serializer const& serializer, CapabilityAtom& value) +void ASTSerialReadContext::handleName(ASTSerializer const& serializer, Name*& value) { - serializeEnum(serializer, value); + String text; + serialize(serializer, text); + value = _astBuilder->getNamePool()->getName(text); } -void serialize(Serializer const& serializer, DeclAssociationKind& value) +// +// A `Token` is *almost* an easy type to handle, and +// the declaration for its fossilized representation +// makes it look like it should be a simple `struct` +// that we can let fiddle generate the implementation +// for: +// + +template<> +struct FossilizedTypeTraits<Token> +{ + struct FossilizedType + { + Fossilized<decltype(Token::type)> type; + Fossilized<decltype(Token::loc)> loc; + Fossilized<decltype(Token::flags)> flags; + Fossilized<String> content; + }; +}; + +void serialize(ASTSerializer const& serializer, Token& value) { - serializeEnum(serializer, value); + serializer.getContext()->handleToken(serializer, value); } -void serialize(Serializer const& serializer, TokenType& value) +// +// The cracks start to show when we look at the logic +// for writing a `Token`: +// + +void ASTSerialWriteContext::handleToken(ASTSerializer const& serializer, Token& value) { - serializeEnum(serializer, value); + SLANG_SCOPED_SERIALIZER_STRUCT(serializer); + serialize(serializer, value.type); + serialize(serializer, value.loc); + + // The flags stored in a `Token` have one bit + // (`TokenFlag::Name`) that we don't want + // to have read in on the other side, because + // it relates to some aspects of the underlying + // in-memory representation that don't actually + // relate to the semantic *value* we are serializing. + + TokenFlags flags = TokenFlags(value.flags & ~TokenFlag::Name); + serialize(serializer, flags); + + // The content of a token is basically just a + // string, but it can be encoded in different + // ways, so we extract it here for writing. + // + String content = value.getContent(); + serialize(serializer, content); } -void serialize(Serializer const& serializer, ValNodeOperandKind& value) +// +// The reading logic adds yet more complexity... +// + +void ASTSerialReadContext::handleToken(ASTSerializer const& serializer, Token& value) { - serializeEnum(serializer, value); + SLANG_SCOPED_SERIALIZER_STRUCT(serializer); + serialize(serializer, value.type); + serialize(serializer, value.loc); + + serialize(serializer, value.flags); + + String content; + serialize(serializer, content); + + // Note that we cannot just call `value.setContent(...)` + // and pass in an `UnownedStringSlice` of `content`, + // because the `Token` will not take ownership of its own + // textual content. + // + // Instead, we need to get the text we just loaded + // into something that the `Token` can refer info, + // and the easiest way to accomplish that is to + // represent the text using a `Name`. + // + Name* name = _astBuilder->getNamePool()->getName(content); + value.setName(name); } -void serialize(Serializer const& serializer, SPIRVAsmOperand::Flavor& value) +// +// While we use fiddle to generate a lot of the code related to +// specific subclasses of `NodeBase`, the logic to serialize +// a `NodeBase*` itself needs to be special-cased by intercepting +// the `serializeObject()` customization point provided by +// the serialization system. +// +// We'll cover the implementations of `handleASTNode()` for the +// reading and writing cases later; what matters now is to +// establish this declaration before any code that tries to +// serialize any pointers to AST nodes. +// + +template<typename T> +void serializeObject(ASTSerializer const& serializer, T*& value, NodeBase*) { - serializeEnum(serializer, value); + // The general-purpose serialization layer defines + // a variant as akin to a struct, but where the + // specific number and type of fields that get written + // can vary from value to value, for the same type. + // + // The fossil encoding of a variant is always via indirection, + // as a pointer to a memory region holding the particular + // value, along with a pointer to the layout information + // for that value. + // + // Because `NodeBase` is the base class of a polymorphic + // class hierarchy, we treat all pointers to `NodeBase`-derived + // types as variants for serialization purposes. + // + SLANG_SCOPED_SERIALIZER_VARIANT(serializer); + serializer.getContext()->handleASTNode(serializer, reinterpret_cast<NodeBase*&>(value)); } -void serialize(Serializer const& serializer, SlangLanguageVersion version) +// +// We also intercept the `serializeObjectContents()` customization +// point, which is used to read/write most of the actual members +// of an AST node, whereas the `serializeObject()` step just deals +// with the parts that are necessary to allocate (or find) an +// object in the reading direction. +// + +void serializeObjectContents(ASTSerializer const& serializer, NodeBase* value, NodeBase*) { - serializeEnum(serializer, version); + serializer.getContext()->handleASTNodeContents(serializer, value); } -void serialize(Serializer const& serializer, MatrixCoord& value) +// +// The handling of the members of a `ContainerDecl` is another +// complicated part of the serialization process, so we will +// define the `serialize()` implementation here, but defer its +// implementation until later. +// +// We know, however, that the members will be serialized via +// the intermediate `ContainerDeclDirectMemberDeclsInfo` type +// that was defined earlier in this file. +// + +SLANG_DECLARE_FOSSILIZED_AS(ContainerDeclDirectMemberDecls, ContainerDeclDirectMemberDeclsInfo); + +void serialize(ASTSerializer const& serializer, ContainerDeclDirectMemberDecls& value) { - SLANG_SCOPED_SERIALIZER_TUPLE(serializer); - serialize(serializer, value.row); - serialize(serializer, value.col); + serializer.getContext()->handleContainerDeclDirectMemberDecls(serializer, value); } +// +// Pointers to diagnostics (which can be referenced in attributes +// related to enabling/disabling warnings) get serialized as +// the integer diagnostic ID. +// + +SLANG_DECLARE_FOSSILIZED_AS(DiagnosticInfo const*, Int32); + void serializePtr(Serializer const& serializer, DiagnosticInfo const*& value, DiagnosticInfo const*) { Int32 id = 0; @@ -128,13 +996,35 @@ void serializePtr(Serializer const& serializer, DiagnosticInfo const*& value, Di } } -void serialize(Serializer const& serializer, SemanticVersion& value) + +// +// A `DeclRef<T>` is just a wrapper around a `DeclRefBase*`, +// and we'll serialize it as such. +// + +template<typename T> +void serialize(ASTSerializer const& serializer, DeclRef<T>& value) { - auto raw = value.getRawValue(); - serialize(serializer, raw); - value = SemanticVersion::fromRaw(raw); + serialize(serializer, value.declRefBase); } +template<typename T> +struct FossilizedTypeTraits<DeclRef<T>> +{ + // TODO: This case can't be declared with `SLANG_DECLARE_FOSSILIZED_AS()` + // because of the need for the template parameter `T`. A more advanced + // version of that macro could also allow for template parameters, + // but for now it is okay to just write these cases out long-form. + // + using FossilizedType = Fossilized<DeclRefBase*>; +}; + +// +// A `SyntaxClass<T>` is a wrapper around an `ASTNodeType`: +// + +SLANG_DECLARE_FOSSILIZED_AS(SyntaxClass<NodeBase>, ASTNodeType); + void serialize(Serializer const& serializer, SyntaxClass<NodeBase>& value) { ASTNodeType raw = ASTNodeType(0); @@ -150,118 +1040,112 @@ void serialize(Serializer const& serializer, SyntaxClass<NodeBase>& value) } // -// Many types in the AST need additional context (beyond -// what the `Serializer` has) in order to serialize -// themselves or their members. +// The `Modifiers` type is just a wrapper around the way +// that the `Modifier` type uses an internally-linked list. // -// We define a custom serializer interface to capture -// the cases that can't be handled by a `Serializer` -// alone. +// We serialize `Modifiers` as if they were just using +// the ordinary `List<T>` type (which maybe they should...). // -/// Interface for AST serialization -struct ASTSerializerImpl -{ -public: - virtual void handleASTNode(NodeBase*& value) = 0; - virtual void handleASTNodeContents(NodeBase* value) = 0; - virtual void handleName(Name*& value) = 0; - virtual void handleSourceLoc(SourceLoc& value) = 0; - virtual void handleToken(Token& value) = 0; - - // Note that this type does *not* inherit from `ISerializerImpl`. - // - // We want to decouple the AST-specific context information - // from the lower-level details of the serialization format. - // - // Instead of using inheritance, we expect that any - // `ASTSerializerImpl` will aggregate a lower-level - // serializer, and the interface exposes access to - // that base serializer implementation. - - virtual ISerializerImpl* getBaseSerializer() = 0; -}; +SLANG_DECLARE_FOSSILIZED_AS(Modifiers, List<Modifier*>); -/// Specialization of `Serializer_` for AST serialization. -template<> -struct Serializer_<ASTSerializerImpl> : SerializerBase<ASTSerializerImpl> +void serialize(ASTSerializer const& serializer, Modifiers& value) { -public: - using SerializerBase::SerializerBase; + SLANG_SCOPED_SERIALIZER_ARRAY(serializer); + // Because we are dealing with a list, rather + // than a more mundane aggregate type list + // a struct, we need our logic to distinguish + // between the writing and reading cases. // - // In order to allow an `ASTSerializer` to be used with - // functions that expect an ordinary `Serializer`, we - // implement an implicit conversion operator. - // - - operator Serializer() const { return Serializer(get()->getBaseSerializer()); } -}; + if (isWriting(serializer)) + { + for (auto modifier : value) + { + serialize(serializer, modifier); + } + } + else + { + Modifier** link = &value.first; -/// Context type for AST serialization. -using ASTSerializer = Serializer_<ASTSerializerImpl>; + while (hasElements(serializer)) + { + Modifier* modifier = nullptr; + serialize(serializer, modifier); -template<typename T> -void serializeObject(ASTSerializer const& serializer, T*& value, NodeBase*) -{ - SLANG_SCOPED_SERIALIZER_VARIANT(serializer); - serializer->handleASTNode(*(NodeBase**)&value); + *link = modifier; + link = &modifier->next; + } + } } -void serializeObjectContents(ASTSerializer const& serializer, NodeBase* value, NodeBase*) -{ - serializer->handleASTNodeContents(value); -} +// +// For the purposes of serialization, a `TypeExp` is just +// a wrapper around a `Type*`. +// +// (Under the hood a `TypeExp` has room to store both a +// type *expression* (an `Expr*`) and the `Type*` that +// we compute as a result of checking that type expression. +// For any AST that has passed front-end semantic checking, +// the `Type*` part is expected to be filled in, and the +// `Expr*` part is no longer relevant.) +// +// Here we use another convenience macro to declare that +// the fossilized reprsentation of a `TypeExp` is the same +// as the `TypeExpr::type` member. +// +SLANG_DECLARE_FOSSILIZED_AS_MEMBER(TypeExp, type); -template<typename T> -void serialize(ASTSerializer const& serializer, DeclRef<T>& value) +void serialize(ASTSerializer const& serializer, TypeExp& value) { - serialize(serializer, value.declRefBase); + serialize(serializer, value.type); } -void serialize(ASTSerializer const& serializer, SourceLoc& value) +// +// The `CandidateExtensionList` and `DeclAssociationList` types +// are simple wrappers around a single field. +// + +SLANG_DECLARE_FOSSILIZED_AS_MEMBER(CandidateExtensionList, candidateExtensions); + +void serialize(ASTSerializer const& serializer, CandidateExtensionList& value) { - serializer->handleSourceLoc(value); + serialize(serializer, value.candidateExtensions); } -void serialize(ASTSerializer const& serializer, RequirementWitness& value) + +SLANG_DECLARE_FOSSILIZED_AS_MEMBER(DeclAssociationList, associations); + +void serialize(ASTSerializer const& serializer, DeclAssociationList& value) { - SLANG_SCOPED_SERIALIZER_VARIANT(serializer); - serialize(serializer, value.m_flavor); - switch (value.m_flavor) - { - case RequirementWitness::Flavor::none: - break; + serialize(serializer, value.associations); +} - case RequirementWitness::Flavor::declRef: - serialize(serializer, value.m_declRef); - break; +// +// The various types used to store capabilities on declarations +// are all semantically equivalent to simpler types. +// - case RequirementWitness::Flavor::val: - serialize(serializer, value.m_val); - break; +// A `CapabilityAtomSet` is an optimized representation of a +// set of a `CapabilityAtom`s (which we can encode as just +// a sequence). +// +SLANG_DECLARE_FOSSILIZED_AS(CapabilityAtomSet, List<CapabilityAtom>); - case RequirementWitness::Flavor::witnessTable: - serialize(serializer, value.m_obj); - break; - } -} +// A `CapabilityStateSet` can simply be encoded using its `atomSet` member. +// +SLANG_DECLARE_FOSSILIZED_AS_MEMBER(CapabilityStageSet, atomSet); -void serialize(ASTSerializer const& serializer, WitnessTable& value) -{ - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.baseType); - serialize(serializer, value.witnessedType); - serialize(serializer, value.isExtern); +// A `CapabilityStageSet` is really just a wrapper around a `CapabilityStageSets` +// (which is itself just a dictionary of `CapabilityStateSet`s). +// +SLANG_DECLARE_FOSSILIZED_AS(CapabilityTargetSet, CapabilityStageSets); - // TODO(tfoley): In theory we should be able to streamline - // this so that we only encode the requirements that we - // absolutely need to (which basically amounts to `associatedtype` - // requirements where the satisfying type is part of the public - // API of the type). - // - serialize(serializer, value.m_requirementDictionary); -} +// A `CapabilitySet` is really just a wrapper around a `CapabilityTargetSets` +// (which is itself just a dictionary of `CapabilityTargetSet`s). +// +SLANG_DECLARE_FOSSILIZED_AS(CapabilitySet, CapabilityTargetSets); void serialize(Serializer const& serializer, CapabilityAtomSet& value) { @@ -326,85 +1210,61 @@ void serialize(Serializer const& serializer, CapabilitySet& value) } } -void serialize(ASTSerializer const& serializer, CandidateExtensionList& value) -{ - serialize(serializer, value.candidateExtensions); -} - -void serialize(ASTSerializer const& serializer, DeclAssociation& value) -{ - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.kind); - serialize(serializer, value.decl); -} +// +// The `RequirementWitness` type is a variant, where the `m_flavor` +// field determines what data can follow. +// +// For now we will skip declaring those additional members as part +// of the fossilized representation, because we do not have any +// code that wants to navigate them directly on that representation: +// -void serialize(ASTSerializer const& serializer, DeclAssociationList& value) +template<> +struct FossilizedTypeTraits<RequirementWitness> { - serialize(serializer, value.associations); -} + struct FossilizedType + { + Fossilized<decltype(RequirementWitness::m_flavor)> m_flavor; + }; +}; -void serialize(ASTSerializer const& serializer, Modifiers& value) +void serialize(ASTSerializer const& serializer, RequirementWitness& value) { - SLANG_SCOPED_SERIALIZER_ARRAY(serializer); - if (isWriting(serializer)) - { - for (auto modifier : value) - { - serialize(serializer, modifier); - } - } - else + SLANG_SCOPED_SERIALIZER_VARIANT(serializer); + serialize(serializer, value.m_flavor); + switch (value.m_flavor) { - Modifier** link = &value.first; - - while (hasElements(serializer)) - { - Modifier* modifier = nullptr; - serialize(serializer, modifier); - - *link = modifier; - link = &modifier->next; - } - } -} + case RequirementWitness::Flavor::none: + break; -void serialize(ASTSerializer const& serializer, TypeExp& value) -{ - serialize(serializer, value.type); -} + case RequirementWitness::Flavor::declRef: + serialize(serializer, value.m_declRef); + break; -void serialize(ASTSerializer const& serializer, QualType& value) -{ - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.type); - serialize(serializer, value.isLeftValue); - serialize(serializer, value.hasReadOnlyOnTarget); - serialize(serializer, value.isWriteOnly); -} + case RequirementWitness::Flavor::val: + serialize(serializer, value.m_val); + break; -void serialize(ASTSerializer const& serializer, Token& value) -{ - serializer->handleToken(value); + case RequirementWitness::Flavor::witnessTable: + serialize(serializer, value.m_obj); + break; + } } -void serialize(ASTSerializer const& serializer, SPIRVAsmOperand& value) -{ - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.flavor); - serialize(serializer, value.token); - serialize(serializer, value.expr); - serialize(serializer, value.bitwiseOrWith); - serialize(serializer, value.knownValue); - serialize(serializer, value.wrapInId); - serialize(serializer, value.type); -} +// +// The `ValNodeOperand` type, used to store the operands of +// a `Val`-derived AST node, is a variant that gets handled +// similarly to `RequirementWitness` above. +// -void serialize(ASTSerializer const& serializer, SPIRVAsmInst& value) +template<> +struct FossilizedTypeTraits<ValNodeOperand> { - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.opcode); - serialize(serializer, value.operands); -} + struct FossilizedType + { + Fossilized<decltype(ValNodeOperand::kind)> kind; + }; +}; void serialize(ASTSerializer const& serializer, ValNodeOperand& value) { @@ -423,25 +1283,108 @@ void serialize(ASTSerializer const& serializer, ValNodeOperand& value) } } -void serializeObject(ASTSerializer const& serializer, Name*& value, Name*) +// +// Now that we've covered the types that required hand-writing their +// serialization logic, we return to the types that will have their +// serialization logic generated using fiddle. +// +// We start with the ordinary types (everything other than the +// `NodeBase`-derived stuff). +// +// The code being generated here is the same sort of thing that was +// in the hand-written case for types like `MatrixCoord` way earlier +// in this file (in fact, `MatrixCoord` could be handled by this +// logic, and is only hand-written to help illustrate what's going on). +// +// WARNING: The way these declarations are currently being generated +// uses inheritance in the definitions of the `Fossilized_*` types, +// which isn't actually something we can be confident will work correctly +// across compilers. This isn't a problem right now, because there +// doesn't end up being any code that will actually use these generated +// types in the case where there is inhertance going on that would +// break C++ "standard layout" rules. +// +// TODO: If we reach a point where the use of inheritance ends up +// breaking things, then we'll have to do a fair bit more. It might +// seem like we could just turn the `: public Whatever` base into +// a `Whatever super;` field declaration, but that wouldn't give +// us the correct layout in cases where `Whatever` is an empty +// type. +// +#if 0 // FIDDLE TEMPLATE: +%for _,T in ipairs(astStructTypes) do +% TRACE(T) +/// Fossilized representation of a value of type `$T` +struct Fossilized_$T +% if T.directSuperClass then + : public Fossilized<$(T.directSuperClass)> +% else + : public FossilizedRecordVal +% end { - serializer->handleName(value); -} +% for _,f in ipairs(T.directFields) do + Fossilized<decltype($T::$f)> $f; +% end +}; -void serialize(ASTSerializer const& serializer, NameLoc& value) +/// Serialize a `value` of type `$T` +void serialize(ASTSerializer const& serializer, $T& value) { + SLANG_UNUSED(value); SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.name); - serialize(serializer, value.loc); +% if T.directSuperClass then + serialize(serializer, static_cast<$(T.directSuperClass)&>(value)); +% end +% for _,f in ipairs(T.directFields) do + serialize(serializer, value.$f); +% end } +%end +#else // FIDDLE OUTPUT: +#define FIDDLE_GENERATED_OUTPUT_ID 3 +#include "slang-serialize-ast.cpp.fiddle" +#endif // FIDDLE END -void serialize(ASTSerializer const& serializer, ContainerDeclDirectMemberDecls& value) +// +// After the ordinary struct types come the AST node classes. +// As with the declarations, the definitions here aren't all +// that different from how the structs are being handled. +// +// Note that the big "WARNING" on the comment before the struct +// cases also applies to the inheritance here. It just turns out +// that no code (currently) wants to navigate serialized AST +// nodes in memory (so the `Fossilized_*` declarations are largely +// just there to be convenient when debugging). +// +// One wrinkle we deal with here is that the `astNodeType` +// field is not treated as part of the "content" of an AST +// node for the purposes of the `_serializeASTNodeContents()` +// functions, but it needs to be present in the `Fossilized_NodeBase` +// type declaration in order for the layout of these types to +// be correct. We handle that with a small but ugly conditional +// in the logic to define `Fossilized_*`. +// +#if 0 // FIDDLE TEMPLATE: +%for _,T in ipairs(astNodeClasses) do + +/// Fossilized representation of a value of type `$T` +struct Fossilized_$T +% if T.directSuperClass then + : public Fossilized_$(T.directSuperClass) +% else + : public FossilizedVariantObj +% end { - serialize(serializer, value._refDecls()); -} +% if T == Slang.NodeBase then + Fossilized<ASTNodeType> astNodeType; +% end -#if 0 // FIDDLE TEMPLATE: -%for _,T in ipairs(Slang.NodeBase.subclasses) do +% for _,f in ipairs(T.directFields) do + Fossilized<decltype($T::$f)> $f; +% end +}; + +/// Serialize the contents of an AST node of type `$T` void _serializeASTNodeContents(ASTSerializer const& serializer, $T* value) { SLANG_UNUSED(serializer); @@ -452,315 +1395,85 @@ void _serializeASTNodeContents(ASTSerializer const& serializer, $T* value) % for _,f in ipairs(T.directFields) do serialize(serializer, value->$f); % end - } +} %end #else // FIDDLE OUTPUT: -#define FIDDLE_GENERATED_OUTPUT_ID 0 +#define FIDDLE_GENERATED_OUTPUT_ID 4 #include "slang-serialize-ast.cpp.fiddle" #endif // FIDDLE END -void serializeASTNodeContents(ASTSerializer const& serializer, NodeBase* node) -{ - ASTNodeDispatcher<NodeBase, void>::dispatch( - node, - [&](auto n) { _serializeASTNodeContents(serializer, n); }); -} - -enum class PseudoASTNodeType -{ - None, - ImportedModule, - ImportedDecl, -}; - -static PseudoASTNodeType _getPseudoASTNodeType(ASTNodeType type) -{ - return int(type) < 0 ? PseudoASTNodeType(~int(type)) : PseudoASTNodeType::None; -} - -static ASTNodeType _getAsASTNodeType(PseudoASTNodeType type) -{ - return ASTNodeType(~int(type)); -} - -struct ASTEncodingContext : ASTSerializerImpl -{ -public: - ASTEncodingContext( - ISerializerImpl* writer, - ModuleDecl* module, - SerialSourceLocWriter* sourceLocWriter) - : _writer(writer), _module(module), _sourceLocWriter(sourceLocWriter) - { - } - -private: - ISerializerImpl* _writer = nullptr; - ModuleDecl* _module = nullptr; - SerialSourceLocWriter* _sourceLocWriter = nullptr; - - virtual ISerializerImpl* getBaseSerializer() override { return _writer; } - - virtual void handleName(Name*& value) override; - virtual void handleSourceLoc(SourceLoc& value) override; - virtual void handleToken(Token& value) override; - virtual void handleASTNode(NodeBase*& node) override; - virtual void handleASTNodeContents(NodeBase* node) override; - - void _writeImportedModule(ModuleDecl* moduleDecl); - void _writeImportedDecl(Decl* decl, ModuleDecl* importedFromModuleDecl); - - ModuleDecl* _findModuleForDecl(Decl* decl) - { - for (auto d = decl; d; d = d->parentDecl) - { - if (auto m = as<ModuleDecl>(d)) - return m; - } - return nullptr; - } - - ModuleDecl* _findModuleDeclWasImportedFrom(Decl* decl) - { - auto declModule = _findModuleForDecl(decl); - if (declModule == nullptr) - return nullptr; - if (declModule == _module) - return nullptr; - return declModule; - } -}; - -struct ASTDecodingContext : ASTSerializerImpl -{ -public: - ASTDecodingContext( - Linkage* linkage, - ASTBuilder* astBuilder, - DiagnosticSink* sink, - ISerializerImpl* reader, - SerialSourceLocReader* sourceLocReader, - SourceLoc requestingSourceLoc) - : _linkage(linkage) - , _astBuilder(astBuilder) - , _sink(sink) - , _sourceLocReader(sourceLocReader) - , _requestingSourceLoc(requestingSourceLoc) - , _reader(reader) - { - } - -private: - Linkage* _linkage = nullptr; - ASTBuilder* _astBuilder = nullptr; - DiagnosticSink* _sink = nullptr; - SerialSourceLocReader* _sourceLocReader = nullptr; - SourceLoc _requestingSourceLoc; - ISerializerImpl* _reader = nullptr; - - virtual ISerializerImpl* getBaseSerializer() override { return _reader; } - - virtual void handleName(Name*& value) override; - virtual void handleSourceLoc(SourceLoc& value) override; - virtual void handleToken(Token& value) override; - virtual void handleASTNode(NodeBase*& outNode) override; - virtual void handleASTNodeContents(NodeBase* node) override; - - ModuleDecl* _readImportedModule(); - NodeBase* _readImportedDecl(); - - void _cleanUpASTNode(NodeBase* node) - { - if (auto expr = as<Expr>(node)) - { - expr->checked = true; - } - else if (auto decl = as<Decl>(node)) - { - decl->checkState = DeclCheckState::CapabilityChecked; - - if (auto genericDecl = as<GenericDecl>(node)) - { - _assignGenericParameterIndices(genericDecl); - } - else if (auto syntaxDecl = as<SyntaxDecl>(node)) - { - syntaxDecl->parseCallback = &parseSimpleSyntax; - syntaxDecl->parseUserData = (void*)syntaxDecl->syntaxClass.getInfo(); - } - else if (auto namespaceLikeDecl = as<NamespaceDeclBase>(node)) - { - auto declScope = _astBuilder->create<Scope>(); - declScope->containerDecl = namespaceLikeDecl; - namespaceLikeDecl->ownedScope = declScope; - } - } - } - - void _assignGenericParameterIndices(GenericDecl* genericDecl) - { - int parameterCounter = 0; - for (auto m : genericDecl->getDirectMemberDecls()) - { - if (auto typeParam = as<GenericTypeParamDeclBase>(m)) - { - typeParam->parameterIndex = parameterCounter++; - } - else if (auto valParam = as<GenericValueParamDecl>(m)) - { - valParam->parameterIndex = parameterCounter++; - } - } - } -}; - -// -// We are matching up the corresponding `handle*()` operations from the -// `AST{Encoding|Decoding}Context` types here, so that it is easier -// to visually verify that they are serializing the same data with the -// same ordering. // - +// Each of the `_serializeASTNodeContents()` functions handles one class in the hierarchy, +// but we need to be able to dispatch to the correct one based on the run-time type of +// a particular AST node. // -// AST{Encoding|Decoding}Context::handleName() +// The `serializeASTNodeContents()` function is a wrapper around those underscore-prefixed +// functions, and dispatches to the correct one based on the type of the given node. // -void ASTEncodingContext::handleName(Name*& value) -{ - serialize(ASTSerializer(this), value->text); -} - -void ASTDecodingContext::handleName(Name*& value) +void serializeASTNodeContents(ASTSerializer const& serializer, NodeBase* node) { - String text; - serialize(ASTSerializer(this), text); - value = _astBuilder->getNamePool()->getName(text); + ASTNodeDispatcher<NodeBase, void>::dispatch( + node, + [&](auto n) { _serializeASTNodeContents(serializer, n); }); } // -// AST{Encoding|Decoding}Context::handleSourceLoc() -// - -void ASTEncodingContext::handleSourceLoc(SourceLoc& value) -{ - ASTSerializer serializer(this); - SLANG_SCOPED_SERIALIZER_OPTIONAL(serializer); - if (_sourceLocWriter != nullptr) - { - auto rawValue = _sourceLocWriter->addSourceLoc(value); - serialize(serializer, rawValue); - } -} - -void ASTDecodingContext::handleSourceLoc(SourceLoc& value) -{ - ASTSerializer serializer(this); - SLANG_SCOPED_SERIALIZER_OPTIONAL(serializer); - if (hasElements(serializer)) - { - SerialSourceLocData::SourceLoc rawValue; - serialize(serializer, rawValue); - - if (_sourceLocReader) - { - value = _sourceLocReader->getSourceLoc(rawValue); - } - } -} - +// At this point we can get back to the handling of reading/writing actual AST nodes. // -// AST{Encoding|Decoding}Context::handleToken() +// We'll start with the writing logic, because that gives a good idea of +// the overall structure, which the reading logic will need to follow: // -void ASTDecodingContext::handleToken(Token& value) +void ASTSerialWriteContext::handleASTNode(ASTSerializer const& serializer, NodeBase*& node) { - ASTSerializer serializer(this); - - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.type); - serialize(serializer, value.loc); - - serialize(serializer, value.flags); - + // The first complication that needs to be handled is that when we + // run into a `Decl*` that is being written, we need to check + // whether it comes from an imported module (as opposed to the + // module we are being asked to serialize). + // + if (auto decl = as<Decl>(node)) { - SLANG_SCOPED_SERIALIZER_OPTIONAL(serializer); - if (hasElements(serializer)) + if (auto moduleDeclWasImportedFrom = _findModuleDeclWasImportedFrom(decl)) { - String content; - serialize(serializer, content); - - // An important note here is that we cannot just - // call `value.setContent(...)` and pass in an - // `UnownedStringSlice` of `content`, because the - // `Token` will not take ownership of its own - // textual content. + // If we find that the declaration is imported, then there + // are two sub-cases that we want to handle a bit differently: // - // Instead, we need to get the text we just loaded - // into something that the `Token` can refer info, - // and the easiest way to accomplish that is to - // represent the text using a `Name`. + // * When the `decl` we are writing is itself a module + // (and thus identical to `moduleDeclWasImportedFrom`). // - Name* name = _astBuilder->getNamePool()->getName(content); - value.setName(name); - } - } -} - -void ASTEncodingContext::handleToken(Token& value) -{ - ASTSerializer serializer(this); - - SLANG_SCOPED_SERIALIZER_STRUCT(serializer); - serialize(serializer, value.type); - serialize(serializer, value.loc); - - TokenFlags flags = TokenFlags(value.flags & ~TokenFlag::Name); - serialize(serializer, flags); - - { - SLANG_SCOPED_SERIALIZER_OPTIONAL(serializer); - if (value.hasContent()) - { - String content = value.getContent(); - serialize(serializer, content); - } - } -} - -// -// AST{Encoding|Decoding}Context::handleASTNode() -// - -void ASTEncodingContext::handleASTNode(NodeBase*& node) -{ - if (auto decl = as<Decl>(node)) - { - if (auto importedFromModule = _findModuleDeclWasImportedFrom(decl)) - { - if (decl == importedFromModule) + // * The ordinary case, where `decl` is one of the declarations + // contained in `moduleDeclWasImportedFrom`. + // + if (decl == moduleDeclWasImportedFrom) { - _writeImportedModule(importedFromModule); + _writeImportedModule(serializer, moduleDeclWasImportedFrom); return; } else { - _writeImportedDecl(decl, importedFromModule); + _writeImportedDecl(serializer, decl, moduleDeclWasImportedFrom); return; } } } - ASTSerializer serializer(this); - + // The next complication we need to deal with is that + // for most AST nodes we will want to defer writing + // out their contents until a later step (to avoid + // going into an infinite recursion when there are + // cycles in the object graph), but because of the + // way that AST nodes derived from `Val` are + // deduplicated as part of creation, we can't + // defer reading their operands. + // + // Thus we branch here based on whether we are + // writing a `Val`-derived node, or not. + // if (auto val = as<Val>(node)) { val = val->resolve(); - // On the reading side of things, sublcasses of `Val` - // are deduplicated as part of creation, and will read the - // operands out immediately, so we mirror that approach - // on the writing side to make sure the code is consistent. - // serialize(serializer, val->astNodeType); serialize(serializer, val->m_operands); } @@ -771,26 +1484,79 @@ void ASTEncodingContext::handleASTNode(NodeBase*& node) } } -void ASTDecodingContext::handleASTNode(NodeBase*& outNode) +// +// In order to be able to encode the cases for imported +// modules and declarations, we get a little bit "clever" +// with the representation and store some out-of-range +// values in an `ASTNodeType` to represent these +// additional cases. +// + +enum class PseudoASTNodeType { - ASTSerializer serializer(this); + None, + ImportedModule, + ImportedDecl, +}; + +// All valid `ASTNodeType`s will be non-negative integers, +// so the `PseudoASTNodeType` are encoded into an +// `ASTNodeType` as negative values that are the bitwise +// negation of their value in the `PseudoASTNodeType` enumeration. + +static PseudoASTNodeType _getPseudoASTNodeType(ASTNodeType type) +{ + return Int32(type) < 0 ? PseudoASTNodeType(~Int32(type)) : PseudoASTNodeType::None; +} + +static ASTNodeType _getAsASTNodeType(PseudoASTNodeType type) +{ + return ASTNodeType(~Int32(type)); +} + +// +// With the `PseudoASTNodeType` trickery introduced, +// it is possible to show the reading logic for +// `NodeBase`-derived types: +// - ASTNodeType typeTag = ASTNodeType(0); +void ASTSerialReadContext::handleASTNode(ASTSerializer const& serializer, NodeBase*& outNode) +{ + // We start by reading the `ASTNodeType`, because + // we will dispatch differently based on what + // value we see there. + // + ASTNodeType typeTag = ASTNodeType::NodeBase; serialize(serializer, typeTag); + + // In the case where the `ASTNodeType` is actually + // smuggling in one of our `PseudoASTNodeType` + // values, we can delegate to the correct + // subroutine to handle that case. + // + // These two cases mirror the cases for imported + // modules and declarations in + // `ASTSerialWriter::handleASTNode()`. + // switch (_getPseudoASTNodeType(typeTag)) { default: break; case PseudoASTNodeType::ImportedModule: - outNode = _readImportedModule(); + outNode = _readImportedModule(serializer); return; case PseudoASTNodeType::ImportedDecl: - outNode = _readImportedDecl(); + outNode = _readImportedDecl(serializer); return; } + // Next we check whether the `typeTag` + // indicates that we are looking at a + // subclass of `Val`, because we need + // to handle those differently. + // auto syntaxClass = SyntaxClass<NodeBase>(typeTag); if (syntaxClass.isSubClassOf<Val>()) { @@ -811,6 +1577,11 @@ void ASTDecodingContext::handleASTNode(NodeBase*& outNode) } else { + // In the ordinary case, we can allocate an empty + // shell of an AST node to represent the object, + // and defer actually serializing the contents + // of that object until later. + auto node = syntaxClass.createInstance(_astBuilder); outNode = node; @@ -819,41 +1590,36 @@ void ASTDecodingContext::handleASTNode(NodeBase*& outNode) } // -// AST{Encoding|Decoding}Context::handleASTNodeContents() -// - -void ASTEncodingContext::handleASTNodeContents(NodeBase* node) -{ - ASTSerializer serializer(this); - serializeASTNodeContents(serializer, node); -} - -void ASTDecodingContext::handleASTNodeContents(NodeBase* node) -{ - ASTSerializer serializer(this); - serializeASTNodeContents(serializer, node); - - _cleanUpASTNode(node); -} - -// -// AST{Encoding|Decoding}Context::_{write|read}ImportedModule() +// Imported modules are serialized using one of the +// `PseudoASTNodeType` cases as its tag, and then +// store a single field with the name of the module. // -void ASTEncodingContext::_writeImportedModule(ModuleDecl* moduleDecl) +void ASTSerialWriteContext::_writeImportedModule( + ASTSerializer const& serializer, + ModuleDecl* moduleDecl) { ASTNodeType type = _getAsASTNodeType(PseudoASTNodeType::ImportedModule); auto moduleName = moduleDecl->getName(); - ASTSerializer serializer(this); serialize(serializer, type); serialize(serializer, moduleName); } -ModuleDecl* ASTDecodingContext::_readImportedModule() +ModuleDecl* ASTSerialReadContext::_readImportedModule(ASTSerializer const& serializer) { - ASTSerializer serializer(this); - + // In the reading direction, we need to actually + // kick off the logic to import the module + // that this one depends on. + // + // TODO: It might be cleaner if we changed up + // the representation so that imported modules + // get listed at the top level, as part of + // the `ASTModuleInfo`, and thus allowing the + // process of importing them to be handled + // by logic that isn't deep in the guts of the + // serialization code. + // Name* moduleName = nullptr; serialize(serializer, moduleName); auto module = _linkage->findOrImportModule(moduleName, _requestingSourceLoc, _sink); @@ -865,24 +1631,28 @@ ModuleDecl* ASTDecodingContext::_readImportedModule() } // -// AST{Encoding|Decoding}Context::_{write|read}ImportedModule() +// Imported declarations use a `PseudoASTNodeType` +// to define their type tag, and are then serialized +// like a struct that contains a poitner to the +// module that the declaration was imported from, +// and the mangled name of the specific declaration. // -void ASTEncodingContext::_writeImportedDecl(Decl* decl, ModuleDecl* importedFromModuleDecl) +void ASTSerialWriteContext::_writeImportedDecl( + ASTSerializer const& serializer, + Decl* decl, + ModuleDecl* importedFromModuleDecl) { ASTNodeType type = _getAsASTNodeType(PseudoASTNodeType::ImportedDecl); auto mangledName = getMangledName(getCurrentASTBuilder(), decl); - ASTSerializer serializer(this); serialize(serializer, type); serialize(serializer, importedFromModuleDecl); serialize(serializer, mangledName); } -NodeBase* ASTDecodingContext::_readImportedDecl() +NodeBase* ASTSerialReadContext::_readImportedDecl(ASTSerializer const& serializer) { - ASTSerializer serializer(this); - ModuleDecl* importedFromModuleDecl = nullptr; String mangledName; @@ -896,7 +1666,7 @@ NodeBase* ASTDecodingContext::_readImportedDecl() } auto importedDecl = - importedFromModule->findExportFromMangledName(mangledName.getUnownedSlice()); + importedFromModule->findExportedDeclByMangledName(mangledName.getUnownedSlice()); if (!importedDecl) { SLANG_ABORT_COMPILATION( @@ -906,6 +1676,288 @@ NodeBase* ASTDecodingContext::_readImportedDecl() } // +// Handling the contents of an AST node is mostly the +// same logic between the reading and writing directions. +// The only difference is that when we are reading in +// an AST node there is some cleanup work we have to +// do after reading is complete, in order to make +// the AST node actually usable. +// + +void ASTSerialWriteContext::handleASTNodeContents(ASTSerializer const& serializer, NodeBase* node) +{ + serializeASTNodeContents(serializer, node); +} + +void ASTSerialReadContext::handleASTNodeContents(ASTSerializer const& serializer, NodeBase* node) +{ + serializeASTNodeContents(serializer, node); + + _cleanUpASTNode(node); +} + +void ASTSerialReadContext::_cleanUpASTNode(NodeBase* node) +{ + if (auto expr = as<Expr>(node)) + { + expr->checked = true; + } + else if (auto decl = as<Decl>(node)) + { + decl->checkState = DeclCheckState::CapabilityChecked; + + if (auto genericDecl = as<GenericDecl>(node)) + { + _assignGenericParameterIndices(genericDecl); + } + else if (auto syntaxDecl = as<SyntaxDecl>(node)) + { + syntaxDecl->parseCallback = &parseSimpleSyntax; + syntaxDecl->parseUserData = (void*)syntaxDecl->syntaxClass.getInfo(); + } + else if (auto namespaceLikeDecl = as<NamespaceDeclBase>(node)) + { + auto declScope = _astBuilder->create<Scope>(); + declScope->containerDecl = namespaceLikeDecl; + namespaceLikeDecl->ownedScope = declScope; + } + +#if SLANG_ENABLE_AST_DESERIALIZATION_STATS + if (auto moduleDecl = as<ModuleDecl>(decl->parentDecl)) + { + auto& deserializedCount = _sharedContext->_deserializedTopLevelDeclCount; + deserializedCount++; + + Count totalCount = moduleDecl->getDirectMemberDeclCount(); + + fprintf( + stderr, + "loaded %d / %d direct members of module '%s' (%f%%)\n", + int(deserializedCount), + int(totalCount), + moduleDecl->getName() ? moduleDecl->getName()->text.getBuffer() : "", + float(deserializedCount) * 100.0f / float(totalCount)); + } +#endif + + // TODO(tfoley): If we are disabling on-demand deserialization + // for now (because of other changes that are needed before we + // can enable it), then we will intentionally load all of the + // direct member declarations of a container declarations + // up-front. +#if SLANG_DISABLE_ON_DEMAND_AST_DESERIALIZATION + if (auto containerDecl = as<ContainerDecl>(decl)) + { + auto& directMemberDecls = containerDecl->getDirectMemberDecls(); + SLANG_UNUSED(directMemberDecls); + } +#endif + } +} + +void ASTSerialReadContext::_assignGenericParameterIndices(GenericDecl* genericDecl) +{ + int parameterCounter = 0; + for (auto m : genericDecl->getDirectMemberDecls()) + { + if (auto typeParam = as<GenericTypeParamDeclBase>(m)) + { + typeParam->parameterIndex = parameterCounter++; + } + else if (auto valParam = as<GenericValueParamDecl>(m)) + { + valParam->parameterIndex = parameterCounter++; + } + } +} + + +// +// +// + +template<typename K, typename V> +static void _sortByKey(List<KeyValuePair<K, V>>& array) +{ + array.sort([](KeyValuePair<K, V> const& lhs, KeyValuePair<K, V> const& rhs) + { return lhs.key < rhs.key; }); +} + +static void _collectASTModuleInfo(ModuleDecl* moduleDecl, ASTModuleInfo& moduleInfo) +{ + auto module = moduleDecl->module; + + moduleInfo.moduleDecl = moduleDecl; + collectBuiltinDeclsThatNeedRegistration(moduleDecl, moduleInfo.declsToRegister); + + // We want to store a dictionary of exported declarations + // from the module, mapping from a mangled name to the + // declaration with that name. + // + // In order to accelerate search on the reading side, we will + // conspire to make the entries in the serialized dictionary + // be in sorted order by their keys. + // + List<KeyValuePair<String, Decl*>> exportNameDeclPairs; + + auto exportCount = module->getExportedDeclCount(); + for (Index exportIndex = 0; exportIndex < exportCount; ++exportIndex) + { + auto exportMangledName = String(module->getExportedDeclMangledName(exportIndex)); + auto exportDecl = module->getExportedDecl(exportIndex); + + exportNameDeclPairs.add(KeyValuePair(exportMangledName, exportDecl)); + } + _sortByKey(exportNameDeclPairs); + + for (auto& entry : exportNameDeclPairs) + { + moduleInfo.mapMangledNameToDecl.add(entry.key, entry.value); + } +} + +// +// The `ContainerDeclDirectMemberDecls` type is serialized via the +// intermediate type `ContainerDeclDirectMemberDeclsInfo`. We start +// by defining the logic to collect the required information: +// + +static ContainerDeclDirectMemberDeclsInfo _collectContainerDeclDirectMemberDeclsInfo( + ContainerDeclDirectMemberDecls const& decls) +{ + ContainerDeclDirectMemberDeclsInfo info; + info.decls = decls.getDecls(); + + // In order to ensure that the accelerators that we serialize + // match with those created by the compiler front-end, we + // will pull the data from `decls` via its public API rather + // than try to reconstruct any of that information. + // + // Because the public API of `ContainerDeclDirectMemberDecls` + // traffics in `Decl*`s but we want to serialize indices, + // we will create a dictionary to reverse the mapping so that + // we can serialize out indices. + // + Dictionary<Decl*, FossilUInt> mapDeclToIndex; + Count declCount = info.decls.getCount(); + for (Index i = 0; i < declCount; ++i) + { + auto decl = info.decls[i]; + if (!decl) + continue; + + mapDeclToIndex[decl] = FossilUInt(i); + } + + // With our decl-to-index mapping created, filling + // out the to-be-serialized list of transparent + // declarations is a simple matter. + // + for (auto decl : decls.getTransparentDecls()) + { + if (!decl) + continue; + + auto found = mapDeclToIndex.tryGetValue(decl); + SLANG_ASSERT(found); + + info.transparentDeclIndices.add(*found); + } + + // Handling the name-to-declaration mapping is a bit + // more complicated, simply because we want to store + // the entries of the resulting dictionary in sorted + // order to enable them to be looked up via a binary + // search. Thus we start by creating a list of the + // key-value pairs, which we will then sort. + // + List<KeyValuePair<String, FossilUInt>> nameIndexPairs; + for (auto& entry : decls.getMapFromNameToLastDeclOfThatName()) + { + auto name = entry.first; + if (!name) + continue; + + auto decl = entry.second; + if (!decl) + continue; + + auto found = mapDeclToIndex.tryGetValue(decl); + SLANG_ASSERT(found); + + nameIndexPairs.add(KeyValuePair(name->text, *found)); + } + _sortByKey(nameIndexPairs); + + // The `info.mapNameToDeclIndex` is stored as an `OrderedDictionary`, + // so it will preserve the order in which we insert its entries here. + // + for (auto& entry : nameIndexPairs) + { + info.mapNameToDeclIndex.add(entry.key, entry.value); + } + + return info; +} + +void ASTSerialWriteContext::handleContainerDeclDirectMemberDecls( + ASTSerializer const& serializer, + ContainerDeclDirectMemberDecls& value) +{ + // Writing the members of a container declaration is + // just a matter of collecting the information into + // the intermediate type, and then writing *that*. + + ContainerDeclDirectMemberDeclsInfo info = _collectContainerDeclDirectMemberDeclsInfo(value); + + serialize(serializer, info); +} + +void ASTSerialReadContext::handleContainerDeclDirectMemberDecls( + ASTSerializer const& serializer, + ContainerDeclDirectMemberDecls& value) +{ + // In the reading direction, we will intentionally + // *not* deserialize things the usual way, because + // we want to support deserializing only a subset + // of the direct member declarations of a given + // container, on-demand. + + // We start by reading a pointer to a single fossilized + // value from the underlying `Fossil::Reader` that we + // are using, and cast it to the type that we expect to + // find there. + // + // The underlying reader was passed in as part of the + // `serializer` parameter, but it is only typed as an + // `ISerializerImpl`, whereas we *know* it has a more + // specific type, which we want to make use of. + // + ISerializerImpl* readerImpl = serializer.getImpl(); + auto fossilReader = static_cast<Fossil::SerialReader*>(readerImpl); + // + auto fossilizedInfo = + (Fossilized<ContainerDeclDirectMemberDeclsInfo>*)fossilReader->readValPtr().get(); + + // We can read specific fields out of the `fossilizedInfo` + // without triggering full deserialization. At this point + // we will do exactly that to read the number of direct + // member declarations. + // + auto declCount = fossilizedInfo->decls.getElementCount(); + + // We will set up the `ContainerDeclDirectMemberDecls` to + // be in on-demand deserialization mode, in which it will + // retain a pointer to this context (which is being used for + // the entire AST module), along with a pointer to the + // fossilized information for this specific container's + // member declarations. + // + value._initForOnDemandDeserialization(this, fossilizedInfo, declCount); +} + + +// // {write|read}SerializedModuleAST() // @@ -917,43 +1969,457 @@ void writeSerializedModuleAST( // TODO: we might want to have a more careful pass here, // where we only encode the public declarations. + // Rather than serialize the `ModuleDecl` directly, we instead + // collect the information we want to serialize into an intermediate + // `ASTModuleInfo` value, and then serialize *that*. + // + // This choice allows us to build up some data structures that will + // be very useful when reading the serialized data later, and that + // would not naturally "fall out" of serializing the module more + // directly. + + ASTModuleInfo moduleInfo; + _collectASTModuleInfo(moduleDecl, moduleInfo); + + // At the most basic, we are building a single "blob" of data + // (in the sense of the `ISlangBlob` interface). + // BlobBuilder blobBuilder; { + // The architecture of the serialization system means that + // we need a few steps to set up everything before we can + // actually call `serialize()`: + // + // * We need an implementation of `ISerializerImpl` to do + // the actual writing, which in this case will be a + // `Fossil::SerialWriter`. + // + // * We need the additional context information that many + // of the AST types require in their `serialize()` overloads, + // which will be an `ASTSerialWriteContext`. + // + // * We need to wrap those two values up in an `ASTSerializer` + // (which is more or less just a pair of pointers, to the two + // values described above). + // Fossil::SerialWriter writer(blobBuilder); + ASTSerialWriteContext context(moduleDecl, sourceLocWriter); + ASTSerializer serializer(&writer, &context); + + // Once we have our `serializer`, we can finally invoke + // `serialize()` on the `ASTModuleInfo` to cause everything + // to be recursively written. + // + serialize(serializer, moduleInfo); - ASTEncodingContext context(&writer, moduleDecl, sourceLocWriter); - serialize(ASTSerializer(&context), moduleDecl); + // Note that we wrapped these steps in a scope, because + // it is the destructor for `Fossil::SerialWriter` that + // will actually "flush" any pending serialization operations + // and cause the full blob to be written. } + // We can now grab the serialized data as a single `ISlangBlob`. + // ComPtr<ISlangBlob> blob; blobBuilder.writeToBlob(blob.writeRef()); + // While the AST serialization system is using fossil, the + // overall module serialization is still based on the RIFF + // container format, so we immediately turn around and + // add the blob we just created as a single data chunk in + // the RIFF hierarchy. + // + // TODO: This step copies the entire blob. If that copy + // operation ever becomes a performance concern, we should + // be able to tweak things so that the `BlobBuilder` uses + // the same memory arena that the RIFF builder is using, + // and then employ the `RIFF::BuildCursor::addUnownedData()` + // method to add the data without copying. + // void const* data = blob->getBufferPointer(); size_t size = blob->getBufferSize(); - cursor.addDataChunk(PropertyKeys<Module>::ASTModule, data, size); } +// +// The reading direction is significantly more subtle than the +// writing direction, because we will be traversing some of +// the fossilized data structures without first deserializing +// them into ordinary C++ objects. +// +// In order for this code to work, we need to know that the +// fossilized layout for the types we will access directly +// (such as `ASTModuleInfo`) will exactly match what we expect. +// +// As a small safety measure, we include some static assertions +// about the key properties we expect of the fossilized `ASTModuleInfo`. +// + +static_assert(sizeof(Fossilized<ASTModuleInfo>) == 12); +static_assert(offsetof(Fossilized<ASTModuleInfo>, moduleDecl) == 0); +static_assert(offsetof(Fossilized<ASTModuleInfo>, declsToRegister) == 4); +static_assert(offsetof(Fossilized<ASTModuleInfo>, mapMangledNameToDecl) == 8); + ModuleDecl* readSerializedModuleAST( Linkage* linkage, ASTBuilder* astBuilder, DiagnosticSink* sink, + ISlangBlob* blobHoldingSerializedData, RIFF::Chunk const* chunk, SerialSourceLocReader* sourceLocReader, SourceLoc requestingSourceLoc) { + // We expect the `chunk` that was passed in to be a RIFF + // data chunk (matching what was written in `writeSerializedModuleAST()`, + // and to be proper fossil-format data. + // auto dataChunk = as<RIFF::DataChunk>(chunk); + if (!dataChunk) + { + SLANG_UNEXPECTED("invalid format for serialized module AST"); + } + + Fossil::AnyValPtr rootValPtr = + Fossil::getRootValue(dataChunk->getPayload(), dataChunk->getPayloadSize()); + if (!rootValPtr) + { + SLANG_UNEXPECTED("invalid format for serialized module AST"); + } - auto rootVal = Fossil::getRootValue(dataChunk->getPayload(), dataChunk->getPayloadSize()); + // We don't want to simply mirror the `writeSerializedModuleAST()` logic + // here and deserialize an entire `ASTModuleInfo`. Instead, we will + // traverse the `Fossilized<ASTModuleInfo>` directly, and extract only + // the information we need. + // + // The `rootValPtr` above uses the `Fossil::AnyValPtr` type, which + // is basically a dynamically-typed pointer to fossilized data of + // any type, and carries around its own layout information. We could + // in principle traverse the structure using that type by making dynamic + // queries (and doing so would let us detect various error cases where + // the serialized format might not match what we expect), but instead + // we are going to simply perform an uncheckedcast on that dynamically-typed + // pointer to get out a statically-typed pointer to what we expect to + // find there. + // + Fossilized<ASTModuleInfo>* fossilizedModuleInfo = cast<Fossilized<ASTModuleInfo>>(rootValPtr); + + // We now have enough information to construct an `ASTSerialReadContext`, + // which is the mirror to the `ASTSerialWriteContext`, but which has the + // important difference that the `ASTSerialReadContext` is allowed to + // persist past when this function returns. Thus we cannot allocte the + // read context on the stack like we did for the write context, and + // we instead allocate it as a reference-counted object. + // + auto sharedDecodingContext = RefPtr(new ASTSerialReadContext( + linkage, + astBuilder, + sink, + sourceLocReader, + requestingSourceLoc, + fossilizedModuleInfo, + blobHoldingSerializedData)); + + // The `sharedDecodingContext` will allow us to deserialize individual + // `Decl`s from the AST one-by-one. One declaration that we *know* + // we need right away is the actual `ModuleDecl` (since we need to + // return it from this function). + // + ModuleDecl* moduleDecl = + as<ModuleDecl>(sharedDecodingContext->readFossilizedDecl(fossilizedModuleInfo->moduleDecl)); + SLANG_ASSERT(moduleDecl); + +#if SLANG_ENABLE_AST_DESERIALIZATION_STATS + fprintf( + stderr, + "finished loading the `ModuleDecl` for '%s'\n", + moduleDecl->getName()->text.getBuffer()); +#endif + + // In the case where we are reading one of the builtin modules (e.g. + // the core module), there may be declarations inside that module + // that need to be registered with the `SharedASTBuilder`, because + // parts of the C++ compiler code need to be able to form references + // to those declarations. + // + // We will handle those here by traversing the fossilized equivalent + // of the `ASTModuleInfo::declsToRegister` array, then deserializing + // and registering each entry we find. + // + for (Fossilized<Decl>* fossilizedDecl : fossilizedModuleInfo->declsToRegister) + { + Decl* decl = sharedDecodingContext->readFossilizedDecl(fossilizedDecl); + registerBuiltinDecl(astBuilder, decl); + } - Fossil::SerialReader reader(rootVal); +#if SLANG_ENABLE_AST_DESERIALIZATION_STATS + fprintf( + stderr, + "finished registering builtins for '%s'\n", + moduleDecl->getName()->text.getBuffer()); +#endif - ASTDecodingContext - context(linkage, astBuilder, sink, &reader, sourceLocReader, requestingSourceLoc); + // + // At this point any further data in the serialized AST can be read + // on-demand as needed, via the accessor methods on `ContainerDeclDirectMemberDecls` + // and `ModuleDecl` that are implemented below. + // - ModuleDecl* moduleDecl = nullptr; - serialize(ASTSerializer(&context), moduleDecl); return moduleDecl; } +// +// A key facility that makes on-demand deserialization possible is +// the ability to read individual serialized declarations out of +// a module, just based on a pointer to their fossilized representation. +// + +Decl* ASTSerialReadContext::readFossilizedDecl(Fossilized<Decl>* fossilizedDecl) +{ + // AST nodes are all fossilized as variants, which means that they + // carrying their own layout information. We can exploit this fact + // to get from the raw pointer that was passed in to a `Fossil::AnyValPtr` + // that includes the layout information that a `Fossil::SerialReader` + // needs. + // + Fossil::AnyValPtr contentValPtr = getVariantContentPtr(fossilizedDecl); + + // One subtle issue is that when we call `serialize()` below to read + // a `Decl*`, the `SerialReader` wants to *read* a pointer to the + // serialized object from its current cursor position. But what we have + // is a pointer to the object... not a pointer to a *pointer* to the object. + // + // We thus tweak the `InitialStateType` used for the `SerialReader` to + // tell it that it should treat our `contentValPtr` as if there was + // an additional level of pointer indirection above it. + // + Fossil::SerialReader reader( + _readContext, + contentValPtr, + Fossil::SerialReader::InitialStateType::PseudoPtr); + ASTSerializer serializer(&reader, this); + + Decl* decl = nullptr; + serialize(serializer, decl); + return decl; +} + +// +// We now turn our attention to the various accessors on AST types +// that need to read from the serialized data on-demand. +// +// One key design choice in the current encoding is that the +// fossilized dictionaries that we will use for lookup operations +// are written as ordinary `FossilizedDictionary<K,V>` values (which +// are ultimately just flat arrays of `K`,`V` pairs), but have their +// keys sorted before being written out. +// +// We can thus look up entries in these fossilized dictionaries +// using a binary search. +// + +template<typename T> +T const* _findEntryInFossilizedDictionaryWithSortedKeys( + FossilizedDictionary<FossilizedString, T> const& dictionary, + UnownedStringSlice const& key) +{ + Index lo = 0; + Index hi = dictionary.getElementCount() - 1; + + auto elements = dictionary.getBuffer(); + + while (lo <= hi) + { + Index mid = lo + ((hi - lo) >> 1); + + auto element = elements + mid; + int cmp = compare(element->key, key); + if (cmp == 0) + return &element->value; + + if (cmp < 0) + lo = mid + 1; + else + hi = mid - 1; + } + + return nullptr; +} + +Decl* ModuleDecl::_findSerializedDeclByMangledExportName(UnownedStringSlice const& mangledName) +{ + // Each of the accessors defined in this file should only + // ever be invoked in the case where the corresponding + // AST node is using on-demand deserialization. + // + SLANG_ASSERT(isUsingOnDemandDeserializationForExports()); + + // The `context` pointer stored in the `ContainerDeclDirectMemberDecls` type is + // a raw `RefPtr<RefObject>`, so that the definition of the `ASTSerialReadContext` + // doesn't need to be exposed outside this file. + // + // In order to access the context pointer, we thus need to cast it. + // + auto sharedContext = + as<ASTSerialReadContext>(_directMemberDecls.onDemandDeserialization.context); + + // The `sharedContext` has the information needed to do lookup + // based on mangled names, so we delegate the actual work to it. + // + return sharedContext->findExportedDeclByMangledName(mangledName); +} + +Decl* ASTSerialReadContext::findExportedDeclByMangledName(UnownedStringSlice const& mangledName) +{ + // The read context has retained a pointer to the `Fossilized<ASTModuleInfo>`, + // which allows us to perform a lookup in the serialized `mapMangledNameToDecl` + // without ever deserializing it. + // + auto found = _findEntryInFossilizedDictionaryWithSortedKeys( + _fossilizedModuleInfo->mapMangledNameToDecl, + mangledName); + if (!found) + return nullptr; + + // If the given `mangledName` does indeed map to a pointer to + // a fossilized declaration, then we will read the declaration + // on-demand before returing it. + // + // Note that if we've seen the same declaration before (whether + // via a previous call to `findExportedDeclByMangledName()` or + // through some other path leading to `readFossilizedDecl()`, + // this will return the same `Decl*` that was previously + // deserialized). + // + auto decl = readFossilizedDecl(*found); + return decl; +} + +Decl* ContainerDeclDirectMemberDecls::_readSerializedDeclsOfName(Name* name) const +{ + // All of these accessors on `ContainerDeclDirectMemberDecls` start with + // a similar pattern of asserting that they are only used when on-demand + // deserialization is active, and then casting the `void*` that is stored + // in the AST representation over to the correct fossilized type (a type + // that is only used/visible within this file). + // + SLANG_ASSERT(isUsingOnDemandDeserialization()); + auto& fossilizedInfo = + *(Fossilized<ContainerDeclDirectMemberDeclsInfo>*)onDemandDeserialization.data; + + // TODO: It isn't clear why the compiler will sometimes perform by-name + // lookup using a null name, but it happens and thus the code here + // needs to be defensive against that scenario. + // + if (name == nullptr) + return nullptr; + + // Once we are sure that `name` is valid, the overall logic here + // is quite similar to `findExportedDeclByMangledName()` above: + // we do a lookup in the serialized dictionary by binary search. + // + auto found = _findEntryInFossilizedDictionaryWithSortedKeys( + fossilizedInfo.mapNameToDeclIndex, + name->text.getUnownedSlice()); + if (!found) + return nullptr; + + // Unlike the case for `findExportedDeclByMangledName()`, the + // dictionary stored on a container declaration only holds indices + // rather than pointers. The reason for this is that we want to + // bottleneck deserialization of direct member declarations through + // the by-index accessor, when possilbe. + // + // One thing to note here is that this function is being called + // to get the list of *all* declarations with a given name, but + // it seems to only fetch one. In practice this works fine because + // the `_prevInContainerWithSameName` field in `Decl` is part of + // the state that gets serialized for a `Decl`, so that loading + // the head of the linked list will cause the rest to get + // deserialized eagerly. + // + // TODO: We could avoid serializing the `_prevInContainerWithSameName` + // field on every `Decl` (since it is almost always null), and instead + // use a more complicated lookup structure here. E.g., the dictionary + // entries could either refer to a single declaration by index (the + // common case, we hope) or to a sequence of two or more indices stored + // in some side-band structure (in the case where multiple declarations + // have the same name). For now we are sticking with the simpler + // representation; further complexity would need to be motivated by + // profiling information showing there's a problem to be solved. + // + Index declIndex = *found; + return getDecl(declIndex); +} + +void ContainerDeclDirectMemberDecls::_readSerializedTransparentDecls() const +{ + SLANG_ASSERT(isUsingOnDemandDeserialization()); + auto& fossilizedInfo = + *(Fossilized<ContainerDeclDirectMemberDeclsInfo>*)onDemandDeserialization.data; + + // This particular function works by filling in the `filteredListOfTransparentDecls` + // part of the lookup accelerators. If it has been called once and put anything + // into that array, then `filteredListOfTransparentDecls` should be used instead + // of calling this method again. We enforce this invariant in an attempt to + // avoid overhead that might be associated with this method. + // + SLANG_ASSERT(accelerators.filteredListOfTransparentDecls.getCount() == 0); + + // If this is the first time this method is being called (or, in the very common + // corner case, when there are no transparent decls at all...) we loop over the + // fossilized array holding the indices of the transparent members. + // + for (auto index : fossilizedInfo.transparentDeclIndices) + { + // For each index that is found, we do a by-index query for + // the member and then add it to the list. This is another + // case of us trying to bottleneck access to members through + // the by-index accessor as much as possible. + // + auto decl = getDecl(index); + accelerators.filteredListOfTransparentDecls.add(decl); + } +} + +Decl* ContainerDeclDirectMemberDecls::_readSerializedDeclAtIndex(Index index) const +{ + SLANG_ASSERT(isUsingOnDemandDeserialization()); + auto& fossilizedInfo = + *(Fossilized<ContainerDeclDirectMemberDeclsInfo>*)onDemandDeserialization.data; + + // + // It isn't visible here, but `ContainerDeclDirectMemberDecls::getDecl(index)` + // will automatically cache the decl that we return, so that subsequent queries + // for the same index shouldn't call this method at all (unless we end up + // returning null, for some unexpected reason). + // + + // The logic here is fairly simple: we directly read from the array of (fossilized) + // declaration pointers in the (fossilized) `ContainerDeclDirectMemberDeclsInfo`, + // to get a pointer to the (fossilized) declaration we want. + // + // Note that it is important that the variable here is *not* being declared + // with `auto`. If we were to simply write `auto fossilizedDecl` then the type + // that gets inferred would be `Fossilized<Decl*>` which is a `FossilizedPtr<...>` + // - a 32-bit relative pointer. On systems with a 64-bit address space, it is + // not guaranteed that a 32-bit offset is enough to refer to part of the serialized + // AST blob (in the heap) from this local variable (on the stack). + // + // The two options are to use `auto&` so that we capture a *reference* to the + // fossilized pointer (rather than try to copy it), or to declare the variable + // as an ordinary "live" pointer. + // + Fossilized<Decl>* fossilizedDecl = fossilizedInfo.decls[index]; + + // Once we have a pointer to the (fossilized) declaration that we want, + // we can use the `ASTSerialReadContext` to read it on-demand. Because + // the `context` pointer declared on the actual AST type is untyped + // (to avoid needing to expose `ASTSerialReadContext` outside this file), + // we need to cast the pointer before we can perform the read. + // + auto sharedContext = as<ASTSerialReadContext>(onDemandDeserialization.context); + auto decl = sharedContext->readFossilizedDecl(fossilizedDecl); + return decl; +} + } // namespace Slang diff --git a/source/slang/slang-serialize-ast.h b/source/slang/slang-serialize-ast.h index 86ba6e772..ec4023a62 100644 --- a/source/slang/slang-serialize-ast.h +++ b/source/slang/slang-serialize-ast.h @@ -22,6 +22,7 @@ ModuleDecl* readSerializedModuleAST( Linkage* linkage, ASTBuilder* astBuilder, DiagnosticSink* sink, + ISlangBlob* blobHoldingSerializedData, RIFF::Chunk const* chunk, SerialSourceLocReader* sourceLocReader, SourceLoc requestingSourceLoc); diff --git a/source/slang/slang-serialize-fossil.cpp b/source/slang/slang-serialize-fossil.cpp index da1516399..003f4227a 100644 --- a/source/slang/slang-serialize-fossil.cpp +++ b/source/slang/slang-serialize-fossil.cpp @@ -56,7 +56,7 @@ void SerialWriter::_initialize(ChunkBuilder* chunk) // that the last field of the header is a relative pointer // to the root-value chunk. // - headerChunk->writeRelativePtr<Fossil::RelativePtrOffset>(rootValueChunk); + headerChunk->writeRelativePtr<FossilInt>(rootValueChunk); // The root value should always be a variant, and we want to // set up to write into it in a reasonable way. @@ -140,7 +140,7 @@ void SerialWriter::handleFloat64(double& value) void SerialWriter::handleString(String& value) { auto size = value.getLength(); - if (_shouldEmitWithPointerIndirection(FossilizedValKind::String)) + if (_shouldEmitPotentiallyIndirectValueWithPointerIndirection()) { if (size == 0) { @@ -154,14 +154,14 @@ void SerialWriter::handleString(String& value) auto ptrLayout = (ContainerLayoutObj*)_reserveDestinationForWrite(FossilizedValKind::Ptr); - _mergeLayout(ptrLayout->baseLayout, FossilizedValKind::String); + _mergeLayout(ptrLayout->baseLayout, FossilizedValKind::StringObj); _commitWrite(ValInfo::relativePtrTo(existingChunk)); return; } } - _pushPotentiallyIndirectValueScope(FossilizedValKind::String); + _pushPotentiallyIndirectValueScope(FossilizedValKind::StringObj); auto data = value.getBuffer(); _writeValueRaw(ValInfo::rawData(data, size + 1, 1)); @@ -176,7 +176,7 @@ void SerialWriter::handleString(String& value) void SerialWriter::beginArray() { - _pushContainerScope(FossilizedValKind::Array); + _pushContainerScope(FossilizedValKind::ArrayObj); } void SerialWriter::endArray() @@ -186,7 +186,7 @@ void SerialWriter::endArray() void SerialWriter::beginDictionary() { - _pushContainerScope(FossilizedValKind::Dictionary); + _pushContainerScope(FossilizedValKind::DictionaryObj); } void SerialWriter::endDictionary() @@ -258,7 +258,7 @@ void SerialWriter::endTuple() void SerialWriter::beginOptional() { - _pushIndirectValueScope(FossilizedValKind::Optional); + _pushIndirectValueScope(FossilizedValKind::OptionalObj); } void SerialWriter::endOptional() @@ -266,7 +266,7 @@ void SerialWriter::endOptional() _popIndirectValueScope(); } -void SerialWriter::handleSharedPtr(void*& value, Callback callback, void* userData) +void SerialWriter::handleSharedPtr(void*& value, Callback callback, void* context) { // Because we are writing, we only care about the // pointer that is already present in `value`. @@ -305,7 +305,7 @@ void SerialWriter::handleSharedPtr(void*& value, Callback callback, void* userDa fossilizedObject->ptrLayout = ptrLayout; fossilizedObject->liveObjectPtr = liveObjectPtr; fossilizedObject->callback = callback; - fossilizedObject->userData = userData; + fossilizedObject->context = context; _fossilizedObjects.add(fossilizedObject); _mapLiveObjectPtrToFossilizedObject.add(liveObjectPtr, fossilizedObject); @@ -313,15 +313,15 @@ void SerialWriter::handleSharedPtr(void*& value, Callback callback, void* userDa _commitWrite(ValInfo::relativePtrTo(chunk)); } -void SerialWriter::handleUniquePtr(void*& value, Callback callback, void* userData) +void SerialWriter::handleUniquePtr(void*& value, Callback callback, void* context) { // We treat all pointers as shared pointers, because there isn't really // an optimized representation we would want to use for the unique case. // - handleSharedPtr(value, callback, userData); + handleSharedPtr(value, callback, context); } -void SerialWriter::handleDeferredObjectContents(void* valuePtr, Callback callback, void* userData) +void SerialWriter::handleDeferredObjectContents(void* valuePtr, Callback callback, void* context) { // Because we are already deferring writing of the *entirety* of // an object's members as part of how `handleSharedPtr()` works, @@ -330,7 +330,7 @@ void SerialWriter::handleDeferredObjectContents(void* valuePtr, Callback callbac // (In practice the `handleDeferredObjectContents()` operation is // more for the benefit of reading than writing). // - callback(valuePtr, userData); + callback(valuePtr, this, context); } SerialWriter::LayoutObj* SerialWriter::_createSimpleLayout(FossilizedValKind kind) @@ -356,7 +356,7 @@ SerialWriter::LayoutObj* SerialWriter::_createSimpleLayout(FossilizedValKind kin case FossilizedValKind::Float64: return new (_arena) SimpleLayoutObj(kind, 8); - case FossilizedValKind::String: + case FossilizedValKind::StringObj: return new (_arena) SimpleLayoutObj(kind); default: @@ -369,23 +369,19 @@ SerialWriter::LayoutObj* SerialWriter::_createLayout(FossilizedValKind kind) { switch (kind) { - case FossilizedValKind::Array: - case FossilizedValKind::Optional: - case FossilizedValKind::Dictionary: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::OptionalObj: + case FossilizedValKind::DictionaryObj: return new (_arena) ContainerLayoutObj(kind, nullptr); case FossilizedValKind::Ptr: - return new (_arena) ContainerLayoutObj( - kind, - nullptr, - sizeof(Fossil::RelativePtrOffset), - sizeof(Fossil::RelativePtrOffset)); + return new (_arena) ContainerLayoutObj(kind, nullptr, sizeof(FossilInt), sizeof(FossilInt)); case FossilizedValKind::Struct: case FossilizedValKind::Tuple: return new (_arena) RecordLayoutObj(kind); - case FossilizedValKind::Variant: + case FossilizedValKind::VariantObj: // A variant is being treated like a container in this context, // because it wants to be able to track the layout of what it // ended up holding... @@ -403,7 +399,7 @@ SerialWriter::LayoutObj* SerialWriter::_createLayout(FossilizedValKind kind) case FossilizedValKind::UInt64: case FossilizedValKind::Float32: case FossilizedValKind::Float64: - case FossilizedValKind::String: + case FossilizedValKind::StringObj: { if (auto found = _simpleLayouts.tryGetValue(kind)) return *found; @@ -435,7 +431,7 @@ SerialWriter::LayoutObj* SerialWriter::_mergeLayout(LayoutObj*& dst, FossilizedV // then we want to have a unique layout object for each // instance. // - if (kind == FossilizedValKind::Variant) + if (kind == FossilizedValKind::VariantObj) { auto src = _createLayout(kind); return src; @@ -462,9 +458,9 @@ void SerialWriter::_mergeLayout(LayoutObj*& dst, LayoutObj* src) switch (src->getKind()) { - case FossilizedValKind::Array: - case FossilizedValKind::Optional: - case FossilizedValKind::Dictionary: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::OptionalObj: + case FossilizedValKind::DictionaryObj: case FossilizedValKind::Ptr: { auto dstContainer = (ContainerLayoutObj*)dst; @@ -473,10 +469,10 @@ void SerialWriter::_mergeLayout(LayoutObj*& dst, LayoutObj* src) } break; - case FossilizedValKind::String: + case FossilizedValKind::StringObj: break; - case FossilizedValKind::Variant: + case FossilizedValKind::VariantObj: // Recursive merging should not be applied to variants; // each variant is unique until later deduplication. break; @@ -557,7 +553,7 @@ Size SerialWriter::ValInfo::getAlignment() const switch (kind) { case Kind::RelativePtr: - return sizeof(Fossil::RelativePtrOffset); + return sizeof(FossilInt); case Kind::ContentsOfChunk: return chunk->getAlignment(); @@ -598,13 +594,13 @@ void SerialWriter::_popInlineValueScope() void SerialWriter::_pushVariantScope() { - _pushPotentiallyIndirectValueScope(FossilizedValKind::Variant); + _pushPotentiallyIndirectValueScope(FossilizedValKind::VariantObj); } void SerialWriter::_popVariantScope() { SLANG_ASSERT(_state.layout); - SLANG_ASSERT(_state.layout->kind == FossilizedValKind::Variant); + SLANG_ASSERT(_state.layout->kind == FossilizedValKind::VariantObj); auto variantLayout = (ContainerLayoutObj*)_state.layout; auto valueLayout = variantLayout->baseLayout; SLANG_ASSERT(valueLayout); @@ -631,7 +627,7 @@ void SerialWriter::_popVariantScope() void SerialWriter::_pushPotentiallyIndirectValueScope(FossilizedValKind kind) { - if (_shouldEmitWithPointerIndirection(kind)) + if (_shouldEmitPotentiallyIndirectValueWithPointerIndirection()) { _pushIndirectValueScope(kind); } @@ -647,12 +643,10 @@ ChunkBuilder* SerialWriter::_popPotentiallyIndirectValueScope() // conditional to select between the functions for the // indirect and inline cases. - auto valueLayout = _state.layout; auto valueChunk = _state.chunk; _popState(); - auto valueKind = valueLayout->getKind(); - if (_shouldEmitWithPointerIndirection(valueKind)) + if (_shouldEmitPotentiallyIndirectValueWithPointerIndirection()) { return _writeKnownIndirectValueSharedLogic(valueChunk); } @@ -729,11 +723,14 @@ void SerialWriter::_writeValueRaw(ValInfo const& val) case ValInfo::Kind::RelativePtr: _ensureChunkExists(); - _state.chunk->writeRelativePtr<Fossil::RelativePtrOffset>(val.chunk); + _state.chunk->writeRelativePtr<FossilInt>(val.chunk); break; case ValInfo::Kind::ContentsOfChunk: { + if (!val.chunk) + return; + if (!_state.chunk) { _state.chunk = val.chunk; @@ -751,29 +748,14 @@ void SerialWriter::_writeValueRaw(ValInfo const& val) } } -bool SerialWriter::_shouldEmitWithPointerIndirection(FossilizedValKind kind) +bool SerialWriter::_shouldEmitPotentiallyIndirectValueWithPointerIndirection() { - switch (kind) - { - default: - return false; - - case FossilizedValKind::Optional: - return true; - - case FossilizedValKind::Array: - case FossilizedValKind::Dictionary: - case FossilizedValKind::String: - case FossilizedValKind::Variant: - break; - } - switch (_state.layout->getKind()) { default: return true; - case FossilizedValKind::Optional: + case FossilizedValKind::OptionalObj: case FossilizedValKind::Ptr: return false; } @@ -794,10 +776,10 @@ SerialWriter::LayoutObj*& SerialWriter::_reserveDestinationForWrite() break; case FossilizedValKind::Ptr: - case FossilizedValKind::Optional: - case FossilizedValKind::Array: - case FossilizedValKind::Dictionary: - case FossilizedValKind::Variant: + case FossilizedValKind::OptionalObj: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::DictionaryObj: + case FossilizedValKind::VariantObj: { auto containerLayout = (ContainerLayoutObj*)_state.layout; auto& elementLayout = containerLayout->baseLayout; @@ -849,17 +831,17 @@ void SerialWriter::_commitWrite(ValInfo const& val) } break; - case FossilizedValKind::Optional: + case FossilizedValKind::OptionalObj: case FossilizedValKind::Ptr: - case FossilizedValKind::Array: - case FossilizedValKind::Dictionary: - case FossilizedValKind::Variant: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::DictionaryObj: + case FossilizedValKind::VariantObj: { auto elementIndex = _state.elementCount++; switch (outerKind) { - case FossilizedValKind::Optional: + case FossilizedValKind::OptionalObj: case FossilizedValKind::Ptr: if (elementIndex > 0) { @@ -911,7 +893,10 @@ void SerialWriter::_flush() _state = State(fossilizedObject->ptrLayout, fossilizedObject->chunk); - fossilizedObject->callback(&fossilizedObject->liveObjectPtr, fossilizedObject->userData); + fossilizedObject->callback( + &fossilizedObject->liveObjectPtr, + this, + fossilizedObject->context); } // Once we've written out all the payload data, we can start to work on @@ -921,7 +906,7 @@ void SerialWriter::_flush() for (auto variantInfo : _variants) { auto layoutChunk = _getOrCreateChunkForLayout(variantInfo.layout); - variantInfo.chunk->addPrefixRelativePtr<Fossil::RelativePtrOffset>(layoutChunk); + variantInfo.chunk->addPrefixRelativePtr<FossilInt>(layoutChunk); } } @@ -964,22 +949,22 @@ ChunkBuilder* SerialWriter::_getOrCreateChunkForLayout(LayoutObj* layout) break; case FossilizedValKind::Ptr: - case FossilizedValKind::Optional: + case FossilizedValKind::OptionalObj: { auto containerLayout = (ContainerLayoutObj*)layout; auto elementLayout = containerLayout->baseLayout; auto elementLayoutChunk = _getOrCreateChunkForLayout(elementLayout); - chunk->writeRelativePtr<Fossil::RelativePtrOffset>(elementLayoutChunk); + chunk->writeRelativePtr<FossilInt>(elementLayoutChunk); } break; - case FossilizedValKind::Array: - case FossilizedValKind::Dictionary: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::DictionaryObj: { auto containerLayout = (ContainerLayoutObj*)layout; auto elementLayout = containerLayout->baseLayout; auto elementLayoutChunk = _getOrCreateChunkForLayout(elementLayout); - chunk->writeRelativePtr<Fossil::RelativePtrOffset>(elementLayoutChunk); + chunk->writeRelativePtr<FossilInt>(elementLayoutChunk); UInt32 elementStride = 0; if (elementLayout) @@ -1004,7 +989,7 @@ ChunkBuilder* SerialWriter::_getOrCreateChunkForLayout(LayoutObj* layout) { auto& field = recordLayout->fields[i]; auto fieldLayoutChunk = _getOrCreateChunkForLayout(field.layout); - chunk->writeRelativePtr<Fossil::RelativePtrOffset>(fieldLayoutChunk); + chunk->writeRelativePtr<FossilInt>(fieldLayoutChunk); auto fieldOffset = UInt32(field.offset); chunk->writeData(&fieldOffset, sizeof(fieldOffset)); @@ -1043,9 +1028,9 @@ bool SerialWriter::LayoutObjKey::operator==(LayoutObjKey const& that) const default: break; - case FossilizedValKind::Array: - case FossilizedValKind::Dictionary: - case FossilizedValKind::Optional: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::DictionaryObj: + case FossilizedValKind::OptionalObj: case FossilizedValKind::Ptr: { auto thisContainer = (ContainerLayoutObj*)obj; @@ -1117,9 +1102,9 @@ void SerialWriter::LayoutObjKey::hashInto(Hasher& hasher) const default: break; - case FossilizedValKind::Array: - case FossilizedValKind::Dictionary: - case FossilizedValKind::Optional: + case FossilizedValKind::ArrayObj: + case FossilizedValKind::DictionaryObj: + case FossilizedValKind::OptionalObj: case FossilizedValKind::Ptr: { auto container = (ContainerLayoutObj*)obj; @@ -1152,16 +1137,83 @@ void SerialWriter::LayoutObjKey::hashInto(Hasher& hasher) const // SerialReader // -SerialReader::SerialReader(FossilizedValRef valRef) +SerialReader::SerialReader( + ReadContext& context, + Fossil::AnyValPtr valPtr, + InitialStateType initialState) + : _context(context) { - _state.type = State::Type::Root; - _state.baseValue = valRef; + // We track the number of active `SerialReader`s that + // are working with the same `ReadContext`, and will + // make use of this count in the destructor below. + // + context._readerCount++; + + switch (initialState) + { + case InitialStateType::Root: + _state.type = State::Type::Root; + break; + + case InitialStateType::PseudoPtr: + _state.type = State::Type::PseudoPtr; + break; + } + + _state.baseValPtr = valPtr; _state.elementIndex = 0; _state.elementCount = 1; } SerialReader::~SerialReader() { + // If an application is designed to perform something + // like on-demand deserialization, it may create + // additional `SerialReader`s attached to the same + // `ReadContext`, potentially even in the body of a + // callback that was invoked by an operation on another + // `SerialReader` further up the stack. + // + // If we were to track the deferred actions that get + // enqueued on a per-`SerialReader` basis, and then + // flush them when the given `SerialReader` is destructed, + // it could potentially lead to very deep call stacks. + // + // Instead, we track a single list of deferred actions + // on the `ReadContext`, which means that we need to + // figure out when to actually flush that list. + // + // What is implemented here is a "last one out shuts the door" + // policy. When a `SerialReader` is being destroyed, before + // it decrements the count on the shared `ReadContext`, it + // checks to see if it is the last remaining `SerialReader`, + // in which case it takes responsibility for flushing the deferred + // actions that were enqueued by *all* of the readers. + // + // Note that the ordering here is critical: we check whether + // we are the last reader and, if so, perform the `_flush()` + // operation all *before* decrementing the counter. If we + // were to decrement the count before invoking `_flush()` + // then any nested `SerialReader`s that get created by the + // deferred actions would (incorrectly) believe themselves + // to be the "last one out" and try to perform their own + // `flush()`, which could quickly lead to unbounded + // recursion. + // + if (_context._readerCount == 1) + { + _flush(); + } + _context._readerCount--; +} + +Fossil::AnyValPtr SerialReader::readValPtr() +{ + return _readValPtr(); +} + +void SerialReader::flush() +{ _flush(); } @@ -1172,94 +1224,83 @@ SerializationMode SerialReader::getMode() void SerialReader::handleBool(bool& value) { - auto valRef = _readValRef(); - value = as<FossilizedBoolVal>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleInt8(int8_t& value) { - auto valRef = _readValRef(); - value = as<FossilizedInt8Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleInt16(int16_t& value) { - auto valRef = _readValRef(); - value = as<FossilizedInt16Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleInt32(Int32& value) { - auto valRef = _readValRef(); - value = as<FossilizedInt32Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleInt64(Int64& value) { - auto valRef = _readValRef(); - value = as<FossilizedInt64Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleUInt8(uint8_t& value) { - auto valRef = _readValRef(); - value = as<FossilizedUInt8Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleUInt16(uint16_t& value) { - auto valRef = _readValRef(); - value = as<FossilizedUInt16Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleUInt32(UInt32& value) { - auto valRef = _readValRef(); - value = as<FossilizedUInt32Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleUInt64(UInt64& value) { - auto valRef = _readValRef(); - value = as<FossilizedUInt64Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleFloat32(float& value) { - auto valRef = _readValRef(); - value = as<FossilizedFloat32Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleFloat64(double& value) { - auto valRef = _readValRef(); - value = as<FossilizedFloat64Val>(valRef)->getValue(); + handleSimpleVal(value); } void SerialReader::handleString(String& value) { - auto valRef = _readPotentiallyIndirectValRef(); - if (!valRef) + auto valPtr = _readPotentiallyIndirectValPtr(); + if (!valPtr) { value = String(); } else { - value = as<FossilizedStringObj>(valRef)->getValue(); + value = as<FossilizedStringObj>(valPtr)->get(); } } void SerialReader::beginArray() { - auto valRef = _readPotentiallyIndirectValRef(); - auto arrayRef = as<FossilizedContainerObj>(valRef); + auto valPtr = _readPotentiallyIndirectValPtr(); + auto arrayPtr = as<FossilizedArrayObjBase>(valPtr); _pushState(); _state.type = State::Type::Array; - _state.baseValue = valRef; + _state.baseValPtr = arrayPtr; _state.elementIndex = 0; - _state.elementCount = getElementCount(arrayRef); + _state.elementCount = arrayPtr->getElementCount(); } void SerialReader::endArray() @@ -1269,15 +1310,15 @@ void SerialReader::endArray() void SerialReader::beginDictionary() { - auto valRef = _readPotentiallyIndirectValRef(); - auto dictionaryRef = as<FossilizedContainerObj>(valRef); + auto valPtr = _readPotentiallyIndirectValPtr(); + auto dictionaryPtr = as<FossilizedDictionaryObjBase>(valPtr); _pushState(); _state.type = State::Type::Dictionary; - _state.baseValue = valRef; + _state.baseValPtr = dictionaryPtr; _state.elementIndex = 0; - _state.elementCount = getElementCount(dictionaryRef); + _state.elementCount = dictionaryPtr->getElementCount(); } void SerialReader::endDictionary() @@ -1292,15 +1333,15 @@ bool SerialReader::hasElements() void SerialReader::beginStruct() { - auto valRef = _readValRef(); - auto recordRef = as<FossilizedRecordVal>(valRef); + auto valPtr = _readValPtr(); + auto recordPtr = as<FossilizedRecordVal>(valPtr); _pushState(); _state.type = State::Type::Struct; - _state.baseValue = valRef; + _state.baseValPtr = valPtr; _state.elementIndex = 0; - _state.elementCount = getFieldCount(recordRef); + _state.elementCount = recordPtr->getFieldCount(); } void SerialReader::endStruct() @@ -1310,18 +1351,20 @@ void SerialReader::endStruct() void SerialReader::beginVariant() { - auto valRef = _readPotentiallyIndirectValRef(); - auto variantRef = as<FossilizedVariantObj>(valRef); - - auto contentValRef = getVariantContent(variantRef); - auto contentRecordRef = as<FossilizedRecordVal>(contentValRef); + auto valPtr = _readPotentiallyIndirectValPtr(); + if (auto variantPtr = as<FossilizedVariantObj>(valPtr)) + { + auto contentValPtr = getVariantContentPtr(variantPtr); + valPtr = contentValPtr; + } + auto recordPtr = as<FossilizedRecordVal>(valPtr); _pushState(); _state.type = State::Type::Struct; - _state.baseValue = contentValRef; + _state.baseValPtr = recordPtr; _state.elementIndex = 0; - _state.elementCount = getFieldCount(contentRecordRef); + _state.elementCount = recordPtr->getFieldCount(); } void SerialReader::endVariant() @@ -1339,15 +1382,15 @@ void SerialReader::handleFieldKey(char const* name, Int index) void SerialReader::beginTuple() { - auto valRef = _readValRef(); - auto recordRef = as<FossilizedRecordVal>(valRef); + auto valPtr = _readValPtr(); + auto recordPtr = as<FossilizedRecordVal>(valPtr); _pushState(); _state.type = State::Type::Tuple; - _state.baseValue = valRef; + _state.baseValPtr = recordPtr; _state.elementIndex = 0; - _state.elementCount = getFieldCount(recordRef); + _state.elementCount = recordPtr->getFieldCount(); } void SerialReader::endTuple() @@ -1357,15 +1400,15 @@ void SerialReader::endTuple() void SerialReader::beginOptional() { - auto valRef = _readIndirectValRef(); - auto optionalRef = as<FossilizedOptionalObj>(valRef); + auto valPtr = _readIndirectValPtr(); + auto optionalPtr = as<FossilizedOptionalObjBase>(valPtr); _pushState(); _state.type = State::Type::Optional; - _state.baseValue = valRef; + _state.baseValPtr = optionalPtr; _state.elementIndex = 0; - _state.elementCount = Count(hasValue(optionalRef)); + _state.elementCount = Count(optionalPtr->hasValue()); } void SerialReader::endOptional() @@ -1373,14 +1416,21 @@ void SerialReader::endOptional() _popState(); } -void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userData) +void SerialReader::handleSharedPtr(void*& value, Callback callback, void* context) { - // The fossilized value at our cursor must be a pointer, - // and we can resolve what it is pointing to easily enough. - // - auto valRef = _readValRef(); - auto ptrRef = as<FossilizedPtrVal>(valRef); - auto targetValRef = getPtrTarget(ptrRef); + Fossil::AnyValPtr targetValPtr; + + if (_state.type == State::Type::PseudoPtr) + { + _state.type = State::Type::Root; + targetValPtr = _readValPtr(); + } + else + { + auto valPtr = _readValPtr(); + auto ptrPtr = as<FossilizedPtr<void>>(valPtr); + targetValPtr = ptrPtr->getTargetValPtr(); + } // The logic here largely mirrors what appears in // `SerialWriter::handleSharedPtr`. @@ -1388,7 +1438,7 @@ void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userDa // We first check for an explicitly written null pointer. // If we find one our work is very easy. // - if (!targetValRef) + if (!targetValPtr) { value = nullptr; return; @@ -1397,7 +1447,7 @@ void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userDa // Now we need to check if we've previously read in // a reference to the same object. // - if (auto found = _mapFossilizedObjectPtrToObjectInfo.tryGetValue(targetValRef.getData())) + if (auto found = _context.mapFossilizedObjectPtrToObjectInfo.tryGetValue(targetValPtr.get())) { auto objectInfo = *found; @@ -1440,9 +1490,9 @@ void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userDa // object index that has not yet been read at all. // auto objectInfo = RefPtr(new ObjectInfo()); - _mapFossilizedObjectPtrToObjectInfo.add(targetValRef.getData(), objectInfo); + _context.mapFossilizedObjectPtrToObjectInfo.add(targetValPtr.get(), objectInfo); - objectInfo->fossilizedObjectRef = targetValRef; + objectInfo->fossilizedObjectPtr = targetValPtr; // We cannot return from this function until we have // stored a pointer into `value`, to represent the @@ -1473,7 +1523,7 @@ void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userDa // _pushState(); _state.type = State::Type::Object; - _state.baseValue = objectInfo->fossilizedObjectRef; + _state.baseValPtr = objectInfo->fossilizedObjectPtr; _state.elementIndex = 0; _state.elementCount = 1; @@ -1491,7 +1541,7 @@ void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userDa // that objects and stores a pointer to it into the output // parameter. // - callback(&objectInfo->resurrectedObjectPtr, userData); + callback(&objectInfo->resurrectedObjectPtr, this, context); _popState(); @@ -1500,15 +1550,15 @@ void SerialReader::handleSharedPtr(void*& value, Callback callback, void* userDa value = objectInfo->resurrectedObjectPtr; } -void SerialReader::handleUniquePtr(void*& value, Callback callback, void* userData) +void SerialReader::handleUniquePtr(void*& value, Callback callback, void* context) { // We treat all pointers as shared pointers, because there isn't really // an optimized representation we would want to use for the unique case. // - handleSharedPtr(value, callback, userData); + handleSharedPtr(value, callback, context); } -void SerialReader::handleDeferredObjectContents(void* valuePtr, Callback callback, void* userData) +void SerialReader::handleDeferredObjectContents(void* valuePtr, Callback callback, void* context) { // Unlike the case in `SerialWriter::handleDeferredObjectContents()`, // we very much *do* want to delay invoking the callback until later. @@ -1528,9 +1578,9 @@ void SerialReader::handleDeferredObjectContents(void* valuePtr, Callback callbac deferredAction.savedState = _state; deferredAction.resurrectedObjectPtr = valuePtr; deferredAction.callback = callback; - deferredAction.userData = userData; + deferredAction.context = context; - _deferredActions.add(deferredAction); + _context._deferredActions.add(deferredAction); } void SerialReader::_flush() @@ -1538,7 +1588,7 @@ void SerialReader::_flush() // We need to flush any actions that were deferred // and are still pending. // - while (_deferredActions.getCount() != 0) + while (_context._deferredActions.getCount() != 0) { // TODO: For simplicity we are using the `_deferredActions` // array as a stack (LIFO), but it would be good to @@ -1546,15 +1596,15 @@ void SerialReader::_flush() // large the array would need to grow for a FIFO vs. LIFO, // and pick the better option. // - auto deferredAction = _deferredActions.getLast(); - _deferredActions.removeLast(); + auto deferredAction = _context._deferredActions.getLast(); + _context._deferredActions.removeLast(); _state = deferredAction.savedState; - deferredAction.callback(deferredAction.resurrectedObjectPtr, deferredAction.userData); + deferredAction.callback(deferredAction.resurrectedObjectPtr, this, deferredAction.context); } } -FossilizedValRef SerialReader::_readValRef() +Fossil::AnyValPtr SerialReader::_readValPtr() { switch (_state.type) { @@ -1563,7 +1613,7 @@ FossilizedValRef SerialReader::_readValRef() SLANG_ASSERT(_state.elementCount == 1); SLANG_ASSERT(_state.elementIndex == 0); _state.elementIndex++; - return _state.baseValue; + return _state.baseValPtr; case State::Type::Struct: case State::Type::Tuple: @@ -1571,8 +1621,8 @@ FossilizedValRef SerialReader::_readValRef() SLANG_ASSERT(_state.elementIndex < _state.elementCount); auto index = _state.elementIndex++; - auto recordRef = as<FossilizedRecordVal>(_state.baseValue); - return getField(recordRef, index); + auto recordPtr = as<FossilizedRecordVal>(_state.baseValPtr); + return getAddress(recordPtr->getField(index)); } case State::Type::Optional: @@ -1580,8 +1630,8 @@ FossilizedValRef SerialReader::_readValRef() SLANG_ASSERT(_state.elementCount == 1); SLANG_ASSERT(_state.elementIndex == 0); - auto optionalRef = as<FossilizedOptionalObj>(_state.baseValue); - return getValue(optionalRef); + auto optionalPtr = as<FossilizedOptionalObjBase>(_state.baseValPtr); + return getAddress(optionalPtr->getValue()); } case State::Type::Array: @@ -1590,8 +1640,8 @@ FossilizedValRef SerialReader::_readValRef() SLANG_ASSERT(_state.elementIndex < _state.elementCount); auto index = _state.elementIndex++; - auto containerRef = as<FossilizedContainerObj>(_state.baseValue); - return getElement(containerRef, index); + auto containerPtr = as<FossilizedContainerObjBase>(_state.baseValPtr); + return Fossil::ValPtr(containerPtr->getElement(index)); } default: @@ -1600,24 +1650,25 @@ FossilizedValRef SerialReader::_readValRef() } } -FossilizedValRef SerialReader::_readIndirectValRef() +Fossil::AnyValPtr SerialReader::_readIndirectValPtr() { - auto ptrValRef = _readValRef(); - auto ptrRef = as<FossilizedPtrVal>(ptrValRef); + auto baseValPtr = _readValPtr(); + auto basePtrPtr = as<FossilizedPtr<void>>(baseValPtr); - auto valRef = getPtrTarget(ptrRef); - return valRef; + auto targetValPtr = basePtrPtr->getTargetValPtr(); + return targetValPtr; } -FossilizedValRef SerialReader::_readPotentiallyIndirectValRef() +Fossil::AnyValPtr SerialReader::_readPotentiallyIndirectValPtr() { - auto valRef = _readValRef(); - if (auto ptrRef = as<FossilizedPtrVal>(valRef)) + auto baseValPtr = _readValPtr(); + if (auto basePtrPtr = as<FossilizedPtr<void>>(baseValPtr)) { - return getPtrTarget(ptrRef); + auto targetValRef = basePtrPtr->getTargetValRef(); + return Fossil::ValPtr(targetValRef); } - return valRef; + return baseValPtr; } void SerialReader::_pushState() diff --git a/source/slang/slang-serialize-fossil.h b/source/slang/slang-serialize-fossil.h index 0393c5784..930719935 100644 --- a/source/slang/slang-serialize-fossil.h +++ b/source/slang/slang-serialize-fossil.h @@ -260,7 +260,7 @@ private: /// Callback information used by the ISerializer interface. Callback callback = nullptr; - void* userData = nullptr; + void* context = nullptr; }; List<FossilizedObjectInfo*> _fossilizedObjects; @@ -465,10 +465,10 @@ private: void _pushPotentiallyIndirectValueScope(FossilizedValKind kind); ChunkBuilder* _popPotentiallyIndirectValueScope(); - /// Determine if a potentially-indirect value of `kind` should be + /// Determine if a potentially-indirect value of should be /// emitted indirectly, in the current state. /// - bool _shouldEmitWithPointerIndirection(FossilizedValKind kind); + bool _shouldEmitPotentiallyIndirectValueWithPointerIndirection(); /// Helper function to share details between `_popIndirectValueScope` /// and `_popPotentiallyIndirectValueScope`. @@ -548,10 +548,10 @@ private: virtual void handleFieldKey(char const* name, Int index) override; - virtual void handleSharedPtr(void*& value, Callback callback, void* userData) override; - virtual void handleUniquePtr(void*& value, Callback callback, void* userData) override; + virtual void handleSharedPtr(void*& value, Callback callback, void* context) override; + virtual void handleUniquePtr(void*& value, Callback callback, void* context) override; - virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* userData) + virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* context) override; }; @@ -559,10 +559,39 @@ private: struct SerialReader : ISerializerImpl { public: - SerialReader(FossilizedValRef valRef); + struct ReadContext; + + enum class InitialStateType + { + Root, + PseudoPtr, + }; + + SerialReader( + ReadContext& context, + Fossil::AnyValPtr valPtr, + InitialStateType initialState = InitialStateType::Root); ~SerialReader(); + /// Read a value from the current cursor position. + /// + /// This operation can be used to skip over an entire value + /// that might otherwise need to be read with a sequence of + /// operations of the `ISerializerImpl` interface. + /// + /// The saved pointer can then be used to construct another + /// `Fossil::SerialReader` to read the contents of the value + /// at some later time, or code can simply navigate the + /// data in memory using their own logic. + /// + Fossil::AnyValPtr readValPtr(); + + void flush(); + private: + /// The shared context that this reader is using. + ReadContext& _context; + /// A state that the reader can be in. struct State { @@ -577,6 +606,8 @@ private: Tuple, Struct, Object, + + PseudoPtr, }; /// The type of state. @@ -588,7 +619,7 @@ private: /// that will be read (e.g., for the `Root` case), or it might be /// a container that is a parent of the next value to be read. /// - FossilizedValRef baseValue; + Fossil::AnyValPtr baseValPtr; /// Index of next element to read. /// @@ -613,15 +644,16 @@ private: /// Stack of saved states. List<State> _stack; - void _pushState(); - void _popState(); - // // Like other `ISerializerImpl`s for reading, we track objects // that are in the process of being read in, to avoid possible // unbounded recursion (and detect circularities when they // occur). // + // A key difference here is that the actual mapping is being + // stored in the shared `ReadContext`, rather than in the + // `SerialReader` itself. + // enum class ObjectState { @@ -634,9 +666,8 @@ private: ObjectState state = ObjectState::Unread; void* resurrectedObjectPtr = nullptr; - FossilizedValRef fossilizedObjectRef; + Fossil::AnyValPtr fossilizedObjectPtr; }; - Dictionary<void*, RefPtr<ObjectInfo>> _mapFossilizedObjectPtrToObjectInfo; // // Again, like other `ISerializerImpl`s for reading, we @@ -652,9 +683,12 @@ private: State savedState; Callback callback; - void* userData; + void* context; }; - List<DeferredAction> _deferredActions; + + void _pushState(); + void _popState(); + /// Execute all deferred actions that are still pending. void _flush(); @@ -663,14 +697,14 @@ private: /// /// This is the case for scalars, tuples, and structs. /// - FossilizedValRef _readValRef(); + Fossil::AnyValPtr _readValPtr(); /// Read an indirect value. /// /// This is the case for things like optionals, that are /// always encoded as a pointer. /// - FossilizedValRef _readIndirectValRef(); + Fossil::AnyValPtr _readIndirectValPtr(); /// Read a potentially-indirect value. /// @@ -679,7 +713,31 @@ private: /// /// Otherwise, this will return a reference to the value itself. /// - FossilizedValRef _readPotentiallyIndirectValRef(); + Fossil::AnyValPtr _readPotentiallyIndirectValPtr(); + + + template<typename T> + void handleSimpleVal(T& value) + { + auto valPtr = _readValPtr(); + value = as<Fossilized<T>>(valPtr)->getDataRef(); + } + +public: + struct ReadContext + { + public: + ReadContext() = default; + + private: + friend struct SerialReader; + + Dictionary<void*, RefPtr<ObjectInfo>> mapFossilizedObjectPtrToObjectInfo; + List<DeferredAction> _deferredActions; + + Count _readerCount = 0; + }; + private: // @@ -728,13 +786,15 @@ private: virtual void beginOptional() override; virtual void endOptional() override; - virtual void handleSharedPtr(void*& value, Callback callback, void* userData) override; - virtual void handleUniquePtr(void*& value, Callback callback, void* userData) override; + virtual void handleSharedPtr(void*& value, Callback callback, void* context) override; + virtual void handleUniquePtr(void*& value, Callback callback, void* context) override; - virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* userData) + virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* context) override; }; +using ReadContext = SerialReader::ReadContext; + } // namespace Fossil } // namespace Slang diff --git a/source/slang/slang-serialize-riff.cpp b/source/slang/slang-serialize-riff.cpp index 469803c2a..fb8acd2bd 100644 --- a/source/slang/slang-serialize-riff.cpp +++ b/source/slang/slang-serialize-riff.cpp @@ -217,7 +217,7 @@ void RIFFSerialWriter::endOptional() _cursor.endChunk(); } -void RIFFSerialWriter::handleSharedPtr(void*& value, Callback callback, void* userData) +void RIFFSerialWriter::handleSharedPtr(void*& value, Callback callback, void* context) { // Because we are writing, we only care about the // pointer that is already present in `value`. @@ -279,22 +279,22 @@ void RIFFSerialWriter::handleSharedPtr(void*& value, Callback callback, void* us ObjectInfo objectInfo; objectInfo.ptr = ptr; objectInfo.callback = callback; - objectInfo.userData = userData; + objectInfo.context = context; _objects.add(objectInfo); } -void RIFFSerialWriter::handleUniquePtr(void*& value, Callback callback, void* userData) +void RIFFSerialWriter::handleUniquePtr(void*& value, Callback callback, void* context) { // We treat all pointers as shared pointers, because there isn't really // an optimized representation we would want to use for the unique case. // - handleSharedPtr(value, callback, userData); + handleSharedPtr(value, callback, context); } void RIFFSerialWriter::handleDeferredObjectContents( void* valuePtr, Callback callback, - void* userData) + void* context) { // Because we are already deferring writing of the *entirety* of // an object's members as part of how `handleSharedPtr()` works, @@ -303,7 +303,7 @@ void RIFFSerialWriter::handleDeferredObjectContents( // (In practice the `handleDeferredObjectContents()` operation is // more for the benefit of reading than writing). // - callback(valuePtr, userData); + callback(valuePtr, this, context); } void RIFFSerialWriter::_writeObjectReference(ObjectIndex index) @@ -363,7 +363,7 @@ void RIFFSerialWriter::_flush() // can set the pointed-to pointer to whatever object it // allocates or finds. // - objectInfo.callback(&objectInfo.ptr, objectInfo.userData); + objectInfo.callback(&objectInfo.ptr, this, objectInfo.context); // TODO(tfoley): There is an important invariant here that // the callback had better only write *one* value, but @@ -572,7 +572,7 @@ RIFFSerialReader::ObjectIndex RIFFSerialReader::_readObjectReference() return objectIndex; } -void RIFFSerialReader::handleSharedPtr(void*& value, Callback callback, void* userData) +void RIFFSerialReader::handleSharedPtr(void*& value, Callback callback, void* context) { // The logic here largely mirrors what appears in // `RIFFSerialWriter::handleSharedPtr`. @@ -686,7 +686,7 @@ void RIFFSerialReader::handleSharedPtr(void*& value, Callback callback, void* us // that objects and stores a pointer to it into the output // parameter. // - callback(&objectInfo.ptr, userData); + callback(&objectInfo.ptr, this, context); _popCursor(); @@ -706,7 +706,7 @@ void RIFFSerialReader::handleUniquePtr(void*& value, Callback callback, void* us void RIFFSerialReader::handleDeferredObjectContents( void* valuePtr, Callback callback, - void* userData) + void* context) { // Unlike the case in `RIFFSerialWriter::handleDeferredObjectContents()`, // we very much *do* want to delay invoking the callback until later. @@ -726,7 +726,7 @@ void RIFFSerialReader::handleDeferredObjectContents( deferredAction.savedCursor = _cursor; deferredAction.valuePtr = valuePtr; deferredAction.callback = callback; - deferredAction.userData = userData; + deferredAction.context = context; _deferredActions.add(deferredAction); } @@ -777,7 +777,7 @@ void RIFFSerialReader::_flush() _deferredActions.removeLast(); _cursor = deferredAction.savedCursor; - deferredAction.callback(deferredAction.valuePtr, deferredAction.userData); + deferredAction.callback(deferredAction.valuePtr, this, deferredAction.context); } } diff --git a/source/slang/slang-serialize-riff.h b/source/slang/slang-serialize-riff.h index 87f83f3f0..256c185d4 100644 --- a/source/slang/slang-serialize-riff.h +++ b/source/slang/slang-serialize-riff.h @@ -162,8 +162,8 @@ private: /// Callback that can be invoked to serialize the object's data. Callback callback; - /// User-data pointer for `callback` - void* userData; + /// Context pointer for `callback` + void* context; }; /// The chunk where object definitions are listed. @@ -237,10 +237,10 @@ private: virtual void beginOptional() override; virtual void endOptional() override; - virtual void handleSharedPtr(void*& value, Callback callback, void* userData) override; - virtual void handleUniquePtr(void*& value, Callback callback, void* userData) override; + virtual void handleSharedPtr(void*& value, Callback callback, void* context) override; + virtual void handleUniquePtr(void*& value, Callback callback, void* context) override; - virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* userData) + virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* context) override; }; @@ -262,6 +262,19 @@ public: /// ~RIFFSerialReader(); + /// Read a chunk from the current cursor position. + /// + /// This operation can be used to skip over an entire value + /// that might otherwise need to be read with a sequence of + /// operations of the `ISerializerImpl` interface. + /// + /// The saved pointer can then be used to construct another + /// `RIFFSerialReader` to read the contents of the chunk + /// at some later time, or code can simply navigate the + /// chunk in memory using their own logic. + /// + RIFF::Chunk const* readChunk(); + private: /// Representation of a read cursor in the serialized RIFF data. using Cursor = RIFF::BoundsCheckedChunkPtr; @@ -344,8 +357,8 @@ private: /// The callback to apply to read data into the `valuePtr` Callback callback; - /// The user-data pointer for the `callback`. - void* userData; + /// The context pointer for the `callback`. + void* context; }; /// Deferred actions that are still pending. @@ -427,10 +440,10 @@ private: virtual void beginOptional() override; virtual void endOptional() override; - virtual void handleSharedPtr(void*& value, Callback callback, void* userData) override; - virtual void handleUniquePtr(void*& value, Callback callback, void* userData) override; + virtual void handleSharedPtr(void*& value, Callback callback, void* context) override; + virtual void handleUniquePtr(void*& value, Callback callback, void* context) override; - virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* userData) + virtual void handleDeferredObjectContents(void* valuePtr, Callback callback, void* context) override; }; diff --git a/source/slang/slang-serialize.h b/source/slang/slang-serialize.h index 591f43139..261f3f2d6 100644 --- a/source/slang/slang-serialize.h +++ b/source/slang/slang-serialize.h @@ -370,7 +370,7 @@ struct ISerializerImpl virtual void handleFieldKey(char const* name, Int index) = 0; /// A callback function used to handle serialization of pointers. - typedef void (*Callback)(void* valuePtr, void* userData); + typedef void (*Callback)(void* valuePtr, void* impl, void* context); /// Handle a pointer value that is expected to be unique. /// @@ -387,7 +387,7 @@ struct ISerializerImpl /// /// When writing, if the `value` is non-null, then the callback /// will be invoked, either immediately or at some later point, - /// as `callback(&ptr, userData)` where `ptr` is a variable + /// as `callback(&ptr, this, context)` where `ptr` is a variable /// holding a copy of the `value` that was passed in. The callback /// is expected to write the members of the pointed-to type. /// @@ -396,7 +396,7 @@ struct ISerializerImpl /// for ensuring that its internal state has been restored to /// be compatible with what it was when `handleUniquePtr` was called. /// - virtual void handleUniquePtr(void*& value, Callback callback, void* userData) = 0; + virtual void handleUniquePtr(void*& value, Callback callback, void* context) = 0; /// Handle a pointer value that may have multiple references. /// @@ -411,7 +411,7 @@ struct ISerializerImpl /// the `callback` will not be invoked, and instead `value` will /// be set to the pointer that was previously read. /// - virtual void handleSharedPtr(void*& value, Callback callback, void* userData) = 0; + virtual void handleSharedPtr(void*& value, Callback callback, void* context) = 0; /// Defer serialization of the contents of an object. /// @@ -422,14 +422,14 @@ struct ISerializerImpl /// passed to `handleUniquePtr()` or `handleSharedPtr()`. /// /// This operation schedules the given `callback` to be called - /// at some later point a `callback(value, userData)`, with + /// at some later point a `callback(value, this, context)`, with /// the state of the serializer implementation restored to what /// it was when `handleDeferredObjectContents()` was called. /// /// Some concrete serializer implementations might implement /// this operation by invoking `callback` immediately. /// - virtual void handleDeferredObjectContents(void* value, Callback callback, void* userData) = 0; + virtual void handleDeferredObjectContents(void* value, Callback callback, void* context) = 0; }; // @@ -439,50 +439,68 @@ struct ISerializerImpl // // While the `ISerializerImpl` interface can cover a wide range of // types that need to be serialized, it is common for types to require -// more specific context to be available in order to perform serialization. +// more specific *context* to be available in order to perform serialization. // For example, code might need access to a factory object in order // to construct objects of a type being read. // -// To support more specialized serializer implementations, we allow -// the smart pointer used for a serializer to depend on the type -// of the underlying implementation object. +// To support more specialized serializer implementations, the smart +// pointer type used for a serializer actually wraps *two* pointers: +// one for an `ISerializerImpl`-derived type, and one for a context +// type. The smart pointer is templated on both of these types. // /// Base type for serialization contexts. /// -/// The type parameter `T` should be a type of object that -/// holds the context information needed. +/// The type parameter `Impl` should be a type that derives from +/// `ISerializerImpl`, and the `Context` type parameter can be any +/// type that passes along additional context information needed. /// -template<typename T> +template<typename Impl, typename Context> struct SerializerBase { public: SerializerBase() = default; - SerializerBase(T* ptr) - : _ptr(ptr) + SerializerBase(Impl* impl, Context* context = nullptr) + : _impl(impl), _context(context) { } - T* get() const { return _ptr; } - T* operator->() const { return get(); } + template<typename I, typename C> + SerializerBase( + SerializerBase<I, C> const& serializer, + std::enable_if_t< + std::is_convertible_v<I*, Impl*> && std::is_convertible_v<C*, Context*>, + void>* = nullptr) + : _impl(serializer.getImpl()), _context(serializer.getContext()) + { + } + + Impl* getImpl() const { return _impl; } + Context* getContext() const { return _context; } + + Impl* get() const { return _impl; } + Impl* operator->() const { return get(); } + private: - T* _ptr = nullptr; + Impl* _impl = nullptr; + Context* _context = nullptr; }; /// A serialization context. /// -/// The type parameter `T` should be a type of object that -/// holds the context information needed. +/// The type parameter `Impl` should be a type that derives from +/// `ISerializerImpl`, and the `Context` type parameter can be any +/// type that passes along additional context information needed. /// -template<typename T> -struct Serializer_ : SerializerBase<T> +template<typename Impl, typename Context> +struct Serializer_ : SerializerBase<Impl, Context> { - using SerializerBase<T>::SerializerBase; + using SerializerBase<Impl, Context>::SerializerBase; }; /// Default serialization context. -using Serializer = Serializer_<ISerializerImpl>; +using Serializer = Serializer_<ISerializerImpl, void>; // // We define namespace-scope functions that mirror some @@ -944,22 +962,22 @@ void serializeObjectContents(S const& serializer, T* value, void*) serialize(serializer, *value); } -template<typename S, typename T> -void _serializeObjectContentsCallback(void* valuePtr, void* userData) +template<typename I, typename C, typename T> +void _serializeObjectContentsCallback(void* valuePtr, void* impl, void* context) { - auto serializerImpl = (S*)userData; + Serializer_<I, C> serializer((I*)impl, (C*)context); auto value = (T*)valuePtr; - serializeObjectContents(Serializer_<S>(serializerImpl), value, (T*)nullptr); + serializeObjectContents(serializer, value, (T*)nullptr); } -template<typename S, typename T> -void deferSerializeObjectContents(Serializer_<S> const& serializer, T* value) +template<typename I, typename C, typename T> +void deferSerializeObjectContents(Serializer_<I, C> const& serializer, T* value) { ((Serializer)serializer) ->handleDeferredObjectContents( value, - _serializeObjectContentsCallback<S, T>, - serializer.get()); + _serializeObjectContentsCallback<I, C, T>, + serializer.getContext()); } template<typename S, typename T> @@ -972,26 +990,32 @@ void serializeObject(S const& serializer, T*& value, void*) deferSerializeObjectContents(serializer, value); } -template<typename S, typename T> -void _serializeObjectCallback(void* valuePtr, void* userData) +template<typename I, typename C, typename T> +void _serializeObjectCallback(void* valuePtr, void* impl, void* context) { - auto serializerImpl = (S*)userData; + Serializer_<I, C> serializer((I*)impl, (C*)context); auto& value = *(T**)valuePtr; - serializeObject(Serializer_<S>(serializerImpl), value, (T*)nullptr); + serializeObject(serializer, value, (T*)nullptr); } -template<typename S, typename T> -void serializeSharedPtr(Serializer_<S> const& serializer, T*& value) +template<typename I, typename C, typename T> +void serializeSharedPtr(Serializer_<I, C> const& serializer, T*& value) { ((Serializer)serializer) - ->handleSharedPtr(*(void**)&value, _serializeObjectCallback<S, T>, serializer.get()); + ->handleSharedPtr( + *(void**)&value, + _serializeObjectCallback<I, C, T>, + serializer.getContext()); } -template<typename S, typename T> -void serializeUniquePtr(Serializer_<S> const& serializer, T*& value) +template<typename I, typename C, typename T> +void serializeUniquePtr(Serializer_<I, C> const& serializer, T*& value) { ((Serializer)serializer) - ->handleUniquePtr(*(void**)&value, _serializeObjectCallback<S, T>, serializer.get()); + ->handleUniquePtr( + *(void**)&value, + _serializeObjectCallback<I, C, T>, + serializer.getContext()); } template<typename S, typename T> diff --git a/source/slang/slang.cpp b/source/slang/slang.cpp index 2c76f035c..f2f28e3e3 100644 --- a/source/slang/slang.cpp +++ b/source/slang/slang.cpp @@ -745,6 +745,7 @@ SlangResult Session::_readBuiltinModule( linkage, astBuilder, nullptr, // no sink + fileContents, astChunk, sourceLocReader, SourceLoc()); @@ -755,11 +756,6 @@ SlangResult Session::_readBuiltinModule( moduleDecl->module = module; module->setModuleDecl(moduleDecl); - if (isFromCoreModule(moduleDecl)) - { - registerBuiltinDecls(this, moduleDecl); - } - // After the AST module has been read in, we next look // to deserialize the IR module. // @@ -4198,6 +4194,7 @@ void Linkage::loadParsedModule( } RefPtr<Module> Linkage::findOrLoadSerializedModuleForModuleLibrary( + ISlangBlob* blobHoldingSerializedData, ModuleChunk const* moduleChunk, RIFF::ListChunk const* libraryChunk, DiagnosticSink* sink) @@ -4244,6 +4241,7 @@ RefPtr<Module> Linkage::findOrLoadSerializedModuleForModuleLibrary( return loadSerializedModule( moduleName, modulePathInfo, + blobHoldingSerializedData, moduleChunk, libraryChunk, SourceLoc(), @@ -4253,6 +4251,7 @@ RefPtr<Module> Linkage::findOrLoadSerializedModuleForModuleLibrary( RefPtr<Module> Linkage::loadSerializedModule( Name* moduleName, const PathInfo& moduleFilePathInfo, + ISlangBlob* blobHoldingSerializedData, ModuleChunk const* moduleChunk, RIFF::ListChunk const* containerChunk, SourceLoc const& requestingLoc, @@ -4286,6 +4285,7 @@ RefPtr<Module> Linkage::loadSerializedModule( if (SLANG_FAILED(loadSerializedModuleContents( module, moduleFilePathInfo, + blobHoldingSerializedData, moduleChunk, containerChunk, sink))) @@ -4352,6 +4352,7 @@ RefPtr<Module> Linkage::loadBinaryModuleImpl( RefPtr<Module> module = loadSerializedModule( moduleName, moduleFilePathInfo, + moduleFileContents, moduleChunk, rootChunk, requestingLoc, @@ -5269,7 +5270,27 @@ void Module::_processFindDeclsExportSymbolsRec(Decl* decl) } } -NodeBase* Module::findExportFromMangledName(const UnownedStringSlice& slice) +Decl* Module::findExportedDeclByMangledName(const UnownedStringSlice& mangledName) +{ + // If this module is a serialized module that is being + // deserialized on-demand, then we want to use the + // mangled name mapping that was baked into the serialized + // data, rather than attempt to enumerate all of the declarations + // in the module (as would be done if we proceeded to call + // `ensureExportLookupAcceleratorBuilt()`). + // + if (this->m_moduleDecl->isUsingOnDemandDeserializationForExports()) + { + return m_moduleDecl->_findSerializedDeclByMangledExportName(mangledName); + } + + ensureExportLookupAcceleratorBuilt(); + + const Index index = m_mangledExportPool.findIndex(mangledName); + return (index >= 0) ? m_mangledExportSymbols[index] : nullptr; +} + +void Module::ensureExportLookupAcceleratorBuilt() { // Will be non zero if has been previously attempted if (m_mangledExportSymbols.getCount() == 0) @@ -5284,9 +5305,25 @@ NodeBase* Module::findExportFromMangledName(const UnownedStringSlice& slice) m_mangledExportSymbols.add(nullptr); } } +} - const Index index = m_mangledExportPool.findIndex(slice); - return (index >= 0) ? m_mangledExportSymbols[index] : nullptr; +Count Module::getExportedDeclCount() +{ + ensureExportLookupAcceleratorBuilt(); + + return m_mangledExportPool.getSlicesCount(); +} + +Decl* Module::getExportedDecl(Index index) +{ + ensureExportLookupAcceleratorBuilt(); + return m_mangledExportSymbols[index]; +} + +UnownedStringSlice Module::getExportedDeclMangledName(Index index) +{ + ensureExportLookupAcceleratorBuilt(); + return m_mangledExportPool.getSlices()[index]; } // ComponentType @@ -6657,6 +6694,7 @@ void Linkage::setFileSystem(ISlangFileSystem* inFileSystem) SlangResult Linkage::loadSerializedModuleContents( Module* module, const PathInfo& moduleFilePathInfo, + ISlangBlob* blobHoldingSerializedData, ModuleChunk const* moduleChunk, RIFF::ListChunk const* containerChunk, DiagnosticSink* sink) @@ -6753,6 +6791,7 @@ SlangResult Linkage::loadSerializedModuleContents( this, astBuilder, sink, + blobHoldingSerializedData, astChunk, sourceLocReader, serializedModuleLoc); @@ -7399,8 +7438,10 @@ SlangResult EndToEndCompileRequest::addLibraryReference( // We need to deserialize and add the modules ComPtr<IModuleLibrary> library; + auto libBlob = RawBlob::create((const Byte*)libData, libDataSize); + SLANG_RETURN_ON_FAIL( - loadModuleLibrary((const Byte*)libData, libDataSize, basePath, this, library)); + loadModuleLibrary(libBlob, (const Byte*)libData, libDataSize, basePath, this, library)); // Create an artifact without any name (as one is not provided) auto artifact = diff --git a/source/slang/slang.natvis b/source/slang/slang.natvis index 341a390f2..38ec85b62 100644 --- a/source/slang/slang.natvis +++ b/source/slang/slang.natvis @@ -667,4 +667,55 @@ </Expand> </Type> <!--~UIntSet--> + + <Type Name="Slang::FossilizedPtr<*>"> + <SmartPointer Usage="Minimal">_offset == 0 ? nullptr : ($T1*)((char*)this + _offset)</SmartPointer> + <DisplayString Condition="_offset == 0">{($T1*)0}</DisplayString> + <DisplayString Condition="_offset != 0">{($T1*)((char*)this + _offset)}</DisplayString> + <Expand> + <ExpandedItem>_offset == 0 ? nullptr : ($T1*)((char*)this + _offset)</ExpandedItem> + </Expand> + </Type> + + + <Type Name="Slang::FossilizedString"> + <DisplayString Condition="_obj._offset == 0">""</DisplayString> + <DisplayString Condition="_obj._offset != 0">{((char*)this + _obj._offset),s8}</DisplayString> + <!-- + <Expand> + <ExpandedItem>(_text._offset == 0 ? "" : ((char*)this + _text._offset)),s8</ExpandedItem> + </Expand> + --> + </Type> + + <Type Name="Slang::FossilizedArray<*>"> + <DisplayString Condition="_obj._offset == 0">{{ count = 0 }}</DisplayString> + <DisplayString Condition="_obj._offset != 0">{{ count = {*((UInt32*)this - 1)} }}</DisplayString> + <Expand> + <Item Name="[count]">*((UInt32*)this - 1)</Item> + <ArrayItems> + <Size>*((UInt32*)this - 1)</Size> + <ValuePointer>($T1*)((char*)this + _obj._offset)</ValuePointer> + </ArrayItems> + </Expand> + </Type> + + <Type Name="Slang::FossilizedDictionary<*,*>"> + <DisplayString Condition="_obj._offset == 0">{{ count = 0 }}</DisplayString> + <DisplayString Condition="_obj._offset != 0">{{ count = {*((UInt32*)this - 1)} }}</DisplayString> + <Expand> + <Item Name="[count]">*((UInt32*)this - 1)</Item> + <ArrayItems> + <Size>*((UInt32*)this - 1)</Size> + <!-- + <ValuePointer>($T1*)((char*)this + _elements._offset)</ValuePointer> + --> + <ValuePointer>(Slang::KeyValuePair<$T1,$T2> *) ((char*)this + _obj._offset)</ValuePointer> + <!-- + <ValuePointer>(Slang::KeyValuePair<$T1,$T2>*)((char*)this + _elements._offset)</ValuePointer> + --> + </ArrayItems> + </Expand> + </Type> + </AutoVisualizer> |
