From 352576546bf192b06e425e922d305f0a7d0845fb Mon Sep 17 00:00:00 2001 From: Theresa Foley <10618364+tangent-vector@users.noreply.github.com> Date: Wed, 6 Aug 2025 12:48:02 -0700 Subject: A new approach to AST node deduplication (#8072) * A new approach to AST node deduplication Background ---------- The Slang compiler currently relies on the idea that AST nodes derived from `Val` are always deduplicated based on their opcode and operands. Deduplication requires caching, and we thus have to determine the right scope at which to allocate and cache different `Val`s. We need to ensure that `Val`s that refer to declarations/nodes in a specific compilation session/linkage are not cached in a scope that will outlive that session/linkage (or else the cache will contain garbage pointers). Conversely, we also need to ensure that any `Val`s that are referred to by something like a builtin module's AST will remain alive at least as long as that module/AST, and that every compilation session/linkage that refers to that module will find that `Val` cached if they try to create an equivalent one. The existing approach to deduplication has some subtleties: * A builtin module like the core module needs to be fully loaded (using the `ASTBuilder` for the global session) before any other compilation session might construct `Val`s referring to nodes in the AST of that module. * The various declarations/types/values cached on the `SharedASTBuilder` need to be created before any user-defined compilation occurs, to ensure that all of the relevant `Val`s are created on the global session's AST builder (see `Session::finalizeSharedASTBuilder`). * Related to all of the above: the deduplication cache on a non-top-level `ASTBuilder` is initialized by copying the cache from the top-level (global session) `ASTBuilder` on creation. Any subsequent allocations on the global-session `ASTBuilder` will not be visible to the linkage-level `ASTBuilder`. This led to weird things like the *options parsing* logic for `-load-core-module` reaching in and performing a manual copy of the cache from the global session to a linkage to try to restore the invariants. * Notably, the deduplication logic doesn't actually care if two different linkages create distinct `Val`s that represent the same thing, so long as nothing in the AST of a builtin module will refer to that thing. E.g., if the builtin modules never mention `Ptr`, then two different linkages that both refer to that type will likely construct distinct `Val`s to represent it. Thus the deduplication is not as complete as some might assume. The subtle ordering considerations when creating `Val`s related to builtin modules has proved to be a sticking point for implementing on-demand deserialization of the builtin modules (in order to improve startup times). Overview -------- This change implements a new approach that hopefully makes it easier to ensure correctness, even in the case where AST nodes for builtin modules and `Val`s that refer to those nodes sometimes get created after other compilation has been performed. The key points of the new approach are: * `ASTBuilder`s are now explicitly (rather than just implicitly) linked into a hierarchy. Currently the compiler codebase will only create a two-level hierarchy: the `ASTBuilder` for a global session is the parent to all of the `ASTBuilder`s created for `Linkage`s. The expectation is that the code can and will generalize to more levels of nesting. * When a request is made to create a `Val` (or find it in a cache), there is a single `ASTBuilder` that is determined to be responsible for owning that `Val` for its lifetime. The logic involved is comparable to the way that the `IRBuilder` decides where to insert a "hoistable" (deduplicated) instruction, with the simplification that we are dealing with a tree instead of a CFG. Details ------- * `Session::finalizeASTBuilder()` is gone, because it is no longer needed * The logic in the `ASTBuilder` constructor and in `slang-options.cpp` that was manually copying the `m_cachedNodes` from the global-session `ASTBuilder` over to a linkage's `ASTBuilder` is removed, since it is no longer needed. * Made every AST node (`NodeBase`) carry a pointer to the `ASTBuilder` that created it. This is wasteful, but makes it easier to be sure the implementation will work. * Introduced a class `RootASTBuilder`, derived from `ASTBuilder` to represent the root of a given hierarchy of AST builders. * Every non-root `ASTBuilder` is now constructed with a pointer to its parent builder. * Changed it so that instead of allocating a `SharedASTBuilder` and then passing it in to create one or more `ASTBuilder`s, the shared AST builder state is more of an implementation detail of the `ASTBuilder` type, and is automatically allocated behind the scenes as part of creating an `ASTBuilder`. * The inline (defined in header) `ASTBuilder::_getOrCreateImpl()` now just does a first-pass check for an existing cached `Val` and, if it doesn't find one, delegates to `ASTBuilder::_getOrCreateImplSlowPath()`, which encapsulates the logic for the cache-miss case (even if that logic is just two lines). * The `ASTBuilder::_findAppropriateASTBuilderForVal()` method inspects a `ValNodeDesc` to determine the "deepest" of the AST builders among its operands (falling back to the root AST builder if there are no relevant operands). * The `ASTBuilder::_getOrCreateValDirectly()` is intended for use when the correct AST builder to use for caching/allocation has already been identified. * Moved the caching of generic arguments for "default substitutions" out of `ASTBuilder` and onto `GenericDecl` itself. Note that the naming of the old field (`m_cachedGenericDefaultArgs`) was unclear about the fact that this is related to "default substitutions" (where each generic parameter is fed an argument that refers to the parameter itself), and has *nothing* to do with any default argument values that might be set on the generic parameters. * Changed the global session (`Session`) so that instead of storing pointers to both the root/builtin `ASTBuilder` *and* the corresponding `SharedASTBuilder`, it just retains a pointer to the root `ASTBuilder`. This is consistent with the move toward making the `SharedASTBuilder` just an implementation detail. Possible Future Work -------------------- * In theory this change should unblock on-demand AST deserialization, so it would be good to get back to those changes once this new approach lands. * Many AST node subclasses end up storing their own `ASTBuilder` pointers, which are likely now redundant with the one in `NodeBase`. These should be eliminated where possible. * A lot of code for AST-related manipulations has been changed over time to require an `ASTBuilder` to be passed in, but at this point such a builder should always be derive-able so long as the operation has at least one non-null AST node available to it. * Most of the accessors that are currently on `SharedASTBuilder` can/should be migrated to just be on `ASTBuilder`, where they can use the data from the `SharedASTBuilder` as part of their implementation. Ideally most code should only nee to interface with `ASTBuilder`s directly, and will never need to talk to the `SharedASTBuilder`. * This change cleaned up some of the ownership hierarchy, and made it so that the global session only retains a pointer to the root AST builder (and not the shared state as well). A logical follow-on for that would be to make the `Linkage` more properly own (and thus allocate) its own AST builder (and other related objects), and have the global session only store a pointer to the root linkage (and not point to the various sub-objects directly). * The `Linkage` and `SourceManager` types have their own form of parent/child hierarchy (restricted to two levels), and could be generalized in a way that is similar to what this change does for `ASTBuilder`. Support for a hierarchy of `Linkage`s could be a powerful tool for end users, if we expose it in the right way. * format code --------- Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com> --- source/slang/slang-ast-builder.cpp | 247 ++++++++++++++++++++++++++++++++----- 1 file changed, 214 insertions(+), 33 deletions(-) (limited to 'source/slang/slang-ast-builder.cpp') diff --git a/source/slang/slang-ast-builder.cpp b/source/slang/slang-ast-builder.cpp index d3e99ee4b..d576ed0fb 100644 --- a/source/slang/slang-ast-builder.cpp +++ b/source/slang/slang-ast-builder.cpp @@ -10,21 +10,16 @@ namespace Slang // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! SharedASTBuilder !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -SharedASTBuilder::SharedASTBuilder() {} - -void SharedASTBuilder::init(Session* session) +SharedASTBuilder::SharedASTBuilder(Session* session, RootASTBuilder* rootASTBuilder) { m_namePool = session->getNamePool(); // Save the associated session m_session = session; - // We just want as a place to store allocations of shared types - { - RefPtr astBuilder(new ASTBuilder); - astBuilder->m_sharedASTBuilder = this; - m_astBuilder = astBuilder.detach(); - } + // The root AST builder is the one that owns this `SharedASTBuilder`. + // + m_astBuilder = rootASTBuilder; // Clear the built in types memset(m_builtinTypes, 0, sizeof(m_builtinTypes)); @@ -169,20 +164,6 @@ Type* SharedASTBuilder::getOverloadedType() return m_overloadedType; } -SharedASTBuilder::~SharedASTBuilder() -{ - // Release built in types.. - for (Index i = 0; i < SLANG_COUNT_OF(m_builtinTypes); ++i) - { - m_builtinTypes[i] = nullptr; - } - - if (m_astBuilder) - { - m_astBuilder->releaseReference(); - } -} - void SharedASTBuilder::registerBuiltinDecl(Decl* decl, BuiltinTypeModifier* modifier) { auto type = DeclRefType::create(m_astBuilder, makeDeclRef(decl)); @@ -222,22 +203,32 @@ Decl* SharedASTBuilder::tryFindMagicDecl(const String& name) // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ASTBuilder !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -ASTBuilder::ASTBuilder(SharedASTBuilder* sharedASTBuilder, const String& name) - : m_sharedASTBuilder(sharedASTBuilder) - , m_name(name) - , m_id(sharedASTBuilder->m_id++) - , m_arena(2097152) +/// Default block size of 2MB. +static const size_t kASTBuilderMemoryArenaBlockSize = 2 * 1024 * 1024; + +ASTBuilder::ASTBuilder(ASTBuilder* parent, String const& debugName) + : m_parent(parent), m_name(debugName), m_arena(kASTBuilderMemoryArenaBlockSize) { + SLANG_ASSERT(parent); + auto sharedASTBuilder = parent->getSharedASTBuilder(); SLANG_ASSERT(sharedASTBuilder); - // Copy Val deduplication map over so we don't create duplicate Vals that are already - // existent in the core module. - m_cachedNodes = sharedASTBuilder->getInnerASTBuilder()->m_cachedNodes; + + m_depth = parent->m_depth + 1; + + m_sharedASTBuilder = sharedASTBuilder; + m_id = sharedASTBuilder->m_id++; } ASTBuilder::ASTBuilder() - : m_sharedASTBuilder(nullptr), m_id(-1), m_arena(2097152) + : m_arena(kASTBuilderMemoryArenaBlockSize) { - m_name = "SharedASTBuilder::m_astBuilder"; +} + +RootASTBuilder::RootASTBuilder(Session* globalSession) + : m_sharedASTBuilderStorage(globalSession, this) +{ + m_sharedASTBuilder = &m_sharedASTBuilderStorage; + m_name = "RootASTBuilder"; } ASTBuilder::~ASTBuilder() @@ -250,6 +241,196 @@ ASTBuilder::~ASTBuilder() incrementEpoch(); } +Val* ASTBuilder::_getOrCreateImplSlowPath(ValNodeDesc&& desc) +{ + // The most important thing we need to determine here + // is whether the node described by `desc` would be + // created using the arena of this `ASTBuilder`, or + // one of its ancestors (and if so, which one...). + // + ASTBuilder* astBuilderToUse = _findAppropriateASTBuilderForVal(desc); + + // Once we've identified the right level of the hierarchy, + // we can check the cache at that level and create + // the node if it doesn't already exist. + // + Val* valNode = astBuilderToUse->_getOrCreateValDirectly(std::move(desc)); + + // If the chosen `astBuilderToUse` was `this`, then the + // call to `_getOrCreateValDirectly` will have updated + // `m_cachedNodes` already. + // + // If the node was created using a different builder, + // which is an ancestor than this one (which would mean + // its depth is lower), then we can also update our + // own cache to match. + // + if (astBuilderToUse->m_depth < this->m_depth) + { + // Our approach to caching assumes that we cannot + // mix-and-match AST nodes from builders that aren't + // in some kind of ancestor/descendent relationship. + // Thus, if the builder that was chosen is less deep + // than `this`, we expect that to be because it is + // an ancestor. + // + SLANG_ASSERT(this->isDescendentOf(astBuilderToUse)); + + m_cachedNodes.add(ValKey(valNode), valNode); + } + // + // Note that we do *not* want to update our cache in + // the case where the chosen builder has higher depth + // then `this`, because `this` could outlive the chosen + // builder, and we don't want to be left with + // garbage pointers sitting in the cache. + // + // We also don't consider that case to be an error, + // because it is reasonable for code to do things like + // construct a specialized decl-ref for `Foo` using + // the builder associated with declaration `Foo`, even + // when specializing to a type `Bar` that comes from a + // deeper/child builder. + + return valNode; +} + +ASTBuilder* ASTBuilder::_findAppropriateASTBuilderForVal(ValNodeDesc const& desc) +{ + // AST builders are arranged in a hierarchy, where a child builder + // can see nodes cached in its ancestors, but not vice versa. + // + // We basically want to allocate a given `Val` as far down + // the hierarchy as we can (away from the root), so that the + // lifetime of those allocations can be narrowly scoped. However, + // we also need to ensure that `Val`s are cached far enough + // *up* the hierarchy that deduplication is possible, and that + // we can be sure a `Val` lives at least as long as each of + // its operands. + // + // Our approach to the caching problem relies on a key + // constraint, that the `ASTBuilder` used for a `_getOrCreateImpl()` + // operation and all of the `ASTBuilder`s used to create the + // nodes referenced as operands in the `desc` must be part + // of a single path of parent links in the hierarchy. + // Put another way: for any two `ASTBuilder`s involved in the + // creation of the node or its operands, they must be in some + // kind of ancestor/descendent relationship. + // + // Given this constraint, we can determine that the `Val` should + // be allocated and cached on the *deepest* AST builder + // from among the operands (or on the root AST builder in the + // case where there are no operands). + // + // We thus initialize our variable to the *shallowest* builder, + // which is the one we'll use if there are no operands. + // + ASTBuilder* deepestBuilder = getSharedASTBuilder()->getInnerASTBuilder(); + for (auto const& operand : desc.operands) + { + // We are only interested in operands that reference + // an AST node, so we will skip over all others. + // + switch (operand.kind) + { + default: + continue; + + case ValNodeOperandKind::ASTNode: + case ValNodeOperandKind::ValNode: + break; + } + + // We now know that the operand is represented + // as an AST node, but we need to skip over + // null operands because they aren't relevant + // to picking the right AST builder to use. + // + NodeBase* node = operand.values.nodeOperand; + if (!node) + continue; + + // Once we have an AST node worth looking at, + // we find the AST builder responsible for + // allocating that node. + // + ASTBuilder* nodeBuilder = node->getASTBuilder(); + SLANG_ASSERT(nodeBuilder); + + // The approach we are taking here relies on all + // the AST builders involved being part of a single + // path in the hierarchy, so we will do a minimal + // amount of validation in debug builds to ensure + // that each of the node builders for the operands + // is in some kind of ancestor/descendent relationship + // with the builder being used to make the request. + // + SLANG_ASSERT(nodeBuilder->isDescendentOf(this) || this->isDescendentOf(nodeBuilder)); + + // If the builder we are looking at is deeper than the + // deepest builder we've seen previously, then we update + // our candiate for the deepest builder. + // + if (nodeBuilder->m_depth > deepestBuilder->m_depth) + deepestBuilder = nodeBuilder; + } + + // + // At the end of that loop, we have a maximally-deep builder, + // and because we require all the builders to come from + // a single path in the hierarchy, that builder is also + // uniquely determined (a maximum rather than just maximal). + // + + return deepestBuilder; +} + +bool ASTBuilder::isDescendentOf(ASTBuilder* ancestor) +{ + SLANG_ASSERT(ancestor); + + auto builder = this; + while (builder) + { + if (builder == ancestor) + return true; + builder = builder->m_parent; + } + return false; +} + + +Val* ASTBuilder::_getOrCreateValDirectly(ValNodeDesc&& desc) +{ + // This operation should only be called if `this` + // was determined to be the appropriate AST builder + // to use when allocating/caching a `Val` based on `desc`. + // + SLANG_ASSERT(this == _findAppropriateASTBuilderForVal(desc)); + + // We start by checking the cache. This might have + // already been done as part of `_getOrCreateImpl()`, + // but it is also possible that the `_getOrCreateImpl()` + // call was made on a descendent `ASTBuilder` and its + // cache might not (yet) contain the given node. + // + if (auto found = m_cachedNodes.tryGetValue(desc)) + return *found; + + // If we don't have a cache hit at this level, + // then we just need to create the node and + // update our cache. + // + auto node = as(desc.type.createInstance(this)); + SLANG_ASSERT(node); + for (auto& operand : desc.operands) + node->m_operands.add(operand); + + m_cachedNodes.add(ValKey(node), _Move(node)); + + return node; +} + Index ASTBuilder::getEpoch() { return getSharedASTBuilder()->m_session->m_epochId; -- cgit v1.2.3