<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/slang/slang-global-session.cpp, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2025-08-06T19:48:02+00:00</updated>
<entry>
<title>A new approach to AST node deduplication (#8072)</title>
<updated>2025-08-06T19:48:02+00:00</updated>
<author>
<name>Theresa Foley</name>
<email>10618364+tangent-vector@users.noreply.github.com</email>
</author>
<published>2025-08-06T19:48:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=352576546bf192b06e425e922d305f0a7d0845fb'/>
<id>urn:sha1:352576546bf192b06e425e922d305f0a7d0845fb</id>
<content type='text'>
* A new approach to AST node deduplication

Background
----------

The Slang compiler currently relies on the idea that AST nodes derived from `Val` are always deduplicated based on their opcode and operands.
Deduplication requires caching, and we thus have to determine the right scope at which to allocate and cache different `Val`s.

We need to ensure that `Val`s that refer to declarations/nodes in a specific compilation session/linkage are not cached in a scope that will outlive that session/linkage (or else the cache will contain garbage pointers).
Conversely, we also need to ensure that any `Val`s that are referred to by something like a builtin module's AST will remain alive at least as long as that module/AST, and that every compilation session/linkage that refers to that module will find that `Val` cached if they try to create an equivalent one.

The existing approach to deduplication has some subtleties:

* A builtin module like the core module needs to be fully loaded (using the `ASTBuilder` for the global session) before any other compilation session might construct `Val`s referring to nodes in the AST of that module.

* The various declarations/types/values cached on the `SharedASTBuilder` need to be created before any user-defined compilation occurs, to ensure that all of the relevant `Val`s are created on the global session's AST builder (see `Session::finalizeSharedASTBuilder`).

* Related to all of the above: the deduplication cache on a non-top-level `ASTBuilder` is initialized by copying the cache from the top-level (global session) `ASTBuilder` on creation. Any subsequent allocations on the global-session `ASTBuilder` will not be visible to the linkage-level `ASTBuilder`. This led to weird things like the *options parsing* logic for `-load-core-module` reaching in and performing a manual copy of the cache from the global session to a linkage to try to restore the invariants.

* Notably, the deduplication logic doesn't actually care if two different linkages create distinct `Val`s that represent the same thing, so long as nothing in the AST of a builtin module will refer to that thing. E.g., if the builtin modules never mention `Ptr&lt;String&gt;`, then two different linkages that both refer to that type will likely construct distinct `Val`s to represent it. Thus the deduplication is not as complete as some might assume.

The subtle ordering considerations when creating `Val`s related to builtin modules has proved to be a sticking point for implementing on-demand deserialization of the builtin modules (in order to improve startup times).

Overview
--------

This change implements a new approach that hopefully makes it easier to ensure correctness, even in the case where AST nodes for builtin modules and `Val`s that refer to those nodes sometimes get created after other compilation has been performed.
The key points of the new approach are:

* `ASTBuilder`s are now explicitly (rather than just implicitly) linked into a hierarchy. Currently the compiler codebase will only create a two-level hierarchy: the `ASTBuilder` for a global session is the parent to all of the `ASTBuilder`s created for `Linkage`s. The expectation is that the code can and will generalize to more levels of nesting.

* When a request is made to create a `Val` (or find it in a cache), there is a single `ASTBuilder` that is determined to be responsible for owning that `Val` for its lifetime. The logic involved is comparable to the way that the `IRBuilder` decides where to insert a "hoistable" (deduplicated) instruction, with the simplification that we are dealing with a tree instead of a CFG.

Details
-------

* `Session::finalizeASTBuilder()` is gone, because it is no longer needed

* The logic in the `ASTBuilder` constructor and in `slang-options.cpp` that was manually copying the `m_cachedNodes` from the global-session `ASTBuilder` over to a linkage's `ASTBuilder` is removed, since it is no longer needed.

* Made every AST node (`NodeBase`) carry a pointer to the `ASTBuilder` that created it. This is wasteful, but makes it easier to be sure the implementation will work.

* Introduced a class `RootASTBuilder`, derived from `ASTBuilder` to represent the root of a given hierarchy of AST builders.

* Every non-root `ASTBuilder` is now constructed with a pointer to its parent builder.

* Changed it so that instead of allocating a `SharedASTBuilder` and then passing it in to create one or more `ASTBuilder`s, the shared AST builder state is more of an implementation detail of the `ASTBuilder` type, and is automatically allocated behind the scenes as part of creating an `ASTBuilder`.

* The inline (defined in header) `ASTBuilder::_getOrCreateImpl()` now just does a first-pass check for an existing cached `Val` and, if it doesn't find one, delegates to `ASTBuilder::_getOrCreateImplSlowPath()`, which encapsulates the logic for the cache-miss case (even if that logic is just two lines).

* The `ASTBuilder::_findAppropriateASTBuilderForVal()` method inspects a `ValNodeDesc` to determine the "deepest" of the AST builders among its operands (falling back to the root AST builder if there are no relevant operands).

* The `ASTBuilder::_getOrCreateValDirectly()` is intended for use when the correct AST builder to use for caching/allocation has already been identified.

* Moved the caching of generic arguments for "default substitutions" out of `ASTBuilder` and onto `GenericDecl` itself. Note that the naming of the old field (`m_cachedGenericDefaultArgs`) was unclear about the fact that this is related to "default substitutions" (where each generic parameter is fed an argument that refers to the parameter itself), and has *nothing* to do with any default argument values that might be set on the generic parameters.

* Changed the global session (`Session`) so that instead of storing pointers to both the root/builtin `ASTBuilder` *and* the corresponding `SharedASTBuilder`, it just retains a pointer to the root `ASTBuilder`. This is consistent with the move toward making the `SharedASTBuilder` just an implementation detail.

Possible Future Work
--------------------

* In theory this change should unblock on-demand AST deserialization, so it would be good to get back to those changes once this new approach lands.

* Many AST node subclasses end up storing their own `ASTBuilder` pointers, which are likely now redundant with the one in `NodeBase`. These should be eliminated where possible.

* A lot of code for AST-related manipulations has been changed over time to require an `ASTBuilder` to be passed in, but at this point such a builder should always be derive-able so long as the operation has at least one non-null AST node available to it.

* Most of the accessors that are currently on `SharedASTBuilder` can/should be migrated to just be on `ASTBuilder`, where they can use the data from the `SharedASTBuilder` as part of their implementation. Ideally most code should only nee to interface with `ASTBuilder`s directly, and will never need to talk to the `SharedASTBuilder`.

* This change cleaned up some of the ownership hierarchy, and made it so that the global session only retains a pointer to the root AST builder (and not the shared state as well). A logical follow-on for that would be to make the `Linkage` more properly own (and thus allocate) its own AST builder (and other related objects), and have the global session only store a pointer to the root linkage (and not point to the various sub-objects directly).

* The `Linkage` and `SourceManager` types have their own form of parent/child hierarchy (restricted to two levels), and could be generalized in a way that is similar to what this change does for `ASTBuilder`. Support for a hierarchy of `Linkage`s could be a powerful tool for end users, if we expose it in the right way.

* format code

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Organize code better by splitting some big files (#7890)</title>
<updated>2025-07-24T19:59:58+00:00</updated>
<author>
<name>Theresa Foley</name>
<email>10618364+tangent-vector@users.noreply.github.com</email>
</author>
<published>2025-07-24T19:59:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=8ccd495d5eaa82cb831378c28dd190e657b6c999'/>
<id>urn:sha1:8ccd495d5eaa82cb831378c28dd190e657b6c999</id>
<content type='text'>
* Organize code better by splitting some big files

The basic change here is that the majority of the declarations in `slang-compiler.h` have been split out into a set of smaller and more focused files.
As a result, the implement of those declarations have been moved from `slang-compiler.cpp` and `slang.cpp` over to those new files when the proper home for code is obvious.

I have tried as much as possible to *not* make any edits to the code along the way, and just copy-paste declarations from one place to another as-is.
The exceptions I am aware of are:

* In some cases a function that used to be file-scope `static` was used by code that landed in two or more different `.cpp` files. In these cases, I changed the function to be non-`static` (removing the `_` prefix from its name, if it had one, per our naming conventions), and put a declaration for the function into the most appropriate header I could identify.

* I added a few comments in places where I saw ugly or unfortunate things in the code I was moving, and wanted to tag them with `TODO`s so we can hopefully get to them in the fullness of time.

* I added top-level comments to each of the new `.h` files that was introduced to try to explain the logic for what goes into that file.

* In cases where one of the new header files mostly existed to declare a single type, I sometimes added more detail to the doc comment on that type, to better explain the type and its role in the compiler (this is text that otherwise might have gone into the comment at the top leve lof the file, but I figured that the doc comment would have higher discoverability).

I expect that the most contentious choice here is that the `Session` class lands in `slang-global-session.h` while `slang-session.h` holds the `Linkage` class.
The names used in this change are consistent with how the relevant concepts in the public Slang API are named, and are consistent with how we *intend* to rename the classes themselves in time.

* format code

* fixup

---------

Co-authored-by: slangbot &lt;186143334+slangbot@users.noreply.github.com&gt;</content>
</entry>
</feed>
