diff options
| author | Theresa Foley <10618364+tangent-vector@users.noreply.github.com> | 2025-04-17 09:53:37 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-04-17 09:53:37 -0700 |
| commit | 8d1dca337e4b74c4b88a434eb2df5889410aff7c (patch) | |
| tree | 8d729a92b249d90723863264d547e3d4f2fae012 /source/compiler-core | |
| parent | 04db5a95657a8c1ad1db36570eadaeedbea01cbb (diff) | |
Eliminate back-reference in ChildStmt (#6835)
* Eliminate back-reference in ChildStmt
This change is part of a larger effort to improve the code for AST
serialization in the Slang compiler.
Tree structures are understandably easier to serialize than DAGs,
and DAGs are easier than fully generaal graphs.
The Slang AST nodes form a tree structure... except when they don't.
Among the exceptions to nice tree-structured ASTs are:
1. References to `Decl`s are encoded as pointers to the AST `Decl`
nodes themselves. This can result in cycles in the graph, and
requires care in serialization.
2. Nodes that inherit from `Val` represent, well, *values* instead
of actual pieces of syntax, and as such they are deduplicated so
that identical values will (hopefully) be identical pointers.
This results in a DAG structure for `Val`s, but at least it's not
a general graph (except for cycles that go through a `Decl`).
3. There are some minor cases of DAG-structured sharing that the
parser can introduce to deal with cases when a traditional-style
declaration includes multiple declarators. E.g., given:
```
static int a, b;
```
The resulting `DeclGroup` will include distinct `Decl`s for `a`
and `b`, which will share the `static` modifier through a
`SharedModifiers` node, and the `int` type specifier through a
`SharedTypeExpr` node.
This duplication can be ignored, for the purposes of serialization,
since duplicating those parts of the AST has no major down-sides.
4. There is the case of `ChildStmt`, used for things like `break`
and `continue`, which stores a direct `Stmt*` to the enclosing
parent statement being targetted. Storing the target is useful so
that IR lowering doesn't need to repeat the work that the semantic
checking logic did to associate each child statement with its parent.
The parent link inside of `ChildStmt` creates a cycle in the AST
`Stmt` hierarchy, since the outer statement contains the inner,
and the inner statement stores a pointer to the outer.
This change eliminates the last of these sources of complication for
AST serialization, by changing the `ChildStmt` type to stored an
integer ID for the enclosing statement that it matches to, and having
each `BreakableStmt` (used to represent the outer `switch`, or loop,
or whatever) generate its own unique ID as part of semantic checking.
Note: if necessary, it is reasonable for the outer statement to have
its unique ID generated as part of parsing, rather than semantic
checking.
* format code
* Change unique ID to be a proper Decl
The fix here is to make the "unique ID" representation be a full
`Decl`-derived AST node, so that it is both allowed to break the
tree-structuring rules cleanly, and it is also trivially guaranteed
to be unique across all loaded ASTs.
* format code
---------
Co-authored-by: slangbot <186143334+slangbot@users.noreply.github.com>
Diffstat (limited to 'source/compiler-core')
0 files changed, 0 insertions, 0 deletions
