From a3fd4e2bc40cfc77db953b14744c30e7a18e7c1d Mon Sep 17 00:00:00 2001 From: Tim Foley Date: Fri, 15 Feb 2019 09:08:19 -0800 Subject: Split front- and back-ends (#846) * Split front- and back-ends This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API. The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion. For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points. The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed. The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did: * The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files. * A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does *not* interact with generic arguments for global or entry-point generic parameters. * The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`). * The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation. * A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around. * The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward. * Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios. * The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time). * One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation). It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required: * I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it. * A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters. * In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation. * In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway. * In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better. * In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward. * The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors. * A few APIs now take `DiagnosticSink*` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob**` parameter for the output. * fixup: missing include for compilers with accurate template checking (non-VS) * fixup: review feedback --- source/slang/parser.cpp | 250 ++++++------------------------------------------ 1 file changed, 29 insertions(+), 221 deletions(-) (limited to 'source/slang/parser.cpp') diff --git a/source/slang/parser.cpp b/source/slang/parser.cpp index e2085eb7d..3abc47ede 100644 --- a/source/slang/parser.cpp +++ b/source/slang/parser.cpp @@ -8,7 +8,7 @@ namespace Slang { - // Pre-declare + // pre-declare static Name* getName(Parser* parser, String const& text); // Helper class useful to build a list of modifiers. @@ -79,7 +79,11 @@ namespace Slang class Parser { public: - TranslationUnitRequest* translationUnit; + NamePool* namePool; + SourceLanguage sourceLanguage; + + NamePool* getNamePool() { return namePool; } + SourceLanguage getSourceLanguage() { return sourceLanguage; } int anonymousCounter = 0; @@ -124,27 +128,26 @@ namespace Slang currentScope = currentScope->parent; } Parser( + Session* session, TokenSpan const& _tokens, DiagnosticSink * sink, RefPtr const& outerScope) : tokenReader(_tokens) , sink(sink) , outerScope(outerScope) + , m_session(session) {} Parser(const Parser & other) = default; - Session* getSession() - { - return translationUnit->compileRequest->mSession; - } - RefPtr Parse(); + Session* m_session = nullptr; + Session* getSession() { return m_session; } + Token ReadToken(); Token ReadToken(TokenType type); Token ReadToken(const char * string); bool LookAheadToken(TokenType type, int offset = 0); bool LookAheadToken(const char * string, int offset = 0); void parseSourceFile(ModuleDecl* program); - RefPtr ParseProgram(); RefPtr ParseStruct(); RefPtr ParseClass(); RefPtr ParseStatement(); @@ -578,11 +581,6 @@ namespace Slang return false; } - RefPtr Parser::Parse() - { - return ParseProgram(); - } - RefPtr ParseTypeDef(Parser* parser, void* /*userData*/) { RefPtr typeDefDecl = new TypeDefDecl(); @@ -694,7 +692,7 @@ namespace Slang Token token(TokenType::Identifier, scopedIdentifier, scopedIdSourceLoc); // Get the name pool - auto namePool = parser->translationUnit->compileRequest->getNamePool(); + auto namePool = parser->getNamePool(); // Since it's an Identifier have to set the name. token.ptrValue = namePool->getName(token.Content); @@ -910,7 +908,7 @@ namespace Slang static Name* getName(Parser* parser, String const& text) { - return parser->translationUnit->compileRequest->getNamePool()->getName(text); + return parser->getNamePool()->getName(text); } static NameLoc expectIdentifier(Parser* parser) @@ -1859,7 +1857,7 @@ namespace Slang } // GLSL allows `[]` directly in a type specifier - if (parser->translationUnit->sourceLanguage == SourceLanguage::GLSL) + if (parser->getSourceLanguage() == SourceLanguage::GLSL) { typeExpr = parsePostfixTypeSuffix(parser, typeExpr); } @@ -1929,7 +1927,7 @@ namespace Slang // Just as a safety net, only apply this logic for // a file that is being passed in as "true" Slang code. // - if(parser->translationUnit->sourceLanguage == SourceLanguage::Slang) + if(parser->getSourceLanguage() == SourceLanguage::Slang) { if(typeSpec.decl) { @@ -2313,171 +2311,6 @@ namespace Slang return ParseHLSLBufferDecl(parser, "TextureBuffer"); } - static void removeModifier( - Modifiers& modifiers, - RefPtr modifier) - { - RefPtr* link = &modifiers.first; - while (*link) - { - if (*link == modifier) - { - *link = (*link)->next; - return; - } - - link = &(*link)->next; - } - } - - static RefPtr parseGLSLBlockDecl( - Parser* parser, - Modifiers& modifiers) - { - // An GLSL block like this: - // - // uniform Foo { int a; float b; } foo; - // - // is treated as syntax sugar for a type declaration - // and then a global variable declaration using that type: - // - // struct $anonymous { int a; float b; }; - // Block<$anonymous> foo; - // - // where `$anonymous` is a fresh name. - // - // If a "local name" like `foo` is not given, then - // we make the declaration "transparent" so that lookup - // will see through it to the members inside. - - - SourceLoc pos = parser->tokenReader.PeekLoc(); - - // The initial name before the `{` is only supposed - // to be made visible to reflection - auto reflectionNameToken = parser->ReadToken(TokenType::Identifier); - - // Look at the qualifiers present on the block to decide what kind - // of block we are looking at. Also *remove* those qualifiers so - // that they don't interfere with downstream work. - String blockWrapperTypeName; - if( auto uniformMod = modifiers.findModifier() ) - { - removeModifier(modifiers, uniformMod); - blockWrapperTypeName = "ConstantBuffer"; - } - else if( auto inMod = modifiers.findModifier() ) - { - removeModifier(modifiers, inMod); - blockWrapperTypeName = "__GLSLInputParameterGroup"; - } - else if( auto outMod = modifiers.findModifier() ) - { - removeModifier(modifiers, outMod); - blockWrapperTypeName = "__GLSLOutputParameterGroup"; - } - else if( auto bufferMod = modifiers.findModifier() ) - { - removeModifier(modifiers, bufferMod); - blockWrapperTypeName = "__GLSLShaderStorageBuffer"; - } - else - { - // Unknown case: just map to a constant buffer and hope for the best - blockWrapperTypeName = "ConstantBuffer"; - } - - // We are going to represent each buffer as a pair of declarations. - // The first is a type declaration that holds all the members, while - // the second is a variable declaration that uses the buffer type. - RefPtr blockDataTypeDecl = new StructDecl(); - RefPtr blockVarDecl = new VarDecl(); - - addModifier(blockDataTypeDecl, new ImplicitParameterGroupElementTypeModifier()); - addModifier(blockVarDecl, new ImplicitParameterGroupVariableModifier()); - - // Attach the reflection name to the block so we can use it - auto reflectionNameModifier = new ParameterGroupReflectionName(); - reflectionNameModifier->nameAndLoc = NameLoc(reflectionNameToken); - addModifier(blockVarDecl, reflectionNameModifier); - - // Both declarations will have a location that points to the name - parser->FillPosition(blockDataTypeDecl.Ptr()); - parser->FillPosition(blockVarDecl.Ptr()); - - // Generate a unique name for the data type - blockDataTypeDecl->nameAndLoc.name = generateName(parser, "ParameterGroup_" + String(reflectionNameToken.Content)); - - // TODO(tfoley): We end up constructing unchecked syntax here that - // is expected to type check into the right form, but it might be - // cleaner to have a more explicit desugaring pass where we parse - // these constructs directly into the AST and *then* desugar them. - - // Construct a type expression to reference the buffer data type - auto blockDataTypeExpr = new VarExpr(); - blockDataTypeExpr->loc = blockDataTypeDecl->loc; - blockDataTypeExpr->name = blockDataTypeDecl->getName(); - blockDataTypeExpr->scope = parser->currentScope.Ptr(); - - // Construct a type exrpession to reference the type constructor - auto blockWrapperTypeExpr = new VarExpr(); - blockWrapperTypeExpr->loc = pos; - blockWrapperTypeExpr->name = getName(parser, blockWrapperTypeName); - // Always need to look this up in the outer scope, - // so that it won't collide with, e.g., a local variable called `ConstantBuffer` - blockWrapperTypeExpr->scope = parser->outerScope; - - // Construct a type expression that represents the type for the variable, - // which is the wrapper type applied to the data type - auto blockVarTypeExpr = new GenericAppExpr(); - blockVarTypeExpr->loc = blockVarDecl->loc; - blockVarTypeExpr->FunctionExpr = blockWrapperTypeExpr; - blockVarTypeExpr->Arguments.Add(blockDataTypeExpr); - - blockVarDecl->type.exp = blockVarTypeExpr; - - // The declarations in the body belong to the data type. - parseAggTypeDeclBody(parser, blockDataTypeDecl.Ptr()); - - if( parser->LookAheadToken(TokenType::Identifier) ) - { - // The user gave an explicit name to the block, - // so we need to use that as our variable name - blockVarDecl->nameAndLoc = NameLoc(parser->ReadToken(TokenType::Identifier)); - - // TODO: in this case we make actually have a more complex - // declarator, including `[]` brackets. - } - else - { - // synthesize a dummy name - blockVarDecl->nameAndLoc.name = generateName(parser, "parameterGroup_" + String(reflectionNameToken.Content)); - - // Otherwise we have a transparent declaration, similar - // to an HLSL `cbuffer` - auto transparentModifier = new TransparentModifier(); - transparentModifier->loc = pos; - addModifier(blockVarDecl, transparentModifier); - } - - // Expect a trailing `;` - parser->ReadToken(TokenType::Semicolon); - - // Because we are constructing two declarations, we have a thorny - // issue that were are only supposed to return one. - // For now we handle this by adding the type declaration to - // the current scope manually, and then returning the variable - // declaration. - // - // Note: this means that any modifiers that have already been parsed - // will get attached to the variable declaration, not the type. - // There might be cases where we need to shuffle things around. - - AddMember(parser->currentScope, blockDataTypeDecl); - - return blockVarDecl; - } - static void parseOptionalInheritanceClause(Parser* parser, AggTypeDeclBase* decl) { if (AdvanceIf(parser, TokenType::Colon)) @@ -3020,27 +2853,8 @@ namespace Slang // // - A keyword-based declaration (e.g., `cbuffer ...`) // - The beginning of a type in a declarator-based declaration (e.g., `int ...`) - // - A GLSL block declaration (e.g., `uniform Foo { ... }`) - - // Let's deal with the GLSL block case first. This is something like: - // - // uniform Foo { ... }; - // - // The `uniform` keyword has already been parsed as a modifier, - // so the identifier we are looking at is `Foo`. If the token - // after that is `{`, we assume this is a block. - // - // Of course, we only want to allow this syntax when parsing GLSL... - if (parser->translationUnit->sourceLanguage == SourceLanguage::GLSL) - { - if( parser->LookAheadToken(TokenType::LBrace, 1) ) - { - decl = parseGLSLBlockDecl(parser, modifiers); - break; - } - } - // Next we will check whether we can use the identifier token + // First we will check whether we can use the identifier token // as a declaration keyword and parse a declaration using // its associated callback: RefPtr parsedDecl; @@ -3184,15 +2998,6 @@ namespace Slang currentScope = nullptr; } - RefPtr Parser::ParseProgram() - { - RefPtr program = new ModuleDecl(); - - parseSourceFile(program.Ptr()); - - return program; - } - RefPtr Parser::ParseStruct() { RefPtr rs = new StructDecl(); @@ -3591,7 +3396,7 @@ namespace Slang // parsing HLSL code. // - bool brokenScoping = translationUnit->sourceLanguage == SourceLanguage::HLSL; + bool brokenScoping = getSourceLanguage() == SourceLanguage::HLSL; // We will create a distinct syntax node class for the unscoped // case, just so that we can correctly handle it in downstream @@ -4439,14 +4244,18 @@ namespace Slang return parsePrefixExpr(this); } - RefPtr parseTypeFromSourceFile(TranslationUnitRequest* translationUnit, + RefPtr parseTypeFromSourceFile( + Session* session, TokenSpan const& tokens, DiagnosticSink* sink, - RefPtr const& outerScope) + RefPtr const& outerScope, + NamePool* namePool, + SourceLanguage sourceLanguage) { - Parser parser(tokens, sink, outerScope); - parser.translationUnit = translationUnit; + Parser parser(session, tokens, sink, outerScope); parser.currentScope = outerScope; + parser.namePool = namePool; + parser.sourceLanguage = sourceLanguage; return parser.ParseType(); } @@ -4457,12 +4266,11 @@ namespace Slang DiagnosticSink* sink, RefPtr const& outerScope) { - Parser parser(tokens, sink, outerScope); - - parser.translationUnit = translationUnit; - + Parser parser(translationUnit->getSession(), tokens, sink, outerScope); + parser.namePool = translationUnit->getNamePool(); + parser.sourceLanguage = translationUnit->sourceLanguage; - return parser.parseSourceFile(translationUnit->SyntaxNode.Ptr()); + return parser.parseSourceFile(translationUnit->getModuleDecl()); } static void addBuiltinSyntaxImpl( -- cgit v1.2.3