slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2019-10-24	Strip IR after front-end steps are done (#1092)	Tim Foley
	* Strip IR after front-end steps are done The main feature of this change is to unconditonally strip out the `IRHighLevelDeclDecoration`s in an IR module once the "mandatory" IR passes in the front end have run. This ensures that later IR passes (e.g., code emission) cannot rely on AST-level information to get their job done. Since I was already writing a pass to remove some instructions at the end of the front-end passes, I went ahead and also made the `-obfuscate` flag apply to the front-end IR generation by causing it to strip `IRNameHintDecoration`s while it is doing the other stripping. With this, the main identifying information left in IR modules (other than semantics and entry-point names) is mangled name strings for imported/exported symbols. A few other things got changes along the way: * Removed the `.expected` file for one of the tests, where that file seemingly shouldn't have been checked in at all. * Updated the signature of the DCE pass both so that it doesn't require a back-end compile request (it wasn't using it anyway), and so that it takes some options to decide whether to keep symbols marked `[export(...)]` alive (the front-end wants to keep these, while back-end passes currently need to be able to eliminate them). * Moved the `obfuscateCode` flag from the back-end compile request to the base class shared between front- and back-end requests, and updated the options and repro logic to set both as needed. An obvious improvement in the future would be to have the front- and back-end requests share these settings by referencing a single common object in the end-to-end case, rather than each having their own copy. * Removed logic that was keeping layout instructions alive in DCE, even if they weren't used. This seems to have been a vestige of an intermediate step between AST and IR layout. * fixup: add the new files
2019-10-22	User IR-based layout for all IR steps (#1084)	Tim Foley
	This change builds on previous work that moves toward a more IR-based representation of layout. Those steps added some instructions for representing layout in the IR (initially just proxies for the AST layout objects), and an explicit lowering pass that could build a target-specific IR module that binds parameters and entry points to layout information. This change aims to complete that work, in the sense that the IR representation of layout is now self-contained and does not rely on having pointers back into the AST-level representation. Achieving this requires two main kinds of work: 1. Update any code that used layout information derived from the IR (most notably all the `slang-emit-` code) to use the new IR representation and its accessors. 2. Update any code that constructs* layouts using information derived from the IR to construct IR layouts instead. The biggest new infrastructure feature in this change is support for "attributes" in the IR (I'd welcome feedback on the naming). An attribute can either be thought of like key/value arguments that can be added to certain instructions to encode optional data, or alternatively like a decoration that is referenced as an operand instead of a child. The value of attributes over decorations is that they can affect the hash/identity of an instruction (which decorations can't), while the advantage of decorations is that they can easily be added/removed over the lifetime of an instruction (which attributes can't). We mostly use them here to represent operands that are logically optional. Once attributes are available, the encoding of layout information into the IR is mostly straightforward: * An `IRVarLayout` has a fixed operand for its type layout, and can accept a few different attributes * Zero or more `IRVarOffsetAttr`s that specify the offset of the variable for a given resource kind. These are equivalent to the `VarLayout::ResourceInfo`s at the AST level. * An optional `IRUserSemanticAttr` and `IRSystemValueSemanticAttr` to represent the (possibly derived) semantic of a varying input/output parameter. * An option `IRStageAttr` to represent the known stage for a parameter. * An `IREntryPointLayout` has a var layout for the entry point parameters (logically grouped in to a struct) and another var layout for the result parameter. * There is a small type hierarchy rooted at `IRTypeLayout` where each subtype can add fixed operands and attributes that are expected to appear. It also supports `IRTypeSizeAttr`s that serve a similar role to the `IRVarOffsetAttr`s. * Structure types maintain the mapping of fields to their var layouts using `IRStructFieldLayoutAttr`s. With the encoding in place, most of the changes in category (1) (code that just uses rather than creates layouts) was straightforward. The biggest different beyond name changes was that everything needs to be fetched using accessors instead of bare fields. It would have been possible to stage this commit and make the diffs smaller by first introducing mandatory acessors to the AST layout types. The changes in category (2) were more involved. There were a lot of places in the existing code where a `TypeLayout` or `VarLayout` would be created, and then initialized piecemeal over several lines of code (and sometimes even across functions). Because of the way that layouts need to support many optional properties, it did not seem practical to just have monolithic factory functions that took all the options as arguments, so I instead opted for a builder approach. The builders for `IRVarLayout` and `IREntryPointLayout` are both straightforward, and honestly there is no realy need for a builder for entry point layouts right now, but I was trying to future-proof in case we decidd to add some optional attributes to them. The builders for type layouts are more involved because of the inheritance hierarchy. Each concrete sub-type of type layout needs to define its own builder type that customizes the opcode, operands, and attributes of the final instruction. The refactoring that had to go into this change was a nice excuse to clean up a few ugly warts in the AST layout code that were largely there to support IR use cases. While this change adds a lot of new infrastructure code to the IR, most of the client code has stayed the same or gotten simpler. One annoying wart that remains with this change is the notion of an "offset element type layout" for parameter group types. That idea was added to deal with a legacy feature in the reflection API that we realized was a mistake, but unfortunately having that "offset" layout handy made writing a few other pieces of code simpler so that there are use cases of the feature even in the IR. Removing those uses is do-able, but requires careful refactoring so it is best left to a follow-on change. Another thing that could be considered for a follow-on change is how much information should be specified when constructing a `Builder` for an IR type layout, and how much should be allowed to be specified statefully/piecemeal. It would be nice to force all the required operands to be specified up front, but `IRParameterGroupTypeLayout::Builder` doesn't currently work that way because so much of the client code that needs it involved a lot of stateful setting and would need to be refactored heavily to provide the necessary information up front.
2019-10-17	Initial work on representing layout at IR level (#1079)	Tim Foley
	* Initial work on representing layout at IR level This change starts the process of making the back-end of the compiler independent of the AST-level layout information (`TypeLayout`, `VarLayout`, etc.) so that it instead only relies on layout information that is embedded into IR modules. This brings us incrementally closer to a world in which the back-end could be run without the AST-level structures even existing (e.g., for an application that just wants to ship IR without any AST information for IP protection, while still supporting some amount of linking and specialization). The main parts of the change are: * There is a bunch of incidental churn related to specifying entry points by index instead of the `EntryPoint` object for certain operations. This ends up being a better choice because we can use the index to look up side-band information about the entry point that might not be stored on the `EntryPoint` object itself. In particular... * We expand the `ComponentType` interface to support looking up the mangled name of an entry point by index. In common cases (no generic/interface specialization) this would be the same as asking the `EntryPoint` for its mangled name, but in cases where we have specialized a generic entry point, the mangled name would include speicalization arguments that are only available on the `SpecializedComponentType` that wraps the entry point. This part of the change isn't ideal and there might be a better solution waiting to be invented. Note that we store mangled entry point names as strings rather than using `DeclRef`s because that ensures that the information could be serialized and deserialized without a dependence on the AST. * The `TargetProgram` type (which represents binding a specific `ComponentType` for a shader program to a specific `TargetRequest` that represents the target platform) is expanded to include an `IRModule` that represents layout information, in addition to the AST-level `ProgramLayout` it already contained. We create both of these objects at the same time (on-demand) to simplify the overall flow (so that any code that triggers creation of the AST-level layout will also ensure that the IR-level layout exists). * A bunch of code in the emit passes that was passing down layout-related objects has been eliminated. It appears that most of those objects weren't actually being used, so this is just a cleanup, but it helps ensure that the back-end steps are "clean" and don't depend on the AST-level information. The one big exception here is that the emit logic needs to know the stage for the entry point being emitted (to deal with one wrinkle in translating DXR to VKRT). * A big change (actually introduced by @jsmall-nvidia in a branch that this change copied and then built from) is to introduce some more explicit IR instructions to represent layout information, notably an `IRTypeLayout` and an `IRVarLayout`. For now these objects still reference their AST equivalents, but the separation gives us an incremental path to move information from the AST-level objects over to the IR ones. This work includes logic in `IRBuilder` to construct the IR-level layout objects from the AST-level ones on-demand, so that the existing code paths that try to attach AST-level layout will continue to work for now. * Because layout information is now embedded in the IR, the `slang-ir-link.cpp` logic loses a lot of cases that used to deal with attaching AST-level layout objects to IR-level instructions during the linking process. Instead, the linker now assumes that one (or more) of the input IR modules will have layout information associated with it, and the linker makes sure to copy layout decorations (and the instructions they reference) from the input IR module(s) to the output using its more ordinary mechanisms. * Inside `slang-lower-to-ir.cpp`, we add logic to construct an IR module in a `TargetProgram` that simply references the global shader parameters, entry points, etc. and attaches IR layout decorations to them. This is akin to the existing pass in the same file that constructs IR to represent specialization information, and both of these passes share infrastructure with the main AST->IR lowering pass. Eventually, it is expected that this pass will encompass more of the logic for copying AST-level layout information over to IR-level equivalents. * One small wrinkle with this change was that the output for an HLSL generation test case changed some of its `#line` directives. The old code was actually more inaccurate than the new, so this change just updated the baseline. It also added some logic in the linker to make sure that when an IR instruction has multiple definitions, we try to pick up a source location from any of them, in case the "main" one somehow didn't get a location. * Another small fix was that the key/value map in `StructTypeLayout` for mapping fields/members to their layouts was keyed on `Decl` when it really should have been `VarDeclBase`. This change should in principle be a pure refactoring with no functionality changes, so no new tests were added. It is unfortunately also a change that has a high probability of breaking at least some client code, so we may want to be defensive and mark this with a new major version number (well, a new minor version number since we are pre-`1.0`) to give us some room for releasing hotfixes to the old version if needed. * fixup: infinite recursion bug detected by clang * fixup: remove commented-out code
2019-09-18	Clean up some behavior of operator% (#1060)	Tim Foley
	Work on #1059 The `%` operator in the Slang implementation had several issues, and this change tries to address some of them: * Renamed most occurences of "mod" describing this operator to be "rem" for "remainder" to better match its semantics in HLSL * Split the operator into distinct integer and floating-point variants (`IRem` and `FRem`) to simplify having different codegen for the two * Added floating-point variants of `operator%` and `operator%=` to the stdlib. * Added custom C++ codegen for `kIROp_FRem` such that it maps to the standard C/C++ `remainder()` function * Added custom GLSL codegen so that `kIROp_FRem` maps to the GLSL `mod()` function (which isn't correct...) * Added a test case to confirm that D3D11, D3D12, and CPU targets all agree on the definition of floating-point `%` * Fixed `render-test-tool` to allow a negative integer in a `data=...` specification. This didn't end up being used in the final test, but still seems like a good fix. * Added a customized baseline for the Vulkan flavor of that test to confirm that we are not compiling correctly to SPIR-V just yet Addressing the correctness of the output for GLSL/SPIR-V will have to come as a later change given that the operation we want is not exposed directly by unextended GLSL.
2019-08-22	WIP: CPU compute coverage (#1030)	jsmall-nvidia
	* Add support for '=' when defining a name in test. * Add support for double intrinsics. * Add support for asdouble Add findOrAddInst - used instead of findOrEmitHoistableInst, for nominal instructions. Support cloning of string literals. C++ working on more compute tests. * Constant buffer support in reflection. Fixed debugging into source for generated C++. buffer-layout.slang works. * Added cpu test result. * Remove some commented out code. Comment on next fixes. * Improvements to reflection CPU code. * C++ working with ByteAddressBuffer. * Enabled more compute tests for CPU. * Enabled more compute tests on CPU. Added support for [] style access to a vector. * Enabled more CPU compute tests. * Handling of buffer-type-splitting.slang Named buffers can be paths to resources * Fix some warnings, remove some dead code. * Fix problem with verification of number of operands for asuint/asint as they can have 1 or 3 operands. asdouble takes 2. * Fix handling in MemoryArena around aligned allocations. That _allocateAlignedFromNewBlock assumed the block allocated has the aligment that was requested and so did not correct the start address.
2019-08-08	Revise new COM-lite API (#1007)	Tim Foley
	* Revise new COM-lite API This change revises the "COM-lite" API that was recently introduced to try to streamline it and introduce some missing central/base concepts. The central new abstraction in the API is the notion of a "component type," which is a unit of shader code composition. A component type can have: * IR code for some number of functions/types/etc. * Zero or more global shader parameters * Zero or more "entry point" functions at which execution can start * Zero or more "specialization" parameters (types or values that must be filled in before kernel code can be generated) * Zero or more "requirements" (dependencies on other component types that must be satisfied before kernel code can be generated) Both individual compiled modules, and validated entry points are then examples of component types, and we additionally define a few services that apply to all component types: * We can take N component types and compose them to create a new component type that combines their code, shader parameters, entry points, and specialization parameters. A composed component type may also include requirements from the sub-component types, but it is also possible that by composing thing we satisfy requirements (if `A` requires `B`, and we compose `A` and `B`, then the requirement is now satisfied, and doesn't appear on the composite). * We can take a component type with N specialization parameters, and specialize it by giving N compatible specialization arguments. The result of specialization is a new component type with zero specialization parameters. Under the right circumstances the specialzed component type will be layout compatible with the unspecialized one. * One more example that isn't exposed in the public API today is that we can take a component with requirements and "complete" it by automatically composing it with component types that satisfy those requirements. This can be seen as a kind of linking step that pulls together the transitive closure of dependencies. * We can query the layout for the shader parameters and entry points of a component type, for a specific target. * We can query compiled kernel code for an entry point in a component type (for a specific target). This only works for component types with zero specialization parameters and zero requirements. The idea is that by giving users a fairly general algebra of operations on component types, they can compose final programs in ways that meet their requirements. For example, it becomes possible to incrementally "grow" a component type to represent the global root signature for ray tracing shaders as new entry points are added, in such a way that it always stays layout-compatible with kernels that have already been compiled. Much of the implementation work here is in implementing the unifying component type abstraction, and in particular re-writing code that used to assume a program consisted of a flat list of modules and entry points to work with a hierarchical representation that reflects the underlying algebra (e.g., with types to represent composite and specialized component types). There's also a hidden "legacy" case of a component type to deal with some legacy compiler behaviors that can't be directly modeled on top of the simple algebra with modules and entry points. This API is by no means feature-complete or fully developed. It is expected that we will flesh it out more when bringing up application code (e.g., Falcor) on top of the revamped API. One notable thing that went away in this change is explicit support for "entry point groups" and notions of local root signatures (especially the Falcor-specific handling of the `shared` keyword, which a previous change turned into an explicitly supported feature). With the new "building blocks" approach, it should be possible for a DXR application to deal with local root signatures as a matter of policy (on top of the API we provide). If/when we need to provide some kind of emulation of local root signatures for Vulkan (and/or if Vulkan is extended with an explicit notion of local root signatures), we might need to revisit this choice. * Fix debug build There was invalid code inside an `assert()`, so the release build didn't catch it. * fixup: warnings * fixup: more warnings-as-errors * fixup: review notes * fixup: use component type visitors in place of dynamic casting
2019-07-18	Add back a notion of IR global constants (#1002)	Tim Foley
	This change adds back a little bit of explicit support for global constants in the IR, after a previous change completely removed the existing `IRGlobalConstant` node type. The new `IRGlobalConstant` is not a parent instruction, and doesn't function at all like the old one. Instead it is effectively a simple instruction that takes zero or one operands: * The zero-operand case represents a constant with unknown value. This would usually come from another module, and thus would have an `[import(...)]` linkage decoration, so that after linking it resolves to a constant with a known value. * In the one-operand case, the single operand represents the value of the constant, so that the operation semantically behaves like an identity function. It exists just to give decorations something to "attach" to, so that a global constant with a value can have, e.g., an `[export(...)]` decoration to establish linkage. The IR lowering pass was updated to create the new node type to wrap any global constants. For now we do this both for global `static const` variables and function-scope `static const`, although the latter doesn't really need the extra indirection. The IR linking logic was extended to handle linking of global constants akin to how other global instructions are handled. The new logic is mostly boilerplate, and it is likely that a refactor of the linking logic would eliminate the need for this kind of per-instruction-opcode handling of IR instructions that can have linkage. A custom pass was added that is intended to be run right after linking (it could arguably be folded into `linkIR()`, but I thought it was safer to keep each pass as small as possible). This pass replaces any `IRGlobalConstant` that has a value (operand) with that value, so that global constants should be eliminated after the linking step. This ensures that downstream optimization/transformation passes don't have to deal with the possibility of global constants. Almost all the existing passes would Just Work if global constants were left in the IR. The two big exceptions are: * Anything that relies on testing `IRInst` identity as a way to test for things having the same value would break, since a global constant is a distinct `IRInst` from its value. * The type legalization pass doesn't handle `IRGlobalConstant` instructions with non-simple types. This could be added if we ever wanted it, but it seemed silly to write this code now if it would always be dead (and thus untested). I went ahead and updated the emit logic to handle an `IRGlobalConstant`s that still existing in the IR module at emit time, since the amount of code required was small so that being robust to that case seemed safest (e.g., in case we ever want to have a path that emits code directly while skipping some/all of our IR transformation passes). There should be no visible changes to the functionality of the compiler with this change, but it should help make IR dumps from the front-end more clear/explicit (since each constant will be a distinct instruction with its own name), and paves the way for supporting proper cross-module linkage of constants.
2019-07-17	Change how global-scope constants are handled (#1001)	Tim Foley
	Before this change, global and function-scope `static const` declarations were represented as instructions of type `IRGlobalConstant`, which was represented similarly to an `IRGlobalVar`: with a "body" block of instructions that compute/return the initial value. This representation inhibited optimizations (because a reference to a global constant would not in general be replaced with a reference to its value), and also caused problems for resource type legalization because the logic for type legalization did not (and still does not) handle initializers on globals (so global variables that contain resource types are still unsupported). The change here is simple at the high level: we get rid of `IRGlobalConstant` and instead handle global-scope constants as "ordinary" instructions at the global scope. E.g., if we have a declaration like: static const int a[] = { ... } that will be represented in the IR as a `makeArray` instruction at the global scope, referencing other global-scope instructions that represent the values in the array. This simple choice addresses both of the main limitations. A `static const` variable of integer/float/whatever type is now represented as just a reference to the given IR value and thus enables all the same optimizations. When a `static const` variable uses a type with resources, the existing legalization logic (which can handle most of the "ordinary" instructions already) applies. Another secondary benefit of this approach is that the hacky `IREmitMode` enumeration is no longer needed to help us special-case source code emit for `static const` variables. Beyond just removing `IRGlobalConstant`, and updating the lowering logic to use the initializer direclty, the main change here is to the emit logic to make it properly handle "ordinary" instructions that might appear at global scope. One open issue with this change, that could be addressed in a follow-up change, is that "extern" global constants that need to be imported from another module (but which might not have a known value when the current module is compiled) aren't supported - we don't have a way to put a linkage decoration on them. A future change might re-introduce global constants as a distinct IR instruction type that just references the value as an operand (if it is available). We would then need to replace references to an IR constant with references to its value right after linking.
2019-07-09	WIP: slang to C++ code generation (#997)	jsmall-nvidia
	* WIP: Emitting Cpp * Added HLSLType instead of using IRInst - because they don't seem to be deduped. * Removed need for lexer to take a String. Added mechansim to lookup intrinsic functions on C++. * A c/c++ cross compilation test. * WIP Cpp output using cloning and slang types. * More work to generate mul funcs. * WIP: Outputting some simple C++. * Expose findOrEmitHoistableInst to IRBuilder to aid cloning, * Simplification for checking for BasicTypes. Test infrastructure compiles output C++ code. * Dot and mat/vec multiplication output. * First pass at swizzling. * First support for binary ops. * Builtin binary and unary functions. * Any and all. * WIP adding support for other functions. Added code to generate function signature. * Add scalar functions to slang-cpp-prelude.h * Support for most built in operations. * Tested first ternary. * Checking the emitting of corner cases functions - normalize, length, any, all, normalize, reflect. * Check asfloat etc work. * Fmod support. * WIP Array handling in C++. * First stage in being able to handl arbitrary type output for CLikeSourceEmitter * Removed Handler/Emitter split - so can implement more easily complex type naming. * Array passing by value first pass. * Rename Array -> FixedArray * Outputs structs in C++. * Emit the thread config. * Dimension -> TypeDimension * SpecializedOperation -> SpecializedIntrinsic Operation -> IntrinsicOp Use shared impl of isNominalOp Commented use of m_uniqueModule etc. * Add code to test slang->cpp when compiled doesn't have errors. Does so by building shared library and exporting the entry point. * Fix linux clang/gcc compile error about override not being specified. * Make sure c-cross-compile is run on linux targets/smoke. * Remove c-cross-compile.slang from smoke. * Fix running tests/cross-compile/c-cross-compile.slang on Ubuntu 16.04 * Only add -std=c++11 for C++ source.
2019-05-31	Use slang- prefix on slang compiler and core source (#973)	jsmall-nvidia
	* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.