slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2019-11-14	Initial work on direct emission of SPIR-V (#1118)	Tim Foley
	* Initial work on direct emission of SPIR-V This change adds a first vertical slice of support for emitting SPIR-V code directly from the Slang IR, instead of generating it indirectly via GLSL. This work isn't usable for anything valuable right now; the goal is just to get something checked in that we can incrementally extend over time. When invoking `slangc`, the `-emit-spirv-directly` option can be used to turn on the new code path. I have not bothered to add an equivalent API option, because this flag is only intended to be used for testing in the immediate future. The existing `emitEntryPoint()` function has become `emitEntryPointSource()` to more accurately reflect its role in a world where we can also emit entry points to a binary format. Much of the logic that was inside `emitEntryPoint()` had to do with linking and then optimizing/transforming Slang IR code to get it ready for emission on a particular target. This logic has been factored into a new `linkAndOptimizeIR()` function that can be shared between the path that emits source and the new one that emits SPIR-V. The meat of the change is then the `emitSPIRVFromIR()` function in `slang-emit-spirv.cpp`, which is called after all the optimizations and transformations have been applied to the Slang IR to get it ready. Rather than repeat myself here, I will try to make the comments in `slang-emit-spirv.cpp` usable as documentation of the approach being taken. Smaller notes: * I've included a test case that compares `slangc` output directly to expected SPIR-V. This is perhaps not an ideal plan for how to test SPIR-V emission going forward, but it suffices for now. * The `external/` directory needed to be added to the include dirs for the `slang` project so that the new code can depend on the SPIR-V header. * In `slang-ir-link`, the direct SPIR-V generation path means that we now link with a target of SPIR-V instead of GLSL. In principle this can be used to ensure that appropriate variants of intrinsics are selected based on the knowledge that we are emitting SPIR-V. In practice, that isn't being used at all. * Fixup: path for SPIR-V headers While working on this PR I used a copy of `spirv.h` that I placed into the repository tree manually, but since I started the work we ended up with SPIR-V headers in our tree anyway, albeit at a different path. This change tries to fix things up so that my code uses the headers that were already placed in the repository. * fixup; 64-bit build issue * fixup: typo fixes based on review
2019-11-07	* Removed strip pass from emit as no longer needed (#1114)	jsmall-nvidia
	* If obfuscate is enabled do strip on Layout * Add option to keep insts that have layout decoration (else DCE strips layout) * Add NameHint back in lowering - as strip now correctly removes. We may want NameHints in some stages even with obfuscation (for error messages in IR passes), as long as they are removed appropriately at the end
2019-11-06	Feature/obfuscate improvements (#1107)	jsmall-nvidia
	* Added RiffReadHelper * Move type to fourCC in Chunk simplifies some code. * Make MemoryArena able to track external blocks. Allow ownership of Data to vary. Changed IR serialization to use moved allocations to avoid copies. As it turns out all of the array writes could use unowned data, but doing so requires the IRData to stay in scope longer than IRSerialData, which it does at the moment - but perhaps needs better naming or a control for the feature. * Write out slang-module container. * WIP on -r option. Loading modules - with -r. * Making the serialized-module run (without using imported module). * Split compiling module from the test. * Separate module compilation with a function working. * Remove serialization test as not used. * Fix warning on gcc. * Updated test to have types across module boundary. * Allow entry point declaration. A test that tries to build with just an entry point declaration and a module. * Try to make link work with multiple modules. * Multi module linking first pass working. * Multi module test working with -module-name option * Added feature to repro manifest of approximation of command line that was used. * Use isDefinition - for determining to add decorations to entry point lowering. * Added support for repo-file-system.h More precise control of CacheFileSystem. Allow RelativeFileSystem to strip paths optionally. Use canonical paths in PathInfo cache. Fix bug in -D options for command line output of StateSerailizeUtil * Add missing slang-options.h * Fix bug in bit slang-state-serialize.cpp with bit removal. * Added documentation around -repro-file-system Added spLoadReproAsFileSystem function. * Fix warning. * spAddLibraryReference * * Add support for slang-lib extension * Container output when using -no-codegen option * Use the m_containerFormat to determine if the module container is constructed. Store the result in a blob. This allows for potential access via the API. Write the blob if a filename is set. Use m_ prefix for container variables. * Added spGetContainerCode. Made spGetCompileRequestCode work. * * Put obfuscateCode on linkage * Remove obfuscation from variable names - as can be achieved by either stripping and/or removing NameHintDecorations at lowering * Remove name hints being added during lowering * Add stripping of SourceLoc location in strip phase * Hashing of linkage import/export names. * Do final strip in emitEntryPoint, removes any remaining SourceLoc.
2019-10-24	Strip IR after front-end steps are done (#1092)	Tim Foley
	* Strip IR after front-end steps are done The main feature of this change is to unconditonally strip out the `IRHighLevelDeclDecoration`s in an IR module once the "mandatory" IR passes in the front end have run. This ensures that later IR passes (e.g., code emission) cannot rely on AST-level information to get their job done. Since I was already writing a pass to remove some instructions at the end of the front-end passes, I went ahead and also made the `-obfuscate` flag apply to the front-end IR generation by causing it to strip `IRNameHintDecoration`s while it is doing the other stripping. With this, the main identifying information left in IR modules (other than semantics and entry-point names) is mangled name strings for imported/exported symbols. A few other things got changes along the way: * Removed the `.expected` file for one of the tests, where that file seemingly shouldn't have been checked in at all. * Updated the signature of the DCE pass both so that it doesn't require a back-end compile request (it wasn't using it anyway), and so that it takes some options to decide whether to keep symbols marked `[export(...)]` alive (the front-end wants to keep these, while back-end passes currently need to be able to eliminate them). * Moved the `obfuscateCode` flag from the back-end compile request to the base class shared between front- and back-end requests, and updated the options and repro logic to set both as needed. An obvious improvement in the future would be to have the front- and back-end requests share these settings by referencing a single common object in the end-to-end case, rather than each having their own copy. * Removed logic that was keeping layout instructions alive in DCE, even if they weren't used. This seems to have been a vestige of an intermediate step between AST and IR layout. * fixup: add the new files
2019-10-17	Initial work on representing layout at IR level (#1079)	Tim Foley
	* Initial work on representing layout at IR level This change starts the process of making the back-end of the compiler independent of the AST-level layout information (`TypeLayout`, `VarLayout`, etc.) so that it instead only relies on layout information that is embedded into IR modules. This brings us incrementally closer to a world in which the back-end could be run without the AST-level structures even existing (e.g., for an application that just wants to ship IR without any AST information for IP protection, while still supporting some amount of linking and specialization). The main parts of the change are: * There is a bunch of incidental churn related to specifying entry points by index instead of the `EntryPoint` object for certain operations. This ends up being a better choice because we can use the index to look up side-band information about the entry point that might not be stored on the `EntryPoint` object itself. In particular... * We expand the `ComponentType` interface to support looking up the mangled name of an entry point by index. In common cases (no generic/interface specialization) this would be the same as asking the `EntryPoint` for its mangled name, but in cases where we have specialized a generic entry point, the mangled name would include speicalization arguments that are only available on the `SpecializedComponentType` that wraps the entry point. This part of the change isn't ideal and there might be a better solution waiting to be invented. Note that we store mangled entry point names as strings rather than using `DeclRef`s because that ensures that the information could be serialized and deserialized without a dependence on the AST. * The `TargetProgram` type (which represents binding a specific `ComponentType` for a shader program to a specific `TargetRequest` that represents the target platform) is expanded to include an `IRModule` that represents layout information, in addition to the AST-level `ProgramLayout` it already contained. We create both of these objects at the same time (on-demand) to simplify the overall flow (so that any code that triggers creation of the AST-level layout will also ensure that the IR-level layout exists). * A bunch of code in the emit passes that was passing down layout-related objects has been eliminated. It appears that most of those objects weren't actually being used, so this is just a cleanup, but it helps ensure that the back-end steps are "clean" and don't depend on the AST-level information. The one big exception here is that the emit logic needs to know the stage for the entry point being emitted (to deal with one wrinkle in translating DXR to VKRT). * A big change (actually introduced by @jsmall-nvidia in a branch that this change copied and then built from) is to introduce some more explicit IR instructions to represent layout information, notably an `IRTypeLayout` and an `IRVarLayout`. For now these objects still reference their AST equivalents, but the separation gives us an incremental path to move information from the AST-level objects over to the IR ones. This work includes logic in `IRBuilder` to construct the IR-level layout objects from the AST-level ones on-demand, so that the existing code paths that try to attach AST-level layout will continue to work for now. * Because layout information is now embedded in the IR, the `slang-ir-link.cpp` logic loses a lot of cases that used to deal with attaching AST-level layout objects to IR-level instructions during the linking process. Instead, the linker now assumes that one (or more) of the input IR modules will have layout information associated with it, and the linker makes sure to copy layout decorations (and the instructions they reference) from the input IR module(s) to the output using its more ordinary mechanisms. * Inside `slang-lower-to-ir.cpp`, we add logic to construct an IR module in a `TargetProgram` that simply references the global shader parameters, entry points, etc. and attaches IR layout decorations to them. This is akin to the existing pass in the same file that constructs IR to represent specialization information, and both of these passes share infrastructure with the main AST->IR lowering pass. Eventually, it is expected that this pass will encompass more of the logic for copying AST-level layout information over to IR-level equivalents. * One small wrinkle with this change was that the output for an HLSL generation test case changed some of its `#line` directives. The old code was actually more inaccurate than the new, so this change just updated the baseline. It also added some logic in the linker to make sure that when an IR instruction has multiple definitions, we try to pick up a source location from any of them, in case the "main" one somehow didn't get a location. * Another small fix was that the key/value map in `StructTypeLayout` for mapping fields/members to their layouts was keyed on `Decl` when it really should have been `VarDeclBase`. This change should in principle be a pure refactoring with no functionality changes, so no new tests were added. It is unfortunately also a change that has a high probability of breaking at least some client code, so we may want to be defensive and mark this with a new major version number (well, a new minor version number since we are pre-`1.0`) to give us some room for releasing hotfixes to the old version if needed. * fixup: infinite recursion bug detected by clang * fixup: remove commented-out code
2019-08-20	User defined downstream compiler prelude (#1028)	jsmall-nvidia
	* Added setDownstreamCompilerPrelude Renamed setPassThroughPath to setDownstreamCompilerPath. Fixed tests. Added prelude directory & code to TestToolUtil to setup default preludes for testing/command line apis. * Fix merge problem * Remove hacks to make prelude work by adding a search path as no longer needed with 'user prelude'. * Split up prelude into scalar intrinsics, and types. Use slang.h for main header. slang-cpp-prelude.h can now just include what it needs (relative to prelude directory) and define the few remaining things/work arounds. * Fix typo.
2019-08-08	WIP: Preliminary Slang -> C++ code generation (#1009)	jsmall-nvidia
	* Expanded prelude for some other resource types. Disable C++ output for ParameterGroup. * WIP: Layout for CPU. * Fixes to CPU layout. * WIP: The uniform is output, but the variable definition is not. * WIP: Entry point parameters to global scope in C++. Handling of resource types (in so far as outputting) * Some discussion of ABI and different input types. * WIP: More C++ support around resource types. * WIP: Split up variables into different structures on emit. * WIP: Emitting C++ with wrapping up of 'Context' * WIP: C++ code has access to semantic values. Wrap in struct so can use method calls to pass shared state. Disable legalizeResourceTypes and legalizeExistentialTypeLayout * Fix structured buffer layout for CPU. * Remove testing/handling of global uniforms on CPU path. Typo fix. Changed CPU tests to use new CPU calling convention. * Check globals are working. Initalize context to zero globals. * Order the global parameters for C++ ouput by their layout. Note - that layout isn't quite working correctly because the StructuredBuffer<int> the int seems to be consuming uniform space. * Work around for reflection not having all data needed for layout ordering for C++ code. * Output constant buffers as pointers. * Entry point parameters accessed through pointer to struct. * WIP: Layout for CPU is reasonable for test case. * Only output 'f' after float literal if type marks as a float. * Cast construction works on C++. * Made IntrinsicOp::ConvertConstruct to make intent clearer. * C++ handling construction from scalar. Handle access of a scalar with .x. Check default initialization. * Comment about need for split of kIROp_construct. Release build works. * Added support from constructVectorFromScalar to C/C++ target. * Handling of in/out in C/C++. * First pass documentation CPU support. * Improvements to C++/C slang code generation documentation. * Small doc change to include need for mechansim to specify cpp compiler path. * Better handling of swizzling - allow swizzling a scalar into a vector.
2019-08-08	Revise new COM-lite API (#1007)	Tim Foley
	* Revise new COM-lite API This change revises the "COM-lite" API that was recently introduced to try to streamline it and introduce some missing central/base concepts. The central new abstraction in the API is the notion of a "component type," which is a unit of shader code composition. A component type can have: * IR code for some number of functions/types/etc. * Zero or more global shader parameters * Zero or more "entry point" functions at which execution can start * Zero or more "specialization" parameters (types or values that must be filled in before kernel code can be generated) * Zero or more "requirements" (dependencies on other component types that must be satisfied before kernel code can be generated) Both individual compiled modules, and validated entry points are then examples of component types, and we additionally define a few services that apply to all component types: * We can take N component types and compose them to create a new component type that combines their code, shader parameters, entry points, and specialization parameters. A composed component type may also include requirements from the sub-component types, but it is also possible that by composing thing we satisfy requirements (if `A` requires `B`, and we compose `A` and `B`, then the requirement is now satisfied, and doesn't appear on the composite). * We can take a component type with N specialization parameters, and specialize it by giving N compatible specialization arguments. The result of specialization is a new component type with zero specialization parameters. Under the right circumstances the specialzed component type will be layout compatible with the unspecialized one. * One more example that isn't exposed in the public API today is that we can take a component with requirements and "complete" it by automatically composing it with component types that satisfy those requirements. This can be seen as a kind of linking step that pulls together the transitive closure of dependencies. * We can query the layout for the shader parameters and entry points of a component type, for a specific target. * We can query compiled kernel code for an entry point in a component type (for a specific target). This only works for component types with zero specialization parameters and zero requirements. The idea is that by giving users a fairly general algebra of operations on component types, they can compose final programs in ways that meet their requirements. For example, it becomes possible to incrementally "grow" a component type to represent the global root signature for ray tracing shaders as new entry points are added, in such a way that it always stays layout-compatible with kernels that have already been compiled. Much of the implementation work here is in implementing the unifying component type abstraction, and in particular re-writing code that used to assume a program consisted of a flat list of modules and entry points to work with a hierarchical representation that reflects the underlying algebra (e.g., with types to represent composite and specialized component types). There's also a hidden "legacy" case of a component type to deal with some legacy compiler behaviors that can't be directly modeled on top of the simple algebra with modules and entry points. This API is by no means feature-complete or fully developed. It is expected that we will flesh it out more when bringing up application code (e.g., Falcor) on top of the revamped API. One notable thing that went away in this change is explicit support for "entry point groups" and notions of local root signatures (especially the Falcor-specific handling of the `shared` keyword, which a previous change turned into an explicitly supported feature). With the new "building blocks" approach, it should be possible for a DXR application to deal with local root signatures as a matter of policy (on top of the API we provide). If/when we need to provide some kind of emulation of local root signatures for Vulkan (and/or if Vulkan is extended with an explicit notion of local root signatures), we might need to revisit this choice. * Fix debug build There was invalid code inside an `assert()`, so the release build didn't catch it. * fixup: warnings * fixup: more warnings-as-errors * fixup: review notes * fixup: use component type visitors in place of dynamic casting
2019-07-18	Add back a notion of IR global constants (#1002)	Tim Foley
	This change adds back a little bit of explicit support for global constants in the IR, after a previous change completely removed the existing `IRGlobalConstant` node type. The new `IRGlobalConstant` is not a parent instruction, and doesn't function at all like the old one. Instead it is effectively a simple instruction that takes zero or one operands: * The zero-operand case represents a constant with unknown value. This would usually come from another module, and thus would have an `[import(...)]` linkage decoration, so that after linking it resolves to a constant with a known value. * In the one-operand case, the single operand represents the value of the constant, so that the operation semantically behaves like an identity function. It exists just to give decorations something to "attach" to, so that a global constant with a value can have, e.g., an `[export(...)]` decoration to establish linkage. The IR lowering pass was updated to create the new node type to wrap any global constants. For now we do this both for global `static const` variables and function-scope `static const`, although the latter doesn't really need the extra indirection. The IR linking logic was extended to handle linking of global constants akin to how other global instructions are handled. The new logic is mostly boilerplate, and it is likely that a refactor of the linking logic would eliminate the need for this kind of per-instruction-opcode handling of IR instructions that can have linkage. A custom pass was added that is intended to be run right after linking (it could arguably be folded into `linkIR()`, but I thought it was safer to keep each pass as small as possible). This pass replaces any `IRGlobalConstant` that has a value (operand) with that value, so that global constants should be eliminated after the linking step. This ensures that downstream optimization/transformation passes don't have to deal with the possibility of global constants. Almost all the existing passes would Just Work if global constants were left in the IR. The two big exceptions are: * Anything that relies on testing `IRInst` identity as a way to test for things having the same value would break, since a global constant is a distinct `IRInst` from its value. * The type legalization pass doesn't handle `IRGlobalConstant` instructions with non-simple types. This could be added if we ever wanted it, but it seemed silly to write this code now if it would always be dead (and thus untested). I went ahead and updated the emit logic to handle an `IRGlobalConstant`s that still existing in the IR module at emit time, since the amount of code required was small so that being robust to that case seemed safest (e.g., in case we ever want to have a path that emits code directly while skipping some/all of our IR transformation passes). There should be no visible changes to the functionality of the compiler with this change, but it should help make IR dumps from the front-end more clear/explicit (since each constant will be a distinct instruction with its own name), and paves the way for supporting proper cross-module linkage of constants.
2019-07-09	WIP: slang to C++ code generation (#997)	jsmall-nvidia
	* WIP: Emitting Cpp * Added HLSLType instead of using IRInst - because they don't seem to be deduped. * Removed need for lexer to take a String. Added mechansim to lookup intrinsic functions on C++. * A c/c++ cross compilation test. * WIP Cpp output using cloning and slang types. * More work to generate mul funcs. * WIP: Outputting some simple C++. * Expose findOrEmitHoistableInst to IRBuilder to aid cloning, * Simplification for checking for BasicTypes. Test infrastructure compiles output C++ code. * Dot and mat/vec multiplication output. * First pass at swizzling. * First support for binary ops. * Builtin binary and unary functions. * Any and all. * WIP adding support for other functions. Added code to generate function signature. * Add scalar functions to slang-cpp-prelude.h * Support for most built in operations. * Tested first ternary. * Checking the emitting of corner cases functions - normalize, length, any, all, normalize, reflect. * Check asfloat etc work. * Fmod support. * WIP Array handling in C++. * First stage in being able to handl arbitrary type output for CLikeSourceEmitter * Removed Handler/Emitter split - so can implement more easily complex type naming. * Array passing by value first pass. * Rename Array -> FixedArray * Outputs structs in C++. * Emit the thread config. * Dimension -> TypeDimension * SpecializedOperation -> SpecializedIntrinsic Operation -> IntrinsicOp Use shared impl of isNominalOp Commented use of m_uniqueModule etc. * Add code to test slang->cpp when compiled doesn't have errors. Does so by building shared library and exporting the entry point. * Fix linux clang/gcc compile error about override not being specified. * Make sure c-cross-compile is run on linux targets/smoke. * Remove c-cross-compile.slang from smoke. * Fix running tests/cross-compile/c-cross-compile.slang on Ubuntu 16.04 * Only add -std=c++11 for C++ source.
2019-06-19	Start exposing a new COM-lite API (#987)	Tim Foley
	* Start exposing a new COM-lite API This change is mostly about exposing a new API to the Slang compiler that allows more fine-grained control over the compilation flow. The basic concepts in the new API are: * An `IGlobalSession` is the granularity at which we load/parse the Slang stdlib, and therefore gives applications a way to amortize startup cost for the library across multiple compiles. This is a concept that might be able to go away in a future version of Slang. * An `ISession` owns all the code that gets loaded/compiled/generated. Any `import`ed modules are shared across everything in a session (we don't re-parse/-check the code when we see another `import` for the same module). Any generic- or interface-based code in the session can be specialized using types from the same session (but not necessarily across sessions). * An `IModule` is the unit of code loading and scoping. It doesn't expose any API in this change, but would be the right scope for looking up types or entry points by name. * An `IProgram` is a "linked" combination of modules and entry points from which code can be generated and reflection information queried. This change re-uses the existing reflection API types, rather than introduce a new API that duplicates that functionality. That will probably change in a future revision. There are two major pieces of functionality added here that aren't related to the new API: * We now have an API concept of "entry point groups" which are one or more entry points that are intended to be used together so that they need to have non-overlapping parameters. For now this is being used to handle "hit groups" and local root signatures for ray tracing, but I'm not sure this is a concept we will keep in the long run. * We have a very special-case (client-application-specific) flag that ascribes special meaning to the `shared` keyword, so that it can be attached to global parameters to indicate that they are actually to be part of the local root signature rather than the global one for DXR. None of the API design (including naming) here is finalized; the only reason to check in the changes at this point to avoid having a long-running branch that leads to merge pain. Clients should not try to depend on the new API just yet, since it is still a work in progress. * fixup: clang warning * fixup: try to detect clang C++11 support * fixup * fixup * fixup * fixup * fixup: review feedback
2019-06-06	Split out target code generation from CLikeSourceEmitter (#976)	jsmall-nvidia
	* * Added SourceStyle to CLikeSourceEmitter, to limit cases to actual target types. * Made Impl methods _ prefixed * Small tidyup * * SourceStream -> SourceWriter * use slang-emit- prefix on SourceWriter file * * Remove EmitContext -> merge into CLikeSourceEmitter * slang-c-like-source-emitter -> slang-emit-source.cpp * ExtensionUsageTracker -> GLSLExtensionTracker slang-extension-usage-tracker.cpp/.h -> slang-emit-glsl-extension-tracker.cpp/.h * emit-source.cpp.h -> emit-c-like.cpp/.h * Small fix to move where some _ prefixed functions are declared in CLikeSourceEmitter. * * CLikeSourceEmitter::CInfo -> Desc * Functions to get and find CodeGenTarget by name * Split out empty language impls * Create an impl based on SourceStyle * * CodeGenTarget conversion to and from string * Move HLSL specific functions to HLSLEmitSource. * Emitting texture and image types. * Move move GLSL specific functionality to GLSLSourceEmitter * Split more out of slang-emit-c-like * Refactor more out of slang-emit-c-like * * tryEmitIRInstExprImpl(IRInst* inst, IREmitMode mode, const EmitOpInfo& inOuterPrec) * Fix bug around output of uintBitsToFloat * More work refactoring out target specifics from slang-emit-c-like * Move functions that are only implemented once in GLSL impl into their Impl method. * Move rate qualification out of slang-emit-c-like * * Added getEmitOpForOp - allows for table usage so different ops can be dealt with the same way * Moved vector comparison to slang-emit-glsl * * * Use EmitOpInfo to control output in slang-emit-c-like.cpp for unary ops * Move more functionality from CLikeSourceEmitter to HLSLSourceEmitter * Make output of parameters implementaion specific. * Extracted interpolation modifiers. * Remove IR from methods that don't need them. * Remove IR from method names. * Refactor handling of output of types - to make the impls implement the full path without lots of cases for specific impls * Add variable declaration modifiers and matrix layout to larget specific in slang-emit. * Make target specific internal functions _ prefixed.
2019-06-04	Review improvements on #971: WIP: Support for other source target languages ↵	jsmall-nvidia
	(#974) * * Added SourceStyle to CLikeSourceEmitter, to limit cases to actual target types. * Made Impl methods _ prefixed * Small tidyup * * SourceStream -> SourceWriter * use slang-emit- prefix on SourceWriter file * * Remove EmitContext -> merge into CLikeSourceEmitter * slang-c-like-source-emitter -> slang-emit-source.cpp * ExtensionUsageTracker -> GLSLExtensionTracker slang-extension-usage-tracker.cpp/.h -> slang-emit-glsl-extension-tracker.cpp/.h * emit-source.cpp.h -> emit-c-like.cpp/.h * Small fix to move where some _ prefixed functions are declared in CLikeSourceEmitter.
2019-05-31	Use slang- prefix on slang compiler and core source (#973)	jsmall-nvidia
	* Prefixing source files in source/slang with slang- * Prefix source in source/slang with slang- prefix. * Rename core source files with slang- prefix. * Update project files. * Fix problems from automatic merge.