slang.git - Making it easier to work with shaders

	Commit message (Collapse)	Author	Age
*	Working on better handling of builtin functions in IR (#196)	Tim Foley	2017-10-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main change I was working on here was to start having more of the builtin functions (in this case, `cos`, `sin`, and `saturate`) just lower to the IR as calls to builtin functions (with declarations but no definition), rather than expect/require them to map to individual IR opcodes in every case. The main change there was the removal of some `intrinsic_op` modifiers in the stdlib. This then requires the `isTargetInstrinsic` logic in IR-based code emit to avoid emitting declarations for these intrinsics. The corresponding logic for emitting calls to these intrinsics is currently being skipped. Along the way, a variety of fixups were added: - In order to support lowering to GLSL, we need to handle cases where a variable/function name uses a GLSL reserved word. The right long-term fix there is to always use generated or mangled names, but for now I'm hacking it by adding a `_s` prefix to all names during IR-based emit. - This needs a flag to disable it, since some of our tests currently rely on checking binding information from generated HLSL/SPIR-V that will include these mangled/modified names. - Emit matrix layout modifiers appropriately for GLSL - Specialize IR parameter-block emission between GLSL and HLSL - Fix up argument count/index logic for a couple of opcodes that weren't fixed when removing the types from the explicit operand list - Fix up IR generation for calls to declarations with generic arguments. We were briefly adding the generic args to the ordinary argument list, which added complexity in several places. We now rely on the declaration-reference nodes in the IR to carry that extra info. - TODO: We actually need to make sure that this is the case, since we don't currently correctly generated specialized decl-refs when building IR for function calls The main test that would have been affected by this is `cross-compile-entry-point`, but I was not able to get that working fully with the IR. The main problem in this case was that when emitting GLSL we will need to perform certain required transformations on the IR to get legal code for GLSL. Notably: - We need to hoist entry-point parameters away from being function parameters, and make them be global variables. This is currently being hand-waved during the emit logic, but it seems way better to have it all get cleaned up in the IR first. - We need to scalarize entry-point parameters, because structure input/output is not supported as vertex input or fragment output (and it may be best to always scalarize anyway, to match HLSL semantics). (Note: "scalarize" here means to bust up structures, but not matrices/vectors)
*	IR: handle control flow constructs (#186)	Tim Foley	2017-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* IR: handle control flow constructs This change includes a bunch of fixes and additions to the IR path: - `slang-ir-assembly` is now a valid output target (so we can use it for testing) - This uses what used to be the IR "dumping" logic, revamped to support much prettier output. - A future change will need to add back support for less prettified output to use when actually debugging - IR generation for `for` loops and `if` statements is supported - HLSL output from the above control flow constructs is implemented - Revamped the handling of l-values, and in particular work on compound ops like `+=` - Add basic IR support for `groupshared` variables - Add basic IR support for storing compute thread-group size - Output semantics on entry point parameters - This uses the AST structures to find semantics, so its still needs work - Pass through loop unroll flags - This is required to match `fxc` output, at least until we implement unrolling ourselves. * Fixup: 64-bit build issues. * fixup for merge
*	Continue work on IR-based codegen	Tim Foley	2017-09-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This gets us far enough that we can convert a single test case to use the IR, under the new `-use-ir` flag. Getting this merged into mainline will at least ensure that we keep the IR path working in a minimal fashion, even when we have to add functionality the existing AST-based path There is definitely some clutter here from keeping both IR-based and AST-based translation around, but I don't want to have a long-lived branch for the IR that gets further and further away from the `master` branch that is actually getting used and tested. Summary of changes: - Add pointer types and basic `load` operation to be able to handle variable declarations - Add basic `call` instruction type - Add simple address math for field reference in l-value - Always add IR for referenced decls to global scope - Add notion of "intrinsic" type modifier, which maps a type declaration directly to an IR opcode (plus optional literal operands to handle things like texture/sampler flavor) - Improve printing of IR instructions, types, operands - Add constant-buffer type to IR - Allow any instruction to be detected as "should be folded into use sites" and use this to tag things of constant-buffer type - Also add logic for implicit base on member expressions, to handle references to `cbuffer` members - Add connection back to original decl to IR variables (including global shader parameters...) - Use reflection name instead of true name when emitting HLSL from IR (so that we can match HLSL output) - Make IR include decorations for type layout - Re-use existing emit logic for HLSL semantics to output `register` semantics for IR-based code - Make IR-based codegen be an option we can enable from the command line - It still isn't on by default (it can barely manage a trivial shader), but it seems better to enable it always instead of putting it under an `#ifdef` - Fix up how we check for intrinsic operations suring AST-based cross compilation so that adding new intrinsic ops for the IR won't break codegen.
*	Add a flag to control type splitting	Tim Foley	2017-08-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `-split-mixed-types` flag can be provided to command-line `slangc`, and the `SLANG_COMPILE_FLAG_SPLIT_MIXED_TYPE` flag can be passed to `spSetCompileFlags`. Either of these turns on a mode where Slang will split types that included both resource and non-resource fields. The declaration of such a type will just drop the resource fields, while a variable declare using such a type turns into multiple declararations: one for the non-resource fields, and then one for each resource field (recursively). This behavior was already implemented for GLSL support, and this change just adds a flag so that the user can turn it on unconditionally. Caveats: - This does not apply in "full rewriter" mode, which is what happens if the user doesn't use any `import`s. I could try to fix that, but it seems like in that mode people are asking to bypass as much of the compiler as possible. - When it does apply, it applies to user code as well as library/Slang code. So this will potentially rewrite the user's own HLSL in ways they wouldn't expect. I don't see a great way around it, though.
*	Make source location lightweight	Tim Foley	2017-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #24 So far the code has used a representation for source locations that is heavy-weight, but typical of research or hobby compilers: a `struct` type containing a line number and a (heap-allocated) string. This is actually very convenient for debugging, but it means that any data structure that might contain a source location needs careful memory management (because of those strings) and has a tendency to bloat. The new represnetation is that a source location is just a pointer-sized integer. In the simplest mental model, you can think of this as just counting every byte of source text that is passed in, and using those to name locations. Finding the path and line number that corresponds to a location involves a lookup step, but we can arrange to store all the files in an array sorted by their start locations, and do a binary search. Finding line numbers inside a file is similarly fast (one you pay a one-time cost to build an array of starting offsets for lines). More advanced compilers like clang actually go further and create a unique range of source locations to represent a file each time it gets included, so that they can track the include stack and reproduce it in diagnostic messages. I'm not doing anything that clever here.
*	Add a `-o` option to command-line `slangc`	Tim Foley	2017-07-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes #11 - This adds a `-o` command-line option for specifying an output file. - The code tries to be a bit smart, to glean an output format from a file extension, and also to associate multiple `-o` options with multiple `-entry` options if needed. - There is a restriction that all the output files need to agree on the code generation target. This is reasonable for now, but might be something to lift eventualy - There is a restriction that only one output file is allowed per entry point - Together with the previous item this means you can't output both a `.spv` and a `.spv.asm` in one pass, even though both should be possible - There is currently a restriction that output paths only apply to entry points - This means there is no way to output reflection JSON to a file with `-o` (but that is mostly just a debugging feature for now) - This also means we don't support any "container" formats that can encapsulate multiple compiled entry points
*	Try to improve handling of failures during compilation	Tim Foley	2017-07-19
\| \| \| \| \| \| \|	The change is mostly about trying to make sure the compiler "fails safe" when it encounters an internal assumption that isn't met. Most internal errors will now throw exceptions (yes, exceptions are evil, but this will work for now), and these get caught in `spCompile` so that they don't propagate to the user (they just see a message that compilation aborted due to an internal error). Subsequent changes are going to need to work on diagnosing as many of these situations as possible, so that users can at least know what construct in their code was unexpected or unhandled by the compiler.
*	Cleanups for test cases:	Tim Foley	2017-07-10
\| \| \| \| \| \|	- Allow a code-generation target of `NONE` in order to suppress ordinary output in test cases where we don't care about the actual output (just pass/fail result) - Add explicit `location` layout qualifiers to intermediate vertex-to-fragment variables in GLSL test cases for rendering, to work around apparent Intel driver bugs.
*	Pick layout rules based on target languge, not source.	Tim Foley	2017-07-09
\| \| \| \|	The tricky bit here was that the `reflection-json` output format isn't really a code generation target like the others, and we need to be able to have multiple "targets" active to make sense of it. This needs cleaning-up.
*	Fix many warnings-as-errors issues.	Tim Foley	2017-07-06
\| \| \| \| \|	The code should now compile cleanly with warnings as errors for VS2015 with `W3`. Most of the changes had to do with propagating a real pointer-sized integer type through code that had been using `int`.
*	Overhaul `RefPtr` and `String`	Tim Foley	2017-06-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- `RefPtr` no longer tries to have distinct cases for interal-vs-external reference counts. Instead we always require an internal reference count. - Types the used `RefPtr` but weren't `RefObject` were made to inherit `RefObject` - The `ReferenceCounted` base class was removed, so that only `RefObject` remains - Implicit conversion from `RefPtr<T>` to `T` added - This created some complicates for other types that relied on implicit conversions, so this isn't a net cleanup right now - The main type that got messed up by the above was `String`, which previously held a `RefPtr<char, ...>`. This change thus also* includes a major overhaul of `String`: - `String` now holds all its data via indirection, using a `StringRepresentation` that is a `RefObject`. This object holds a length, capacity, and directly stores the character data in its allocation. This means that `sizeof(String)==sizeof(void*)` - It is now possible to directly mutate a `String` by appending to its representation (we just need to ensure it has a reference count of `1`, possibly by cloning it). This means that `StringBuilder` is now basically just an idomatic use of `String` - A couple operations that just return sub-ranges of a `String` now return `StringSlice` to avoid allocation when it isn't needed. This required more work. - Indices into strings changed from `int` to `UInt` (which is pointer-sized). This had a bunch of follow-on changes because the value `-1` sometimes needs to be special-cased in code that uses indices. Further cleanups are probably needed here.
*	Replace "auto-import" with `#import`	Tim Foley	2017-06-26
\| \| \| \| \| \|	Right now `#import` only differs from `#include` in that it takes a string literal for a file name instead of a raw identifier (to which `.slang` gets appended). The next step is to make `#import` respect preprocessor state, while `__import` doesn't.
*	Overhaul handling of entry points and translation units.	Tim Foley	2017-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main user-visible change here is that instead of `spAddTranslationUnitEntryPoint` we have `spAddEntryPoint`, to reflect that the list of entry points is "global" to a compile request. As a result, `spGetEntryPointSource` now only needs the entry point index, and not the translation unit index. There are a bunch more behind-the-scenes changes, though, reflecting a streamlining of the concepts related to compilation into a smaller number of classes. Now there is: - `Session` (unchanged) to manage the lifetimes of shared stuff like the stdlib - `CompileRequest` (merges in `CompileOptions`) to handle all the lifetime related to a single invocation of the compiler - `TranslationUnitRequest` (merges `TranslationUnitOptions`, `CompileUnit`) to represent a single translation unit ("module") that the user is trying to compile. This is a single file for HLSL/GLSL, but can be multiple files for Slang. - `EntryPointRequest` (merges `EntryPointOption` and a bit of `EntryPointResult`) to track a single entry point that the user is asking to compile (that entry point always comes from a single translation unit) A lot of functions used to take some combination of these and end up with really long signatures. I've given most of the objects "parent" pointers so that they can get back to all the context they need, so most functions don't need as many parameters. It may eventually be important to tease these apart again, in particular: - The code-generation side of things (the `*Result` types) might need to be pulled out in case we want to codegen multiple times from the same AST - Similarly, the layout stuff may also need to be pulled out, in case we want to lay things out multiple times with different rules.
*	Allow for automatic importing of Slang code	Tim Foley	2017-06-19
	The basic idea of this change is that user code can just write: #include "foo.h" and then if `foo.h` gets found in a list of registered directories for "auto-import," then it actually gets interpreted as if the user had writte, more or less: __import foo; That is, the code in `foo.h` will be treated as Slang, and will be fully parsed and checked (no matter what the source language had been), and the scoping rules will be those of `__import` instead of `#include`. This is a really big hammer, and I could imagine it smashing fingers if used poorly. I'm not sure this feature will pan out, but we need to try things to know. One big piece of that that I'll likely keep in either case is an overhaul of command-line options parsing for `slangc`. In particular, this logic has been moved into the core `slang` library (so that users can just pass options in via the API), and it is all done on UTF-8 strings rather than wide strings (which was always going to be Windows-specific).