slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2020-06-18	Fix and improvements around repro (#1397)	jsmall-nvidia
	* * Fix output in slang repro command line * Profile uses lowerCamel method names (had mix of upper and lower) * Rename slang-serialize-state/SerializeStateUtil to slang-repro and ReproUtil.
2020-06-18	Improvements around C++ code generation (#1396)	jsmall-nvidia
	* * Remove UniformState and UniformEntryPointParams types * Put all output C++ source in an anonymous namespace * If SLANG_PRELUDE_NAMESPACE is set, make what it defines available in generated file. * Fix signature issue in performance-profile.slang * Context -> KernelContext to avoid ambiguity. * Fix issues around dynamic dispatch and anonymous namespace. * Fix typo.
2020-06-18	Associate a downstream compiler for prelude lookup even if output is source. ↵	jsmall-nvidia
	(#1395) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-06-17	Add != operator for enum types (#1394)	Tim Foley
	This was an oversight in the stdlib, and the `!=` definition follows the `==` in a straightforward fashion.
2020-06-17	Generate dynamic C++ code for the minimal test case. (#1391)	Yong He
	* Add IR pass to lower generics into ordinary functions. * Fix project files * Emit dynamic C++ code for simple generics and witness tables. Fixes #1386. * Remove -dump-ir flag. * Fixups.
2020-06-17	Hotfix/slangc unreleased compile request (#1393)	jsmall-nvidia
	* Releases compile request if there is an error. * Arrange so that caller can clean up CompileRequest so don't have to capture all paths.
2020-06-15	Merge branch 'master' into glsl-loop	Tim Foley

2020-06-15	Specialize function calls involving array arguments. (#1389)	Yong He
	Fixes #890. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-06-15	Remove implicit conversions to `void` (#1388)	Tim Foley
	* Remove implicit conversions to `void` Fixes #1372 The standard library code had accidentally introduced implicit-conversion `__init` operations on the `void` type that accepted each of the other basic types, so that a function written like: ```hlsl void bad() { return 1; } ``` would translate to: ```hlsl void bad() { return (void)1; } ``` The dual problesm are that the input code should have produced a diagnostic of some kind, and the output code doesn't appear to compile correctly through fxc. This change introduces several fixes aimed at this issue: * First, the problem in the stdlib code is plugged: we don't introduce implicit conversion operations to or from `void` (we'd only been banning it in one direction before) * Next, an explicit `__init` was added to `void` that accepts any type so that existing HLSL code that might do `(void) someExpression` to ignore a result will continue to work. This is a compatibility feature, and it might be argued that we should at least warn when it is used. Note that this function is expected to never appear in output HLSL/GLSL because its result will never be used, and it is marked `[__readNone]` allowing calls to it to be eliminated as dead code. * During IR lowering, we now take care to only emit the `IRReturnVal` instruction type if there is a non-`void` value being returned, and use `IRReturnVoid` for both the case where no expression was used in the `return` statement and the case where an expression of type `void` is returned. * A test case was added to confirm that returning `1` from a `void` function isn't allowed, while returning `(void) 1` is. The net result of these changes is that we now produce an error for the bad input code, we allow explicit casts to `void` as a compatibility feature, and we are more robust about treating `void` as if it is an ordinary type in the front-end. * fixup: missing file
2020-06-15	Merge branch 'master' into glsl-loop	Yong He

2020-06-15	Generate IRType for interfaces, and reference them as `operand[0]` in ↵	Yong He
	IRWitnessTable values (#1387) * Generate IRType for interfaces, and use them as the type of IRWitnessTable values. This results the following IR for the included test case: ``` [export("_S3tu010IInterface7Computep1pii")] let %1 : _ = key [export("_ST3tu010IInterface")] [nameHint("IInterface")] interface %IInterface : _(%1); [export("_S3tu04Impl7Computep1pii")] [nameHint("Impl.Compute")] func %Implx5FCompute : Func(Int, Int) { block %2( [nameHint("inVal")] param %inVal : Int): let %3 : Int = mul(%inVal, %inVal) return_val(%3) } [export("_SW3tu04Impl3tu010IInterface")] witness_table %4 : %IInterface { witness_table_entry(%1,%Implx5FCompute) } ``` * Fixes per code review comments. Moved interface type reference in IRWitnessTable from their type to operand[0]. * Fix typo in comment.
2020-06-13	Emit [[dont_unroll]] attribute in GLSL	Yong He

2020-06-12	Diagnose circularly-defined constants (#1384)	Tim Foley
	* Diagnose circularly-defined constants Work on #1374 This change diagnoses cases like the following: ```hlsl static const int kCircular = kCircular; static const int kInfinite = kInfinite + 1; static const int kHere = kThere; static const int kThere = kHere; ``` By diagnosing these as errors in the front-end we protect against infinite recursion leading to stack overflow crashes. The basic approach is to have front-end constant folding track variables that are in use when folding a sub-expression, and then diagnosing an error if the same variable is encountered again while it is in use. In order to make sure the error occurs whether or not the constant is referenced, we invoke constant folding on all `static const` integer variables. Limitations: * This only works for integers, since that is all front-end constant folding applies to. A future change can/should catch circularity in constants at the IR level (and handle more types). * This only works for constants. Circular references in the definition of a global variable are harder to diagnose, but at least shouldn't result in compiler crashes. * This doesn't work across modules, or through generic specialization: anything that requires global knowledge won't be checked * fixup: missing files * fixup: review feedback
2020-06-10	Add compiler flag to disable specialization pass.	Yong He

2020-06-08	Small fixes/improvements based on review. (#1379)	jsmall-nvidia

2020-06-05	Filter lookup results from interfaces in `visitMemberExpr`.	Yong He
	Fixes #1377
2020-06-05	Merge branch 'master' into findtypebynamefix	Yong He

2020-06-05	ASTNodes use MemoryArena (#1376)	jsmall-nvidia
	* Add a ASTBuilder to a Module Only construct on valid ASTBuilder (was being called on nullptr on occassion) * Add nodes to ASTBuilder. * Compiles with RefPtr removed from AST node types. * Initialize all AST node pointer variables in headers to nullptr; * Initialize AST node variables as nullptr. Make ASTBuilder keep a ref on node types. Make SyntaxParseCallback returns a NodeBase * Don't release canonicalType on dtor (managed by ASTBuilder). * Give ASTBuilders a name and id, to help in debugging. For now destroy the session TypeCache, to stop it holding things released when the compile request destroys ASTBuilders. * Moved the TypeCheckingCache over to Linkage from Session. * NodeBase no longer derived from RefObject. * Only add/dtor nodes that need destruction. First pass compile on linux.
2020-06-05	Merge branch 'master' into findtypebynamefix	Yong He

2020-06-05	Fix FindTypeByName reflection API not finding stdlib types.	Yong He

2020-06-05	Merge branch 'master' into loop_attrib	Tim Foley

2020-06-05	Fixes for active mask synthesis + tests (#1370)	Tim Foley
	* Fixes for active mask synthesis + tests There are two fixes here: * The code generation that follows active mask synthesis was requiring CUDA SM architecture version 7.0 for one of the introduced instructions, but not all of them. This change centralizes the handling of upgrading the required CUDA SM architecture version, and makes sure that the instructions introduced by active mask synthesis request version 7.0. * The tests for active mask synthesis were not flagged as requiring the `cuda_sm_7_0` feature when invoking `render-test-tool`, which meant they run but produce unexpected results when invoked on a GPU without the required semantics for functions like `__ballot_sync()`. This change adds the missing `-render-feature cuda_sm_7_0` to those tests. * fixup: mark more tests that rely on implicit active mask
2020-06-04	Add a ASTBuilder to a Module (#1369)	jsmall-nvidia
	Only construct on valid ASTBuilder (was being called on nullptr on occassion)
2020-06-04	Remove aborting in emitLoopControlDecoration default case.	Yong He

2020-06-04	Emit [loop] attribute to output HLSL.	Yong He

2020-06-04	First steps toward inheritance for struct types (#1366)	Tim Foley
	* First steps toward inheritance for struct types This change adds the ability for a `struct` type to declare a base type that is another `struct`: ```hlsl struct Base { int baseMember; } struct Derived : Base { int derivedMember; } ``` The semantics of the feature are that code like the above desugars into code like: ```hlsl struct Base { int baseMember; } struct Derived { Base _base; int derivedMember; } ``` At points where a member from the base type is being projected out, or the value is being implicitly cast to the base type, the compiler transforms the code to reference the implicitly-generated `_base` member. That means code like this: ```hlsl void f(Base b); ... Derived d = ...; int x = d.baseMember; f(d); ``` gets transformed into a form like this: ```hlsl void f(Base b); ... Derived d = ...; int x = d._base.baseMember; f(d._base); ``` Note that as a result of this choice, the behavior when passing a `Derived` value to a function that expects a `Base` (including to inherited member functions) is that of "object shearing" from the C++ world: the called function can only "see" the `Base` part of the argument, and any operations performed on it will behave as if the value was indeed a `Base`. There is no polymorphism going on because Slang doesn't currently have `virtual` methods. In an attempt to work toward inheritance being a robust feature, this change adds a bunch of more detailed logic for checking the bases of various declarations: * An `interface` declaration is only allowed to inherit from other `interface`s * An `extension` declaration can only introduce inheritance from `interface`s * A `struct` declaration can only inherit from at most one other `struct`, and that `struct` must be the first entry in the list of bases This change also adds a mechanism to control whether a `struct` or `interface` in one module can inherit from a `struct` or `interface` declared in another module: * If the base declaration is marked `[open]`, then the inheritance is allowed * If the base declaration is marked `[sealed]`, then the inheritance is allowed * If it is not marked otherwise, a `struct` is implicitly `[sealed]` * If it is not marked otherwise, an `interface` is implicitly `[open]` These seem like reasonable defaults. In order to safeguard the standard library a bit, the interfaces for builtin types have been marked `[sealed]` to make sure that a user cannot declare a `struct` and then mark it as a `BuiltinFloatingPointType`. This step should bring us a bit closer to being able to document and expose these interfaces for built-in types so that users can write code that is generic over them. There are some big caveats with this work, such that it really only represents a stepping-stone toward a usable inheritance feature. The most important caveats are: * If a `Derived` type tries to conform to an interface, such that one or more interface requirements are satisfied with members inherited from the `Base` type, that is likely to cause a crash or incorrect code generation. * If a `Derived` type tries to inherit from a `Base` type that conforms to one or more interfaces, the witness table generated for the conformance of `Derived` to that interface is likely to lead to a crash or incorrect code generation. It is clear that solving both of those issues will be necessary before we can really promote `struct` inheritance as a feature for users to try out. * fixup: trying to appease clang error * fixups: review feedback
2020-06-03	Devirtualize AST types (#1368)	jsmall-nvidia
	* Make getSup work with more general non-virtual 'virtual' mechanism. * WIP: Non virtual AST types. * Project change. * Type doesn't implement equalsImpl * Fix macro invocation Make Overridden functions public to make simply accessible by base types. * Use SLANG_UNEXPECTED. * GetScalarType -> getScalarType Use SLANG_UNEXPECTED instead on ASSERT in NamedExpressionType and TypeType
2020-06-02	Added spGetBuildTagString. (#1365)	jsmall-nvidia

2020-06-02	Make stdlib path just be the filename. (#1364)	jsmall-nvidia
	* Made bad-operaor-call available on all targets. Fix the line filename to not inclue path, to avoid paths being absolute and therefores value be host environment dependent (causing tests to fail). * Disable on linux because theres still a problem on gcc x86 where the file path is different. * Fix to some typos in bad-operator-call.slang * Fix diagnostic for bad-operator-call.slang
2020-06-02	Working matrix swizzle (#1354)	Dietrich Geisler
	* Working matrix swizzle. Supports one and zero indexing and multiple elements. Performs semantic checking of the swizzle. Matrix swizzles are transformed into a vector of indexing operations during lowering to the IR. This change does not handle matrix swizzle as lvalues. * Renaming * Added missing semicolon * Initialize variable for gcc * Added the expect file for diagnostics * Matrix swizzle updated per PR feedback * Stylistic fix * Formatting fixes * Fix compiling with AST change. Change indentation. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2020-05-29	Bug fix problem with ray tracing from fragment shader (#1362)	jsmall-nvidia
	* Added GLSL_460 if ray tracing is used on fragment shader. Moved GLSL specific setup init function. * Split out _requireRayTracing method.
2020-05-29	NodeBase types constructed with astNodeType member set (#1363)	jsmall-nvidia
	* Maked Substituions derived from NodeBase * * Add astNodeTYpe field to NodeBase * Make Substitutions derived from NodeBase * Make all construction through ASTBuilder * Make getClassInfo non virtual (just uses the astNodeType)
2020-05-29	Feature/ast syntax standard (#1360)	jsmall-nvidia
	* Small improvements to documentation and code around DiagnosticSink * Made methods/functions in slang-syntax.h be lowerCamel Removed some commented out source (was placed elsewhere in code) * Making AST related methods and function lowerCamel. Made IsLeftValue -> isLeftValue.
2020-05-28	Small improvements to documentation and code around DiagnosticSink (#1359)	jsmall-nvidia

2020-05-28	WIP: ASTBuilder (#1358)	jsmall-nvidia
	* Compiles. * Small tidy up around session/ASTBuilder. * Tests are now passing. * Fix Visual Studio project. * Fix using new X to use builder when protectedness of Ctor is not enough. Substitute->substitute * Add some missing ast nodes created outside of ASTBuilder. * Compile time check that ASTBuilder is making an AST type. * Moced findClasInfo and findSyntaxClass (essentially the same thing) to SharedASTBuilder from Session.
2020-05-26	Synthesize "active mask" for CUDA (#1352)	Tim Foley
	* Synthesize "active mask" for CUDA The Big Picture =============== The most important change here is to `hlsl.meta.slang`, where the declaration of `WaveGetActiveMask()` is changed so that instead of mapping to `__activemask()` on CUDA (which is semantically incorrect) it maps to a dedicated IR instruction. The other `WaveActive()` intrinsics that make use of the implicit "active mask" concept had already been changed in #1336 so that they explicitly translate to call the equivalent `WaveMask()` intrinsic with the result of `WaveGetActiveMask()`. As a result, all of the `WaveActive()` functions are now no different from a user-defined function that uses `WaveGetActiveMask()`. The bulk of the work in this change goes into an IR pass to replace the new instruction for getting the active mask gets replaced with appropriately computed values before we generate output CUDA code. That work is in `slang-ir-synthesize-active-mask.{h,cpp}`. Utilities ========= There are a few pieces of code that were helpful in writing the main pass but that can be explained separately: IR instructions were added corresponding to the Slang `WaveMaskBallot()` and `WaveMaskMatch()` functions, which map to the CUDA `__ballot_sync()` and `__match_any_sync()` operations, respectively. These are only implemented for the CUDA target because they are only being generated as part of our CUDA-only pass. * The `IRDominatorTree` type was updated to make it a bit more robust in the presence of unreachable blocks in the CFG. It is possible that the same ends could be achieved more efficiently by folding the corner cases into the main logic, but I went ahead and made things very explicit for now. * I added an `IREdge` utility type to better encapsulate the way that certain code operating on the predecessors/successors of an `IRBlock` were using an `IRUse` to represent a control-flow edge. The `IREdge` type makes the logic of those operations more explicit. A future change should proably change it so that `IRBlock::getPredecessors()` and `getSuccessors()` are instead `getIncomingEdges()` and `getOutgoingEdges()` and work as iterators over `IREdge` values, given the way that the predecessor and successor lists today can contain duplicates. Using the above `IREdge` type, the logic for detecting and break critical edges was broken down into something that is a bit more clear (I hope), and that also factors out the breaking of an edge (by inserting a block along it) into a reusable subroutine. The Main Pass ============= The implementation of the new pass is in `slang-ir-synthesize-active-mask.cpp`, and that file attempts to include enough comments to make the logic clear. A brief summary for the benefit of the commit history: * The first order of business is to identify functions that need to have the active mask value piped into them, and to add an additional parameter to them so that the active mask is passed down explicitly. Call sites are adjusted to pass down the active mask which can then result in new functions being identified as needing the active mask. * The next challenge is for a function that uses the active mask, to compute the active mask value to use in each basic block. The entry block can easily use the active mask value that was passed in, while other blocks need more work. * When doing a conditional branch, we can compute the new mask for the block we branch to as a function of the existing mask and the branch condition. E.g., the value `WaveMaskBallot(existingMask, condition)` can be used as the mask for the "then" block of an `if` statement. * When control flow paths need to "reconverge" at a point after a structured control-flow statement, we need to insert logic to synchronize and re-build the mask that will execute after the statement, while also excluding any lanes/threads that exited the statement in other ways (e.g., an early `return` from the function). The explanation here is fairly hand-wavy, but the actual pass uses much more crisp definitions, so the code itself should be inspected if you care about the details. Tests ===== The tests for the new feature are all under `tests/hlsl-intrinsic/active-mask/`. Most of them stress a single control-flow construct (`if`, `switch`, or loop) and write out the value of `WaveGetActiveMask()` at various points in the code. In practice, our definition of the active mask doesn't always agree with what D3D/Vulkan implementations seem to produce in practice, and as a result a certain amount of effort has gone into adding tweaks to the tests that force them to produce the expected output on existing graphics APIs. These tweaks usually amount to introducing conditional branches that aren't actually conditional in practice (the branch condition is always `true` or always `false` at runtime), in order to trick some simplistic analysis approaches that downstream compilers seem to employ. One test case currently fails on our CUDA target (`switch-trivial-fallthrough.slang`) and has been disabled. This is an expected failure, because making it produce the expected value requires a bit of detailed/careful coding that would add a lot of additional complexity to this change. It seemed better to leave that as future work. Future Work =========== * As discussed under "Tests" above, the handling of simple `switch` statements in the current pass is incomplete. * There's an entire can of worms to be dealt with around the handling of fall-through for `switch`. * The current work also doesn't handle `discard` statements, which is unimportant right now (CUDA doesn't have fragment shaders), but might matter if we decide to synthesize masks for other targets. Similar work would probably be needed if we ever have `throw` or other non-local control flow that crosses function boundaries. * An important optimization opportunity is being left on the floor in this change. When block that comes "after" a structured control-flow region (which is encoded explicitly in Slang IR and SPIR-V) post-dominates the entry block of the region, then we know that the active mask when exiting the region must be the same as the mask when entering the region, and there is no need to insert explicit code to cause "re-convergence." This should be addressed in a follow-on change once we add code to Slang for computing a post-dominator tree from a function CFG. * Related to the above, the decision-making around whether a basic block "needs" the active mask is perhaps too conservative, since it decides that any block that precedes one needing the active mask also needs it. This isn't true in cases where the active mask for a merge block can be inferred by post-dominance (as described above), so that the blocks that branch to it don't need to compute an active mask at all. * If/when we extend the CPU target to support these operations (along with SIMD code generation, I assume), we will also need to synthesize an active mask on that platform, but the approach taken here (which pretty much relies on support for CUDA "cooperative groups") wouldn't seem to apply in the SIMD case. * Similarly, the approach taken to computing the active mask here requires a new enough CUDA SM architecture version to support explicit cooperative groups. If we want to run on older CUDA-supporting architectures, we will need a new and potentially very different strategy. * Because the new pass here changes the signature of functions that require the active mask (and not those that don't), it creates possible problems for generating code that uses dynamic dispatch (via function pointers). In principle, we need to know at a call site whether or not the callee uses the active mask. There are multiple possible solutions to this problem, and they'd need to be worked through before we can make the implicit active mask and dynamic dispatch be mutually compatible. * Related to changing function signatures: no effort is made in this pass to clean up the IR type of the functions it modifies, so there could technically be mismatches between the IR type of a function and its actual signature. If/when this causes problems for downstream passes we probably need to do some cleanup. * fixup: backslash-escaped lines I did some "ASCII art" sorts of diagrams to explain cases in the CFG, and some of those diagrams used backslash (`\`) characters as the last character on the line, causing them to count as escaped newlines for C/C++. The gcc compiler apparently balked at those lines, since they made some of the single-line comments into multi-line comments. I solved the problem by adding a terminating column of `\|` characters at the end of each line that was part of an ASCII art diagram. * fixup: typos Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2020-05-26	Improvements around hashing (#1355)	jsmall-nvidia
	* Fields from upper to lower case in slang-ast-decl.h * Lower camel field names in slang-ast-stmt.h * Fix fields in slang-ast-expr.h * slang-ast-type.h make fields lowerCamel. * slang-ast-base.h members functions lowerCamel. * Method names in slang-ast-type.h to lowerCamel. * GetCanonicalType -> getCanonicalType * Substitute -> substitute * Equals -> equals ToString -> toString * ParentDecl -> parentDecl Members -> members * * Make hash code types explicit * Use HashCode as return type of GetHashCode * Added conversion from double to int64_t * Split Stable from other hash functions * toHash32/64 to convert a HashCode to the other styles. GetHashCode32/64 -> getHashCode32/64 GetStableHashCode32/64 -> getStableHashCode32/64 * Other Get/Stable/HashCode32/64 fixes * GetHashCode -> getHashCode * Equals -> equals * CreateCanonicalType -> createCanonicalType * Catches of polymorphic types should be through references otherwise slicing can occur. * Fixes for newer verison of gcc. Fix hashing problem on gcc for Dictionary. * Another fix for GetHashPos * Fix signed issue around GetHashPos
2020-05-25	Remove non-ascii characters from source file.	Yong He

2020-05-22	Tidy up around AST nodes (#1353)	jsmall-nvidia
	* Fields from upper to lower case in slang-ast-decl.h * Lower camel field names in slang-ast-stmt.h * Fix fields in slang-ast-expr.h * slang-ast-type.h make fields lowerCamel. * slang-ast-base.h members functions lowerCamel. * Method names in slang-ast-type.h to lowerCamel. * GetCanonicalType -> getCanonicalType * Substitute -> substitute * Equals -> equals ToString -> toString * ParentDecl -> parentDecl Members -> members
2020-05-21	Non virtual accept implementation on AST types (#1351)	jsmall-nvidia
	* First pass impl of making accept on AST node types non virtual. * A single switch for ITypeVistor on Val type. * Use ORIGIN to choose ITypeVisitor dispatch. * Don't use ORIGIN - we don't need special handling for ITypeVisitor on Val derived types.
2020-05-21	AST dump improvements (#1350)	jsmall-nvidia
	* Add support for parsing array types to C++ extractor. * C++ extractor looks for 'balanced tokens'. Use for extracting array suffixes. * First pass at field dumping. * Update project for field dumping. * WIP AST Dumper. * More AST dump compiling. * Fix bug in StringSlicePool where it doesn't use the copy of the UnownedStringSlice in the map. * Add support for SLANG_RELFECTED and SLANG_UNREFLECTED More AST dump support. * Support for hierarchical dumping/flat dumping. Use SourceWriter to dump. * Add -dump-ast command line option. * Add fixes to VS project to incude AST dump. * Fix compilation on gcc. * Add fix for type ambiguity issue on x86 VS. * Fixes from merge of reducing Token size. * Fix comment about using SourceWriter. * Improvement AST dumping around * Pointers (write as hex) * Scope * Turn off some unneeded fields in AST hierarchy * Only output the initial module in full.
2020-05-20	AST dumping via C++ Extractor reflection (#1348)	jsmall-nvidia
	* Add support for parsing array types to C++ extractor. * C++ extractor looks for 'balanced tokens'. Use for extracting array suffixes. * First pass at field dumping. * Update project for field dumping. * WIP AST Dumper. * More AST dump compiling. * Fix bug in StringSlicePool where it doesn't use the copy of the UnownedStringSlice in the map. * Add support for SLANG_RELFECTED and SLANG_UNREFLECTED More AST dump support. * Support for hierarchical dumping/flat dumping. Use SourceWriter to dump. * Add -dump-ast command line option. * Add fixes to VS project to incude AST dump. * Fix compilation on gcc. * Add fix for type ambiguity issue on x86 VS. * Fixes from merge of reducing Token size. * Fix comment about using SourceWriter.
2020-05-19	Reduce the size of Token (#1349)	jsmall-nvidia
	* Token size on 64 bits is 24 bytes (from 40). On 32 bits is 16 bytes from 24. * Added hasContent method to Token. Some other small improvements around Token.
2020-05-14	Change to make a single implementation of SLANG_ABSTRACT_CLASS and ↵	jsmall-nvidia
	SLANG_CLASS, to simplify macro injection and not require macro redefinition in each file. (#1345)
2020-05-11	Add GLSL translation for HLSL fmod() (#1342)	Tim Foley
	The existing code was assuming `fmod()` was available as a builtin in GLSL, which isn't true. It also isn't possible to translate the HLSL `fmod()` to the GLSL `mod()` since the two have slightly different semantics. This change introduces a definition for `fmod(x,y)` that amounts to `x - y*trunc(x/y)` which should agree with the HLSL version except in corner cases (e.g., there are some cases where the HLSL version returns `-0` and this one will return `0`).
2020-05-08	AST nodes using C++ Extractor (#1341)	jsmall-nvidia
	* Extractor builds without any reference to syntax (as it will be helping to produce this!). * Change macros to include the super class. * WIP replacing defs files. * Added indexOf(const UnownedSubString& in) to UnownedSubString. Refactored extractor * Output a macro for each type with the extracted info - can be used during injection in class * Simplify the header file - as can get super type and last from macro now * Store the 'origin' of a definition * Some small tidy ups to the extractor. * Improve comments on the extractor options. * Made CPPExtractor own SourceOrigins * Small fixes around SourceOrigin. * Small tidy up around macroOrign * WIP Visitor seems now to work correctly. Split out types used by ast into slang-ast-support-types.h * Fix remaining problems with C++ extractor being used with AST nodes. Add CountOf to extractor type ids. Added ReflectClassInfo::getInfo to turn an ASTNodeType into a ReflectClassInfo * Fix compiling on linux. Fix typo in memset. * Small tidy up around comments/layout. Moved NodeBase casting to NodeBase. * Make premake generate project that builds with cpp-extractor for AST. * Get the source directory from the filter in premake. * Fix typo in source path * Explicitly set the source path for premake generation for AST. * Special case handling of override to apease Clang. * Use a more general way to find the slang-ast-reflect.h file to run the extractor. * Appveyor is not triggering slang-cpp-extractor - try putting dependson together. * Put building slang-cpp-extractor first. * Disable some project options to stop MSBuild producing internal compiler errors. * Try reordering the projects in premake5.lua * Hack to try and make slang-cpp-extractor built on appveyor. * Disable flags - not required for MSBuild on appveyor. * Disable flags not required for build on AppVeyor. * Updated Visual Studio projects with slang-cpp-extractor. * Added Visual Studio slang-cpp-extractor project.
2020-05-07	Enhanced C++ extractor (#1340)	jsmall-nvidia
	* Extractor builds without any reference to syntax (as it will be helping to produce this!). * Change macros to include the super class. * Added indexOf(const UnownedSubString& in) to UnownedSubString. Refactored extractor * Output a macro for each type with the extracted info - can be used during injection in class * Simplify the header file - as can get super type and last from macro now * Store the 'origin' of a definition * Some small tidy ups to the extractor. * Improve comments on the extractor options. * Made CPPExtractor own SourceOrigins * Small fixes around SourceOrigin. * Small tidy up around macroOrign
2020-05-04	C++ Extractor (#1337)	jsmall-nvidia
	* WIP: Doing texing using slangs lexer for cpp-extractor * Node tree for C++ extraction. * Bug fixing. Add dump of hierarchy. * First pass at extracting fields. * Parse template types. * Use diagnostics defs for C++ extractor. * Simplify Diagnostic Defs. * Remove the brace stack. * Added IdentifierLookup. * Add handling for >> style template close. * Improved identifier handling/keywords. * Added ability to check if reader is at cursor position. * Handling of an unspecified root type. * Parsing code comments. Tidy up some parsing - to use advanceIf functions more. * Improve path handling. * Fixes around changes to Path interface. * Working Range, Type and Scope header. * Extract the middle part of marker and put in output. Gives more flexibility at macro injection, and in class definitions. * Split DERIVED types into it's own macro, to provide way to generate for derived types. * Fix clang/g++ compile issue. * Tabs -> spaces. * Fix small bug in getFileNameWithoutExt * Small improvement around naming. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
2020-05-04	Make stdlib WaveActive* call WaveMask* (#1336)	Tim Foley
	This change makes the various `WaveActive()` functions have default implementations that call `WaveMask()` passing `WaveActiveMask()`. The new definitions will be used during CUDA code generation, which simplifies some of the duplication that was occuring in the `__target_intrinsic` modifiers. This change does not add logic to make computation of `WaveGetActiveMask()` corect on CUDA, so these functions will still fail to provide the behavior that users need/expect. A future change will need to add logic to synthesize the value of `WaveGetActiveMask()` automatically.
2020-05-01	Improve GLSL coverage of boolean binary ops (#1335)	Tim Foley
	* Improve GLSL coverage of boolean binary ops This change ensures that the `&&`, `\|\|`, `&`, `\|`, and `^` apply correctly to vectors of `bool` values when targetting GLSL. Most of the changes are in the GLSL emit path, where the IR instructions for these operators are bottlenecked through a small set of helper routines to cover the different cases. In general: * The vector variants of the operations are implemented by casting to `uint` vectors, performing bitwise ops, then casting back * The scalar variants are handled by conveting the bitwise operations to their equivalent logical operator (the one interesting case there is bitwise `^` where the equivalent logical operation on `bool` is `!=`) This change makes it clear that our IR really shouldn't have distinct opcodes for logical vs. bitwise and/or/xor, and instead should just have a single family of operations where the behavior differs based on the type of the operand. That is already de facto the way things work (a user can always write `&`, `\|` and `^` and expect them to work on `bool` and vectors of `bool`), so that the GLSL output path has to deal with the overlap. Having two sets of IR ops here actually makes for more code instead of less. * Fixups: review feedback and test ! operator