<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/slang/slang-modifier-defs.h, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2020-05-08T18:31:40+00:00</updated>
<entry>
<title>AST nodes using C++ Extractor (#1341)</title>
<updated>2020-05-08T18:31:40+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-05-08T18:31:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=798f3bc2236ce81499b05662dc11e7c071e7cde8'/>
<id>urn:sha1:798f3bc2236ce81499b05662dc11e7c071e7cde8</id>
<content type='text'>
* Extractor builds without any reference to syntax (as it will be helping to produce this!).

* Change macros to include the super class.

* WIP replacing defs files.

* Added indexOf(const UnownedSubString&amp; in) to UnownedSubString.
Refactored extractor
* Output a macro for each type with the extracted info - can be used during injection in class
* Simplify the header file - as can get super type and last from macro now
* Store the 'origin' of a definition

* Some small tidy ups to the extractor.

* Improve comments on the extractor options.

* Made CPPExtractor own SourceOrigins

* Small fixes around SourceOrigin.

* Small tidy up around macroOrign

* WIP Visitor seems now to work correctly.
Split out types used by ast into slang-ast-support-types.h

* Fix remaining problems with C++ extractor being used with AST nodes.
Add CountOf to extractor type ids.
Added ReflectClassInfo::getInfo to turn an ASTNodeType into a ReflectClassInfo

* Fix compiling on linux.
Fix typo in memset.

* Small tidy up around comments/layout.
Moved NodeBase casting to NodeBase.

* Make premake generate project that builds with cpp-extractor for AST.

* Get the source directory from the filter in premake.

* Fix typo in source path

* Explicitly set the source path for  premake generation for AST.

* Special case handling of override to apease Clang.

* Use a more general way to find the slang-ast-reflect.h file to run the extractor.

* Appveyor is not triggering slang-cpp-extractor - try putting dependson together.

* Put building slang-cpp-extractor first.

* Disable some project options to stop MSBuild producing internal compiler errors.

* Try reordering the projects in premake5.lua

* Hack to try and make slang-cpp-extractor built on appveyor.

* Disable flags - not required for MSBuild on appveyor.

* Disable flags not required for build on AppVeyor.

* Updated Visual Studio projects with slang-cpp-extractor.

* Added Visual Studio slang-cpp-extractor project.</content>
</entry>
<entry>
<title>CUDA version handling (#1301)</title>
<updated>2020-03-30T23:23:09+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-30T23:23:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=ea7690558bca71ce3a9453adff4e0135352a352f'/>
<id>urn:sha1:ea7690558bca71ce3a9453adff4e0135352a352f</id>
<content type='text'>
* render feature for CUDA compute model.

* Use SemanticVersion type.

* Enable CUDA wave tests that require CUDA SM 7.0.
Provide mechanism for DownstreamCompiler to specify version numbers.

* Enabled wave-equality.slang

* Make CUDA SM version major version not just a single digit.

* Fix assert.

* DownstreamCompiler::Version -&gt; CapabilityVersion</content>
</entry>
<entry>
<title>Define compound intrinsic ops in the standard library (#1273)</title>
<updated>2020-03-16T16:03:19+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-03-16T16:03:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=256a20a163ef6ee93a817472adcb24c076b0c0dc'/>
<id>urn:sha1:256a20a163ef6ee93a817472adcb24c076b0c0dc</id>
<content type='text'>
* Define compound intrinsic ops in the standard library

The current stdlib code has a notion of "compound" intrinsic ops, which use the `__intrinsic_op` modifier but don't actually map to a single IR instruction.
Instead, most* of these map to multiple IR instructions using hard-coded logic in `slang-ir-lower.cpp`.

(* One special case is `kCompoundOp_Pos` that is used for unary `operator+` and that maps to *zero* IR instructions)

All of the opcodes that used to use the `kCompoundOp_` enumeration values now have definitions directly in the stdlib and use the new `[__unsafeForceInlineEarly]` attribute to ensure that they get inlined into their use sites so that the output code is as close as possible to the original.

For the most part, generating the stdlib definitions for the compound ops is straightforward, but here's some notes:

* The unary `operator+` I just defined directly in Slang code, since it doesn't share much structure with other cases

* The unary increment/decrement ops are generated as a cross product of increment/decrement and prefix/postfix. The logic is a bit messy but given that we have scalar, vector, and matrix versions to deal with it still saves code overall

* Because all the compound/assignment cases got moved out, the existing code for generating unary/binary ops can be simplified a bit

* All the no-op bit-cast operations like `asfloat(float)` are now inline identity functions

* A few other small cleanups are made by not having to worry about the compound ops (which used to be called "pseudo ops") sometimes being encoded in to the same type of value as a real IR opcode.

The one big detail here is a fix for how IR lowering works for `let` declarations: they were previously being `materialize()`d which only guarantees that they've been placed in a contiguous and addressable location, but doesn't actually convert them to an r-value. As a result a `let` declaration could accidentally capture a mutable location by reference, which is definitely *not* what we wanted it to do. Fixing this was needed to make the new postfix `++` definition work (several existing tests end up covering this).

One important forward-looking note:

One subtle (but significant) choice in this change is that we actually reduce the number of declarations in the stdlib, because instead of having the compound operators include both per-type and generic overloads (just listing scalar cases for now):

    float operator+=(in out float left, float right) { ... }
    int operator+=(in out int left, int right) { ... }
    ...
    T operator+= &lt;T:__BuiltinBlahBlah&gt;(in out T left, T right) { ... }

We now have *only* the single generic version:

    T operator+= &lt;T:__BuiltinBlahBlah&gt;(in out T left, T right) { ... }

In running our current tests, this change didn't lead to any regressions (perhaps surprisingly).

Given that we were able to reduce the number of overloads for `operator+=` by a factor of N (where N is the number of built-in types), it seems worth considering whether we could also reduce the number of overloads of `operator+` by the same factor by only having generic rather than per-type versions.

One concern that this forward-looking question raises is whether the quality of diagnostic messages around bad calls to the operators might suffer when there are only generic overloads instead of per-type overloads. In order to feel out this problem I added a test case that includes some bad operator calls both to `+=` (which is now only generic with this change) and `+` (which still has per-type overloads). Overall, I found the quality of the error messages (in terms of the candidates that get listed) isn't perfect for either, but personally I prefer the output in the generic case.

As part of adding that test, I also added some fixups to how overload resolution messages get printed, to make sure the function name is printed in more cases, and also that the candidates print more consistently. These changes affected the expected output for one other diagnostic test.

* fixup: disable bad operator test on non-Windows targets</content>
</entry>
<entry>
<title>Add a basc inlining facility for use in the stdlib (#1271)</title>
<updated>2020-03-11T19:53:09+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-03-11T19:53:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=69f7d288313eb238bfb42943694dfcd9bb911d3e'/>
<id>urn:sha1:69f7d288313eb238bfb42943694dfcd9bb911d3e</id>
<content type='text'>
The main feature visible to the stdlib here is the `[__unsafeForceInlineEarly]` attribute, which can be attached to a function definition and forces calls to that function to be inlined immediately after initial IR lowering.

The pass is implemented in `slang-ir-inline.{h,cpp}` and currently only handles the completely trivial case of a function with no control flow that ends with a single `return`. The lack of support for any other cases motivates the `__unsafe` prefix on the attribute.

In order to test that the pass works, I modified the "comma operator" in the standard library to be defined directly (rather than relying on special-case handling in IR lowering), and then added a test that uses that operator to make sure it generates code as expected. The compute version of the test confirms that we generate semantically correct code for the operator, while the SPIR-V cross-compilation test confirms that our output matches GLSL where the comma operator has been inlined, rather than turned into a subroutine.

Notes for the future:

* With this change it should be possible (in principle) to redefine all the compound operators in the stdlib to instead be ordinary functions with the new attribute, removing the need for `slang-compound-intrinsics.h`.

* Once the compound intrinsics are defined in the stdlib, it should be easier/possible to start making built-in operators like `+` be ordinary functions from the standpoint of the IR

* The attribute and pass here could be extended to include an alternative inlining attribute that happens later in compilation (after linking) but otherwise works the same. This could in theory be used for functions where we don't want to inline the definition into generated IR, but still want to inline things berfore generating final HlSL/GLSL/whatever.

* The inlining pass itself could be generalized to work for less trivial functions pretty easily; for the most part it would just mean "splitting" the block with the call site and then inserting clones of the blocks in the callee. Any `return` instructions in the clone would become unconditional branches (with arguments) to the block after the call (which would get a parameter to represent the returned value).

* The "hard" part for such an inlining pass would be handling cases where the control flow that results from inlining can't be handled by our later restructuring passes. The long-term fix there is to implement something like the "relooper" algorithm to restructure control flow as required for specific targets.</content>
</entry>
<entry>
<title>Feature/glslang spirv version (#1256)</title>
<updated>2020-03-05T15:59:54+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-05T15:59:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6684d32db1f5693bcfb4971558cb30e855cd3bad'/>
<id>urn:sha1:6684d32db1f5693bcfb4971558cb30e855cd3bad</id>
<content type='text'>
* WIP add support for __spirv_version .

* Added IRRequireSPIRVVersionDecoration

* SPIR-V version passed to glslang.
Enable VK wave tests.
Split ExtensionTracker out, so can be cast and used externally to emit.
Added SourceResult.

* Fix warning on Clang.

* Missing hlsl.meta.h

* Refactor communication/parsing of __spirv_version with glslang.

* Fix some debug typos.
Be more precise in handling of substring handling.

* Make glslang forwards and backwards binary compatible.

* Small comment improvements.

* Added slang-spirv-target-info.h/cpp

* Fix for major/minor on gcc.

* Another fix for gcc/clang.

* VS projects include slang-spirv-target-info.h/cpp

* Removed SPIRVTargetInfo
Added SemanticVersion.
Don't bother with passing a target to glslang. Should be separate from 'version'.

* Renamed slang-emit-glsl-extension-tracker.cpp/.h -&gt; slang-glsl-extension-tracker.cpp/.h
Fixed some VS project issues.

* Fix a comment.

* Added slang-semantic-version.cpp/.h

* Added slang-glsl-extension-tracker.cpp/.h

* Added split that can check for input has all been parsed.

* Fix problem on x86 win build.
</content>
</entry>
<entry>
<title>__spirv_version Decoration (#1255)</title>
<updated>2020-03-03T23:41:07+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-03T23:41:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=5951d2a45f3546a619fb5b032a4a422229c46e4c'/>
<id>urn:sha1:5951d2a45f3546a619fb5b032a4a422229c46e4c</id>
<content type='text'>
* WIP add support for __spirv_version .

* Added IRRequireSPIRVVersionDecoration
</content>
</entry>
<entry>
<title>Add attributes to enable dual-source blending on Vulkan (#1210)</title>
<updated>2020-02-10T18:25:29+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-02-10T18:25:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=60dfb62e638a06ebdcef27138b63033b828ec2ef'/>
<id>urn:sha1:60dfb62e638a06ebdcef27138b63033b828ec2ef</id>
<content type='text'>
This change adds support for the `[[vk::location(...)]]` and `[[vk::index(...)]]` attributes, which can be used together to mark up shader outputs for dual-source blending on Vulkan. HLSL/Slang code like the following:

```hlsl
struct Output
{
    [[vk::location(0)]]
    float4 a : SV_Target0;

    [[vk::location(0), vk::index(1)]]
    float4 b : SV_Target1;
}

[shader("fragment")]
Output main(...) { ...}
```

can be used to set up dual-source blending on both D3D and Vulkan APIs. The output GLSL for the above will look something like:

```glsl
layout(location = 0)            out vec4 a;
layout(location = 0, index = 1) out vec4 b;

void main() { ... }
```

The more or less straightforward parts of this change were:

* Added new `attribute_syntax` declarations to the stdlib, for `[[vk::location(...)]]` and `[[vk::index(...)]]`

* Added new AST node types for the new attribute cases, sharing a base class so that argument checking can be shared

* Added checks for the arguments to the new attributes in `slang-check-modifier.cpp` (eventually this kind of logic shouldn't be needed for new attributes)

* Updated GLSL emit logic so that it treats the `index`/`space` parts of a variable layout as the `location`/`index` for varying parameters.

* Updated GLSL legalization so that when it translates entry-point parameters into globals (and scalarizes structures) it handles both a binding index and space for the parameters.

* Added a cross-compilation test case to verify that the basics of the feature work

The remaining work is all in `slang-parameter-binding.cpp`.

There is some work that isn't technically related to this change (and which could be reverted if it causes problems), around the detection and handling of fragment shader outputs with `SV_Target` semantics. The basic changes (which could be backed out and then merged separately) are:

* Made the special-case `SV_Target` logic only trigger for fragment shaders (that is the only place where `SV_Target` should appear, but we weren't guarding against it)

* Made the logic to reserve a `u&lt;N&gt;` register for `SV_Target&lt;N&gt;` only trigger for D3D Shader Model 5.0 and below (since it is not required for SM 5.1 and up). This could be a breaking change for some users, but that seems unlikely.

* Fixed one test case that relied on the behavior of reserving `u0` for `SV_Target0` even though it was a SM6.0 test.

* Also added more comments to the system-value handling logic.

The more interesting changes come up starting in `processEntryPointVaryingParameterDecl()`. The basic issue is that we have so far only supported implicit layout for varying parameters on GLSL/Vulkan, but the `[[vk::location(...)]]` attribute is a form of explicit layout annotation. Rather than try to kludge something that only works in narrow cases, I instead opted to try to fix things more generally.

In `processEntryPointVaryingParameterDecl()` we now check for the `location` and `index` attributes when we are on "Khronos" targets (Vulkan/OpenGL/GLSL) and immediately add them to the variable layout being constructed if they are found. There is nothing in this logic specific to fragment-shader outputs, so this feature now applies to any varying input/output on Khronos targets.

Allowing explicit layouts creates the potential for mixing implicit and explicit layout. For example, consider:

```hlsl
struct Output
{
    float4 color : COLOR;
    [[vk::location(0)]] float3 normal : NORMAL;
}
```

What `location` should `color` get? Should this code be an error? There are two cases where this conundrum can come up: when working with `struct` types used for varying parameters, and the entry-point parameter list itself.

For the varying `struct` case we currently make an expedient choice. We handle fields with both implicit or explicit layotu with appropriate logic, but logic that doesn't account for the case of mixing the two. Then at the end of layout for the `struct` we issue an error if there was a mix of implicit and explicit layout (such that our results aren't likely to be valid).

For the entry point varying parameter case, things were already using a `ScopeLayoutBuilder` type (that encapsulates some logic shared between entry-point and global parameters). The entry-point-specific bits were moved out into a `SimpleScopeLayoutBuilder` and it was updated so that rather than assuming all parameters use implicit layout it does a two-phase layout approach similar to what we use for the global scope:

* First all parameters are enumerated to collect explicit bindings and mark certain ranges as "used"

* Next the parameters are enumerated again and those without explicit bindings get allocated space using a "first fit" algorithm

In principle we could extend the two-phase approach to apply to `struct` types as well, but that would be best saved for a future refactoring of some of this parameter binding logic, since I would like to exploit more of the opportunities for sharing code across the uniform/varying and struct/entry-point/global cases.

By moving the point where entry point parameters get their offsets assigned, it was necessary to move around some of the logic that removes varying parameter usage (and other things that shouldn't "leak" out of an entry point) to a different point in the entry point layout process.

While adding these various pieces does not quite enable us to support explicit bindings on entry point parameters (e.g., putting `uniform Texture2D t : register(t0)` in an entry point parameter list) or in `struct` types (e.g., explicit `packoffset` annotations on fields), it starts to provide some of the infrastructure that we'd need in order to support those cases.</content>
</entry>
<entry>
<title>Clean up the concept of "pseudo ops" (#1136)</title>
<updated>2019-11-23T02:54:38+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2019-11-23T02:54:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=c6393465795e700a68458ff618041104f89ed42b'/>
<id>urn:sha1:c6393465795e700a68458ff618041104f89ed42b</id>
<content type='text'>
* Clean up the concept of "pseudo ops"

Built-in functions in the Slang standard library can be marked with `__intrinsic_op(...)` to indicate that they should not lower to functions in the IR, and that instead call sites to those functions should be translated directly to the IR.
There are two cases where `__intrinsic_op(...)` gets used:

1. In the case where the argument to `__intrinsic_op(...)` is an actual IR instruction opcode, the IR lowering logic directly translates a call into an instruction with the given opcode. The arguments to the call become the operands of the instruction.

2. In the case where the argument to `__intrinsic_op(...)` is one of a set of "pseudo" instruction opcodes, the IR lowering logic directly handles the lowering to IR with dedicated code. The operands to the call might be handled differently depending on the kind of operation.

The compound operators like `+=` are the most important example of these "pseudo" instructions.
It doesn't make sense to handle them as true function calls (although that would work semantically), nor does it make sense to have a single IR instruction with such complicated semantics.

An earlier version of the compiler used the same enumeration for both the true IR instruction opcodes and these "pseudo" opcodes, with the simple constraint that the pseudo opcodes were all negative while the real opcodes were positive. That design got changed up over a few refactorings, and because there was never a good explanation in the code itself of what "pseudo" opcodes were, we eventually ended up in a place where the in-memory and serialized IR encodings included logic to try to deal with the possibility of these "pseudo" opcodes, even though the entire design of the lowering pass meant that they'd never appear in generated IR.

This change tries to clean up the mess in a few ways:

* The terminology is now that these are "compound" intrinsic ops, to differentiate them from the more common case of intrinsic ops that map one-to-one to IR instructions.

* The declaration of the compound intrinsic ops is no longer in a file related to the IR, and doesn't use the `IR` naming prefix, so somebody looking at the IR opcodes cannot become confused and think the compound ops are allowed there.

* The IR encoding in memory and when serialized is updated to not account for or worry about the possibility of "pseudo" ops.

* The compound ops are declared in such a way that ensures their enumerant values are all negative, so that they are yet again trivially disjoint from the true IR opcodes.

A more drastic change might have split `__intrinsic_op` into two different modifier types: one for the trivial single-instruction case and one for the compound case.
Doing this would make the change more invasive, though, because there are places in the meta-code that generates the standard library that intentionally handle both single-instruction and compound ops (because built-in operators can translate to either case).

* fixup: missing file

* cleanups based on review feedback
</content>
</entry>
<entry>
<title>Support for [__extern] attribute (#1111)</title>
<updated>2019-11-06T19:11:41+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-11-06T19:11:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=835eb1b6a6d75c206fc65cf5e9e5ac132c5200a0'/>
<id>urn:sha1:835eb1b6a6d75c206fc65cf5e9e5ac132c5200a0</id>
<content type='text'>
* Added RiffReadHelper

* Move type to fourCC in Chunk simplifies some code.

* Make MemoryArena able to track external blocks.
Allow ownership of Data to vary.
Changed IR serialization to use moved allocations to avoid copies.

As it turns out all of the array writes could use unowned data, but doing so requires the IRData to stay in scope longer than IRSerialData, which it does at the moment - but perhaps needs better naming or a control for the feature.

* Write out slang-module container.

* WIP on -r option.
Loading modules - with -r.

* Making the serialized-module run (without using imported module).

* Split compiling module from the test.

* Separate module compilation with a function working.

* Remove serialization test as not used.

* Fix warning on gcc.

* Updated test to have types across module boundary.

* Allow entry point declaration.
A test that tries to build with just an entry point declaration and a module.

* Try to make link work with multiple modules.

* Multi module linking first pass working.

* Multi module test working with -module-name option

* Added feature to repro manifest of approximation of command line that was used.

* Use isDefinition - for determining to add decorations to entry point lowering.

* Added support for repo-file-system.h
More precise control of CacheFileSystem.
Allow RelativeFileSystem to strip paths optionally.
Use canonical paths in PathInfo cache.
Fix bug in -D options for command line output of StateSerailizeUtil

* Add missing slang-options.h

* Fix bug in bit slang-state-serialize.cpp with bit removal.

* Added documentation around -repro-file-system
Added spLoadReproAsFileSystem function.

* Fix warning.

* spAddLibraryReference

* * Add support for slang-lib extension
* Container output when using -no-codegen option

* Use the m_containerFormat to determine if the module container is constructed.
Store the result in a blob. This allows for potential access via the API.
Write the blob if a filename is set.
Use m_ prefix for container variables.

* Added spGetContainerCode.
Made spGetCompileRequestCode work.

* * Put obfuscateCode on linkage
* Remove obfuscation from variable names - as can be achieved by either stripping and/or removing NameHintDecorations at lowering
* Remove name hints being added during lowering
* Add stripping of SourceLoc location in strip phase

* Hashing of linkage import/export names.

* Do final strip in emitEntryPoint, removes any remaining SourceLoc.

* Support for [__extern] to mark struct/function that are defined elsewhere.

* Allow adding extern to any decl.

* Use ExternAtrtibute to apply import decoration, rather than use an ir extern decoration.

* Added a test for [__extern]

* Improved comment around [__extern]
</content>
</entry>
<entry>
<title>IR types for subset of Attributes (#1067)</title>
<updated>2019-10-04T13:46:03+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2019-10-04T13:46:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=7c8527d20e433c3a10736136d31e4cd882a3baaa'/>
<id>urn:sha1:7c8527d20e433c3a10736136d31e4cd882a3baaa</id>
<content type='text'>
* IROutputControlPointsDecoration

* IROutputTopologyDecoration

* IRPartitioningDecoration

* IRDomainDecoration

* Use IRPatchConstantDecoration alone for hlsl output.

* IRMaxVertexCountDecoration

* IRInstanceDecoration

* Removed  _emitHLSLAttributeSingleString and _emitHLSLAttributeSingleInt
Removed GLSLBindingAttribute and just use NumThreadsAttribute

* Added IRNumThreadsDecoration.

* Added IRNumThreadsDecoration

* Fix build problem on x86.
Improve diagnostic text based on review.
</content>
</entry>
</feed>
