slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2025-09-02	render-test: Change D3D12 default to sm_6_5 (#8320)	James Helferty (NVIDIA)
	Changes default for render-test to sm_6_5. Since sm_6_5 is the new default, remove the -use-dxil option, add -use-dxcb option Remove -use-dxil option from all test cases. Add -use-dxcb to two tests that needed it. Fixes #7611
2025-08-01	Fix 7441: CUDA boolean vector layout to use 1-byte elements (#7862)	Harsh Aggarwal (NVIDIA)
	* Fix 7441: CUDA boolean vector layout to use 1-byte elements Boolean vectors (bool1, bool2, bool3, bool4) were incorrectly implemented as integer-based types using 4 bytes per element instead of actual 1-byte boolean elements on CUDA targets. Changes: - Update CUDA prelude to define boolean vectors as structs with bool fields instead of typedef aliases to integer vectors - Implement CUDALayoutRulesImpl::GetVectorLayout to use 1-byte alignment for boolean vectors, matching actual CUDA memory layout behavior - Update make_bool functions to populate struct fields correctly This ensures boolean vectors have the same memory layout as bool[4] arrays: - bool1: 1 byte (was 4 bytes) - bool2: 2 bytes (was 8 bytes) - bool3: 3 bytes (was 12 bytes) - bool4: 4 bytes (was 16 bytes) Fixes memory layout mismatch between Slang reflection API and actual CUDA compilation, achieving 75% memory savings for boolean vector usage. * Fix CI issues - Add and update associated functions and operators * Make boolX same as uchar * Use align construct on struct for boolX * Improve Test case for robust alignment checks * Formatting * Disable selected slangpy tests * add metal check which is slightly different than cuda * Test-1 * Test-2 * Test-3 * Test-4 * ReflectionChange * cleanup and update * _slang_select with plain bool is needed for reverse-loop-checkpoint-test
2025-07-25	Add combined texture-sampler flag to reflection API to differentiate ↵	Copilot
	Texture2D from Sampler2D (#7901) * Initial plan * Add SLANG_TEXTURE_COMBINED_FLAG to differentiate combined texture-samplers Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> * Fix regression in hlsl-to-vulkan-combined test by updating expected output Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: csyonghe <2652293+csyonghe@users.noreply.github.com>
2025-07-09	Avoid adding underscore to _snorm format if it has one (#7664)	aidanfnv

2025-05-15	Add checking for hlsl register semantic. (#7118)	Yong He
	* Add checking for hlsl register semantic. * Fix. * Fix test. * Fix switch error. * Fix tests.
2025-02-27	Add inner texture type to reflection json (#6416)	Devon
	* Add inner texture type to reflection json * Add expected result of test * Adjust test expected results * Fix ci test result --------- Co-authored-by: Yong He <yonghe@outlook.com>
2024-12-21	Fixed stage and result field names in json reflection (#5927)	Stan

2024-12-12	Bump spirv-header/tools/glslang (#5834)	Ellie Hermaszewska
	* Back out "Update SPIRV submodules (#5815)" This backs out commit e50aac13e2c161d672b137a62f6d66820d0f9ff1. * Use upstream spirv-tools * Fix bump-glslang.sh for newer versions of spirv-tools * Use upstream glslang * Add --do-fetch option to bump glslang * Bump glslang and friends Supersedes https://github.com/shader-slang/slang/pull/5815 * Regenerate glslang and spirv-tools outputs * Fixes to slang-glslang * Correct spirv intrinsic for OpImageSampleFootprintNV Note that this currently fails validation with the following error: ``` error: line 145: Result <id> from OpSampledImage instruction must not appear as operand for OpImageSampleFootprintNV, since it is not specified as taking an OpTypeSampledImage. Found result <id> '55[%sampledImage]' as an operand of <id> '56[%resultVal]'. %sampledImage = OpSampledImage %54 %51 %40 ``` This seems to be in error as the spec for SPV_NV_shader_image_footprint states that "Sampled Image must be an object whose type is OpTypeSampledImage" https://refined-github-html-preview.kidonng.workers.dev/KhronosGroup/SPIRV-Registry/raw/refs/heads/main/extensions/NV/SPV_NV_shader_image_footprint.html glslang also seems to fail with the same validation error * Fix spv storage class test
2024-11-06	Fix IntVal unification logic to insert type casts + buffer element lowering ↵	Yong He
	regression. (#5508) * Fix IntVal unification logic to insert type casts. * Fix regression.
2024-11-06	Fix WGSL parameter block binding. (#5500)	Yong He
	* Fix WGSL parameter block binding. * Re-enable tests. * Update failure list. * Fix entrypoint parameters. * Update tests. * Enable stat-var test.
2024-10-29	format	Ellie Hermaszewska
	* format * Minor test fixes * enable checking cpp format in ci
2024-10-15	Enable WebGPU tests in CI (#5239)	Anders Leino

2024-10-09	Use user defined type name in glsl buffer declarations. (#5242)	Yong He

2024-09-30	Add COM API for querying metadata. (#5168)	Yong He
	* Add COM API for querying metadata. * Fix tests. * fix test.
2024-07-30	Fixes for Metal ParameterBlock support. (#4752)	Yong He

2024-07-23	Allow only specific spv storage classes for binding decoration (#4713)	kaizhangNV
	* Allow only specific spv storage classes for binding decoration In https://registry.khronos.org/vulkan/specs/1.3/html/chap37.html#VUID-StandaloneSpirv-DescriptorSet-06491 it states that If a variable is decorated by DescriptorSet or Binding, the Storage class must be UniformConstant, Uniform and StorageBuffer. So apply this rule to our emit-spirv logic. * Add a unit test * Address few comments
2024-07-19	Support parameter block in metal shader objects. (#4671)	Yong He
	* Support parameter block in metal shader objects. * Ingore parameter block tests on devices without tier2 argument buffer. * Fix warning. * Fix texture subscript test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2024-07-10	Neaten filecheck output for non-passing nested-parameter-block test (#4592)	Ellie Hermaszewska
	With this context filecheck can generate a better error when it fails
2024-06-13	Metal: misc fixes and enable more tests. (#4374)	Yong He
	* Fix and enable tests for metal. * Fix. * Fix. * Fix tests. * Fix warnings. * Fix. --------- Co-authored-by: Yong He <yonghe@Yongs-Mac-mini.local>
2024-06-12	Implement Sampler2D for CPP target (#4371)	Jay Kwak
	Closes #4267 Co-authored-by: Yong He <yonghe@outlook.com>
2024-06-08	SPIRV `Block` decoration fixes. (#4303)	Yong He
	* SPIRV `Block` decoration fixes. - SPIRV does not allow duplicate `Block` decorations. So we shouldn't be generating them. - Also fixes duplication of OpName. - SPIRV and HLSL do not allow ConstantBuffer with trailing unsized arrays. Added a check in the front-end against such code. * Convert failing cross-compile tests to filecheck. --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
2024-06-07	Metal compute tests (#4292)	skallweitNV

2024-05-29	Improve compile time performance. (#3857)	Yong He
	* Handle type check cache update on extensions more gracefully. * Correctness fix. * Cache implcit cast overload resolution results. * Fix. * More optimizations. * Cache implicit default ctor resolution. * Disable redundancy removal. * Fix. * Fix test. * Fix. * Correctness fix. * Fix. * Fix, * Fix test. * Small tweak.
2024-03-25	Fix default space assignment. (#3833)	Yong He

2024-03-23	Make `-no-mangle` option work, add `-no-hlsl-binding`. (#3817)	Yong He

2024-03-01	Make slangc commandline parsing compatible with renderdoc. (#3658)	Yong He
	* Make slangc commandline parsing compatible with renderdoc. * Fix tests.
2024-01-24	[SPIRV] Support `globallycoherent` and `[vk::index()]`. (#3488)	Yong He
	* [SPIRV] Support `globallycoherent` modifier. * Fix. * Disable executable cooperative vector tests. * Update expected failure. * [SPIRV] Emit varying output index decoration. * Add test. * Update tests. * Fix test. * Emit `SpvExecutionModeEarlyFragmentTests`. * Lower `StructuredBuffer<bool>`. * Support globallycoherent on ByteAddressBuffer. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-12-06	Support visibility control and default to `internal`. (#3380)	Yong He
	* Support visibility control and default to `internal`. * Fix wip. * Fixes. * Fix. * Fix test. * Add legacy language detection and compatibility for existing code. * Add doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-11-01	Parameter binding and gfx fixes. (#3302)	Yong He
	* Parameter binding and gfx fixes. * Add diagnostics on entry point parameters. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-05	SPIR-V image operations (#3163)	Ellie Hermaszewska
	* Add __truncate and __sampledType for spirv_asm Allows some texture tests to start passing * add __isVector Currently unused * Add 1-vector legalization pass (WIP) * Add capabilities for image types * neaten instruction dumping * add 1-vector test * Add a couple of cases to vec1 legalization * Remove texture tests from expected failures * comment * regenerate vs projects * Remove redundant define form synchapi emulation * refactoring image methods * All sample functions refactored * Remove incorrect glsl intrinsics Partially addresses https://github.com/shader-slang/slang/issues/3174 * __subscript image ops via writing funcs * Extract texture struct writing from core.meta.slang * Abstract out cuda intrinsic * Remvoe erroneous call to opDecorateIndex * spirv asm IR utils * Correct position of loads for SPIR-V asm inst operands * Raise constructors to global scope during spir-v legalization * Correct snippet output * Implement most texture sampling ops for SPIR-V * Legalize 1-vectors for glsl too * Make SPIR-V inst operands non-hoistable * Better 1-vector legalization * Put textures in ptrs for spirv * insert missing break * Add vec1 legalization test * Add some missing pieces to slang-ir-insts * Greatly neaten vec1 legalization * a * Neaten vec1 legalization * Add image read and write intrinsics for spir-v * Squash warnings * regenerate vs projects * Drop redundant guards * Drop 5 tests from expected failure list * Inst numbering changes to cross compile tests * vec1 legalization tests only on vk * Correct location of asm op emit * Inline constant in spirv-asm * Correct signedness for lane in wave intrinsics * Extract element from float1 for cuda * squash warnings * Neaten spirv-emit * dedupe more capabilities * warnings * neaten assert * comments * comments
2023-09-03	Proper lowering of functiosn that returns NonCopyable values. (#3179)	Yong He
	* Proper lowering of functiosn that returns NonCopyable values. * Fix tests. * Fix clang errors. * Fix. * Fix clang error. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-27	Fix -fvk-u-shift not applying to global constant buffer. (#3033)	Yong He
	* Fix -fvk-u-shift not applying to global constant buffer. * Fix test. * Fix. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-26	Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output. (#3027)	Yong He
	* Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output. * Add `transpose` to `ObjectToWorld4x3`. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-19	Support for vk-shift-* without explicit bindings (#3000)	jsmall-nvidia
	* Improvements to HLSLToVulkanLayoutOptions. * WIP vk-shift-* with HLSL like binding. Detecting clashes. * Shift example seems to be working correctly. One oddness is that "used" data is now reflected, as we only enable for D3D shader resource types. Now we use those with inferred VK mode they appear. * Implicit seems to work. * Disable inference with Sampler/CombinedTextureSampler. I guess? we could just use the HLSL texture register binding to infer. * Report overlapping ranges in diagnostic. The hlsl-to-vulkan-shift-diagnostic result might be surprising but it is correct, because u is automatically laid out so consumes DescriptorSlot 0, but that's already consumed by c. * First attempt at array layout with infer on Vulkan. * Fix the vulkan shift output. * Array example.
2023-07-14	Fix vk-shift-* mapping issue (#2993)	jsmall-nvidia
	* Fix vk-shift-* mappings. * Add some doc info about vk-shift. * Fix diagnostic test.
2023-06-28	Add support for vk::image_format attribute (#2945)	jsmall-nvidia

2023-05-04	Improvements around HLSLToVulkanLayout (#2867)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Improve the HLSLToVulkanLayoutOptions interface. Add more diagnostics. Add diagnostics test. * Add check for global binding using file check. * Fix issues with some tests around making some diagnostics ids unique. * Small improvements with doc/handling of vk-<>-shift option setup. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-05-03	HLSL->Vulkan binding support (#2865)	jsmall-nvidia
	* WIP around VK shift binding. * Refactor around options parsing. * Remove needless passing around of sink. * Some more tidying around OptionsParser. * Handle vulkan shift parsing. * Fix small issue around vk binding and "all". * Fixing some small issues. Missing break. * Split out VulkanLayoutOptions * WIP binding taking into account HLSL->Vulkan options. * First attempt at making binding work with HLSLVulkanOptions. * VulkanLayoutOptions -> HLSLToVulkanLayoutOptions * WIP with HLSL-Vulkan binding. * Some more testing around vk-shift. * Improvements around global binding. More tests. * Improve test coverage. Improve checking for requirements around default space. * Update command line options. * Small fixes. * Small fix in options reporting. * Fix warning issue. * Some fixes for isDefault for HLSLToVulkanLayoutOptions. * Update hlsl-to-vulkan-shift output. The difference was due to default handling if shift isn't specified, and not being specified was not correctly tracked.
2023-05-02	Various dxc/fxc compatibility fixes. (#2863)	Yong He
	* Various dxc/fxc compatibility fixes. * Cleanup. * Fix test cases. * Fix comments. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-27	Detect and deduplicate read-only resource access. (#2680)	Yong He
	* Detect and deduplicate read-only resource access. * Fix tests. * Fix tests. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2019-12-06	Remove legacy feature for merging global shader parameters (#1139)	Tim Foley
	* Remove legacy feature for merging global shader parameters There is a fair amount of special-case code in the Slang compiler today to deal with the scenario where a programmer declares the "same" shader parameter across two different translation units: ```hlsl // a.hlsl Texture2D a; cbuffer C { float4 c; } ``` ```hlsl // b.hlsl cbuffer C { float4 c; } Texture2D b; ``` An important note here is that the declaration of `C` may be in a header file that both `a.hlsl` and `b.hlsl` `#include`, because from the standpoint of the parser and later stages of the compiler, there is no difference between `C` being in an included file vs. it being copy-pasted across both `a.hlsl` and `b.hlsl`. When a user invokes `slangc a.hlsl b.hlsl` (or the equivalent via the API), then they may decide that it is "obvious" that the shader parameter `C` is the "same" in both `a.hlsl` and `b.hlsl`. Knowing that the parameter is the "same" may lead them to make certain assumptions: * They may assume that generated code for entry points in `a.hlsl` and `b.hlsl` will both agree on the exact `register`/`binding` occupied by `C`. * They may assume that reflection information for their program will only reflect `C` once, and it will reflect it in a way that is applicable to entry points in both `a.hlsl` and `b.hlsl` * They may assume that the compiler can and should handle this use case even when `C` contains fields with `struct` types that are declared in both `a.hlsl` and `b.hlsl` that have the "same" definition. * They may assume that in cases where `C` is declared inconsistently between `a.hlsl` and `b.hlsl` the compiler can and will diagnose an error. Making these assumptions work in practice required a lot of special-case code: * When composing/linking programs was `ComponentType`s we had to include a special case `LegacyProgram` type that could provide these "do what I mean" semantics, since they are not what one would want in the general case for a `CompositeComponentType`. * During enumeration of global shader parameter in a `LegacyProgram`, we had to detect parameters from distinct modules (translation units) with the same name, and then enforce that they must have the "same" type (via an ad hoc recursive structural type match). No other semantic checking logic needs or uses that kind of structural check. * During parameter binding generation, we need to handle the case where a single global shader parameter might have multiple declarations, and make sure to collect explicit bindings from all of them (checking for inconsistency) and also to apply generated bindings to all of them. * The `mapVarToLayout` member in `StructTypeLayout` is a concession to the fact that we might have multiple `VarDecl`s for each field of the struct that represents the global scope, we might need to look up a field and its layout using any of those declarations (much of the need for this field had gone away now that IR passes are largely using IR-based layout). All of these different special cases added more complex code in many places in the compiler, all to support a scenario that isn't especially common. Most users won't be affected by the original issue, because they will do one of several things that rule it out: * Anybody using `slangc` like a stand-in for `fxc` or `dxc` and compiling one translation unit at a time will not suffer from any problems. If/when such users want consistent bindings across translation units, they already use either explicit binding or rely on consistent ordering and implicit binding. * Anybody who puts all the entry points that get combined into a pass/pipeline in a single file will not have problems. They will automatically get consistent bindings because of Slang's guarantees, and there can't be duplicated declarations when there is only one translation unit. * Anybody using `import` to factor out common declarations while compiling multiple translation units at once will not be affected. Parameters declared in an `import`ed module are the "same" in a much deeper way that it is trivial for Slang to support. Only users of the Falcor framework are likely to be affected by this, and they have two easy migration paths: either put related entry points into the same file, or factor common parameters into an `import`ed module. (It is also worth noting that for command-line `slangc`, it is possible to have a single module with multiple `.slang` files in it, which can all see global declarations like parameters across all the files. Anybody who buys into doing things the Slang Way should have no problem avoiding duplicated declarations) With the rationale out of the way, the actual change mostly just amounts to deleting lots of code that is no longer needed. An astute reviewer might notice several `assert`-fail conditions where complex Slang features were never actually made to work correctly with this legacy behavior. A small number of test cases broke with the code changes, but these were tests that specifically exercised the behavior being removed. In the case of the tests around binding/reflection generating, I rewrote the tests to use one of the idomatic workarounds (putting the shared parameters into an `import`ed module), but doing so required me to add support for `#include` when doing pass-through compilation with `fxc`. That logic added a bit more cruft than I had originally hoped to this commit, but having `#include` support when doing pass-through compilation is probably a net win. * fixup: 64-bit warning
2019-02-15	Split front- and back-ends (#846)	Tim Foley
	* Split front- and back-ends This change is a major refactor of several of the types that provide the behind-the-scenes implementation of the public C API. The goal of this refactor is primarily to allow for future API services that let the user operate both the front- and back-ends of the compiler in a more complex fashion. For example, as user should be able to compile a bunch of source code into modules, look up types, functions, etc. in those modules, specialize generic types/functions to the types they've looked up, and then finally request target code to be gernerated for specialized entry points. The back-end code generation they trigger should re-use the front-end compilation work (parsing, semantic checking, IR generation) that was already performed. The most visible change is that `CompileRequest` has been split up into several smaller types that take responsibility for parts of what it did: * The `Linkage` type owns the storage for `import`ed modules, and well as the `TargetRequest`s that represent code-generation targets. The intention is that an application could use a single `Linkage` for the duration of its runtime (so long as it was okay with the memory usage), so that each `import`ed module only gets loaded once. For now, this type needs to manage the search paths, file system, and source manager, because of its responsibility for loading files. * A `FrontEndCompileRequest` owns the stuff related to parsing, semantic checking, and initial IR generation. This most notably includes the `TranslationUnitRequest`s and the `FrontEndEntryPointRequest`s (which used to be just `EntryPointRequest`s). It's main job is to produce AST and IR modules for each translation unit, and to find and validate the entry points. The front-end request does not interact with generic arguments for global or entry-point generic parameters. * The main output of both `import` operations and front-end translation units is the `Module` type, which is just a simple container for both the AST module (to service the reflection/layout APIs, and also for semantic checking of code that `import`s the module) and the IR module (for linking and code generation). This type captures the commonalities between the old `LoadedModule` (which is now just an alias for `Module`) and `TranslationUnitRequest` (which now owns a `Module`). * The secondary output of front-end compilation is a `Program`, which comprises a list of referenced `Module`s and validated `EntryPoint`s that will be used together. Layout and code generation both need a `Program` to tell them what modules and entry points will be used together (we don't want to just code-gen everythin that has ever been loaded into the linakge). The `Program`s created by the front-end do not include generic arguments, so they may provide incomplete layout information and/or be unsuitable for code generation. * A `BackEndCompileRequest` owns stuff related to turning a `Program` into output kernels for the targets of a `Linkage`. Most of the data it owns beyond the `Program` to be compiled is minor, so this is a good candidate for demotion from a heap-allocated object to just a `struct` of options that gets passed around. * The `CompileRequestBase` type is an attempt to wrap up the common functionality of both front-end and back-end compile requests. Most of it is just exposing the availability of a linkage and `DiagnosticSink`, so this type is a good candidate for subsequent removal. The main interesting thing it has is the flags related to dumping and validation of IR, so there is probably a good refactoring still to be made around deciding how options should be handled going forward. * Behind the scenes, the `Program` type is set up to handle some level of on-line compilation and layout work. The `Program` knows the `Linkage` it belongs to, and allows for a `TargetProgram` to be looked up based on a specific `TargetRequest`. A `TargetProgram` then allows layout information and compiled kernel code to be asked for on-demand, in order to support eventual "live" compilation scenarios. * The `EndToEndCompileRequest` type is a composition/coordination type that replaces the old `CompileRequest` in a way that uses the services of the various other types. It owns a few pieces of state that only make sense in the context of an end-to-end compile (e.g., there is really no way to "pass through" code when the front- and back-ends are run separately) or a command-line compile (everything to do with specifying output paths for files is really just for the benefit of `slangc`, and might even be moved there over time). * One important detail is that the `EndToEndCompilRequest` owns all of the string-based generic arguments for both global and entry-point generic parameters. The logic in `check.cpp` for dealing with those arguments has been heavily refactored to separate out the parsings steps that are specific to end-to-end compilation with string-based type arguments, and the semantic checking steps that result in a specialized `Program` (which can be exposed through new APIs that aren't tied to end-to-end compilation). It is perhaps not surprising that this change had a lot of consequences, so I'll briefly run over some of the main categories of changes required: * I changed the way that global generic arguments are passed via API (use `spSetGlobalGenericArgs` instead of the generic arguments for `spAddEntryPointEx`, which are not just for entry-point generics), which has been a change that we've needed for a long time. This is technically a breaking API change, although we should have very few client applications that care about it. * A bunch of places that used to take "big" objects like `CompileRequest` now just take the sub-pieces they care about (e.g., a function might have only needed a `Linkage` and a `DiagnosticSink`). This makes many subroutines or "context" struct types more generally useful, at the cost of taking more parameters. * In a few cases the conceptually clean separation of the layers breaks down (often for edge-case or compatibility features), and so we may pass along additional objects that are allowed to be null, but are used when present. A big example of this is how the back-end code generation routines accept an `EndToEndCompileRequest` that is optional, and only used to check whether "pass through" compilation is needed. We should probably look into cleaning this kind of logic up over time so that we don't need to violate the apparent separation of phases of compilation. * In cases where separation of layers was being broken for the sake of GLSL features, I went ahead and ripped them out, since all of that should be dead code anyway. * In many cases I increased the encapsulation of data in the core types to help track down use sites and make sure they are following invariants better. * In cases where code was doing, e.g., `context->shared->compileRequest->session->getThing()` I have tried to introduce convenience routines so that the usage site is just `context->getThing()` to improve encapsulation and allow changes to be made more easily going forward. * The `noteInternalErrorLoc` functionality was moved off of the compile request and into `DiagnosticSink`, since that is the one type you can rely on having around when you want to note an internal error. We may consider going forward if (and how) it should reset the counter used for noting locations on internal errors. * A few APIs now take `DiagnosticSink` arguments where they didn't before, and as a result some public APIs need to create `DiagnosticSink`s to pass in, before going ahead and ignoring the messages. In the future there should be variations of these APIs that accept an `ISlangBlob` parameter for the output. fixup: missing include for compilers with accurate template checking (non-VS) * fixup: review feedback
2018-12-14	Represent global shader parameters explicitly in the IR (#756)	Tim Foley
	Before this change, global shader parameters were represented in the IR as just being ordinary global variables. The only indication that a particular global represented a parameter was when it got a layotu attached to it (as part of back-end processing), and we've had a number of bugs related to layouts being dropped so that what should have been a shader parameter turned into an ordinary global variable in the output. This change is more strongly motivated by the fact that making shader parameters look like globals means that we cannot easily reason about their value when doing IR transformations. If we see two `load`s from the same global variable can we assume they yield the same value? In the general case we cannot, and this means that any transformation that wants to rely on the fact that an input `Texture2D` shader parameter can't actually change over the life of the program needs to do extra work. The fix here is to introduce a new kind of IR instruction that represents a global shader parameter directly (not a pointer to it as a global would), at which point there isn't even such a notion as a "load" from the parameter, since it represents the value directly. In several cases logic that used to apply to global variables in case they were shader parameters (by looking for a layout) is now moved to apply to these global parameters. The biggest source of issues in this change was that switching from pointers to plain values to represent these shader parameters stresses different cases in type legalization. I also had to deal with the case of legalization for GLSL where we actually do need global shader parameters that are writable (since varying output goes in the global scope), but in that case I borrowed the use of pointer-like `Out<...>` and `InOut<...>` types to represent that intent, which we were already using for function parameters representing outputs. A few tests started failing because the changes lead to a slightly different order of code emission, which in some HLSL tests resulted in a function parameter named `s` getting emitted before a global parameter named `s`, leading to the latter getting the name `s_1` instead of `s_0`. A few SPIR-V tests started failing because the new approach means that we no longer end up performing a load from all varying input parameters at the start of `main` and instead reference the varying inputs directly. The resulting code is more idomatic, but it differed from the baselines for those tests.
2018-12-07	Change how buffers are emitted (#741)	Tim Foley
	* Change how buffers are emitted This is a change with a lot of pieces, which can't always be separated out cleanly. I'm going to walk through them in what I hope is a logical order. The main goal of this change was to allow arrays of structured buffers to translate to Vulkan. Consider two declarations of structured buffers in HLSL/Slang: ```hlsl StructuredBuffer<X> single; StructuredBuffer<Y> multiple[10]; ``` The current translation logic was handling `single` by translating it into an unnamed GLSL `buffer` block like: ```glsl layout(std430) buffer _S1 { X single[]; }; ``` That syntax allows an expression like `single[i]` in Slang to be translated simply as `single[i]` in GLSL. But that naive translating doesn't work for `multiple`, since we need to declare a array of blocks in GLSL, which requires giving the whole thing a name: ```glsl layout(std430) buffer _S2 { Y _data[]; } multiple[10]; ``` Now a reference to `multiple[i][j]` in Slang needs to become `multiple[i]._data[j]` in GLSL. To avoid having way too many special cases around single structured buffers vs. arrays, it makes sense to allows emit things in the latter form, so that we instead lower `single` as: ```glsl layout(std430) buffer _S1 { X _data[]; } single; ``` So that now a reference to `single[i]` becomes `single._data[i]` in GLSL. Most of that can be handled in the standard library translation of the structured buffer indexing operations. The only wrinkle there is that there were some old special-case instructions in the IR intended to handle buffer load/store operations (these were added back when I was trying to keep the "VM" path working). These aren't really needed to have structured-buffer operations work; they can be handled as ordinary functions as far as the stdlib is concerned. I removed the old instructions. Along the way, it became clear that a few other cases follow the same pattern. Byte-addressed buffers are an obvious case. We were lowering HLSL/Slang: ```hlsl ByteAddressBuffer b; ... uint x = b.Load(0); ``` to GLSL like: ```glsl layout(std430) buffer _S1 { uint b[]; }; ... uint x = b[0]; ``` That logic would fail for arrays the same way that the structured buffer case was failing. The fix is the same: use named `buffer` blocks and then introduce an explicit `_data` field: ```glsl layout(std430) buffer _S1 { uint _data[]; } b; ... uint x = b._data[0]; ``` Just like with structured buffers, all of the VK translation for operations on byte-addressed buffers can be implemented directly in teh stdlib, so once the emit logic was changed it was just a matter of adding `._data` to a bunch of VK tranlsations. It turns out that arrays of constant buffers have more or less the same problem, and furthermore we have some problems with any code that directly uses the modern HLSL `ConstantBuffer<T>` type. Note: the emit logic around constant buffers sometimes refers to "parameter groups" because that is being used in the compiler as a catch-all term for constant buffers, texture buffers, and parameter blocks. The existing code was going out of its way to reproduce the way that constant buffer declarations are implicitly referenced in HLSL: ```hlsl cbuffer C { float f; } ... float tmp = f; // No reference to `C` here ``` This can be seen in the emit logic with the `isDerefBaseImplicit` function, which is used to take the internal IR representation for a reference to `f` (which is closer to the expression `(C).f` or `C->f`) and leave off any reference to `C` so that we emit just `f`. That kind of logic just flat out doesn't work in some important cases. Arrays of constant buffers are a clear one: ```hlsl ConstantBuffer<X> cbArray[3]; ... X x = cbArray[0]; ``` There is no way to translate that to an ordinary `cbuffer` declaration at all. The same problem can be created without arrays, though: ```hlsl ConstantBuffer<X> singleCB; ... X x = singleCB; ``` The current strategy for translating constant buffers was translating `singleCB` into a `cbuffer` declaration that reproduced the fields of `X` as its members, which just wouldn't work: ```hlsl cbuffer singleCB { float f; // field of `X` } ... X x = singleCB; // ERROR: there is nothing named `singleCB` in this HLSL ``` The new strategy is more consistent. We still generate a `cbuffer` declaration for a single constant buffer, but we always give it a single field of the chosen element type: ```hlsl cbuffer singleCB { X singleCB; } ... X x = singleCB; // this works fine! ``` And in the array case we generate code that uses the explicit `ConstantBuffer<T>` type: ```hlsl ConstantBuffer<X> cbArray[3]; ... X x = cbArray[0]; ``` The GLSL output is more complicated because unlike with HLSL there is no implicit conversion from a uniform block to its element type (there is no notion of an element type). The array case thus needs a `_data` field similar to what we do for structured buffers: ```glsl layout(std140) uniform _S3 { X _data; } cbArray[3]; ... X x = cbArray[0]._data; ``` And then the non-array case needs to have a similar `_data` field for consistency: ```glsl layout(std140) uniform _S1 { X _data; } singleCB; ... X x = singleCB._data; ``` This is handled by inserting the necessary reference to `_data` whenever we dereference a constant buffer, either as part of a load instruction (loading from the whole CB as a pointer), or an `IRFieldAddress` instruction which forms a pointer into the CB (e.g., `&(singleCB->f)` becomes `singleCB._data.f`). The current emit logic handles `ParameterBlock<X>` differently from `ConstantBuffer<X>`, but really only to allow parameter blocks to be explicitly named in the output, while constant buffers were left implicit by default. Thus the only difference was a legacy one (from back when trying to exactly reproduce the HLSL text we got as input was considered an important goal), and the new approach to emitting constant buffers would get rid of it. I removed the separate logic for emitting `ParameterBlock<X>` and just let the handling for constant buffers deal with it. Note that any resource types inside of a `ParameterBlock<X>` would have been moved out as part of legalization, so that a parameter block is 100% equivalent to a constant buffer when it comes time to emit code. Unsurprisingly, changing the way we generate HLSL and GLSL output for all these buffer types meant that any tests that were directly comparing the output of `slangc` against `fxc`, `dxc`, or `glslang` broke. The basic approach to fixing the breakage in GLSL tests was to update the GLSL baseline to reflect the new output startegy. In some cases I used macros to name the various `_S<digits>` temporaries so that future renaming will hopefully be easier (it would be great if we auto-generated temporary names with a bit more context). There was one GLSL test (`tests/bugs/vk-structured-buffer-binding`) that was using raw GLSL expected output, and this was changed to use a GLSL baseline to generate SPIR-V for comparison. For HLSL tests we were sometimes running the same input file through `slangc` and `fxc`/`dxc`, and in these cases I macro-ized the various `cbuffer` declarations to generate different declarations depending on the compiler. I completely dropped the tests coming from the D3D SDK because they aren't providing much coverage, and updating them would change them so far from the original code that the purported benefit (using a body of existing shaders) would be lost. I also dropped the explicit matrix layout qualifiers in the `matrix-layout` test because the new output strategy breaks those for GLSL (you can't put matrix layout qualifiers on `struct` fields, and now the body of every constant buffer is inside a `struct`). This isn't as big of a loss as it seems, because our handling of those qualifiers wasn't really right to begin with. Slang users should only be setting the matrix layout mode globally (and we should probably switch to error out on the explicit qualifiers for now). The other thing that got dropped is tests involving `packoffset` modifiers. Slang already warns that it doesn't support these, and the way they were used in the test cases is actually misleading. For the binding/layout-related tests, the goal was to show that Slang reproduces the same layout as fxc, in which case explicitly enforcing a layout via `packoffset` seems like cheating (are we sure we enforced the layout fxc would have produced?). The real reason was that Slang used to emit explicit `packoffset` on every* field of a `cbuffer` it would output, because of an `fxc` bug where you couldn't use `register` on textures/samplers declared inside a `cbuffer` unless every field in the `cbuffer` used a `register` or `packoffset` modifier. Slang hasn't required that behavior in a while because it now splits textures and samplers, and the one test case where we needed `packoffset` to work around the `fxc` bug in the baseline HLSL has been macro-ified even more to work around the bug. The amount of churn in the test cases is unfortunate, but it continues to point at the weakness of any testing strategy that checks for exact equivalent between Slang's output and that of other compilers. We need to keep working to replace these tests with better alternatives. In `check.cpp` there is logic to perform implicit dereferencing, so that if you write `obj.f` where `obj` is a `ConstantBuffer<X>` (or some other "pointer-like" type) and `f` is a field in `X`, then this effectively translates as `(obj).f`. That is, we dereference the value of type `ConstantBuffer<X>` to get a value of type `X`, and then refer to the field of the `X` value. There was a problem where the logic to insert that kind of implicit dereference operation was using a reference (`auto& type = ...`) for the type of the expression being dereferenced, and then clobbering it. This would mean that an expression of type `ConstantBuffer<X>` would have its type overwritten to be just `X` and then codegen would break later on. I'm not sure how we haven't run into that before. The `array-of-buffers` test case was added to confirm that we now support arrays of constant, structured, and byte-address buffers for both DXIL and SPIR-V output. Okay, so that was a lot of stuff, but hopefully it is clear how this all works to make the output of the compiler more consistent and explicit, while also supporting the required new functionality. fixup: review feedback
2018-10-29	Rework command-line options handling for entry points and targets (#697)	Tim Foley
	* Rework command-line options handling for entry points and targets Overview: * The biggest functionality change is that the implicit ordering constraints when multiple `-entry` options are reversed: any `-stage` option affects the `-entry` to its left instead of to its right as it used to. This is technically a breaking change, but I expect most users aren't using this feature. * The options parsing tries to handle profile versions and stages as distinct data (rather than using the combined `Profile` type all over), and treats a `-profile` option that specifies both a profile version and a stage (e.g., `-profile ps_5_0`) as if it were sugar for both a `-profile` and a `-stage` (e.g., `-profile sm_5_0 -stage fragment`). * We now technically handle multiple `-target` options in one invocation of `-slangc`, but do not advertise that fact in the documentation because it might be confusing for users. Similar to the relationship between `-stage` and `-entry`, any `-profile` option affects the most recent `-target` option unless there is only one `-target`. * The logic for associating `-o` options with corresponding entry points and targets has been beefed up. The rule is that a `-o` option for a compiled kernel binds to the entry point to its left, unless there is only one entry point (just like for `-stage`). The associated target for a `-o` option is found via a search, however, because otherwise it would be impossible to specify `-o` options for both SPIR-V and DXIL in one pass. * The handling of output paths for entry points in the internal compiler structures was changed, because previously it could only handle one output path per entry point (even when there are multiple targets). The new logic builds up a per-target mapping from an entry point to its desired output path (if any). Details: * Support for formatting profile versions, stages, and compile targets (formats) was added to diagnostic printing, so that we can make better error messages. This is fairly ad hoc, and it would be nice to have all of the string<->enum stuff be more data-driven throughout the codebase. * Test cases were added for (almost) all of the error conditions in the current options validation. The main one that is missing is around specifying an `-entry` option before any source file when compiling multiple files. This is because the test runner is putting the source file name first on the command line automatically, so we can't reproduce that case. * Several reflection-related tests now reflect entry points where they didn't before, because the logic for detecting when to infer a default `main` entry point have been made more loose * On the dxc path, beefed up the handling of mapping from Slang `Profile`s to the coresponding string to use when invoking dxc. * A bunch of tests cases were in violation of the newly imposed rules, so those needed to be cleaned up. * There were also a bunch of test cases that had accidentally gotten "disabled" at some point because there were comparing output from `slangc` both with and without a `-pass-through` option, but that meant that any errors in command-line parsing produced the same error output in both the Slang and pass-through cases. This change updates `slang-test` to always expect a successful run for these tests, and then manually updates or disables the various test cases that are affected. * When merging the updated test for matrix layout mode, I found that the new command-line logic was failing to propagate a matrix layout mode passed to `render-test` into the compiler. This was because the `-matrix-layout` options were implemented as per-target, but the target was being set by API while the option came in via command line (passed through the API). It seems like we want matrix layout mode to be a global option anyway (rather than per-target), so I made that change here. Add missing expected output files * A 64-bit fix * Remove commented-out code noted in review
2018-09-21	Remove the "hack sampler" workaround (#648)	Tim Foley
	* Update glslang version * Fix build for new glslang The latest glslang required a few changes to our manual build for their code (because we are not taking a dependency on CMake). * Rebuild project files using premake, which picks up a few files added to glslang, but also a few diffs in Slang's own project files in cases where they were edited manually instead of using premake. * Fix up the declaration our our device limits (which are inentionally set to not limit what code passes through our glslang), because the underlying structure definition in glslang has changed. This is a kludgy bit of glslang's design, but it doesn't make sense for us to invest in a more serious workaround. * Remove the "hack sampler" workaround When the `GL_KHR_vulkan_glsl` spec was introduced to allow GLSL to be compiled for Vulkan SPIR-V, it made an annoying mistake by leaving a few builtins as taking `sampler2D`, etc. when the equivalent SPIR-V operations only require a `texture2D`, etc. The relevant builtins are: * `textureSize` * `textureQueryLevels` * `textureSamples` * `texelFetch` * `texelFetchOffset` This means that shader code that wanted to use those operations needed to conspire to have a `sampler` handy so they could write, e.g.: ```glsl vec4 val = texelFetch(sampler2D(myTexture, someRandomSampler), p, lod); ``` when what they really wanted was this: ```glsl vec4 val = texelFetch(myTexture, p, lod); ``` That is annoying but probably something each to work around for a GLSL programmer, but when cross-compiling from HLSL, you might have an operation like: ```hlsl float4 val = myTexure.Load(p); ``` in which case a cross-compiler needs to manufacture a sampler out of thin air. If the shader happened to use a sampler for something else you could snag that, but in the worse case you had to cross-compile to GLSL that declared a new sampler. Slang did this by declaring a sampler called `SLANG_hack_samplerForTexelFetch` (because `texelFetch` is the operation that first surfaced the issue). For complex reasons we always define this sampler, even if we turn out not to need it in a particular output kernel. This choice has a bunch of annoying consequences: * There is always a sampler defined in descriptor set zero, because that's where we put the hack sampler, so a user-defined parameter block always has a set number of 1 or greater (see #646). * The hack sampler shows up in reflection output because users need to size their descriptor sets appropriately to pass along this sampler that won't actually be used if they don't want to get debug spew from the validation layers. We filed an issue on glslang about this problem, and eventually some kind folks from the gamedev community (who also saw the same problem) defined an extension spec (`GL_EXT_samplerless_texture_functions`) to fix the underlying issue and contributed a patch to glslang to make it support that extension. This change just backs the hack out of Slang now that we have a glslang version that supports the extension to get past the defect in the original GLSL-for-Vulkan definition. Besides yanking out the code for the hack, we also change the relevant builtins to declare that they require this new GLSL extension (so that we properly request it from glslang when the builtins are used), and fix some reflection test cases that exposed the existence of the "hack sampler." * Fixup: syntax error in stdlib generator files * Remove more code for hack sampler There was logic to ensure we always have a "default" register space/set when cross-compiling, because the hack sampler would need it. This is no longer necessary once we remove the hack sampler. * Fix expected test output. Fixing the root cause of issue #646 means that one of our test cases that tickles that issue now produces different output (luckily it can now be used as a regression test for the issue).
2018-05-04	Allow more complex compound expressions when emitting from IR (#552)	Tim Foley
	The emit logic already had an idea of when an instruction should be "folded" it its use site(s), and this change just expands on that logic to try to be more aggressive. The basic idea is that instead of outputting this: ```hlsl float4 _S3 = a_0 + b_0; float4 _S4 = c_0 * _S3; d_0 = _S4; ``` we can hopefully output something like this: ```hlsl d_0 = c_0 * (a_0 + b_0); ``` The way this works is that after dealing with the various special cases that decide an instruction `I` must/cannot be folded in, we look and see if it has the following properites: * `I` has no side effects * `I` has a single user, `U` * `I` and `U` are in the same block (and `I` comes before `U` in that block) * for every instruction `X` between `I` and `U` (exclusive), `X` has no side effects If all of these conditions are true, then `I` can be folded in as a sub-expression when we emit `U`. This change doesn't affect most of our test output, but there is still a single test with SPIR-V output that we compare against a GLSL baseline, and so that baseline had to be modified to match the GLSL we now generate. Similar to #547, this change is not meant to provide a complete solution, but rather to take a concrete but low-risk step toward improving our output. Opportunities to improve the results further include: * We can/should ensure that when outputting sub-expressions we keep extra parentheses to a minimum. The old logic for emitting from an AST had support for "unparsing" expressions with minimal parentheses, and we should try to do the same. This can be error-prone, because omitting parentheses can lead to silent failures, so it must be done carefully. * We could try to be more aggressive about detecting what operations might have side effects. The most interesting case is function calls, where we should try to check if the callee is a function known to be side-effect-free. We could start by annotating most builtin functions with an attribute/decoration that indicates freedom from side effects. Deriving this attribute for user functions could be interesting, but we'd have to be careful since "nontermination" is technically a side effect. * We could try to be more aggressive about determining what side effects in instructions `X` are "safe" for the instruction `I` to move across. For example, if `I` is a load from variable `a` and `X` is a store to variable `b`, then that would seem to be safe. This starts to get into issues of instruction scheduling, though, and that is probably beyond what we want Slang to be doing.
2018-05-03	Pass through original names for most declarations (#547)	Tim Foley
	The basic idea here is that when lowering to the IR, the front-end will attach a "name hint" to the IR instruction(s) that represent a given declaration, and then the passes that work on the IR will try to preserve and propagate those names, and then finally the emit logic will use them in place of mangled or unique names when available. This change does not try to deal with the issues that arise when we try to use those variable names in the output without any modification (e.g., handling cases where they might clash with keywords or builtins in the target language). Instead, it tries to establish baseline behavior for propagating through names, so that a later change can concentrate on the issue of using those names exactly when it is legal to do so. In order to avoid issues around the name "hints" causing problems we take two main steps: 1. We "scrub" each name to reduce it down to the allowed set of identifier characters in C-like languages, and then ensure that it doesn't do things that would be illegal in some downstream languages (e.g., consecutive underscores are not allowed in GLSL) or could clash with Slang's mangled names. This process isn't guaranteed to give distinct results for distinct inputs (it isn't a mangling scheme, after all). 2. We generate a unique ID for each occurence of a given name and always use that as a suffix. This means that even if a name happens to overlap with a keyword (if you somehow have a variable named `do`), we will still add a suffix that makes it not a problem (we'd output `do_0` which is fine). The logic for generating these names is mostly straightforward. For simple variables, we use their given name directly, while for other declarations we try to form a name that includes their parent declaration (e.g. `SomeType.someMethod`). Various IR passes need to propagate or preserve this information. The most interesting is type legalization, when we take a variable with an aggregate type and split some of the fields out into their own variables. In that case we generate "dotted" names like `someVar.someTexture` and rely on the emit logic to turn that into `someVar_someTexture`. During SSA generation, if we are promoting a variable to SSA temporaries, we will try to propagate the name of the variable over to the temporaries (unless they already have a name from some other place). The same applies to block parameters ("phi nodes"). Many of the test changes need their expected output to be updated for this change. Luckily in most cases the output has gotten easier to understand.
2018-04-11	Introduce an IR-level type system (#481)	Tim Foley
	* Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors
2018-03-29	Change uses of "spire" to "slang" (#461)	Tim Foley
	Fixes #350 When the Slang project forked off from the Spire research effort, we renamed things as we went, but many cases seem to have slipped through the cracks. The two biggest diffs here are: - The `hello` example program was incorrectly talking about what was in the shader file (Slang no longer supports the "module" or "pipeline" constructs from Spire), and so it wasn't just a simple rename. - The files under `tests/bindings` were mistakenly using `__SPIRE__` as a preprocessor guard, which means that they weren't actually testing what they meant to. Luckily, it looks like the relevant functionality didn't regress while these tests were unintentionally deactivated.