slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2023-07-26	Refactor `dmul(This, Differential)` to `dmul<T:Real>(T, Differential)` (#3029)	Sai Praveen Bangaru
	* Refactor `dmul(This, Differential)` to `dmul<T:Real>(T, Differential)` - Add AST synthesis support for generic containers - Refactor relevant tests * Merge dmul synthesis with dadd and dzero, and disambiguate using an enum * Fix trailing spaces
2023-07-26	Fix scalar swizzle causes invalid glsl output. (#3028)	Yong He
	* Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output. * Add `transpose` to `ObjectToWorld4x3`. * Fix scalar swizzle causes invalid glsl output. Fixes #3026. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-26	Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output. (#3027)	Yong He
	* Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output. * Add `transpose` to `ObjectToWorld4x3`. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-26	Add glsl intrinsics for CalculateLevelOfDetail (#3023)	Ellie Hermaszewska
	Translates to textureQueryLod().x (with the Unclameped variant being returned in the .y component) Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-26	Add GatherCmp* for texture objects (#3024)	Ellie Hermaszewska
	The translation to GLSL is incomplete as intrinsics only exist for some combination of comparison and channel (just channel 0) Closes https://github.com/shader-slang/slang/issues/3021
2023-07-25	Add slang.natjmc. (#3018)	Yong He
	This allows Visual Studio debugger to skip over AST visitor dispatch functions and stop directly at the `visit` functions when stepping into a `dispatch` call. Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-25	Allow parsing more than 10 intrinsic arguments (#3014)	Ellie Hermaszewska

2023-07-24	Remove [__readNone] on clip. (#3016)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-21	Don't error on disabled warnings when treatWarnAsErr. (#3013)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-21	Add support for `-fvk-invert-y`. (#3012)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-21	Add sampleCount parameter for read-only textures. (#3011)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-21	Fix data-flow analysis not propagating diff property through differentiable ↵	Sai Praveen Bangaru
	calls (#3010) * Add test for nodiff diagnostic for non-diff call propagated through diff call * Add logic to disambiguate calls to differentiable and non-differentiable methods * Add expected results for test * Simplify test * Update slang-ir-check-differentiability.cpp * Added comments for TreatAsDifferentiableExpr flavors --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-21	Better handling of bindings with multiple resource kind "aliases" for GLSL ↵	jsmall-nvidia
	emit (#3009) * A more way robust way to handle resource consumption might use multiple `kind`s on GLSL emit. * Improve method naming and some comments. * Small consistency fix.
2023-07-20	Fix issue with loop elimination not working on certain side-effect-free ↵	Sai Praveen Bangaru
	loops (#3005) Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-20	Fix for vk-shift-* GLSL emit issue (#3004)	jsmall-nvidia
	* Handle different resource kinds that can appear via the vk-shift-* allowing some HLSL kinds in GLSL emit. * Determine the used kind for emit. * Added vk-shift-uniform-issue.slang * Use a better function name. Improve comments.
2023-07-19	Add `sampleCount` parameter for MS textures. (#3001)	Yong He
	* Add `sampleCount` parameter for MS textures. * Fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-19	Support for vk-shift-* without explicit bindings (#3000)	jsmall-nvidia
	* Improvements to HLSLToVulkanLayoutOptions. * WIP vk-shift-* with HLSL like binding. Detecting clashes. * Shift example seems to be working correctly. One oddness is that "used" data is now reflected, as we only enable for D3D shader resource types. Now we use those with inferred VK mode they appear. * Implicit seems to work. * Disable inference with Sampler/CombinedTextureSampler. I guess? we could just use the HLSL texture register binding to infer. * Report overlapping ranges in diagnostic. The hlsl-to-vulkan-shift-diagnostic result might be surprising but it is correct, because u is automatically laid out so consumes DescriptorSlot 0, but that's already consumed by c. * First attempt at array layout with infer on Vulkan. * Fix the vulkan shift output. * Array example.
2023-07-19	Optimize specialization, and remove unnecessary calls to `simplifyIR`. (#2999)	Yong He
	* Remove unneccessary calls to `simplifyIR`. * fix. * Delete obsolete hoistConst pass. * Fix. * Small improvements. * Fix. * Fix enum lowering. * fix * tweaks. * tweaks. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-18	nsight Aftermath crash example (#2984)	jsmall-nvidia
	* Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Aftermath crash demo WIP. * Enable aftermath in test project. * Setting failCount. * Dumping out of source maps. * Improve comments. Simplify handling of compile products. * Other small fixes to aftermath example. * Added Emit SourceLocType. Track sourcemap association meaning. Improved documentation. * Small improvements. * Capture debug information for D3D11/D3D12/Vulkan. * Enable debug info. * Small improvements. * Improve aftermath example README.md.
2023-07-18	Simplify Lookup and improve compiler performance. (#2996)	Yong He
	* Simplify lookup. * Various bug fixes. * Report type dictionary size in perf benchmark. * Remove type duplication. * increase initial dict size. * Bug fix. * Fix bugs. * Fixup. * Revert type legalization looping. * Fix specialization pass. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-14	Fix vk-shift-* mapping issue (#2993)	jsmall-nvidia
	* Fix vk-shift-* mappings. * Add some doc info about vk-shift. * Fix diagnostic test.
2023-07-14	Allow setting of HLSLToVulkan options without having a target specified on ↵	jsmall-nvidia
	the command line options. (#2989)
2023-07-14	Robustness fixes around reverse-mode differentiation of variables & inout ↵	Sai Praveen Bangaru
	parameters (#2985) * Add a new test case for checking loop in the reverse mode * Create duplicate var for primal value to avoid inconsistent inputs to backward call Fixes an issue with inout parameters where the backprop call may use the post-call value of the var instead of the pre-call value. * `IRStore`s transpose to `IRLoad` and an `IRStore(0)` to clear differential. Fixes some subtle issues around transposing * Simplify test * Delete out-edited.hlsl --------- Co-authored-by: Lifan Wu <lifanw@nvidia.com> Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-12	Pool inst worklists and hashsets to avoid rehashing. (#2982)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-12	Create and cache flattened inheritance lists (#2740)	Theresa Foley
	* Create and cache flattened inheritance lists The basic change here is to have a cached lookup that can map a `Type`, or a `DeclRef` that might refer to a type or `extension`, to a list of the facets that comprise it. The notion of a facet here is similar to what the C++ standard calls "sub-objects". A declared type like a `struct` has: * a facet for its own direct members * one facet for each of its (transitive) base `struct` types * one facet for each `interface` it conforms to * one facet for each `extension` that applies to that type The set of facets for a type is de-duplicated (so that "diamond" inheritance patterns don't cause issues) and deterministically ordered, using a variation of the C3 linearization algorithm. The creation of a linearized list of facets should help the compiler implementation in two key places: * Testing if a type implements an interface (or inherits from a base type) should now only take time linear in the number of (transitive) bases of that type. We can simply scan the linearized facet list to see if it contains a facet corresponding to the given base. * Looking up the members of a type (or a value of a given type) should be greatly simplified, since all of the members can be found in a single linear scan of the facet list. In addition, those facets will be ordered so that facets for "more derived" types will precede those for "less derived" types, so that shadowing in the case of overrides should be easier to implement. This change only implements the first of these two improvements, since there is already a lot of churn involved. Notes and caveats: * The handling of conjunction types (e.g., `IFoo & IBar`) complicates the implementation, both because the simple approach to subtype testing alluded to above is no longer complete, and also because we need to be more careful about what forms of subtype witnesses we construct, so that we can maintain the currently-required invariant that two witnesses are only equal if they have matching structure. * We don't implement the full/"proper" C3 algorithm here because it has some failure cases that we'd still like to support. In particular if we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it is illegal to have `IZ : IX, IY` because the two bases it inherits from disagree on the relative ordering of `IA` and `IB` in their own linearizations. Handling such cases may make our implementation less efficient, and it will also require testing of those corner caes. * When it comes time to revamp the implementation of lookup, we will need to deal with the fact that a single linear list (seemingly) cannot give us sufficient information to decide which of two members of the same name should shadow the other, or if there is an ambiguity. Or rather, it can give us that information if we are willing to accept some very user-unfriendly behavior and simply say that declarations earlier in the linearization always shadow later declarations, even if the facets involved are not related by an inheritance relationship of any kind. * In order to remove one kind of vicious circularity from the approach, the linearization that we are computing for `extension` declarations will not be sufficient for lookups in the body of such an `extension`. A future change may need to have support for creating and caching two distinct linearizations for each `extension`: one that is to be used when that `extension` is pulled into the linearization for a type that it applies to, and another for when lookup will be performed in the context of the `extension` itself. * This change does not include the simple expedient of adding a direct cache for subtype tests to the `SharedSemanticsContext`, although adding such a cache would be a simple matter. * This change introduces more deduplication for subtype witnesses, which should enable more deduplication for other `Val`s (including `Type`s), but it does not introduce any assumptions that equal `Val`s or `Type`s must have identical pointer representations. * Eventually we may find that, similar to the situation with `Type`s, we will want to have a split between surface-level and canonicalized versions of other `Val`s, including subtype witnesses. * Fix clang error. * remove debugging code. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-12	Use scratchData on `IRInst` to replace HashSets. (#2978)	Yong He
	* Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-12	Extend `no_diff` to support subscript operations on resources and array ↵	Sai Praveen Bangaru
	variables… (#2981) * Extend `no_diff` to support subscript operations on resources and array variables * Update autodiff.slang.expected
2023-07-12	Fix native string emit for CUDA/Cpp backend. (#2980)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-11	Add perf benchmark utility. (#2977)	Yong He
	* Add perf benchmark utility. * Update documentation. * Fix. * Fix doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-10	Add support for texture footprint queries (#2970)	Theresa Foley

2023-07-10	Fix hit object emit for HLSL + FuncType specialization bug fix. (#2976)	Yong He
	* Fix hit object emit for HLSL. * Fix a bug involving specialization of functon type. * Add a test case. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-10	Add glsl intrinsic for SampleCmpLevelZero with offset and correct existing ↵	Ellie Hermaszewska
	intrinsic (#2975) * Correct glsl intrinsic for SampleCmpLevelZero without offset * Add glsl intrinsic for SampleCmpLevelZero with offset * Add test for samplecmplevelzero glsl translation --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-10	Do not fail when emitting GLSL using unorm/snorm textures (#2973)	Ellie Hermaszewska
	* Do not fail when emitting GLSL using unorm/snorm textures Ignored in glslang https://github.com/KhronosGroup/glslang/blob/main/glslang/HLSL/hlslGrammar.cpp\#L1476 * Add test for unorm modifier on glsl
2023-07-07	Make DeclRefBase a Val, and DeclRef<T> a helper class. (#2967)	Yong He
	* Make DeclRefBase a Val, and DeclRef<T> a helper class. * Fixes. * Workaround gcc parser issue. * Revert NodeOperand change. * Fix. * Fix clang incomplete class complains. * Fix code review. * Small cleanups and improvements. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-07	Do not use member function of incomplete SemanticsVisitor (#2968)	Ellie Hermaszewska
	Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2023-07-06	Fix erroneous error claiming variable is being used before its declaration ↵	Ellie Hermaszewska
	(#2958) * Simplify type of diagnoseImpl * Show source line for Note diagnostics, opting out of this where appropriate * Make declared after use diagnostic clearer * Fix erroneous error claiming variable is being used before its declaration Closes https://github.com/shader-slang/slang/issues/2936 * Fix build on msvc --------- Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2023-07-06	Add support for length for scalar floating point types. (#2965)	jsmall-nvidia

2023-07-06	Work around for NonUniformResourceIndex with non integral types. (#2963)	jsmall-nvidia
	* Work around for NonUniformResourceIndex with non integral types. * Make the non integral NonUniformResourceIndex, inline early. * Add a depreciated warning.
2023-07-05	Bottleneck DeclRef creation through ASTBuilder. (#2689)	Yong He
	* Bottleneck DeclRef creation through ASTBuilder. * Fix clang error. * Fix. * Fix. * More fix. * Rebase on top of tree. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-05	Squash some warnings (#2956)	Ellie Hermaszewska
	* restrict -Wno-assume to clang (gcc does not have this warning) * Add move where possible Annoyingly this warns for c++17, but will not be necessary with c++20 * Do not partially initialize struct * Remove unused variable * Silence unused var warning It is actually still referenced from an uninstantiated (for now) template * Use unused var --------- Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2023-07-05	Disable l-value coercion for ref types (#2960)	jsmall-nvidia
	* Make lvalue coercion not work for ref, to stop problem with atomics (for GLSL output). * Improve some comments.
2023-07-05	Initial sizeof/alignof implementation. (#2954)	jsmall-nvidia
	* Initial sizeof implementation. * Small macro improvement. * Fix some typos. * Refactor NaturalSize. Add more sizeof tests. * Use _makeParseExpr to add sizeof support. * Add size-of.slang diagnostic result. * Fix typo in folding with macro change. * Add a sizeof test of This. * Some more NaturalSize coverage. * Simple alignof support. * Testing for alignof. * Added 8 bit enum to check enums values are correctly sized. * Add alignof to completion. * Lower sizeof/alignof to IR. sizeof/alignof IR pass. Tests for simple generic scenarios. * Make append handle invalid properly. Improve comments. --------- Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
2023-07-03	Refactor "meta" decls for stdlib texture types (#2932)	Theresa Foley
	We use some ad-hoc "template engine" code generation / metaprogramming to generate many of the declarations in the Slang standard library. In many cases the level of meta-ness is (relatively) manageable, but one of the biggest tangles in the whole thing is the generation of the texture-related types. We basically have a single set of nested `for` loops that generate all types of the form: (RW\|RasterizerOrdered\|/*/)(Texture\|Sampler)(1D\|2D\|...)Array?MS? Inside that loop we then have tons of conditional logic to determine: Which points in the cross-product space should be skipped, rather than emitted as a type. * Which methods to emit, or not. * The type signature(s) of those methods. * The translation of those methods for each target (via `__target_intrinsic`) The code ends up being long, complicated, and very hard to maintain or extend. This change takes a first small step to try to help us get the complexity more under control. The basic approach is that the data that defines each point in the cross-product space is aggregated into a `TextureTypeInfo` structure in the meta-level code, and then the logic for emitting the declarations related to a given texture type is expressed as a member function of that type. The intention is that this design will more easily allow the meta-level code to be factored into distinct subroutines, and enable us to clean up and re-use recurring bits of text that need to appear in the output. It is possible (though I am not yet predicting it) that we will end up wanting to utilize a bit of an inheritance hierarchy on `TextureTypeInfo` to allow us to more cleanly factor out code that is specific to certain cases (e.g., there is only a small amount of sharing between `RW`/`RasterizerOrdered` and read-only texture types). It is intentional that this step introduces no significant changes to the logic that used to be inside the loop (and is now inside of a method). Instead, the goal is to minimize the scale of the diffs that reviewers might be expectecd to deal with in follow-on changes. Co-authored-by: Yong He <yonghe@outlook.com>
2023-06-30	Non-Recursive CFG DFS. (#2953)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-06-30	Fix for operator assignment issue (#2951)	jsmall-nvidia
	* WIP handling LValue coercion via LValueImplicitCast * Need to have the ptr type for the cast. * Casting conversion working on C++. * Make the LValue casts record if in or in/out as we can produce better code if we know the difference. * WIP LValueCast pass * Fix tests so we don't fail because downstream compilers detect use of uninitialized variable. * Do conversions through through tmp for l-value scenarios that can't work other ways. * Fix a typo. * Change diagnostic implicit-cast-lvalue for a type that still exhibits the issue. * Add matrix test. * Added a bit more clarity around LValue casting choices. * Small comment improvements. Improvements based on comments on PR. * Use findOuterGeneric.
2023-06-29	Issue diagnostic for incorrect parameter types & directionality when ↵	Sai Praveen Bangaru
	defining custom derivatives (#2947) * Issue diagnostic for incorrect directionality when defining custom derivative * Better diagnostics on invalid custom derivatives * Avoid duplicating `getParameterDirection()` --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-06-29	Apply SCCP on global scope before unrolling loops. (#2952)	Yong He
	* Apply SCCP on global scope before unrolling loops. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-06-29	Small fixes to GLSL-legalize and func-property prop. (#2950)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-06-29	Warn on semicolon after `if`. (#2948)	Yong He
	* Warn on semicolon after `if`. * add test result --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-06-28	Fix parameter block loads in GLSL emit. (#2946)	Yong He
	* Fix parameter block loads in GLSL emit. * Revert `[NoSideEffect]` declarations in DXR1.1 API. * fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>