| Age | Commit message (Collapse) | Author |
|
|
|
|
|
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* Add option to use original entrypoint in spirv output.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Redesign DeclRef + Deduplicate Val.
* Update project files
* Fix warning.
* Fix.
* Fix.
* Remove `Val::_equalsImplOverride`.
* Rmove `Val::_getHashCodeOverride`.
* Remove `semanticVisitor` param from `resolve`.
* Cleanups.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Disable code motion for expensive insts (call & div)
The current redundancy removal pass does not consider control-flow within loops and as a result can sometimes move dynamic dispatch code outside their switch blocks, if they are nested in a single-iter-loop.
* Update liveness.slang.expected
|
|
* Cast integer literals.
* Fix expected output.
* For CUDA, search global instructions to see what types are used.
Improve lookup for fp16 header in CUDA.
* Fix issue with f16tof32
* Small improvement around finding used base types.
|
|
* Generalize collectInductionValues
* Support affine transformations of loop index as induction variables
* Test for generalized induction value collection
* Neaten inductive variable finding
* Store the type of implication success when finding inductive variables
* Test that loop induction finding does not alway succeed
* Support chains of additions and branches of additions in induction variable finding
* Use c++17 for downstream compilers
|
|
* A more way robust way to handle resource consumption might use multiple `kind`s on GLSL emit.
* Improve method naming and some comments.
* Small consistency fix.
* Fix issue with #elif evaluation.
* Add a test.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix -fvk-u-shift not applying to global constant buffer.
* Fix test.
* Fix.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Refactor `dmul(This, Differential)` to `dmul<T:Real>(T, Differential)`
- Add AST synthesis support for generic containers
- Refactor relevant tests
* Merge dmul synthesis with dadd and dzero, and disambiguate using an enum
* Fix trailing spaces
|
|
* Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output.
* Add `transpose` to `ObjectToWorld4x3`.
* Fix scalar swizzle causes invalid glsl output.
Fixes #3026.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix -fvk-u-shift not applying to RWStructuredBuffer on glsl output.
* Add `transpose` to `ObjectToWorld4x3`.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Translates to textureQueryLod().x (with the Unclameped variant being returned in the .y component)
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
The translation to GLSL is incomplete as intrinsics only exist for some combination of comparison and channel (just channel 0)
Closes https://github.com/shader-slang/slang/issues/3021
|
|
This allows Visual Studio debugger to skip over AST visitor dispatch functions and stop directly at the `visit*` functions when stepping into a `dispatch*` call.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
calls (#3010)
* Add test for nodiff diagnostic for non-diff call propagated through diff call
* Add logic to disambiguate calls to differentiable and non-differentiable methods
* Add expected results for test
* Simplify test
* Update slang-ir-check-differentiability.cpp
* Added comments for TreatAsDifferentiableExpr flavors
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
emit (#3009)
* A more way robust way to handle resource consumption might use multiple `kind`s on GLSL emit.
* Improve method naming and some comments.
* Small consistency fix.
|
|
loops (#3005)
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Handle different resource kinds that can appear via the vk-shift-* allowing some HLSL kinds in GLSL emit.
* Determine the used kind for emit.
* Added vk-shift-uniform-issue.slang
* Use a better function name.
Improve comments.
|
|
* Add `sampleCount` parameter for MS textures.
* Fix test.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Improvements to HLSLToVulkanLayoutOptions.
* WIP vk-shift-* with HLSL like binding.
Detecting clashes.
* Shift example seems to be working correctly.
One oddness is that "used" data is now reflected, as we only enable for D3D shader resource types. Now we use those with inferred VK mode they appear.
* Implicit seems to work.
* Disable inference with Sampler/CombinedTextureSampler.
I guess? we could just use the HLSL texture register binding to infer.
* Report overlapping ranges in diagnostic.
The hlsl-to-vulkan-shift-diagnostic result might be surprising but it is correct, because u is automatically laid out so consumes DescriptorSlot 0, but that's already consumed by c.
* First attempt at array layout with infer on Vulkan.
* Fix the vulkan shift output.
* Array example.
|
|
* Remove unneccessary calls to `simplifyIR`.
* fix.
* Delete obsolete hoistConst pass.
* Fix.
* Small improvements.
* Fix.
* Fix enum lowering.
* fix
* tweaks.
* tweaks.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Small fixes and improvements around reflection tool.
* Make PrettyWriter printing a class.
* Aftermath crash demo WIP.
* Enable aftermath in test project.
* Setting failCount.
* Dumping out of source maps.
* Improve comments.
Simplify handling of compile products.
* Other small fixes to aftermath example.
* Added Emit SourceLocType.
Track sourcemap association meaning.
Improved documentation.
* Small improvements.
* Capture debug information for D3D11/D3D12/Vulkan.
* Enable debug info.
* Small improvements.
* Improve aftermath example README.md.
|
|
* Simplify lookup.
* Various bug fixes.
* Report type dictionary size in perf benchmark.
* Remove type duplication.
* increase initial dict size.
* Bug fix.
* Fix bugs.
* Fixup.
* Revert type legalization looping.
* Fix specialization pass.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix vk-shift-* mappings.
* Add some doc info about vk-shift.
* Fix diagnostic test.
|
|
the command line options. (#2989)
|
|
parameters (#2985)
* Add a new test case for checking loop in the reverse mode
* Create duplicate var for primal value to avoid inconsistent inputs to backward call
Fixes an issue with inout parameters where the backprop call may use the post-call value of the var instead of the pre-call value.
* `IRStore`s transpose to `IRLoad` and an `IRStore(0)` to clear differential.
Fixes some subtle issues around transposing
* Simplify test
* Delete out-edited.hlsl
---------
Co-authored-by: Lifan Wu <lifanw@nvidia.com>
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Create and cache flattened inheritance lists
The basic change here is to have a cached lookup that can map a `Type`,
or a `DeclRef` that might refer to a type or `extension`, to a list of
the *facets* that comprise it.
The notion of a *facet* here is similar to what the C++ standard calls
"sub-objects".
A declared type like a `struct` has:
* a facet for its own direct members
* one facet for each of its (transitive) base `struct` types
* one facet for each `interface` it conforms to
* one facet for each `extension` that applies to that type
The set of facets for a type is de-duplicated (so that "diamond"
inheritance patterns don't cause issues) and deterministically ordered,
using a variation of the C3 linearization algorithm.
The creation of a linearized list of facets should help the compiler
implementation in two key places:
* Testing if a type implements an interface (or inherits from a base
type) should now only take time linear in the number of (transitive)
bases of that type. We can simply scan the linearized facet list to
see if it contains a facet corresponding to the given base.
* Looking up the members of a type (or a value of a given type) should
be greatly simplified, since all of the members can be found in a
single linear scan of the facet list. In addition, those facets will
be ordered so that facets for "more derived" types will precede those
for "less derived" types, so that shadowing in the case of overrides
should be easier to implement.
This change only implements the first of these two improvements, since
there is already a *lot* of churn involved.
Notes and caveats:
* The handling of conjunction types (e.g., `IFoo & IBar`) complicates
the implementation, both because the simple approach to subtype
testing alluded to above is no longer complete, and also because
we need to be more careful about what forms of subtype witnesses
we construct, so that we can maintain the currently-required invariant
that two witnesses are only equal if they have matching structure.
* We don't implement the full/"proper" C3 algorithm here because it has
some failure cases that we'd still like to support. In particular if
we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it
is illegal to have `IZ : IX, IY` because the two bases it inherits
from disagree on the relative ordering of `IA` and `IB` in their
own linearizations. Handling such cases may make our implementation
less efficient, and it will also require testing of those corner
caes.
* When it comes time to revamp the implementation of lookup, we will
need to deal with the fact that a single linear list (seemingly)
cannot give us sufficient information to decide which of two members
of the same name should shadow the other, or if there is an ambiguity.
Or rather, it *can* give us that information if we are willing to
accept some very user-unfriendly behavior and simply say that
declarations earlier in the linearization always shadow later
declarations, even if the facets involved are not related by an
inheritance relationship of any kind.
* In order to remove one kind of vicious circularity from the approach,
the linearization that we are computing for `extension` declarations
will not be sufficient for lookups in the body of such an `extension`.
A future change may need to have support for creating and caching
two distinct linearizations for each `extension`: one that is to be
used when that `extension` is pulled into the linearization for a
type that it applies to, and another for when lookup will be performed
in the context of the `extension` itself.
* This change does *not* include the simple expedient of adding a direct
cache for subtype tests to the `SharedSemanticsContext`, although
adding such a cache would be a simple matter.
* This change introduces more deduplication for subtype witnesses,
which should enable more deduplication for other `Val`s (including
`Type`s), but it does not introduce any assumptions that equal
`Val`s or `Type`s must have identical pointer representations.
* Eventually we may find that, similar to the situation with `Type`s,
we will want to have a split between surface-level and canonicalized
versions of other `Val`s, including subtype witnesses.
* Fix clang error.
* remove debugging code.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
variables… (#2981)
* Extend `no_diff` to support subscript operations on resources and array variables
* Update autodiff.slang.expected
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add perf benchmark utility.
* Update documentation.
* Fix.
* Fix doc.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Fix hit object emit for HLSL.
* Fix a bug involving specialization of functon type.
* Add a test case.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
intrinsic (#2975)
* Correct glsl intrinsic for SampleCmpLevelZero without offset
* Add glsl intrinsic for SampleCmpLevelZero with offset
* Add test for samplecmplevelzero glsl translation
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Do not fail when emitting GLSL using unorm/snorm textures
Ignored in glslang https://github.com/KhronosGroup/glslang/blob/main/glslang/HLSL/hlslGrammar.cpp\#L1476
* Add test for unorm modifier on glsl
|
|
* Make DeclRefBase a Val, and DeclRef<T> a helper class.
* Fixes.
* Workaround gcc parser issue.
* Revert NodeOperand change.
* Fix.
* Fix clang incomplete class complains.
* Fix code review.
* Small cleanups and improvements.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
(#2958)
* Simplify type of diagnoseImpl
* Show source line for Note diagnostics, opting out of this where appropriate
* Make declared after use diagnostic clearer
* Fix erroneous error claiming variable is being used before its declaration
Closes https://github.com/shader-slang/slang/issues/2936
* Fix build on msvc
---------
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|