| Age | Commit message (Collapse) | Author |
|
|
|
* Add type layout for structured buffer
* Default to generating spirv directly
* vk test for compute simple
* Add spirv-dis as a downstream compiler
* Emit Array types in SPIR-V
* makevector for spirv
* Dump whole spirv module on validation failure
* register array types
todo, use emitTypeInst
* Neater formatting for unhandled inst printing
* break out emitCompositeConstruct
* Correct array type generation
* neaten
* Allow getElement for vector
* Remove unused
* Allow predicating target intrinsics on types
* Consider functions with intrinsics to have definitions
We need to specialize these if they are predicated on types
* Correct array type generation
* makeArray for spir-v
* replace getElement with getElementPtr for spirv
* Correct translation of field access for spirv
* Push layouts to types for spirv
* Spirv intrinsics * operator now makes a pointer
* Add structured buffer of struct test
* Preserve type layout in spirv structured buffer legalization
* neaten
* makeVectorFromScalar for SPIRV
* placeholder for layouts on param groups
* More type safe spirv op construction
* Know that constants and types only go in one section
* Remove emitTypeInst
* Add todo for spirv sampling
* Add links to spirv documentation on emit functions
* OpTypeImage support for SPIR-V
* Add simpler texture test for spirv
* s/spirv_direct/spirv/g
* Allow several string literals in target_intrinsic
* Handle global params without a var layour for SPIR-V
For example groupshared vars
* uint spirv asm type
* Add todo for isDefinition
It is currently too broad
* Some atomic op spirv intrinsics
* Strip ConstantBuffer wrappers for spirv
* Add todo for matrix annotations
* Do not associate decorations insts with spirv counterparts
* Correct entry point parameter generation
* Spelling
* Assert that fieldAddress is returning a pointer
* Add error for existential type layout getting to spir-v emit
* Add IRTupleTypeLayout
Unused so far
* Allow getElementPtr to work with vectors
* Correct target name in test
* Hide default spirv direct behind a premake option --default-spirv-direct=true
* Do not insert space at start of intrinsic def
* Correct asm rendering in tests
* remove redundant option
* Emit directly from direct test
* Add source language options for spirv-dis
* Add comments to spirv dis
* Add dead debug print for before spirv module
* Correct asm rendering in tests
* s/spirv_direct/spirv/g
* Only specialize intrinsic functions with predicates
* regenerate vs projects
* squash warnings
* squash warnings
* remove duplication
* Silence warnings from msvc
* squash warnings
* Overload for zero sized array
* More msvc warnings
* warnings
* Add spirv-tools to path for tests
* Do not be specific about dxc version for diag test
* Normalize line endings from spirv-dis
* Correct filecheck matches
* Temporarily disable two spirv tests
Failing on CI, undebuggable hang :/
* Do not emit storage class more than once for spirv snippet
* Do not pass spir-v to spirv-dis by stdin
* Do not get spirv-dis output via stream, use file
* normalize file endings in spirv-dis output
|
|
* Support per field matrix layout
* Fix warnings.
* Fix.
* Fix tests.
* Fix spiv gen.
* Fix.
* More test fixes.
* Fix.
* Run only GPU tests on self-hosted servers.
* Remove -use-glsl-matrix-layout-modifier.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support implciit casted swizzled lvalue.
* Fix warnings.
* Fix.
* fix comment.
* Prefer mangled linkage name for global params.
* Update tests.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Default target intrinsic should not apply to SPIR-V
* CapabilitySet is a conjunction
* Add TEXTUAL_SOURCE capability class
|
|
* Redesign DeclRef + Deduplicate Val.
* Update project files
* Fix warning.
* Fix.
* Fix.
* Remove `Val::_equalsImplOverride`.
* Rmove `Val::_getHashCodeOverride`.
* Remove `semanticVisitor` param from `resolve`.
* Cleanups.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
calls (#3010)
* Add test for nodiff diagnostic for non-diff call propagated through diff call
* Add logic to disambiguate calls to differentiable and non-differentiable methods
* Add expected results for test
* Simplify test
* Update slang-ir-check-differentiability.cpp
* Added comments for TreatAsDifferentiableExpr flavors
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Add `sampleCount` parameter for MS textures.
* Fix test.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Remove unneccessary calls to `simplifyIR`.
* fix.
* Delete obsolete hoistConst pass.
* Fix.
* Small improvements.
* Fix.
* Fix enum lowering.
* fix
* tweaks.
* tweaks.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Simplify lookup.
* Various bug fixes.
* Report type dictionary size in perf benchmark.
* Remove type duplication.
* increase initial dict size.
* Bug fix.
* Fix bugs.
* Fixup.
* Revert type legalization looping.
* Fix specialization pass.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Create and cache flattened inheritance lists
The basic change here is to have a cached lookup that can map a `Type`,
or a `DeclRef` that might refer to a type or `extension`, to a list of
the *facets* that comprise it.
The notion of a *facet* here is similar to what the C++ standard calls
"sub-objects".
A declared type like a `struct` has:
* a facet for its own direct members
* one facet for each of its (transitive) base `struct` types
* one facet for each `interface` it conforms to
* one facet for each `extension` that applies to that type
The set of facets for a type is de-duplicated (so that "diamond"
inheritance patterns don't cause issues) and deterministically ordered,
using a variation of the C3 linearization algorithm.
The creation of a linearized list of facets should help the compiler
implementation in two key places:
* Testing if a type implements an interface (or inherits from a base
type) should now only take time linear in the number of (transitive)
bases of that type. We can simply scan the linearized facet list to
see if it contains a facet corresponding to the given base.
* Looking up the members of a type (or a value of a given type) should
be greatly simplified, since all of the members can be found in a
single linear scan of the facet list. In addition, those facets will
be ordered so that facets for "more derived" types will precede those
for "less derived" types, so that shadowing in the case of overrides
should be easier to implement.
This change only implements the first of these two improvements, since
there is already a *lot* of churn involved.
Notes and caveats:
* The handling of conjunction types (e.g., `IFoo & IBar`) complicates
the implementation, both because the simple approach to subtype
testing alluded to above is no longer complete, and also because
we need to be more careful about what forms of subtype witnesses
we construct, so that we can maintain the currently-required invariant
that two witnesses are only equal if they have matching structure.
* We don't implement the full/"proper" C3 algorithm here because it has
some failure cases that we'd still like to support. In particular if
we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it
is illegal to have `IZ : IX, IY` because the two bases it inherits
from disagree on the relative ordering of `IA` and `IB` in their
own linearizations. Handling such cases may make our implementation
less efficient, and it will also require testing of those corner
caes.
* When it comes time to revamp the implementation of lookup, we will
need to deal with the fact that a single linear list (seemingly)
cannot give us sufficient information to decide which of two members
of the same name should shadow the other, or if there is an ambiguity.
Or rather, it *can* give us that information if we are willing to
accept some very user-unfriendly behavior and simply say that
declarations earlier in the linearization always shadow later
declarations, even if the facets involved are not related by an
inheritance relationship of any kind.
* In order to remove one kind of vicious circularity from the approach,
the linearization that we are computing for `extension` declarations
will not be sufficient for lookups in the body of such an `extension`.
A future change may need to have support for creating and caching
two distinct linearizations for each `extension`: one that is to be
used when that `extension` is pulled into the linearization for a
type that it applies to, and another for when lookup will be performed
in the context of the `extension` itself.
* This change does *not* include the simple expedient of adding a direct
cache for subtype tests to the `SharedSemanticsContext`, although
adding such a cache would be a simple matter.
* This change introduces more deduplication for subtype witnesses,
which should enable more deduplication for other `Val`s (including
`Type`s), but it does not introduce any assumptions that equal
`Val`s or `Type`s must have identical pointer representations.
* Eventually we may find that, similar to the situation with `Type`s,
we will want to have a split between surface-level and canonicalized
versions of other `Val`s, including subtype witnesses.
* Fix clang error.
* remove debugging code.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
variables… (#2981)
* Extend `no_diff` to support subscript operations on resources and array variables
* Update autodiff.slang.expected
|
|
|
|
* Fix hit object emit for HLSL.
* Fix a bug involving specialization of functon type.
* Add a test case.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Make DeclRefBase a Val, and DeclRef<T> a helper class.
* Fixes.
* Workaround gcc parser issue.
* Revert NodeOperand change.
* Fix.
* Fix clang incomplete class complains.
* Fix code review.
* Small cleanups and improvements.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Bottleneck DeclRef creation through ASTBuilder.
* Fix clang error.
* Fix.
* Fix.
* More fix.
* Rebase on top of tree.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* restrict -Wno-assume to clang (gcc does not have this warning)
* Add move where possible
Annoyingly this warns for c++17, but will not be necessary with c++20
* Do not partially initialize struct
* Remove unused variable
* Silence unused var warning
It is actually still referenced from an uninstantiated (for now) template
* Use unused var
---------
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
* Initial sizeof implementation.
* Small macro improvement.
* Fix some typos.
* Refactor NaturalSize.
Add more sizeof tests.
* Use _makeParseExpr to add sizeof support.
* Add size-of.slang diagnostic result.
* Fix typo in folding with macro change.
* Add a sizeof test of This.
* Some more NaturalSize coverage.
* Simple alignof support.
* Testing for alignof.
* Added 8 bit enum to check enums values are correctly sized.
* Add alignof to completion.
* Lower sizeof/alignof to IR.
sizeof/alignof IR pass.
Tests for simple generic scenarios.
* Make append handle invalid properly.
Improve comments.
---------
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
* WIP handling LValue coercion via LValueImplicitCast
* Need to have the ptr type for the cast.
* Casting conversion working on C++.
* Make the LValue casts record if in or in/out as we can produce better code if we know the difference.
* WIP LValueCast pass
* Fix tests so we don't fail because downstream compilers detect use of uninitialized variable.
* Do conversions through through tmp for l-value scenarios that can't work other ways.
* Fix a typo.
* Change diagnostic implicit-cast-lvalue for a type that still exhibits the issue.
* Add matrix test.
* Added a bit more clarity around LValue casting choices.
* Small comment improvements.
Improvements based on comments on PR.
* Use findOuterGeneric.
|
|
* WIP looking at reflection with pointers.
* Added GetPointerLayout.
* Initial test via reflection with layout of ptr type.
* WIP handles ptrs to types that have layout that hasn't been completed.
* Move tests to ptr.
* WIP try to take into account lowering correctly between AggTypeDecl and Type, but doesn't quite work.
* WIP a different path to handling recursive lowering problem with Ptr.
* Fix issues with reflection output.
* Small tidy.
* Fix for infinite recursion issue.
* Lower IRPointerTypeLayout
* Working with generics.
Has a hack to work around Layout around Ptr in IR.
The reflection around the generic - the name isn't much use, it should probably have the generic parameters, but that would require getName to do something more sophisticated.
* Fix issue around calling finishOuterGenerics to early.
* Remove feature/ptr test.
* Fix type legalization being an infinite loop with Ptr self referencing.
* Disable the pointer self reference test because produces an infintie loop on emit.
* Fixed comment based on review.
* Fix for issue with emit and pointers causing infinite recursion.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Small fixes and improvements around reflection tool.
* Make PrettyWriter printing a class.
* Add HLSL output support for [flatten] and [branch]
* Handle [branch] on switch.
|
|
* Fixes for Shader Execution Reordering on VK
There are some mismatches between the way that hit objects are
handled between the current NVAPI/HLSL and proposed GLSL extensions
for shader execution reordering. These mismatches create complications
for generating valid GLSL/SPIR-V code from input Slang.
Many of the problems that apply to `HitObject` also apply to the
existing `RayQuery<>` type used for "inline" ray tracing.
In the case of `RayQuery<>` we have that for *both* HLSL and
GLSL/SPIR-V:
* A `RayQuery` (or `rayQueryEXT`) is an opaque handle to underlying
mutable storage
* The storage that backs a `RayQuery` is allocated as part of the
"defualt constructor" for a local variable declared with type
`RayQuery`.
* The `RayQuery` API provides numerous operations that mutate the
storage referred to by the opaque handle.
The key difference between HLSL and GLSL/SPIR-V for the case of a
`RayQuery` amounts to:
* In HLSL, local variables of type `RayQuery` can be assigned to,
and assignment has by-reference semantics. It is possible to create
multiple aliased handles to the same underlying storage.
* In GLSL/SPIR-V, local variables of type `rayQueryEXT` cannot be
assigned to, returned from functions, etc. It is impossible to
create multiple aliased handles to the same underlying storage.
The case for `HitObject`s is signicantly *more* messy, because:
* In NVAPI/HLSL a `HitObject` is effectively a "value type" in that
it only exposes constructors, and there is no way to mutate the
state of a `HitObject` other than by assignment to a variable of that
type. It makes no semantic difference whether a `HitObject` directly
stores the value(s), or if it is a handle, since there is no way
to introduce aliasing of mutable state. Assignment of `HitObject`s
semantically creates a copy.
* In GLSL/SPIR-V, a `hitObjectNV` is, like a `rayQueryEXT`, a handle
to underlying mutable state. These handles cannot be assigned,
returned from functions, etc. There is no way to make a copy of
a hit object.
This change includes several changes to how *both* `RayQuery<>` and
`HitObject` are implemented, with the intention of getting more cases
to work correctly when compiling for GLSL/SPIR-V, and to set up a
more clear mental model for the semantics we want to give to these
types in Slang, and how those semantics can/should map to our targets.
An overview of important changes:
* Marked a few operations on `RayQuery` as `[mutating]` that
realistically should have already been that way.
* Marked the `HitObject` type as being non-copyable (an attribute we
do not currently enforce), and marked the various GLSL operations that
construct a hit object as having an `out` parameter of the `HitObject`
type (even if they are nominally specified in GLSL as not writing
to the correspondign parameter).
* Added a distinct IR opcode (`allocateOpaqueHandle`) to represent the
implicit allocation that happens when declaring a variable of type
`HitObject` or `RayQuery`, and made the "implicit constructor" for
those types map to the new op. This operation took a lot of tweaking
to get emitting in a reasonable way, and I'm still not 100% sure that
all of the emission-related logic for it is strictly required
(or correct).
* Added new IR instructions for `HitObject` and `RayQuery` types, and
made the stdlib types map to those IR instructions.
* Treat `HitObject` and `RayQuery` as resource types for the purpose
of our existing pass that specializes calls to functions that have
outputs of resource type
* Added a new test case that includes a function that returns a
`HitObject` as its result.
* Many test cases saw slight changes in their output (especially around
the relative ordering of declarations of `HitObject`s and `RayQuery`s
with other instructions)
* Remove debugging logic
|
|
* Preserve type cast during AST constant folding.
Fixes #2891.
* Fix.
* Fix truncating.
* fix test.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Diagnose on div-by-zero during sccp.
* fix
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fusion pass for saturated_cooperation
* simplify assert
* regenerate vs projects
* missing test output files
* rename shadowing variable to appease msvc
* Fuse calls to sat_coop with differing inputs
* formatting
* add cpu test for hof simple
* Make higher-order functions into compute comparison tests
* comment tests
* remove redundant test
* Add test to confirm inlining in sat_coop fuse
* Add clarifying comment for sat coop fusing
* Add KnownBuiltin decoration
* s/CanUseFuncSignature/TypesFullyResolved for higher order function checking
* Add TODO
* spelling
* Correct detection of sat_coop calls
* Disable tests which are unsupported on testing infra
|
|
* MVP for higher order functions
* Add shader subgroup partitioned glsl intrinsics
* Implement parsing and checking for tuple types
Currently there is no way to do anything useful with them from the source language however
* neaten
* Correct precedence of function type parsing
* neaten
* higher order function tests
* function types of any arity
* Inference for higher order functions
* Add second test for unsynchronized params
* regenerate vs projects
* dx11 -> dx12 for saturated cooperations tests
* Disable saturated cooperation tests on vulkan
They fail on release builds in CI, not essential for the higher order function work however
* remove saturated-cooperation tests
* Remove unnecessary assert and clarify control flow in AddDeclRefOverloadCandidates
* Add Tuple type name mangling
* Use functype keyword to introduce function types
* Add more inference tests for hof
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* Various dxc/fxc compatibility fixes.
* Cleanup.
* Fix test cases.
* Fix comments.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
|
|
* Autodiff support for dynamically dispatched generic method.
* Fix.
* Support dynamically dispatched generic type.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
* Small tidy on String.
* Append -> append
* ToString -> toString
ProduceString -> produceString
* Small fixes.
* StringToXXX -> stringToXXX
* Fix typo introduced by Append -> append.
* Made intToAscii do reversal at the end.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP lowerCamel Dictionary.
* WIP more lowerCamel fixes for Dictionary.
* Add/Remove/Clear
* GetValue/Contains
* Fix tabs in dictionary.
Count -> getCount
* Fix fields with caps.
* Key -> key
Value -> value
Use m_ for members where appropriate.
Use lowerCamel in linked list.
* Some small fixes/improvements to Dictionary.
* Kick CI.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Make output of obfuscation locs work in a slang-module.
* Tidy up detection for writing serialized source locs.
* Support for .zip references.
Handling of obfuscated source maps read from containers.
A test to check obfuscated source map working on a module.
* When using obfuscation, always obfuscate locs instead of stripping them. We keep a source map, so we can still produce reasonable errors.
* Write out source locs if debug information is enabled.
* Check output without sourcemap.
* Small fixes.
* Small improvements around hash calculation for source map name.
* Disable test that fails on x86 gcc linux for now.
* Fix issues around obfuscated source map using lines rather than columns.
Fix some issues around encoding/decoding.
* Make column calculation of source locs take into account utf8/tabs.
Don't special case obfuscated source map for lookup for source loc.
* Support following multiple source maps.
* Small fixes/improvements around SourceMap lookup.
|
|
* Diagnose on using uninitialized `out` param.
* Hack to allow `out Vertices<T>`.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add a bunch of builder emit wrappers for constant indices
To avoid cluttering any calling code with int instruction construction
* Matrix swizzle stores
Closes https://github.com/shader-slang/slang/issues/2512
* Matrix swizzle store tests
* Squash vs warnings
* Select scalar for singular swizzles
* Test singular swizzle materialization
* Use IRIntegerValue over UInt for IR wrappers
* Correct size of swizzle vector type
* Remove variable shadowing
|
|
|
|
* Cleaner impl of unary stdlib derivative functions.
* fixup
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Update checkpoint policy to make obvious recompute decisions.
Also adds an optimization to fold updateElement chains on the same array or struct into a single makeArray or makeStruct.
* Bug fixes around array types with different int typed count.
* change test.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add PyTorch C++ binding generation.
* fix
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Fix crash.
* Fix `[ForwradDerivative]` on member functions.
* Update comments.
* Fix crash when [BackwardDerivative] is provided but not [ForwardDerivative].
* Allow calling dynamic dispatched generic method from differentiable func.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP source map.
* Split out handling of RttiTypeFuncs to a map type.
* Make RttiTypeFuncsMap hold default impls.
* Slightly more sophisticated RttiTypeFuncsMap
* Source map decoding.
* Fix tabs.
* Fix asserts due to negative values.
* Use less obscure mechanisms in SourceMap.
* Source map decoding.
Simplifying SourceMap usage.
* First attempt at ouputting a source map as part of emit.
* Added support for -source-map option. SourceMap is added to the artifact.
* Small improvements around column calculation in SourceWriter.
* Source Loc obuscation WIP.
* Fix some issues around SourceMap obfuscation.
* Split out obfuscation into its own file.
* Keep obfuscated SourceMap even through serialization bottleneck.
|
|
|
|
* Add support for emitting cuda kernel and host functions.
* Update test.
* Fix cuda preamble emit.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add `[CudaDeviceExport]` to allow exporting CUDA device functions.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|