| Age | Commit message (Collapse) | Author |
|
* [Metal] Fix global constant array emit.
* Try enable more tests.
|
|
* Implement for metal `SV_GroupIndex`
1. If we don't have `sv_GroupThreadId` available we create one using `SV_GroupIndex`s location data.
2. We emit code emulating `sv_GroupThreadId` from the same logic that CUDA/CPP uses.
* address most review comments
Addressed all but two: [1](https://github.com/shader-slang/slang/pull/4385#discussion_r1639058473) and [2](https://github.com/shader-slang/slang/pull/4385#issuecomment-2166934855)
I want to enable tests and be sure there is no bugs using CI before I redesign the code so I have a working fallback.
* address comment, enable tests
enable now functioning tests due to `SV_GroupIndex` working with metal
* syntax error with groupThreadID search
did `= param` instead of `= i.param`
* add `sv_groupid` for test + test fixes
* disable test that won't work regardless
|
|
* Fix and enable tests for metal.
* Fix.
* Fix.
* Fix tests.
* Fix warnings.
* Fix.
---------
Co-authored-by: Yong He <yonghe@Yongs-Mac-mini.local>
|
|
* Enable full test on macos.
* Add failing test to expected list.
* Fix CI script.
* Update expected failure list.
* Update test list.
|
|
* capability upgrade warning/error
adjusted implementation + tests to support a warning/error if capabilities are implicitly upgraded and test accordingly.
* add glsl profile caps
* add GLSL and HLSL capabilities to the associated capability
* syntax error in capdef
* only error if user explicitly enables capabilities
1. changed testing infrastructure to not set a `profile` explicitly,
2. Added tests to be sure this works as intended with user API and with slangc command line
* Change capability atom definitions and how Slang manages them to fix errors
1. most `glsl_spirv` version atoms have been removed from `.capdef`, instead we will translate `spirv` version atoms into `glsl_spirv` since there is no point in writing the same code twice in `.capdef` files to define `spirv` versions.
2. add spirv version, and hlsl sm version (and equivlent) capability dependencies
3. removed some stage requirments which were set on objects, keep the wrapper capabilities. I am keeping the wrapper capabilities since I am unaware on if there are stage limitations (spec says code in practice does not work).
* check internal version instead of version profile (_spirv_1_5 vs. spirv_1_5)
* remove unused OpCapability. adjust SPIRV version'ing again for glsl_spirv
* apply workaround for glslang bug with rayquery usage
* ensure capabilities targetted by a profile and added together by a user are valid
* remove additions to `spirv_1_*` wrapper
* spirv_* -> glsl_spirv fix
* fix bug where incompatable profiles would cause invalid target caps
* try to avoid joining invalid capabilities
* fix the warning/error & printing
* run through tests to fix capability system and test mistakes
many mistakes were mesh shaders doing `-profile glsl_450+spirv_1_4`. This is not allowed for a few reasons
1. the test tooling does not handle arguments the same as `slangc`
2. glsl_450 core profile does not support mesh shaders, nor does spirv_1_4. sm_6_5 does work in this senario
* set some sm_4_1 intrinsics to sm_4_0
* replace `GLSL_` defs with `glsl_`
* swap the unsupported render-test syntax for working syntax
* set d3d11/d3d12 profile defaults
this is required since sm version changes compiled code & behavior
* adjusted nvapi capabilities with atomics + d3d11 set to use sm_5_0 as per default
* cleanup
* address review
* incorrect styling
* change `bitscanForward` to work as intended on 32 bit targets
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
|
|
|
|
Fixes #4062
This change enables wide load/stores for byte-address-buffer backed
resources, when the data is accessed at an offset that is aligned.
**Goals**
- Improve performance by issuing wider instructions instead of sequence
of scalar instructions, for load and stores of byte-address buffers.
- Reduce code-size and readability of the generated shaders.
- Help naive users as well as ninja programmers, generate optimal code.
**Non Goals**
- Help with Structured buffers, or other resources.
- Target compilation time improvements.
**Key changes**
Adds 2 new overloads for Load and Store operations on ByteAddress Buffers.
1. Load / Store with an extra alignment parameter
```
resource.Load<T>(offset, alignment);
resource.Store<T>(offset, value, alignment);
```
2. LoadAligned / StoreAligned with no extra parameter,
with the same signature as orignial Load / Store.
```
resource.LoadAligned<T>(offset);
resource.StoreAligned<T>(offset, value);
```
- This overload will implicitly identify the alignment value,
from the base type T of the elementary unit of the resource.
**Supported resources**
1. Vectors
This can be upto 4 elements, i.e. float -- float4.
2. Arrays
This does not have a limit on number of elements, but on a
conservative estimate, we can limit to few hundreds.
3. Structures
This is used to group a resource of a single type.
```
struct {
float4 x;
}
```
**Code updates**
- Modified byte-address-ir legalize to handle struct, array and vector
kinds of load or store access
- Added custom hlsl stdlib functions to implement all the overloads for Load,
Store etc.
- Added C-like emitter, SPIR-V emitter for handling ByteAddressBuffers.
- Added a new core stdlib function intrinsic to wrap around alignOf<T>().
- Added a new peephole optimization entry to identify the equivalent
IntLiteral value from the alignOf<T>() inst.
- Added tests to check explicit, and implicit aligned Load and Store
operations.
|
|
* Avoid synthesis for when types can be used as their own differenial
+ Add test
* Add missing files..
* Fix issue with method synthesis for self-differential types
+ Add a generic test
* Fix
* Fix issue with out-of-date type resolution cache.
Witness tables created during the conformance checking phase not being taken into account during the decl type resolution phase because the epoch is not updated after conformance checking.
This leads to certain complex associated-type lookup chains (such as the one in tests/compute/assoctype-nested-lookup) not resolving properly and causing errors.
* Delete self-differential-type-synthesis-extension.slang
* Quick fix to repopulate stdlib cache for deferred stdlib loading
* Update slang-check-decl.cpp
|
|
Fixes #3533
- Add logic to perform aligned memory operations for loading from and storing to composite resources, like vectors within the ByteAddress legalize pass.
- Checks
Added a new test for byte address with/without alignment.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Switch to direct-to-spirv backend as default.
* Fix slang-test.
* Fix.
* Fix.
|
|
Fixes #387676* ForceInline SampleLevel to allow decorations to apply
* explictly add all the SPIRVAsmOperand Insts in non-differentiable list, which might get inadvertently processed when these functions are inlined into the main shader
* Support NonUniformResourceIndex for SPIR-V target
Fixes #3876
* add a new IR instruction for NonUniformResourceIndex
* slang ir emitter for nonuniform resource index
* update the hlsl meta slang
* Add test cases for NonUniformResourceIndex access for buffers and textures, with/without cast, nested access etc.
* add default c-like emitter for nonuniformresourceinfo
* added hlsl emitter
* added glsl emitter
* requisites for spirv enabling
- new decorator for nonuniformresourceindex
- emitter for nonuniformresourceindex signature change
* add hasResourceType checker
* add rwStructBuffType in resourcetype checker
* add a case for nonuniformres in emitDecorations
* DO NOT COMMIT: This change adds special handling for RWStructBuf within the isResourceType function, if it is a pointer to this resource, return true to make it work with nonuniformres test
* spirv emitter for decorations - update the emitLocalInst to perform decorations at the end
* added main spirv emitter code
* slang emit spirv bugfix
* hacky way of supporting Call Inst
* move code to cleanup nonuniform inst into helper function
* remove stale codefrom test
* add spirv decoration for nonuniform
* update test to remove global variables
* update coherent-2 test
* update comment for special handling
* update the spirv legalize to handle nested nonuniforms
improved logic that handles call ops, rwstructbuf, nested nonuniforms
etc.
* update nonuniform-array-of-tex test
* missed removing nonuniform inst causing duplicate decorations
* add glsl and hlsl variants of nonuniform tests
* repurpose the hasResource function into something specific for nonuniform inst decoration helper
* clean up comments and code around spirv-legalization to emit nonuniform inst by recursively looking into the inst
* use the helper canDecorateNonUniformInst to convert `nonUniformResourceInfo` inst to decoration
* converted compute/unbounded-array-of-array cross compile test into a simple check test
* update contains Resource helper function to be more generic
* clean up the case for opcall handling with nonuniform resource inst
* update ptr to struct buffer check to be more explicit and rename the function to check for ptr to resource type
* update comments and fix the test for coherent
* fix typos
* update logic on spirv legalize to delete dead instructions - for some reason this doesn't automatically happen
* add comments to declarations
* add NonuniformResourceIndex to the non-differential inst list
|
|
|
|
* Implement short-circuit logic operator
Implement short-circuit evaluation for logic && and ||
operator.
The short-circuit behavior is only used when the operands
involved are scalar and the parent function is non-differentiable.
In implementation, we define a new class 'LogicOperatorShortCircuitExpr'
derived from 'OperatorExpr'. In the visitInvoke() call, we will create
a new expression object 'LogicOperatorShortCircuitExpr' if the
expression is logic && or ||. So that we can generate new IR code in the
new visit function 'visitLogicOperatorShortCircuitExpr' to implement the
short-circuit behavior.
Add new test to test the short-circuit behavior.
* Fix an compile issue occurred in Falcon test
Previously, we early return when at least one of the operands of
"&&" or "||" is vector in convertToLogicOperatorExpr call. However,
in that case the arguments involved in the expression have already been
type checked. When it falls-back to 'visitInvokeExpr', it will check
the arguments again, and some unexpected behavior could occur
which could in turn cause some internal error.
So we add a check in the 'visitInvokeExpr' to avoid double type checking
of arguments.
* Update glsl subgroup test to not use short-circuit
Since the short-circuit evaluation could cause the threads
diverging in subgroup intrinsics. So change the test to not
using "&&" to chain those subgroup intrinsics together. Instead,
using "&" to chain them together because those test functions have
the return value as bool.
* Disable short-circuit in few situations
Disable short-circuit in following situations:
1. generic parameter list
2. static const varible initialization
* Use a flag to indicate the enablement of short-circuit
Instead of using a struct to indicate the state of the outer
environment of current expression, use a simple bool flag to
indicate whether or not apply the short-circuit to current
expression because there few situations where we will disable
short-circuiting and in those circumstances, there is no nested.
Therefore, a flag is good enough to indicate the case.
* Disable short-circuit in index expression
Also fix the build issue. (A cleanup for the last change.)
* check both 'static' and 'const' modifiers
Previously we only check HLSLStaticModifier to decide whether or
not using short-circuit, but we really should check both 'static'
and 'const' modifiers together, because we only want to disable
the short circuit for init expression for 'static const' variable.
* relax the restriction of short-circuit for index expression
Disable the short-circuit for index expression only when declare
an array.
* Simplify the logic by creating subVisitor
Simplify the logic by create a sub expression visitor so
that we don't need to introduce extra recursion.
* Call convertToLogicOperatorExpr after args check
Change to call convertToLogicOperatorExpr after arguments
check in visitInvokeExpr such that we don't have to check
whether the arguments checked to avoid the double checking
issue.
|
|
* Refactor compiler option representation.
* Fix binary compatibility.
* Add a test for specifying compiler options at link time.
* Fix binary compatibility.
* Fix binary compatibility.
* Fix backward compatibility on matrix layout.
* Fix.
* Fix.
* Fix.
* Fix gfx.
* Fix gfx.
* Fix dynamic dispatch.
* Polish.
|
|
* [SPIRV] Support `globallycoherent` modifier.
* Fix.
* Disable executable cooperative vector tests.
* Update expected failure.
* [SPIRV] Emit varying output index decoration.
* Add test.
* Update tests.
* Fix test.
* Emit `SpvExecutionModeEarlyFragmentTests`.
* Lower `StructuredBuffer<bool>`.
* Support globallycoherent on ByteAddressBuffer.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix incorrect behavior of operator%
Fixes #1059.
This change fixes the incorrect translation of "operator%" from
HLSL to SPIRV.
The issue stems from the fact that the behavior of "operator%" in
GLSL differs from that in HLSL. In HLSL it behaves as "remainder"
where as it behaves as "modulus" in GLSL.
We have been using SpvOpFMod for operator% when Slang compiles
from HLSL to SPRIV, which is incorrect. This change switches it
to SpvOpFRem.
The tests are slightly modified to reveal any potential issues.
* Change output type of test/compute/frem
For testing the operator%, we are using "int" as the output type of the
test, "test/compute/frem.slang".
Since the operands are in float type, it is more preferable to have a
float type as the resulting type. This can be done with an option,
"-output-using-type".
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* More robust input and output selection in generator tools
* Add cmake build system
* Get slang-test running with cmake
* Bump lz4 and miniz dependencies
* Make cmake build more declarative
* Correct preprocessor logic in slang.h
* Add cuda test to compute/simple
* Remove empty cmake files
* output placement for cmake, and commenting
* Correct include paths in spirv-embed-generator
* Format cmake with gersemi
* Make cmake build clerer
* Neaten header generation
Also work around https://gitlab.kitware.com/cmake/cmake/-/issues/18399
by introducing correct_generated_properties to set the GENERATED flag in
the correct scope
* remove unused files
* use 3.20 to set GENERATOR property properly
* spelling
* more flexible linker arg setting
* replace slang-static with obj collection
* Set rpath and linker path correctly
* neaten generated file generation
* tests working with cmake build
* fix premake5 build
* comment and neaten cmake
* remove unnecessary dependency
* Build aftermath example only when aftermath is enabled
* Add slang-llvm and other dependencies
* Put modules alongside binaries
* Find slang-glslang correctly
* Better option handling
* comments
* add llvm build test
* Better option handling
* cmake wobble
* use UNICODE and _UNICODE
* remove other workflows
* use ccache
* neaten
* limit parallel for llvm build
* use ninja for build
* Windows and Darwin slang-llvm builds
* cache key
* verbose llvm build
* cl on windows
* sccache and cl.exe
* use cl.exe
* Correct package detection
* less verbosity
* Simplify miniz inclusion
* fix build with sccache
* Neaten llvm building
* neaten
* Neaten slang-llvm fetching
* more surgical workarounds
* Add ci action
* Get version from git
* better variable naming
* add missing include
* clean up after premake in cmake
* more docs on cmake build
* ci wobble
* add imgui target
* more selective source
* do not download swiftshader
* Some missing dependencies
* only build llvm on dispatch
* Disable /Zi in CI where sccache is present
* simplify
* set PIC for miniz
* set policies before project
* reengage workaround
* more runs on ci
* Add cmake presets
* Add cpack
* move iterator debug level to preset
* Correct lib flag
* simplify action
* Neaten cmake init
* Add todo
* Add simple test wrapper
* Add tests to workflow presets
* rename packing preset
* Correctly set definitions
* docs
* correct preset names
* Make slang-test depend on test-server/test-process
* neaten
* use workflow in actions
* install docs
* Correct module install dir
* debug dist workflow
* Install headers
* neaten header globbing
* Neaten dependency handling
* make lib and bin variables
* Do not set compiler for vs builds, unnecessary
* docs
* allow setting explicit source for target
* maintain archive subdir
* cmake docs
* install headers
* place targets into folders
* cmake docs
* nest external projects in folder
* remove name clash
* Neater external packages
* meta targets in folder structure
* cleaner slang-glslang dll
* Add missing static directive to slang-no-embedded-stdlib
* more robust module copying
* make slang-test the startup project
* folder tweak
* Make FETCH_BINARY the default on all platforms
* Set DEBUG_DIR
* add natvis files to source
* skip spirv tests
* remove test step from debug dist
* Add build to .gitignore
* redo warnings to be more like premake
* Update imgui
* clean more premake files
* Disable PCH for glslang, gcc throws a warning
* Add /MP for msvc builds
* warning wobble
* Add script to build llvm
* Add slang-llvm and generators components
* Build slang-llvm in ci
* comments
* fetch llvm with git
* better abi approximation for cache
* better sccache key
* formatting
* Correct logic around disabling problematic debug info for ccache
* exclude gcc and clang from windows ci
* Make dist workflows use system llvm
* naming
* restore normal dist builds
* formatting
* run tests in ci
* Correct slang-llvm url setting
* Rely on the system to find the test tool library
* actions matrix wiggle
* cope with OSX ancient bash
* Correct compilers on windows
* more ci debugging
* Correct rpath handling on OSX
* neaten
* correct path to slang-llvm
* Correct rpath separator on osx
* Find slang-llvm correctly
* smoke tests only on osx
* ci wobble
* Give MacOS module a dylib suffix
* get swiftshader correctly
* cope with bsd cp
* remove debug output
* full tests on osx
* ci wobble
* Add some vk tests to expected failures
* simplify ci
* ci wobble
* exclude dx12 tests from github ci
* remove cmake code for building llvm
* warnings
* warnings as errors for cl
* spirv-tools in path
* add aarch64 ci build
* Add SLANG_GENERATORS_PATH option for prebuilt generators
* neaten
* Correct generator target name
* remove yaml anchors because github actions does not support them
* Demote CMake in docs
Also add info on cross compiling
* Restore premake CI
* use minimal ci for cmake
* Write miniz_export for premake build
and .gitignore it
* Mention build config tool options in docs
* Remove redefined macro for miniz
* regenerate vs project
|
|
* Support visibility control and default to `internal`.
* Fix wip.
* Fixes.
* Fix.
* Fix test.
* Add legacy language detection and compatibility for existing code.
* Add doc.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Unify Texture types in stdlib into 1 generic type.
* Fixes.
* Fix.
* Fixes.
* Fix reflection.
* Fix binding reflection.
* Add gather intrinsics.
* Fix gather intrinsics.
* Fix texture type toText.
* Fix intrinsic.
* fix cuda intrinsic.
* Fix project files.
* cleanup.
* Fix.
* Fix.
* Fix sampler feedback test.
* Fix getDimension intrinsics.
* Fix spirv sample image intrinsics.
* Fix test.
* Fix GLSL intrinsic.
* Cleanup.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Various SPIRV fixes.
- Geometry shader support (WIP).
- Fix texture get dimension and load.
- Fold global GetElement(MakeArray/MakeVector) insts.
- Call spvopt to inline all functions.
- Translate OpImageSubscript.
- Emit struct member names and global variable names.
- Fix lowering of OpBitNot -> OpNot, instead of OpBitReverse.
* Fix test.
* Fix geometry shader.
* Fix geometry shader emit.
* Add atomic Image access test.
* Fix tests.
* don't fail if spirv-opt fails.
* Update comments.
* Fix test.
* Cleanups.
* indentation
---------
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
|
|
exporting type information (#3209)
* Initial: add a DiffTensor impl
* Auto-binding and diff tensor implementations now work
* Refactored diff-tensor implementation + added py-export for struct types
* Cleanup
* Update slang-ir-pytorch-cpp-binding.cpp
* Updated test names
* Update autodiff-data-flow.slang.expected
* Add more versions of load/store & default generic args for DiffTensorView.
* Add diagnostic for default generic arg and more tests
* Add more `[AutoPyBind]` tests
|
|
* Add __truncate and __sampledType for spirv_asm
Allows some texture tests to start passing
* add __isVector
Currently unused
* Add 1-vector legalization pass (WIP)
* Add capabilities for image types
* neaten instruction dumping
* add 1-vector test
* Add a couple of cases to vec1 legalization
* Remove texture tests from expected failures
* comment
* regenerate vs projects
* Remove redundant define form synchapi emulation
* refactoring image methods
* All sample functions refactored
* Remove incorrect glsl intrinsics
Partially addresses https://github.com/shader-slang/slang/issues/3174
* __subscript image ops via writing funcs
* Extract texture struct writing from core.meta.slang
* Abstract out cuda intrinsic
* Remvoe erroneous call to opDecorateIndex
* spirv asm IR utils
* Correct position of loads for SPIR-V asm inst operands
* Raise constructors to global scope during spir-v legalization
* Correct snippet output
* Implement most texture sampling ops for SPIR-V
* Legalize 1-vectors for glsl too
* Make SPIR-V inst operands non-hoistable
* Better 1-vector legalization
* Put textures in ptrs for spirv
* insert missing break
* Add vec1 legalization test
* Add some missing pieces to slang-ir-insts
* Greatly neaten vec1 legalization
* a
* Neaten vec1 legalization
* Add image read and write intrinsics for spir-v
* Squash warnings
* regenerate vs projects
* Drop redundant guards
* Drop 5 tests from expected failure list
* Inst numbering changes to cross compile tests
* vec1 legalization tests only on vk
* Correct location of asm op emit
* Inline constant in spirv-asm
* Correct signedness for lane in wave intrinsics
* Extract element from float1 for cuda
* squash warnings
* Neaten spirv-emit
* dedupe more capabilities
* warnings
* neaten assert
* comments
* comments
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Compile append and consume structured buffers to glsl.
* Fix.
* Update CI config.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add test for generic param inference bug for nested generics
* Change description & simplify test
* Add expected file
* Check parent decl before unifying type parameters
|
|
* Add type layout for structured buffer
* Default to generating spirv directly
* vk test for compute simple
* Add spirv-dis as a downstream compiler
* Emit Array types in SPIR-V
* makevector for spirv
* Dump whole spirv module on validation failure
* register array types
todo, use emitTypeInst
* Neater formatting for unhandled inst printing
* break out emitCompositeConstruct
* Correct array type generation
* neaten
* Allow getElement for vector
* Remove unused
* Allow predicating target intrinsics on types
* Consider functions with intrinsics to have definitions
We need to specialize these if they are predicated on types
* Correct array type generation
* makeArray for spir-v
* replace getElement with getElementPtr for spirv
* Correct translation of field access for spirv
* Push layouts to types for spirv
* Spirv intrinsics * operator now makes a pointer
* Add structured buffer of struct test
* Preserve type layout in spirv structured buffer legalization
* neaten
* makeVectorFromScalar for SPIRV
* placeholder for layouts on param groups
* More type safe spirv op construction
* Know that constants and types only go in one section
* Remove emitTypeInst
* Add todo for spirv sampling
* Add links to spirv documentation on emit functions
* OpTypeImage support for SPIR-V
* Add simpler texture test for spirv
* s/spirv_direct/spirv/g
* Allow several string literals in target_intrinsic
* Handle global params without a var layour for SPIR-V
For example groupshared vars
* uint spirv asm type
* Add todo for isDefinition
It is currently too broad
* Some atomic op spirv intrinsics
* Strip ConstantBuffer wrappers for spirv
* Add todo for matrix annotations
* Do not associate decorations insts with spirv counterparts
* Correct entry point parameter generation
* Spelling
* Assert that fieldAddress is returning a pointer
* Add error for existential type layout getting to spir-v emit
* Add IRTupleTypeLayout
Unused so far
* Allow getElementPtr to work with vectors
* Correct target name in test
* Hide default spirv direct behind a premake option --default-spirv-direct=true
* Do not insert space at start of intrinsic def
* Correct asm rendering in tests
* remove redundant option
* Emit directly from direct test
* Add source language options for spirv-dis
* Add comments to spirv dis
* Add dead debug print for before spirv module
* Correct asm rendering in tests
* s/spirv_direct/spirv/g
* Only specialize intrinsic functions with predicates
* regenerate vs projects
* squash warnings
* squash warnings
* remove duplication
* Silence warnings from msvc
* squash warnings
* Overload for zero sized array
* More msvc warnings
* warnings
* Add spirv-tools to path for tests
* Do not be specific about dxc version for diag test
* Normalize line endings from spirv-dis
* Correct filecheck matches
* Temporarily disable two spirv tests
Failing on CI, undebuggable hang :/
* Do not emit storage class more than once for spirv snippet
* Do not pass spir-v to spirv-dis by stdin
* Do not get spirv-dis output via stream, use file
* normalize file endings in spirv-dis output
|
|
* Support per field matrix layout
* Fix warnings.
* Fix.
* Fix tests.
* Fix spiv gen.
* Fix.
* More test fixes.
* Fix.
* Run only GPU tests on self-hosted servers.
* Remove -use-glsl-matrix-layout-modifier.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Initial sizeof implementation.
* Small macro improvement.
* Fix some typos.
* Refactor NaturalSize.
Add more sizeof tests.
* Use _makeParseExpr to add sizeof support.
* Add size-of.slang diagnostic result.
* Fix typo in folding with macro change.
* Add a sizeof test of This.
* Some more NaturalSize coverage.
* Simple alignof support.
* Testing for alignof.
* Added 8 bit enum to check enums values are correctly sized.
* Add alignof to completion.
* Lower sizeof/alignof to IR.
sizeof/alignof IR pass.
Tests for simple generic scenarios.
* Make append handle invalid properly.
Improve comments.
---------
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
|
* Be lenient on same-size unsigend->signed conversion.
* Fix tests.
* Use 250.
* wip
* Fix.
* Fix tests.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Remove unused COM annotation
* Move SLANG_ENABLE_DXBC_SUPPORT to slang.h
* Add DX11 simple compute test
* Remove unnecessary COM parameter annotation
* Run compute smoke test for DX12
* Ignore d3d11 tests when we do not have fxc
* Do not try to find NVAPI on Linux
* Add some logs to .gitignore
* Minor cleanups in d3d12 headers
* Fix tautological comparison (due to integer overflow)
* Limit OutputDebugStringA to Windows
|
|
* Add warning for returning without initializing out parameter
* Add unused prelude function to squash uninitialized out variable warnings
|
|
* Correct case of windows.h includes
* Use Slang::SharedLibrary to load directx dlls
* s/max/std::max/
* Factor common OS code in calcHasApi
* Add DXIL test for compute/simple
* s/false/FALSE for calls to WinAPI functions
* Factor common OS code in gfxGetAdapters
* 2 missing headers
d3d12sdklayers for ID3DDebug
climits for UINT_MAX
* Define out unused function on Linux
* Only try to load Vulkan and CUDA on Windows or Linux
* simplify D3DUtil::getDxgiModule
* Remove WIN32_LEAN_AND_MEAN &co from source files
Add a global define
* Set WIN32_LEAN_AND_MEAN &friends in headers
Restore previous state also
* regenerate vs projects
|
|
* Add missing expected.txt for test
* Diagnostics -> StdWriters in render test
* Allow specifying several test prefixes to run
`slang-test -- tests/foo tests/bar`
* Squash warnings in some tests
* Enable gfx debug layer in gfx test util
Makes this issue present consistently: https://github.com/shader-slang/slang/issues/2766
* Allow DebugDevice to return interfaces instantiated by the debugged object
* Check that we actaully have a shader cache for shader cache tests
* Implement FileCheck tests for several test commands
- SIMPLE, SIMPLE_EX
- SIMPLE_LINE
- REFLECTION, CPU_REFLECTION
- CROSS_COMPILE
It does not currently support the render tests or the COMPARE_COMPUTE commands
It is invoked by adding `(filecheck=MY_FILECHECK_PREFIX)` to the test command, for example
TEST:CROSS_COMPILE(filecheck=SPIRV): -target spirv-assembly
* Move LLVM FileCheck interface to slang-llvm
* Neaten slang-test tests
* Refine handling of expected output in slang-test
* Add example FileCheck buffer test
* Add cuda-kernel-export tests
Which were waiting on FileCheck
* Bump vs project files
* Make createLLVMFileCheck_V1 return a void* rather than specifically an IFileCheck
* Remove use of CharSlice from filecheck interface
* Bump slang-llvm version
---------
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add missing expected.txt for test
* Diagnostics -> StdWriters in render test
* Allow specifying several test prefixes to run
`slang-test -- tests/foo tests/bar`
* Squash warnings in some tests
* Enable gfx debug layer in gfx test util
Makes this issue present consistently: https://github.com/shader-slang/slang/issues/2766
* Allow DebugDevice to return interfaces instantiated by the debugged object
* Check that we actaully have a shader cache for shader cache tests
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Overhaul global inst deduplication and cpp/cuda backend.
* Update IR documentation.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Test for disabling warnings.
* Output diagnostic if argument parsing fails in render test.
* More improvements around disabling diagnostics.
* Add support for re enabling a warning.
* Add warning controls to help text.
* Tidy up around NameConventionUtil.
* Make NameConvention an enum.
* Handle leading underscores.
* Update comment, and remove intial handling of _ prefix.
|
|
* Warning on bool to float conversion.
* Fix test cases.
* Improve.
* LanguageServer: don't show constant value for non constant variables.
* Fix tests.
* Fix warnings in tests.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Expose internals of dce and use it to implement call graph walk.
* Specialize calls in generic functions.
* Fix clang error.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
* Add support for __global.
* Added `ActualGlobalRate`
Added special handling around globals and COM interfaces.
Tested out in cpu-com-example.
* Fix typo in NodeBase.
* Support for accessing globals by name working.
* Bounds checking for C++
Improved bounds checks for CUDA.
* Check that actual global initialization is working.
* Fix typo.
* Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar.
* Fix typo in CUDA prelude.
* Remove context.
Only create replacement if needed.
* Split out COM host-callable into a unit-test.
* host-callable com testing on C++and llvm.
* Comment around the COM ptr replacement.
* WIP Zero bound test.
* Disable com test on vs 32 bit.
Fix C++ prelude
* Disable 32 bit targets testing com host-callable.
* For now disable zero index test.
* Enable bounds checking for CPU/CUDA.
* Small fixes.
Disable CUDA zero index bound fix.
* Add test result for bound check.
* Work around for index wrapping issue.
* Added Fixed array test.
* Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
|
|
* Use IR pass to eliminate phi nodes
"Phi nodes" are one of the key contrivances that makes SSA (Static
Single Assignment) form work. Because SSA is so great for compiler
IRs, we kind of need to deal with phi nodes, but they also get in
the way because they don't have a direct analog in most lower-level
machine ISAs or execution models, nor in most of the high-level
languages a transpiler wants to emit. As a result a compiler like
ours needs to be able to eliminate the phi nodes from a program as
part of generating output code.
(For any clever people noting that SPIR-V supports phi nodes
directly: yes, it does. It doesn't need to and it probably *shouldn't*.
Anybody involved in the decision-making knows my reasoning, and
anybody else should feel free to ask me if they want the lecture.
Anyway...)
The basic idea of elimiating phi nodes is simple enough. We replace
each phi node with a temporary variable. Uses of the phi use values
loaded from the temporary. The operation of the phi itself
(assigning a value based on the branch taken) amounts to an assignment
into the temporary.
Previously, the Slang compiler dealt with phi nodes very late in
the process of generating code: in the middle of emitting strings
of source code in a high-level language like HLSL or GLSL. Doing the
work that late in compilation has two big drawbacks:
1. Our ability to emit clean and/or optimal code is limited because
we may not be able to make certain changes to the IR, or because we
cannot make use of additional information like a dominator tree that
might be available at other points in compilation.
2. Any other IR passes that relate to temporary variables won't be
able to see the variables that we generate for phi nodes. This could
raise issues with correctness (e.g., if we want to compute live-range
information for *all* temporary variables), or performance (we have no
way to run additional IR optimization passes after phis are eliminated).
This change addresses these problems by making the elimination of
phi nodes an explicit IR pass. Additional optimizations can easily be
run after this pass (although we'd need to be careful not to run
passes that could end up introducing new phis). The pass makes use
of the information available to it to try to produce code that will
emit to "clean" HLSL/GLSL.
The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is
heavily commented, so I won't describe the approach in detail here.
There are two related issues that came up, though:
First, it turned out that our emit logic for local variables (`IRVar`
instructions) wasn't using the function we'd defined named `emitVar()`.
One worrying consequence of that oversight was that the `precise`
modifier would impact generated HLSL/GLSL for variables that turned
into SSA values (including phi nodes), but *not* for local variables
that had not been SSA'd (or that had been SSA'd and then de-SSA'd).
This change also fixes that bug; it is unclear how widespread the
impact of the original issue might be.
Second, generating explicit IR temporaries for phi nodes exposed a
pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass
basically detects cases where we have an instruction `I` with a use
`U` such that the use follows the rules of SSA form ("def dominates
use," meaning `I` dominations `U`), but does not follow the more
restrictive scoping rules of high-level-language output (where a value
computed "inside" a loop is not automatically visible to code outside
the loop just because it dominates that code). That pass did not
correctly account for the case where `I` was a temporary variable.
It seems that case could not arise before now because we didn't have
any passes that would move `var`, `load`, or `store` operations out
of the basic block they started in. The fix for that pass was relatively
simple, and will make the whole thing more robust in case we add more
aggressive optimizations later.
* fixup: expected test output
|
|
Read/write resource types (what D3D/HLSL often refer to as UAVs) can be broadly categorized based on whether they require an underlying format (e.g., a `DXGI_FORMAT`) for reads, or not. D3D refers to the ones that require a format as "typed" UAVs (even though a `RWStructuredBuffer<MyData>` is clearly "typed" at the HLSL level). Vulkan refers to these cases as "storage images" and "storage texel buffers."
Under the D3D model, an application does not have to specify the exact format for a formatted/"typed" UAV in order for loads to work, but it *does* need to specify if an HLSL resource with a declared `float` or vector-of-`float` element type will be backed by data with a `*_UNORM` or `*_SNORM` format. This is where the `unorm` and `snorm` type modifiers come in.
Superficially, it might seem that adding this feature to the Slang compiler is "just" a matter of adding the two modifiers, which is easily done with a pair of one-line `syntax` declarations in `core.meta.slang` plus the corresponding AST node types.
Unfortunately the superficial view misses the detail that, to date, Slang has not had any support for *type modifiers* at all, and has only supported *declaration modifiers*. The distinction has so far not mattered, even with modifiers like `const` because, e.g., the difference between a "`const` array of `float`" and an "array of `const float`" doesn't really matter.
So, adding these two modifiers required introducing a lot of infrastructure along the way. Let's walk through what needed to happen:
* As described above, the actual `syntax` was added easily in the Slang stdlib
* I added a new subclass of `Modifier` for `TypeModifier`s in the AST, and added the AST nodes for `unorm` and `snorm` as subclasses of that.
* In order to syntactically support modifiers applied to types (e.g., `unorm float`), I needed to add a `ModifiedTypeExpr` subclass of `Expr` that represents a base type expression with one or more modifiers applied
* The parser needed some subtle new logic. There are two main cases where type modifiers will come up:
1. In contexts where we might be parsing a declaration (e.g., `const unorm float a`), we need to support a list of modifiers that might freely mix type modifiers and "declaration modifiers" which are not intended to apply to types. In this case we need to split the lis tof modifiers into the type-related ones and the declaration-related ones, and attach each subset to the appropriate place. This is very important for features like C-style pointers, where in `static const float* a;`, the `static` modifier applies to the entire declaration of `a`, but the `const` modifier *only* applies to the `float` type specifier, and *not* to the outer pointer type (the actual type of `a`).
2. In contexts where we are not parsing a declaration (e.g., a generic type argument), we need to support a list of modifiers and appy them *all* to the type specifier being parsed, even if some of them might not be appropriate.
* While working in the parser I implemented a certain amount of unrelated cleanup for code that was using raw `Modifier*`s to represent lists of modifiers, instead of the purpose-built `Modifiers` type.
* The `_parseGenericArg` case needed specific work, because it is an important case in the grammar where we need to parse *either* a type expression or a value exprssion, but cannot easily predict which we will see. The fix implemented for now is to always try to parse modifiers and, if we see any, to assume we are in the type case. Because of the rules for how modifiers in a C-like language inhere to the type specifier (and not necessarily the entire type), we need to refactor some of the type expression parsing routines to support parsing a "suffix" of a type expression.
* Note: I decided to be conservative and only make these changes in `_parseGenericArg` because that is place that is *needed* in order for user code with `unorm`/`snorm` to work, but in practice a user could still confuse our parser by using type modifiers as part of a cast (e.g., `x = (unorm float)y;`). While there is currently no reason why a user should want to do this, it *does* suggest that we need to be prepared to see type modifiers in other ambiguous "expression or type?" contexts. We have so far preferred to avoid looking up built-in syntax declarations like modifiers in expression contexts, because we want to allow users to create variable names that might conflict with some of the more surprising modifier keywords in HLSL (e.g., both `triangle` and `sample` are modifier keyword). A nuanced strategy may be required when we get around to closing this gap (which will be needed around when we want full pointer support, since a cast like `(const SomeType*)somePtr` is pretty common).
* In semantic checking, we now need a `visitModifiedTypeExpr`, which visits the base expression to produce a `Type` and then checks each of the `Modifier`s attached to it. During this process we need to translate the AST-level `Modifier`s into something that can exist properly in the universe of `Type`s. We introduce a `ModifiedType` subclass of `Type`, distinct from the `ModifiedTypeExpr` subclass of `Expr`. Furthermore, we introduce a `ModifierVal` subclass of `Val`, distinct from `Modifier`/`TypeModifier`.
* One unfortunate thing here is that it means we have both, e.g., `UNormModifier` to represent the parsed syntax, and `UNormModifierVal` to represent the `Type`/`Val`-level representation of the same concept. It is quite likely that we are near the point where we can/should consider having two distinct AST representations: one for freshly-parsed ASTs and one for semantically-checked ASTs. The `Type`/`Val` hierarchy clearly belongs to the latter.
* No actual semantic checking is currently being applied to the `unorm` and `snorm` modifiers, although we should in principle check that they are only being applied to `float` and vector-of-`float` types.
* In an attempt to simplify some of the creation logic and build a tiny bit of reusable infrastructure, I went ahead and added the skeleton of a dedupe-caching system in `ASTBuilder` so that we can easily ensure only a single `UNormModifierVal` and a single `SNormModifierVal` ever get created inside the scope of a single builder.
* TODO: Thinking about this, I'm now worried the deduplication does not mean I can make the simplifications I currently do in semantic checking by assuming that any two `UNormModifierVal`s will be pointer-identical. This is because we do not currently (IIRC) have the required "bottleneck" in the compiler where all ASTs get serialized after initial checking, and then deserialized when `import`ed into a downstream module, so that every AST node during a checking step comes from a single `ASTBuilder`. Hmm...
* If we can rely on deduplication to do its thing, then the `Val` and `Type` implementations of modifiers can be relatively simple.
* TODO: One issue here is that the equality comparison for `ModifiedType` currently checks for the same base type and the same modifiers in the same order. This works for now when we only have a small number of type modifiers and any given type will hae at most one, but in the longer run it relies on us to implement some kind of canonicalization scheme, which would both ensure that between `Modified(T, {A, B})` and `Modified(T, {B, A})` only one is allowed (that is, a canonical ordering on modifiers), and that we do not allow `Modified(Modified(T, {A}), {B})`.
* TODO: One other issues is that the `ModifiedType` case does not currently interact correctly with the `as()`-based casting for types (whereas that operation *does* interact in a semantically-correct fashion with `typedef`s). Fixing this issue in a robust way really depends on us re-architecting the `Type` system so that *any* `Type` can have modifiers attached, with modifiers affecting type identity/deduplication.
* The key place where `ModifiedType` creates a complication in semantic checking is type conversion/coercion. A user is likely to declare a `RWTexture2D<unorm float>`, fetch from it (producing a value of type `unorm float`) and then assign the result to a `float` variable, prompting for a conversion from `unorm float` to `float` (because they are distinct `Type`s).
* We handle this case in the core `_coerce()` operation by checking if either `toType` or `fromType` is a `ModifiedType`. If *either* one is a modified type, we apply logic to check for modifiers that are present on one and not the other. Basically we check which modifiers need to be "dropped" and which need to be "added" during conversion, and validate that these modifiers *can* be dropped/added without creating a semantic error. The only type modifiers we support right now *can* be dropped/added like this, so we are fine.
* TODO: When we add more complete pointer support, we could need logic here to validate when casts between, e.g., `const int*` and `int*` should/shouldn't be allowed.
* Note: Even opening the door to type modifiers at all creates the same kind of challenges for user-defined generic types (and functions!) since `MyType<int>` and `MyType<const int>` are distinct instantiations in a future where we support `const` as a type modifier. We *may* need to plan to restrict where modified types can be used, so that certain built-in generic types support modified types as arguments, but user-defined types don't (or at least might need to opt-in to get support).
* The result of a `_coerce()` that drops/adds modifiers is a `ModifierCastExpr`, which is a kind of no-op AST node that merely expresses that the conversion is allowed and valid.
* In IR lowering we currently do the simple thing and translate a `ModifiedType` to a distinct IR node called `AttributedType`.
* The change in terminology from "modifier" to "attribute" is to follow the way that these kinds of modifiers best map to the `IRAttr` case in the IR (rather than the `IRDecoration` case). We probably ought to do a careful terminology scrub here, because having this terminology mismatch between IR and AST could be a source of confusion.
* TODO: In principle, using `IRAttributedType` creates the same basic problems as using `ModifiedType`: code that is usin `as()` or similar operations to check for a specific subclass of `IRType` may not see the case they were looking for due to use of `IRAttributedType`.
* Initially I had hoped to avoid the problem by having the `IRAttr`s be attached directly as operands to an otherwise-ordinary `IRType`. E.g., a lowered `unorm float4` would be an `IRVectorType` with an "extra" operand that is an `IRUNormAttr`, something like: `Vector<Float, 4, UNorm>`. This sounds great (and looks great!), but runs into the problem that it is incompatible with the way we currently represent things like generic type parameters. A generic type parameter `T` is represented as an `IRParam`, and it does *not* make sense to have an additional `IRParam` to represent `const T` or `unorm T`, etc.
* The Right Way to solve this stuff at both the AST and IR levels is to avoid passing around bare `Type*` or `IRType*` in general, and instead use a value type that implements the needed policy more directly: something like a `TypeHolder` or `IRTypeHolder` (placeholder name). The `*Holder` type would abstract over the various "wrapper" nodes required to store all the additional data like attributes but, importantly, would *not* allow that extra information to be dropped or lost during operations like casting (e.g., note how the current `Type` implementation of `as()` loses information on `typedef` names, making our error messages slightly worse). This is actually quite similar to how we currently use the `DeclRef<T>` system to allow working with what is *usually* a `T*` under the hood, but in a way that ensures we don't lose track of any generic substitution information.
* During C-like code emit we have a process that turns an `IRType` into a chain of declarators as needed to emit a C-like declaration with pointers, arrays, etc. The `IRAttributedType` case needs to get folded into this logic. Basically, when we see an `IRAttributedType` we immediately emit any modifiers that are required to be in a prefix position, then recursively emit the underlying type with an extra layer of declarator that tracks the modifiers, so that we can emit any modifiers that should be placed in a postfix position *after* the type. As a specific example, our C/C++ back-end would want to use the postifx option to handle `const`, because then it can properly emit stuff like `int const * const *` and not the incorrect `const const int**`.
* The HLSL emit logic overrides the prefix case for handling type attributes, and uses it to emit `unorm` and `snorm` where they occur.
* One unfortunate detail is that (apparently) some downstream HLSL compilers do not allow the `unorm`/`snorm` modifiers to apply to `vector<float, *>` types, even though that should be semantically valid. Instead, they only support `float`, `float2`, `float3`, and `float4` explicitly. To work around this issue, we go ahead and change our HLSL emit logic so that when we encountered 1-to-4 component vectors of `float`, `int`, or `uint` we emit the type name using the typical HLSL shorthand. This is actually a signficicant change in our HLSL output, but it both seemed like a good fix to have anyway, and was also the only obvious way to address the downstream parser shortcomings without a massive kludge.
* As a result of this change the `half-texture.slang` test broke, since it was using raw HLSL as the expected output. I changed the test to do a DXIL comparison instead, which is our preferred way of testing cross-compilation behavior (since it is more robust in the face of small changes to our source output).
|
|
First, we have a CUDA-only test that simply needed a format name to be changed to match the new conventions in `gfx`.
Second, we have one of the "active mask" tests that seems to produce different results locally for developers (under Vulkan) than it does on CI. This is almost certainly down to differences in GPUs and/or drivers. The inconsistency ultimately proves the point that I was trying to make when I wrote those tests - the "active mask" concept is effectively meaningless as exposed in D3D and Vulkan because it has not been specified in a way that allows programmers to reason about its value, and drivers have implemented wildly different interpretations of its supposed semantics for so long that there is no real hope of turning `WaveGetActiveMask()` into something that returns a well-defined value in any but the most trivial cases.
TLDR: I disabled that test for Vulkan, which means it is completely disabled.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Support for test proxy.
* Turn on testing using proxy.
* Don't pass sink into check of downstream compiler.
* Small change to kick off build.
* Remove register specification on transcendental.
* Increase poll timeout.
Small improvements to proxy.
* Disable gfx unit tests.
* Put test runner in shared library mode by default.
* Change comment. Kick off another CI test.
* Small edit to kick off builds.
* Run unit tests on proxy.
* Turn on using proxy for now.
* Enable swift shader.
* Fix typo.
Add exception support.
* Make the default spwan type SharedLibrary
Use isolation for gfx unit tests.
* Update slang-binaries.
* Fix typo.
* Report unit test output information.
|