| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove premake lua
* Remove premake generated vs project
* remove deps file
* Remove premake driving bat files
* Full test matrix under CMake
* Remove premake based ci workflows
* Wiggle CI
* remove cmake from ci name
* find frameworks correctly on osx
* remove cmake from ci name
* Cope with sccache not being available
* cmake based falcor tests
* ci wobble
* only install ninja if necessary
* more appropriate cache name
* Remove premake from build instructions
* Add some docs on ci setup
* remove premake from regression tests
* remove premake from perf test
* Set SLANGC_PATH
* ci wobble
* bump slang-binaries
* ci wobble
* Bump spirv tools
* dont use timestamp in cache
* remote debug code
* cache key wobble
* Install sccache after building llvm
* Do not build llvm tools
* ci wobble
* ci wobble
* ci wobble
* ci wobble
* ci wobble
* Tests spirv via glsl in ci
* Define SLANG_ENABLE_XLIB=1
* osx builds on aarch64
* ci wobble
* ci wobble
* ci wobble
* ci wobble
* ci wobble
* ci wobble
* package documentation and metadata with cmake
* ci wobble
* Split hlsl double intrinsic tests
* ci wobble
* Correct type for double log10
Fixes https://github.com/shader-slang/slang/issues/4549
* remove working test from expected failures
* add broken test to expected failures
* smaller build for falcor tests
* ci wobble
* A few exclusions in ci
* wip, release script
* Enable examples in ci
* neaten release script
* Correct building docs
* Only use xlib vulkan when slang_enable_xlib is true
* bump slang-llvm version
* Remove toolchain file use
* Bump slang-llvm preset version
* slash direction
* Improve build directions
* Add msvc cross build documentation
* Disable old release files
* Smaller set of releases for test
* Allow not building llvm
* simplify release matrix
* Cross releases
* formatting
* formatting
* ci wiggle
* ci wiggle
* cleaner
* neaten
* ci wobble
* formatting
* Install cross tools on linux
* do not clean build dir
* neaten ci
* neaten ci
* neaten ci
* remove unused release workflow files
* Build llvm on some platforms
* neaten ci
* notarize on osx
* s/x64/x86_64
* ci wobble
* Embed stdlib for release build
* wobble ci
* wobble ci
* s/x64/x86_64
* ci wobble
* ci wobble
* ci wobble
* vk-gl-cts on cmake
* neaten ci
* neaten ci
* bump cache action version
* Cope with windows being weird about case
* old glibc version
* old glibc version
* Correct action file
* Keep cache hot on main branch
* separate small script for old glibc releases
* ci wobble
* ci wobble
* Run cmake outside of docker
* only sign on releases
* Revert "Run cmake outside of docker"
This reverts commit a58aaba939a4aa35fe70962fd60d9512b143592f.
* python3 on build image
* less parallel
* ci wobble
* ci wobble
* ci wobble
* newer git
* ci wobble
* ci wobble
* Use newer docker image
* Use newer docker image
* sccache wobble
* permissions issue
* neaten
* build llvm in ci
* build llvm in ci
* Remove linux clang build in ci
* Only install crossbuild tools on non-aarch64 systems
* neaten ci.yml
* Correct github matrix
* Simplify github matrix
* ci wobble
* Disable broken test
See https://github.com/shader-slang/slang/issues/4589
* ci wobble
* Neater slang-llvm archive filename
* Neater path for uploading artifacts
* Neater ci names
* Use Windows SDK 10.0.19041.0 in cmake builds
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Capability def parsing & codegen + disjoint sets
This change adds a capability definition file, and a code generator
to produce C++ code that defines the capability enums and necessary
data structures around the capabilities.
Extends the existing CapabilitySet class to support expressing
disjoint sets of capabilities. This sets up for the next change
that will enhance our type checking with reasoning of capability
requirements.
* Fix cmake.
* Fix warning.
* Fix.
* Fix isBetterForTarget to prefer less specialized option.
* Fix.
* Fix premake.
* Fix intrinsic.
* Fix vs sln file.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
|
|
|
|
|
|
| |
* Add a tool to dump/replay compute pipeline creation from gfx.
* Fix x86 build.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add -spirv-core-grammar option to load alternate spirv defs
Also embed a version to use by default
* Use perfect hash for spv op lookup
* Neaten perfect hash embedding
* Refactor spirv grammar lookup in preperation for more kinds of lookups
* Load spirv capability list from spec
* Add all SPIR-V enums to lookup table
* regenerate vs projects
* appease msvc
* Use string slices for spir-v core grammar lookups
* wiggle
* comment
* Add OpInfo for spv ops
* regenerate vs projects
* Embed op names
* Add min/max operand counts and enum categories to spirv info
* neaten
* Operand kinds for spirv ops
* Store and embed all information relating to spirv enums and qualifiers
* Use SPIR-V spec to position instructions in spirv_asm blocks
* Neaten spir-v info embedding
* Neaten perfect hash embedding
* Add assignment syntax to spirv_asm snippets
* Better errors for spirv_asm parser
* Add warning for too many operands in spirv asm
* squash warnings
* neaten
* test wiggle
* Lookup enums for spirv
* Put OpCapability and OpExtension in the correct place for spirv_asm blocks
* Tests for OpCapability and OpExtension
* ci wiggle
* Add expected failure
* Allow raising immediate values to constant ids where necessary in spirv_asm blocks
* Allow bitwise or expressions and numeric literals in spirv_asm blocks
* test numeric literals
* Fix memory issues.
* fix.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add quad texture example.
* delete output image
* remove irrelavent files
* update project files
* fix
* Update example.
* Fix.
* remove out-texture
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* s/emititng blobal/emitting global
* Use SPIR-V opcode names rather than numbers
* regenerate Visual Studio project files
* Use names for extended SPIR-V GLSL instructions
* Add missing operand for SPIR-V extended instruction
* Add warning aginst modifying generated hashing files
* Squash warnings on MSVC
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
* Fix build script for macos aarch64.
* fix1
* Fix2.
* update vs files
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use TerminatedUnownedStringSlice for literals in output C++.
* Remove Escape/Unescape functions used in slang-token-reader.cpp
Add target type of 'host-cpp' etc to map to the target types.
* Fix some corner cases around string encoding.
* Added unit test for string escaping.
Fixed some assorted escaping bugs.
* Updated test output.
* Added decode test.
* Stop using hex output, to get around 'greedy' aspect. Use octal instead.
* Added HostHostCallable
Small changes to use ArtifactDesc/Info instead of large switches.
* Fix C++ emit to handle arbitrary function export.
* Add options handling for callable without an output being specified.
* Can compile with COM interface. Added example using com interface.
* Use the IR Ptr type instead of hack in C++ emit for interfaces.
* Fix issue with outputting the COM call when ptr is used.
* Fix crash issue on compilation failure.
|
| |
|
|
|
|
|
|
|
| |
* Add language server daemon.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
* Various fixes to gfx.
* Fix.
* Fixes.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixed naming conflicts in heterogeneous-hello-world
Added 3 new modifiers (`__unmangled`, `__exportDirectly`, `__externLib`)
`__unmangled` causes mangleName() to return the normal name of the decl.
`__exportDirectly` changes parent decl name concatenation behavior to use
"::" instead of "." (for Name Hint) and emits the name hint when it exists,
otherwise it emits the mangled name.
`__externLib` stops Slang from emitting the corresponding struct.
Also made necessary changes to heterogeneous-hello-world so that this new
functionality is shown off.
* Undo unintentional formatting changes
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use 'Process' to communicate with an command line tool.
* Remove slang-win-stream
* Tidy up windows ProcessUtil.
* First version of BufferedReadStream.
* Windows working IPC for steams.
* Test proxy count option.
* Split Process/ProcessUtil. Process is platform dependant. ProcessUtil are functions that are platform independent.
* First implementation of Unix Process interface.
* Unix process compiles on cygwin.
* Fix typo in unix process.
* Separate unix pipe stream error of invalid access, from pipe availability.
* Fix in standard line extraction.
* Make fd non blocking.
* Fix issues with Windows Process streams.
* Added UnixPipe.
* Some fixes around UnixPipeStream.
* Make a unix stream closed explicit.
* Hack to debug linux process/stream.
* Revert to old linux pipe handling.
* Pass executable path for unit tests.
Split out CommandLine into own source.
* Small improvements in process/command line.
* Check process behavior with crash.
* Make stderr and stdout unbuffered for crash testing.
* Only turn disable buffering in crash test.
* Disable crash test on CI.
* Fix crash on clang/linux.
* Enable crash test.
Remove _appendBuffer as can use StreamUtil functionality.
* Added inital processing for http headers.
* Small improvements to HttpHeader.
* First pass HTTPPacketConnection working on windows.
* Enable other Process communication tests.
* Update comments.
* WIP JSON RPC.
* Add terminate to Process.
Made JSONRPC a Util.
* Small tidy up around HTTPPacketConnection.
* Improve process termination options.
* WIP for test-server.
* Add diagnostics error handling to test-server.
* Improved JSON support.
Parsing/creating JSON-RPC messages.
* WIP JSONRPC parsing.
* First pass RttiInfo support.
* WIP converting between JSON/native types.
* Project files.
* Split out RttiUtil.
Made RttiInfo constuction thread safe.
* WIP RTTI<->JSON.
* Add diagnostics to JSON<->native conversions.
* Make RttiInfo for structs globals. Avoids problem around derived types (like pointers), being able to cause an abort.
* Add pointer support to RTTI.
Fixed some compilation issues on linux.
* Add fixed array support.
* Added Rtti unit test.
* Add rtti unit test.
* Split out quoted/unquoted key handling.
Fix bugs in JSON value/container.
Added JSON native test.
* Make default array allocator use malloc/free.
Remove the new[] handler (doesn't work on visuals studio).
* Fix for linux warning.
* Remove some test code.
* Fix issues on x86 win.
* Fix warning on aarch64.
* Fix some bugs in JSON parsing/handling.
Make Rtti work copy/dtor/ctor struct types.
* Testing JSON<->native with fixed array.
Make makeArrayView explicit if it's just a single value.
Added array type.
* Fix getting arrayView.
* Improve JSON diagnostic name.
* First pass refactor using Rtti for JSON RPC.
* First pass of test server using RTTI/JSON-RPC.
* Added JSONRPCConnection.
* Fix some naming issues.
* First pass of test-server working.
* Added unit test support for JSON-RPC test server.
* Fix compilation issues on linux around template handling.
* Typo fix.
* Fix a bug around SourceLoc lookup with JSONContainer.
* Set the console type to console for ISlangWriters.
* Small improvements to test-server.
* Small improvements in test-server.
* Small fix.
* Remove test-proxy. Make test-process a process that can be used to unit test 'Process'.
Adding mechanism to control spawning that will create a new process for every test.
* Ability to remove source manager for JSONValue.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Use 'Process' to communicate with an command line tool.
* Remove slang-win-stream
* Tidy up windows ProcessUtil.
* First version of BufferedReadStream.
* Windows working IPC for steams.
* Test proxy count option.
* Split Process/ProcessUtil. Process is platform dependant. ProcessUtil are functions that are platform independent.
* First implementation of Unix Process interface.
* Unix process compiles on cygwin.
* Fix typo in unix process.
* Separate unix pipe stream error of invalid access, from pipe availability.
* Fix in standard line extraction.
* Make fd non blocking.
* Fix issues with Windows Process streams.
* Added UnixPipe.
* Some fixes around UnixPipeStream.
* Make a unix stream closed explicit.
* Hack to debug linux process/stream.
* Revert to old linux pipe handling.
* Pass executable path for unit tests.
Split out CommandLine into own source.
* Small improvements in process/command line.
* Check process behavior with crash.
* Make stderr and stdout unbuffered for crash testing.
* Only turn disable buffering in crash test.
* Disable crash test on CI.
* Fix crash on clang/linux.
* Enable crash test.
Remove _appendBuffer as can use StreamUtil functionality.
* Added inital processing for http headers.
* Small improvements to HttpHeader.
* First pass HTTPPacketConnection working on windows.
* Enable other Process communication tests.
* Update comments.
* WIP JSON RPC.
* Add terminate to Process.
Made JSONRPC a Util.
* Small tidy up around HTTPPacketConnection.
* Improve process termination options.
* WIP for test-server.
* Add diagnostics error handling to test-server.
* Improved JSON support.
Parsing/creating JSON-RPC messages.
* WIP JSONRPC parsing.
* First pass RttiInfo support.
* WIP converting between JSON/native types.
* Project files.
* Split out RttiUtil.
Made RttiInfo constuction thread safe.
* WIP RTTI<->JSON.
* Add diagnostics to JSON<->native conversions.
* Make RttiInfo for structs globals. Avoids problem around derived types (like pointers), being able to cause an abort.
* Add pointer support to RTTI.
Fixed some compilation issues on linux.
* Add fixed array support.
* Added Rtti unit test.
* Add rtti unit test.
* Split out quoted/unquoted key handling.
Fix bugs in JSON value/container.
Added JSON native test.
* Make default array allocator use malloc/free.
Remove the new[] handler (doesn't work on visuals studio).
* Fix for linux warning.
* Remove some test code.
* Fix issues on x86 win.
* Fix warning on aarch64.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Support for test proxy.
* Turn on testing using proxy.
* Don't pass sink into check of downstream compiler.
* Small change to kick off build.
* Remove register specification on transcendental.
* Increase poll timeout.
Small improvements to proxy.
* Disable gfx unit tests.
* Put test runner in shared library mode by default.
* Change comment. Kick off another CI test.
* Small edit to kick off builds.
* Run unit tests on proxy.
* Turn on using proxy for now.
* Enable swift shader.
* Fix typo.
Add exception support.
* Make the default spwan type SharedLibrary
Use isolation for gfx unit tests.
* Update slang-binaries.
* Fix typo.
* Report unit test output information.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
* Diagnostic for no type conformance + bug fix.
* Fixes.
* Fix.
* Include heterogeneous example only with --enable-experimental-projects premake flag
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
* Add prebuild command to premake for heterogeneous example
* Pass in entry point as parameter (also remove shader bytecode)
* Pass in module name as parameter
* Squashed commit of the following:
commit 5b13b57fe600724344c556fe4309a5d6bb3d39ab
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Oct 7 23:38:50 2021 -0700
Return diagnostics data when encountering module load error by exception (#1966)
commit 112e1515c30fa972ff56f91514b70946153c718c
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Thu Oct 7 16:12:29 2021 -0400
Disable test crashing CI (#1965)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Disable test that appears to be crashing.
commit da32069a0c1c8c723d7ef45100049a8f0dd5d9c4
Author: Kai Yao <kyao@nvidia.com>
Date: Mon Oct 4 13:58:51 2021 -0700
Modified barrier API to accept multiple resources per call (#1959)
Co-authored-by: Yong He <yonghe@outlook.com>
commit 97bb82ebcdf8f1391b9d93b5a8d7b1dfc4e88e52
Author: jsmall-nvidia <jsmall@nvidia.com>
Date: Mon Oct 4 14:15:51 2021 -0400
Removing exceptions from core/compiler-core (#1953)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Refactor Stream. Working on all tests.
* Split out CharEncode.
* Make method names lower camel.
m_prefix in Writer/Reader
* Tidy up around CharEncode interface.
* Small improvements around encode/decode.
* Better use of types.
* Remove readLine from TextReader.
* Remove exceptions from Stream/Text handling.
* Fix some typos.
* Fix tabbing.
* Fix missing override.
* Remove remaining exception throw/catch via using signal mechanism.
* Remove exceptions that are not used anymore.
* Document the Stream interface.
* Remove index for decoding 'get byte' function.
* Fix CharReader -> ByteReader.
commit b3dfe383c6d31ff3dbd76dcfb32de8d536382f3e
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Mon Oct 4 09:46:33 2021 -0700
Get native handles for TextureResource and BufferResource (#1960)
* Added getNativeHandle() to TextureResource and BufferResource; Implemented getNativeHandle() in Vulkan and D3D12; Added new unit test files for the aforementioned implementation
* Added missing getNativeHandle() implementations to renderer-shared.cpp and CUDA
* Finished new getNativeHandle() unit tests for ITextureResource and IBufferResource; Modified ICommandQueue and ICommandBuffer unit tests to call QueryInterface to convert to IUnknown then back and compare resulting pointers for equality
* Unit tests updated and pass locally
* Cast m_buffer.m_buffer and m_image to uint64_t
commit 35bca4cc432613af3926da3bed217a6baa9cbd26
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Oct 1 13:08:25 2021 -0700
Add getNativeHandle() to ICommandQueue and ICommandBuffer (#1952)
* Added support for getting command buffer and command queue handles to ICommandBuffer and ICommandQueue; D3D12Device, VkDevice, and DebugDevice modifieid to implement this new functionality; immediate-renderer-base.cpp also modified to implement the new functions
* Removed excess boilerplate
* Changed readRef() to get() in D3D12 getNativeHandle() implementation for ICommandBuffer and ICommandQueue
* Added unit tests for new getNativeHandle() implementations, unfinished
* Queue test added; Minor cleanup changes
* getBufferHandleTestImpl() now closes the command buffer before returning
* Added getNativeHandle() implementations to CUDADevice
* Added comment clarifying that the Vulkan check is checking for a null handle, which is defined to be 0
commit 6c6200f547c7387598743b23bb3c8f0d375d9494
Author: Kai Yao <kyao@nvidia.com>
Date: Thu Sep 30 20:25:34 2021 -0700
VK Resource Barrier (#1955)
* Resource barrier API and VK implementation
* Stub implementations
* Handle VK Acceleration Structure flag
* Add a couple more cases to pipeline barrier stages
commit 627fc976bac5c2381dbace9c7925cb6a68b8de12
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 19:48:47 2021 -0700
Fix aarch64 build on github (#1957)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 122d701513e116856bd59c999221ce36a373d7db
Author: Yong He <yonghe@outlook.com>
Date: Thu Sep 30 17:51:56 2021 -0700
Fix GitHub release (#1956)
* Fix aarch64 release build config.
* Fix for WinAarch64 build.
* Update premake for embed-std-lib build on aarch64.
* `platform` fix for aarach64 build.
* Try revert back to use absolute output path for slang-stdlib-generated.h
* Fix
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
commit aa8f7b899b7b562b3d3c6e25c3da41569505e70c
Author: Chad Engler <englercj@live.com>
Date: Wed Sep 29 13:02:47 2021 -0700
Fix ARM64 detection for MSVC (#1951)
commit 6736b0c1c5fa3e89bc561eb7965a1a0d17af3466
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 29 11:29:46 2021 -0700
Add ISession::loadModuleFromSource. (#1950)
Co-authored-by: Yong He <yhe@nvidia.com>
commit d8e452412e14a6a8ba137f2adcae13b398e5cecb
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 15:03:03 2021 -0700
Fix AbortCompilationException leaking through loadModule API. (#1949)
* Fix AbortCompilationException leaking through loadModule API.
* Update.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
commit cdf1b2c007fefdca128584d2a9f63dec3d350e16
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 28 11:54:24 2021 -0700
Improvements to the unit test framework. (#1948)
commit af788b62e18bbd55cd748ad60400a74cf1bc93ee
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Fri Sep 24 16:53:41 2021 -0700
Add existing device handle support unit test (#1946)
commit bec8e6aec85b6e3f875c58bdd59eb15613978358
Author: Yong He <yonghe@outlook.com>
Date: Fri Sep 24 11:33:44 2021 -0700
Move existing unit tests to a standalone dll. (#1945)
commit f2a3c933bc11a498c622fa18694c84beca8ca031
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Thu Sep 23 12:19:49 2021 -0700
Add method to retrieve native handles (#1944)
* Added a getNativeHandle() method that retrieves the natively created handles; Modified RendererBase, VKDevice, D3D12Device, and DebugDevice to implement this new method
* Moved ExistingDeviceHandles out of Desc directly inside IDevice and renamed to NativeHandles; Modified calls accessing the struct accordingly in RendererBase, DebugDevice, VKDevice, and D3D12Device
* Minor cleanup changes (renames, etc.)
commit b9b398d038b524f15a86ff27cd6888d54e8754e0
Author: Yong He <yonghe@outlook.com>
Date: Wed Sep 22 10:06:59 2021 -0700
Add gfx unit testing framework. (#1943)
* Add gfx unit testing framework.
* Fix compilation error.
* Reset gfxDebugCallback after render_test.
* Pass enabledApi flags through.
* Fix for code review suggestions.
Co-authored-by: Yong He <yhe@nvidia.com>
commit 6e9cee69b3588ddae09b08b9f580f59ad899983f
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Tue Sep 21 18:46:32 2021 -0700
Support for existing device/instance handles in Vulkan (#1942)
commit b1f04c8544c650de3947955ca68f679535d249aa
Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com>
Date: Wed Sep 15 20:22:45 2021 -0700
Allow D3D12Device to use an existing device handle (#1940)
* Added a new field for an existing device handle to IDevice::Desc; Modified D3D12Device::initialize to set the device stored in desc if it already exists instead of creating a new one
* Turned existingDeviceHandle into a struct containing an array of two elements; Updated D3D12Device::initialize to match changes to existingDeviceHandle; Updated comments
* Fixed style error for ExistingDeviceHandles struct
commit 2f7b9f5ae8be21c6c1d75ae9caefbc7b3f8986a9
Author: Pablo Delgado <private@pablode.com>
Date: Thu Sep 16 01:17:57 2021 +0200
Fix incorrect WIN32 macros and missing Windows.h inclusion (#1939)
* Replace WIN32 preprocessor macros with _WIN32
* Add missing Windows.h include for InterlockedIncrement
commit 11d43642008905ac69a3832eb8a9b2ae7b785f86
Author: Yong He <yonghe@outlook.com>
Date: Tue Sep 14 11:36:44 2021 -0700
Avoid upcasting to f32 in 16bit float-uint bit cast. (#1938)
Co-authored-by: Yong He <yhe@nvidia.com>
commit 502aa3812a82cf0d091cff0c67804e4ee448ac78
Author: David Siher <32305650+dsiher@users.noreply.github.com>
Date: Tue Sep 14 12:59:55 2021 -0400
Bring heterogeneous-hello-world back up to date. (#1935)
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
* Revert "Squashed commit of the following:"
This reverts commit 4f665858d65f7c332c616ef6db9fdafa1c5e0b9f.
* Run premake
* Remove prebuild command (only works on Windows?)
* Rerun premake
* Fix heterogeneous prebuild command
* Remove linux specific prebuild command
* Fix prebuild command (again)
* Change target from dxbc to hlsl to see if that fixes linux issues
* Use Path::getFileNameWithoutExt
* Change string-literal.slang.expected to have extra filename in decoration
Co-authored-by: Yong He <yonghe@outlook.com>
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add gfx unit testing framework.
* Fix compilation error.
* Reset gfxDebugCallback after render_test.
* Pass enabledApi flags through.
* Fix for code review suggestions.
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Bring heterogeneous-hello-world back up to date.
* Reintroduced heterogeneous-hello-world into the premake
* No longer uses compiled bytecode for entry point, instead a loadModule
call is hardocoded with the slang file name.
* Entry point is, similarly, hardcoded for now.
* Added a bypass to slang-legalize-types for an unneeded GPUForeach check
* Run premake and change to relative path
* Removed experimental and added README
Co-authored-by: Yong He <yonghe@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Experimental DXR1.0 support in gfx.
- Add `dispatchRays` command.
- Add `createRayTracingPipelineState` method to construct a D3D ray tracing state object from a linked slang program and user specified shader table.
Limitations/simplifications: no local root signature support, shader table entries contains only shader identifiers and is specified at pipeline creation time, owned by the pipeline state object.
* Root object binding for raytracing pipelines.
* `maybeSpecializePipeline` implementation for raytracing pipelines.
* Add ray-tracing-pipeline example.
* Fixes.
* Update README.md
* Update comments on the lifespan of specialized pipelines
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
| |
|
|
|
| |
* Update VS projects to 2019.
* Empty commit to trigger build
|
| | |
|
| | |
|
| |
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Make slang-test the startup project.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Split out compiler-core initially with just slang-source-loc.cpp
* More lexer, name, token to compiler-core.
* Split Lexer and Core diagnostics.
* Move slang-file-system to core.
* Add slang-file-system to core.
* More DownstreamCompiler into compiler-core
* Fix typo.
* Add compiler-core to bootstrap proj.
* Small fixes to premake
* For linux try with compiler-core
* Remove compiler-core from examples.
* Added NameConventionUtil to compiler-core
* Add global function to CharUtil to *hopefully* avoid linking issue.
* Hack to make linkage of CharUtil work on linux.
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
* Enable building glslang from source
Somehow the slang-glslang binaries we are currently using aren't the most up-to-date ones, so I am enabling building glslang from source so that we can produce new binaries.
* fixup: run generators
|
| |
|
|
|
|
|
| |
* Refactor window library.
* Fix project file
* Fix warnings.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Refactor `gfx` to surface `CommandBuffer` interface.
* Fixes.
* Fix code review issues, and make vulkan runnable on devices without VK_EXT_extended_dynamic_states.
* Update solution files
* Move out-of-date examples to examples/experimental
Co-authored-by: Yong He <yhe@nvidia.com>
|
| | |
|
| |
|
| |
This change also switches the build back to using prebuilt glslang binaries instead of always building from source.
|
| |
|
|
|
|
|
|
|
| |
* Update glslang to 11.1.0
This change pulls new versions of glslang, spirv-headers, and spirv-tools as submodules, and makes the necessary changes to other files in the repository to get it all building (at least on Windows).
This change also enables building of glslang from source by default, so that we can easily generate new binaries for inclusion in the `slang-binaries` repository.
* fixup: missing file
|
| |
|
|
|
|
|
|
|
| |
* Make `gfx` compile to a DLL.
* Fix cuda
* Fix cuda build
* Bug gl screen capture bug.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Testing out use of lz4.
* Added ICompressionSystem, and LZ4 implementation.
* Add support for deflate compression.
Simplify compression interface - to make more easily work across apis.
* WIP on CompressedFileSystem.
* ImplicitDirectoryCollector
* SubStringIndexMap - > StringSliceIndexMap.
* WIP save stdlib in different containers.
* Support for different archive types for stdlib.
* Fix project.
* CompressedFileSystem -> ArchiveFileSystem.
Added CompressionSystemType::None
* Added ArchiveFileSystem
* Fix problem RiffFileSystem load withoug compression system.
* Test archive types.
Improve diagnostic message.
* Fix typo in testing file system archives.
* Split out archive detection.
* Fix gcc warning issue.
* Fix warning.
* RiffArchiveFileSystem -> RiffFileSystem
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
| |
|
| |
Co-authored-by: Yong He <yhe@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* "Shader Toy" example and related fixes
This change introduces a new `shader-toy` example program that is primarily designed to show how Slang's features for type-based encapsulation and modularity can be applied to modularity for effects along the lines of those from `shadertoy.com`.
The Example
-----------
The example is being checked in with an example "toy" effect that I hastily put together, so that it would not be encumbered with any IP concerns. I wrote the effect using the shadertoy.com editor, so I can be sure it is valid GLSL. During bringup of the application I used a pre-existing and larger effect for testing, so some of the support code that was added is not being used at present.
The big-picture idea here is to have an exmaple that shows how to modularize things using Slang interfaces and generics, and then to use the Slang compiler API to manage the compilation, composition, specialization, and linking steps. For better or worse this leads to the sequence of API calls involved being much longer than what was in something like the `hello-world` example.
Future Work (Example)
---------------------
There is a lot of room for improvement and expansion here, so this should be viewed as a checkpoint of work in progress rather than something I'm claiming as a finalized demonstration of all we'd like to achieve. Areas for future work include:
* We need to copy the integration of "Dear, IMGUI" that was already done for the `model-viewer` example so that this example can have a UI.
* Now that the compilation flow is broken into all these additional steps, it should be possible to have the application load multiple effects as distinct modules, and then provide a UI for switching between them. The chosen effect module would be used to specialize the top-level shader(s) before kernel generation.
* The checked-in logic includes a compute shader that can execute an effect, but that hasn't been tested nor has it been wired up to any kind of UI. We should have a way to switch between multiple execution methods, with a goal of eventually including CPU execution.
* The "GLSL compatibility" code needs a lot of improvements before it is likely to be usable for a nontrivial number of shaders. Some of that work is waiting on Slang compiler fixes, though.
* We should consider allowing the individual "toy" effects to define their own uniform parameters and expose those via a UI and reflection. The catch in this case is not that this would be difficult to do, but that it would be a semantic change to how shader toy effects currently work.
The Compiler Fixes
------------------
Doing this work exposed a few bugs in Slang, and this change includes fixes for the ones that were quick to address.
We already had logic in `slang-check-shader.cpp` that was validating the entry points in a compile request - either by checking the explicitly-listed entry points, or by scanning for `[shader("...")]` attributes. The problem is that the routine that did that checking was not being invoked on all compiles. The logic that handled entry points was only being run for manual compiles using `SlangCompileRequest`, while anything using `import` or `loadModule` would ignore entry points. I refactored the relevant code into a subroutine that will be invoked in all compilation scenarios.
There were already `TODO` comments in `SpecializedComponentType` which made the point about how a specialized entry point like `myShader<YourType>` would need to properly show that it has dependencies on both the module that defines `myShader` *and* the module that defines `YourType`, while only the former was being handled at present. I went ahead and implemented the logic to scan the generic arguments for a specialized compoment type in order to determine what module(s) the arguments depend on (both type arguments and witness tables). With that change, using `IComponentType::link` on a specialized component will properly pull in the module(s) that the generic arguments come from.
In `slang-ir-legalize-types.cpp` we could run into assertion failures in debug builds because of code trying to legalize layout `IRAttr`s for fields or parameters with types that need legalization. In practice it is safe to skip these layout attributes, because legalization of the fields/parameters they pertain to would result in creation of entirely new layout attributes, and the old ones would then be unreferenced.
Future Work (Fixes)
-------------------
There are other compiler bugs that this work exposed, but which this change does not address. These will need to be resolved as part of subsequent changes:
* Slang allows for default-initialization of variables of a generic type. That is, given `<T : ISomething>` a user is allowed to declare `T x = {};` and the Slang front-end does not complain. Instead, this leads to an internal compiler error during IR lowering.
* The Slang `__init()` feature probably needs to be upgraded to a properly supported feature, and we probably need a way to make implementing default-initialization an easy thing (e.g., any `struct` type that has initial-value expressions for all its fields should automatically and implicitly satsify an `init();` requirement declared in an interface)
* Iniside an `__init()` definition, code has mutable access to members of the enclosing type, but for some reason the front-end is incorrectly treating `this` as immutable in those contexts. As a result you can write to `someField` but not `this.someField`.
* User-defined operator overloads flat out don't work (which isn't surprising given that no clients have decided to use them yet, and we have no test coverage for them). This is actually due to the shadowing rules being used for lookup right now, so a fix for this issue is going to have far-reaching consequences around what overloads are visible where (and anything that impacts overload resolution is a big can of worms, including around performance).
* fixup: test case had missing main function
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Move reflection to reflection-api.
* Slight reorg to pull out potentially Slang internal functions from the reflection API impls.
* Remove visual studio projects
* Fix for slang-binaries copy.
* Add the visual studio projects in build/visual-studio
* Remove miniz project.
* Differentiate the linePath from the filePath.
* Improve comment in premake5.lua + to kick of CI.
* Kick CI.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add miniz
* Fix for separator in CacheFileSystem.
Add compression unit test for zip.
* Put zip compression into core.
* Remove delimiter stripping if simplifying a path - as stripping will fix delimiters.
* ZipFileSystem WIP.
* More ZipFileSystem working.
* Added isEmpty.
Fixed small bug is contains.
* First pass support for mutability on zip.
* Improvements to File::read/writeAllBytes
* Can access and save archive - but has memory leaks.
* Fix memory leak.
* Some ZIP compression tests.
* Fix memory leak on ScopedAllocation.
Fix off by one bug on UIntSet
* Bug fix in UIntSet
* Fix remaining ZipFileSystem issues.
Adde stand alone unit-test.
* Turn tabs to spaces in slang-io.h
* Renamed mode ReadWrite (instead of just Write)
* Make miniz it's own project.
* Fix windows warning on win32.
* Remove warnings needed when miniz was included as a header library.
* Set the C++ standard via 'flags' in premake.
* Add support for 'implicit' paths.
* Add testing for implicit directories.
Better handling of implicit directories.
* Improve comments in ZipFileSystem.
* Update comment around reader/writer transformation.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Embed default prelude for CUDA
Slang supports the notion of a "prelude" that gets prepended to the source code we generate in language. For some targets, a prelude is not necessary (e.g., we compile to HLSL/GLSL and then on to DXBC/DXIL/SPIR-V just fine without a prelude), but some targets have been implemented in a way that makes a prelude necessary (notably CPU and CUDA). For the targets that require a prelude, the Slang codebase includes usable preludes under the `prelude/` directory.
Prior to this change, if a user was compiling for such a target (whether via command-line or API), there had to take responsibility for specifying the prelude to use (usually by passing in the contents of the prelude file(s) already included in the Slang distribution).
It is reasonable for a user to expect an out-of-the-box experience where compilation to CUDA PTX or native CPU code should Just Work, similarly to how compilation to SPIR-V Just Works. This change is a step in the direction of providing a user experiene that Just Works for common cases.
The main addition here is a tool called `slang-embed` that we run during our build to turn the `prelude/*.h` files into `prelude/*.h.cpp` files that embed the contents of the original `.h` file as a `const` variable.
By compiling and linking in the generated `.h.cpp` file for the CUDA prelude, we are then able to set the default prelude to use for CUDA at the time a session/linkage is created. That default prelude will be used unless the user manually specifies their own prelude (which current users of the CUDA back-end must be doing).
This change only sets up a default prelude for CUDA because of the way that the CPU prelude is split across multiple files. A strategy that provides a good default prelude for CPU may take more work, but that work might also be unnecessary if we switch to a strategy of using LLVM to generate native code.
The implementation of the `slang-embed` tool is intentionally simple, and it will likely run into issues if/when we need to embed binary files or larger text files. The assumption being made here is that we can address those issues when they arise, and there is no reason to over-engineer the tool right now.
The way that `slang-embed` is integrated into our build process is likely to require some iteration to make sure that it works across all platforms. I expect that this change will have multiple follow-up fixes related to trying to get the build to work as expected across all targets on CI.
* fixup: trying to ensure that embedded prelude gets compiled into slang
* fixup: properly clean up allocations in slang-embed
* fixup: fix double free introduced by previous change
* fixup: off-by-one allocation error
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Put the running of generators into a separate project, to try and sure the generated products are available for other dependencies when compiling with multiple threads on linux.
* Made paths Strings in slang-generate. Made paths use / for path separators (rather than \ on windows which causes some problems with #line).
* Make the run-generators proj a utility step.
* Made run-generators a StaticLib.
* Fix problem with generating when not necessary.
* Trying to get abspath to work on linux.
* Add run-generator-main.cpp dummy file.
* Add comment about the issues around linux and correct build triggering.
* Add updated projects.
* Remove the run-generators-main.cpp as no longer needed for 'run-generators' tool.
Removed the adding of files by default from baseSlangProject
Made the run generators project use slang-string.cpp as the file it builds from core.
* Add the run-generators VS project.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Introduced heterogeneous example. Example includes C++ source and
header files, and does not currently make use of the associated slang
file when building. The intent of this commit is to introduce the
example as a baseline for later updates as the heterogeneous model is
expanded.
* Changing namespace
* Renamed and rewrote README
* Updated example to account for compiler updates
* Updated path
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Synthesize "active mask" for CUDA
The Big Picture
===============
The most important change here is to `hlsl.meta.slang`, where the declaration of `WaveGetActiveMask()` is changed so that instead of mapping to `__activemask()` on CUDA (which is semantically incorrect) it maps to a dedicated IR instruction.
The other `WaveActive*()` intrinsics that make use of the implicit "active mask" concept had already been changed in #1336 so that they explicitly translate to call the equivalent `WaveMask*()` intrinsic with the result of `WaveGetActiveMask()`. As a result, all of the `WaveActive*()` functions are now no different from a user-defined function that uses `WaveGetActiveMask()`.
The bulk of the work in this change goes into an IR pass to replace the new instruction for getting the active mask gets replaced with appropriately computed values before we generate output CUDA code. That work is in `slang-ir-synthesize-active-mask.{h,cpp}`.
Utilities
=========
There are a few pieces of code that were helpful in writing the main pass but that can be explained separately:
* IR instructions were added corresponding to the Slang `WaveMaskBallot()` and `WaveMaskMatch()` functions, which map to the CUDA `__ballot_sync()` and `__match_any_sync()` operations, respectively. These are only implemented for the CUDA target because they are only being generated as part of our CUDA-only pass.
* The `IRDominatorTree` type was updated to make it a bit more robust in the presence of unreachable blocks in the CFG. It is possible that the same ends could be achieved more efficiently by folding the corner cases into the main logic, but I went ahead and made things very explicit for now.
* I added an `IREdge` utility type to better encapsulate the way that certain code operating on the predecessors/successors of an `IRBlock` were using an `IRUse*` to represent a control-flow edge. The `IREdge` type makes the logic of those operations more explicit. A future change should proably change it so that `IRBlock::getPredecessors()` and `getSuccessors()` are instead `getIncomingEdges()` and `getOutgoingEdges()` and work as iterators over `IREdge` values, given the way that the predecessor and successor lists today can contain duplicates.
* Using the above `IREdge` type, the logic for detecting and break critical edges was broken down into something that is a bit more clear (I hope), and that also factors out the breaking of an edge (by inserting a block along it) into a reusable subroutine.
The Main Pass
=============
The implementation of the new pass is in `slang-ir-synthesize-active-mask.cpp`, and that file attempts to include enough comments to make the logic clear. A brief summary for the benefit of the commit history:
* The first order of business is to identify functions that need to have the active mask value piped into them, and to add an additional parameter to them so that the active mask is passed down explicitly. Call sites are adjusted to pass down the active mask which can then result in new functions being identified as needing the active mask.
* The next challenge is for a function that uses the active mask, to compute the active mask value to use in each basic block. The entry block can easily use the active mask value that was passed in, while other blocks need more work.
* When doing a conditional branch, we can compute the new mask for the block we branch to as a function of the existing mask and the branch condition. E.g., the value `WaveMaskBallot(existingMask, condition)` can be used as the mask for the "then" block of an `if` statement.
* When control flow paths need to "reconverge" at a point after a structured control-flow statement, we need to insert logic to synchronize and re-build the mask that will execute after the statement, while also excluding any lanes/threads that exited the statement in other ways (e.g., an early `return` from the function).
The explanation here is fairly hand-wavy, but the actual pass uses much more crisp definitions, so the code itself should be inspected if you care about the details.
Tests
=====
The tests for the new feature are all under `tests/hlsl-intrinsic/active-mask/`. Most of them stress a single control-flow construct (`if`, `switch`, or loop) and write out the value of `WaveGetActiveMask()` at various points in the code.
In practice, our definition of the active mask doesn't always agree with what D3D/Vulkan implementations seem to produce in practice, and as a result a certain amount of effort has gone into adding tweaks to the tests that force them to produce the expected output on existing graphics APIs. These tweaks usually amount to introducing conditional branches that aren't actually conditional in practice (the branch condition is always `true` or always `false` at runtime), in order to trick some simplistic analysis approaches that downstream compilers seem to employ.
One test case currently fails on our CUDA target (`switch-trivial-fallthrough.slang`) and has been disabled. This is an expected failure, because making it produce the expected value requires a bit of detailed/careful coding that would add a lot of additional complexity to this change. It seemed better to leave that as future work.
Future Work
===========
* As discussed under "Tests" above, the handling of simple `switch` statements in the current pass is incomplete.
* There's an entire can of worms to be dealt with around the handling of fall-through for `switch`.
* The current work also doesn't handle `discard` statements, which is unimportant right now (CUDA doesn't have fragment shaders), but might matter if we decide to synthesize masks for other targets. Similar work would probably be needed if we ever have `throw` or other non-local control flow that crosses function boundaries.
* An important optimization opportunity is being left on the floor in this change. When block that comes "after" a structured control-flow region (which is encoded explicitly in Slang IR and SPIR-V) post-dominates the entry block of the region, then we know that the active mask when exiting the region must be the same as the mask when entering the region, and there is no need to insert explicit code to cause "re-convergence." This should be addressed in a follow-on change once we add code to Slang for computing a post-dominator tree from a function CFG.
* Related to the above, the decision-making around whether a basic block "needs" the active mask is perhaps too conservative, since it decides that any block that precedes one needing the active mask also needs it. This isn't true in cases where the active mask for a merge block can be inferred by post-dominance (as described above), so that the blocks that branch to it don't need to compute an active mask at all.
* If/when we extend the CPU target to support these operations (along with SIMD code generation, I assume), we will also need to synthesize an active mask on that platform, but the approach taken here (which pretty much relies on support for CUDA "cooperative groups") wouldn't seem to apply in the SIMD case.
* Similarly, the approach taken to computing the active mask here requires a new enough CUDA SM architecture version to support explicit cooperative groups. If we want to run on older CUDA-supporting architectures, we will need a new and potentially very different strategy.
* Because the new pass here changes the signature of functions that require the active mask (and not those that don't), it creates possible problems for generating code that uses dynamic dispatch (via function pointers). In principle, we need to know at a call site whether or not the callee uses the active mask. There are multiple possible solutions to this problem, and they'd need to be worked through before we can make the implicit active mask and dynamic dispatch be mutually compatible.
* Related to changing function signatures: no effort is made in this pass to clean up the IR type of the functions it modifies, so there could technically be mismatches between the IR type of a function and its actual signature. If/when this causes problems for downstream passes we probably need to do some cleanup.
* fixup: backslash-escaped lines
I did some "ASCII art" sorts of diagrams to explain cases in the CFG, and some of those diagrams used backslash (`\`) characters as the last character on the line, causing them to count as escaped newlines for C/C++.
The gcc compiler apparently balked at those lines, since they made some of the single-line comments into multi-line comments.
I solved the problem by adding a terminating column of `|` characters at the end of each line that was part of an ASCII art diagram.
* fixup: typos
Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Extractor builds without any reference to syntax (as it will be helping to produce this!).
* Change macros to include the super class.
* WIP replacing defs files.
* Added indexOf(const UnownedSubString& in) to UnownedSubString.
Refactored extractor
* Output a macro for each type with the extracted info - can be used during injection in class
* Simplify the header file - as can get super type and last from macro now
* Store the 'origin' of a definition
* Some small tidy ups to the extractor.
* Improve comments on the extractor options.
* Made CPPExtractor own SourceOrigins
* Small fixes around SourceOrigin.
* Small tidy up around macroOrign
* WIP Visitor seems now to work correctly.
Split out types used by ast into slang-ast-support-types.h
* Fix remaining problems with C++ extractor being used with AST nodes.
Add CountOf to extractor type ids.
Added ReflectClassInfo::getInfo to turn an ASTNodeType into a ReflectClassInfo
* Fix compiling on linux.
Fix typo in memset.
* Small tidy up around comments/layout.
Moved NodeBase casting to NodeBase.
* Make premake generate project that builds with cpp-extractor for AST.
* Get the source directory from the filter in premake.
* Fix typo in source path
* Explicitly set the source path for premake generation for AST.
* Special case handling of override to apease Clang.
* Use a more general way to find the slang-ast-reflect.h file to run the extractor.
* Appveyor is not triggering slang-cpp-extractor - try putting dependson together.
* Put building slang-cpp-extractor first.
* Disable some project options to stop MSBuild producing internal compiler errors.
* Try reordering the projects in premake5.lua
* Hack to try and make slang-cpp-extractor built on appveyor.
* Disable flags - not required for MSBuild on appveyor.
* Disable flags not required for build on AppVeyor.
* Updated Visual Studio projects with slang-cpp-extractor.
* Added Visual Studio slang-cpp-extractor project.
|