summaryrefslogtreecommitdiffstats
path: root/prelude
Commit message (Collapse)AuthorAge
...
* Generate lookup tables from cmake (#3461)Ellie Hermaszewska2024-01-24
| | | | | | | | | | | | | | | | | | | * Generate lookup tables from cmake * Correct add_custom_command generator dependencies * set options for lookup table source * include path * use slang_add_target for capability generated targets * vs project regenerate * ci wobble --------- Co-authored-by: Yong He <yonghe@outlook.com>
* WIP: CMake (#3326)Ellie Hermaszewska2023-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * More robust input and output selection in generator tools * Add cmake build system * Get slang-test running with cmake * Bump lz4 and miniz dependencies * Make cmake build more declarative * Correct preprocessor logic in slang.h * Add cuda test to compute/simple * Remove empty cmake files * output placement for cmake, and commenting * Correct include paths in spirv-embed-generator * Format cmake with gersemi * Make cmake build clerer * Neaten header generation Also work around https://gitlab.kitware.com/cmake/cmake/-/issues/18399 by introducing correct_generated_properties to set the GENERATED flag in the correct scope * remove unused files * use 3.20 to set GENERATOR property properly * spelling * more flexible linker arg setting * replace slang-static with obj collection * Set rpath and linker path correctly * neaten generated file generation * tests working with cmake build * fix premake5 build * comment and neaten cmake * remove unnecessary dependency * Build aftermath example only when aftermath is enabled * Add slang-llvm and other dependencies * Put modules alongside binaries * Find slang-glslang correctly * Better option handling * comments * add llvm build test * Better option handling * cmake wobble * use UNICODE and _UNICODE * remove other workflows * use ccache * neaten * limit parallel for llvm build * use ninja for build * Windows and Darwin slang-llvm builds * cache key * verbose llvm build * cl on windows * sccache and cl.exe * use cl.exe * Correct package detection * less verbosity * Simplify miniz inclusion * fix build with sccache * Neaten llvm building * neaten * Neaten slang-llvm fetching * more surgical workarounds * Add ci action * Get version from git * better variable naming * add missing include * clean up after premake in cmake * more docs on cmake build * ci wobble * add imgui target * more selective source * do not download swiftshader * Some missing dependencies * only build llvm on dispatch * Disable /Zi in CI where sccache is present * simplify * set PIC for miniz * set policies before project * reengage workaround * more runs on ci * Add cmake presets * Add cpack * move iterator debug level to preset * Correct lib flag * simplify action * Neaten cmake init * Add todo * Add simple test wrapper * Add tests to workflow presets * rename packing preset * Correctly set definitions * docs * correct preset names * Make slang-test depend on test-server/test-process * neaten * use workflow in actions * install docs * Correct module install dir * debug dist workflow * Install headers * neaten header globbing * Neaten dependency handling * make lib and bin variables * Do not set compiler for vs builds, unnecessary * docs * allow setting explicit source for target * maintain archive subdir * cmake docs * install headers * place targets into folders * cmake docs * nest external projects in folder * remove name clash * Neater external packages * meta targets in folder structure * cleaner slang-glslang dll * Add missing static directive to slang-no-embedded-stdlib * more robust module copying * make slang-test the startup project * folder tweak * Make FETCH_BINARY the default on all platforms * Set DEBUG_DIR * add natvis files to source * skip spirv tests * remove test step from debug dist * Add build to .gitignore * redo warnings to be more like premake * Update imgui * clean more premake files * Disable PCH for glslang, gcc throws a warning * Add /MP for msvc builds * warning wobble * Add script to build llvm * Add slang-llvm and generators components * Build slang-llvm in ci * comments * fetch llvm with git * better abi approximation for cache * better sccache key * formatting * Correct logic around disabling problematic debug info for ccache * exclude gcc and clang from windows ci * Make dist workflows use system llvm * naming * restore normal dist builds * formatting * run tests in ci * Correct slang-llvm url setting * Rely on the system to find the test tool library * actions matrix wiggle * cope with OSX ancient bash * Correct compilers on windows * more ci debugging * Correct rpath handling on OSX * neaten * correct path to slang-llvm * Correct rpath separator on osx * Find slang-llvm correctly * smoke tests only on osx * ci wobble * Give MacOS module a dylib suffix * get swiftshader correctly * cope with bsd cp * remove debug output * full tests on osx * ci wobble * Add some vk tests to expected failures * simplify ci * ci wobble * exclude dx12 tests from github ci * remove cmake code for building llvm * warnings * warnings as errors for cl * spirv-tools in path * add aarch64 ci build * Add SLANG_GENERATORS_PATH option for prebuilt generators * neaten * Correct generator target name * remove yaml anchors because github actions does not support them * Demote CMake in docs Also add info on cross compiling * Restore premake CI * use minimal ci for cmake * Write miniz_export for premake build and .gitignore it * Mention build config tool options in docs * Remove redefined macro for miniz * regenerate vs project
* CUDA: Fixes for NVRTC 12.x and warp mask ambiguity; adds CC 8.x warp ↵Neil Bickford2023-11-07
| | | | | | | | | | | reduction intrinsics. (#3314) * CUDA: Fixes for NVRTC 12.x, warp mask ambiguity; add reduction partial specializations. * Fixes running NVRTC on CUDA 12 without a specified profile (used in testing, e.g. `slang-test -api cuda -category wave`) * Fixes mask ambiguity between getting the lane index from threadId.x and a full mask of threads. * Adds partial specializations for compute capability 8.x warp reduction intrinsics. * Fix formatting
* Make the exponent return value from frexp int (#3284)Ellie Hermaszewska2023-10-26
| | | | | | | | | | | * Make the exponent return value from frexp int Fixes https://github.com/shader-slang/slang/issues/3282 * Update slang-llvm. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* More `slangpy` features + polishing (#3233)Sai Praveen Bangaru2023-09-23
| | | | | | | | | | | | | | | | | * Update user-guide with new slangpy features * More polishing of new slangpy docs * Update a1-02-slangpy.md * Only require contiguity for vector element types * Added `loadOnce/storeOnce` and subscript operations * Added docs, `DiffTensorView.dims()` & `DiffTensorView.stride(uint)` * Add constructors, remove storeOnce/loadOnce test * Adjusted intrinsic definitions
* Add check for contiguous tensors (#3199)Sai Praveen Bangaru2023-09-08
| | | | | Otherwise, this can lead to undetected scenario where the strides are incorrect for non-scalar types (`float2`, `float3`, etc..) Users must call `tensor = tensor.contiguous()` on the inputs to avoid this error.
* Remove unsupported torch types + add bool type. (#3197)Sai Praveen Bangaru2023-09-08
| | | Co-authored-by: Yong He <yonghe@outlook.com>
* Misc. SPIRV Fixes, Part 2. (#3147)Yong He2023-08-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Misc. SPIRV Fixes, Part 2. * Fix up. * Fix. * Add system smenatic values. * 16 bit int and floats, matrix/vector reshape, bool ops. * Fix. * Fix. * Allow push constant entry point params. * entrypoint params. * swizzleSet and swizzledStore. * packoffset. * string hash. * Fix. * Matrix arithmetics. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Lower all ByteAddressBuffer uses for SPIRV. (#3143)Yong He2023-08-23
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Only define atomics for `float2` and `float4` when CUDA arch<900 (#3041)Sai Praveen Bangaru2023-08-02
| | | From Ada onwards, these definitions are already available in CUDA's stdlib and will cause a compiler error.
* Avoid implicit casts or device transfers. (#2992)Sai Praveen Bangaru2023-07-14
|
* Fix native string emit for CUDA/Cpp backend. (#2980)Yong He2023-07-12
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Support for infinite literal of from 34.2432#INF (#2944)jsmall-nvidia2023-06-27
|
* Various fixes for autodiff and slangpy. (#2876)Yong He2023-05-09
| | | | | | | | | | | | | * Various fixes for autodiff and slangpy. * Fix cuda code gen for `select`. * Fix getBuildTagString(). * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Set sharedMem argument to 0 when launching cuda kernel. (#2799)Yong He2023-04-13
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Small fixes to TorchTensor. (#2790)Yong He2023-04-11
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fix linking issue in slangpy + no mask param for kernels. (#2778)Yong He2023-04-05
| | | | | | | | | | | | | * Fix linking issue in slangpy + no mask param for kernels. * add cuda header changes * fix * More correct change of active mask insertion. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* More builtin library support in torch backend. (#2760)Yong He2023-03-30
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Convert tensor types in `make_tensor_view`. (#2755)Yong He2023-03-29
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Add slangpy doc, fix cuda prelude. (#2748)Yong He2023-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add slangpy doc, fix cuda prelude. * more bug fix. * fix. * fix. * More fix. * fix. * f * fix prelude. * update prelude. * update doc * Update prelude. * add zeros_like * update doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Update slang-llvm (#2735)Yong He2023-03-26
|
* Add PyTorch C++ binding generation. (#2734)Yong He2023-03-26
| | | | | | | | | * Add PyTorch C++ binding generation. * fix --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Add support for emitting cuda kernel and host functions. (#2712)Yong He2023-03-17
| | | | | | | | | | | * Add support for emitting cuda kernel and host functions. * Update test. * Fix cuda preamble emit. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Overhaul global inst deduplication and cpp/cuda backend. (#2654)Yong He2023-02-16
| | | | | | | | | * Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
* Preliminary debugBreak support (#2647)jsmall-nvidia2023-02-14
| | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Preliminary support for debug break. * Add C++ debug break support. Add details about usage. * Improve debug break test details. * Make HLSL output a comment about no support. * Handle specialize for target assert, without a body if it has spv_instruction/target intrinsic
* Fix code generation for matrix reshape. (#2568)Yong He2022-12-14
| | | Co-authored-by: Yong He <yhe@nvidia.com>
* Fix inlining pass. (#2506)Yong He2022-11-10
| | | | | | | | | | | | | | | * Fix inlining pass. * Add more check against corner cases. * Revise comments. * Fixes. * Fix premake script. * Fixes. Co-authored-by: Yong He <yhe@nvidia.com>
* f32tof16 and f16tof32 support for CPU targets (#2500)jsmall-nvidia2022-11-09
| | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Float16 support for C++/CPU based targets with f16tof32 and f32tof16. * Small correction around INF/NAN handling for f32tof16 * Small improvement to f16tof32 * Disable CUDA test for now.
* Make cpp-host prelude include scalar intrinsics. (#2478)Yong He2022-10-31
| | | | | | | * Make cpp-host prelude include scalar intrinsics. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Run simple compute kernel in gfx-smoke test. (#2400)Yong He2022-09-15
|
* Add gfx interface definition in Slang. (#2364)Yong He2022-08-16
| | | | | | | | | | | | | | | | | | * Add gfx interface definition in Slang. - add gfx interface definitons in Slang. - fix slang compiler to correctly type-check `out` interface argument. - modify gfx interface to be fully COM compatible - add convenient ShaderProgram creation methods to gfx. * Fix compile errors and warnings. * Update project files * Fix cuda. * Properly implement queryInterface in command encoder impls. Co-authored-by: Yong He <yhe@nvidia.com>
* Language server pointer type support + add `DLLImport` test (#2350)Yong He2022-08-10
| | | | | | | | | | | | | | | | | | | * Language server pointer type support. + Natvis for AST. * Add completion suggestion for GUID. * Make executable test able to use slang-rt. * Fix gcc argument for rpath. * Fix DLLImport on linux. * Fix windows. * Fix. Co-authored-by: Yong He <yhe@nvidia.com>
* Allow `class` to implement COM interface, [DLLExport] (#2338)Yong He2022-07-25
| | | | | | | * Allow `class` to implement COM interface, [DLLExport] * Fix [COM] usage in tests and examples with UUIDs. Co-authored-by: Yong He <yhe@nvidia.com>
* Improved bounds checking for C++/CUDA (#2263)jsmall-nvidia2022-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure. * Add support for __global. * Added `ActualGlobalRate` Added special handling around globals and COM interfaces. Tested out in cpu-com-example. * Fix typo in NodeBase. * Support for accessing globals by name working. * Bounds checking for C++ Improved bounds checks for CUDA. * Check that actual global initialization is working. * Fix typo. * Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar. * Fix typo in CUDA prelude. * Remove context. Only create replacement if needed. * Split out COM host-callable into a unit-test. * host-callable com testing on C++and llvm. * Comment around the COM ptr replacement. * WIP Zero bound test. * Disable com test on vs 32 bit. Fix C++ prelude * Disable 32 bit targets testing com host-callable. * For now disable zero index test. * Enable bounds checking for CPU/CUDA. * Small fixes. Disable CUDA zero index bound fix. * Add test result for bound check. * Work around for index wrapping issue. * Added Fixed array test. * Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
* Actual global support (#2262)jsmall-nvidia2022-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure. * Add support for __global. * Added `ActualGlobalRate` Added special handling around globals and COM interfaces. Tested out in cpu-com-example. * Fix typo in NodeBase. * Support for accessing globals by name working. * Check that actual global initialization is working. * Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar. * Remove context. Only create replacement if needed. * Split out COM host-callable into a unit-test. * host-callable com testing on C++and llvm. * Comment around the COM ptr replacement. * Disable com test on vs 32 bit. Fix C++ prelude * Disable 32 bit targets testing com host-callable. * Use JSON parsing to locate VS version. * Need platform detection in C++prelude. * Fix com host callable test for LLVM. * Work around for not being able to include "targetConditionals.h"
* COM interfaces with host callable (#2258)jsmall-nvidia2022-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure.
* Support `[DllImport]` (#2181)Yong He2022-04-12
| | | | | | | | | | | | | | | | | * Support `[DllImport]` * Fix. * Fix. * Fix array type emit in cpp. * Fix. * Fix. * Fix Co-authored-by: Yong He <yhe@nvidia.com>
* Allow slangc to generate exe from .slang file. (#2170)Yong He2022-03-28
|
* Fixed naming conflicts in heterogeneous-hello-world (#2114)David Siher2022-02-03
| | | | | | | | | | | | | | | | | | | | * Fixed naming conflicts in heterogeneous-hello-world Added 3 new modifiers (`__unmangled`, `__exportDirectly`, `__externLib`) `__unmangled` causes mangleName() to return the normal name of the decl. `__exportDirectly` changes parent decl name concatenation behavior to use "::" instead of "." (for Name Hint) and emits the name hint when it exists, otherwise it emits the mangled name. `__externLib` stops Slang from emitting the corresponding struct. Also made necessary changes to heterogeneous-hello-world so that this new functionality is shown off. * Undo unintentional formatting changes Co-authored-by: Yong He <yonghe@outlook.com>
* Generalize heterogenous code emit (#1968)David Siher2021-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Bring heterogeneous-hello-world back up to date. * Reintroduced heterogeneous-hello-world into the premake * No longer uses compiled bytecode for entry point, instead a loadModule call is hardocoded with the slang file name. * Entry point is, similarly, hardcoded for now. * Added a bypass to slang-legalize-types for an unneeded GPUForeach check * Run premake and change to relative path * Removed experimental and added README * Add prebuild command to premake for heterogeneous example * Pass in entry point as parameter (also remove shader bytecode) * Pass in module name as parameter * Squashed commit of the following: commit 5b13b57fe600724344c556fe4309a5d6bb3d39ab Author: Kai Yao <kyao@nvidia.com> Date: Thu Oct 7 23:38:50 2021 -0700 Return diagnostics data when encountering module load error by exception (#1966) commit 112e1515c30fa972ff56f91514b70946153c718c Author: jsmall-nvidia <jsmall@nvidia.com> Date: Thu Oct 7 16:12:29 2021 -0400 Disable test crashing CI (#1965) * #include an absolute path didn't work - because paths were taken to always be relative. * Disable test that appears to be crashing. commit da32069a0c1c8c723d7ef45100049a8f0dd5d9c4 Author: Kai Yao <kyao@nvidia.com> Date: Mon Oct 4 13:58:51 2021 -0700 Modified barrier API to accept multiple resources per call (#1959) Co-authored-by: Yong He <yonghe@outlook.com> commit 97bb82ebcdf8f1391b9d93b5a8d7b1dfc4e88e52 Author: jsmall-nvidia <jsmall@nvidia.com> Date: Mon Oct 4 14:15:51 2021 -0400 Removing exceptions from core/compiler-core (#1953) * #include an absolute path didn't work - because paths were taken to always be relative. * Refactor Stream. Working on all tests. * Split out CharEncode. * Make method names lower camel. m_prefix in Writer/Reader * Tidy up around CharEncode interface. * Small improvements around encode/decode. * Better use of types. * Remove readLine from TextReader. * Remove exceptions from Stream/Text handling. * Fix some typos. * Fix tabbing. * Fix missing override. * Remove remaining exception throw/catch via using signal mechanism. * Remove exceptions that are not used anymore. * Document the Stream interface. * Remove index for decoding 'get byte' function. * Fix CharReader -> ByteReader. commit b3dfe383c6d31ff3dbd76dcfb32de8d536382f3e Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Mon Oct 4 09:46:33 2021 -0700 Get native handles for TextureResource and BufferResource (#1960) * Added getNativeHandle() to TextureResource and BufferResource; Implemented getNativeHandle() in Vulkan and D3D12; Added new unit test files for the aforementioned implementation * Added missing getNativeHandle() implementations to renderer-shared.cpp and CUDA * Finished new getNativeHandle() unit tests for ITextureResource and IBufferResource; Modified ICommandQueue and ICommandBuffer unit tests to call QueryInterface to convert to IUnknown then back and compare resulting pointers for equality * Unit tests updated and pass locally * Cast m_buffer.m_buffer and m_image to uint64_t commit 35bca4cc432613af3926da3bed217a6baa9cbd26 Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Fri Oct 1 13:08:25 2021 -0700 Add getNativeHandle() to ICommandQueue and ICommandBuffer (#1952) * Added support for getting command buffer and command queue handles to ICommandBuffer and ICommandQueue; D3D12Device, VkDevice, and DebugDevice modifieid to implement this new functionality; immediate-renderer-base.cpp also modified to implement the new functions * Removed excess boilerplate * Changed readRef() to get() in D3D12 getNativeHandle() implementation for ICommandBuffer and ICommandQueue * Added unit tests for new getNativeHandle() implementations, unfinished * Queue test added; Minor cleanup changes * getBufferHandleTestImpl() now closes the command buffer before returning * Added getNativeHandle() implementations to CUDADevice * Added comment clarifying that the Vulkan check is checking for a null handle, which is defined to be 0 commit 6c6200f547c7387598743b23bb3c8f0d375d9494 Author: Kai Yao <kyao@nvidia.com> Date: Thu Sep 30 20:25:34 2021 -0700 VK Resource Barrier (#1955) * Resource barrier API and VK implementation * Stub implementations * Handle VK Acceleration Structure flag * Add a couple more cases to pipeline barrier stages commit 627fc976bac5c2381dbace9c7925cb6a68b8de12 Author: Yong He <yonghe@outlook.com> Date: Thu Sep 30 19:48:47 2021 -0700 Fix aarch64 build on github (#1957) Co-authored-by: Yong He <yhe@nvidia.com> commit 122d701513e116856bd59c999221ce36a373d7db Author: Yong He <yonghe@outlook.com> Date: Thu Sep 30 17:51:56 2021 -0700 Fix GitHub release (#1956) * Fix aarch64 release build config. * Fix for WinAarch64 build. * Update premake for embed-std-lib build on aarch64. * `platform` fix for aarach64 build. * Try revert back to use absolute output path for slang-stdlib-generated.h * Fix * fix Co-authored-by: Yong He <yhe@nvidia.com> commit aa8f7b899b7b562b3d3c6e25c3da41569505e70c Author: Chad Engler <englercj@live.com> Date: Wed Sep 29 13:02:47 2021 -0700 Fix ARM64 detection for MSVC (#1951) commit 6736b0c1c5fa3e89bc561eb7965a1a0d17af3466 Author: Yong He <yonghe@outlook.com> Date: Wed Sep 29 11:29:46 2021 -0700 Add ISession::loadModuleFromSource. (#1950) Co-authored-by: Yong He <yhe@nvidia.com> commit d8e452412e14a6a8ba137f2adcae13b398e5cecb Author: Yong He <yonghe@outlook.com> Date: Tue Sep 28 15:03:03 2021 -0700 Fix AbortCompilationException leaking through loadModule API. (#1949) * Fix AbortCompilationException leaking through loadModule API. * Update. * Fix. Co-authored-by: Yong He <yhe@nvidia.com> commit cdf1b2c007fefdca128584d2a9f63dec3d350e16 Author: Yong He <yonghe@outlook.com> Date: Tue Sep 28 11:54:24 2021 -0700 Improvements to the unit test framework. (#1948) commit af788b62e18bbd55cd748ad60400a74cf1bc93ee Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Fri Sep 24 16:53:41 2021 -0700 Add existing device handle support unit test (#1946) commit bec8e6aec85b6e3f875c58bdd59eb15613978358 Author: Yong He <yonghe@outlook.com> Date: Fri Sep 24 11:33:44 2021 -0700 Move existing unit tests to a standalone dll. (#1945) commit f2a3c933bc11a498c622fa18694c84beca8ca031 Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Thu Sep 23 12:19:49 2021 -0700 Add method to retrieve native handles (#1944) * Added a getNativeHandle() method that retrieves the natively created handles; Modified RendererBase, VKDevice, D3D12Device, and DebugDevice to implement this new method * Moved ExistingDeviceHandles out of Desc directly inside IDevice and renamed to NativeHandles; Modified calls accessing the struct accordingly in RendererBase, DebugDevice, VKDevice, and D3D12Device * Minor cleanup changes (renames, etc.) commit b9b398d038b524f15a86ff27cd6888d54e8754e0 Author: Yong He <yonghe@outlook.com> Date: Wed Sep 22 10:06:59 2021 -0700 Add gfx unit testing framework. (#1943) * Add gfx unit testing framework. * Fix compilation error. * Reset gfxDebugCallback after render_test. * Pass enabledApi flags through. * Fix for code review suggestions. Co-authored-by: Yong He <yhe@nvidia.com> commit 6e9cee69b3588ddae09b08b9f580f59ad899983f Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Tue Sep 21 18:46:32 2021 -0700 Support for existing device/instance handles in Vulkan (#1942) commit b1f04c8544c650de3947955ca68f679535d249aa Author: lucy96chen <47800040+lucy96chen@users.noreply.github.com> Date: Wed Sep 15 20:22:45 2021 -0700 Allow D3D12Device to use an existing device handle (#1940) * Added a new field for an existing device handle to IDevice::Desc; Modified D3D12Device::initialize to set the device stored in desc if it already exists instead of creating a new one * Turned existingDeviceHandle into a struct containing an array of two elements; Updated D3D12Device::initialize to match changes to existingDeviceHandle; Updated comments * Fixed style error for ExistingDeviceHandles struct commit 2f7b9f5ae8be21c6c1d75ae9caefbc7b3f8986a9 Author: Pablo Delgado <private@pablode.com> Date: Thu Sep 16 01:17:57 2021 +0200 Fix incorrect WIN32 macros and missing Windows.h inclusion (#1939) * Replace WIN32 preprocessor macros with _WIN32 * Add missing Windows.h include for InterlockedIncrement commit 11d43642008905ac69a3832eb8a9b2ae7b785f86 Author: Yong He <yonghe@outlook.com> Date: Tue Sep 14 11:36:44 2021 -0700 Avoid upcasting to f32 in 16bit float-uint bit cast. (#1938) Co-authored-by: Yong He <yhe@nvidia.com> commit 502aa3812a82cf0d091cff0c67804e4ee448ac78 Author: David Siher <32305650+dsiher@users.noreply.github.com> Date: Tue Sep 14 12:59:55 2021 -0400 Bring heterogeneous-hello-world back up to date. (#1935) * Bring heterogeneous-hello-world back up to date. * Reintroduced heterogeneous-hello-world into the premake * No longer uses compiled bytecode for entry point, instead a loadModule call is hardocoded with the slang file name. * Entry point is, similarly, hardcoded for now. * Added a bypass to slang-legalize-types for an unneeded GPUForeach check * Run premake and change to relative path * Removed experimental and added README Co-authored-by: Yong He <yonghe@outlook.com> * Revert "Squashed commit of the following:" This reverts commit 4f665858d65f7c332c616ef6db9fdafa1c5e0b9f. * Run premake * Remove prebuild command (only works on Windows?) * Rerun premake * Fix heterogeneous prebuild command * Remove linux specific prebuild command * Fix prebuild command (again) * Change target from dxbc to hlsl to see if that fixes linux issues * Use Path::getFileNameWithoutExt * Change string-literal.slang.expected to have extra filename in decoration Co-authored-by: Yong He <yonghe@outlook.com>
* Bring heterogeneous-hello-world back up to date. (#1935)David Siher2021-09-14
| | | | | | | | | | | | | | | | | | * Bring heterogeneous-hello-world back up to date. * Reintroduced heterogeneous-hello-world into the premake * No longer uses compiled bytecode for entry point, instead a loadModule call is hardocoded with the slang file name. * Entry point is, similarly, hardcoded for now. * Added a bypass to slang-legalize-types for an unneeded GPUForeach check * Run premake and change to relative path * Removed experimental and added README Co-authored-by: Yong He <yonghe@outlook.com>
* First Slang LLVM integration (#1934)jsmall-nvidia2021-09-10
| | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * First integration with 'slang-llvm'. * Fix project. * Fix test output. * First pass assert support. * Add inline impls for min and max. * Add abs inline abs impl for llvm. * Make abs not use ternary op * Fix typo in slang-llvm.h * Sundary fixes to make remaining tests using llvm backend pass.
* CUDA layout corner cases/testing (#1881)jsmall-nvidia2021-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Add support for sizeOf/alignOf/offsetOf to stdlib. Add $G intrinsic expansion that works of the generic parameters not the param type * Test cuda layout. * Fix CUDA layout issues. Fix reflection to handle other built in types. Fix __offsetOf * Tests of reflection and layout as reported directly from CUDA. * Comment about use of aligned size as size. * Fix warning from VS. * Check alignment is pow2. * Small improvements to alignment calcs. * Tab to spaces. * Fix alignment pointer sizes on 32 bit OS for CUDA. * Fix CUDA reflection on 32 bit.
* Enable tracing rays with OptiX backend (#1871)Nathan V. Morrical2021-06-04
| | | | | | | | | | | | | | | * OptiX ray payload can now be read from and written to using the two payload register pointer method * changing op to more descriptive name * small tweak to allow for dumping out intermediate source for cuda targets * initial trace ray call compiling * hit attributes now work for float and int types, and vectors thereof * Hitgroups using structs and arrays now work with optix Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>
* OptiX ray payload read/write support in raytracing pipeline shaders (#1853)Nathan V. Morrical2021-05-25
| | | | | | | | | * OptiX ray payload can now be read from and written to using the two payload register pointer method * changing op to more descriptive name * fixup: comment change to re-trigger CI Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>
* Read half->float RWTexture conversion (#1842)jsmall-nvidia2021-05-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates. * First pass support for sust.p RWTexture format conversion on write. * Tidy up implementation of $C. Made clamping mode #define able. * A simple test for RWTexture CUDA format conversion. * Add support for float2 and float4. * WIP conversion testing. * Use $E to fix byte addressing in X in CUDA. * Do not scale when accessing via _convert versions of surface functions. * Revert to previous test. * Test with half/float convert write/read. * More broad half->float read conversion testing. * Improve documentation around half and RWTexture conversion.
* Surface access on CUDA is byte addressed in X (#1841)jsmall-nvidia2021-05-15
| | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates. * First pass support for sust.p RWTexture format conversion on write. * Tidy up implementation of $C. Made clamping mode #define able. * A simple test for RWTexture CUDA format conversion. * Use $E to fix byte addressing in X in CUDA. * Do not scale when accessing via _convert versions of surface functions.
* Support for HW format conversions for RWTexture on CUDA (#1840)jsmall-nvidia2021-05-15
| | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates. * First pass support for sust.p RWTexture format conversion on write. * Tidy up implementation of $C. Made clamping mode #define able. * A simple test for RWTexture CUDA format conversion.
* CUDA half RWTexture write support/doc improvements (#1839)jsmall-nvidia2021-05-14
| | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates.
* Support for reads from RWTexture<half> (#1837)jsmall-nvidia2021-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions. * Add requireSMVersion to CUDAExtensionTracker. * half-calc.slang now works for CUDA. * bit-cast-16-bit works on CUDA. * WIP handling of CUDA vector<half> types. * Half swizzle CUDA. * Half vector test. * Fix swizzle half bug. * Fix compilation issue with narrowing to Index. * Add unary ops. * Add some vector scalar maths ops. * Add half vector conversions for CUDA. * Fix erroneous comment. * Support for half comparisons. * First pass test for half compare. * Fix bug in CUDA specialized emit control. Updated tests to have pre and post inc/dec. * Removed unneeded parts of the cuda prelude. * Half structured buffer works on CUDA. * Added name lookup for Gfx::Format * Support half texture type in test system. * Test for half reading on CUDA. * Add half formats to Vk and D3D utils. * Fix getAt for CUDA - where there might not be a .x member in a vector. * Template specialization for half surface access works. * Half RWTexture support. * Test for half RWTexture access. * Update half-rw-texture test. * Remove test function from CUDA prelude.