slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2024-01-05	Add test for glsl groupshared init (#3433)	Ellie Hermaszewska

2023-12-08	WIP: CMake (#3326)	Ellie Hermaszewska
	* More robust input and output selection in generator tools * Add cmake build system * Get slang-test running with cmake * Bump lz4 and miniz dependencies * Make cmake build more declarative * Correct preprocessor logic in slang.h * Add cuda test to compute/simple * Remove empty cmake files * output placement for cmake, and commenting * Correct include paths in spirv-embed-generator * Format cmake with gersemi * Make cmake build clerer * Neaten header generation Also work around https://gitlab.kitware.com/cmake/cmake/-/issues/18399 by introducing correct_generated_properties to set the GENERATED flag in the correct scope * remove unused files * use 3.20 to set GENERATOR property properly * spelling * more flexible linker arg setting * replace slang-static with obj collection * Set rpath and linker path correctly * neaten generated file generation * tests working with cmake build * fix premake5 build * comment and neaten cmake * remove unnecessary dependency * Build aftermath example only when aftermath is enabled * Add slang-llvm and other dependencies * Put modules alongside binaries * Find slang-glslang correctly * Better option handling * comments * add llvm build test * Better option handling * cmake wobble * use UNICODE and _UNICODE * remove other workflows * use ccache * neaten * limit parallel for llvm build * use ninja for build * Windows and Darwin slang-llvm builds * cache key * verbose llvm build * cl on windows * sccache and cl.exe * use cl.exe * Correct package detection * less verbosity * Simplify miniz inclusion * fix build with sccache * Neaten llvm building * neaten * Neaten slang-llvm fetching * more surgical workarounds * Add ci action * Get version from git * better variable naming * add missing include * clean up after premake in cmake * more docs on cmake build * ci wobble * add imgui target * more selective source * do not download swiftshader * Some missing dependencies * only build llvm on dispatch * Disable /Zi in CI where sccache is present * simplify * set PIC for miniz * set policies before project * reengage workaround * more runs on ci * Add cmake presets * Add cpack * move iterator debug level to preset * Correct lib flag * simplify action * Neaten cmake init * Add todo * Add simple test wrapper * Add tests to workflow presets * rename packing preset * Correctly set definitions * docs * correct preset names * Make slang-test depend on test-server/test-process * neaten * use workflow in actions * install docs * Correct module install dir * debug dist workflow * Install headers * neaten header globbing * Neaten dependency handling * make lib and bin variables * Do not set compiler for vs builds, unnecessary * docs * allow setting explicit source for target * maintain archive subdir * cmake docs * install headers * place targets into folders * cmake docs * nest external projects in folder * remove name clash * Neater external packages * meta targets in folder structure * cleaner slang-glslang dll * Add missing static directive to slang-no-embedded-stdlib * more robust module copying * make slang-test the startup project * folder tweak * Make FETCH_BINARY the default on all platforms * Set DEBUG_DIR * add natvis files to source * skip spirv tests * remove test step from debug dist * Add build to .gitignore * redo warnings to be more like premake * Update imgui * clean more premake files * Disable PCH for glslang, gcc throws a warning * Add /MP for msvc builds * warning wobble * Add script to build llvm * Add slang-llvm and generators components * Build slang-llvm in ci * comments * fetch llvm with git * better abi approximation for cache * better sccache key * formatting * Correct logic around disabling problematic debug info for ccache * exclude gcc and clang from windows ci * Make dist workflows use system llvm * naming * restore normal dist builds * formatting * run tests in ci * Correct slang-llvm url setting * Rely on the system to find the test tool library * actions matrix wiggle * cope with OSX ancient bash * Correct compilers on windows * more ci debugging * Correct rpath handling on OSX * neaten * correct path to slang-llvm * Correct rpath separator on osx * Find slang-llvm correctly * smoke tests only on osx * ci wobble * Give MacOS module a dylib suffix * get swiftshader correctly * cope with bsd cp * remove debug output * full tests on osx * ci wobble * Add some vk tests to expected failures * simplify ci * ci wobble * exclude dx12 tests from github ci * remove cmake code for building llvm * warnings * warnings as errors for cl * spirv-tools in path * add aarch64 ci build * Add SLANG_GENERATORS_PATH option for prebuilt generators * neaten * Correct generator target name * remove yaml anchors because github actions does not support them * Demote CMake in docs Also add info on cross compiling * Restore premake CI * use minimal ci for cmake * Write miniz_export for premake build and .gitignore it * Mention build config tool options in docs * Remove redefined macro for miniz * regenerate vs project
2023-12-06	Support visibility control and default to `internal`. (#3380)	Yong He
	* Support visibility control and default to `internal`. * Fix wip. * Fixes. * Fix. * Fix test. * Add legacy language detection and compatibility for existing code. * Add doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-11-16	Unify stdlib `Texture` types into one generic type. (#3327)	Yong He
	* Unify Texture types in stdlib into 1 generic type. * Fixes. * Fix. * Fixes. * Fix reflection. * Fix binding reflection. * Add gather intrinsics. * Fix gather intrinsics. * Fix texture type toText. * Fix intrinsic. * fix cuda intrinsic. * Fix project files. * cleanup. * Fix. * Fix. * Fix sampler feedback test. * Fix getDimension intrinsics. * Fix spirv sample image intrinsics. * Fix test. * Fix GLSL intrinsic. * Cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-27	Various SPIRV fixes. (#3231)	Yong He
	* Various SPIRV fixes. - Geometry shader support (WIP). - Fix texture get dimension and load. - Fold global GetElement(MakeArray/MakeVector) insts. - Call spvopt to inline all functions. - Translate OpImageSubscript. - Emit struct member names and global variable names. - Fix lowering of OpBitNot -> OpNot, instead of OpBitReverse. * Fix test. * Fix geometry shader. * Fix geometry shader emit. * Add atomic Image access test. * Fix tests. * don't fail if spirv-opt fails. * Update comments. * Fix test. * Cleanups. * indentation --------- Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
2023-09-19	Added `[AutoPyBindCUDA]` for automatic kernel binding + `[PyExport]` for ↵	Sai Praveen Bangaru
	exporting type information (#3209) * Initial: add a DiffTensor impl * Auto-binding and diff tensor implementations now work * Refactored diff-tensor implementation + added py-export for struct types * Cleanup * Update slang-ir-pytorch-cpp-binding.cpp * Updated test names * Update autodiff-data-flow.slang.expected * Add more versions of load/store & default generic args for DiffTensorView. * Add diagnostic for default generic arg and more tests * Add more `[AutoPyBind]` tests
2023-09-05	SPIR-V image operations (#3163)	Ellie Hermaszewska
	* Add __truncate and __sampledType for spirv_asm Allows some texture tests to start passing * add __isVector Currently unused * Add 1-vector legalization pass (WIP) * Add capabilities for image types * neaten instruction dumping * add 1-vector test * Add a couple of cases to vec1 legalization * Remove texture tests from expected failures * comment * regenerate vs projects * Remove redundant define form synchapi emulation * refactoring image methods * All sample functions refactored * Remove incorrect glsl intrinsics Partially addresses https://github.com/shader-slang/slang/issues/3174 * __subscript image ops via writing funcs * Extract texture struct writing from core.meta.slang * Abstract out cuda intrinsic * Remvoe erroneous call to opDecorateIndex * spirv asm IR utils * Correct position of loads for SPIR-V asm inst operands * Raise constructors to global scope during spir-v legalization * Correct snippet output * Implement most texture sampling ops for SPIR-V * Legalize 1-vectors for glsl too * Make SPIR-V inst operands non-hoistable * Better 1-vector legalization * Put textures in ptrs for spirv * insert missing break * Add vec1 legalization test * Add some missing pieces to slang-ir-insts * Greatly neaten vec1 legalization * a * Neaten vec1 legalization * Add image read and write intrinsics for spir-v * Squash warnings * regenerate vs projects * Drop redundant guards * Drop 5 tests from expected failure list * Inst numbering changes to cross compile tests * vec1 legalization tests only on vk * Correct location of asm op emit * Inline constant in spirv-asm * Correct signedness for lane in wave intrinsics * Extract element from float1 for cuda * squash warnings * Neaten spirv-emit * dedupe more capabilities * warnings * neaten assert * comments * comments
2023-08-23	Lower all ByteAddressBuffer uses for SPIRV. (#3143)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-21	Compile append and consume structured buffers to glsl. (#3142)	Yong He
	* Compile append and consume structured buffers to glsl. * Fix. * Update CI config. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-15	Fix bug with overload resolution under nested generics (#3107)	Sai Praveen Bangaru
	* Add test for generic param inference bug for nested generics * Change description & simplify test * Add expected file * Check parent decl before unifying type parameters
2023-08-15	SPIR-V WIP (#3064)	Ellie Hermaszewska
	* Add type layout for structured buffer * Default to generating spirv directly * vk test for compute simple * Add spirv-dis as a downstream compiler * Emit Array types in SPIR-V * makevector for spirv * Dump whole spirv module on validation failure * register array types todo, use emitTypeInst * Neater formatting for unhandled inst printing * break out emitCompositeConstruct * Correct array type generation * neaten * Allow getElement for vector * Remove unused * Allow predicating target intrinsics on types * Consider functions with intrinsics to have definitions We need to specialize these if they are predicated on types * Correct array type generation * makeArray for spir-v * replace getElement with getElementPtr for spirv * Correct translation of field access for spirv * Push layouts to types for spirv * Spirv intrinsics * operator now makes a pointer * Add structured buffer of struct test * Preserve type layout in spirv structured buffer legalization * neaten * makeVectorFromScalar for SPIRV * placeholder for layouts on param groups * More type safe spirv op construction * Know that constants and types only go in one section * Remove emitTypeInst * Add todo for spirv sampling * Add links to spirv documentation on emit functions * OpTypeImage support for SPIR-V * Add simpler texture test for spirv * s/spirv_direct/spirv/g * Allow several string literals in target_intrinsic * Handle global params without a var layour for SPIR-V For example groupshared vars * uint spirv asm type * Add todo for isDefinition It is currently too broad * Some atomic op spirv intrinsics * Strip ConstantBuffer wrappers for spirv * Add todo for matrix annotations * Do not associate decorations insts with spirv counterparts * Correct entry point parameter generation * Spelling * Assert that fieldAddress is returning a pointer * Add error for existential type layout getting to spir-v emit * Add IRTupleTypeLayout Unused so far * Allow getElementPtr to work with vectors * Correct target name in test * Hide default spirv direct behind a premake option --default-spirv-direct=true * Do not insert space at start of intrinsic def * Correct asm rendering in tests * remove redundant option * Emit directly from direct test * Add source language options for spirv-dis * Add comments to spirv dis * Add dead debug print for before spirv module * Correct asm rendering in tests * s/spirv_direct/spirv/g * Only specialize intrinsic functions with predicates * regenerate vs projects * squash warnings * squash warnings * remove duplication * Silence warnings from msvc * squash warnings * Overload for zero sized array * More msvc warnings * warnings * Add spirv-tools to path for tests * Do not be specific about dxc version for diag test * Normalize line endings from spirv-dis * Correct filecheck matches * Temporarily disable two spirv tests Failing on CI, undebuggable hang :/ * Do not emit storage class more than once for spirv snippet * Do not pass spir-v to spirv-dis by stdin * Do not get spirv-dis output via stream, use file * normalize file endings in spirv-dis output
2023-08-14	Support per field matrix layout (#3101)	Yong He
	* Support per field matrix layout * Fix warnings. * Fix. * Fix tests. * Fix spiv gen. * Fix. * More test fixes. * Fix. * Run only GPU tests on self-hosted servers. * Remove -use-glsl-matrix-layout-modifier. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-12	Use scratchData on `IRInst` to replace HashSets. (#2978)	Yong He
	* Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-05	Initial sizeof/alignof implementation. (#2954)	jsmall-nvidia
	* Initial sizeof implementation. * Small macro improvement. * Fix some typos. * Refactor NaturalSize. Add more sizeof tests. * Use _makeParseExpr to add sizeof support. * Add size-of.slang diagnostic result. * Fix typo in folding with macro change. * Add a sizeof test of This. * Some more NaturalSize coverage. * Simple alignof support. * Testing for alignof. * Added 8 bit enum to check enums values are correctly sized. * Add alignof to completion. * Lower sizeof/alignof to IR. sizeof/alignof IR pass. Tests for simple generic scenarios. * Make append handle invalid properly. Improve comments. --------- Co-authored-by: Theresa Foley <10618364+tangent-vector@users.noreply.github.com>
2023-06-01	Be lenient on same-size unsigend->signed conversion. (#2913)	Yong He
	* Be lenient on same-size unsigend->signed conversion. * Fix tests. * Use 250. * wip * Fix. * Fix tests. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-04-29	Minor tidyings around d3d usage (#2854)	Ellie Hermaszewska
	* Remove unused COM annotation * Move SLANG_ENABLE_DXBC_SUPPORT to slang.h * Add DX11 simple compute test * Remove unnecessary COM parameter annotation * Run compute smoke test for DX12 * Ignore d3d11 tests when we do not have fxc * Do not try to find NVAPI on Linux * Add some logs to .gitignore * Minor cleanups in d3d12 headers * Fix tautological comparison (due to integer overflow) * Limit OutputDebugStringA to Windows
2023-04-21	Add warning for returning without initializing out parameter (#2807)	Ellie Hermaszewska
	* Add warning for returning without initializing out parameter * Add unused prelude function to squash uninitialized out variable warnings
2023-04-14	Some small fixes with Windows/DX usage (#2797)	Ellie Hermaszewska
	* Correct case of windows.h includes * Use Slang::SharedLibrary to load directx dlls * s/max/std::max/ * Factor common OS code in calcHasApi * Add DXIL test for compute/simple * s/false/FALSE for calls to WinAPI functions * Factor common OS code in gfxGetAdapters * 2 missing headers d3d12sdklayers for ID3DDebug climits for UINT_MAX * Define out unused function on Linux * Only try to load Vulkan and CUDA on Windows or Linux * simplify D3DUtil::getDxgiModule * Remove WIN32_LEAN_AND_MEAN &co from source files Add a global define * Set WIN32_LEAN_AND_MEAN &friends in headers Restore previous state also * regenerate vs projects
2023-04-11	Implement FileCheck tests for several test commands (#2747)	Ellie Hermaszewska
	* Add missing expected.txt for test * Diagnostics -> StdWriters in render test * Allow specifying several test prefixes to run `slang-test -- tests/foo tests/bar` * Squash warnings in some tests * Enable gfx debug layer in gfx test util Makes this issue present consistently: https://github.com/shader-slang/slang/issues/2766 * Allow DebugDevice to return interfaces instantiated by the debugged object * Check that we actaully have a shader cache for shader cache tests * Implement FileCheck tests for several test commands - SIMPLE, SIMPLE_EX - SIMPLE_LINE - REFLECTION, CPU_REFLECTION - CROSS_COMPILE It does not currently support the render tests or the COMPARE_COMPUTE commands It is invoked by adding `(filecheck=MY_FILECHECK_PREFIX)` to the test command, for example TEST:CROSS_COMPILE(filecheck=SPIRV): -target spirv-assembly * Move LLVM FileCheck interface to slang-llvm * Neaten slang-test tests * Refine handling of expected output in slang-test * Add example FileCheck buffer test * Add cuda-kernel-export tests Which were waiting on FileCheck * Bump vs project files * Make createLLVMFileCheck_V1 return a void* rather than specifically an IFileCheck * Remove use of CharSlice from filecheck interface * Bump slang-llvm version --------- Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>
2023-04-04	Diagnose on using assignment as predicate expr. (#2774)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2023-04-02	Fix several silently failing tests (#2767)	Ellie Hermaszewska
	* Add missing expected.txt for test * Diagnostics -> StdWriters in render test * Allow specifying several test prefixes to run `slang-test -- tests/foo tests/bar` * Squash warnings in some tests * Enable gfx debug layer in gfx test util Makes this issue present consistently: https://github.com/shader-slang/slang/issues/2766 * Allow DebugDevice to return interfaces instantiated by the debugged object * Check that we actaully have a shader cache for shader cache tests --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-02-16	Overhaul global inst deduplication and cpp/cuda backend. (#2654)	Yong He
	* Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-02-07	Arithmetic simplifications and more IR clean up logic. (#2632)	Yong He

2023-01-23	Full address insts elimination for backward autodiff. (#2604)	Yong He
	Co-authored-by: Yong He <yhe@nvidia.com>
2022-09-28	Improvements around diagnostic controls (#2414)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Test for disabling warnings. * Output diagnostic if argument parsing fails in render test. * More improvements around disabling diagnostics. * Add support for re enabling a warning. * Add warning controls to help text. * Tidy up around NameConventionUtil. * Make NameConvention an enum. * Handle leading underscores. * Update comment, and remove intial handling of _ prefix.
2022-08-17	Warning on lossy implicit casts. (#2367)	Yong He
	* Warning on bool to float conversion. * Fix test cases. * Improve. * LanguageServer: don't show constant value for non constant variables. * Fix tests. * Fix warnings in tests. Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-25	Specialize generic/existential calls within generic functions. (#2294)	Yong He
	* Expose internals of dce and use it to implement call graph walk. * Specialize calls in generic functions. * Fix clang error. Co-authored-by: Yong He <yhe@nvidia.com>
2022-06-08	Improved bounds checking for C++/CUDA (#2263)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Use TerminatedUnownedStringSlice for literals in output C++. * Remove Escape/Unescape functions used in slang-token-reader.cpp Add target type of 'host-cpp' etc to map to the target types. * Fix some corner cases around string encoding. * Added unit test for string escaping. Fixed some assorted escaping bugs. * Updated test output. * Added decode test. * Stop using hex output, to get around 'greedy' aspect. Use octal instead. * Added HostHostCallable Small changes to use ArtifactDesc/Info instead of large switches. * Fix C++ emit to handle arbitrary function export. * Add options handling for callable without an output being specified. * Can compile with COM interface. Added example using com interface. * Use the IR Ptr type instead of hack in C++ emit for interfaces. * Fix issue with outputting the COM call when ptr is used. * Fix crash issue on compilation failure. * Add support for __global. * Added `ActualGlobalRate` Added special handling around globals and COM interfaces. Tested out in cpu-com-example. * Fix typo in NodeBase. * Support for accessing globals by name working. * Bounds checking for C++ Improved bounds checks for CUDA. * Check that actual global initialization is working. * Fix typo. * Refactor the com replacement such that it doesn't need a cache or do anything special with GlobalVar. * Fix typo in CUDA prelude. * Remove context. Only create replacement if needed. * Split out COM host-callable into a unit-test. * host-callable com testing on C++and llvm. * Comment around the COM ptr replacement. * WIP Zero bound test. * Disable com test on vs 32 bit. Fix C++ prelude * Disable 32 bit targets testing com host-callable. * For now disable zero index test. * Enable bounds checking for CPU/CUDA. * Small fixes. Disable CUDA zero index bound fix. * Add test result for bound check. * Work around for index wrapping issue. * Added Fixed array test. * Only enable prelude asserts via SLANG_PRELUDE_ENABLE_ASSERT (unless defined by the user)
2022-05-10	Use IR pass to eliminate phi nodes (#2226)	Theresa Foley
	* Use IR pass to eliminate phi nodes "Phi nodes" are one of the key contrivances that makes SSA (Static Single Assignment) form work. Because SSA is so great for compiler IRs, we kind of need to deal with phi nodes, but they also get in the way because they don't have a direct analog in most lower-level machine ISAs or execution models, nor in most of the high-level languages a transpiler wants to emit. As a result a compiler like ours needs to be able to eliminate the phi nodes from a program as part of generating output code. (For any clever people noting that SPIR-V supports phi nodes directly: yes, it does. It doesn't need to and it probably shouldn't. Anybody involved in the decision-making knows my reasoning, and anybody else should feel free to ask me if they want the lecture. Anyway...) The basic idea of elimiating phi nodes is simple enough. We replace each phi node with a temporary variable. Uses of the phi use values loaded from the temporary. The operation of the phi itself (assigning a value based on the branch taken) amounts to an assignment into the temporary. Previously, the Slang compiler dealt with phi nodes very late in the process of generating code: in the middle of emitting strings of source code in a high-level language like HLSL or GLSL. Doing the work that late in compilation has two big drawbacks: 1. Our ability to emit clean and/or optimal code is limited because we may not be able to make certain changes to the IR, or because we cannot make use of additional information like a dominator tree that might be available at other points in compilation. 2. Any other IR passes that relate to temporary variables won't be able to see the variables that we generate for phi nodes. This could raise issues with correctness (e.g., if we want to compute live-range information for all temporary variables), or performance (we have no way to run additional IR optimization passes after phis are eliminated). This change addresses these problems by making the elimination of phi nodes an explicit IR pass. Additional optimizations can easily be run after this pass (although we'd need to be careful not to run passes that could end up introducing new phis). The pass makes use of the information available to it to try to produce code that will emit to "clean" HLSL/GLSL. The core of the pass is in `slang-ir-eliminate-phis.cpp`, and is heavily commented, so I won't describe the approach in detail here. There are two related issues that came up, though: First, it turned out that our emit logic for local variables (`IRVar` instructions) wasn't using the function we'd defined named `emitVar()`. One worrying consequence of that oversight was that the `precise` modifier would impact generated HLSL/GLSL for variables that turned into SSA values (including phi nodes), but not for local variables that had not been SSA'd (or that had been SSA'd and then de-SSA'd). This change also fixes that bug; it is unclear how widespread the impact of the original issue might be. Second, generating explicit IR temporaries for phi nodes exposed a pre-existing bug in the `slang-ir-restructure-scoping` pass. That pass basically detects cases where we have an instruction `I` with a use `U` such that the use follows the rules of SSA form ("def dominates use," meaning `I` dominations `U`), but does not follow the more restrictive scoping rules of high-level-language output (where a value computed "inside" a loop is not automatically visible to code outside the loop just because it dominates that code). That pass did not correctly account for the case where `I` was a temporary variable. It seems that case could not arise before now because we didn't have any passes that would move `var`, `load`, or `store` operations out of the basic block they started in. The fix for that pass was relatively simple, and will make the whole thing more robust in case we add more aggressive optimizations later. * fixup: expected test output
2022-01-25	Add support for HLSL unorm/snorm (#2095)	Theresa Foley
	Read/write resource types (what D3D/HLSL often refer to as UAVs) can be broadly categorized based on whether they require an underlying format (e.g., a `DXGI_FORMAT`) for reads, or not. D3D refers to the ones that require a format as "typed" UAVs (even though a `RWStructuredBuffer<MyData>` is clearly "typed" at the HLSL level). Vulkan refers to these cases as "storage images" and "storage texel buffers." Under the D3D model, an application does not have to specify the exact format for a formatted/"typed" UAV in order for loads to work, but it does need to specify if an HLSL resource with a declared `float` or vector-of-`float` element type will be backed by data with a `_UNORM` or `_SNORM` format. This is where the `unorm` and `snorm` type modifiers come in. Superficially, it might seem that adding this feature to the Slang compiler is "just" a matter of adding the two modifiers, which is easily done with a pair of one-line `syntax` declarations in `core.meta.slang` plus the corresponding AST node types. Unfortunately the superficial view misses the detail that, to date, Slang has not had any support for type modifiers at all, and has only supported declaration modifiers. The distinction has so far not mattered, even with modifiers like `const` because, e.g., the difference between a "`const` array of `float`" and an "array of `const float`" doesn't really matter. So, adding these two modifiers required introducing a lot of infrastructure along the way. Let's walk through what needed to happen: * As described above, the actual `syntax` was added easily in the Slang stdlib * I added a new subclass of `Modifier` for `TypeModifier`s in the AST, and added the AST nodes for `unorm` and `snorm` as subclasses of that. * In order to syntactically support modifiers applied to types (e.g., `unorm float`), I needed to add a `ModifiedTypeExpr` subclass of `Expr` that represents a base type expression with one or more modifiers applied * The parser needed some subtle new logic. There are two main cases where type modifiers will come up: 1. In contexts where we might be parsing a declaration (e.g., `const unorm float a`), we need to support a list of modifiers that might freely mix type modifiers and "declaration modifiers" which are not intended to apply to types. In this case we need to split the lis tof modifiers into the type-related ones and the declaration-related ones, and attach each subset to the appropriate place. This is very important for features like C-style pointers, where in `static const float* a;`, the `static` modifier applies to the entire declaration of `a`, but the `const` modifier only applies to the `float` type specifier, and not to the outer pointer type (the actual type of `a`). 2. In contexts where we are not parsing a declaration (e.g., a generic type argument), we need to support a list of modifiers and appy them all to the type specifier being parsed, even if some of them might not be appropriate. * While working in the parser I implemented a certain amount of unrelated cleanup for code that was using raw `Modifier`s to represent lists of modifiers, instead of the purpose-built `Modifiers` type. The `_parseGenericArg` case needed specific work, because it is an important case in the grammar where we need to parse either a type expression or a value exprssion, but cannot easily predict which we will see. The fix implemented for now is to always try to parse modifiers and, if we see any, to assume we are in the type case. Because of the rules for how modifiers in a C-like language inhere to the type specifier (and not necessarily the entire type), we need to refactor some of the type expression parsing routines to support parsing a "suffix" of a type expression. * Note: I decided to be conservative and only make these changes in `_parseGenericArg` because that is place that is needed in order for user code with `unorm`/`snorm` to work, but in practice a user could still confuse our parser by using type modifiers as part of a cast (e.g., `x = (unorm float)y;`). While there is currently no reason why a user should want to do this, it does suggest that we need to be prepared to see type modifiers in other ambiguous "expression or type?" contexts. We have so far preferred to avoid looking up built-in syntax declarations like modifiers in expression contexts, because we want to allow users to create variable names that might conflict with some of the more surprising modifier keywords in HLSL (e.g., both `triangle` and `sample` are modifier keyword). A nuanced strategy may be required when we get around to closing this gap (which will be needed around when we want full pointer support, since a cast like `(const SomeType)somePtr` is pretty common). In semantic checking, we now need a `visitModifiedTypeExpr`, which visits the base expression to produce a `Type` and then checks each of the `Modifier`s attached to it. During this process we need to translate the AST-level `Modifier`s into something that can exist properly in the universe of `Type`s. We introduce a `ModifiedType` subclass of `Type`, distinct from the `ModifiedTypeExpr` subclass of `Expr`. Furthermore, we introduce a `ModifierVal` subclass of `Val`, distinct from `Modifier`/`TypeModifier`. * One unfortunate thing here is that it means we have both, e.g., `UNormModifier` to represent the parsed syntax, and `UNormModifierVal` to represent the `Type`/`Val`-level representation of the same concept. It is quite likely that we are near the point where we can/should consider having two distinct AST representations: one for freshly-parsed ASTs and one for semantically-checked ASTs. The `Type`/`Val` hierarchy clearly belongs to the latter. * No actual semantic checking is currently being applied to the `unorm` and `snorm` modifiers, although we should in principle check that they are only being applied to `float` and vector-of-`float` types. * In an attempt to simplify some of the creation logic and build a tiny bit of reusable infrastructure, I went ahead and added the skeleton of a dedupe-caching system in `ASTBuilder` so that we can easily ensure only a single `UNormModifierVal` and a single `SNormModifierVal` ever get created inside the scope of a single builder. * TODO: Thinking about this, I'm now worried the deduplication does not mean I can make the simplifications I currently do in semantic checking by assuming that any two `UNormModifierVal`s will be pointer-identical. This is because we do not currently (IIRC) have the required "bottleneck" in the compiler where all ASTs get serialized after initial checking, and then deserialized when `import`ed into a downstream module, so that every AST node during a checking step comes from a single `ASTBuilder`. Hmm... * If we can rely on deduplication to do its thing, then the `Val` and `Type` implementations of modifiers can be relatively simple. * TODO: One issue here is that the equality comparison for `ModifiedType` currently checks for the same base type and the same modifiers in the same order. This works for now when we only have a small number of type modifiers and any given type will hae at most one, but in the longer run it relies on us to implement some kind of canonicalization scheme, which would both ensure that between `Modified(T, {A, B})` and `Modified(T, {B, A})` only one is allowed (that is, a canonical ordering on modifiers), and that we do not allow `Modified(Modified(T, {A}), {B})`. * TODO: One other issues is that the `ModifiedType` case does not currently interact correctly with the `as()`-based casting for types (whereas that operation does interact in a semantically-correct fashion with `typedef`s). Fixing this issue in a robust way really depends on us re-architecting the `Type` system so that any `Type` can have modifiers attached, with modifiers affecting type identity/deduplication. * The key place where `ModifiedType` creates a complication in semantic checking is type conversion/coercion. A user is likely to declare a `RWTexture2D<unorm float>`, fetch from it (producing a value of type `unorm float`) and then assign the result to a `float` variable, prompting for a conversion from `unorm float` to `float` (because they are distinct `Type`s). * We handle this case in the core `_coerce()` operation by checking if either `toType` or `fromType` is a `ModifiedType`. If either one is a modified type, we apply logic to check for modifiers that are present on one and not the other. Basically we check which modifiers need to be "dropped" and which need to be "added" during conversion, and validate that these modifiers can be dropped/added without creating a semantic error. The only type modifiers we support right now can be dropped/added like this, so we are fine. * TODO: When we add more complete pointer support, we could need logic here to validate when casts between, e.g., `const int` and `int` should/shouldn't be allowed. * Note: Even opening the door to type modifiers at all creates the same kind of challenges for user-defined generic types (and functions!) since `MyType<int>` and `MyType<const int>` are distinct instantiations in a future where we support `const` as a type modifier. We may need to plan to restrict where modified types can be used, so that certain built-in generic types support modified types as arguments, but user-defined types don't (or at least might need to opt-in to get support). * The result of a `_coerce()` that drops/adds modifiers is a `ModifierCastExpr`, which is a kind of no-op AST node that merely expresses that the conversion is allowed and valid. * In IR lowering we currently do the simple thing and translate a `ModifiedType` to a distinct IR node called `AttributedType`. * The change in terminology from "modifier" to "attribute" is to follow the way that these kinds of modifiers best map to the `IRAttr` case in the IR (rather than the `IRDecoration` case). We probably ought to do a careful terminology scrub here, because having this terminology mismatch between IR and AST could be a source of confusion. * TODO: In principle, using `IRAttributedType` creates the same basic problems as using `ModifiedType`: code that is usin `as()` or similar operations to check for a specific subclass of `IRType` may not see the case they were looking for due to use of `IRAttributedType`. * Initially I had hoped to avoid the problem by having the `IRAttr`s be attached directly as operands to an otherwise-ordinary `IRType`. E.g., a lowered `unorm float4` would be an `IRVectorType` with an "extra" operand that is an `IRUNormAttr`, something like: `Vector<Float, 4, UNorm>`. This sounds great (and looks great!), but runs into the problem that it is incompatible with the way we currently represent things like generic type parameters. A generic type parameter `T` is represented as an `IRParam`, and it does not make sense to have an additional `IRParam` to represent `const T` or `unorm T`, etc. * The Right Way to solve this stuff at both the AST and IR levels is to avoid passing around bare `Type` or `IRType` in general, and instead use a value type that implements the needed policy more directly: something like a `TypeHolder` or `IRTypeHolder` (placeholder name). The `Holder` type would abstract over the various "wrapper" nodes required to store all the additional data like attributes but, importantly, would not* allow that extra information to be dropped or lost during operations like casting (e.g., note how the current `Type` implementation of `as()` loses information on `typedef` names, making our error messages slightly worse). This is actually quite similar to how we currently use the `DeclRef<T>` system to allow working with what is usually a `T` under the hood, but in a way that ensures we don't lose track of any generic substitution information. During C-like code emit we have a process that turns an `IRType` into a chain of declarators as needed to emit a C-like declaration with pointers, arrays, etc. The `IRAttributedType` case needs to get folded into this logic. Basically, when we see an `IRAttributedType` we immediately emit any modifiers that are required to be in a prefix position, then recursively emit the underlying type with an extra layer of declarator that tracks the modifiers, so that we can emit any modifiers that should be placed in a postfix position after the type. As a specific example, our C/C++ back-end would want to use the postifx option to handle `const`, because then it can properly emit stuff like `int const * const ` and not the incorrect `const const int`. The HLSL emit logic overrides the prefix case for handling type attributes, and uses it to emit `unorm` and `snorm` where they occur. * One unfortunate detail is that (apparently) some downstream HLSL compilers do not allow the `unorm`/`snorm` modifiers to apply to `vector<float, >` types, even though that should be semantically valid. Instead, they only support `float`, `float2`, `float3`, and `float4` explicitly. To work around this issue, we go ahead and change our HLSL emit logic so that when we encountered 1-to-4 component vectors of `float`, `int`, or `uint` we emit the type name using the typical HLSL shorthand. This is actually a signficicant change in our HLSL output, but it both seemed like a good fix to have anyway, and was also the only obvious way to address the downstream parser shortcomings without a massive kludge. As a result of this change the `half-texture.slang` test broke, since it was using raw HLSL as the expected output. I changed the test to do a DXIL comparison instead, which is our preferred way of testing cross-compilation behavior (since it is more robust in the face of small changes to our source output).
2021-11-11	Fix two test cases that were causing problems (#2011)	Theresa Foley
	First, we have a CUDA-only test that simply needed a format name to be changed to match the new conventions in `gfx`. Second, we have one of the "active mask" tests that seems to produce different results locally for developers (under Vulkan) than it does on CI. This is almost certainly down to differences in GPUs and/or drivers. The inconsistency ultimately proves the point that I was trying to make when I wrote those tests - the "active mask" concept is effectively meaningless as exposed in D3D and Vulkan because it has not been specified in a way that allows programmers to reason about its value, and drivers have implemented wildly different interpretations of its supposed semantics for so long that there is no real hope of turning `WaveGetActiveMask()` into something that returns a well-defined value in any but the most trivial cases. TLDR: I disabled that test for Vulkan, which means it is completely disabled.
2021-10-26	Runs all gfx unit tests through a 'test proxy' (#1981)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Support for test proxy. * Turn on testing using proxy. * Don't pass sink into check of downstream compiler. * Small change to kick off build. * Remove register specification on transcendental. * Increase poll timeout. Small improvements to proxy. * Disable gfx unit tests. * Put test runner in shared library mode by default. * Change comment. Kick off another CI test. * Small edit to kick off builds. * Run unit tests on proxy. * Turn on using proxy for now. * Enable swift shader. * Fix typo. Add exception support. * Make the default spwan type SharedLibrary Use isolation for gfx unit tests. * Update slang-binaries. * Fix typo. * Report unit test output information.
2021-10-26	Expanded gfx::Format to include additional formats (#1982)	lucy96chen
	* Format list updated with additional formats supported by both D3D and Vulkan; D3DUtil::getMapFormat() and VkUtil::getVkFormat() updated to include additional formats; GFX_FORMAT() updated with all additional formats (BC compression unfinished) * Finished updating GFX_FORMAT with newly added formats and sizes; Pixel size is now tracked using the FormatPixelSize struct containing the values for bytes per block and pixels per block to accomodate BC formats; Updated gfxGetFormatSize and associated sub-calls to return FormatPixelSize instead of uint8_t; Most calls to gfxGetFormatSize() updated to reflect changes, a couple calls still unupdated * Changes to accommodate new formats finished, debugging slang-literal unit test * First format unit test working * One test added for BC1Unorm and RGBA8Unorm_SRGB, both passing * Refactored format testing code to merge BC1Unorm and RGBA8Unorm SRGB into a single file * All unit tests added for BC and Srgb formats * Most tests added and working; Added five additional formats (still need tests) and made the appropriate changes to support these; createTextureView() modified for D3D11, D3D12, and Vulkan to take into account the format specified in the texture view desc when the texture's format is typeless * Format enums renamed to more closely match their D3D counterparts; Added a universal float and uint buffer and buffer view for use across all Format tests * Remaining tests added; D3D12 tests pass, but Vulkan crashes in BC1_UNORM and D3D11 spits out a bunch of D3D11 Errors (but supposedly passes) * re-run premake * Added Sint versions of test shaders; Vulkan and D3D11 tests also pass * Size struct for format unit tests no longer use initializer lists * Fixed a Size struct missed in the previous pass * Fixed minor bugs causing tests to fail * Added documentation detailing all currently unsupported formats * Skip tests causing unsupported format warnings due to swiftshader * updated several test using old Format enum names * Revert change to compareComputeResult() that was added for debugging purposes * DEBUGGING: Added prints to identify which formats are failing on CI * Reverted attempted debugging changes; Fixed texture2d-gather.hlsl to use updated Format enums * Fixed incorrect array sizes in d3d11 _initSrvDesc() * Commented out further tests that produce unexpected results when tested for Vulkan with swiftshader * Revert "Merge branch 'expanded-format-support' of https://github.com/lucy96chen/slang into expanded-format-support" This reverts commit 20008f0d3ecc3b1405ecac8c138edaa3cd37ed6b, reversing changes made to 6081e95827315fee50e18409394d5abd62fac787. * Added a fuzzy comparison function for use with floats * submodule update * Revert messed up changes caused by previous revert after automatically merging on github
2021-10-21	Passing associated type arguments to existential parameters + packing for ↵	Yong He
	`bool`. (#1987) * Passing associated type arguments to existential parameters + packing for `bool`. * fix typo Co-authored-by: Yong He <yhe@nvidia.com>
2021-10-04	Removing exceptions from core/compiler-core (#1953)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Refactor Stream. Working on all tests. * Split out CharEncode. * Make method names lower camel. m_prefix in Writer/Reader * Tidy up around CharEncode interface. * Small improvements around encode/decode. * Better use of types. * Remove readLine from TextReader. * Remove exceptions from Stream/Text handling. * Fix some typos. * Fix tabbing. * Fix missing override. * Remove remaining exception throw/catch via using signal mechanism. * Remove exceptions that are not used anymore. * Document the Stream interface. * Remove index for decoding 'get byte' function. * Fix CharReader -> ByteReader.
2021-09-13	Bug fix in 16bit type emit, vk validation error fix. (#1936)	Yong He
	+ Implement bit_cast between float16 and uint16 in GLSL. + Enable pack-any-value-16bit test on vk. Co-authored-by: Yong He <yhe@nvidia.com>
2021-09-09	`reinterpret` and 16-bit value packing. (#1933)	Yong He
	* `reinterpret` and 16-bit value packing. * Update `half-texture` cross-compile test reference result. * Revert inadvertent reformatting of slang-ir-inst-defs.h Co-authored-by: Yong He <yhe@nvidia.com>
2021-09-03	Fix crash: dynamic dispatch of generic interface method. (#1929)	Yong He
	* Fix crash: dynamic dispatch of generic interface method. * Fix memory error. Co-authored-by: Yong He <yhe@nvidia.com>
2021-08-26	Add API to control interface specialization. (#1925)	Yong He

2021-08-25	Add `createDynamicObject` stdlib function. (#1923)	Yong He
	This function takes a user provided `typeID` and arbitrary typed value, and turns them into an existential value whose `witnessTableID` is `typeID` and whose `anyValue` is the user provided value. This allows the users to pack the runtime type id info in arbitrary way.
2021-07-21	Work to mitigate SPIR-V bloat (#1914)	Theresa Foley
	* Work to mitigate SPIR-V bloat SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things. Note: experimentation shows that the bloat is less pronounced when running either no SPIR-V optimizations or full SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request. This change primarily does two things: First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would always copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters. Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies. Notes: * The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon. * The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing. * There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments. * Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated. * This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create more memory traffic (because the same values get loaded in multiple callee functions). * fixup: run premake * fixup: typo
2021-07-09	Enable testing with Swiftshader. (#1906)	Yong He

2021-07-08	Allow render-test to run inline ray tracing tests. (#1903)	Yong He
	* Update VS projects to 2019. * Empty commit to trigger build * Implement gfx inline ray tracing on D3D12. * Allow render-test to run inline ray tracing tests. Co-authored-by: Yong He <yhe@nvidia.com>
2021-06-09	Enable some VK texture tests (#1878)	jsmall-nvidia

2021-06-08	Fix RWTexture issues on CUDA (#1876)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Re-enable CUDA RWTexture tests. Re-enable RWTexture1D test Make sure tests have only single mip for RWTexture (required for CUDA) * Fix issue with reading CUDA surface. Re-enable working CUDA RWTextureTest. Enable 1D case.
2021-06-06	Fixed issue around 4xFloat16 texture on CUDA (#1874)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Fixes around Float16. Incorrect calculation of 'elementSize'.
2021-06-02	Various Fixes to gfx, reflection and emit. (#1867)	Yong He
	* Various Fixes to gfx, reflection and emit. - Fix GLSL emit to properly output `bitsTo` functions for `IRBitCast` insts. - Add line directive mode setting for `ISession`. - Extend `TypeLayout::getElementStride` to handle `VectorType` case. - Fix `IDevice::readBufferResource` 's D3D12 implementation to copy only the requested bytes out. - Fix `render-test` to use the `ISession` from `gfx` instead of creating its own `ISession` to make sure `gfx` and `render-test` agree on WitnessTable and RTTI IDs. - Extend `render-test` to support filling vector and matrix values in the new `set x = ...` TEST_INPUT syntax. - Add a `dynamic-dispatch-15` test case to make sure packing / unpacking works correctly across all targets, and to make sure render-test's RTTI/WitnessTable ID filling logic is correct for non-trivial cases. * Remove default-major test * Fix cyclic reference in `ExtendedTypeLayout`. * Move `lineDirectiveMode` setting to `TargetDesc`. Add `structureSize` to `TargetDesc` and `SessionDesc` for future binary compatibility. * Cleanup. Co-authored-by: Yong He <yhe@nvidia.com>
2021-05-25	Rework shader object specialization control interface. (#1857)	Yong He

2021-05-25	Allow overriding specialization args via `IShaderObject`. (#1854)	Yong He
	* Allow overriding specialization args via `IShaderObject`. * Fixes. Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>
2021-05-21	[gfx] Support StructuredBuffer<IInterface>. (#1851)	Yong He
	Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>