slang.git - Making it easier to work with shaders

Age	Commit message (Collapse)	Author
2024-06-04	Use memcpy to replace strncpy_s (#4270)	kaizhangNV
	Use memcpy to replace strncpy_s in SlangProfiler::SlangProfiler to fix the error in Windows.
2024-06-04	Add APIs to get profile of compile time (#4242)	kaizhangNV
	* Add APIs to get profile of compile time Add serial time measurement Add profiler to measure lots of stages in slang compilation, and it can accumulate the time spent in each thread in multi-threads case and finally report a serial timing info. * Add invocation times to the profiler * Simplify the profiler and provide a 'clear' option Change the profiler design to only return the thread_local profiler to user. We create a ISlangProfiler interface to carry the thread_local variable PerformanceProfilerImpl profiler to user. In addition, we provide a new option in the input parameter to control whether or not user want to clear the previous profile data. So spGetCompileProfile() can always returns a fresh new profiling data. * Change to use slang container List Stop using std::vector, instead use slang's container List. Generate a UUID for ISlangProfiler
2024-05-31	Capabilities generator inclusive join and misc (#4237)	ArielG-NV

2024-05-29	Add options to speedup compilation. (#4240)	Yong He
	* Add options to speedup compilation. * Fix. * Plumb options to DCE pass. * Revert debug change. * Fix regressions. * More optimizations. * more cleanup and fixes. * remove comment. * Fixes. * Another fix. * Fix errors. * Fix errors. * Add comments.
2024-05-28	Print memory leak info in Debug build of slangc.exe (#4210)	Jay Kwak
	When memory leak is detected, this commit will dump the information about the memory leak. This feature is available only in Debug build on Windows platform. Also note that the message will not be printed on the client applications that use slang.dll, because the printing happens as a part of slangc.exe not slang.dll. I found a bug that Slang::StdWriters was closing `stdout` and `stderr` in its destructor, which prevented Crt functions to print the messages to `stdout` and `stderr`.
2024-05-24	Fix clang-18 build (#4222)	exdal
	* Update slang-performance-profiler.cpp * modified: source/core/slang-performance-profiler.cpp * reviews --------- Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
2024-05-16	Capabilities System, CapabilitySet Logic Overhaul (#4145)	ArielG-NV
	* Capabilities System, Backing Logic Overhaul Fixes #4015 Problems to address: 1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster. 2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging. What was addressed: 1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below. Design: * `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation. * `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched. * `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched. * `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations. 2. Errors * Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors. * Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility. What is missing: 1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul. 2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint. Notes: * UIntSet had implementation bugs which were fixed in this PR. * The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation. * fix .natvis debug view * Small optimizations I found while working on the addition the AST building pass looks like so now: 1% = ~capabilitySet 2% = capabilitySet() 1.5% capabilitySet::unionWith() 0.8% capabilitySet::join() 1.5% auxillary info for debugging ~0.5-1% extra visitor overhead ~5% total for the visitor ~6.5% for total runtime costs * fix caps which were wrong but worked * push minor syntax fix (still looking for why other tests fail) * perf & bug fixes 1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario. 2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code. NOTE: did not address review comments * fix glsl.meta caps error * fixing findBest logic again & UIntSet wrapper findBest was not checking for 'more specialized' targets & was element counter was flawed * faster getElements algorithm + natvis for UIntSet + wrong warning * type incompatability of bitscanForward implementations * try to fix warnings again * remove ptr for clang intrinsic * add missing header * ifdef to allow clang compile * compiler hackery to fix up platform/type independent operations * bracket * fix MSVC error * missing template * change types out again * changes to fix compiling * adjustment to parameter for Clang/GCC * added iterator to delay processing all atomSets of a CapabilitySet * add a few missing consts's * ensure we never have more than 1 disjointSet Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons: 1. We still need 0-1 set functionality. 2. Might as well keep the code, just disallow the problematic functionality. * address review comments non linked-list refactor review comments addressed; add doc comments + remove redundant code * comments + remove isValid for bool operator * push removal of linkedlist for capabilities * add missing break * address review comments minor adjustments of syntax * push a fix to the `CapabilitySet({shader, missing target})` code * quality + error 1. add iterator to UIntSet 2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400) * fix target_switch erroring + temporarily remove UIntSet::Interator temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator * fix the UIntSet iterator * Revert "fix the UIntSet iterator" temporarily to pull from master * add metal error as per texture.slang (took a while I realize this was why things were breaking, likely should adjust errors to reflect this) * Rework UIntSet to have a template for output type This is done so it is reasonable to debug the iterator output and not just dealing with messy int's Fix problems with the iterators implemented + invalid capabilities handling * removed incorrect `__target_switch` capability barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile. * remove some uses of getElements * remove undeclared_stage for now * remove redundant code associated with `undeclared_stage` * remove unused variable * address review specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate * move GLSL_150 capdef change to sm_4_1 (more accurate) * address most review comments did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776 * revert incorrect code review suggestion * push changes for all code review suggestions
2024-05-14	Slang: Support UTF-8 with Byte Order Markers (#4135)	cheneym2
	Slang APIs are documented as taking UTF-8 encoded shader source, though it's not explicitly documented whether it is allowed to include a BOM (Byte Order Marker). This change adds support for UTF-8 BOM markers by virtue of disposing of BOM data. As a bonus, UTF-16 input which can cleanly decode to UTF-8 is now also accepted. Throwing out the BOM on input is done by leveraging existing functionality in "determineEncoding()", however a bug exists there for null-terminated single character input, where the null byte caused a heuristic to guess UTF-16, even though the null byte isn't part of the string. The bug in "determineEncoding" is fixed by only guessing when bytes >= 2 and not looking past the end of the buffer. The 'implicit-cast' test was mistakenly relying on the bug to pass, as its expected file was being read as UTF16 and cropped to zero length due to the bug. The expected output of implicit-cast is updated to pass with the bug fix in place. The decoding of UTF-16 to UTF-8 is done through an existing 'decode' method. This change fixes a bug in UTF16-LE 'decode' where it was decoded as if it were Big-Endian. Adds 3 small tests to ensure the compiler doesn't choke on source files in UTF-8 (with BOM), UTF16-LE, or UTF16-BE. Bonus: Fixes a bug in diagnostic reporting where hex values were incorrectly translated to text, leading to incorrect, possibly truncated strings. Fixes #4046 Co-authored-by: Yong He <yonghe@outlook.com>
2024-05-10	Fix race-condition and visual artifacts issues (#4152)	kaizhangNV
	* Fix race-condition and visual artifacts issues In PerformanceProfiler::getProfiler() we return a static object for the profiler implementation, this is not thread-safe, so change it to thead_local. There is still some visual artifacts when using slang as the shading language. We don't know the root cause yet, but found out it's related to our loop inversion algorithm. So stage this feature for now, and turn it into an internal option and default off. We will re-enable it after more investigation on this optimization. File an new issue 4151 to track it. * Add '-loop-inversion' to the few tests
2024-05-09	fix typo (#4144)	Tomáš Pazdiora
	Co-authored-by: Yong He <yonghe@outlook.com>
2024-05-03	Add host shared library target. (#4098)	Yong He
	* Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix.
2024-04-19	Add metal downstream compiler + metallib target. (#3990)	Yong He
	* Add metal downstream compiler + metallib target. * Add more comments. * Add missing override.
2024-04-19	allow preludes in include folder (#3976)	Tomáš Pazdiora
	Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com>
2024-04-17	Add skeleton for metal backend. (#3971)	Yong He

2024-04-15	[GFX] Fix d3d12 buffer view creation logic for StructuredBuffers. (#3954)	Yong He

2024-04-09	typos (#3913)	Pema Malling

2024-03-07	Uniformity analysis. (#3704)	Yong He
	* Uniformity analysis. * Add [NonUniformReturn] decorations to some hlsl intrinsic functions.
2024-02-22	Add API for querying and reusing precompiled binary modules. (#3614)	Yong He

2024-02-20	Support link time type specialization. (#3604)	Yong He

2024-02-06	Improve Capability System (#3555)	Yong He
	* Improve capability system. * Update documentation. * Tuning semantics. * LSP: hierarchical diagnostics. * Fix test. * Fix test.
2023-12-08	WIP: CMake (#3326)	Ellie Hermaszewska
	* More robust input and output selection in generator tools * Add cmake build system * Get slang-test running with cmake * Bump lz4 and miniz dependencies * Make cmake build more declarative * Correct preprocessor logic in slang.h * Add cuda test to compute/simple * Remove empty cmake files * output placement for cmake, and commenting * Correct include paths in spirv-embed-generator * Format cmake with gersemi * Make cmake build clerer * Neaten header generation Also work around https://gitlab.kitware.com/cmake/cmake/-/issues/18399 by introducing correct_generated_properties to set the GENERATED flag in the correct scope * remove unused files * use 3.20 to set GENERATOR property properly * spelling * more flexible linker arg setting * replace slang-static with obj collection * Set rpath and linker path correctly * neaten generated file generation * tests working with cmake build * fix premake5 build * comment and neaten cmake * remove unnecessary dependency * Build aftermath example only when aftermath is enabled * Add slang-llvm and other dependencies * Put modules alongside binaries * Find slang-glslang correctly * Better option handling * comments * add llvm build test * Better option handling * cmake wobble * use UNICODE and _UNICODE * remove other workflows * use ccache * neaten * limit parallel for llvm build * use ninja for build * Windows and Darwin slang-llvm builds * cache key * verbose llvm build * cl on windows * sccache and cl.exe * use cl.exe * Correct package detection * less verbosity * Simplify miniz inclusion * fix build with sccache * Neaten llvm building * neaten * Neaten slang-llvm fetching * more surgical workarounds * Add ci action * Get version from git * better variable naming * add missing include * clean up after premake in cmake * more docs on cmake build * ci wobble * add imgui target * more selective source * do not download swiftshader * Some missing dependencies * only build llvm on dispatch * Disable /Zi in CI where sccache is present * simplify * set PIC for miniz * set policies before project * reengage workaround * more runs on ci * Add cmake presets * Add cpack * move iterator debug level to preset * Correct lib flag * simplify action * Neaten cmake init * Add todo * Add simple test wrapper * Add tests to workflow presets * rename packing preset * Correctly set definitions * docs * correct preset names * Make slang-test depend on test-server/test-process * neaten * use workflow in actions * install docs * Correct module install dir * debug dist workflow * Install headers * neaten header globbing * Neaten dependency handling * make lib and bin variables * Do not set compiler for vs builds, unnecessary * docs * allow setting explicit source for target * maintain archive subdir * cmake docs * install headers * place targets into folders * cmake docs * nest external projects in folder * remove name clash * Neater external packages * meta targets in folder structure * cleaner slang-glslang dll * Add missing static directive to slang-no-embedded-stdlib * more robust module copying * make slang-test the startup project * folder tweak * Make FETCH_BINARY the default on all platforms * Set DEBUG_DIR * add natvis files to source * skip spirv tests * remove test step from debug dist * Add build to .gitignore * redo warnings to be more like premake * Update imgui * clean more premake files * Disable PCH for glslang, gcc throws a warning * Add /MP for msvc builds * warning wobble * Add script to build llvm * Add slang-llvm and generators components * Build slang-llvm in ci * comments * fetch llvm with git * better abi approximation for cache * better sccache key * formatting * Correct logic around disabling problematic debug info for ccache * exclude gcc and clang from windows ci * Make dist workflows use system llvm * naming * restore normal dist builds * formatting * run tests in ci * Correct slang-llvm url setting * Rely on the system to find the test tool library * actions matrix wiggle * cope with OSX ancient bash * Correct compilers on windows * more ci debugging * Correct rpath handling on OSX * neaten * correct path to slang-llvm * Correct rpath separator on osx * Find slang-llvm correctly * smoke tests only on osx * ci wobble * Give MacOS module a dylib suffix * get swiftshader correctly * cope with bsd cp * remove debug output * full tests on osx * ci wobble * Add some vk tests to expected failures * simplify ci * ci wobble * exclude dx12 tests from github ci * remove cmake code for building llvm * warnings * warnings as errors for cl * spirv-tools in path * add aarch64 ci build * Add SLANG_GENERATORS_PATH option for prebuilt generators * neaten * Correct generator target name * remove yaml anchors because github actions does not support them * Demote CMake in docs Also add info on cross compiling * Restore premake CI * use minimal ci for cmake * Write miniz_export for premake build and .gitignore it * Mention build config tool options in docs * Remove redefined macro for miniz * regenerate vs project
2023-12-05	Support `include` for pulling file into the current module. (#3377)	Yong He
	* Support `include` for pulling file into the current module. * Add auto-completion, hover info and goto-def support. * Disable warning for missing `module` declaration for now. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-10-25	Fix warnings for gcc 12.3 (#3286)	Ellie Hermaszewska
	* Silence a few gcc out of bounds warnings * Search upwards from executable for prelude directory instead of assuming depth * comment wording * Check return values of read and write * Correct path to vulkan headers in gfx * Correct diagnostic on missing file in slang-embed * Do not use absolute path to libraries in test-context.cpp --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-10-09	Run curated spirv-opt passes through slang-glslang. (#3266)	Yong He
	* Run curated spirv-opt passes through slang-glslang. * Cleanup. * Replace spirv-dis downstream compiler with glslang. * delete slang-spirv-opt.cpp. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-10-06	Small type system fixes. (#3265)	Yong He

2023-10-04	SPIRV compiler performance fixes. (#3258)	Yong He
	* SPIRV compiler performance fixes. * Cleanup. * update project files * Cleanup debug code. * Make redundancy removal non-recursive. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-29	Fix for problem with OrderedHashSet causing crash (#3251)	jsmall-nvidia
	* Fix for problem with OrderedHashSet causing crashes during running tests on on g++ 7.3 * Fix typo
2023-09-13	Add all RayQuery SPIRV Intrinsics. (#3204)	Yong He
	* Add all RayQuery SPIRV Intrinsics. * Fix * Fix. * fix. * Fix. * Fix. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-07	Fix GLSL output for `gl_ClipDistance` input builtin. (#3193)	Yong He
	* Fix GLSL output for `gl_ClipDistance` input builtin. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-07	Add -repro-fallback-directory (#3188)	jsmall-nvidia
	Co-authored-by: Yong He <yonghe@outlook.com>
2023-09-01	Fix GLSL code gen around RayQuery and HitObject types. (#3173)	Yong He
	* Update slang-llvm. * Fix. * fix. * Fix unit tests for multi-thread execution. * Fix tests. * fixes. * update tests. * Add gfx-smoke to linux expected failure list. * Try fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-28	Allow bitwise or expressions and numeric literals in spirv_asm blocks (#3157)	Ellie Hermaszewska
	* Add -spirv-core-grammar option to load alternate spirv defs Also embed a version to use by default * Use perfect hash for spv op lookup * Neaten perfect hash embedding * Refactor spirv grammar lookup in preperation for more kinds of lookups * Load spirv capability list from spec * Add all SPIR-V enums to lookup table * regenerate vs projects * appease msvc * Use string slices for spir-v core grammar lookups * wiggle * comment * Add OpInfo for spv ops * regenerate vs projects * Embed op names * Add min/max operand counts and enum categories to spirv info * neaten * Operand kinds for spirv ops * Store and embed all information relating to spirv enums and qualifiers * Use SPIR-V spec to position instructions in spirv_asm blocks * Neaten spir-v info embedding * Neaten perfect hash embedding * Add assignment syntax to spirv_asm snippets * Better errors for spirv_asm parser * Add warning for too many operands in spirv asm * squash warnings * neaten * test wiggle * Lookup enums for spirv * Put OpCapability and OpExtension in the correct place for spirv_asm blocks * Tests for OpCapability and OpExtension * ci wiggle * Add expected failure * Allow raising immediate values to constant ids where necessary in spirv_asm blocks * Allow bitwise or expressions and numeric literals in spirv_asm blocks * test numeric literals * Fix memory issues. * fix. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-08-16	Run vk tests on spirv backend with expected failure list. (#3128)	Yong He

2023-08-15	squash warnings (#3113)	Ellie Hermaszewska
	* Remove unused variables * Silence gcc out of bounds warnings * Squash strict-aliasing warnings It is still a naughty thing to be casting to T like this though * Correct equality check when val is nullptr --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-08-16	Use ankerl/unordered_dense as a hashmap implementation (#3036)	Ellie Hermaszewska
	* Correct namespace for getClockFrequency * missing const * Add missing assignment operator * Remove unused variables * Return correct modified variable * Use stable hash code for file system identity * terse static_assert * Structured binding for map iteration * Make (==) and getHashCode const on many structs * Add ConstIterator for LinkedList * Replace uses of ItemProxy::getValue with Dictionary::at * Extract list of loads from gradientsMap before updating it * Const correctness in type layout * Add unordered_dense hashmap submodule * Use wyhash or getHashCode in slang-hash.h * refactor slang-hash.h * Use ankerl/unordered_dense as a hashmap implementation Notable changes: - The subscript operator returns a reference directly to the value, rather than a lazy ItemProxy (pair of dict pointer and key) slang-profile time (95% over 10 runs): - Before: 6.3913906 (±0.0746) - After: 5.9276123 (±0.0964) * 64 bit hash for strings So they have the same hash as char buffers with the same contents * Narrowing warnings for gcc to match msvc * revert back to c++17 * Correct c++ version for msvc * Use path to unordered_dense which keeps tests happy * Do not assign to and read from map in same expression * Remove redundant map operations in primal-hoist * Split out stable hash functions into slang-stable-hash.h * 64 bit hash by default * regenerate vs projects * Correct return type from HashSetBase::getCount() * correct width for call to Dictionary::reserve * Use stable hash for obfuscated module ids * Signed int for reserve * clearer variable naming * Parameterize Dictionary on hash and equality functors * Allow heterogenous lookup for Dictionary * missing const * Use set over operator[] in some places * Remove unused function * s/at/getValue
2023-08-15	SPIR-V WIP (#3064)	Ellie Hermaszewska
	* Add type layout for structured buffer * Default to generating spirv directly * vk test for compute simple * Add spirv-dis as a downstream compiler * Emit Array types in SPIR-V * makevector for spirv * Dump whole spirv module on validation failure * register array types todo, use emitTypeInst * Neater formatting for unhandled inst printing * break out emitCompositeConstruct * Correct array type generation * neaten * Allow getElement for vector * Remove unused * Allow predicating target intrinsics on types * Consider functions with intrinsics to have definitions We need to specialize these if they are predicated on types * Correct array type generation * makeArray for spir-v * replace getElement with getElementPtr for spirv * Correct translation of field access for spirv * Push layouts to types for spirv * Spirv intrinsics * operator now makes a pointer * Add structured buffer of struct test * Preserve type layout in spirv structured buffer legalization * neaten * makeVectorFromScalar for SPIRV * placeholder for layouts on param groups * More type safe spirv op construction * Know that constants and types only go in one section * Remove emitTypeInst * Add todo for spirv sampling * Add links to spirv documentation on emit functions * OpTypeImage support for SPIR-V * Add simpler texture test for spirv * s/spirv_direct/spirv/g * Allow several string literals in target_intrinsic * Handle global params without a var layour for SPIR-V For example groupshared vars * uint spirv asm type * Add todo for isDefinition It is currently too broad * Some atomic op spirv intrinsics * Strip ConstantBuffer wrappers for spirv * Add todo for matrix annotations * Do not associate decorations insts with spirv counterparts * Correct entry point parameter generation * Spelling * Assert that fieldAddress is returning a pointer * Add error for existential type layout getting to spir-v emit * Add IRTupleTypeLayout Unused so far * Allow getElementPtr to work with vectors * Correct target name in test * Hide default spirv direct behind a premake option --default-spirv-direct=true * Do not insert space at start of intrinsic def * Correct asm rendering in tests * remove redundant option * Emit directly from direct test * Add source language options for spirv-dis * Add comments to spirv dis * Add dead debug print for before spirv module * Correct asm rendering in tests * s/spirv_direct/spirv/g * Only specialize intrinsic functions with predicates * regenerate vs projects * squash warnings * squash warnings * remove duplication * Silence warnings from msvc * squash warnings * Overload for zero sized array * More msvc warnings * warnings * Add spirv-tools to path for tests * Do not be specific about dxc version for diag test * Normalize line endings from spirv-dis * Correct filecheck matches * Temporarily disable two spirv tests Failing on CI, undebuggable hang :/ * Do not emit storage class more than once for spirv snippet * Do not pass spir-v to spirv-dis by stdin * Do not get spirv-dis output via stream, use file * normalize file endings in spirv-dis output
2023-08-07	Add spirv-dis as a downstream compiler (#3059)	Ellie Hermaszewska
	* Add spirv-dis as a downstream compiler * Add TODO for spirv-dis downstream compiler * Do not use SpirvDis by default * tabs to spaces * regenerate vs projects * correct test * correct calling convention --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-08-08	Add missing empty case to Array::makeArray (#3058)	Ellie Hermaszewska

2023-08-08	Allow parsing some SPIR-V enums in intrinsics (#3062)	Ellie Hermaszewska
	* Add TokenReader::AdvanceIf overlaod for TokenType * Add some spirv defs to parser * Add comment
2023-08-04	Redesign `DeclRef` and systematic `Val` deduplication (#3049)	Yong He
	* Redesign DeclRef + Deduplicate Val. * Update project files * Fix warning. * Fix. * Fix. * Remove `Val::_equalsImplOverride`. * Rmove `Val::_getHashCodeOverride`. * Remove `semanticVisitor` param from `resolve`. * Cleanups. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-01	Generalize collectInductionValues (#3031)	Ellie Hermaszewska
	* Generalize collectInductionValues * Support affine transformations of loop index as induction variables * Test for generalized induction value collection * Neaten inductive variable finding * Store the type of implication success when finding inductive variables * Test that loop induction finding does not alway succeed * Support chains of additions and branches of additions in induction variable finding * Use c++17 for downstream compilers
2023-07-18	Simplify Lookup and improve compiler performance. (#2996)	Yong He
	* Simplify lookup. * Various bug fixes. * Report type dictionary size in perf benchmark. * Remove type duplication. * increase initial dict size. * Bug fix. * Fix bugs. * Fixup. * Revert type legalization looping. * Fix specialization pass. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-12	Create and cache flattened inheritance lists (#2740)	Theresa Foley
	* Create and cache flattened inheritance lists The basic change here is to have a cached lookup that can map a `Type`, or a `DeclRef` that might refer to a type or `extension`, to a list of the facets that comprise it. The notion of a facet here is similar to what the C++ standard calls "sub-objects". A declared type like a `struct` has: * a facet for its own direct members * one facet for each of its (transitive) base `struct` types * one facet for each `interface` it conforms to * one facet for each `extension` that applies to that type The set of facets for a type is de-duplicated (so that "diamond" inheritance patterns don't cause issues) and deterministically ordered, using a variation of the C3 linearization algorithm. The creation of a linearized list of facets should help the compiler implementation in two key places: * Testing if a type implements an interface (or inherits from a base type) should now only take time linear in the number of (transitive) bases of that type. We can simply scan the linearized facet list to see if it contains a facet corresponding to the given base. * Looking up the members of a type (or a value of a given type) should be greatly simplified, since all of the members can be found in a single linear scan of the facet list. In addition, those facets will be ordered so that facets for "more derived" types will precede those for "less derived" types, so that shadowing in the case of overrides should be easier to implement. This change only implements the first of these two improvements, since there is already a lot of churn involved. Notes and caveats: * The handling of conjunction types (e.g., `IFoo & IBar`) complicates the implementation, both because the simple approach to subtype testing alluded to above is no longer complete, and also because we need to be more careful about what forms of subtype witnesses we construct, so that we can maintain the currently-required invariant that two witnesses are only equal if they have matching structure. * We don't implement the full/"proper" C3 algorithm here because it has some failure cases that we'd still like to support. In particular if we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it is illegal to have `IZ : IX, IY` because the two bases it inherits from disagree on the relative ordering of `IA` and `IB` in their own linearizations. Handling such cases may make our implementation less efficient, and it will also require testing of those corner caes. * When it comes time to revamp the implementation of lookup, we will need to deal with the fact that a single linear list (seemingly) cannot give us sufficient information to decide which of two members of the same name should shadow the other, or if there is an ambiguity. Or rather, it can give us that information if we are willing to accept some very user-unfriendly behavior and simply say that declarations earlier in the linearization always shadow later declarations, even if the facets involved are not related by an inheritance relationship of any kind. * In order to remove one kind of vicious circularity from the approach, the linearization that we are computing for `extension` declarations will not be sufficient for lookups in the body of such an `extension`. A future change may need to have support for creating and caching two distinct linearizations for each `extension`: one that is to be used when that `extension` is pulled into the linearization for a type that it applies to, and another for when lookup will be performed in the context of the `extension` itself. * This change does not include the simple expedient of adding a direct cache for subtype tests to the `SharedSemanticsContext`, although adding such a cache would be a simple matter. * This change introduces more deduplication for subtype witnesses, which should enable more deduplication for other `Val`s (including `Type`s), but it does not introduce any assumptions that equal `Val`s or `Type`s must have identical pointer representations. * Eventually we may find that, similar to the situation with `Type`s, we will want to have a split between surface-level and canonicalized versions of other `Val`s, including subtype witnesses. * Fix clang error. * remove debugging code. --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-07-12	Use scratchData on `IRInst` to replace HashSets. (#2978)	Yong He
	* Use scratchData on `IRInst` to replace HashSets. * Update test results. * Initialize scratchData. * Update autodiff documentation. * Use enum instead of bool. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-07-11	Add perf benchmark utility. (#2977)	Yong He
	* Add perf benchmark utility. * Update documentation. * Fix. * Fix doc. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-06-09	Small improvements around StringBlob (#2924)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Sundary improvements around StringBlob.
2023-06-08	Improvements around StringBlob (#2921)	jsmall-nvidia
	* #include an absolute path didn't work - because paths were taken to always be relative. * Small fixes and improvements around reflection tool. * Make PrettyWriter printing a class. * Improvements around handling StringBlob and storing stdlib source in ISlangBlob. * Fix some issues with comments around StringBlob. * Default initialize StringBlob fields.
2023-05-05	Squash a couple of warnings on clang (#2870)	Ellie Hermaszewska
	* Squash a couple of warnings on clang Redisable -Wunused-local-typedefs * unused variable
2023-05-04	Add SLANG_ASSUME and use it in release asserts (#2859)	Ellie Hermaszewska
	* Remove UNREACHABLE * formatting * Remove unused SLANG_EXPECT macros * Add SLANG_ASSUME and use it in release asserts * Reassure GCC that we are using memcpy responsibly --------- Co-authored-by: Yong He <yonghe@outlook.com>
2023-05-02	Markdown CommandOptions (#2860)	jsmall-nvidia
	* WIP CommandOptions * Fix some output issues. * Simplify word wrapping. * Add file extensions. * Change how lookup takes place. Add appendSplit functions to StringUtil. Make Categories hold the index range of their options. * Small improvement. * Lookup with partial option names. * Associate user values. * Encoding flags in the name. * Refactor setting up of command options. * Use CommandOptions in slang-options. * Remove old help text. * Cache the CommandOptions on the Session. * Range checking. Fix bug in the Options handling. * Extra checks for validity. * Get categories directly. * Slight improvements over output. * Added NameValue types. * Fix typo. Remove some now unused diagnostics. Fix diagnostic in testing, as output has changed. * Add minimal usage message. * Remove platform executable extension from diagnostics output. * Some improvements around getting names from NameValue types. * Improve some option descriptions. * Small fixes. * WIP improvements around CommandOptions. * Split out CommandOptionsWriter. * Add links to options. * Add command line options reference. * Link to the reference command line information. * Add quick links. * Improvements around lookup. Add categories to linking. * Small additional fixes. * Add LinkFlags control. * Small text fixes. * Fix typo. * Fix typo. * Fix typo. * Add support for -g and -O using CommandOptions. * Improve generated doc output/descriptions. Remove options listed directly in documentation.