summaryrefslogtreecommitdiff
path: root/source/slang/slang.cpp
AgeCommit message (Collapse)Author
2024-05-20Printing a timing of stdlib build time (#4190)Jay Kwak
2024-05-17Add `-minimum-slang-optimization` to favor compile time. (#4186)Yong He
2024-05-17capture/relay: Add capture interface classes (#4177)kaizhangNV
* capture/relay: Add capture interface classes Add `ModuleCapture` class for capturing `IModule` - The `IModule` can only be created from -- `ISession::loadModule` -- `ISession::loadModuleFromIRBlob` -- `ISession::loadModuleFromSource` -- `ISession::loadModuleFromSourceString` so, we create the `ModuleCapture` at those methods in `SessionCapture` class. We use a hash map to store a map from `IModule` to `ModuleCapture` to avoid creating new `ModuleCapture` when there is already an old one. - In `SessionCapture::getLoadedModule`, we will assert on not finding a `ModuleCapture` instance. Add `EntryPointCapture` class for capturing `IEntryPoint`. - The `IEntryPoint` can only be created from: -- `IModule::findEntryPointByName` -- `IModule::findAndCheckEntryPoint` so, we create the `EntryPointCapture` at those methods in `ModuleCapture`. Similarly, we use a hash map to store a map from `IEntryPoint` to `EntryPointCapture`. - In `IModule::getDefinedEntryPoint`, we will assert on not finding a `EntryPointCapture` instance. Add `CompositeComponentTypeCapture` class for capturing CompositeComponentType, but since user is only exposed to `IComponentType`, so `CompositeComponentTypeCapture` just inherits from `IComponentType`. - `CompositeComponentType` can only be created from: -- ISession::createCompositeComponentType so create it here. Add `TypeConformanceCapture` class for capturing `ITypeConformance`. - The `ITypeConformance` can only be created from: -- `ISession::createTypeConformanceComponentType` so create it here. In addition, because `EntryPointCapture` and `ModuleCapture` share a some base class `IComponentType`, we generate the COM GUID for those two classes to differentiate them. * Fix the build issue * Add nullptr check for output parameter * define the SLANG_CAPTURE_ASSERT macro used in both debug and release build
2024-05-16Capabilities System, CapabilitySet Logic Overhaul (#4145)ArielG-NV
* Capabilities System, Backing Logic Overhaul Fixes #4015 Problems to address: 1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster. 2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging. What was addressed: 1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below. Design: * `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation. * `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched. * `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched. * `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations. 2. Errors * Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors. * Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility. What is missing: 1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul. 2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint. Notes: * UIntSet had implementation bugs which were fixed in this PR. * The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation. * fix .natvis debug view * Small optimizations I found while working on the addition the AST building pass looks like so now: 1% = ~capabilitySet 2% = capabilitySet() 1.5% capabilitySet::unionWith() 0.8% capabilitySet::join() 1.5% auxillary info for debugging ~0.5-1% extra visitor overhead ~5% total for the visitor ~6.5% for total runtime costs * fix caps which were wrong but worked * push minor syntax fix (still looking for why other tests fail) * perf & bug fixes 1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario. 2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code. NOTE: did not address review comments * fix glsl.meta caps error * fixing findBest logic again & UIntSet wrapper findBest was not checking for 'more specialized' targets & was element counter was flawed * faster getElements algorithm + natvis for UIntSet + wrong warning * type incompatability of bitscanForward implementations * try to fix warnings again * remove ptr for clang intrinsic * add missing header * ifdef to allow clang compile * compiler hackery to fix up platform/type independent operations * bracket * fix MSVC error * missing template * change types out again * changes to fix compiling * adjustment to parameter for Clang/GCC * added iterator to delay processing all atomSets of a CapabilitySet * add a few missing consts's * ensure we never have more than 1 disjointSet Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons: 1. We still need 0-1 set functionality. 2. Might as well keep the code, just disallow the problematic functionality. * address review comments non linked-list refactor review comments addressed; add doc comments + remove redundant code * comments + remove isValid for bool operator * push removal of linkedlist for capabilities * add missing break * address review comments minor adjustments of syntax * push a fix to the `CapabilitySet({shader, missing target})` code * quality + error 1. add iterator to UIntSet 2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400) * fix target_switch erroring + temporarily remove UIntSet::Interator temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator * fix the UIntSet iterator * Revert "fix the UIntSet iterator" temporarily to pull from master * add metal error as per texture.slang (took a while I realize this was why things were breaking, likely should adjust errors to reflect this) * Rework UIntSet to have a template for output type This is done so it is reasonable to debug the iterator output and not just dealing with messy int's Fix problems with the iterators implemented + invalid capabilities handling * removed incorrect `__target_switch` capability barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile. * remove some uses of getElements * remove undeclared_stage for now * remove redundant code associated with `undeclared_stage` * remove unused variable * address review specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate * move GLSL_150 capdef change to sm_4_1 (more accurate) * address most review comments did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776 * revert incorrect code review suggestion * push changes for all code review suggestions
2024-05-14Propagate warning settings on `Linkage` to IR passes. (#4156)Yong He
2024-05-08`slangc` tool experience improvements. (#4140)Yong He
* `slangc` tool experience improvements. Fixes #4123. Fixes #4127. * Update doc.
2024-05-03Add host shared library target. (#4098)Yong He
* Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix.
2024-05-01Copy default target's optionSet to code-gen target's optionSet (#4073)kaizhangNV
In current implementation, the some options will be to added to the target that is only specified by command line "-target". But if user specifies the target by just using slang API, e.g. 'spAddCodeGenTarget', those options will be missed. To fix the problem, we copy the default target's options to the code-gen target's option set. The default target will only be useful when there is no target specified in the command line.
2024-05-01Fix compile failures when using debug symbol. (#4069)Yong He
* Fix compile failures when using debug symbol. * Various fixes. * Fix intrinsic. * Fix test.
2024-04-30Avoid classifying methods with `[numthreads]` as entry points for ↵Sai Praveen Bangaru
CUDA-related targets (#4063)
2024-04-30Change stdlib to not depend on short-circuit (#4056)kaizhangNV
Do not use "&&" to implement the intrinsic kIROp_And, instead define a 'and' function in stdlib. So it will be up to us to determine whether we want to use 'short-circuit' behavior in stdlib.
2024-04-30Add option -disable-short-circuit (#4054)kaizhangNV
Add option -disable-short-circuit to disable short circuit for logic operators && and ||. Also, disable the short circuit by default in the stdlib.
2024-04-30Metal: Vertex/Fragment builtin and layouts. (#4044)Yong He
* Metal: Vertex/Fragment builtin and layouts. * Fix. * Fix test. * Emit user semantic on vertex/fragment attributes.
2024-04-19Add metal downstream compiler + metallib target. (#3990)Yong He
* Add metal downstream compiler + metallib target. * Add more comments. * Add missing override.
2024-04-17Add skeleton for metal backend. (#3971)Yong He
2024-04-12Fix the issue that 'spGetDependencyFilePath' report 'unknown' (#3927) (#3939)kaizhangNV
Fix the issue that 'spGetDependencyFilePath' will report "unknown" for the source code is from string. We only reported valid file path when the source code is file a file, so we change that to report a valid file name even when the source code is from the string.
2024-04-12Add missing astBuilder setting in `ComponentType::tryFoldIntVal`. (#3934)Yong He
2024-04-11Fix the issues when compiling slang to library (#3936)kaizhangNV
2024-04-09Allow COM based API to discover and check entrypoints without [shader] ↵Yong He
attribute. (#3914) * Allow COM based API to discover and check entrypoints without [shader] attribute. * Undo changes. * More comments.
2024-03-19Fix inconsistent digest of precompiled module. (#3796)Yong He
2024-03-18Fix name mangling and source file finding logic for precompiled module ↵Yong He
validation. (#3784) * Fix name mangling. * Fix source validation. * Caching and search path fixes.
2024-03-12Fix `sessionDesc.defaultMatrixLayoutMode` being ineffective. (#3753)Yong He
* Fix `sessionDesc.defaultMatrixLayoutMode` being ineffective. * Fix matrix layout in buffer pointer. * Attempt to fix. * Fix buffer element type lowering for buffer pointers. * Add comment. * Fix test. * Fix member lookup in `Ref<T>`. * Fix validation error. * Enhance test.
2024-03-11Link-time specialization fixes. (#3734)Yong He
* Fix method synthesis logic for static differentiable methods. * Support link-time constants in thread group size reflection.
2024-03-08Parser and module finding logic fixes. (#3720)Yong He
* Fix parsing logic of `struct` decl. Fixes #3716. * Allow `loadModule` to find modules with underscores. * Fix test.
2024-03-07Link-time constant and linkage API improvements. (#3708)Yong He
* Link-time constant and linkage API improvements. * Fix. * Allow module name to be empty. * Fix. * Fix. * Fix compile error.
2024-03-04Add `IGlobalSession::getSessionDescDigest`. (#3669)Yong He
* Add `IGlobalSession::getSessionDescDigest`. * Fix.
2024-02-28[SPIRV] Add NonSemanticDebugInfo for step-through debugging. (#3644)Yong He
* [SPIRV] Add NonSemanticDebugInfo for step-through debugging. * Fix. * Fix.
2024-02-26Allow default values for `extern` symbols. (#3632)Yong He
* Allow default values for `extern` symbols. * Fix. * Fix test.
2024-02-23Add slangc interface to compile and use ir modules. (#3615)Yong He
* Add slangc interface to compile and use ir modules. * Fix glsl scalar layout settings not copied to target. * Fix. * Cleanups.
2024-02-22Add API for querying and reusing precompiled binary modules. (#3614)Yong He
2024-02-21Fix SPIRV lowering issue. (#3608)Yong He
* Fix SPIRV pointer lowering issue. Fixes #3605. * Add another pointer test. Fixes #3601. * Fixes #3600. * Fix #3595.
2024-02-20Language server robustness fix. (#3607)Yong He
* Language server robustness fix. * Allow parameter name to be the same as its type. * fix * Fix test.
2024-02-20Refactor compiler option representations. (#3598)Yong He
* Refactor compiler option representation. * Fix binary compatibility. * Add a test for specifying compiler options at link time. * Fix binary compatibility. * Fix binary compatibility. * Fix backward compatibility on matrix layout. * Fix. * Fix. * Fix. * Fix gfx. * Fix gfx. * Fix dynamic dispatch. * Polish.
2024-02-15Support loading serialized modules. (#3588)Yong He
* Support loading serialized modules. * Fix. * Fix vs solution files * Fix glsl module loading. * C++ fix. * Fix. * Try fix c++ error. * Try fix. * Fix. * Fix.
2024-02-06Unify GLSL and HLSL buffer block parsing. (#3552)Yong He
* Unify GLSL and HLSL buffer block parsing. Automatic GLSL module recognition. * Fix.
2024-02-02Capability type checking. (#3530)Yong He
* Capability type checking. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2024-02-02Atomics+Wave ops intrinsics fixes. (#3542)Yong He
* Fix atomics intrinsics, increase kMaxDescriptorSets. * Add SPIRVASM to known non-differentiable insts. * Support fp16 wave ops when targeting glsl. * Fixes. * Fix vk validation errors. * Fix. * Add to allowed failures.
2024-02-01FP16 atomics for RWByteAddresBuffer, fp32 atomics for images. (#3536)Yong He
* FP16 atomics for RWByteAddresBuffer, fp32 atomics for images. * Fix spelling. * Add overload. * Fix test failures. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2024-02-01Add slangc option to specialize entrypoint + auto glsl mode. (#3531)Yong He
* Add slangc option to specialize entrypoint. * Auto enable glsl mode when input file has glsl extension name. * Fix test. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2024-01-18Capability def parsing & codegen + disjoint sets (#3451)Yong He
* Capability def parsing & codegen + disjoint sets This change adds a capability definition file, and a code generator to produce C++ code that defines the capability enums and necessary data structures around the capabilities. Extends the existing CapabilitySet class to support expressing disjoint sets of capabilities. This sets up for the next change that will enhance our type checking with reasoning of capability requirements. * Fix cmake. * Fix warning. * Fix. * Fix isBetterForTarget to prefer less specialized option. * Fix. * Fix premake. * Fix intrinsic. * Fix vs sln file. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2024-01-09Add compiler settings to shader cache key (#3439)skallweitNV
2024-01-03Fix issue with entry point result not being available via ↵jsmall-nvidia
`spGetEntryPointCodeBlob` if defined in a serialized module. (#3431)
2023-12-08Handle import, entrypoint and global params in included files. (#3395)Yong He
* Handle `import`, entrypoint and global params in included files. * Fix language server. * Extend `_createScopeForLegacyLookup` for `__include`. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-12-06Change default visibility of interface members and update docs. (#3381)Yong He
* Update behavior around interfaces and docs. * Update toc --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-12-05Support `include` for pulling file into the current module. (#3377)Yong He
* Support `include` for pulling file into the current module. * Add auto-completion, hover info and goto-def support. * Disable warning for missing `module` declaration for now. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-11-14Add GLSL Compatibility. (#3321)Yong He
* Parse glsl buffer blocks to GLSLInterfaceBlockDecl * Parse glsl local size layout declarations * Parse (and ignore) glsl version directives * spelling * Better l-value interpretation for glsl interface blocks * Better l-value interpretation for glsl interface blocks * Add compile flag for enabling glsl * Parse and ignore precision modifiers. * Automatically import `glsl` module for compatiblity. * Complete vector and matrix types for glsl * Remove generated file from repo * Bump .gitignore * do not mark out globals as params * Synthesize entrypoint layout from global inout vars. * update test result. * Allow HLSL semantic on global variables. * Fix. * Fix test. * Fix win32 compile error. * Add more builtin input/output and texture intrinsics. * Add struct/array constructor syntax. * Skip `#extension` lines. * overide operator * for matrix/vector multiplication. * Add `matrixCompMult`. * Parse modifiers in for loop init var declr. * Add more glsl intrinsics, add stage into to var layout. * Allow `int[3] x` syntax. * Fix array type syntax. --------- Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com> Co-authored-by: Yong He <yhe@nvidia.com>
2023-10-11Report spirv-opt time. (#3271)Yong He
* Report spirv-opt time. * Removing timing logic in `loadModule`. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-09-26Fix for epoch/ASTBuilder* nullptr issue (#3240)jsmall-nvidia
* Fix issue with failing tests tests/serialization/serialized-module-test.slang tests/serialization/extern/extern-test.slang * Fix issue with session destruction order on Session. * Improve comment.
2023-09-07Fix GLSL output for `gl_ClipDistance` input builtin. (#3193)Yong He
* Fix GLSL output for `gl_ClipDistance` input builtin. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com>
2023-08-28Allow bitwise or expressions and numeric literals in spirv_asm blocks (#3157)Ellie Hermaszewska
* Add -spirv-core-grammar option to load alternate spirv defs Also embed a version to use by default * Use perfect hash for spv op lookup * Neaten perfect hash embedding * Refactor spirv grammar lookup in preperation for more kinds of lookups * Load spirv capability list from spec * Add all SPIR-V enums to lookup table * regenerate vs projects * appease msvc * Use string slices for spir-v core grammar lookups * wiggle * comment * Add OpInfo for spv ops * regenerate vs projects * Embed op names * Add min/max operand counts and enum categories to spirv info * neaten * Operand kinds for spirv ops * Store and embed all information relating to spirv enums and qualifiers * Use SPIR-V spec to position instructions in spirv_asm blocks * Neaten spir-v info embedding * Neaten perfect hash embedding * Add assignment syntax to spirv_asm snippets * Better errors for spirv_asm parser * Add warning for too many operands in spirv asm * squash warnings * neaten * test wiggle * Lookup enums for spirv * Put OpCapability and OpExtension in the correct place for spirv_asm blocks * Tests for OpCapability and OpExtension * ci wiggle * Add expected failure * Allow raising immediate values to constant ids where necessary in spirv_asm blocks * Allow bitwise or expressions and numeric literals in spirv_asm blocks * test numeric literals * Fix memory issues. * fix. --------- Co-authored-by: Yong He <yonghe@outlook.com>