| Age | Commit message (Collapse) | Author |
|
|
|
|
|
* capture/relay: Add capture interface classes
Add `ModuleCapture` class for capturing `IModule`
- The `IModule` can only be created from
-- `ISession::loadModule`
-- `ISession::loadModuleFromIRBlob`
-- `ISession::loadModuleFromSource`
-- `ISession::loadModuleFromSourceString`
so, we create the `ModuleCapture` at those methods in `SessionCapture`
class. We use a hash map to store a map from `IModule` to `ModuleCapture`
to avoid creating new `ModuleCapture` when there is already an old one.
- In `SessionCapture::getLoadedModule`, we will assert on not finding
a `ModuleCapture` instance.
Add `EntryPointCapture` class for capturing `IEntryPoint`.
- The `IEntryPoint` can only be created from:
-- `IModule::findEntryPointByName`
-- `IModule::findAndCheckEntryPoint`
so, we create the `EntryPointCapture` at those methods in `ModuleCapture`.
Similarly, we use a hash map to store a map from `IEntryPoint` to
`EntryPointCapture`.
- In `IModule::getDefinedEntryPoint`, we will assert on not finding
a `EntryPointCapture` instance.
Add `CompositeComponentTypeCapture` class for capturing CompositeComponentType,
but since user is only exposed to `IComponentType`, so `CompositeComponentTypeCapture`
just inherits from `IComponentType`.
- `CompositeComponentType` can only be created from:
-- ISession::createCompositeComponentType
so create it here.
Add `TypeConformanceCapture` class for capturing `ITypeConformance`.
- The `ITypeConformance` can only be created from:
-- `ISession::createTypeConformanceComponentType`
so create it here.
In addition, because `EntryPointCapture` and `ModuleCapture` share a some
base class `IComponentType`, we generate the COM GUID for those two
classes to differentiate them.
* Fix the build issue
* Add nullptr check for output parameter
* define the SLANG_CAPTURE_ASSERT macro used in both debug and release build
|
|
* Capabilities System, Backing Logic Overhaul
Fixes #4015
Problems to address:
1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster.
2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging.
What was addressed:
1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below.
Design:
* `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation.
* `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched.
* `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched.
* `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations.
2. Errors
* Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors.
* Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility.
What is missing:
1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul.
2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint.
Notes:
* UIntSet had implementation bugs which were fixed in this PR.
* The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation.
* fix .natvis debug view
* Small optimizations I found while working on the addition
the AST building pass looks like so now:
1% = ~capabilitySet
2% = capabilitySet()
1.5% capabilitySet::unionWith()
0.8% capabilitySet::join()
1.5% auxillary info for debugging
~0.5-1% extra visitor overhead
~5% total for the visitor
~6.5% for total runtime costs
* fix caps which were wrong but worked
* push minor syntax fix (still looking for why other tests fail)
* perf & bug fixes
1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario.
2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code.
NOTE: did not address review comments
* fix glsl.meta caps error
* fixing findBest logic again & UIntSet wrapper
findBest was not checking for 'more specialized' targets & was element counter was flawed
* faster getElements algorithm + natvis for UIntSet + wrong warning
* type incompatability of bitscanForward implementations
* try to fix warnings again
* remove ptr for clang intrinsic
* add missing header
* ifdef to allow clang compile
* compiler hackery to fix up platform/type independent operations
* bracket
* fix MSVC error
* missing template
* change types out again
* changes to fix compiling
* adjustment to parameter for Clang/GCC
* added iterator to delay processing all atomSets of a CapabilitySet
* add a few missing consts's
* ensure we never have more than 1 disjointSet
Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons:
1. We still need 0-1 set functionality.
2. Might as well keep the code, just disallow the problematic functionality.
* address review comments
non linked-list refactor review comments addressed; add doc comments + remove redundant code
* comments + remove isValid for bool operator
* push removal of linkedlist for capabilities
* add missing break
* address review comments
minor adjustments of syntax
* push a fix to the `CapabilitySet({shader, missing target})` code
* quality + error
1. add iterator to UIntSet
2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400)
* fix target_switch erroring + temporarily remove UIntSet::Interator
temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator
* fix the UIntSet iterator
* Revert "fix the UIntSet iterator" temporarily to pull from master
* add metal error as per texture.slang
(took a while I realize this was why things were breaking, likely should adjust errors to reflect this)
* Rework UIntSet to have a template for output type
This is done so it is reasonable to debug the iterator output and not just dealing with messy int's
Fix problems with the iterators implemented + invalid capabilities handling
* removed incorrect `__target_switch` capability
barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile.
* remove some uses of getElements
* remove undeclared_stage for now
* remove redundant code associated with `undeclared_stage`
* remove unused variable
* address review
specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate
* move GLSL_150 capdef change to sm_4_1 (more accurate)
* address most review comments
did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776
* revert incorrect code review suggestion
* push changes for all code review suggestions
|
|
|
|
* `slangc` tool experience improvements.
Fixes #4123.
Fixes #4127.
* Update doc.
|
|
* Add host shared library target.
* Attempt fix.
* Fix warnings.
* try fix.
* Fix test.
* Fix.
|
|
In current implementation, the some options will be to added
to the target that is only specified by command line "-target".
But if user specifies the target by just using slang API,
e.g. 'spAddCodeGenTarget', those options will be missed.
To fix the problem, we copy the default target's options to
the code-gen target's option set. The default target will only be
useful when there is no target specified in the command line.
|
|
* Fix compile failures when using debug symbol.
* Various fixes.
* Fix intrinsic.
* Fix test.
|
|
CUDA-related targets (#4063)
|
|
Do not use "&&" to implement the intrinsic kIROp_And,
instead define a 'and' function in stdlib. So it will be
up to us to determine whether we want to use 'short-circuit'
behavior in stdlib.
|
|
Add option -disable-short-circuit to disable short circuit for logic
operators && and ||. Also, disable the short circuit by default in the
stdlib.
|
|
* Metal: Vertex/Fragment builtin and layouts.
* Fix.
* Fix test.
* Emit user semantic on vertex/fragment attributes.
|
|
* Add metal downstream compiler + metallib target.
* Add more comments.
* Add missing override.
|
|
|
|
Fix the issue that 'spGetDependencyFilePath' will report "unknown"
for the source code is from string. We only reported valid file path
when the source code is file a file, so we change that to report a valid
file name even when the source code is from the string.
|
|
|
|
|
|
attribute. (#3914)
* Allow COM based API to discover and check entrypoints without [shader] attribute.
* Undo changes.
* More comments.
|
|
|
|
validation. (#3784)
* Fix name mangling.
* Fix source validation.
* Caching and search path fixes.
|
|
* Fix `sessionDesc.defaultMatrixLayoutMode` being ineffective.
* Fix matrix layout in buffer pointer.
* Attempt to fix.
* Fix buffer element type lowering for buffer pointers.
* Add comment.
* Fix test.
* Fix member lookup in `Ref<T>`.
* Fix validation error.
* Enhance test.
|
|
* Fix method synthesis logic for static differentiable methods.
* Support link-time constants in thread group size reflection.
|
|
* Fix parsing logic of `struct` decl.
Fixes #3716.
* Allow `loadModule` to find modules with underscores.
* Fix test.
|
|
* Link-time constant and linkage API improvements.
* Fix.
* Allow module name to be empty.
* Fix.
* Fix.
* Fix compile error.
|
|
* Add `IGlobalSession::getSessionDescDigest`.
* Fix.
|
|
* [SPIRV] Add NonSemanticDebugInfo for step-through debugging.
* Fix.
* Fix.
|
|
* Allow default values for `extern` symbols.
* Fix.
* Fix test.
|
|
* Add slangc interface to compile and use ir modules.
* Fix glsl scalar layout settings not copied to target.
* Fix.
* Cleanups.
|
|
|
|
* Fix SPIRV pointer lowering issue.
Fixes #3605.
* Add another pointer test.
Fixes #3601.
* Fixes #3600.
* Fix #3595.
|
|
* Language server robustness fix.
* Allow parameter name to be the same as its type.
* fix
* Fix test.
|
|
* Refactor compiler option representation.
* Fix binary compatibility.
* Add a test for specifying compiler options at link time.
* Fix binary compatibility.
* Fix binary compatibility.
* Fix backward compatibility on matrix layout.
* Fix.
* Fix.
* Fix.
* Fix gfx.
* Fix gfx.
* Fix dynamic dispatch.
* Polish.
|
|
* Support loading serialized modules.
* Fix.
* Fix vs solution files
* Fix glsl module loading.
* C++ fix.
* Fix.
* Try fix c++ error.
* Try fix.
* Fix.
* Fix.
|
|
* Unify GLSL and HLSL buffer block parsing.
Automatic GLSL module recognition.
* Fix.
|
|
* Capability type checking.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix atomics intrinsics, increase kMaxDescriptorSets.
* Add SPIRVASM to known non-differentiable insts.
* Support fp16 wave ops when targeting glsl.
* Fixes.
* Fix vk validation errors.
* Fix.
* Add to allowed failures.
|
|
* FP16 atomics for RWByteAddresBuffer, fp32 atomics for images.
* Fix spelling.
* Add overload.
* Fix test failures.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add slangc option to specialize entrypoint.
* Auto enable glsl mode when input file has glsl extension name.
* Fix test.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Capability def parsing & codegen + disjoint sets
This change adds a capability definition file, and a code generator
to produce C++ code that defines the capability enums and necessary
data structures around the capabilities.
Extends the existing CapabilitySet class to support expressing
disjoint sets of capabilities. This sets up for the next change
that will enhance our type checking with reasoning of capability
requirements.
* Fix cmake.
* Fix warning.
* Fix.
* Fix isBetterForTarget to prefer less specialized option.
* Fix.
* Fix premake.
* Fix intrinsic.
* Fix vs sln file.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
`spGetEntryPointCodeBlob` if defined in a serialized module. (#3431)
|
|
* Handle `import`, entrypoint and global params in included files.
* Fix language server.
* Extend `_createScopeForLegacyLookup` for `__include`.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Update behavior around interfaces and docs.
* Update toc
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Support `include` for pulling file into the current module.
* Add auto-completion, hover info and goto-def support.
* Disable warning for missing `module` declaration for now.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Parse glsl buffer blocks to GLSLInterfaceBlockDecl
* Parse glsl local size layout declarations
* Parse (and ignore) glsl version directives
* spelling
* Better l-value interpretation for glsl interface blocks
* Better l-value interpretation for glsl interface blocks
* Add compile flag for enabling glsl
* Parse and ignore precision modifiers.
* Automatically import `glsl` module for compatiblity.
* Complete vector and matrix types for glsl
* Remove generated file from repo
* Bump .gitignore
* do not mark out globals as params
* Synthesize entrypoint layout from global inout vars.
* update test result.
* Allow HLSL semantic on global variables.
* Fix.
* Fix test.
* Fix win32 compile error.
* Add more builtin input/output and texture intrinsics.
* Add struct/array constructor syntax.
* Skip `#extension` lines.
* overide operator * for matrix/vector multiplication.
* Add `matrixCompMult`.
* Parse modifiers in for loop init var declr.
* Add more glsl intrinsics, add stage into to var layout.
* Allow `int[3] x` syntax.
* Fix array type syntax.
---------
Co-authored-by: Ellie Hermaszewska <ellieh@nvidia.com>
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Report spirv-opt time.
* Removing timing logic in `loadModule`.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix issue with failing tests
tests/serialization/serialized-module-test.slang
tests/serialization/extern/extern-test.slang
* Fix issue with session destruction order on Session.
* Improve comment.
|
|
* Fix GLSL output for `gl_ClipDistance` input builtin.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add -spirv-core-grammar option to load alternate spirv defs
Also embed a version to use by default
* Use perfect hash for spv op lookup
* Neaten perfect hash embedding
* Refactor spirv grammar lookup in preperation for more kinds of lookups
* Load spirv capability list from spec
* Add all SPIR-V enums to lookup table
* regenerate vs projects
* appease msvc
* Use string slices for spir-v core grammar lookups
* wiggle
* comment
* Add OpInfo for spv ops
* regenerate vs projects
* Embed op names
* Add min/max operand counts and enum categories to spirv info
* neaten
* Operand kinds for spirv ops
* Store and embed all information relating to spirv enums and qualifiers
* Use SPIR-V spec to position instructions in spirv_asm blocks
* Neaten spir-v info embedding
* Neaten perfect hash embedding
* Add assignment syntax to spirv_asm snippets
* Better errors for spirv_asm parser
* Add warning for too many operands in spirv asm
* squash warnings
* neaten
* test wiggle
* Lookup enums for spirv
* Put OpCapability and OpExtension in the correct place for spirv_asm blocks
* Tests for OpCapability and OpExtension
* ci wiggle
* Add expected failure
* Allow raising immediate values to constant ids where necessary in spirv_asm blocks
* Allow bitwise or expressions and numeric literals in spirv_asm blocks
* test numeric literals
* Fix memory issues.
* fix.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|