| Age | Commit message (Collapse) | Author |
|
|
|
* Move switch statement bodies to their own lines
* format
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* format
* Minor test fixes
* enable checking cpp format in ci
|
|
(#5415)
This commit changes the word "stdlib" or "standard library" to "core module" in the source code.
|
|
* Squash redundant move warnings
* Move C interface from slang.h to slang-deprecated.h
spGetBuildTagString remains, because it's useful to have before the
global session exists.
This C API is used quite pervasively in the C++ helpers (for example
slang::UserAttribute. It's not trivial to move these to
slang-deprecated.h as they're entangled with some enums which are
themselves used elsewhere in the compiler.
The fact that these helpers use the C API can be viewed as an
implementation detail for now, and this usage moved to slang-deprecated
in due course.
Closes https://github.com/shader-slang/slang/issues/4758
* Squash warnings for our usage of our deprecated API
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Use the assembly description as target when disassembling
I believe this is a bugfix.
It seems to have worked before because up until the WGSL case, the disassembler has been
the same executable as the one producing the binary to be disassembled.
* Add Tint as a downstream compiler
This closes issue #5104.
* Add downstream compiler for Tint.
* Tint is wrapped in a shared library, 'slang-tint' available from [1].
* The header file for slang-tint.dll is added in external/slang-tint-headers.
* Add some boilerplate for WGSL targets.
* Add an entry point test for WGSL.
[1] https://github.com/shader-slang/dawn/releases/tag/slang-tint-0
* Add WGSL_SPIRV as supported target for Glslang
* Add WebGPU support to slang-test
This helps to address issue #5051.
* Disable lots of crashing compute tests for 'wgpu'
This closes issue #5051.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Implemented Combined-texture for WGSL
* Remove unnecessary comment
* Limit to std430 layout
* Fix compiler warning for unused variable
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Transferring source locations when creating phi instructions
* Tracking for simple variables
* Deriving source locations for loop counters
* Printing checkpoint structure breakdown
* More readable output format
* Special behavior for loop counters
* Writing report to file
* Add slangc option to enable checkpoint reports
* Display types of checkpointed fields
* Message in case there are no checkpointing contexts
* Catch source locations for function calls
* Source cleanup
* Fix compilation warnings
* Remove stray dump()
* Provide the report through diagnostic notes
* Add missing path for sourceLoc during unzip pass
* Add tests for reporting intermediates
* Include more transfer cases for source locations
* Fix ordering in address elimination
* Fill in more holes with source location transfer
* Remove debugging line
* Reverting changes to diagnostic sink
* Simplify address elimination using source location RAII contexts
* Eliminating manual source loc transfers in forward transcription
* Fix local var adaptation to use RAII location setter
* Simplify primal hoisting logic for source location transfer
* Simplify unzipping with RAII location scopes
* Simplify transpose logic
* Cleaning up for rev.cpp
* Reverting spacing changes
* Fix mistake with source loc RAII instantiation
* Fix formatting issues
|
|
* Add WGSL as a target
This is required for #4807.
* C-like emitter: Allow the function header emission to be overloaded
WGSL-style function headers are pretty different from normal C-style headers:
Normal C-style headers:
ReturnType Func(...)
void VoidFunc(...)
WGSL-style headers:
fn Func(...) -> ReturnType
fn VoidFunc(...)
This change allows the header style to be overloaded, in order to accomodate WGSL-style
headers as required to resolve issue #4807, but retains normal C-style headers as the
default implementation.
[1] https://www.w3.org/TR/WGSL/#function-declaration-sec
* C-like emitter: Allow emission of switch case selectors to be overloaded
The C-like emitter will emit code like this:
switch(a.x)
{
case 0:
case 1:
{
...
} break;
...
}
This is not allowed in WGSL. Instead, selectors for cases that share a body must [1] be
separated by commas, like this:
switch(a.x)
{
case 0, 1:
{
...
} break;
...
}
To prepare for addressing issue #4807, this patch makes the emission of switch case
selectors overloadable.
[1] https://www.w3.org/TR/WGSL/#syntax-case_selectors
* C-like emitter: Support WGSL-style declarations
This patch helps to address issue 4807.
C-like languages declare variables like this:
i32 a;
WGSL declares variables like this:
var a : i32
The patch introduces overloads so that the forthcoming WGSL emitter can output WGSL-style
declarations, which helps to resolve #4807.
* C-like emitter: Support overloading of declarators
Unlike C-like languages, WGSL does not support the following types at the syntax level,
via declarators:
- arrays
- pointers
- references
For this reason, this patch introduces support for overloading the declarator emitter,
in order to help address issue #4807.
C-like languages:
int a[3]; // Array-ness of type is mixed into the "declarator"
WGSL:
var a : array<int, 3>; // Array-ness of type is part of the... type_specifier!
* C-like emitter: Allow struct declaration separator to be overridden
C-like languages use ';' as a separator, and languages like e.g. WGSL use ','.
This change prepares for addressing issue #4807.
* C-like emitter: Allow overriding of whether pointer-like syntax is necessary
Things like e.g. structured buffers map to "ptr-to-array" in WGSL, but ptr-typed
expressions don't always need C-style pointer-like syntax.
Therefore, make it overrideable whether or not such syntax is emitted in various cases in
order to address #4807.
* C-like emitter: Emit parenthesis to avoid warning about & and + precedence
This helps with #4807 because WGSL compilers (e.g. Tint) treat absence of parenthesis as
an error.
* C-like emitter: Add hook for emitting struct field attributes
WGSL requires @align attributes to specify explicit field alignment in certain cases.
Thus, this patch prepares for addressing #4807.
* C-like emitter: Add hook for emitting global param types
Declarations of structured buffers map to global array declarations in WGSL.
However, in all other cases such as when structured buffers are used in operands, their
types map to *ptr*-to-array.
This patch makes it possible for the WGSL back-end to say that structured buffers
generally map to "ptr-to-array" types, but still have a special case of just "array" when
declaring the global shader parameter.
Thus, this patch helps with addressing #4807.
* IR lowering: Use std140 for WGSL uniform buffers
This patch just cuts out some logic that prevented std140 to be chosen for WGSL uniform
buffers.
Note that WGSL buffers in the uniform address space is not quite std140, but for now it's
close enough to avoid compile issues.
Later on, a custom layout should be created for WGSL uniform buffers.
When that's done, this change will be revisited, but for now it helps to resolve #4807.
* Don't emit line directives in WGSL by default
WGSL does not support line directives [1].
The plan currently seems to be to instead support source-map [2].
This is part of addressing issue #4807.
[1] https://github.com/gpuweb/gpuweb/issues/606
[2] https://github.com/mozilla/source-map
* WGSL IR legalization: Map SV's
The implementation closely follows the cooresponding one for Metal.
Supported:
- DispatchThreadID
- GroupID
- GroupThreadID
- GroupThreadID
Unsupported:
- GSInstanceID
This is not complete, but it helps to address #4807.
* WGSL emitter: Add support for basic language constructs
A lot of the basics are added in order to generate correct WGSL code for basic Slang language constructs.
This addresses issue #4807.
This adds support for at least the following:
- statments
- if statements
- ternary operator
- while statement
- for statements
- variable declarations
- switch statements
- Note: Slang may emit non-constant case expressions, see issue 4834
- literals
- integer literals
- u?int[16|32|64]_t
- float and half literals
- bool literals
- vector literals and splatting (e.g 1.xxx)
- function definitions
- assignments
- +=, *=, /=
- array assignments
- vector assignments/updates
- swizzles of other vectors
- from matrix rows ('m[i]' notation)
- from matrix cols (using swizzle notation, e.g 'm._11_12_13')
- matrix assignments/updates
- to rows ('m[i]' notation)
- to cols (using swizzle notation, e.g 'm._11_12_13')
- declarations
- arrays
[1] https://www.w3.org/TR/WGSL/#syntax-switch_body
* Add some WGSL capabilities
This patch registers some WGSL capabilities required to pass many of the initial compute
shader compile tests.
Many capabilities still remain to be added -- this is just an initial set to help resolve
issue #4807.
- asint
- min and max
- cos and sin
- all and any
* WGSL and C-like emitters: Add hack to bitcast case expression
In WGSL, the switch condition and case types must match.
https://www.w3.org/TR/WGSL/#switch-statement
Slang currently allows these types to mismatch, as pointed out in #4921.
Issue #4921 should eventually be addressed in the front-end by a patch like [1].
However, at the moment that would break Falcor tests.
Thus, this patch temporarily works around the issue in the WGSL emitter only in order to
help resolve #4807.
In the future, the Falcor tests should be fixed, this patch should be dropped and [1]
should be merged instead.
[1] a32156ef52f43b8503b2c77f2f1d51220ab9bdea
|
|
* Initial -embed-spirv support
Add support for SPIR-V precompilation using the framework
established for DXIL.
Work on #4883
* SLANG_UNUSED
* Add linkage attributes to exported spirv functions
* Combine DXIL and SPIRV paths
* Whitespace fix
* Merge remaining precompiled spirv/dxil paths
* Change inst accessors to return codegentarget
* Add unit test for precompiled spirv
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Support entrypoints defined in a namespace.
* Fix test.
|
|
* Support mixture of precompiled and non-precompiled modules
This changes the implementation of precompile DXIL modules to
accept combinations of modules with precompiled DXIL, ones without,
and ones with a mixture of precompiled DXIL and Slang IR.
During precompilation, module IR is analyzed to find public functions
which appear to be capable of being compiled as HLSL, and those
functions are given a HLSLExport decoration, ensuring they are emitted
as HLSL and preserved in the precompiled DXIL blob. The IR for those
functions is then tagged with a new decoration AvailableInDXIL, which
marks that their implementation is present in the embedded DXIL blob.
The DXIL blob is attached to the IR as before, inside a EmbeddedDXIL
BlobLit instruction.
The logic that determines whether or not functions should be
precompiled to DXIL is a placeholder at this point, returning true
always. A subsequent change will add selection criteria.
During module linking, the full module IR is available, as well
as the optional EmbeddedDXIL blob. The IR for functions implemented
by the blob are tagged with AvailableInDXIL in the module IR.
After linking the IR for all modules to program level IR, the IR for
the functions marked AvailableInDXIL are deleted from the linked IR,
prior to emitting HLSL and compiling linking the result.
This change also changes the point of time when the module IR is
checked for EmbeddedDXIL blobs. Instead of happening at load time
as before, it happens during immediately before final linking, meaning
that the blob does not need to be independently stored with the module
separate from the IR as was done previously.
Work on #4792
* Clean up debug prints
* Call isSimpleHLSLDataType stub
* Address feedback on precompiled dxil support
Allow for IR filtering both before and after linking.
Only mark AvailableInDXIL those functions which pass
both filtering stages. Functions are corrlated using
mangled function names.
Rather than delete functions entirely when linking with
libraries that include precompiled DXIL, instead convert
the IR function definitions to declarations by gutting
them, removing child blocks.
* Use artifact metadata and name list instead of linkedir hack
* Use String instead of UnownedStringSlice
* Update tests
* Renaming
* Minor edits
* Don't fully remove functions post-link
* Unexport before collecting metadata
|
|
* Allow capabilities to be used with `[shader("...")]`
Fixes: #4917
Changes:
1. Allow using capabilities instead of `Stage`s with `EntryPointAttribute`.
2. When resolving capabilities for an entrypoint+profile (per entrypoint) in `resolveStageOfProfileWithEntryPoint` add our `EntryPointAttribute` and resolved capability
3. Added tests and some capabilities related clean-up
* fix a warning made by a mistake in syntax
* change fineStageByName to assume it is passed a stage without a '_'
* test with and without prefix '_'
* cleanup some profiles and reprisentation to work better with 'Stage' and 'Profile'
This use case is why we need to clean all profile-usage into `CapabilityName`s directly.
* change how we compare
* only change profiles
* let all capabilities be resolved by 'shader' profile for now
* fix warning checks I accidently broke
* meshshading_internal to _meshshading
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
Adds a new Github CI action for benchmarking the slangc compiler on the MDL shaders. For now, the results are only dumped to the output of the CI, which can be later viewed through raw logs. The next step is to use github-action-benchmark to push these results into a page which will show the benchmark results over time as commits are pushed.
|
|
* Add embedded precompiled binary IR ops
Add IR operations to embed precompiled DXIL or SPIR-V blobs
into IR. Adds a BlobLit literal that is mostly identical to
StringLit except for its inability to be displayed, e.g.
in dumped IR. In the future, the blob might be dumped as
hexadecimal, but for now it is summarized as "<binary blob>".
* EmbeddedDXIL and SPIR-V options
The options, '-embed-dxil' and '-embed-spirv' in slangc, will
cause a target dxil or spirv to be compiled and stored in the
translation unit IR when written to a slang-module. Subsequent
changes actually implement the options.
* Per-translation unit DXIL precompilation
When -embed-dxil is specified, perform a precompilation to DXIL of
each TU, linked only with stdlib. Embed the resulting DXIL for
the TU in a IR op. Being part of IR, the precompiled DXIL can be
serialized to disk in a slang-module.
Upon loading slang-modules, the new IR op will be searched for and
the precompiled DXIL blob is saved with the loaded Module. During
linking, if all the Modules have precompiled blobs they will be
sent to the downstream compile commands as libraries instead of
source, skipping the downstream compilation, using DXC only for
linking.
Fixes Issue #4580
* Remove placeholder embedded SPIRV support
Code was added only to sketch out how other precompiled bins
will be supported.
* Remove the rest of the SPIRV placeholder support
* Fix warnings, test error on non-windows
* Remove lib_6_6 hack, add dxil_lib capability
* Allocate blob value from irmodule memarena
* Add null check after memarena allocation
* Restore the request->e2erequest code path for generatewholeprogram
* Update capability handling, move EmbedDXIL enum to end to preserve abi
* Remove lib_6_6 hack
* Move ICompileRequest functions to end
|
|
entry-point (#4670)
* Fixes #4656
Changes:
1. Setting a profile via slangc no-longer sets an entry-point target-stage, this is to allow slangc to follow how the SLANG-API works (else `main` is assumed to be the default entry-point)
2. If the stage specified by a profile is not equal to the stage specified by a entry-point, we throw a capability error.
3. Resolving the stage of an entry point was changed to function (mostly) equally for when 0 entry-points are specified versus to when there are 1 or more.
4. changed capabilitySet Iterator so it is invalid if backing data is nullptr (although this should never happen, it would stop crashes in the worst case).
* remove the breaking change since it likely is going to be a lot more than just a simple change due to the implicit `main` and stage through `profile` code.
* print out profile name with errors
* use target's profile for printing
* change logic to print warning in a different method (account for more cases)
* set unknown stages
|
|
* Add API for querying dependency files on IModule
* return nullptr
|
|
|
|
* capability upgrade warning/error
adjusted implementation + tests to support a warning/error if capabilities are implicitly upgraded and test accordingly.
* add glsl profile caps
* add GLSL and HLSL capabilities to the associated capability
* syntax error in capdef
* only error if user explicitly enables capabilities
1. changed testing infrastructure to not set a `profile` explicitly,
2. Added tests to be sure this works as intended with user API and with slangc command line
* Change capability atom definitions and how Slang manages them to fix errors
1. most `glsl_spirv` version atoms have been removed from `.capdef`, instead we will translate `spirv` version atoms into `glsl_spirv` since there is no point in writing the same code twice in `.capdef` files to define `spirv` versions.
2. add spirv version, and hlsl sm version (and equivlent) capability dependencies
3. removed some stage requirments which were set on objects, keep the wrapper capabilities. I am keeping the wrapper capabilities since I am unaware on if there are stage limitations (spec says code in practice does not work).
* check internal version instead of version profile (_spirv_1_5 vs. spirv_1_5)
* remove unused OpCapability. adjust SPIRV version'ing again for glsl_spirv
* apply workaround for glslang bug with rayquery usage
* ensure capabilities targetted by a profile and added together by a user are valid
* remove additions to `spirv_1_*` wrapper
* spirv_* -> glsl_spirv fix
* fix bug where incompatable profiles would cause invalid target caps
* try to avoid joining invalid capabilities
* fix the warning/error & printing
* run through tests to fix capability system and test mistakes
many mistakes were mesh shaders doing `-profile glsl_450+spirv_1_4`. This is not allowed for a few reasons
1. the test tooling does not handle arguments the same as `slangc`
2. glsl_450 core profile does not support mesh shaders, nor does spirv_1_4. sm_6_5 does work in this senario
* set some sm_4_1 intrinsics to sm_4_0
* replace `GLSL_` defs with `glsl_`
* swap the unsupported render-test syntax for working syntax
* set d3d11/d3d12 profile defaults
this is required since sm version changes compiled code & behavior
* adjusted nvapi capabilities with atomics + d3d11 set to use sm_5_0 as per default
* cleanup
* address review
* incorrect styling
* change `bitscanForward` to work as intended on 32 bit targets
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
|
|
* SPIRV `Block` decoration fixes.
- SPIRV does not allow duplicate `Block` decorations. So we shouldn't be generating them.
- Also fixes duplication of OpName.
- SPIRV and HLSL do not allow ConstantBuffer with trailing unsized arrays. Added a check in the front-end against such code.
* Convert failing cross-compile tests to filecheck.
---------
Co-authored-by: Jay Kwak <82421531+jkwak-work@users.noreply.github.com>
|
|
|
|
* fix all Clang-14 warnings
* remove a clang-14 warning fix because it is a MSVC warning...
|
|
* Capabilities System, Backing Logic Overhaul
Fixes #4015
Problems to address:
1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster.
2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging.
What was addressed:
1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below.
Design:
* `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation.
* `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched.
* `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched.
* `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations.
2. Errors
* Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors.
* Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility.
What is missing:
1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul.
2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint.
Notes:
* UIntSet had implementation bugs which were fixed in this PR.
* The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation.
* fix .natvis debug view
* Small optimizations I found while working on the addition
the AST building pass looks like so now:
1% = ~capabilitySet
2% = capabilitySet()
1.5% capabilitySet::unionWith()
0.8% capabilitySet::join()
1.5% auxillary info for debugging
~0.5-1% extra visitor overhead
~5% total for the visitor
~6.5% for total runtime costs
* fix caps which were wrong but worked
* push minor syntax fix (still looking for why other tests fail)
* perf & bug fixes
1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario.
2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code.
NOTE: did not address review comments
* fix glsl.meta caps error
* fixing findBest logic again & UIntSet wrapper
findBest was not checking for 'more specialized' targets & was element counter was flawed
* faster getElements algorithm + natvis for UIntSet + wrong warning
* type incompatability of bitscanForward implementations
* try to fix warnings again
* remove ptr for clang intrinsic
* add missing header
* ifdef to allow clang compile
* compiler hackery to fix up platform/type independent operations
* bracket
* fix MSVC error
* missing template
* change types out again
* changes to fix compiling
* adjustment to parameter for Clang/GCC
* added iterator to delay processing all atomSets of a CapabilitySet
* add a few missing consts's
* ensure we never have more than 1 disjointSet
Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons:
1. We still need 0-1 set functionality.
2. Might as well keep the code, just disallow the problematic functionality.
* address review comments
non linked-list refactor review comments addressed; add doc comments + remove redundant code
* comments + remove isValid for bool operator
* push removal of linkedlist for capabilities
* add missing break
* address review comments
minor adjustments of syntax
* push a fix to the `CapabilitySet({shader, missing target})` code
* quality + error
1. add iterator to UIntSet
2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400)
* fix target_switch erroring + temporarily remove UIntSet::Interator
temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator
* fix the UIntSet iterator
* Revert "fix the UIntSet iterator" temporarily to pull from master
* add metal error as per texture.slang
(took a while I realize this was why things were breaking, likely should adjust errors to reflect this)
* Rework UIntSet to have a template for output type
This is done so it is reasonable to debug the iterator output and not just dealing with messy int's
Fix problems with the iterators implemented + invalid capabilities handling
* removed incorrect `__target_switch` capability
barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile.
* remove some uses of getElements
* remove undeclared_stage for now
* remove redundant code associated with `undeclared_stage`
* remove unused variable
* address review
specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate
* move GLSL_150 capdef change to sm_4_1 (more accurate)
* address most review comments
did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776
* revert incorrect code review suggestion
* push changes for all code review suggestions
|
|
* Add host shared library target.
* Attempt fix.
* Fix warnings.
* try fix.
* Fix test.
* Fix.
|
|
CUDA-related targets (#4063)
|
|
* Add metal downstream compiler + metallib target.
* Add more comments.
* Add missing override.
|
|
|
|
|
|
* Add slangc interface to compile and use ir modules.
* Fix glsl scalar layout settings not copied to target.
* Fix.
* Cleanups.
|
|
* Refactor compiler option representation.
* Fix binary compatibility.
* Add a test for specifying compiler options at link time.
* Fix binary compatibility.
* Fix binary compatibility.
* Fix backward compatibility on matrix layout.
* Fix.
* Fix.
* Fix.
* Fix gfx.
* Fix gfx.
* Fix dynamic dispatch.
* Polish.
|
|
* Support loading serialized modules.
* Fix.
* Fix vs solution files
* Fix glsl module loading.
* C++ fix.
* Fix.
* Try fix c++ error.
* Try fix.
* Fix.
* Fix.
|
|
* Capability type checking.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix atomics intrinsics, increase kMaxDescriptorSets.
* Add SPIRVASM to known non-differentiable insts.
* Support fp16 wave ops when targeting glsl.
* Fixes.
* Fix vk validation errors.
* Fix.
* Add to allowed failures.
|
|
* Capability def parsing & codegen + disjoint sets
This change adds a capability definition file, and a code generator
to produce C++ code that defines the capability enums and necessary
data structures around the capabilities.
Extends the existing CapabilitySet class to support expressing
disjoint sets of capabilities. This sets up for the next change
that will enhance our type checking with reasoning of capability
requirements.
* Fix cmake.
* Fix warning.
* Fix.
* Fix isBetterForTarget to prefer less specialized option.
* Fix.
* Fix premake.
* Fix intrinsic.
* Fix vs sln file.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* More robust input and output selection in generator tools
* Add cmake build system
* Get slang-test running with cmake
* Bump lz4 and miniz dependencies
* Make cmake build more declarative
* Correct preprocessor logic in slang.h
* Add cuda test to compute/simple
* Remove empty cmake files
* output placement for cmake, and commenting
* Correct include paths in spirv-embed-generator
* Format cmake with gersemi
* Make cmake build clerer
* Neaten header generation
Also work around https://gitlab.kitware.com/cmake/cmake/-/issues/18399
by introducing correct_generated_properties to set the GENERATED flag in
the correct scope
* remove unused files
* use 3.20 to set GENERATOR property properly
* spelling
* more flexible linker arg setting
* replace slang-static with obj collection
* Set rpath and linker path correctly
* neaten generated file generation
* tests working with cmake build
* fix premake5 build
* comment and neaten cmake
* remove unnecessary dependency
* Build aftermath example only when aftermath is enabled
* Add slang-llvm and other dependencies
* Put modules alongside binaries
* Find slang-glslang correctly
* Better option handling
* comments
* add llvm build test
* Better option handling
* cmake wobble
* use UNICODE and _UNICODE
* remove other workflows
* use ccache
* neaten
* limit parallel for llvm build
* use ninja for build
* Windows and Darwin slang-llvm builds
* cache key
* verbose llvm build
* cl on windows
* sccache and cl.exe
* use cl.exe
* Correct package detection
* less verbosity
* Simplify miniz inclusion
* fix build with sccache
* Neaten llvm building
* neaten
* Neaten slang-llvm fetching
* more surgical workarounds
* Add ci action
* Get version from git
* better variable naming
* add missing include
* clean up after premake in cmake
* more docs on cmake build
* ci wobble
* add imgui target
* more selective source
* do not download swiftshader
* Some missing dependencies
* only build llvm on dispatch
* Disable /Zi in CI where sccache is present
* simplify
* set PIC for miniz
* set policies before project
* reengage workaround
* more runs on ci
* Add cmake presets
* Add cpack
* move iterator debug level to preset
* Correct lib flag
* simplify action
* Neaten cmake init
* Add todo
* Add simple test wrapper
* Add tests to workflow presets
* rename packing preset
* Correctly set definitions
* docs
* correct preset names
* Make slang-test depend on test-server/test-process
* neaten
* use workflow in actions
* install docs
* Correct module install dir
* debug dist workflow
* Install headers
* neaten header globbing
* Neaten dependency handling
* make lib and bin variables
* Do not set compiler for vs builds, unnecessary
* docs
* allow setting explicit source for target
* maintain archive subdir
* cmake docs
* install headers
* place targets into folders
* cmake docs
* nest external projects in folder
* remove name clash
* Neater external packages
* meta targets in folder structure
* cleaner slang-glslang dll
* Add missing static directive to slang-no-embedded-stdlib
* more robust module copying
* make slang-test the startup project
* folder tweak
* Make FETCH_BINARY the default on all platforms
* Set DEBUG_DIR
* add natvis files to source
* skip spirv tests
* remove test step from debug dist
* Add build to .gitignore
* redo warnings to be more like premake
* Update imgui
* clean more premake files
* Disable PCH for glslang, gcc throws a warning
* Add /MP for msvc builds
* warning wobble
* Add script to build llvm
* Add slang-llvm and generators components
* Build slang-llvm in ci
* comments
* fetch llvm with git
* better abi approximation for cache
* better sccache key
* formatting
* Correct logic around disabling problematic debug info for ccache
* exclude gcc and clang from windows ci
* Make dist workflows use system llvm
* naming
* restore normal dist builds
* formatting
* run tests in ci
* Correct slang-llvm url setting
* Rely on the system to find the test tool library
* actions matrix wiggle
* cope with OSX ancient bash
* Correct compilers on windows
* more ci debugging
* Correct rpath handling on OSX
* neaten
* correct path to slang-llvm
* Correct rpath separator on osx
* Find slang-llvm correctly
* smoke tests only on osx
* ci wobble
* Give MacOS module a dylib suffix
* get swiftshader correctly
* cope with bsd cp
* remove debug output
* full tests on osx
* ci wobble
* Add some vk tests to expected failures
* simplify ci
* ci wobble
* exclude dx12 tests from github ci
* remove cmake code for building llvm
* warnings
* warnings as errors for cl
* spirv-tools in path
* add aarch64 ci build
* Add SLANG_GENERATORS_PATH option for prebuilt generators
* neaten
* Correct generator target name
* remove yaml anchors because github actions does not support them
* Demote CMake in docs
Also add info on cross compiling
* Restore premake CI
* use minimal ci for cmake
* Write miniz_export for premake build
and .gitignore it
* Mention build config tool options in docs
* Remove redefined macro for miniz
* regenerate vs project
|
|
|
|
* Run curated spirv-opt passes through slang-glslang.
* Cleanup.
* Replace spirv-dis downstream compiler with glslang.
* delete slang-spirv-opt.cpp.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Correct namespace for getClockFrequency
* missing const
* Add missing assignment operator
* Remove unused variables
* Return correct modified variable
* Use stable hash code for file system identity
* terse static_assert
* Structured binding for map iteration
* Make (==) and getHashCode const on many structs
* Add ConstIterator for LinkedList
* Replace uses of ItemProxy::getValue with Dictionary::at
* Extract list of loads from gradientsMap before updating it
* Const correctness in type layout
* Add unordered_dense hashmap submodule
* Use wyhash or getHashCode in slang-hash.h
* refactor slang-hash.h
* Use ankerl/unordered_dense as a hashmap implementation
Notable changes:
- The subscript operator returns a reference directly to the value,
rather than a lazy ItemProxy (pair of dict pointer and key)
slang-profile time (95% over 10 runs):
- Before: 6.3913906 (±0.0746)
- After: 5.9276123 (±0.0964)
* 64 bit hash for strings
So they have the same hash as char buffers with the same contents
* Narrowing warnings for gcc to match msvc
* revert back to c++17
* Correct c++ version for msvc
* Use path to unordered_dense which keeps tests happy
* Do not assign to and read from map in same expression
* Remove redundant map operations in primal-hoist
* Split out stable hash functions into slang-stable-hash.h
* 64 bit hash by default
* regenerate vs projects
* Correct return type from HashSetBase::getCount()
* correct width for call to Dictionary::reserve
* Use stable hash for obfuscated module ids
* Signed int for reserve
* clearer variable naming
* Parameterize Dictionary on hash and equality functors
* Allow heterogenous lookup for Dictionary
* missing const
* Use set over operator[] in some places
* Remove unused function
* s/at/getValue
|
|
* Add type layout for structured buffer
* Default to generating spirv directly
* vk test for compute simple
* Add spirv-dis as a downstream compiler
* Emit Array types in SPIR-V
* makevector for spirv
* Dump whole spirv module on validation failure
* register array types
todo, use emitTypeInst
* Neater formatting for unhandled inst printing
* break out emitCompositeConstruct
* Correct array type generation
* neaten
* Allow getElement for vector
* Remove unused
* Allow predicating target intrinsics on types
* Consider functions with intrinsics to have definitions
We need to specialize these if they are predicated on types
* Correct array type generation
* makeArray for spir-v
* replace getElement with getElementPtr for spirv
* Correct translation of field access for spirv
* Push layouts to types for spirv
* Spirv intrinsics * operator now makes a pointer
* Add structured buffer of struct test
* Preserve type layout in spirv structured buffer legalization
* neaten
* makeVectorFromScalar for SPIRV
* placeholder for layouts on param groups
* More type safe spirv op construction
* Know that constants and types only go in one section
* Remove emitTypeInst
* Add todo for spirv sampling
* Add links to spirv documentation on emit functions
* OpTypeImage support for SPIR-V
* Add simpler texture test for spirv
* s/spirv_direct/spirv/g
* Allow several string literals in target_intrinsic
* Handle global params without a var layour for SPIR-V
For example groupshared vars
* uint spirv asm type
* Add todo for isDefinition
It is currently too broad
* Some atomic op spirv intrinsics
* Strip ConstantBuffer wrappers for spirv
* Add todo for matrix annotations
* Do not associate decorations insts with spirv counterparts
* Correct entry point parameter generation
* Spelling
* Assert that fieldAddress is returning a pointer
* Add error for existential type layout getting to spir-v emit
* Add IRTupleTypeLayout
Unused so far
* Allow getElementPtr to work with vectors
* Correct target name in test
* Hide default spirv direct behind a premake option --default-spirv-direct=true
* Do not insert space at start of intrinsic def
* Correct asm rendering in tests
* remove redundant option
* Emit directly from direct test
* Add source language options for spirv-dis
* Add comments to spirv dis
* Add dead debug print for before spirv module
* Correct asm rendering in tests
* s/spirv_direct/spirv/g
* Only specialize intrinsic functions with predicates
* regenerate vs projects
* squash warnings
* squash warnings
* remove duplication
* Silence warnings from msvc
* squash warnings
* Overload for zero sized array
* More msvc warnings
* warnings
* Add spirv-tools to path for tests
* Do not be specific about dxc version for diag test
* Normalize line endings from spirv-dis
* Correct filecheck matches
* Temporarily disable two spirv tests
Failing on CI, undebuggable hang :/
* Do not emit storage class more than once for spirv snippet
* Do not pass spir-v to spirv-dis by stdin
* Do not get spirv-dis output via stream, use file
* normalize file endings in spirv-dis output
|
|
* Fix -fvk-use-entrypoint-name.
* Remove irrelevant changes.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add option to use original entrypoint in spirv output.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Redesign DeclRef + Deduplicate Val.
* Update project files
* Fix warning.
* Fix.
* Fix.
* Remove `Val::_equalsImplOverride`.
* Rmove `Val::_getHashCodeOverride`.
* Remove `semanticVisitor` param from `resolve`.
* Cleanups.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Create and cache flattened inheritance lists
The basic change here is to have a cached lookup that can map a `Type`,
or a `DeclRef` that might refer to a type or `extension`, to a list of
the *facets* that comprise it.
The notion of a *facet* here is similar to what the C++ standard calls
"sub-objects".
A declared type like a `struct` has:
* a facet for its own direct members
* one facet for each of its (transitive) base `struct` types
* one facet for each `interface` it conforms to
* one facet for each `extension` that applies to that type
The set of facets for a type is de-duplicated (so that "diamond"
inheritance patterns don't cause issues) and deterministically ordered,
using a variation of the C3 linearization algorithm.
The creation of a linearized list of facets should help the compiler
implementation in two key places:
* Testing if a type implements an interface (or inherits from a base
type) should now only take time linear in the number of (transitive)
bases of that type. We can simply scan the linearized facet list to
see if it contains a facet corresponding to the given base.
* Looking up the members of a type (or a value of a given type) should
be greatly simplified, since all of the members can be found in a
single linear scan of the facet list. In addition, those facets will
be ordered so that facets for "more derived" types will precede those
for "less derived" types, so that shadowing in the case of overrides
should be easier to implement.
This change only implements the first of these two improvements, since
there is already a *lot* of churn involved.
Notes and caveats:
* The handling of conjunction types (e.g., `IFoo & IBar`) complicates
the implementation, both because the simple approach to subtype
testing alluded to above is no longer complete, and also because
we need to be more careful about what forms of subtype witnesses
we construct, so that we can maintain the currently-required invariant
that two witnesses are only equal if they have matching structure.
* We don't implement the full/"proper" C3 algorithm here because it has
some failure cases that we'd still like to support. In particular if
we have both `IX : IA, IB` and `IY : IB, IA`, the C3 algorithm says it
is illegal to have `IZ : IX, IY` because the two bases it inherits
from disagree on the relative ordering of `IA` and `IB` in their
own linearizations. Handling such cases may make our implementation
less efficient, and it will also require testing of those corner
caes.
* When it comes time to revamp the implementation of lookup, we will
need to deal with the fact that a single linear list (seemingly)
cannot give us sufficient information to decide which of two members
of the same name should shadow the other, or if there is an ambiguity.
Or rather, it *can* give us that information if we are willing to
accept some very user-unfriendly behavior and simply say that
declarations earlier in the linearization always shadow later
declarations, even if the facets involved are not related by an
inheritance relationship of any kind.
* In order to remove one kind of vicious circularity from the approach,
the linearization that we are computing for `extension` declarations
will not be sufficient for lookups in the body of such an `extension`.
A future change may need to have support for creating and caching
two distinct linearizations for each `extension`: one that is to be
used when that `extension` is pulled into the linearization for a
type that it applies to, and another for when lookup will be performed
in the context of the `extension` itself.
* This change does *not* include the simple expedient of adding a direct
cache for subtype tests to the `SharedSemanticsContext`, although
adding such a cache would be a simple matter.
* This change introduces more deduplication for subtype witnesses,
which should enable more deduplication for other `Val`s (including
`Type`s), but it does not introduce any assumptions that equal
`Val`s or `Type`s must have identical pointer representations.
* Eventually we may find that, similar to the situation with `Type`s,
we will want to have a split between surface-level and canonicalized
versions of other `Val`s, including subtype witnesses.
* Fix clang error.
* remove debugging code.
---------
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Use scratchData on `IRInst` to replace HashSets.
* Update test results.
* Initialize scratchData.
* Update autodiff documentation.
* Use enum instead of bool.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add API for querying total compile time.
* Optimize.
* Remove redundant simplifyIR calls.
* Fix.
---------
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix typo.
* Add options for source embedding.
* Small improvements.
* Working with tests.
* Add check for supported language types for embedding.
* Try and remove assume warning.
* Fix warning on MacOSX.
* Some extra checking around Style::Text.
* Some small improvements to docs/handling for headers extensions.
* Fix md issue.
* Small fixes around zeroing partial last element.
* Another small fix....
* Small improvement in hex conversion.
* Add an assert for unsignedness.
|
|
* WIP CommandOptions
* Fix some output issues.
* Simplify word wrapping.
* Add file extensions.
* Change how lookup takes place.
Add appendSplit functions to StringUtil.
Make Categories hold the index range of their options.
* Small improvement.
* Lookup with partial option names.
* Associate user values.
* Encoding flags in the name.
* Refactor setting up of command options.
* Use CommandOptions in slang-options.
* Remove old help text.
* Cache the CommandOptions on the Session.
* Range checking.
Fix bug in the Options handling.
* Extra checks for validity.
* Get categories directly.
* Slight improvements over output.
* Added NameValue types.
* Fix typo.
Remove some now unused diagnostics.
Fix diagnostic in testing, as output has changed.
* Add minimal usage message.
* Remove platform executable extension from diagnostics output.
* Some improvements around getting names from NameValue types.
* Improve some option descriptions.
* Small fixes.
|
|
|