| Age | Commit message (Collapse) | Author |
|
fixes #373
fixes bug that misses current translation unit's scope when resolving entry-point global type argument expression.
|
|
reflection data
|
|
|
|
|
|
- support overloaded generic function. this involves adding a new expression type, `OverloadedExpr2` to hold the candidate expressions for the generic function decl being referenced.
- make BitNot a normal IROp instead of an IRPseudoOp
- make sure we clone the decorations of parameters when cloning ir functions
- propagate geometry shader entry point attributes (`[maxvertexcount]` and `[instance]`) through HLSL emit
- IR emit: handle geometry shader entry-point parameter decorations, such as 'triangle'.
- IR emit: treat geometry shader stream output typed ir value as `should fold into use`.
|
|
1. prevent cyclic lookups when an interface inherits transitively from itself.
2. in `createGlobalGenericParamSubstitution`, create a default substitution for the base type declref before using it to lookup the witness table.
|
|
This completes item 5 in issue #361.
The interesting change is that when checking for interface conformance, we include the requirements (include transitive interfaces) defined in extensions as well. (check.cpp line 1946)
All the other changes are for one thing: reoder the semantic checkings to two explicit stages: check header and check body. In check header phase, we check everything except function bodies, register all extensions with their target decls, then check interface conformances for all concrete types. In body checking phase, we look inside the function bodies and check concrete statements/expressions. This change ensures that we take extension into consideration in all places where it should be.
|
|
|
|
This commit is a bunch of quick hacks to get transitive interfaces to work. The idea is for each concrete type we create one giant witness table that contains entries for all the transitively reachable interface requirements, and then create one copy of that witness table for each interface it implements.
`DoLocalLookupImpl` now also looks up in inherited interface decles when looking up for a symbol in an interface decl.
When visiting `InheritanceDecl` in `lower-to-ir`, create copies of the giant witness table for each transitively inherited interface, so that these witness tables can be found later when the IR is specialized.
Re-enable the `copy all witness tables` hack in `specializeIRForEntryPoint` to ensure those transitive witness tables are copied over.
|
|
|
|
|
|
|
|
This fixes item 2 in #361
Modifies existing extension-multi-interface.slang test case to cover the additional scenario.
|
|
Also support the scenario that the extension declares conformance to interface I, and a method M in I is already supported by the base implementation.
|
|
`createDefaultSubstitutions` now responsible for creating a `ThisTypeSubstitution` when `decl` is an `InterfaceDecl`. This is to ensure a reference to an associated type decl from the same interface that defines the assoctype decl will get a `ThisTypeSubstitution` so that the right hand side of it can be replaced by future substitutions.
|
|
|
|
fixes #362
|
|
This commit changes the type of `DeclRefBase::substitutions` from `RefPtr<Substitutions>` to `SubstitutionSet`, which is a new type defined as following:
```
struct SubstitutionSet
{
RefPtr<GenericSubstitution> genericSubstitutions;
RefPtr<ThisTypeSubstitution> thisTypeSubstitution;
RefPtr<GlobalGenericParamSubstitution> globalGenParamSubstitutions;
}
```
This change get rid of most helper functions to retreive the substitution of a certain type, as well as surgery operations to insert a `ThisTypeSubstitution` or `GlobalGenericTypeSubstittuion` at top or bottom of the substitution chain. It also simplies type comparison when certain type of substitution should not be considered as part of type definition.
|
|
|
|
* fix #353
* move validateEntryPoint to after all entrypoints has been checked
* bug fix: DeclRefType::SubstituteImpl should change ioDiff
* bug fix: generic resource usage should have count of 1 instead of 0.
* update test case
|
|
Global type argument lookup should be done in both loaded modules and current trnaslation units. This is the same as the logic of spReflection_FindTypeByName, so it is extracted into `CompileRequest::lookupGlobalDecl(Name*)` method and reused in places.
|
|
|
|
Should fix #351
The basic problem is that the type layout logic in Slang isn't taking into account the way that resource-type fields in aggregate types get split. When you just have a bare aggregate, this oversight doesn't cause a problem, but once you put those aggregates into an array, the problems become clear.
Given:
```hlsl
struct Test
{
Texture2D a;
Texture2D b;
};
Test test[8];
```
The default type-layout algorithm gives `Test::a` an offset of zero, and `Test::b` an offset of one.
However, after splitting, we have something like:
```hlsl
Texture2D test_a[8];
Texture2D test_b[8];
```
It is clear in this case that `test_b` can't start at an offset of one relative to `test_a` - it needs to start at `register(t8)`.
This change handles things by adjusting the layout of an array type to account for this detail as soon as it is created. The alternative would have been to not change layout rules at all, but to instead try to adjust things at the point where types get split (and the layout for the un-split case gets applied to the split variable). The reason for doing it the way it is in this change is that the reflection API will hopefully provide accurate information.
Related to reflection information, one thing that is missing here is proper computation of the "stride" for an array like this. We'll see if that needs to be addressed in a follow-up.
|
|
|
|
|
|
Added two API functions:
1. `spReflection_FindTypeByName`, which returns a DeclRefType to the struct type with the given name. The function finds from all loaded modules in a `CompileRequest` for a decl with the given name, construct a `Type` object and cache it in `CompileRequest::types` dictionary. The subsequent calls to `spReflection_FindTypeByName` with the same name will simply returned the cached Type objects.
2. `spReflection_GetTypeLayout`, which returns a `TypeLayout` for a given `Type`. This function creates and caches the `TypeLayout` in the `TargetRequest` object that is used to create the `ProgramLayout`.
|
|
|
|
* Add core.natvis file to Slang DLL build
It seems that when `slang.dll` gets loaded into a user's project, the debugger is able to pick up the custom visualizers implemented in `slang.natvis` (which is directly added to the DLL project) but not `core.natvis` (which is added to a static library project that the DLL project references). Adding `core.natvis` to the DLL project directly seems to resolve this and greatly improve the debugging experience when in user code.
* Bug fix: emit type of CB before CB when using IR
The problem here was the logic for emitting types used by an IR declaration before the declaration.
I refactored it to share logic between variables with initializers and functions, but in doing so failed to have an ordinary variable (which includes constant buffers) ensure that its own type was emitted before the variable.
This is a one-line fix.
|
|
* no-codegen compile flag and global generics reflection
1. Add SLANG_COMPILE_FLAG_NO_CODEGEN (-no-codegen) compiler flag to skip code generation stage, so that a shader that uses global generic type parmameters can be parsed, checked and introspected without knowing the final specialization.
2. Add reflection API to query for global generic type parameters, global parameters of generic type, and the generic type parameter index related to a global generic parameter.
3. Add a reflection test case for global generic type parameters.
* add expected result for global-type-params test case.
* fix reflection json output.
* fix branch condition errors
* fix expected result for global-type-params.slang
* fix expected test case output
|
|
Fixes #345
A brief refresher: a `SourceLoc` in the Slang implementation is just an integer (more or less an absolute byte index into all of the source compiled so far). We convert that integer to a "humane" source location (a file name and line/column numbers) by finding the file and line that match the integer via binary search. The data structures used for that search are owned by a `SourceManager`.
In order to avoid running out of source locations when used in a long-running application (that might reload shaders many times), the implementation creates one `SourceManager` per `CompileRequest`, along with a single shared `SourceManager` that is used for locations in the builtin libraries.
The root of the bug here was that some code was using the `SourceManager` for a compile request when it should have been using the one for the builtins. This happened because one source manager was asked to translate a `SourceLoc` into a humane location, which first involves "expanding" that location (figuring out which file it belongs to, and which source manager owns that file), and failed to realize that the expanded location might use a different source manager (either the current one or one of its "parents").
I fixed this by reworking the API so that the mapping from an expanded location to a humane one is no longer a member of a source manager (since the correct source manager can be looked up in the associated expanded location). Hopefully this will prevent this class of error in the future.
|
|
Fixes #333
The code in `ast-legalize` is passed an array of declarations that have been ordered by dependencies using a topological sort. Unfortunately, it was only using that list in the case where the request was considered to be a "rewrite" request, and would otherwise rely on the order in which things get forced during the recursive walk (which doesn't really work for our needs).
|
|
|
|
|
|
fixes #341
When a typedef definition is used to satisfy an associated type, we must also substitute the resulting typedef type using parent substitution, in the case that the typedef is a generic application.
|
|
fixes #339
`NamedExpressionType::CreateCanonicalType()` may return a deleted pointer. The original implementation is as follows:
```
Type* NamedExpressionType::CreateCanonicalType()
{
return GetType(declRef)->GetCanonicalType();
}
```
If `GetType()` returns a newly constructed Type (this happens when the `typedef` is defined inside a generic parent, which triggers a non-trivial substitution), the temporary type will be deleted when the function returns. The fix is to store the temporary type as a field of NamedExpressionType (`innerType`).
A relevant fix (though not the true cause of issue #339) is to have `Type::GetCanonicalType()` also hold a `RefPtr` to the constructed canonical type, when the canonical type is not `this`. This prevents a returned canonical type being assigned to a RefPtr, which makes it possible for that RefPtr to be the sole owner of the canonical type and deleteing the canonical type when that RefPtr is destroyed.
|
|
function's return type expression.
fixes #336
Add a `ReplaceScopeVisitor` to replace the scopes of the return type expression tree to use the generic decl's scope instead of module's scope after parser has determined the decl is a generic function header decl.
|
|
|
|
fixes #325
This commit includes following changes:
1. Including a default DeclaredSubtypeWitness argument when creating a default GenericSubstitution for a DeclRefType, so that the witness argument can be successfully replaced with an actual witness table after specialization. (check,cpp)
2. Not emitting full mangled name for struct field members. Since the declref of the member access instruction do not include necessary generic substitutions for its parent generic parameters, so the mangled names of the declaration site and use site mismatches. Instead we just emit the original name for struct fields. (emit.cpp)
3. Allow IRWitnessTable to represent a generic witness table for generic structs. Adds necessary fields to IRWitnessTable for generic specialization. For now, the user field of the IRUse is not used and is nullptr. (ir-inst.h)
4. Make IRProxyVal use an IRUse instead of an IRValue*, so that an IRValue referenced by IRProxyVal (as a substitution argument) can be managed by the def-use chain for easy replacement. This is used for specializing witness tables. (ir.cpp, ir.h)
5. Add a `String dumpIRFunc(IRFunc*)` function for debugging.
6. Add name mangling for generic / specialized witness tables (mangle.cpp)
7. improved natvis file for inspecting witness tables.
8. Add specialization of witness tables:
1) `findWitnessTable` will simply return the specialize IRInst for a generic witness table.
2) make `cloneSubstitutionArg` call `cloneValue` to clone the argument instead of calling `context->maybeCloneValue`, so we can make use of the cloned value lookup machanism to directly return the specialized witness table (which is done when we process the `specialize` instruction on the generic witness table before process the decl ref).
3) bug fix: the argument in ir.cpp:3338 should be `newArg` instead of `arg`.
4) add `specializeWitnessTable` function to specailize a generic witness table. It clones the witness table, and recursively calls `getSpecailizedFunc` for the witness table entries.
5) make `specailizeGenerics` function also handle the case when an operand of the `specialize` instruction is a witness table. We will call `specializeWitnessTable` here and replace the `specialize` instruction with the specialized witness table. The replacement mechanism based on IR def-use chain works here because we have already make IRProxyVal a part of the def-use chain.
9. Add two more test cases for nested generics with constraints. (generic-list.slang and generic-struct-with-constraint.slang)
|
|
|
|
|
|
* Change stdlib `saturate` to explicitly specialize `clamp`
This exposes issue #329, and so gives us an easy way to see if transitive subtype witnesses have been implemented correctly.
* Fixup: invoke correct `clamp` overloads
When switching the `clamp` calls in the stdlib definition of `saturate` I made two big mistakes:
1. I was passing in `<T>` in all cases, instead of, e.g., `<vector<T,N>>` in the vector case
2. Of course, the overloads don't actually take `<vector<T,N>>` for the vector case, because `vector<T,N>` is not a `__BuiltinArithmeticType` (`T` is), so instead it should be `clamp<T,N>(...)`.
The issue behind (2) is that we don't support "conditional conformances," which would be a way to say that when `T : __BuiltinArithmeticType` then `vector<T,N> : __BuiltinArithmeticType`. That would be a great long-term wish-list feature, but not something I can see us adding in a hurry.
Anyway the fix here is the simple one: change the vector/matrix call sites to invoke the correct overload in each case.
* Add a notion of transitive subtype witnesses
There are two pieces here:
1. Add the `TransitiveSubtypeWitness` class. This is a witness that `A : C` that works by storing nested subtype witnesses that show that `A : B` and `B : C` for some intermediate type `B`. All the basic `Val` operations are easy enough to define on this.
- The one gotcha case is whether we can ever simplify away a `TransitiveSubtypeWitness` as part of substitution. That is, if we end up substituting so that both `A` and `B` end up as the same type, then we really just need the `B : C` sub-part. Stuff like that is left as future work.
2. Make the logic in `check.cpp` that constructs subtype witnesses based on found inheritance and constraint declarations able to build up transitive chains. Most of the required infrastructure was already there (the search process maintains a trail of "breadcrumbs" that represent all the steps getting from `A : B` to `B : C` to `C : D` ...).
This change does *not* deal with the required changes in the IR to take advantage of transitive witnesses.
|
|
Fixes #326
This basically just copy-pastes logic from the explicit case over to the implicit case.
After we've solved for the explicit type/value arguments, we loop over the constraints and for each one we try to find a suitable subtype witness to use (after substituting in the arguments solved so far).
This change includes a test case for the new functionality.
|
|
Fixes #318
Most of the required support was actually in place, so this is just a bunch of fixes:
- Detect when we are in "full IR" mode, so that we can always emit `struct` declarations with their mangled named (which will produce different names for different specializations, since we emit decl-refs)
- Carefully exclude builtin types from this for now. We'll need a more complete solution for mapping HLSL/Slang builtin types to their GLSL equivalents soon.
- Skip emitting types referenced by generic IR functions, since they might not be usable.
- Also fix things up so that we emit types used in the initializer for any global variables.
- Fix bug in generic specialization where we specialize the same function more than once, with different type arguments. We were crashing on a `Dictionary::Add` call where the key already exists from a previous specialization attempt.
- Fix name-mangling logic so that when outputting a possibly-specialized generic it looks for the outer-most `GenericSubstitution` rather than just the first one in the list. This is to handle the way that we insert other substitutions willy-nilly in places where they realistically don't belong. :(
All of these changes together allow us to pass a slightly modified (more advanced) version of the test case posted to #318.
|
|
* IR: fixes for subscript accessors
Fixes #320
This is a bunch of fixes for handling of `__subscript` operations on builtin types (notably `RWStructuredBuffer` and `StructuredBuffer` at this point).
- Automatically add a `GetterDecl` to any subscript decalratio was declithout any accessors. This avoids hitting a null- dereference in the emit logic.
- Add a notion of a `RefAccessor` (declared with `ref`) as a peer to getters and setters. The idea is that a `ref` accessor returns a pointer to the element data, so that it can be used for both getting and setting values. This is closer to the behavior of `RWStructuredBuffer` element access in HLSL.
- Fixes for dealing with "access chains" where there might be a combination of a subscript (where the is a `get` and `set` but no `ref`) and member access, so that we have to read the base value into a temp, modify it, and then write it back.
- This logic is still a bit of a mess, so we will eventually want to take a more consistent pass over this to deal with how we "materialize" values for setters.
- Update `RWStructuredBuffer` to have a `ref` accessor, and then fix up the IR tests to handle the new opcode that I added for it.
- Note: I didn't handle this as an intrinsic simply because the `tests/ir/*` tests aren't really set up to handle builtins with ugly mangled names.
* Fixup: type error in VM for buffer element ref
I was using the result type of the op as the element type for computing the element address, but the result type is a pointer to the real element type.
This caused test failures on 64-bit platforms, where the stride of the buffer in the `ir/factorial` test needs to be 4.
The fix is to assume the result type is a pointer, and extract the pointed-to type out of that.
|
|
* Fix: try to improve float literal formatting
An earlier change switched to using the C++ stdlib to format floats, but I neglected to check that it would correctly print values that happen to be integral with a decimal place (e.g., print `1.0` instead of just `1`). That causes some obscure cases to fail (e.g., when you write `1.0.xxxx` in HLSL, and we print out `1.xxxx`).
I've tried to resolve the issue by using the "fixed" output format, but that may also create problems for extremely large or small values (which really need to use scientific notation). However, this at least puts us back where we were before...
* Add support for source locations to the IR
- Add a `sourceLocation` field to all `IRValue`s
- Add some logic to associate locations with operations while lowering (using a slightly "clever" RAII approach)
- Make sure that when cloning instructions, we also clone the location
- Make the emit logic use the existing support for source locations when emitting code from the IR
A nice cleanup to this work would be to have the source locations for IR values not be hyper-specific about both line and column; it is probably enough to just emit on the correct line.
|
|
* Support simple generics syntax.
This commit enables simpler generics syntax, e.g.
T test<T>(T arg) {}
or
struct Gen<T>{T x;};
* Support simple generics syntax.
This commit enables simpler generics syntax, e.g.
T test<T>(T arg) {}
or
struct Gen<T>{T x;};
* add expected test result for compute/generics-syntax.slang
|
|
* Fix up parameter block emit for mixed rewriter+IR
The basic problem here arose when a parameter block was declared in code that will be lowered via IR, but is referenced from user code that will use the AST-based lowering pass. The type legalization would split up the parameter block, and the AST would refer to the new variables, but it would fail to recognize when those variables represent constant buffers, and thus should get implicit-dereference behavior.
The fix here was just to propagate type information through when creating new AST (pseudo-)expressions to represent IR declarations that were split.
A more complete fix down the road should try to make sure that both the expression emit logic and the declaration-emit logic agree on whether a particular declaration of a parameter block or constant buffer needs to support the "implicit dereference" case or not. I'm leaving that to future work.
With this change, two test cases that had been disabled now pass.
* Fixup: don't do implicit deref for `ParameterBlock` values
Right now the rules for `ParameterBlock` and all other uniform praameter groups needs to be different, but the `isImplicitBaseExpr` logic wasn't taking that into account.
|
|
There was a bug that arose in the context of Falcor, because:
- Slang uses `sprintf` to format floating-point values when outputting HLSL/GLSL source
- Falcor (or a library it uses) does the equivalent of `setlocale(LC_ALL, "")` to set the global locale for the process based on the user's environment variables
This led to a floating-point literal of `0.5` getting printed as `0,5f`, with a comma for the decimal point. This then gets consumed by `glslang` which (luckily for us) complains that `5f` is not a valid floating-point literal in their language (since it has neither decimal point nor exponent).
The most expedient fix in this case was to switch from using C `sprintf` for formatting floating-point numbers over to using the C++ `<stdio>` implementation, which allows the locale to be set per-stream so that we don't have to rely on (or potentially disrupt) the global locale set by an application using Slang.
Longer term if we have the time/resources to do the Right Thing, it would be best to implement our own printing logic for floating-point numbers, since we would eventually need/want to support arbitrary-precision numbers for literals.
|
|
* Work on getting rewriter + IR playing nice together.
There are a few different changes here, with the goal of improving the interaction between the "rewriter" code generation approach and the new IR and type legalization code.
The main changes are:
- Add a new pass that occurs before the AST legalization pass, which walks the (used) AST declarations and tries to discover (1) which declarations need to be specialized/lowered via the IR, and (2) which declarations need to be included in the resulting AST module.
- AST-based legalization now uses the generated list when in "rewriter" mode, so that we should be working around issues that users were seeing with types not getting emitted.
- TODO: we still need an equivalent fixup in the case of non-"rewriter" emit, so this may still be a problem for `.slang` files.
- IR type legalization now precedes AST legalization, so that we can record information on how any IR global values got legalized (e.g., if they got split). Then AST legalization includes logic to reconstruct suitable tuple expressions to reference a split global.
- When emitting using IR + AST, we walk all of the declarations that we decided belonged to the IR, but which were subsequently referenced in the AST, to make sure they get output (this would include `struct` types that are declared in a file compiled via IR, but never used in IR-based code).
The rewriter+IR use case still doesn't *quite* work, but the logic for walking the AST in a pre-pass ends up being needed/useful to fix some pure rewriter bugs, so I'm getting this checked in sooner rather than later.
* Fixup: walk arguments to generic declaration reference
The gotcha here is that the code for walking the AST would walk a line of code like:
SomeType a;
and know to traverse the declaration of `SomeType`, but if it saw a line of code like:
ParameterBlock<SomeType> b;
it would traverse the declaration of `ParameterBlock`, but fail to visit that of `SomeType`.
|
|
* Add sample-rate-input detection for HLSL.
In the HLSL case, it is possible to do this detection entirely based on declared signatures (it doesn't have a dependency on code generation like the GLSL case does). I've added test cases for the two main ways that a shader can become sample rate:
1. Qualify a fragment input with `sample`
2. Accept an input with the `SV_SampleIndex` semantic
In each case I nested the input inside a `struct` to try to match common HLSL idiom, and to make sure that we handle the nested case.
This code is *not* robust against shaders that declare such an input and then never use it, but that is to be expected given the goals for Slang.
* Fixup: add missing test output files
|