| Age | Commit message (Collapse) | Author |
|
The underlying problem here requires that we have an object-like macro with an expansion that starts with a non-identifier token:
```
```
Then we need a function-like macro that uses a token paste in a way that can expand to that object-like macro:
```
```
Finally, for the specific case a user ran into, we need to invoke that function-like macro in the context of a preprocessor conditional expression:
```
// ...
#error "unimplemented"
```
The way a problem manifest is that the preprocessor logic that handles conditional expressions tries to "peek" one token ahead and see what is coming, and while the peeking logic handles macro expansion it does *not* handle token pasting right now. That means that the peek operation sees `MY_FEATURE` and assumes that it is seeing an identifier in a preprocessor conditional that doesn't have a macro expansion. The logic then goes on to read the token, but what it gets back is *not* an identifier, and is instead the numeric literal token `1`, because the reading logic handles token pasting.
The quick fix I applied here is to make the logic that deals with preprocessor conditionals go ahead and automatically consume a token from the input, and then decide what to do based on that token, so that it always makes use of the reading logic that handles token pasting.
The lingering problem is that we still have cases in the preprocessor that use the peeking logic which doesn't handle pasting, and we might find that those cases have reason to want the same kind of expansion behavior we needed here. A more systematic fix would be to have the peeking logic automatically handle token pasting as well as macro expansion, but doing so would be a more complicated change because detecting the `##` when peeking ahead requires two tokens of lookahead, and our current implementation only assumes we can support one.
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Explicit swapchain interface in `gfx`.
* Correctly return nullptr when `IRenderer` creation failed.
* Fix crashes on CUDA tests.
* Cleanups.
|
|
This operation was added along with the `SV_Barycentrics` system-value input, and allows for a `nointerpolation` varying input to a fragment shader to be fetched at a specific vertex index within the primitive that is causing the fragment shader to be invoked.
This change adds support for the new operations in the standard library, and also includes a test case to make sure that we emit it correctly when producing HLSL/DXIL.
This change also includes a small bug fix to our emission logic for function parameters so that we properly emit layout-related attributes for varying parameters declared directly on an entry point.
(Note that most attribute end up being declared in `struct` types in existing HLSL shaders, and our IR passes produce only global variables for attributes on GLSL; the only case this affects is inidividual scalar/vector attributes declared declared as entry-point parameters, when outputting HLSL)
Note that this change only adds support for the new function on the HLSL/DXIL path, and doesn't yet add any cross-compilation support for GLSL/SPIR-V. The reason for this is that the equivalent GLSL feature(s) appear to use a different model to the HLSL version, and we need to invent a suitable approach to align them to make portable code possible.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP extracting source documentation.
* WIP doc extraction.
* More stuff around doc markup extraction.
* More WIP around doc extraction.
* Fix some indexing issues.
* Initial doc extraction working.
* Renaming of types in markup extraction process.
* Extracting markup content.
Removing indenting.
Other fixes and improvements around document tools.
* WIP support for documentation system.
* Remove some commented out sections.
* Remove some comments that no longer apply.
* Improvements around SourceFile - such that more granularity around line ops.
Made some functionality explicitly work without source.
Improved Doc types nameing.
|
|
The `AdvanceIfMatch()` method was introduced to the parser as a way to avoid infinite loops when parsing nested list structures (e.g., `()`-enclosed parameter lists). The basic idea is that it tries to detect if we have scanned "too far" looking for a closing token, and reports a match to whatever logic was doing the looping to break the statemate.
Unfortunately, the `TryRecoverBefore` logic was changed at some point so that it doesn't necessarily advance any tokens at all, because we generally don't want to skip over a `}` while searching for a `)`. As a result, we could still end up in an infinite loop where we didn't consume any additional tokens as part of recovery, but wouldn't bail out of the search for a match.
This change tries to introduce a slightly more systematic setup where `AdvanceIfMatch` is now parameterized on a type of matched token pair (not just the closing token), and each such matched token pair introduces a list of tokens where if we see them as our lookahead we should bail out (e.g., when looking for a `)` we should give up the search upon seeing a `}`).
After installing that fix I found that my simple test case still gave a surprising error because when mistakenly parsing a function body the parser would look for a `{` and then a `}` to close the body. The search for a closing `}` could accidentally consume a `}` meant for an outer scope, and lead to a cascading failure. I madea quick fix to the parsing of block statements so that we don't look for a closing `}` if we never had an opening `{`, but that isn't really a systematic solution like we truly need.
For now, these fixes will avoid the infinite-loop case, and should give a better diagnostic in the case a user ran into, but we need to take time to do some more top-down work on the parser sooner or later.
|
|
Both D3D "rasterizer ordered views" (ROVs) and GLSL "fragment shader interlock" (FSI) are aimed at the same basic use case: they allow for fragment shaders to contain operations that require mutual exclusion and/or deterinistics ordering between fragment shader invocations that affect the same framebuffer coordinates. The language-level exposure of the features varies greatly between the two API families, though:
* ROVs define an implicit ordering and mutual exclusion constraint: certain resoure parameters are marked as `RasterizerOrdered`, and reads/writes to these resources must be sequences *as if* fragment-shader invocations ran in sequential order for each pixel.
* FSI defines paired begin/end functions that mark a critical section of code. All memory operations in the critical section must be sequences *as if* fragment-shader invocations ran in sequential order for each pixel. In order to make this model tractable, only a single critical section is allowed per fragment shader, and the begin/end must appear at the top level of the shader entry point function (not under control flow or after a possible conditional `return`.
The simplest way for Slang to support portable programs that run across both API families is to insist that code that cares about these ordering guarantees must use *both* mechanisms, and then each of them will only affect the API that cares about it.
Slang already supports ROV resource types, and already lowers them to plain textures for GLSL/SPIR-V.
This change adds the missing feature of a begin/end function pair for FSI, which will map to empty functions on non-GLSL targets.
|
|
* Add a chapter on target platforms
The primary goals of this chapter are:
* Make users aware of just how many different ways of handling things there are across targets. If a user leaves this chapter thinking "how in the world can you abstract over all these differences?", then we have done our job, because they are primed to understand why layout and parameter binding are **necessarily** complicated.
* Help users to understand/recall the relevant capabilities and restrictions of the platforms they care about most. If somebody only cares about D3D12 and Vulkan, I want them to leave with a detailed understanding of how those two differ so they can understand the *specifics* of where the layout and parameter-binding algorithms have to treat those targets differently.
All of this could conceptually be just a background section in the layout and parameter-binding chapter, but putting it off in its own chapter avoids that one taking forever to actually get where it is going.
* Typos
|
|
* Make gfx library visible to external user.
* Fixup
|
|
|
|
This change kind of rolls together two different simplifications:
1. The `createShaderObject()` shouldn't really need to take an `IShaderObjectLayout` because it could just take the `slang::TypeLayoutReflection` instead and create the shader-object layout behind the scenes.
2. For that matter, it needn't take a `slang::TypeLayoutReflection` either, becaues it could just take a `slang::TypeReflection` and query the layout of that type behind the scenes.
The combination of these two changes means:
* `IShaderObjectLayout` is gone from the public API, as is `createShaderObjectLayout()`
* `createShaderObject()` directly takes a `slang::TypeReflection` and allocates a shader object of that type
The result is simpler and more streamlined application code.
Note that under the hood the implementation still has shader-object layouts, using the `ShaderObjectLayoutBase` class. A few locations had to change to use `RefPtr`s instead of `ComPtr`s now that the class is no longer a public COM-lite API type.
The hope is that this change makes it easier to allocate/cache layouts for things like specialized types "under the hood," as is needed to implement parameter setting for static specialization.
|
|
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
This change makes it so that the shared shader object implementation across graphics APIs (everything except CUDA and CPU) uses a host-memory buffer to store ordinary (aka "uniform") data while the shader object is being set up / modified, and then allocates and initializes a GPU-memory buffer for the data on-demand once setup is complete.
This choice is a necessary step for supporting interface/existential-type fields in the presence of static specialization, because any fixed-size GPU buffer we would try to allocate at the time an object is first created might not turn out to be large enough if static specialization must handle a concrete type that doesn't "fit" into the fixed-size space reserved for an existential value (resulting in the value having to be placed in an overflow region outside the original object).
This change does *not* include any of the work related to actually laying out existential-type fields in this fashion. It instead just focuses on changing when and where the GPU memory allocation is performed to one that is more appropriate for those subsequent changes.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP: First pass in supporting output of line error information.
* Add support for lexing to better be able to indicate SourceLocation information.
* Fix lexer usage in DiagnosticSink in C++ extractor.
* Update diagnostics tests to have line location info.
* Fixed test expected output that now have source location information in them.
* Better handling of tab.
* Fix test expected results for tabbing change.
* DiagnosticLexer -> DiagnosticSink::SourceLocationLexer
Added line continuation tests.
* Fix typo.
* Added String::appendRepeatedChar
* Change to rerun tests.
* Added source locations to IR dumping.
* Output column for IR dump source loc.
* Add support for closing brace location to AST.
Use closing brace location in lowering when adding return void.
* Set the source location through SourceLoc - simplifies identifying if current loc is valid.
* Copy terminator sloc.
* Test for improved #line handling.
* Made writer the last parameter for dumpIR.
Small improvements to comments.
* Disable sloc output on dump IR by default.
* Fix issue with #line and inlining.
* Fix for output with improved #line output.
* Small comment change - mainly to kick off TC build.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* Add `SampleGrad` overload for lod clamp.
* Fix gfx to run the test on vulkan.
* Whitespace change to trigger CI build
* remove presentFrame call in render-test
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
The purpose of these changes is to make the `shader-object` example work correctly on CUDA.
Originally I had tried to add changes to the "flat" reflection information so that it introduced descriptor ranges to match the binding ranges it added for interface/existential-type fields. This approach helped the CUDA code that was using that information to try and compute uniform offsets for those fields, but it broke most of the other renderer back-ends. Instead, I removed the relevant asserts from `CUDAShaderObject::setObject()`.
Note taht there are leftover changes from my edits to the flat reflection information, around how it handles "leaf" fields that consume multiple resource kinds. I believe that those changes are, on balance, "more correct" now than they were before, so I decided to leave them in.
The other major fix here is to specialize the `CUDAShaderObject::setObject()` logic to handle the case of setting a shader object for a parameter that has interface type instead of a constant-buffer or parameter block. Mostly I just copy bytes from the child object into the parent object. There are a few caveats, though:
* I am not writing the RTTI or witness-table information, so dynamic dispatch won't work.
* I am assuming a hard-coded offset of 16 bytes for the any-value, which will work for now but is a bit too "magical" and might also break once we support conjunctions of interfaces with dynamic dispatch
* I am assuming that the child value to be writen into the field will "fit" into the any-value area. We need some way to determine whether or not things fit dynamically (ideally using the reflection data), and adapt accordingly.
* I had to add another method on the base CUDA shader object type to handle setting data using a device-memory pointr instead of a host-memory pointer
* There's not a lot we can do about it, but in the case of assigning an ordinary `CUDAShaderObject` into an interface-type field of a `CUDAEntryPointShaderObject` we end up needing to perform a device->host memory copy, because the bytes of the value will have already been written to GPU memory, but need to be in GPU memory for the dispatch call.
* The implementation I'm using here basically assumes that the child shader object must have been finalized before it gets plugged into the parent shader object. We haven't yet made a policy decision about that bit.
|
|
* Add an accessor for IRInst opcode
This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it).
This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR.
* fixup
|
|
* Add associated type and generic value parameter doc section
* Typos and corrections.
|
|
This change adds initial support for a feature being proposed for inclusion in dxc: https://github.com/microsoft/DirectXShaderCompiler/pull/3171.
The main features are:
* A `[payload]` attribute that indicates which `struct` types are intended to be used as payloads. Consistent use of this attribute should mean that an application no longer needs to manually specify a maximum payload size when creating a ray-tracing pipeline.
* `read(...)` and `write(...)` qualifiers which can be attached to fields of `struct` types (usually `[payload]`-attributed types) to indicate which ray tracing pipeline stages are allowed read/write access to that part of the payload. Use of these qualifiers should allow an implementation to optimize storage of ray payload elements across RT pipeline stages.
The work in this change just adds basic parsing for these features, translation to matching IR decorations, and then emission of HLSL text based on those decorations.
Notable gaps in this first change include:
* No work is currently being done to validate access to ray payloads in RT entry points based on these qualifiers.
* The stage names in `read(...)` and `write(...)` are not being validated, and are being stored in the IR as text. These should probably use the `Stage` enumeration in some fashion, but we would need to have a way to encode the additional `caller` pseudo-stage that the feature uses.
* No work is currently being done to adjust or react to the chosen shader model when emitting HLSL code. We should *either* have these attributes force a switch to a higher shader model, *or* skip emission of these attributes if the chosen shader model / profile does not imply support for them.
* No tests are currently included for this work, because tests would rely on using a custom `dxcompiler.dll` build with the new feature supported.
|
|
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* Move around the conventional/convenience features chapters
* Add a first draft of a section on compilation using `slangc` and the COM-lite API
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Support `bit_cast` between complex types.
* Fix vs project file
* Fix clang build error
* fix
* fix
* Fix
* FIx
* Fix
* Fix
* Fix
* Fix
* Fix linux compile error
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP: First pass in supporting output of line error information.
* Add support for lexing to better be able to indicate SourceLocation information.
* Fix lexer usage in DiagnosticSink in C++ extractor.
* Update diagnostics tests to have line location info.
* Fixed test expected output that now have source location information in them.
* Better handling of tab.
* Fix test expected results for tabbing change.
* DiagnosticLexer -> DiagnosticSink::SourceLocationLexer
Added line continuation tests.
* Fix typo.
* Added String::appendRepeatedChar
* Change to rerun tests.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* Fix getting started doc
* Add convenience features chapter in user-guide doc
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
The underlying problem here is that our `SharedIRBuilder` (which currently owns the "global" value-numbering map) has a subtle invariant ("subtle" in the sense of "dangerous and bad"). The value-numbering map stores `IRInst`s for things like constants and types, and if those instructions end up getting modified or deleted (deleting an instruction currently runs its destructor but does not free the pool-allocated memory), then it is possible for the computed hash code for an instruction to no longer match what it was when it was inserted.
The trigger in this case was a use of the `IRInst::removeAndDeallocate()` operation inside of the AST-to-IR lowering pass, which uses a single `SharedIRBuilder`. If that `removeAndDeallocate()` happens to apply to a value in the value-numbering map, then it risks breaking the next time the map gets rehashed.
The short-term fix here is simple: never try to delete an instruction during IR lowering, even if it is known to be unused. Instead, we can rely on the subsequent DCE pass to eliminate the instruction.
A longer-term fix here would involve fixing our entire strategy around value numbering. We know we need to do that, but that would be a big enough change that it couldn't be pursued as part of a simple bug fix like this.
|
|
This change adds a first draft of an Introduction chapter, along with a chapter about the "conventional" features of Slang (when compared to HLSL, GLSL, and C/C++).
|
|
* Add getting started documentation
* wording
* wording
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix typo
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix bugs with m_features on Dx12 and gl.
Fix issue about GFX_NVAPI availability.
* Fix handling of SLANG_E_NOT_AVAILABLE on renderer startup.
* Clarify comment.
* Improve comment.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Copy source loc information when inlining.
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Typo for renderer name for DX12.
|
|
The basic feature here is the ability to use the `&` operator to produce the conjunction/intersection of two interfaces. That is, you can have interfaces:
interface IFirst { int getFirst(); }
interface ISecond { int getSecoond(); }
and if you need a generic function where the type parameter `T` must conform to *both* of these interfaces, you express that by constraining the parameter to the intersection of the interfaces:
void someFunction<T : IFirst & ISecond>(T value) { ... }
Without this feature, the main alternative an application would have is to define an intermediate interface, like:
interface IBoth : IFirst, ISecond {}
Forcing users to deal with an intermediate interface creates more work for type authors (they need to remember to inherit from the right combined interface(s)), or for `extension` authors (when you add `ISecond` to a type that used to just support `IFirst`, you had better also add `IBoth`). In the worst case, a family of N related "leaf" interfaces would give rise to an exponential number of intermediate interfaces to represnt the possible combinations.
A conjunction like `IFirst & ISecond` is officially its own type, and can be used to declare a type alias:
typealias IBoth = IFirst & ISecond;
This change only includes the first pass of work on this feature, so there are several caveats to be aware of:
* Using a conjunction as part of an inheritance clause is not yet supported (e.g., `struct X : IFirst & ISecond`). This is true even if the conjunction was introduced by an intermediate `typealias`
* The `&` syntax introduced here is only parsed in places where only a type (not an expression) is possible. This means you cannot do things like cast to a conjunction with `(IFirst & ISecond)(someValue)`.
* This work *should* apply to conjunctions of more than two interfaces (like `IA & IB & IC`) but that has not yet been tested
* In the long run it may be sensible to allow conjunctions that use concrete types, but we really ought to have the semantic checking logic rule that out for now.
* During testing, I encountered compiler crashes when trying to use this feature together with `property` declarations. Further investigation and debugging is called for.
* The handling of conjunction types is currently incomplete, in that there are many equivalences the compiler does not yet understand. For example, it is clear that `IA & IB` is equivalent to `IB & IA`, but the compiler currently does not understand this and will treat them as different types. A deeper implementation approach is called for.
* Conjunctions are currently only supported for generic type parameter constraints, when performing full specialization. Use of conjunctions for existential-type value parameters or with dynamic dispatch is not yet supported.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP diagnostics for line number output.
* Small param naming change
* Use x macro for pass through compile human name lookup/getting.
* WIP on parsing downstream compiler output.
* Split out parsing into ParseDiagnosticUtil.
Added test result of single line.
* Dump out the std output on fail to parse diagnostics.
* Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC
* Use Index for StringUtil.
* WIP: First pass support for parsing Slang diagnostics.
* WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism.
* Use the new parsing mechanism for diagnostic comparisons.
* Fix layout on GLSL, doesn't have CR so runs into main.
* Split out switch on outputting intrinsic 'specials'.
Output code around intrinsic as emit - so that we get the appropriate indenting (and potentially other benefits).
* Improvements to diagnostics parsing.
Better error handling, and fallback handling.
Added ability to parse downstream compilers without a prefix.
Added ability to parse Slang with a prefix.
* DownstreamDiagnostic::Type -> Severity and related fixes.
* Small fixes around moving from DownstreamDiagnostic::Type -> Severity
* Fix handling of 'special intrinsic' expansion
* Split out the handling of intrinsic expansion into it's own type and files.
* Fixes to reading expected output - for SimpleLine test.
* Test using += to check #line output.
* A test around += and return.
* Small comment fixes.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP diagnostics for line number output.
* Small param naming change
* Use x macro for pass through compile human name lookup/getting.
* WIP on parsing downstream compiler output.
* Split out parsing into ParseDiagnosticUtil.
Added test result of single line.
* Dump out the std output on fail to parse diagnostics.
* Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC
* Use Index for StringUtil.
* WIP: First pass support for parsing Slang diagnostics.
* WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism.
* Use the new parsing mechanism for diagnostic comparisons.
* Improvements to diagnostics parsing.
Better error handling, and fallback handling.
Added ability to parse downstream compilers without a prefix.
Added ability to parse Slang with a prefix.
* DownstreamDiagnostic::Type -> Severity and related fixes.
* Small fixes around moving from DownstreamDiagnostic::Type -> Severity
* Small comment fixes.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
This change pertains to `static` variables in function scope (including things like methods, initializers, property accessors, etc.). Note that it does *not* have anything to do with global-scope `static` variables or with `static const` variables (whether inside a function or not).
The old code generation strategy had a lot of "clever" code to deal with the problem of a `static` variable inside a generic function (or inside a function inside a generic type, etc.). Basically, if you had input code like:
int myFunc<T>(int newVal) {
static int state = 0;
int result = state;
state = newVal;
return result;
}
The language semantics are that `myFunc<float3>` should have a different `state` variable than `myFunc<int2>`. The way that the existing codegen handled that was to generate the `state` variable into its own dedicated `IRGeneric`. Something like:
generic myFunc_state<T0> { global_var g_ptr : int*; return g_ptr; }
generic myFunc<T1> {
func f(int newVal) {
let result : int = load(state<T>);
store(state<T1>, newVal);
return result;
}
}
The catch there is that you end up needing to generate an entire second `IRGeneric`, and then references to `state` need to explicitly use `specialize` to instantiate that generic using the same parameters as `myFunc` was passed (note how `T0` and `T1` are distinct IR generic parameters, despite both representing `T` here).
Things get even more complicated when you consider function-`static` variables with initialization logic, since we need to be sure we only perform that initialization once, but the initialization could refer to arguments of the outer function, and thus needs to be done inside the function body. To handle that case we emit an additional `bool` global if a function-`static` variable has an initializer, and that `bool` gets wrapped up in yet another generic.
That whole approach seems silly in retrospect, and a much simpler solution is possible: just emit the function-`static` variable immediately before the IR function it pertains to, which means it will be nested under the *same* IR generic if there is one (and at module scope if there isn't). The result is something like:
generic myFunc<T1> {
global_var state_ptr : int*;
func f(int newVal) {
let result : int = load(state_ptr);
store(state_ptr, newVal);
return result;
}
}
This change implements that simplification, and all the same tests pass (including whatever tests we had for function-`static` variables).
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP diagnostics for line number output.
* Small param naming change
* Use x macro for pass through compile human name lookup/getting.
* WIP on parsing downstream compiler output.
* Split out parsing into ParseDiagnosticUtil.
Added test result of single line.
* Dump out the std output on fail to parse diagnostics.
* Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC
* Use Index for StringUtil.
* WIP: First pass support for parsing Slang diagnostics.
* WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism.
* Use the new parsing mechanism for diagnostic comparisons.
* Improvements to diagnostics parsing.
Better error handling, and fallback handling.
Added ability to parse downstream compilers without a prefix.
Added ability to parse Slang with a prefix.
|
|
The `GlobalGenericParamSubsitution` class used to be used to represent the mapping of global-scope generic parameters to their concrete arguments, so that we could make use of those concrete arguments for things like layout. That representation caused a lot of pain for other parts of the compiler, though, because everything that dealt with `Substitution`s needed to account for the possibility of global-generic-param subsitutions even if they logically could not occur in most parts of the compiler.
We have since moved to a model where the values for global-scope generic parameters are stored in a single explicit global structure that is used by both layout computation and IR lowering. There is no actual code that construct `GlobalGenericParamSubstitution`s from scratch any more, so all of the support code for them was actually unused.
This change removes all the unused code, and shows that the tests still pass without it (even the tests that use global-scope generic parameters).
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP diagnostics for line number output.
* Small param naming change
* Use x macro for pass through compile human name lookup/getting.
* WIP on parsing downstream compiler output.
* Split out parsing into ParseDiagnosticUtil.
Added test result of single line.
* Dump out the std output on fail to parse diagnostics.
* Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Small typo fixes for docs on CUDA target.
|
|
The problem would manifest for any code that declared a DXR 1.1 `RayQuery` value, but then only used it as one location in their code.
The most common way for this to arise in user code was declaring a `RayQuery` and then handing it off to a helper/worker subroutine.
RayQuery<0> myRayQuery;
helperRoutine(myRayQuery, ...);
The root cause was in the emit logic, where the initialization of `myRayQuery` above (a `defaultConstruct` operation in our IR) was getting folded into its (only) use site.
This folding makes some sense, because the initialization of a ray query is not an operation with side effects, but doesn't work in practice because our way of handling default construction in HLSL output is by using a variable declaration.
The simple fix here is to ensure that `defaultConstruct` instructions never get folded into use sites.
If we decide to revisit the logic here, it might be possible to separate out the case where a `defaultConstruct` is being used as a stand-alone instruction, where we can emit it as:
RayQuery<0> myRayQuery;
versus cases where the `defaultConstruct` is being used as a sub-expression, such as:
helperRoutine(RayQuery<0>(), ...);
Whether or not we can emit the latter form (or if it would be equivalent) depends on details of how constructors like this are being implemented in dxc.
For now it seems safest to emit things in a form that is obviously expected to work.
Aside: Historically, the HLSL language has had no notion of "constructors" as being a thing.
A variable that is declared but not initialized in HLSL has always been left uninitialized, since the first version of the language.
The `RayQuery` type in DXR 1.1 is the first example of a type that appears to have a C++-style "default constructor," although HLSL as implemented by dxc still does not expose constructors as a user-visible or documented feature.
(There is the small detail that the DXR 1.0 `HitGroup` type also relied on C++ constructor syntax, but I'm not aware of anybody using that feature right now, so it is mostly a curiosity.)
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Added trying out section to README.md
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
|
|
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Work around for issue with obfuscation (and lack of name hints) leading to names in output not being correctly uniquified.
* Improve appendChar
Remove unrequired memory juggling to scrub names.
* Remove test code.
* Small fixes in comments and method called.
* Remove linkage decoration on functions that are specialized.
* Obfuscation naming with specialization test.
* Fix instruction deletion.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP more sophisticated mechanism to find NVRTC.
* Improve nvrtc searching to include PATH.
* Make getting an extension able to differentiate between no extension, and just a .
* Add comment.
* Add support for searching instance path.
* Small improvements around scope and finding NVRTC.
* Improve documentation around NVRTC loading.
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Add other NVRTC versions.
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* Further flatten IR natvis views
* improvements
* formatting
Co-authored-by: Yong He <yhe@nvidia.com>
Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>
|
|
* Fix existential specialization of mutable buffer loads.
* fix
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|