<feed xmlns='http://www.w3.org/2005/Atom'>
<title>slang.git/source/core/slang-nvrtc-compiler.cpp, branch master</title>
<subtitle>Making it easier to work with shaders</subtitle>
<id>https://git.yummers.dev/slang.git/atom?h=master</id>
<link rel='self' href='https://git.yummers.dev/slang.git/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/'/>
<updated>2021-04-01T17:39:11+00:00</updated>
<entry>
<title>Added compiler-core project (#1775)</title>
<updated>2021-04-01T17:39:11+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2021-04-01T17:39:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=fa31d21ba92669a521a7768467246918e3947e02'/>
<id>urn:sha1:fa31d21ba92669a521a7768467246918e3947e02</id>
<content type='text'>
* #include an absolute path didn't work - because paths were taken to always be relative.

* Split out compiler-core initially with just slang-source-loc.cpp

* More lexer, name, token to compiler-core.

* Split Lexer and Core diagnostics.

* Move slang-file-system to core.

* Add slang-file-system to core.

* More DownstreamCompiler into compiler-core

* Fix typo.

* Add compiler-core to bootstrap proj.

* Small fixes to premake

* For linux try with compiler-core

* Remove compiler-core from examples.

* Added NameConventionUtil to compiler-core

* Add global function to CharUtil to *hopefully* avoid linking issue.

* Hack to make linkage of CharUtil work on linux.</content>
</entry>
<entry>
<title>DownstreamDiagnostic::Type -&gt; Severity (#1687)</title>
<updated>2021-02-04T19:23:32+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2021-02-04T19:23:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=7f266f1ea7a51213069282296a905650fd405c3f'/>
<id>urn:sha1:7f266f1ea7a51213069282296a905650fd405c3f</id>
<content type='text'>
* #include an absolute path didn't work - because paths were taken to always be relative.

* WIP diagnostics for line number output.

* Small param naming change

* Use x macro for pass through compile human name lookup/getting.

* WIP on parsing downstream compiler output.

* Split out parsing into ParseDiagnosticUtil.
Added test result of single line.

* Dump out the std output on fail to parse diagnostics.

* Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC

* Use Index for StringUtil.

* WIP: First pass support for parsing Slang diagnostics.

* WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism.

* Use the new parsing mechanism for diagnostic comparisons.

* Improvements to diagnostics parsing.
Better error handling, and fallback handling.
Added ability to parse downstream compilers without a prefix.
Added ability to parse Slang with a prefix.

* DownstreamDiagnostic::Type -&gt; Severity and related fixes.

* Small fixes around moving from DownstreamDiagnostic::Type -&gt; Severity

* Small comment fixes.

Co-authored-by: Tim Foley &lt;tfoleyNV@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Improved NVRTC location finding (#1674)</title>
<updated>2021-01-26T17:15:08+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2021-01-26T17:15:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=798d7731eca286df456bc2ec56c0695ba006b472'/>
<id>urn:sha1:798d7731eca286df456bc2ec56c0695ba006b472</id>
<content type='text'>
* #include an absolute path didn't work - because paths were taken to always be relative.

* WIP more sophisticated mechanism to find NVRTC.

* Improve nvrtc searching to include PATH.

* Make getting an extension able to differentiate between no extension, and just a .

* Add comment.

* Add support for searching instance path.

* Small improvements around scope and finding NVRTC.

* Improve documentation around NVRTC loading.</content>
</entry>
<entry>
<title>Add nvrtc shared library/dll names  (#1673)</title>
<updated>2021-01-22T21:18:04+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2021-01-22T21:18:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=00fad59d49d31538270b811903aeb449c97ca152'/>
<id>urn:sha1:00fad59d49d31538270b811903aeb449c97ca152</id>
<content type='text'>
* #include an absolute path didn't work - because paths were taken to always be relative.

* Add other NVRTC versions.

Co-authored-by: Tim Foley &lt;tfoleyNV@users.noreply.github.com&gt;</content>
</entry>
<entry>
<title>Search for multiple NVRTC versions (#1543)</title>
<updated>2020-09-16T21:19:39+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-09-16T21:19:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=8dd0d26466b7b84b0575031bff2ced8b3b1a1bac'/>
<id>urn:sha1:8dd0d26466b7b84b0575031bff2ced8b3b1a1bac</id>
<content type='text'>
* Search for multiple NVRTC versions

The main change here is that when locating the NVRTC compiler we try multiple library names and take the first one that loads successfully (with an ordering that means we try newer versions before older ones).

In order to support this change, I needed to fix the wrapping logic that invokes the downstream compiler "locator" function, so that it does not report every failed dynamic library load as an error diagnostic (leading to compilation failure), but instead only reports such failures once the locator has reported failure.

The form of the diagnostic output for failures is also changed, in that we now report a single umbrella error about failing to load a downstream compiler, and then report the actuall dynamic library load failures as notes on that diagnostic instead of errors of their own. This choice seems appropriate since for cases like NVRTC it is *not* the case that each failed library load is a compilation error. We only need one of the listed libraries to be loadable, so that reporting them all as errors risks confusing users.

One wrinkle that arose during testing is that the 11.0 release of NVRTC dropped support for the `compute_30` target, which had previously been the minimum and default. I had to add logic to check for versions of 11 or greater and switch to `compute_35` as the default. Similar changes may be required as part of supporting newer NVRTC versions if support for more architectures gets deprecated and removed.

A more complete implementation of this logic might try to load multiple NVRTC versions such that the Slang compiler can identify a suitable compiler based on the minimum feature level that code actually requires. That kind of cleanup is left as future work, since for most users the current approach will be sufficient.

* testing: use verbose mode for running tests by default

* fixup: guard against null diagnostic sink</content>
</entry>
<entry>
<title>Initial work to support OptiX output for ray tracing shaders (#1307)</title>
<updated>2020-04-08T20:57:24+00:00</updated>
<author>
<name>Tim Foley</name>
<email>tfoleyNV@users.noreply.github.com</email>
</author>
<published>2020-04-08T20:57:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=6274e175a2b6a07f448feadd4d7da35b2784d746'/>
<id>urn:sha1:6274e175a2b6a07f448feadd4d7da35b2784d746</id>
<content type='text'>
* Initial work to support OptiX output for ray tracing shaders

This change represents in-progress work toward allowing Slang/HLSL ray-tracing shaders to be cross-compiled for execution on top of OptiX. The work as it exists here is incomplete, but the changes are incremental and should not disturb existing supported use cases.

One major unresolved issue in this work is that the OptiX SDK does not appear to set an environment variable

Changes include:

* Modified the premake script to support new options for adding OptiX to the build. Right now the default path to the OptiX SDK is hard-coded because the installer doesn't seem to set an environment variable. We will want to update that to have a reasonable default path for both Windows and Unix-y platforms in a later chance.

* I ran the premake generator on the project since I added new options, which resulted in a bunch of diffs to the Visual Studio project files that are unrelated to this change. Many of the diffs come from previous edits that added files using only the Visual Studio IDE rather than by re-running premake, so it is arguably better to have the checked-in project files more accurately reflect the generated files used for CI builds.

* The "downstream compiler" abstraction was extended to have an explicit notion of the kind of pipeline that shaders are being compiled for (e.g., compute vs. rasterization vs. ray tracing). This option is used to tell the NVRTC case when it needs to include the OptiX SDK headers in the search path for shader compilation (and also when it should add a `#define` to make the prelude pull in OptiX). This code again uses a hard-coded default path for the OptiX SDK; we will need to modify that to have a better discovery approach and also to support an API or command-line override.

  * One note for the future is that instead of passing down a "pipeline type" we could instead pass down the list/set of stages for the kernels being compiled, and the OptiX support could be enabled whenever there is *any* ray tracing entry point present in a module. That approach would allow mixing RT and compute kernels during downstream compilation. We will need to revisit these choices when we start supporting code generation for multiple entry points at a time.

* The CUDA emit logic is currently mostly unchanged. The biggest difference is that when emitting a ray-tracing entry point we prefix the name of the generated `__global__` function with a marker for its stage type, as required by the OptiX runtime (e.g., a `__raygen__` prefix is required on all ray-generation entry points).

* The `Renderer` abstraction had a bare minimum of changes made to be able to understand that ray-tracing pipelines exist, and also that some APIs will require the name of each entry point along with its binary data in order to create a program.

* The `ShaderCompileRequest` type was updated so that only a single "source" is supported (rather than distinct source for each entry point), and also the entry points have been turned into a single list where each entry identifies its stage instead of a fixed list of fields for the supported entry-point types.

* The CUDA compute path had a lot of code added to support execution for the new ray-tracing pipeline type. The logic is mostly derived from the `optixHello` example in the OptiX SDK, and at present only supports running a single ray-generation shader with no parameters. The code here is not intended to be ready for use, but represents a signficiant amount of learning-by-doing.

* The `slang-support.cpp` file in `render-test` was updated so that instead of having separate compilation logic for compute vs. rasterization shaders (which would mean adding a third path for ray tracing), there is now a single flow to the code that works for all pipeline types and any kind of entry points.

  * Implicit in the new code is dropping support for the way GLSL was being compiled for pass-through render tests, which means pass-through GLSL render tests will no longer work. It seems like we didn't have any of those to begin with, though, so it is no great loss.

  * Also implicit are some new invariants about how shaders without known/default entry points need to be handled. For example, the ray tracing case intentionally does not fill in entry points on the `ShaderCompileRequest` and instead fully relies on the Slang compiler's support for discovering and enumerating entry points via reflection. As a consequence of those edits the `-no-default-entry-point` flag on `render-test` is probably not working, but it seems like we don't have any test cases that use that flag anyway.

Given the seemingly breaking changes in those last two bullets, I was surprised to find that all our current tests seem to pass with this change. If there are things that I'm missing, I hope they will come up in review.

* fixup: issues from review and CI

* Some issues noted during the review process (e.g., a missing `break`)

* Fix logic for render tests with `-no-default-entry-point`. I had somehow missed that we had tests reliant on that flag. This required a bit of refactoring to pass down the relevant flag (luckily the function in question was already being passed most of what was in `Options`, so that just passing that in directly actually simplifies the call sites a bit.

* There was a missing line of code to actually add the default compute entry points to the compile request. I think this was a problem that slipped in as part of some pre-PR refactoring/cleanup changes that I failed to re-test.</content>
</entry>
<entry>
<title>CUDA version handling (#1301)</title>
<updated>2020-03-30T23:23:09+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-30T23:23:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=ea7690558bca71ce3a9453adff4e0135352a352f'/>
<id>urn:sha1:ea7690558bca71ce3a9453adff4e0135352a352f</id>
<content type='text'>
* render feature for CUDA compute model.

* Use SemanticVersion type.

* Enable CUDA wave tests that require CUDA SM 7.0.
Provide mechanism for DownstreamCompiler to specify version numbers.

* Enabled wave-equality.slang

* Make CUDA SM version major version not just a single digit.

* Fix assert.

* DownstreamCompiler::Version -&gt; CapabilityVersion</content>
</entry>
<entry>
<title>Better diagnostics on failure on CUDA. (#1288)</title>
<updated>2020-03-25T18:08:21+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-25T18:08:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=28a0ca96a1ad2a3f0e09cc97b866f3b6338a09fa'/>
<id>urn:sha1:28a0ca96a1ad2a3f0e09cc97b866f3b6338a09fa</id>
<content type='text'>
* Better diagnostics on failure on CUDA.

* Catch exceptions in render-test

* * Added ability to disable reporting on CUDA failures
* Stopped using exception for reporting (just write to StdWriter::out()
* Removed CUDAResult type

* Don't set arch type on nvrtc to see if fixes CI issues.

* Try compute_30 on CUDA.

* Added ability to IGNORE_ a test
DIsabled rw-texture-simple and texture-get-dimensions

* Disable tests that require CUDA SM7.0
Use DISABLE_ prefix to disable tests.

* Disable signalUnexpectedError doing printf.</content>
</entry>
<entry>
<title>CUDA support for vector/matrix Wave intrinsics (#1266)</title>
<updated>2020-03-09T16:40:04+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-09T16:40:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=7e0aa9315f7f65033229c1f76d7df47ccd2da3d0'/>
<id>urn:sha1:7e0aa9315f7f65033229c1f76d7df47ccd2da3d0</id>
<content type='text'>
* Distinguish between __activeMask and _getConvergedMask().
Remove need to pass in mask to CUDA wave impls.

* Add support for vector/matrix Wave intrinsics for CUDA.
Fix issue with CUDA parsing of errors.

* Fix typo.</content>
</entry>
<entry>
<title>Renamed UnownedStringSlice::size to getLength to make match String. (#1254)</title>
<updated>2020-03-03T00:14:18+00:00</updated>
<author>
<name>jsmall-nvidia</name>
<email>jsmall@nvidia.com</email>
</author>
<published>2020-03-03T00:14:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.yummers.dev/slang.git/commit/?id=cbba1f2ba451f31e910d59fb9efbadc5e370c095'/>
<id>urn:sha1:cbba1f2ba451f31e910d59fb9efbadc5e370c095</id>
<content type='text'>
</content>
</entry>
</feed>
