| Age | Commit message (Collapse) | Author |
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Upgrade to slang-llvm-13.x-33
* Kick - as build failed on download egress.
* Output "static" on methods in doc output.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
|
|
|
|
|
|
flow works now (#2595)
|
|
|
|
* Fixes for crash when inlining at global scope
Recent changes to the way inlining is implemented in the Slang compiler
have broken certain scenarios involving `static const` declarations.
The basic problem is that the initial-value expression for a `static const`
gets lowered into IR code at the global scope of a module, and if
that code includes `call`s to stdlib operations marked `forceInlineEarly`,
then we end up trying to apply inlining to code at module scope.
The current inlining operation assumes that all `call`s are in basic
blocks, and that the correct way to do inlining involves splitting
those blocks.
This change adds logic to detect when the callee at a call site to
be inlined consists of a single basic block ending in a `return`,
and in that case it invokes specialized inlining logic that doesn't
split basic blocks and doesn't need to care if the original `call`
is in a basic block.
Thus we are able to inline calls to single-basic-block `forceInlineEarly`
functions called as part of the initialization for global-scope
`static const` variables.
This logic does *not* solve the problem of calls to multi-block
`forceInlineEarly` functions from the global scope. Such calls cannot
really be inlined.
A secondary problem that arises when inlining such calls is that the
callee might include local temporaries (`var` instructions) that are
read and written (`load`s and `store`s), and none of those instructions
should be allowed at the global scope.
In the case of the functions being inlined here, the `load`/`store`
operations are superfluous, and should be cleaned up by our SSA pass.
The only reason that they seem to *not* be getting cleaned up in the
case that was been triggering crashes is that the callee is a generic.
The current logic for the SSA pass was skipping the bodies of generic
functions, so they would not be cleaned up. This change enables the SSA
pass to apply to the bodies of generic functions, and also ensures that
SSA cleanups are applied *before* any `forceInlineEarly` functions get
inlined.
* fixup: liveness test outputs
|
|
* Frontend work for `[BackwardDerivative]` and `[BackwardDerivativeOf]`.
* Fix clang issue.
* Fix.
* fix gcc issue
* fix formatting.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Work around for some issue seen with a repro.
* Small improvement in doing IDifferentable check.
* Fix around obfuscation linkage.
|
|
* Make backward differentiation work with generics.
* Fix.
* Another fix.
* More fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* Work around for some issue seen with a repro.
* Small improvement in doing IDifferentable check.
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
nullptr. (#2583)
* #include an absolute path didn't work - because paths were taken to always be relative.
* Fix output when layout is nullptr in emitInterpolatioModifiersImpl
|
|
* Split bwd_diff op into separate ops for primal and propagate func.
* Fix.
* Download swiftshader with github actions instead of curl on linux.
* Fix github action.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Initial multi-block implementation
* Implemented multi-block reverse-mode (without loops)
* Added logic to remove block-level decorations to avoid confusing IR simplification passes
* Fixed issues with block-level decorations during IR simplification by removing them prior to simplification.
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
Supersedes #2532
|
|
* Add format checking attributes on printf-like functions
* Don't use printf format attributes on msvc
Where they are not supported
|
|
* Further unify the autodiff passes.
* Fix clang compilation error.
* Rename ForwardDerivativeTranscriber->ForwardDiffTranscriber.
* Remove unused fields from Transcriber classes.
* More small cleanups.
* Cleanup.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
function. (#2569)
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Closes #2561
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Added initial support for nested calls
* removed comments
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
* Fix a bug in Path::find
* Fix code formatting
* Fix LockFile and add LockFileGuard
* Add PersistentCache and unit test
* Replace file path dependency list with source file dependency list
* Add note on ordering in Module/FileDependencyList
* Remove old shader cache code
* Refactor shader cache implementation
* Temporarily skip unit tests reading/writing files
* Fix warning
* Reenable lock file test
* Rename shader cache tests and disable crashing test
* Testing
* Stop using Path::getCanonical
* Fix persistent cache lock and test
* Fix threading issues
* Move adding file dependency hashes to getEntryPointHash()
* Fix handling of #include files
* Allow specifying additional search paths for gfx testing device
* Work on shader cache tests
* Update project files
* Revive shader cache graphics tests
* Split graphics pipeline test
* Fix compilation
|
|
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add vector arithmetic test. Make gradient accumulation work for any IRLoad
* Added support for general vector types, and split transposition into transpose & materialize to allow emitting the fully accumulated gradient for complex types.
* Several bug fixes + finished up support for vector & struct types + removed prop pass
* minor fixes (int/uint casts)
* Removed IRConstruct
* Added some type casts to prevent warnings
* minor fix for unused variable
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Consolidate crypto functions into single module
* Migrate rest of code to new crypto module
* Fix name conflict
|
|
* #include an absolute path didn't work - because paths were taken to always be relative.
* WIP inlining of functions that take or return string related types on GPU targets.
* Small fixes.
* Added a test.
* Add checking for any getStringHash insts are valid.
* Support getStringHash on CUDA.
* Tweak diagnostic.
|
|
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
ExtractExitentialValueExpr. (#2541)
* Fix missing semantic highlighting in attributes and ExtractExitentialValueExpr.
* Fix regression on partially specialized generic expr highlighting.
* Add regression test.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Cleanup DigestBuilder and MD5HashGen
* Fix templates
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
|
|
* Added partial implementation for reverse-mode
* Fixing several compile and runtime errors.
* Fixed several issues with reverse-mode passes.
* Fixed more issues. Basic reverse-mode tests passing
Co-authored-by: Edward Liu <shiqiu1105@gmail.com>
|
|
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Draft FileStream-based implementation for updating cache file
* File streams fully integrated into shader cache code paths; Tests will not run unless file system is on disk as file streams do not play nicely with in-memory
* Brought old code back as fallback path, but tests need to ensure previous is freed first
* Testing structure updated, beginning cleanup work
* All tests working
* Cleanup changes
* Removed an extra tab at the end of a line
* Cleanup change
* Undo externals change
* Removed redundant logic for OS vs memory file system handling of the shader cache; Removed extra helper function left over from old cache implementation
* Reverted performance change to generate contents hashes when modules are being loaded as this code path is not always followed; Contents hashing now uses a combination of hashing and checking the last modified time for all file dependencies, only reading in and hashing the contents of all files if the last modified hash does not match
* Added handling to Module::updateContentsBasedHash for file dependencies which are not from a physical source file on disk; Added test for above
Co-authored-by: Lucy Chen <lucchen@nvidia.com>
Co-authored-by: Yong He <yonghe@outlook.com>
|
|
argument. (#2536)
* Fix non-static generic func call issue.
* Add test case.
* Revert unnecessary change.
* Update test comment.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Add LockFile helper class
|
|
* Make differentiable data-flow pass recognize interface methods.
* Make existing test to work with `[TreatAsDifferentiable]`.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Fix issues around dynamic generic function and autodiff.
* Fix return type issue.
* Fix type unification for generic `inout` parameter.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Autodiff through simple dynamic dispatch.
* Revert changes.
* Fix.
Co-authored-by: Yong He <yhe@nvidia.com>
|
|
* Initial refactor
* Refactor passes tests
* Removed Differential Bottom references from the IR side
|