diff options
| author | jsmall-nvidia <jsmall@nvidia.com> | 2019-08-08 17:23:03 -0400 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2019-08-08 17:23:03 -0400 |
| commit | 41247c3942210df33b9e3dd733eafb23573a4f2f (patch) | |
| tree | a49a88e680078bd379fe183cf826d49d7f935737 /docs | |
| parent | c1cc93dd962a6db6c839341f11d2654cf0e62e37 (diff) | |
WIP: Preliminary Slang -> C++ code generation (#1009)
* Expanded prelude for some other resource types. Disable C++ output for ParameterGroup.
* WIP: Layout for CPU.
* Fixes to CPU layout.
* WIP: The uniform is output, but the variable definition is not.
* WIP: Entry point parameters to global scope in C++.
Handling of resource types (in so far as outputting)
* Some discussion of ABI and different input types.
* WIP: More C++ support around resource types.
* WIP: Split up variables into different structures on emit.
* WIP: Emitting C++ with wrapping up of 'Context'
* WIP: C++ code has access to semantic values.
Wrap in struct so can use method calls to pass shared state.
Disable legalizeResourceTypes and legalizeExistentialTypeLayout
* Fix structured buffer layout for CPU.
* Remove testing/handling of global uniforms on CPU path.
Typo fix.
Changed CPU tests to use new CPU calling convention.
* Check globals are working. Initalize context to zero globals.
* Order the global parameters for C++ ouput by their layout.
Note - that layout isn't quite working correctly because the StructuredBuffer<int> the int seems to be consuming uniform space.
* Work around for reflection not having all data needed for layout ordering for C++ code.
* Output constant buffers as pointers.
* Entry point parameters accessed through pointer to struct.
* WIP: Layout for CPU is reasonable for test case.
* Only output 'f' after float literal if type marks as a float.
* Cast construction works on C++.
* Made IntrinsicOp::ConvertConstruct to make intent clearer.
* C++ handling construction from scalar.
Handle access of a scalar with .x.
Check default initialization.
* Comment about need for split of kIROp_construct.
Release build works.
* Added support from constructVectorFromScalar to C/C++ target.
* Handling of in/out in C/C++.
* First pass documentation CPU support.
* Improvements to C++/C slang code generation documentation.
* Small doc change to include need for mechansim to specify cpp compiler path.
* Better handling of swizzling - allow swizzling a scalar into a vector.
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/cpu-target.md | 213 |
1 files changed, 213 insertions, 0 deletions
diff --git a/docs/cpu-target.md b/docs/cpu-target.md new file mode 100644 index 000000000..cc1e15b08 --- /dev/null +++ b/docs/cpu-target.md @@ -0,0 +1,213 @@ +Slang CPU target Support +======================== + +Slang has preliminary support for producing CPU source and binaries. + +# Features + +* Can compile C/C++/Slang to binaries (executables and or shared libraries) +* Can compile Slang source into C++ source code +* Supports compute style shaders +* C/C++ backend abstracts the command line options, and parses the compiler errors/out such that all supported compilers output available in same format + +# Limitations + +These limitations apply to Slang source, with C/C++ the limitations are whatever the compiler requires + +* Only supports 64 bit targets (specifically it assumes all pointers are 64 bit) +* Barriers are not supported (making these work would require an ABI change) +* Atomics are not supported +* Complex resource types (such as say Texture2d) are work in progress +* Out of bounds access to resources has undefined behavior +* ParameterBlocks are not currently supported + +For current C++ source output, the compiler needs to support partial specialization. + +# How it works + +The initial version works by adding 'back end' compiler support for C/C++ compilers. Currently this is tested to work with Visual Studio, Clang and G++/Gcc on Windows and Linux. The C/C++ backend can be directly accessed much like 'dxc', 'fxc' of 'glslang' can, using the pass-through mechanism with the following new backends... + +``` +SLANG_PASS_THROUGH_CLANG, ///< Clang C/C++ compiler +SLANG_PASS_THROUGH_VISUAL_STUDIO, ///< Visual studio C/C++ compiler +SLANG_PASS_THROUGH_GCC, ///< GCC C/C++ compiler +SLANG_PASS_THROUGH_GENERIC_C_CPP, ///< Generic C or C++ compiler, which is decided by the source type +``` + +Sometimes it is not important which C/C++ compiler is used, and this can be specified via the 'Generic C/C++' option. This will aim to use the compiler that is most likely binary compatible with the compiler that was used to build the slang binary being used. + +To make it possible for slang to produce CPU code, we now need a mechanism to convert slang code into C/C++. The first iteration only supports C++ generation. If source is desired instead of a binary this can be specified via the SlangCompileTarget. These can be specified on the slangc command line as `-target c` or `-target cpp` + +In the API the `SlangCompileTarget`s are + +``` +SLANG_C_SOURCE, ///< The C language +SLANG_CPP_SOURCE, ///< The C++ language +``` + +If a CPU binary is required this can be specified as a `SlangCompileTarget` of + +``` +SLANG_EXECUTABLE, ///< Executable (for hosting CPU/OS) +SLANG_SHARED_LIBRARY, ///< A shared library/Dll (for hosting CPU/OS) +``` + +These can also be specified on the slang command line as `-target exe` and `-target dll` or `-target sharedlib`. + +In order to be able to use the slang code on CPU, there needs to be binding via values passed to a function that the C/C++ code will produce and export. How this works is described in the ABI section. + +That if a binary target is requested, the binary contents will be returned in a ISlangBlob just like for other targets. To use the CPU binary typically it must be saved as file and then potentially marked for execution by the OS before executing. It may be possible to load shared libraries or dlls from memory - but is a non standard feature, that requires unusual work arounds. + +Under the covers when slang is used to generate a binary via a C/C++ compiler, it must do so through the file system. Currently this means that the source (say generated by slang) and the binary (produced by the C/C++ compiler) must all be files. To make this work slang uses temporary files. That the reasoning for hiding this mechanism - and not return say filenames, is so that in the future when binaries are produced directly (for example with LLVM), nothing will need to change. + +ABI +=== + +Say we have some slang source like the following. + +``` +struct Thing { int a; int b; } + +Texture2D<float> tex; +SamplerState sampler; + +[numthreads(4, 1, 1)] +void computeMain( + uint3 dispatchThreadID : SV_DispatchThreadID, + uniform Thing thing, + uniform Thing thing2) +{ + // ... +} +``` + +When it is compiled into a shared library/dll - how is it invoked? The entry point is exported with a signiture + +``` +void computeMain(ComputeVaryingInput* varyingInput, UniformState* uniformState); +``` + +The UniformState struct typically varies by shader, and it holds all of the bindings. Where these are located can be determined by reflection. For example + +``` +struct UniformState +{ + Thing_0* thing3_0; + RWStructuredBuffer<int32_t> outputBuffer_0; + Texture2D<float > tex_0; + SamplerState sampler_0; + _S1* _S2; +}; +``` + +That for C++ targets, the templated types are defined in the slang-cpp-prelude.h that is included. Note that `slang-cpp-prelude.h` *MUST* currently be within the search path passed to the compiler. By default with the CPU path, the path to the slang file is included as a 'system' include path, such that placing the slang-cpp-prelude.h file in the same directory as the slang source file should mean that it is found. + +ConstantBuffers will become pointers to the type they hold (as thing3_0 is in the above structure). + +StructuredBuffer/RWStructuredBuffer/ByteAddressBuffer/RWByteAddressBuffer become in effect (where in ByteAddressBuffers T is uint32_t). + +``` + T* data; + size_t count; +``` + +Resource types become pointers to interfaces that implement their features. For example `Texture2D` become a pointer to a `ITexture2D` interface that has to be implemented in client side code. Similarly SamplerState and SamplerComparisonState become `ISamplerState` and `ISamplerComparisonState`. + +The `_S1` struct in the example above (which may have different names) is actually a struct that holds all of the entry point uniforms if there are any, in this case + +``` +struct _S1 +{ + Thing_0 thing_0; + Thing_0 thing2_0; +}; +``` + +Note that the this pointer is not directly reflected (although layout of uniform paramters in the struct are). Currently this pointer is just placed after all the other reflected bindings. + + +It may be useful to be able to include `slang-cpp-prelude.h` in C++ code to access the types that are used in the generated code. This introduces a problem in that the types used in the generated code might clash with types in client code. To work around this problem, you can wrap all of the types defined in the prelude with a namespace of your choosing. For example + +``` +#define SLANG_PRELUDE_NAMESPACE CPPPrelude +#include "../../tests/cross-compile/slang-cpp-prelude.h" +``` + +Would wrap all the slang prelude types in the namespace `CPPPrelude`. + +Language aspects +================ + +# Arrays passed by Value + +Slang follows the HLSL convention that arrays are passed by value. This is in contrast the C/C++ where arrays are passed by reference. To make generated C/C++ follow this convention an array is turned into a 'FixedArray' struct type. Sinces classes by default in C/C++ are passed by reference the wrapped array is also. + +To get something more similar to C/C++ operation the array can be marked in out or inout to make it passed by reference. + +Limitations +=========== + +# Out of bounds access + +In HLSL code if an access is made out of bounds of a StructuredBuffer, execution proceceeds. If an out of bounds read is performed, a zeroed value is returned. If an out of bounds write is performed it's effectively a noop, as the value is discarded. + +On the CPU target this behaviour is *NOT* supported. For a debug CPU build an out of bounds access will assert, for a release build the behaviour is undefined. + +The reason for this is that such an access is quite difficult and/or slow to implement on the CPU. The underlying reason is that operator[] typically returns a reference to the contained value. If this is out of bounds - it's not clear what to return, in particular because the value may be read or written and moreover elements of the type might bet written. In practice this means a global zeroed value cannot be returned. + +This could be supported if code gen worked as followed for say + +``` +RWStructuredBuffer<float4> values; + +values[3].x = 10; +``` + +Produces + +``` +template <typename T> +struct RWStructuredBuffer +{ + T& at(size_t index, T& defValue) { return index < size ? values[index] : defValue; } + + T* values; + size_t size; +}; + +RWStructuredBuffer<float4> values; + +// ... +Vector<float, 3> defValue = {}; // Zero initialize such that access +values.at(3).x = 10; +``` + +Note that [] would be turned into the `at` function, which takes the default value as a paramter provided by the caller. If this is then written to then only the defValue is corrupted. Even this mechanism not be quite right, because if we write and then read again from the out of bounds reference in HLSL we may expect that 0 is returned, whereas here we get the value that was written. + +TODO +==== + +# Main + +* Complete support (in terms of interfaces) for 'complex' resource types - such as Texture +* Interface implementation for complex resource types +* Parameter block support (the difficulty is around layout) +* Split out entry point uniforms into a separate pointer passed to the entry point +* Test system executes and tests for CPU targets +* Slang API allows for compilation into loaded binary such that functions can be directly executed +* Output C/C++ compiler errors as 'externalCompiler' errors through diagnostic system +* Improve documentation +* Output of header files +* Mechanism to specify where C/C++ binaries are located + +# Internal Slang compiler features + +These issues are more internal Slang features/improvements + +* Currently we only support 64 bit targets (it is assumed in layout that pointers are 64 bit) +* Slang compute tests work (where appropriate) +* Currently only generates C++ code, it would be fairly straight forward to support C (especially if we have 'intrinsic definitions') +* Have 'intrinsic definitions' in standard library - such that they can be generated where appropriate + + This will simplify the C/C++ code generation as means slang language will generate must of the appropriate code +* Currently 'construct' IR inst is supported as is, we may want to split out to separate instructions for specific scenarios + |
