Slang CPU target Support ======================== Slang has preliminary support for producing CPU source and binaries. # Features * Can compile C/C++/Slang to binaries (executables and or shared libraries) * Can compile Slang source into C++ source code * Supports compute style shaders * C/C++ backend abstracts the command line options, and parses the compiler errors/out such that all supported compilers output available in same format # Limitations These limitations apply to Slang source, with C/C++ the limitations are whatever the compiler requires * Only supports 64 bit targets (specifically it assumes all pointers are 64 bit) * Barriers are not supported (making these work would require an ABI change) * Atomics are not supported * Complex resource types (such as say Texture2d) are work in progress * Out of bounds access to resources has undefined behavior * ParameterBlocks are not currently supported For current C++ source output, the compiler needs to support partial specialization. # How it works The initial version works by adding 'back end' compiler support for C/C++ compilers. Currently this is tested to work with Visual Studio, Clang and G++/Gcc on Windows and Linux. The C/C++ backend can be directly accessed much like 'dxc', 'fxc' of 'glslang' can, using the pass-through mechanism with the following new backends... ``` SLANG_PASS_THROUGH_CLANG, ///< Clang C/C++ compiler SLANG_PASS_THROUGH_VISUAL_STUDIO, ///< Visual studio C/C++ compiler SLANG_PASS_THROUGH_GCC, ///< GCC C/C++ compiler SLANG_PASS_THROUGH_GENERIC_C_CPP, ///< Generic C or C++ compiler, which is decided by the source type ``` Sometimes it is not important which C/C++ compiler is used, and this can be specified via the 'Generic C/C++' option. This will aim to use the compiler that is most likely binary compatible with the compiler that was used to build the slang binary being used. To make it possible for slang to produce CPU code, we now need a mechanism to convert slang code into C/C++. The first iteration only supports C++ generation. If source is desired instead of a binary this can be specified via the SlangCompileTarget. These can be specified on the slangc command line as `-target c` or `-target cpp` In the API the `SlangCompileTarget`s are ``` SLANG_C_SOURCE, ///< The C language SLANG_CPP_SOURCE, ///< The C++ language ``` If a CPU binary is required this can be specified as a `SlangCompileTarget` of ``` SLANG_EXECUTABLE, ///< Executable (for hosting CPU/OS) SLANG_SHARED_LIBRARY, ///< A shared library/Dll (for hosting CPU/OS) ``` These can also be specified on the slang command line as `-target exe` and `-target dll` or `-target sharedlib`. In order to be able to use the slang code on CPU, there needs to be binding via values passed to a function that the C/C++ code will produce and export. How this works is described in the ABI section. That if a binary target is requested, the binary contents will be returned in a ISlangBlob just like for other targets. To use the CPU binary typically it must be saved as file and then potentially marked for execution by the OS before executing. It may be possible to load shared libraries or dlls from memory - but is a non standard feature, that requires unusual work arounds. Under the covers when slang is used to generate a binary via a C/C++ compiler, it must do so through the file system. Currently this means that the source (say generated by slang) and the binary (produced by the C/C++ compiler) must all be files. To make this work slang uses temporary files. That the reasoning for hiding this mechanism - and not return say filenames, is so that in the future when binaries are produced directly (for example with LLVM), nothing will need to change. ABI === Say we have some slang source like the following. ``` struct Thing { int a; int b; } Texture2D tex; SamplerState sampler; [numthreads(4, 1, 1)] void computeMain( uint3 dispatchThreadID : SV_DispatchThreadID, uniform Thing thing, uniform Thing thing2) { // ... } ``` When it is compiled into a shared library/dll - how is it invoked? The entry point is exported with a signiture ``` void computeMain(ComputeVaryingInput* varyingInput, UniformState* uniformState); ``` The UniformState struct typically varies by shader, and it holds all of the bindings. Where these are located can be determined by reflection. For example ``` struct UniformState { Thing_0* thing3_0; RWStructuredBuffer outputBuffer_0; Texture2D tex_0; SamplerState sampler_0; _S1* _S2; }; ``` That for C++ targets, the templated types are defined in the slang-cpp-prelude.h that is included. Note that `slang-cpp-prelude.h` *MUST* currently be within the search path passed to the compiler. By default with the CPU path, the path to the slang file is included as a 'system' include path, such that placing the slang-cpp-prelude.h file in the same directory as the slang source file should mean that it is found. ConstantBuffers will become pointers to the type they hold (as thing3_0 is in the above structure). StructuredBuffer/RWStructuredBuffer/ByteAddressBuffer/RWByteAddressBuffer become in effect (where in ByteAddressBuffers T is uint32_t). ``` T* data; size_t count; ``` Resource types become pointers to interfaces that implement their features. For example `Texture2D` become a pointer to a `ITexture2D` interface that has to be implemented in client side code. Similarly SamplerState and SamplerComparisonState become `ISamplerState` and `ISamplerComparisonState`. The `_S1` struct in the example above (which may have different names) is actually a struct that holds all of the entry point uniforms if there are any, in this case ``` struct _S1 { Thing_0 thing_0; Thing_0 thing2_0; }; ``` Note that the this pointer is not directly reflected (although layout of uniform paramters in the struct are). Currently this pointer is just placed after all the other reflected bindings. It may be useful to be able to include `slang-cpp-prelude.h` in C++ code to access the types that are used in the generated code. This introduces a problem in that the types used in the generated code might clash with types in client code. To work around this problem, you can wrap all of the types defined in the prelude with a namespace of your choosing. For example ``` #define SLANG_PRELUDE_NAMESPACE CPPPrelude #include "../../tests/cross-compile/slang-cpp-prelude.h" ``` Would wrap all the slang prelude types in the namespace `CPPPrelude`. Language aspects ================ # Arrays passed by Value Slang follows the HLSL convention that arrays are passed by value. This is in contrast the C/C++ where arrays are passed by reference. To make generated C/C++ follow this convention an array is turned into a 'FixedArray' struct type. Sinces classes by default in C/C++ are passed by reference the wrapped array is also. To get something more similar to C/C++ operation the array can be marked in out or inout to make it passed by reference. Limitations =========== # Out of bounds access In HLSL code if an access is made out of bounds of a StructuredBuffer, execution proceceeds. If an out of bounds read is performed, a zeroed value is returned. If an out of bounds write is performed it's effectively a noop, as the value is discarded. On the CPU target this behaviour is *NOT* supported. For a debug CPU build an out of bounds access will assert, for a release build the behaviour is undefined. The reason for this is that such an access is quite difficult and/or slow to implement on the CPU. The underlying reason is that operator[] typically returns a reference to the contained value. If this is out of bounds - it's not clear what to return, in particular because the value may be read or written and moreover elements of the type might bet written. In practice this means a global zeroed value cannot be returned. This could be supported if code gen worked as followed for say ``` RWStructuredBuffer values; values[3].x = 10; ``` Produces ``` template struct RWStructuredBuffer { T& at(size_t index, T& defValue) { return index < size ? values[index] : defValue; } T* values; size_t size; }; RWStructuredBuffer values; // ... Vector defValue = {}; // Zero initialize such that access values.at(3).x = 10; ``` Note that [] would be turned into the `at` function, which takes the default value as a paramter provided by the caller. If this is then written to then only the defValue is corrupted. Even this mechanism not be quite right, because if we write and then read again from the out of bounds reference in HLSL we may expect that 0 is returned, whereas here we get the value that was written. TODO ==== # Main * Complete support (in terms of interfaces) for 'complex' resource types - such as Texture * Interface implementation for complex resource types * Parameter block support (the difficulty is around layout) * Split out entry point uniforms into a separate pointer passed to the entry point * Test system executes and tests for CPU targets * Slang API allows for compilation into loaded binary such that functions can be directly executed * Output C/C++ compiler errors as 'externalCompiler' errors through diagnostic system * Improve documentation * Output of header files * Mechanism to specify where C/C++ binaries are located # Internal Slang compiler features These issues are more internal Slang features/improvements * Currently we only support 64 bit targets (it is assumed in layout that pointers are 64 bit) * Slang compute tests work (where appropriate) * Currently only generates C++ code, it would be fairly straight forward to support C (especially if we have 'intrinsic definitions') * Have 'intrinsic definitions' in standard library - such that they can be generated where appropriate + This will simplify the C/C++ code generation as means slang language will generate must of the appropriate code * Currently 'construct' IR inst is supported as is, we may want to split out to separate instructions for specific scenarios