Slang -> CUDA kernel runs correctly in test infrastructure (#1167)

* First pass at BindLocation. * Added BindSet::init - for initializing with two input constant buffers. Needs better name, and perhaps should be another class. * Fix handling of constant buffer stripping. Improved initialization. * Trying to generalize BindLocation a little more. Split out CPULikeBindRoot. * More work to make BindLocation et al work with non uniform bindings. * Added parsing to a location. * WIP: Trying to get CPU working with BindLocation. * Describe problem of knowing the type of the reference point in the binding table. * More ideas on getBindings fix. * Remove BindSet as member of BindLocation. * Added BindLocation::Invalid * Made BindLocation able to be key in hash * Use BindLocation for bindings on BindingSet. * Added cuda and nvrtc categories to test infrastructure. Disabled CUDA synthetic tests by default. Fixed such that all tests now produce something in BindLocation style. * Use m_userIndex instead of m_userData on Resource. Move the binding setup out of cpu-compute-util (as no longer CPU specific) * Removed CPUBinding - used BindLocation/BindSet instead. Fixed some bugs around indexOf around uniform indirection. * Renamed BindSet::Resource -> BindSet::Value. * Document BindLocation. * Fixes for Clang/GCC Improve invariant requirement handling when constructing from BindPoints. * WIP: First attempt to run CUDA kernel. * Fix some issues around doing CUDA kernel launch. * Fix issues around use of cudaMemCpy . * Better cuda runtime error checking mechanism. * Fixed bug in passing parameters to cuda kernel launch. Simplified initialisation of context. * WIP: Fix CUDA runtime issues. * Add explicit CUDA synchronize so failures don't appear on implicit ones. * Fix problem emitting non shared variable on CUDA. * Fix some typos in CUDA layout. Use just a pointer for now for CUDA StucturedBuffer. * Arg order for CUDA launch was wrong. * First compute kernel runs on CUDA.
author: jsmall-nvidia <jsmall@nvidia.com> 2020-01-17 09:15:06 -0500
committer: GitHub <noreply@github.com> 2020-01-17 09:15:06 -0500
commit: a8669ade5cb3add8b9ce08e2c3bd96e93190bca8 (patch)
tree: 63be2fa7829c5bf956a5ce4d52af4e1d4073bf84 /tests/cuda
parent: 662721ba4ab0e38924701df4c876a86eb8390968 (diff)
1 files changed, 7 insertions, 17 deletions
diff --git a/tests/cuda/compile-to-cuda.slang b/tests/cuda/compile-to-cuda.slang
index 6166aaf0b..be7d775bd 100644
--- a/tests/cuda/compile-to-cuda.slang
+++ b/tests/cuda/compile-to-cuda.slang
@@ -1,29 +1,19 @@
 //DISABLE_TEST(smoke):SIMPLE: -target ptx -entry computeMain -stage compute 
+//DISABLE_TEST(compute):COMPARE_COMPUTE:-cpu -compute 
+//TEST(compute):COMPARE_COMPUTE:-cuda -compute 
 
 //TEST_INPUT:ubuffer(data=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0], stride=4):out,name=outputBuffer
 RWStructuredBuffer<int> outputBuffer : register(u0);
 
-int quantize(double value)
-{
-    return int(value * 256);
-}
-
-int quantize(float value)
-{
-    return int(value * 256);
-}
-
 [numthreads(4, 1, 1)]
 void computeMain(uint3 dispatchThreadID : SV_DispatchThreadID)
 {
-    float values[] = { -9, 9, -3, 3 };
 
 	int tid = int(dispatchThreadID.x);
-    float value = values[tid];
-    
-    outputBuffer[tid * 4] = quantize(sin(value));
-    outputBuffer[tid * 4 + 1] = quantize(cos(value));
     
-    outputBuffer[tid * 4 + 2] = quantize(sin(double(value)));
-    outputBuffer[tid * 4 + 3] = quantize(cos(double(value)));
+    outputBuffer[tid * 4] = tid;
+    outputBuffer[tid * 4 + 1] = tid + 1;
+    outputBuffer[tid * 4 + 2] = tid + 2; 
+    outputBuffer[tid * 4 + 3] = tid + 3;
+ 
 }
author	jsmall-nvidia <jsmall@nvidia.com>	2020-01-17 09:15:06 -0500
committer	GitHub <noreply@github.com>	2020-01-17 09:15:06 -0500
commit	a8669ade5cb3add8b9ce08e2c3bd96e93190bca8 (patch)
tree	63be2fa7829c5bf956a5ce4d52af4e1d4073bf84 /tests/cuda
parent	662721ba4ab0e38924701df4c876a86eb8390968 (diff)