Accelerate Your HDK Project with CUDA

Accelerate Your HDK Project with CUDA


Did you ever wonder how we can render volumes? Simulate sub surface scattering without a sub surface? After all the geometry we are working are nothing more than polygons connected with vertices. They don’t have hidden molecules made of bits inside them.

Until last year I too wasn’t really conserned about these questions, and they actually may not be worth considering if all you want a nice picture. But as always i decided to look under the hood and begins my journey as a photon in “Light Transport”. 

The first book I’ve read was of course “Physically Based Rendering” by Matt Phar ( But the problem was, I was reading it like reading a novel and couldn’t really wrap my head around it. Then I started venturing in the realm of volume rendering and read articles by Pixar and Solidangle. But something was still wrong as I was missing an essential ingredient in understanding the concept. That missing thing was the mathematical basics of the theorems that made up ray tracing possible. Things like ray-sphere intersection, orthonormal basis  and most importantly Mote Carlo integration.  

Then recently I stumbled upon a blog by Peter Shirley named “Ray Tracing in One Weekend” ( promising to teach the very basics of ray tracing in a weekend. Decided to take a look into it and i was sold. Implemented in a Qt5 gui in a couple days. Then I realized that I had another repo lying in my github page that i intended to do volumetric path tracing in Houdini SOP level. I dropped the pbrt implementation and decided to go with Peter Shirley’s much simpler approach.

All was good and well but a tiny problem: It was slow!! Then I saw the Cuda implementation of “ray tracing in one weekend” by Nvidia and decided that this could speed things up (And it actually did: I’ve gone up from ~12s for a frame to 3.6 fps )

Now the only thing that remains is how to link hcustom with cuda code. 

Cuda kernels are generally put in a file that ends with *.cu (or *.cuh) and compiled by nvcc, and we have our plugin entrance files which contains cookmysop() operator. To link these together I’ve used hcustom’s linking ability.  

After compiling the file with nvcc, I’ve created a build.bat (I’m on windows) that would link the generated render_kernels.o object file with the main cpp file.  The build. bat is as follows:

hcustom -E -i .\x64\Release -l cudart_static.lib -l render_kernels.obj -L "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\lib\x64" -L ".\x64\Release" -I "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include" renderHub.cpp

hcustom -l renderHub.o -l cudart_static.lib -l render_kernels.obj -L "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\lib\x64" -L ".\x64\Release" -I "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include" SOP_vpt.cpp

As you can see I’ve also created a renderHub.cpp file to maintain data io between the node and cuda kernels. The rest is to link the kernels with this file first, and then to link all of it with main file.   

That pretty much sums up the procedure you can follow to integrate cuda to your current workflow. The better way would be to use cmake to automate the process but I’ve had some trouble with nvcc complaining about stuff and didn’t look into it much further. I will write up if i get some progress. 

The github repo for SOP level ray tracer is here: and the one with the Qt5 gui is located here:

Thank you for stopping by.