Final Report

Abstract

In this project, a proof-of-concept program was created to determine if it is possible to perform ray tracing in a shader program running on the GPU. Real-time performance was achieved for smaller scenes at the cost of visual quality.

Technical approach

Techniques used

I first started off my final project by combining Project 4 (Cloth Simulation) and Project 3 (Ray Tracing). My reasoning for why combining the projects together would be a good idea was that I wanted to target OpenGL Version 3.3 Core so that even older computers would be able to run the GPU ray tracer. Since Project 4 already uses OpenGL 3.3 with shaders, it seemed like a good idea to base my final project off of it and add features to it from Project 3. I ended up doing the opposite by basing my code off of Project 3 and adding code from Project 4 to make it easier to get a successful build. I also needed to figure out how the CMake build system worked through trial and error.

Once I got an successful build, I added an important feature for reloading shaders dynamically instead of having to restart the program all the time to reload new shader code I added. This sped up the development process for the shader writing part.

According to my original plan, I had intended to work on converting the Bounding Volume Hierarchy (BVH) to a GPU friendly format and upload it to the GPU to do BVH traversal inside the shader. But due to the risk of unforeseen problems and possibly needing more time than estimated for this part of the plan, I decided to first work on converting and transferring primitives and lights and porting over the ray tracing code from Project 3 into the fragment shader so that I could still get some visual results even without the BVH.

To convert primitives to a GPU friendly format, I added a method to the Triangle class which is responsible for packing vertex positions, vertex normals, and material information of the triangle into a vector of floats. For spheres a similar method is used where the origin of the sphere and its radius is packed instead. To tell the difference between triangles and spheres apart, an additional struct member is used to signify which shape is packed. Then to upload all the primitives to the GPU, an uploading method is called which iterates over all the primitives and calls the convert method on each primitive and appends each vector of floats to the end of one large vector of floats containing all the primitives processed so far. Finally, the large vector of floats is uploaded as a Texture Buffer Object (TBO) using some example code from Github1. The code allocates a new buffer and texture from OpenGL and calls glBufferData to place the vector of floats in the new buffer that is managed by OpenGL1. Then the texture is bound to the buffer by calling glTexBuffer1. Since the texture is bound to GL_TEXTURE0 using glActiveTexture, the shader uniform for the texture is set to 01. Inside the shader, the data is read by using texelFetch to read a vector of 4 floats at some offset in the texture2. I wrote a shader function that wraps some texelFetch calls so that I can retrieve a entire primitive from the texture given the offset of the primitive. Each primitive is ultimately used in intersection testing.

To convert lights to a GPU friendly format, a process similar to converting primitives is used. A method is used to convert each light into a struct that that will be placed into one large array of light structs. Each struct contains the light’s type, position, direction, radiance, dimensions, and whether it is a point light or not. The array of light structs will be uploaded as a Uniform Buffer Object (UBO). At first, I thought I would need to use raw OpenGL calls to setup the upload of the UBO manually, but I discovered by looking at the NanoGui header file for its shader interface that it already has a class that can manage the buffer for the UBO. So I instantiated an instance of the class and pushed each light struct into it and then passed it to the shader’s setuniform method. On the shader side, the UBO is accessed like a regular array (e.g. u_lights[i] for the i’th light).

The reason why I decided to use a Uniform Buffer Object (UBO) for lights and a Texture Buffer Object (TBO) for primitives is because both types of buffers have different properties3. UBOs are good for small amounts of sequential data3. Since lights are few and the shader iterates sequentially through all the lights to do importance sampling, using a UBO seems like a good idea for lights. On the other hand, TBOs offer a larger amount of storage and is better for random access3. Since the number of primitives can be very large and it is not necessary that the shader would iterate over the primitives in order especially with a BVH traversal implementation, using a TBO seems like a good idea for primitives.

Next up is the porting process of the ray tracer code from Project 3 into GLSL shader code. The process was not too difficult since GLSL and C++ are two very similar languages. I needed to rename types (e.g. Vector3D to vec3, Matrix3x3 to mat3x3) and convert the Object-Oriented form of the code to a more functional form since GLSL does not have classes. The way I accomplished this was to have functions that would take in the struct that they are a part of as their first parameter which is sort of like the hidden this parameter in the methods of classes in C++. One of the functions, I was porting over was a recursive ray tracing function that I needed to convert to an iterative form since GLSL shaders do not allow recursion. I tried to convert the function on my own at first but found it too difficult to do so. I ended basing the new iterative function on a similar ray tracing function I found in the textbook Physically Based Rendering: From Theory to Implementation Third Edition Section 14.5.44. I modified the function slightly because the function in the textbook accounts for the radiance from the light hitting the camera when the ray depth is 0 but in my implementation, the radiance from the light hitting the camera is accounted for outside of the function. Also the textbook function has some lines of code that accounts for reflection and refraction through glass which I removed since those were not implemented in my shader ray tracer.

Finally, I needed to find a way to generate random numbers for the ray tracing process. I researched on the internet various ways to generate random numbers inside a shader. I found on Stackoverflow an example of using a Linear Congruent Generator algorithm in combination with a Taus algorithm to generate pseudorandom numbers based on a seed5. On the CPU side, I generated four random numbers for use as a seed which is passed into the shader as a uniform. But since the seed is the same for all pixels, they will all generate the same random numbers. To fix this, a per pixel seed is added to the passed in seed as shown in this Stackoverflow post6. However, I found this was not enough since the visual output contained some vertical and horizontal lines which was due to the per pixel seed not being good enough. I researched some more and discovered according to another website that the seed should be hashed to generate better quality random numbers7. The website provided a hash function7 which I used, resulting in much better visual outputs without the lines.

Due to project time constraints, BVH conversion, uploading, and traversal were not implemented.

Problems Encountered and Solutions

Lessons learned

Results

Direct illumination of CBspheres_lambertian
Global illumination of CBspheres_lambertian
Global illumination of CBgems
Video of moving around the CBspheres_lambertian scene while ray tracing in real time. The static in the video is due to the random process of ray tracing and the low sample rate of 4 samples per pixels.

References

1: https://gist.github.com/roxlu/5090067
2: https://www.khronos.org/opengl/wiki/Buffer_Texture
3: http://rastergrid.com/blog/2010/01/uniform-buffers-vs-texture-buffers/
4: http://www.pbr-book.org/3ed-2018/Light_Transport_I_Surface_Reflection/Path_Tracing.html#Implementation
5: https://math.stackexchange.com/a/340028
6: https://gamedev.stackexchange.com/a/164659
7: http://www.reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/
8: https://community.khronos.org/t/how-to-crash-a-glsl-shader/62192/3
9: https://community.khronos.org/t/intensive-shaders-1-second-per-primitive/60537
10: https://renderdoc.org/
11: https://www.khronos.org/registry/OpenGL/specs/gl/GLSLangSpec.3.30.pdf

Contributions From Each Team Member

I am the only member of the team and did all of the work for the final project.