**CS112 PA3: Accelerated Ray Tracing** ***Due: Monday, November 30 by 11:59 pm (Pacific time)*** Downloading and Compiling Nori ======================================== For this assignment, we will be using the [Nori educational renderer](https://wjakob.github.io/nori/). Nori is a minimalistic ray tracer written in C++. It runs on all major operation systems including Windows, Linux, and Mac OS. While Nori provides much support code to simplify your development work as much as possible, the code does very little: it loads a scene and saves a rendered image as an OpenEXR image ut the actual rendering code is missing, hence the output image just consists of black pixels. We provide two options for setting up Nori: - You can choose to [compile and build the source code natively](./pa3_compile_base_code.md.html). - Alternatively, your can use Nori within a [virtual machine](./pa3_vm_setup.md.html) (that runs Linux), where the codebase has already been downloaded and can be easily compiled. To further familiarize yourself with Nori, check [this document](./pa3_nori_tutorial.md.html). Ray Tracing Acceleration Data Structures ==================================================== In this assignment, you get to build your own implementation of an octree, a hierarchical data structure that enables fast ray intersection computations involving very large numbers of triangles ![](./images/octree.png) Each node of an octree covers an axis-aligned region of space denoted by two points $p_{min}$ and $p_{max}$. An interior node has exactly eight child nodes that partition this space into eight equal-sized subregions. The region of space covered by the *i*-th child of an octree-node is defined as the bounding box containing both the center and the *i*-th corner of its parent (using a suitable ordering of the corners). In the context of rendering, our goal will be to construct a tree where each leaf node only contains a small number of triangles (e.g. less than 10). At render time, this will allow most ray intersections to be pruned away since any ray will only touch a small number of octree nodes. An efficient ray intersection algorithm will then only traverse this subset of nodes until the closest intersection is found. For this homework, we will be using the scene **`scenes/pa3/ajax-normals.xml`**. This scene references the 3D scan of a bust (**`ajax.obj`**) that is fairly large (~500K triangles). Try rendering this new scene. You will find that rendering is very slow. The reason for this is that the Nori basecode currently implements a brute force ray intersection algorithm found in the files **`include/nori/accel.h`** and **`src/accel.cpp`**. Our main concern is the function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c++ bool rayIntersect(const Ray3f &ray, Intersection &its, bool shadowRay) const; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ which traces a ray **`ray`** against all triangles and stores the resulting information in special **`Intersection`** data structure. The function returns **true** if an intersection was found and **false** otherwise. The parameter **`shadowRay`** signals to the implementation whether a shadow ray is being traced. Tracing shadow rays is cheaper since we only care about whether a ray has *any* intersections at all, rather than searching the closest one. Furthermore, the **`Intersection`** data structure does not need to be filled for shadow rays. Currently, the problematic part of the implementation looks as follows: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c++ /* Brute force search through all triangles */ for (uint32_t idx = 0; idx < m_mesh->getTriangleCount(); ++idx) { float u, v, t; if (m_mesh->rayIntersect(idx, ray, u, v, t)) { /* An intersection was found! Can terminate immediately if this is a shadow ray query */ if (shadowRay) return true; ray.maxt = its.t = t; its.uv = Point2f(u, v); its.mesh = m_mesh; f = idx; foundIntersection = true; } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the above snippet, *u* and *v* denote the barycentric coordinates of the intersection (if it exists), and *t* is the distance traveled along the ray. Part 1: Octree construction (50 %) ---------------------------------------------- Clearly, traversing 500K triangles for every ray in an image is not going to work, so we'll need a way to organize them more intelligently. Your objective for the first part of this assignment is to develop a tree construction algorithm that organizes the triangle data of the scene's meshes into an octree. At a high level, your code should look something like this: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c++ Node *build(bounding box, triangles) { if (no triangles) return nullptr; if (only few triangles) return new leaf node with triangles; triangle_list list[8]; for (every triangle) { for (int i = 0; i < 8; ++i) { if (triangle overlaps sub-node i) add to list[i]; } } Node *node = new Node(); for (int i = 0; i < 8; ++i) node.child[i] = build(bounding box of sub-node i, list[i]); return node; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The specifics of the implementation (naming conventions, memory allocation, choice of auxiliary data structures) are completely up to you. We'll define "few triangles" as less than 10. For simplicity, you can assume that a triangle overlaps with an octree cell if the bounding box of the triangle's vertices overlap with the cell. Hint: Take a good look at **include/nori/bbox.h**. You will find that it provides a number of functions that might come quite handy. You may need to limit the depth of the tree to avoid pathological situations where a triangle list can never be split into less than 10 elements. Your construction code will invariably need to create a large number of octree nodes, which will in turn occupy a considerable amount of memory. A more compact representation that uses less memory per node will be more efficient when tracing rays at render time, since more of the tree will fit into the processor's cache. Think about what information truly needs to be stored, and what information is redundant and can be recomputed on the fly if needed later on. Part 2: Ray traversal (25 %) ---------------------------------------------- Having completed the acceleration data structure, the next step is to actually use it to trace rays. Modify the function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c++ bool rayIntersect(const Ray3f &ray, Intersection &its, bool shadowRay) const; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ so that it performs a hierarchical traversal of the tree, which only visits octree nodes that overlap with the ray extents (in any order). Once more, take a good look at functionality that is already provided in **`include/nori/bbox.h`**. Part 3: Improved ray traversal (25 %) ---------------------------------------------- Modify your original implementation so that it performs an ordered traversal of the child cells of an octree node from *closest* to *farthest*, where distance is defined as the ray distance *t* of the closer intersection between the ray and the bounding box of a child node. The function **`std::sort`** may be helpful. Note: when an intersection was previously found at a distance $t_1$, it is unnecessary to traverse a child cell with distance $t_2 > t_1$. This can be used to prune away additional parts of the tree. !!! note **Disclaimer**: An efficient ray tracing implementation is essential in modern rendering systems, and considerable care is taken to ensure that this operation runs as quickly as possible. A general rule of thumb is that certain expensive C++ constructs should never be used while tracing a ray (i.e. during octree traversal in your case). This includes operations that perform dynamic memory allocation (**`malloc`**, the new operator, or STL data structures that internally perform dynamic memory allocation, such as **`std::vector`**). Note that this doesn mean that you can use **`std::vectors`** in your code--in fact, you will find many of them in existing Nori code. However, a **`std::vector`** should never be created "per ray" inside performance-critical ray traversal code. Bonus: Hacker Points (extra 10 %) ---------------------------------------------- !!! error **Disclaimer**: Hacker points are nderpriced bonus points for the daring few. Sometimes you might be required to implement something that was not taught in class and you might have to do some research and creative thinking. Hacker Points are awarded only to students who implemented all of the remaining assignment, and they are added to the final score. While the octree greatly accelerates ray tracing, its construction unfortunately requires a considerable amount of time. What's worse is that all of this time is spent in serial computations that just use a single processor. However, the octree construction can clearly be parallelized: each subtree is independent of all others and can be processed by an separate thread. Find out how to use the Thread Building Blocks (TBB) library to benefit from this kind of parallelism and realize this in your construction code. Submission ======================================== Two files to submit on Gradescope: - A header file: **`accel.h`** in `include/nori/`; - A cpp file: **`accel.cpp`** in `src/`. Also, remember to put your name and student ID in both `accel.h` and `accel.cpp` files. ![](./images/submission.png width=300)