tag:blogger.com,1999:blog-57437516550599418582024-03-05T13:16:37.877+03:00A GPU-Based Path TracerMustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-5743751655059941858.post-79693324986551143372017-06-16T16:53:00.004+03:002017-06-17T16:31:01.291+03:00Subsurface ScatteringI chose to implement subsurface scattering for the term project of the course. My first intention was to implement one of the diffusion approximation algorithms. Although they are much more faster than simulating transfer of equation with volumetric path tracing, they either do not work with arbitrary geometry or need a preprocessing step which I found cumbersome. Therefore, I implemented a volumetric path tracer that simulates subsurface scattering. The most challenging part for me was to find appropriate absorption and scattering coefficients for the tests. Thankfully, series of outputs show the effect of the scattering inside the medium clearly. Also, I captured a video to show the progressive feature of my path tracer.<br />
<br />
Additionaly, I changed the scene input format quite a bit. I used <a href="http://assimp.sourceforge.net/" target="_blank">assimp</a> to load the 3d models. Since it supports many of the popular extensions, scene design is much more flexible than before. An example scene file is <a href="https://files.fm/u/hj6aj6zz" target="_blank">here</a>.<br />
<br />
Here are the outputs for the same scene with different material settings that my path tracer supports.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDo_eyyue_E1ANwLmrnaB26UXcDMPM5MvjaFTtffYR6D7pgjwmW3F2cQEqvUoeJe5nqTHlViHcppXdfsbiOf5dQ_0cJO7ctpp0IdMdffWiczpcNGww0vcMM0aeeuOAI_PLVEqdiWRrY3Z1/s1600/buddha_diffuse_2109.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDo_eyyue_E1ANwLmrnaB26UXcDMPM5MvjaFTtffYR6D7pgjwmW3F2cQEqvUoeJe5nqTHlViHcppXdfsbiOf5dQ_0cJO7ctpp0IdMdffWiczpcNGww0vcMM0aeeuOAI_PLVEqdiWRrY3Z1/s400/buddha_diffuse_2109.png" title="buddha_lambertian.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Lambertian Material<br />
kd: (1.0, 1.0, 1.0)<br />
2109 samples</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUClrUV3gMhfv8ZFB0UDZjC1Sn5-uTtKA5RQOoZtR4tcm-3iuPTUqy4Svn8oy3j9OcQQ7awVzVScCzw8GxO2fdF7-qz5oNjcy_PzSstGah7LSfukRettQusbKFPDQ-nG10EnFMsCZhRqmd/s1600/_1785_buddha.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUClrUV3gMhfv8ZFB0UDZjC1Sn5-uTtKA5RQOoZtR4tcm-3iuPTUqy4Svn8oy3j9OcQQ7awVzVScCzw8GxO2fdF7-qz5oNjcy_PzSstGah7LSfukRettQusbKFPDQ-nG10EnFMsCZhRqmd/s400/_1785_buddha.png" title="buddha_specular.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Perfect Specular Material<br />
ks: (1.0, 1.0, 1.0)<br />
1785 samples</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCymyHRUWUv5C7f1ESdf52fxYop2ckIW2q8ZM0Svr9DfSbEmHaUEsAMNhckuyHgwSkXkQko9Zm6A5d1kgxJUUeo-Lyl7lKCqI-zkLVh2_35vIYYoqr212g9yCEVgQIXiO4fQL9ytDrsp96/s1600/buddha_refractive_2470.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCymyHRUWUv5C7f1ESdf52fxYop2ckIW2q8ZM0Svr9DfSbEmHaUEsAMNhckuyHgwSkXkQko9Zm6A5d1kgxJUUeo-Lyl7lKCqI-zkLVh2_35vIYYoqr212g9yCEVgQIXiO4fQL9ytDrsp96/s400/buddha_refractive_2470.png" title="buddha_refractive.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Perfect Refractive Material<br />
tintcolor: (0.79, 0.53, 0.79)<br />
tintdistance: 0.2<br />
index of refraction: 2.0<br />
2470 samples</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRLTlewvoPeXoxbpL5CmIFckDCq4tJtovKqZvpYIHBfN30jW1GtK9Hfol36VHgXMPY4Rai3wK_TsbuYath_Y2jRDZoE0K2_1vZyZC_LTNosYoT5zzQnfAo-ZG7VYwQjzVNOc7-FYANAnXL/s1600/buddha_translucent1_1015.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRLTlewvoPeXoxbpL5CmIFckDCq4tJtovKqZvpYIHBfN30jW1GtK9Hfol36VHgXMPY4Rai3wK_TsbuYath_Y2jRDZoE0K2_1vZyZC_LTNosYoT5zzQnfAo-ZG7VYwQjzVNOc7-FYANAnXL/s400/buddha_translucent1_1015.png" title="buddha_translucent_1.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Translucent Material<br />
tintcolor: (0.79, 0.53, 0.79)<br />
tintdistance: 0.2<br />
index of refraction: 2.0<br />
scattering coefficient: 50<br />
anisotropy: 0.7<br />
1015 samples</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi93Y36Dmw-MlcQJKcK9irz7UG-vNjEzpUlsmVZ3kcWouHiyrm7t9ivegRMhNpPKuq8Y0vWyWyXf9swuSGgwUpo1sTOxEqj5wEm91M0p7ClatOVjcWccyKuKLiS2v3ViXNl8eVG_v45DSYF/s1600/_3311_buddha.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi93Y36Dmw-MlcQJKcK9irz7UG-vNjEzpUlsmVZ3kcWouHiyrm7t9ivegRMhNpPKuq8Y0vWyWyXf9swuSGgwUpo1sTOxEqj5wEm91M0p7ClatOVjcWccyKuKLiS2v3ViXNl8eVG_v45DSYF/s400/_3311_buddha.png" title="buddha_translucent_2.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Translucent Material<br />tintcolor: (0.0, 0.8, 0.0)<br />tintdistance: 0.1<br />index of refraction: 1.5<br />scattering coefficient: 100<br />anisotropy: 0.7<br />3311 samples</td></tr>
</tbody></table>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg21OLTw3VuZWNP5kXsqcLPwtcw7qyYJvGvM2Ch-h1lQnTMe5ULqlZaKqyvnw54unN-ojZtqwllRuo2ppgLcG3XLOb-Mg2IRoSqocrL8LNG_S9jTCR4tobUbWVZyYGEfB-bTSEPQS2bF0ig/s1600/buddha_translucent3_4900.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg21OLTw3VuZWNP5kXsqcLPwtcw7qyYJvGvM2Ch-h1lQnTMe5ULqlZaKqyvnw54unN-ojZtqwllRuo2ppgLcG3XLOb-Mg2IRoSqocrL8LNG_S9jTCR4tobUbWVZyYGEfB-bTSEPQS2bF0ig/s400/buddha_translucent3_4900.png" title="buddha_translucent_3.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Translucent Material<br />
tintcolor: (0.8, 0.8, 0.8)<br />
tintdistance: 0.2<br />
index of refraction: 1.0<br />
scattering coefficient: 150<br />
anisotropy: 0.7<br />
4900 samples</td></tr>
</tbody></table>
</div>
<div style="text-align: center;">
Finally, here is a video to demonstrate the progressive feature.</div>
<div style="text-align: center;">
<div class="separator" style="clear: both; text-align: center;">
</div>
</div>
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/jERfvtd5YGs/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/jERfvtd5YGs?feature=player_embedded" width="320"></iframe></div>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com1tag:blogger.com,1999:blog-5743751655059941858.post-55146871215725018972017-06-05T03:52:00.001+03:002017-06-06T20:50:00.156+03:00Assignment 9: Monte Carlo Path TracingWe have not considered any radiance contribution coming from the indirect sources to the surface of interactions. The model we have used so far couldn't simulate the global illumination and we had to fake it by adding an ambient term. Even if the ambient term can be adjusted to a reasonable level, we cannot simulate phenomena, such as color bleeding, caustics, etc., that can only be simulated correctly by global illumination algorithms. Therefore, it is a must to implement a global illumination algorithm to produce more realistic images.<br />
<div>
<br /></div>
<div>
In this assignment, we are required to solve <a href="https://www.cs.rpi.edu/~cutler/classes/advancedgraphics/S11/papers/kajiya_rendering_equation_86.pdf" target="_blank">the rendering equation</a> by using monte carlo techniques. Producing an unbiased image with a low variance is not a simple task and has been a research topic since the early days of this field. Thankfully, there are great resources offering several variance reduction techniques to reduce the noise and for faster convergence. Here are some of the resources I have been using:</div>
<div>
<ul>
<li><a href="https://graphics.stanford.edu/papers/veach_thesis/thesis-bw.pdf" target="_blank">Robust Monte Carlo Methods for Light Transport Simulation</a>: The must-read resource to get an insight with the monte carlo methods to solve the light transport equation.</li>
<li><a href="https://www.amazon.com/Advanced-Global-Illumination-Second-Philip/dp/1568813074" target="_blank">Advanced Global Illumination, Second Edition</a>: An amazing book that takes you from the very beginning to the end by covering almost all related concepts.</li>
<li><a href="https://www.amazon.com/Physically-Based-Rendering-Third-Implementation/dp/0128006455/ref=sr_1_1?s=books&ie=UTF8&qid=1496619547&sr=1-1&keywords=Physically+Based+Rendering%2C+Third+Edition" target="_blank">Physically Based Rendering, Third Edition</a>: This book also has excellent chapters for global illumination and monte carlo methods.</li>
</ul>
<div>
<br /></div>
<div>
After reading and learning a lot from these resources, I started to implement my path tracer. Although my path tracer is very simple and using a few of the techniques that I've learned, I am planning to implement them in time. For now, my path tracer has the following features:</div>
</div>
<div>
<ul>
<li>Progressive path tracing: It is great to see your images converge in real-time and to move around the scene you created. For most of the scenes, I take interactive frame rates.</li>
<li>Cosine weighted importance sampling: It is the only variance reduction strategy I used for the indirect lighting calculation. However, in future, an importance sampling for the bsdf term seems inevitable.</li>
<li>Explicit light sampling: At every bounce, I take samples from each light source. If many light sources are involved, we may choose one of them uniformly or by a pdf that we defined according to some criteria, for example, power of lights.</li>
<li>Smooth dielectrics: Perfectly reflective and refractive surfaces are easy to implement if you already implemented them in your ray tracer. Since they have delta distributions, only a single path is meaningful to follow. Therefore, everything is the same as in the case of our old ray tracer. However, we do not send both reflected and refracted rays for our refractive surfaces since the recursion would be an overhead in general and it is already deadly for the performance of the CUDA programs. Instead, we choose one of them with probability determined by the fresnel equations, which I talked about in the previous posts. </li>
</ul>
<div>
<br /></div>
<div>
Also, I am planning to</div>
</div>
<div>
<ul>
<li>read <a href="http://dl.acm.org/citation.cfm?id=2383874" target="_blank">this</a> paper to implement both refraction and reflection for the rough surfaces.</li>
<li>add multiple importance sampling for direct lighting to reduce the noise related to glossy reflections.</li>
<li>add importance sampling for the bsdf term of the indirect lighting.</li>
</ul>
</div>
<div>
<br /></div>
<div>
Finally, let me sum the basic things up for a path tracer in some brief paragraphs. We separate incoming radiance sources into two parts as direct and indirect. We sample lights explicitly for the direct part and while sampling this part, it is reasonable to importance sample the lights as well as BSDFs. For the indirect part, however, we have no idea about which direction may potentially give us higher radiance values since it is the problem itself that we want to solve. However, we can say which direction can contribute more according to the cosine term and BSDF.</div>
<div>
<br /></div>
<div>
Furthermore, since we explicitly sample the lights, if we hit any area light source with the indirect rays we have sent, we cannot take any contributions from the light sources we already sampled in the previous bounce. That is, if we hit a light that we already sampled, we consider as if it does not contribute. Also, beware that we do not sample the lights explicitly for perfectly reflective and refractive surfaces because that would be meaningless due to their delta distributions. Therefore, if the rays that have been sent after the interaction with smooth surfaces hit a light source, we take the contribution coming from these rays.</div>
<div>
<br /></div>
<div>
As before, I used jittered samples to send the rays not only to the center of the pixels but to the entire square owned by the pixel.<br />
<br />
I tried several russian roulette strategies and fixed depth values for ray terminations. While russian roulette can make a substantial decrease in fps, fixed depth values result in better performance. However, unlike fixed depth values, russian roulette generates unbiased images, which is more important for me now. The outputs shown below are generated using russian roulette that considers the radiance contribution weight of the ray.</div>
<div>
<br /></div>
<div>
Let us see the outputs.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOxveQI_OQNf8xy3Xi4qd0HGt_sPyDC3nUG942vQaco8wnvpiBOv45AyeWp5wQAkR2f0rwiLeUGCiSweJxTR1ljvP4s1M79P59lFW0c-88LV7JTJP8v6wy-kIilnL2x56tWAUNM9nw2q9h/s1600/3795_roulette_weight.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="600" data-original-width="750" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOxveQI_OQNf8xy3Xi4qd0HGt_sPyDC3nUG942vQaco8wnvpiBOv45AyeWp5wQAkR2f0rwiLeUGCiSweJxTR1ljvP4s1M79P59lFW0c-88LV7JTJP8v6wy-kIilnL2x56tWAUNM9nw2q9h/s400/3795_roulette_weight.png" title="cornellbox_ldr.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Cornellbox rendered with 3800 samples</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGi4vvNjjwY6La2qThfmAWrhnrA9eW51PSzKMQqok5tiI2T1g-CQGxJ7xKOsk2RZwzHZXlSsu4mbwcUt03lZmAt5wCf9cB19-kOGjqCbNBJL_UOJoJDfDqdHgp5SL7xEHFygiRFmA4LFEG/s1600/sponza_direct.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="675" data-original-width="1200" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGi4vvNjjwY6La2qThfmAWrhnrA9eW51PSzKMQqok5tiI2T1g-CQGxJ7xKOsk2RZwzHZXlSsu4mbwcUt03lZmAt5wCf9cB19-kOGjqCbNBJL_UOJoJDfDqdHgp5SL7xEHFygiRFmA4LFEG/s400/sponza_direct.png" title="sponza_direct.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Sponza considering only direct illumination</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQ4anEXtQTsNIyFndwCjYgGvTD_Rg5yrU4J8La8crE7Irp-YnPJhlDO_RUELTkwC2fOnuINRlxaMHvRejYWY68YSU7Ww4U4lfdvS6lSdNGCBzdpkhcEaP9mn0LUoxobAYkcxk9pAno-bLc/s1600/_650_sponza.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="675" data-original-width="1200" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQ4anEXtQTsNIyFndwCjYgGvTD_Rg5yrU4J8La8crE7Irp-YnPJhlDO_RUELTkwC2fOnuINRlxaMHvRejYWY68YSU7Ww4U4lfdvS6lSdNGCBzdpkhcEaP9mn0LUoxobAYkcxk9pAno-bLc/s400/_650_sponza.png" title="sponza.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Sponza rendered with 650 samples.</td></tr>
</tbody></table>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ-3cuZly2_HCBsy5MRT1PYIJFVrVzDUl7FVYvKYGiYB33IaVIpPuzP74VFNQU958Bz4AK6-VV_QSMxjnoOw0i6IgjrOGgeoIotbv-kTq2xqgnXmZZkSmQlSEbxUePnaFde04_JiZqiYUv/s1600/_8555_dragon_area.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ-3cuZly2_HCBsy5MRT1PYIJFVrVzDUl7FVYvKYGiYB33IaVIpPuzP74VFNQU958Bz4AK6-VV_QSMxjnoOw0i6IgjrOGgeoIotbv-kTq2xqgnXmZZkSmQlSEbxUePnaFde04_JiZqiYUv/s400/_8555_dragon_area.png" title="color_bleeding.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Color Bleeding Effect</td></tr>
</tbody></table>
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjI7S5DAunnrAbLNP6onmlFQS77kswXFLN-8bZEJ7320G3TzI8FQdenQvL_kNNoUinhc-zVOIlBkUn8uhnT716p1_B373Nq0A5BZij_fjxFcHbRGHGzotKL_xnaeIg98jDN0I_rGnQ1_W2-/s1600/_23821_dragon_area.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjI7S5DAunnrAbLNP6onmlFQS77kswXFLN-8bZEJ7320G3TzI8FQdenQvL_kNNoUinhc-zVOIlBkUn8uhnT716p1_B373Nq0A5BZij_fjxFcHbRGHGzotKL_xnaeIg98jDN0I_rGnQ1_W2-/s400/_23821_dragon_area.png" title="caustic.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Caustic Effect</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-cphDbF6bsBUQuYiUIxXyTY2bPe5KEDQu0Qh5t2YOZUufDgsp9CuGOZk1uCy016Dnjj482D7NDe9nPAadhC_T3Twy7hn6W3DEpgtzZDcivsaAbH0e8dStSoDbpmvZH46eZNVZBENuMu8j/s1600/_2470_dragon_area.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="800" data-original-width="800" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-cphDbF6bsBUQuYiUIxXyTY2bPe5KEDQu0Qh5t2YOZUufDgsp9CuGOZk1uCy016Dnjj482D7NDe9nPAadhC_T3Twy7hn6W3DEpgtzZDcivsaAbH0e8dStSoDbpmvZH46eZNVZBENuMu8j/s400/_2470_dragon_area.png" title="nothing_special.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Nothing Special</td></tr>
</tbody></table>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0tag:blogger.com,1999:blog-5743751655059941858.post-4028117857309149242017-05-14T12:47:00.002+03:002017-05-14T12:48:42.885+03:00Assignment 8: BRDF ModelsI have used well-known <a href="https://en.wikipedia.org/wiki/Blinn%E2%80%93Phong_shading_model" target="_blank">Blinn–Phong shading model</a> so far in my project. However, it is not physically accurate since it does not consider <a href="http://www.thetenthplanet.de/archives/255" target="_blank">normalization</a> factor. Also, if we do not use the modifed version of Blinn–Phong shading model, we may not properly normalize it. Therefore, to get more realistic results, one may use the modifed version with normalization factor. The same applies for the <a href="https://en.wikipedia.org/wiki/Phong_reflection_model" target="_blank">Phong shading model</a>.<br />
<div>
<br /></div>
<div>
On the other hand, we are required to implement a physically based BRDF model, <a href="https://renderman.pixar.com/view/cook-torrance-shader" target="_blank">Cook-Torrance</a>. It simulates specular reflection by treating every surface as if they consist of many microfacets. In this model, we consider fresnel reflectance of the surface, distribution of microfacets by using a distribution function and the geometrical outcomes, blocking and masking, of this distribution.</div>
<div>
<br /></div>
<div>
Let us see the images of this week.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4UJeZmrY_kzAll6jxxUs-TvLsTG-hKQ7mCKpxK7pLRYk8uWeo03nHplXOIb5coxrXSi35aKNg3WDBgb8zQ-_3DwcQcq_oAThENBVZKFO6vfSrZemmdL51guG6n4Q6Onib2v4O2ROAN_bb/s1600/brdf_phong_original.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4UJeZmrY_kzAll6jxxUs-TvLsTG-hKQ7mCKpxK7pLRYk8uWeo03nHplXOIb5coxrXSi35aKNg3WDBgb8zQ-_3DwcQcq_oAThENBVZKFO6vfSrZemmdL51guG6n4Q6Onib2v4O2ROAN_bb/s400/brdf_phong_original.png" title="brdf_phong_original.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of brdf_phong_original.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPfyokcQ135CcgZJnE3h1a-4XxhxXUjmidcpHFhftjyLj2AZvZh8k8n8Y0bclC6tI_PfkHOREtu46Qa411uWFPjSrRi0_76YOfeUvqYYZHHverfbUaEWAJNywTPQDXM9OomZK1gRYh4hDv/s1600/brdf_phong_modified.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPfyokcQ135CcgZJnE3h1a-4XxhxXUjmidcpHFhftjyLj2AZvZh8k8n8Y0bclC6tI_PfkHOREtu46Qa411uWFPjSrRi0_76YOfeUvqYYZHHverfbUaEWAJNywTPQDXM9OomZK1gRYh4hDv/s400/brdf_phong_modified.png" title="brdf_phong_modified.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of brdf_phong_modified.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
</div>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiL4fwF_OQVBzQkF_50j3-j5zvhC8CH8e6yppKpjJvqAs1FeWgHLIkfgUwrZQe0g9I2tdJp0YLuxNEJylniqzyGzaw15zpYe54mOf6h7gjy5tkgXAVnZVe6zxkAvegVYkHhRBfl3k5p7Nd0/s1600/brdf_phong_modified_normalized.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiL4fwF_OQVBzQkF_50j3-j5zvhC8CH8e6yppKpjJvqAs1FeWgHLIkfgUwrZQe0g9I2tdJp0YLuxNEJylniqzyGzaw15zpYe54mOf6h7gjy5tkgXAVnZVe6zxkAvegVYkHhRBfl3k5p7Nd0/s400/brdf_phong_modified_normalized.png" title="brdf_phong_modified_normalized.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of brdf_phong_modified_normalized.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
</div>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOFIF6L28AqPT3WibBy9tL5YsdicNYUgtPbe4SVr5modSuMBl1_6fLoGIywIghwtiIWix9BPoGFSasWVxXc-_KwhPJlBBLNcoTiXNOU_bhAxlMcf2OvDFZ3kEcvjc68ogvc09eni8Zbk9q/s1600/brdf_blinnphong_original.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOFIF6L28AqPT3WibBy9tL5YsdicNYUgtPbe4SVr5modSuMBl1_6fLoGIywIghwtiIWix9BPoGFSasWVxXc-_KwhPJlBBLNcoTiXNOU_bhAxlMcf2OvDFZ3kEcvjc68ogvc09eni8Zbk9q/s400/brdf_blinnphong_original.png" title="brdf_blinnphong_original.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of brdf_blinnphong_original.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
</div>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7-9fsZJGVu4FsqkD7K1HrtVe42W0x5Dij0iidc4Si7CU2NQDOA_zkZTOCHiuZrX7hxsSJbqlxORQ7Ejh_O3WmgnNLgN4r2F_2cRzRjjeJjjvnDblIq7p5NALz6EzCV8hXN249Fc6llZv9/s1600/brdf_blinnphong_modified.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7-9fsZJGVu4FsqkD7K1HrtVe42W0x5Dij0iidc4Si7CU2NQDOA_zkZTOCHiuZrX7hxsSJbqlxORQ7Ejh_O3WmgnNLgN4r2F_2cRzRjjeJjjvnDblIq7p5NALz6EzCV8hXN249Fc6llZv9/s400/brdf_blinnphong_modified.png" title="brdf_blinnphong_modified.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of brdf_blinnphong_modified.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMO-CTbFpGyCU-Cn2aI2o8UC114EfipSrw3qwgGSDjqm9eW8ITKNcWitmN5UWrT9gEe8C2QQJLNeTG2hSSxNYLGWT1OCsmWeRdNYaUjmJA2S4nnIBu-4HZgEapxn0WzaKxlTigFjH5zdap/s1600/brdf_blinnphong_modified_normalized.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMO-CTbFpGyCU-Cn2aI2o8UC114EfipSrw3qwgGSDjqm9eW8ITKNcWitmN5UWrT9gEe8C2QQJLNeTG2hSSxNYLGWT1OCsmWeRdNYaUjmJA2S4nnIBu-4HZgEapxn0WzaKxlTigFjH5zdap/s400/brdf_blinnphong_modified_normalized.png" title="brdf_blinnphong_modified_normalized.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of brdf_blinnphong_modified_normalized.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQNrVuLLJDAPbiMPUSpRG1V3Zk7PJlJrd7Mpa1bzCAyOKcoJ2yS-7KQphACbMx-0Any74hqD63EChh67cb6AFMtNO2tL71WYfCsz-50ewSFchQkLLh-rThwf4uAtMaDongl_aGvhrx9xGe/s1600/brdf_torrancesparrow.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQNrVuLLJDAPbiMPUSpRG1V3Zk7PJlJrd7Mpa1bzCAyOKcoJ2yS-7KQphACbMx-0Any74hqD63EChh67cb6AFMtNO2tL71WYfCsz-50ewSFchQkLLh-rThwf4uAtMaDongl_aGvhrx9xGe/s400/brdf_torrancesparrow.png" title="brdf_torrancesparrow.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of brdf_torrancesparrow.xml<br />
kernel execution time: 6.1 milliseconds</td></tr>
</tbody></table>
</div>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWvqMpdJHHqmaTf3Gdlqqvbzf33JPWIUfmFuUk5aK8IdcOBXy8mxwMENCSsbMfCmNdB_hWWCX4NgFshLvsp39pO0r8zuPBOVY3HmzDZNdThjy79peMZ8zXVfhA7goFqpAc4NlKIEeFpI2M/s1600/killeroo_blinnphong_modified_normalized.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWvqMpdJHHqmaTf3Gdlqqvbzf33JPWIUfmFuUk5aK8IdcOBXy8mxwMENCSsbMfCmNdB_hWWCX4NgFshLvsp39pO0r8zuPBOVY3HmzDZNdThjy79peMZ8zXVfhA7goFqpAc4NlKIEeFpI2M/s400/killeroo_blinnphong_modified_normalized.png" title="killeroo_blinnphong_modified_normalized.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of killeroo_blinnphong_modified_normalized.xml<br />
kernel execution time: 57 milliseconds</td></tr>
</tbody></table>
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhO4nFPbXtPO5s-vdZlxKUnBtsFMzD9Yhvx3rWi8HPtn49bUYte6UBvQ7wVISDrDGDqRZkECjQHLWW25K7_EZhOiWKw_Z9Ib6JFaXQLwKvDY1Hb1kQ7Zf-AyJaiciWAXKqcarfQQDmlkTO/s1600/killeroo_blinnphong_modified_normalized_closeup.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhO4nFPbXtPO5s-vdZlxKUnBtsFMzD9Yhvx3rWi8HPtn49bUYte6UBvQ7wVISDrDGDqRZkECjQHLWW25K7_EZhOiWKw_Z9Ib6JFaXQLwKvDY1Hb1kQ7Zf-AyJaiciWAXKqcarfQQDmlkTO/s400/killeroo_blinnphong_modified_normalized_closeup.png" title="killeroo_blinnphong_modified_normalized_closeup.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_blinnphong_modified_normalized.xml<br />
kernel execution time: 63 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgz0oozNLqDjx-BRedvTtGQLqE36vGdpNqSF9ts1NKSCCzvgugreybaxKdThyIq2WzSW8r54aPAF_CKiEhY2F1OnOBMHt4akJAL7LSHvWmZD4lnrFWK52Oe4wzYReQGoN61ur7xh8dH8qoO/s1600/killeroo_torrancesparrow.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgz0oozNLqDjx-BRedvTtGQLqE36vGdpNqSF9ts1NKSCCzvgugreybaxKdThyIq2WzSW8r54aPAF_CKiEhY2F1OnOBMHt4akJAL7LSHvWmZD4lnrFWK52Oe4wzYReQGoN61ur7xh8dH8qoO/s400/killeroo_torrancesparrow.png" title="killeroo_torrancesparrow.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_torrancesparrow.xml<br />
kernel execution time: 57 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0FYxT4GkHkvgozAamhvP8FjFxlc-2-6kclVXt55WWhphHfnBLjQHptYKSfaAL-5I5-b7WYy1VeLD3DzAm9wuiOg043h69dpQtZgRJTeu1oAlxP0eIS9zEv3yr0KsypDwcOGqoQ7mFmx8-/s1600/killeroo_torrancesparrow_closeup.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0FYxT4GkHkvgozAamhvP8FjFxlc-2-6kclVXt55WWhphHfnBLjQHptYKSfaAL-5I5-b7WYy1VeLD3DzAm9wuiOg043h69dpQtZgRJTeu1oAlxP0eIS9zEv3yr0KsypDwcOGqoQ7mFmx8-/s400/killeroo_torrancesparrow_closeup.png" title="killeroo_torrancesparrow_closeup.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_torrancesparrow.xml<br />
kernel execution time: 63 milliseconds</td></tr>
</tbody></table>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0tag:blogger.com,1999:blog-5743751655059941858.post-42341968168331760542017-04-30T00:10:00.001+03:002017-04-30T17:56:23.430+03:00Assignment 6&7: Texture Mapping, Procedural Textures and Bump MappingTextures add an important portion of realism to ray tracers. Using just plain colors is not enough to create real world scenes and this is why it is almost a must for every ray tracer.<br />
<div>
<br /></div>
<div>
For these two assigments, I implemented texture mapping, perlin noise textures and bump mapping. CUDA helped me, it does once in a while, for texture operations such as fetching, filtering and setting up many sampling parameters. Figuring out which function to use and how to use it was a bit painful but now all is clear and I will talk about them in the "Lessons learned" section. Also, I implemented the improved perlin noise to produce procedural textures.<br />
<br />
Bump mapping is relatively easy when you add the texturing functionality to your ray tracer. However, there is one big problem when dealing with the tangent vectors and the surface normal. How are you going to transform them if you are using instancing? Transforming the normal is well-known but what about others? Of course, we will not transform the tangent vectors in the way we transform the normals. Say, you figured out that you should use the transformation matrix <b>M </b>but not the inverse-transpose of <b>M. </b>However, it will work unless you have a transformation matrix whose determinant is negative. If your transformation matrix has a negative determinant then everything changes. Thanks to one of my friend who gave me this information, I was able to solve the problem by taking tangent vectors from object space to world space by transforming them with <b>M</b> and after that by multiplying one of the tangent vectors by the sign of the determinant of <b>M</b>.</div>
<div>
<br /></div>
<div>
Let us see the outputs of this week.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDeYXin7wkoxDb0tCx2dO2EKXfObpQ63Ygc9p8G1F0AXjIABTluEXPy1EKkjlDV9YTgmy4NcEuSLkOhnVLCqLMln1TSZZy6EC5T7GPP0022d30QqDjYqpQT9QtrNJ06yM5bH9REPmrhEu6/s1600/simple_texture.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDeYXin7wkoxDb0tCx2dO2EKXfObpQ63Ygc9p8G1F0AXjIABTluEXPy1EKkjlDV9YTgmy4NcEuSLkOhnVLCqLMln1TSZZy6EC5T7GPP0022d30QqDjYqpQT9QtrNJ06yM5bH9REPmrhEu6/s400/simple_texture.png" title="simple_texture.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of simple_texture.xml<br />
kernel execution time: 9.7 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNbBFzBeiNEZDzjeBB42g-ZkytvxJrY0KE1Z4nBE-wZCihtlflx52qGOjOTADJY_PxAufyninsruDFZvJO3BVm5BkcAd3As0kQ72zl8N5bv1Fq1xG_NDdUrLMQ23gOtnJ8s61ctiGwdu_R/s1600/skybox.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNbBFzBeiNEZDzjeBB42g-ZkytvxJrY0KE1Z4nBE-wZCihtlflx52qGOjOTADJY_PxAufyninsruDFZvJO3BVm5BkcAd3As0kQ72zl8N5bv1Fq1xG_NDdUrLMQ23gOtnJ8s61ctiGwdu_R/s400/skybox.png" title="skybox.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of the skybox.xml<br />
kernel execution time: 923 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEht_hG5T1JZOPL9q_Ms-gL0lykRAlDq3VnFJkNNolPFRcPWt9jM3X5RA5d4M4KxbxpRnNbIZ_eHhST5oxKTwL9c6wTpt9cCe7USVCUvq03mmZbecKleOF4Lp1Twg5EtjZSrOpFw5t4RELzQ/s1600/ellipsoids_texture.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEht_hG5T1JZOPL9q_Ms-gL0lykRAlDq3VnFJkNNolPFRcPWt9jM3X5RA5d4M4KxbxpRnNbIZ_eHhST5oxKTwL9c6wTpt9cCe7USVCUvq03mmZbecKleOF4Lp1Twg5EtjZSrOpFw5t4RELzQ/s400/ellipsoids_texture.png" title="ellipsoids_texture.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of ellipsoids_texture.xml<br />
kernel execution time: 11.4 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgv_Z-scXMSzQOFmhSGVEnit3joCIfjdiC8lG8kJ4QSaS2fFcJunUcDxj48CKIvAkQePUZz4x20wqddp_ZfTkJlh-qRgIIg5mRPoTDxHDhIdCMt3Q4b5GnqxkpzxC0gNCREMeB29H97VKUy/s1600/sphere_texture_blend_bilinear.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgv_Z-scXMSzQOFmhSGVEnit3joCIfjdiC8lG8kJ4QSaS2fFcJunUcDxj48CKIvAkQePUZz4x20wqddp_ZfTkJlh-qRgIIg5mRPoTDxHDhIdCMt3Q4b5GnqxkpzxC0gNCREMeB29H97VKUy/s400/sphere_texture_blend_bilinear.png" title="sphere_texture_blend_bilinear.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of sphere_texture_blend_bilinear.xml<br />
kernel execution time: 10.7 milliseconds</td></tr>
</tbody></table>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR8gxbxKqJ5evxOpJgp-StLw1tU4GUiVRx1c8jwYd3nxcvlWpZ8MAsilKpv9ZhyphenhyphenwBZMMU5UZOEveex30tbqkd52vVDKSGMU8TdA1bIvtz_pvYTAIRUW13KBguRop4bsvSbm2H1_HyQ3-jM/s1600/sphere_texture_replace_nearest.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR8gxbxKqJ5evxOpJgp-StLw1tU4GUiVRx1c8jwYd3nxcvlWpZ8MAsilKpv9ZhyphenhyphenwBZMMU5UZOEveex30tbqkd52vVDKSGMU8TdA1bIvtz_pvYTAIRUW13KBguRop4bsvSbm2H1_HyQ3-jM/s400/sphere_texture_replace_nearest.png" title="sphere_texture_replace_nearest.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of sphere_texture_replace_nearest.xml<br />
kernel execution time: 10.8 milliseconds</td></tr>
</tbody></table>
</div>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHJ4Ddc8xL3_DYI2jQghnVLjw6DLNHwymceJVr6j8q2pU1IyEOEG6KmmjKnTJ7PZ7JPKNJH5ja5glnFHnQ10el-oCUKxC1ZOur7jjbkIoA3sVMkp9wef0pQbhms7Ev8kePUThk-kxE6RJ_/s1600/sphere_texture_replace_bilinear.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHJ4Ddc8xL3_DYI2jQghnVLjw6DLNHwymceJVr6j8q2pU1IyEOEG6KmmjKnTJ7PZ7JPKNJH5ja5glnFHnQ10el-oCUKxC1ZOur7jjbkIoA3sVMkp9wef0pQbhms7Ev8kePUThk-kxE6RJ_/s400/sphere_texture_replace_bilinear.png" title="sphere_texture_replace_bilinear.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of sphere_texture_replace_bilinear.xml<br />
kernel execution time: 10.9 milliseconds</td></tr>
</tbody></table>
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4NMGKvGhffym1PnBYJCXpZZrFphUWKfbCj6bfYHa2GAmxZA4_7MGAiQI8womk5pY6ITE86omqWEhwobnYv09Rsh8NJG29nJ6UpH8X22AnoI9I_WUKuk37JhDuVW64lDkoz8R5lTCQemx_/s1600/killeroo_diffuse_specular_texture.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4NMGKvGhffym1PnBYJCXpZZrFphUWKfbCj6bfYHa2GAmxZA4_7MGAiQI8womk5pY6ITE86omqWEhwobnYv09Rsh8NJG29nJ6UpH8X22AnoI9I_WUKuk37JhDuVW64lDkoz8R5lTCQemx_/s400/killeroo_diffuse_specular_texture.png" title="killeroo_diffuse_specular_texture.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_diffuse_specular_texture.xml<br />
kernel execution time: 492 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeH-l5ThdEJTnAiQCAIItGIkGtSqQiwFmwBPnoraB6NH__duAMsynQ9Vqv2zZ0kC7Kuhtpg_p3qPfhDr1gYMDjueHOdmVEHrqOcv-lbB09QhHLviqeIgkbnYkI6bp-D8xuLP-ImmBPNGLK/s1600/perlin_types.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeH-l5ThdEJTnAiQCAIItGIkGtSqQiwFmwBPnoraB6NH__duAMsynQ9Vqv2zZ0kC7Kuhtpg_p3qPfhDr1gYMDjueHOdmVEHrqOcv-lbB09QhHLviqeIgkbnYkI6bp-D8xuLP-ImmBPNGLK/s400/perlin_types.png" title="perlin_types.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of perlin_types.xml<br />
kernel execution time: 14.1 milliseconds</td></tr>
</tbody></table>
<div style="text-align: center;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnbFr_cC72pRRsvs00nq36_MyJrBpaePIpAWraOfCC7D-xGLkFNqXGMuFqXA5_bEn4pGzBjiDExrXek3XzoU7Q5otqVedVO94IJt2Oe-4iNX-AXFcPEoP-sPb1hmcOdDIw1-rW_8cLF_Vu/s1600/bump_mapping_basic.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnbFr_cC72pRRsvs00nq36_MyJrBpaePIpAWraOfCC7D-xGLkFNqXGMuFqXA5_bEn4pGzBjiDExrXek3XzoU7Q5otqVedVO94IJt2Oe-4iNX-AXFcPEoP-sPb1hmcOdDIw1-rW_8cLF_Vu/s400/bump_mapping_basic.png" title="bump_mapping_basic.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of bump_mapping_basic.xml<br />
kernel execution time: 350 milliseconds</td></tr>
</tbody></table>
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq4ZDbFbVex_vt8arVFTrNGOlUaj9KNY9wcgQB4nSrceZsqsOOm6tSMXXyD7BEY7FocUBdTseK2W3odc95TzBa3UHoboF0kJ7rkMKU992VfNQeIhK8HrP94_T1gxKC_Ud5vCtmj8baqMtl/s1600/bump_mapping_transformed.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq4ZDbFbVex_vt8arVFTrNGOlUaj9KNY9wcgQB4nSrceZsqsOOm6tSMXXyD7BEY7FocUBdTseK2W3odc95TzBa3UHoboF0kJ7rkMKU992VfNQeIhK8HrP94_T1gxKC_Ud5vCtmj8baqMtl/s400/bump_mapping_transformed.png" title="bump_mapping_transformed.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of bump_mapping_transformed.xml<br />
kernel execution time: 335 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigTw-uHLiy_uPSzCROKEZ662HhIcaQYUPXGD331-Os-KzCKCkDBa3EBXNoJEWKqLRwx1WIniuROdJEmk_jcWbyvtbJVDFIYuKh2xasVHEPpiXjJxuLolUVcStrkaG6ceghFqlWg7CBzlvD/s1600/killeroo_bump_walls.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigTw-uHLiy_uPSzCROKEZ662HhIcaQYUPXGD331-Os-KzCKCkDBa3EBXNoJEWKqLRwx1WIniuROdJEmk_jcWbyvtbJVDFIYuKh2xasVHEPpiXjJxuLolUVcStrkaG6ceghFqlWg7CBzlvD/s400/killeroo_bump_walls.png" title="killeroo_bump_walls.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_bump_walls.xml<br />
kernel execution time: 515 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiztj-1g5f6YeVrwXhDpTrTgylNILrBL8N3CVHYMpRnpgoZhdpWXKVekDCM9bM-bolZzpERJ913wPxr-4R8xVleOWu3KKVdepCbW4vOahy0antIJwohLIwRlObUTyRyGcFvWd5vxGUpjEYh/s1600/sphere_bump_nobump.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiztj-1g5f6YeVrwXhDpTrTgylNILrBL8N3CVHYMpRnpgoZhdpWXKVekDCM9bM-bolZzpERJ913wPxr-4R8xVleOWu3KKVdepCbW4vOahy0antIJwohLIwRlObUTyRyGcFvWd5vxGUpjEYh/s400/sphere_bump_nobump.png" title="sphere_bump_nobump.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of sphere_bump_nobump.xml<br />
kernel execution time: 192 milliseconds</td></tr>
</tbody></table>
<div style="text-align: left;">
<span style="background-color: white; color: #444444; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;">Lessons learned:</span></div>
<div>
<ol>
<li>I stored the random number sequence for the perlin noise in the constant memory. Since the numbers are constant during the execution, constant memory is the fast and appropriate solution.</li>
<li><span style="color: #444444; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;"><span style="font-size: 13px;">Texturing with CUDA is really simple. The only confusing part is to use which function and how. First of all, I recommend you to implement a texture manager class which manages all the textures loaded for your scene. Your scene file may use the same texture more than once and you might create space for the same image for many times. Following code snippet summarizes how to create a texture in CUDA.</span></span></li>
</ol>
</div>
<pre style="background-color: #eeeeee; border: 1px dashed #999999; color: black; font-family: "andale mono" , "lucida console" , "monaco" , "fixed" , monospace; font-size: 12px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;"> <code style="color: black; word-wrap: normal;">
cudaArray* cuda_array = nullptr;
//1-Create and fill a channel description.
cudaChannelFormatDesc channel_desc = cudaCreateChannelDesc(8, 8, 8, 8, cudaChannelFormatKindUnsigned); //RGBA
HANDLE_ERROR(cudaMallocArray(&cuda_array, &channel_desc, image_width, image_height));
HANDLE_ERROR(cudaMemcpyToArray(cuda_array, 0, 0, bits, //"bits" is the pointer to your image on the host memory
image_width * image_height * 4, cudaMemcpyHostToDevice));
//2-Create and fill a resource description.
cudaResourceDesc res_desc;
memset(&res_desc, 0, sizeof(res_desc));
res_desc.resType = cudaResourceTypeArray;
res_desc.res.array.array = cuda_array;
//3-Create and fill a texture description.
cudaTextureDesc tex_desc;
memset(&tex_desc, 0, sizeof(tex_desc));
tex_desc.addressMode[0] = cudaAddressModeClamp;
tex_desc.addressMode[1] = cudaAddressModeClamp;
tex_desc.filterMode = cudaFilterModeLinear;
tex_desc.readMode = cudaReadModeNormalizedFloat;
tex_desc.normalizedCoords = 1;
//4-Create the texture object
cudaTextureObject_t texture = 0;
HANDLE_ERROR(cudaCreateTextureObject(&texture, &res_desc, &tex_desc, nullptr));
..
..
..
..
//Fetch a pixel.
float4 pixel = tex2D<float4><float4>(texture , u, v);</float4></code></pre>
Beware that we do not bind any texture before using it since we are using texture objects but not texture references. To understand the difference and which parameter does what, you should see the <a href="http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TEXTURE__OBJECT.html" target="_blank">documentation</a>.
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0Ankara, Türkiye39.9333635 32.85974190000001739.5435735 32.21429490000002 40.3231535 33.505188900000014tag:blogger.com,1999:blog-5743751655059941858.post-78319947251790128842017-04-18T01:01:00.004+03:002017-04-18T03:04:53.662+03:00Assignment 5: Distributed Ray TracingAlthough we have rendered some nice pictures so far, we didn't concentrate on making them more realistic. By looking at them closely, we can easily see hard shadows and jaggy edges which are the fundamental problem of computer generated images. In this assignment, we concentrate more on this so that we can have softer shadows and less jaggy edges by means of multisampling. Of course there are ways to post-process the final images to have more realistic ones but that would be another topic of discussion. Unfortunately, realism comes at a price in our case, regardless to say it is worth it, since we are now sending more and more rays for a pixel.<br />
<div>
<br /></div>
<div>
Taking more than one sample per pixel does not only help to implement multisampling but it also makes it possible to have area lights, depth of field and some other nice things without a further cost. Our starting point is to choose some random points around a pixel center and consider all the radiance values computed for the rays shot from these points when calculating the final value of this particular pixel. Many strategies can be adopted for choosing these random points and combining radiance values we get. Easiest one is to select all the points just randomly. However, "just randomly" creates quite noisy images. Instead of this, we can adopt jittered sampling, multi-jittered sampling, n-rooks sampling and many more. I implemented jittered sampling and the results are quite satisfying. For the "combining" part, I basically take the average of the radiance values resulted from these rays.</div>
<div>
<br /></div>
<div>
Once you calculated the jittered sample for a pixel, you can use this sample for the area light and the depth of field part as well. For example, say you are going to take 36 samples for your scene. That means you will take 36 jittered sample points around the pixel, 36 different sample points on the area light and 36 different sample points on the aperture of the thin lens camera. Once you take one of this sample, you can use it for the others too. This way, you will have more uniformly distributed samples on each of the planes you are sampling. Of course, this may not apply if you want to take more samples for area lights but not for multisampling or for other combinations that you might want different number of samples.</div>
<div>
<br /></div>
<div>
<a href="https://www.amazon.com/Ray-Tracing-Ground-Kevin-Suffern/dp/1568812728" target="_blank">Ray Tracing from the Ground Up</a> explains these concepts clearly. Also, you can see <a href="http://mathworld.wolfram.com/DiskPointPicking.html" target="_blank">this</a> to pick a random point on a disk, which you are going to need for disk shaped area lights and for depth of field effect.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
Apart from these subjects, I implemented what I've explained in the "lessons learned" part of the <a href="http://raytracer.mustafaisik.net/2017/04/assignment-4-reflections-and-refractions.html" target="_blank">previous post</a>. I reduced the rendering time from 1.34 seconds to 0.47 seconds for the image below.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMEKk0WclGeETbOiTD2k5KlvIffmy8rzr4oAaZ0hcTPJ5rYGApBRKWVunBq8ubRwjCyWv4W_KHXwiNgGdqMkGZuvK5acHT_eBWs-z6vutnYnLNNqZeIVXd886CKBDpgYK5rpx_07SKuGe8/s1600/glass_plates_point.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMEKk0WclGeETbOiTD2k5KlvIffmy8rzr4oAaZ0hcTPJ5rYGApBRKWVunBq8ubRwjCyWv4W_KHXwiNgGdqMkGZuvK5acHT_eBWs-z6vutnYnLNNqZeIVXd886CKBDpgYK5rpx_07SKuGe8/s320/glass_plates_point.png" title="glass_plates_point" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">previous approach, 32 samples</td></tr>
</tbody></table>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiD7ktbZ_NKNzJPFG_FKoeCzRzJUL2yT5k5PKS2gR2GzwENuPvYL-fSfnv_8Pq0kXjxtWZGFDUXGOs9rPQWOz9xJKe4t0DEnIXClmMZYVghruC4kSC6ANTRe7CKsGWh4hILX7tGK2AqNijI/s1600/glass_plates_point_169.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img alt="" border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiD7ktbZ_NKNzJPFG_FKoeCzRzJUL2yT5k5PKS2gR2GzwENuPvYL-fSfnv_8Pq0kXjxtWZGFDUXGOs9rPQWOz9xJKe4t0DEnIXClmMZYVghruC4kSC6ANTRe7CKsGWh4hILX7tGK2AqNijI/s200/glass_plates_point_169.png" title="glass_plates_point_169" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">mentioned approach, 169 samples</td></tr>
</tbody></table>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDLIaeAyxYUBhwz-iBQLldt5pT2_KF4_5t003pdbL7At8sF1aPFX1HglQuo9jt7Afkh1zXbwu32cUQjeU3vulC7nCfjg35yZevdZNqn9mdCFU8dNLbg0mt3lA0tuCavy87Vz3-cIZmgclb/s1600/glass_plates_point_32.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img alt="" border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDLIaeAyxYUBhwz-iBQLldt5pT2_KF4_5t003pdbL7At8sF1aPFX1HglQuo9jt7Afkh1zXbwu32cUQjeU3vulC7nCfjg35yZevdZNqn9mdCFU8dNLbg0mt3lA0tuCavy87Vz3-cIZmgclb/s200/glass_plates_point_32.png" title="glass_plates_point_32" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">mentioned approach, 32 samples</td></tr>
</tbody></table>
<br />
<br />
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
</div>
<div>
However, not unexpectedly, the resulting images are a bit noisy. Considering that we are going to take more and more samples when we deal with <a href="https://en.wikipedia.org/wiki/Monte_Carlo_method" target="_blank">Monte Carlo method</a>, this approach will be the right choice for sure.</div>
<div>
<br />
Lastly, let us see what we rendered this week.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBv43xUk0zlzQhQmU-SFmqeNLoPGU6Wxhjv4hDPhfvZLdnLUmQ4xplYhVHkTAKr_RVjMpRqHRMV1_PeasiCW5WGgFe7NSySUO2H-mHziymOwlfDZkL6zJGN2879Y5rEppdJNVxydLDURdN/s1600/glass_plates_point.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBv43xUk0zlzQhQmU-SFmqeNLoPGU6Wxhjv4hDPhfvZLdnLUmQ4xplYhVHkTAKr_RVjMpRqHRMV1_PeasiCW5WGgFe7NSySUO2H-mHziymOwlfDZkL6zJGN2879Y5rEppdJNVxydLDURdN/s400/glass_plates_point.png" title="glass_plates_point.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of glass_plates_point.xml<br />
kernel execution time: 1.34 seconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1gDwjpJzA0Q55e6jrnb398mahi93R1j_Rb_YabL_enai9_mZ7e1DNi3_JcRjs_6KIVSkBz85irIXFUZQLiLTQUGIDWyo7HBKywYOWldwSC3grg2T0-e4c2jgEF1YicGhT1NdMV2wFO6yN/s1600/glass_plates_area.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1gDwjpJzA0Q55e6jrnb398mahi93R1j_Rb_YabL_enai9_mZ7e1DNi3_JcRjs_6KIVSkBz85irIXFUZQLiLTQUGIDWyo7HBKywYOWldwSC3grg2T0-e4c2jgEF1YicGhT1NdMV2wFO6yN/s400/glass_plates_area.png" title="glass_plates_area.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of glass_plates_area.xml<br />
kernel execution time: 1.37 seconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDirLrw3MsJVH_jjXoPbdNNvMzuXWtRxl_O6lQzHa-uS0fLj13R5Mcih-MKKlFPEB4q08LgXsTC6dh0X4ncWKSMgCGRD5p1AazMtyfIVu8xckGp52QODaoyIPFnphUhtxtDExPAJyInl-Y/s1600/dragon_spot_light.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDirLrw3MsJVH_jjXoPbdNNvMzuXWtRxl_O6lQzHa-uS0fLj13R5Mcih-MKKlFPEB4q08LgXsTC6dh0X4ncWKSMgCGRD5p1AazMtyfIVu8xckGp52QODaoyIPFnphUhtxtDExPAJyInl-Y/s400/dragon_spot_light.png" title="dragon_spot_light.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of dragon_spot_light.xml<br />
kernel execution time: 26.7 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioYrTrXm-X9mX_LkM6dCva2sIu4yfn38dFHECja4z0THIfXYDSNJB5Ourh1qXUg1JDeAlwrIwi-DiGClnzzSlMvzMd5aifXLbI19FXsTDckVqq8FnvYmJjSGIu0PjiTS2l8QOimDbTi1tX/s1600/dragon_spot_light_msaa.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioYrTrXm-X9mX_LkM6dCva2sIu4yfn38dFHECja4z0THIfXYDSNJB5Ourh1qXUg1JDeAlwrIwi-DiGClnzzSlMvzMd5aifXLbI19FXsTDckVqq8FnvYmJjSGIu0PjiTS2l8QOimDbTi1tX/s400/dragon_spot_light_msaa.png" title="dragon_spot_light_msaa.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of dragon_spot_light_msaa.xml<br />
kernel execution time: 2.07 seconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSe08Q9wuqF_CmXoP2MRFeyZTNveBS78XErnowzJCz_M76tLIlgIQuJrxrPEwKgyj98GjPW5RAQ3IaNMqMoW6cyJAvVll71HCcPROVblkLVXbdHmBpGwP-rNI06t6GVHWYStyl2VmmDCLX/s1600/spheres_dof.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSe08Q9wuqF_CmXoP2MRFeyZTNveBS78XErnowzJCz_M76tLIlgIQuJrxrPEwKgyj98GjPW5RAQ3IaNMqMoW6cyJAvVll71HCcPROVblkLVXbdHmBpGwP-rNI06t6GVHWYStyl2VmmDCLX/s400/spheres_dof.png" title="spheres_dof.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of spheres_dof.xml<br />
kernel execution time: 331 milliseconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7kCdh8gqo0teqClTPMfCXDJBY0iZbtdpey3yO6bQOVYnHcdaRpw_YbXev-ed2nG17F78IEBxRg7z6WUJc5fnSYR8F6KjtS_eMpDXizROi9b1bFPFdespEVqvRaCaUEsnapLEoiB6s58Ws/s1600/metal_plates_area.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7kCdh8gqo0teqClTPMfCXDJBY0iZbtdpey3yO6bQOVYnHcdaRpw_YbXev-ed2nG17F78IEBxRg7z6WUJc5fnSYR8F6KjtS_eMpDXizROi9b1bFPFdespEVqvRaCaUEsnapLEoiB6s58Ws/s400/metal_plates_area.png" title="metal_plates_area.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">output of metal_plates_area.xml<br />
kernel execution time: 1.07 seconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYq80UIxkysrEaPoRize7pElsdFqPrOYV3qcqeCjcdKSKOAhFEE5A3XWxwTwyOOsYjykdCNmgSXnQuo8kmBzDXll1mQDmY2_RfXOvuEExMJkj0AnxpKWujBhMDB7IpawJYkfa7rx3LS5RV/s1600/dragon_area.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYq80UIxkysrEaPoRize7pElsdFqPrOYV3qcqeCjcdKSKOAhFEE5A3XWxwTwyOOsYjykdCNmgSXnQuo8kmBzDXll1mQDmY2_RfXOvuEExMJkj0AnxpKWujBhMDB7IpawJYkfa7rx3LS5RV/s400/dragon_area.png" title="dragon_area.png" width="400" /></a></td></tr>
<tr><td class="tr-caption">The dragon under an area light</td></tr>
</tbody></table>
</div>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0tag:blogger.com,1999:blog-5743751655059941858.post-2830307359820676442017-04-09T15:09:00.000+03:002017-04-09T22:27:19.736+03:00Assignment 4: Reflections and RefractionsRay tracing is good at simulating refraction and reflection of light by its nature. One of the first things we have to consider to simulate reflective and refractive materials is that we have to shoot rays beside the primary and shadow rays. After primary rays, if we detect any refractive or reflective surface, we should generate new rays based on the physics of light.<br />
<br />
A perfect specular surface reflects the incoming light in a certain direction. Although perfect specular surfaces are hard to find in nature, not sure if it is impossible, they are implemented in every ray tracer since they are the starting point for simulating other reflective surfaces. On the other hand, when the light interacts with a refractive surface, some of the photons are reflected while some others are refracted depending on the incident angle of light. This effect can be easily seen from the picture of a lake below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg79fmtMlcBXcDx3cnMN0MleW_PDz-VTMFcLvw528UxQDYua4XU1SpanWNKkdgyZGHFgEz4TunRtEiMHChc6KIEN5apJJ59U7Z2D3We7gP3pdIfCRqYXFIR7iBRhg5cJ7aP3tN_ZZdQql4y/s1600/The-mountain-lake.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="250" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg79fmtMlcBXcDx3cnMN0MleW_PDz-VTMFcLvw528UxQDYua4XU1SpanWNKkdgyZGHFgEz4TunRtEiMHChc6KIEN5apJJ59U7Z2D3We7gP3pdIfCRqYXFIR7iBRhg5cJ7aP3tN_ZZdQql4y/s400/The-mountain-lake.jpg" width="400" /></a></div>
<div>
<div>
As you can see, in the pixels near the right-bottom corner of the image, we can easily see the transparent nature of water. The reason is that the incident angle is smaller. However, as the view direction is getting to be perpendicular to the surface normal, incident angle is getting larger and therefore photons are getting mostly reflected. <a href="https://en.wikipedia.org/wiki/Fresnel_equations" target="_blank">Fresnel equations</a> explain this behaviour and based on these equations, we can compute which proportion of photons will be reflected or refracted. Furthermore, <a href="https://en.wikipedia.org/wiki/Schlick%27s_approximation" target="_blank">Schlick’s approximation</a> is a well-known approximation of the Fresnel equations, which can be used to decrease the computation time. Also, <a href="https://en.wikipedia.org/wiki/Snell%27s_law" target="_blank">Snell's law</a> helps us to compute the direction of a refracted light. Finally, by means of the <a href="https://en.wikipedia.org/wiki/Beer%E2%80%93Lambert_law" target="_blank">Beer's law</a>, we can compute the attenuation of the light traveling through the medium. <a href="https://graphics.stanford.edu/courses/cs148-10-summer/docs/2006--degreve--reflection_refraction.pdf" target="_blank">This document</a> clearly explains aforementioned equations and laws except Beer's law.</div>
</div>
<div>
<br /></div>
<div>
Let us see the scenes we should render for this assignment.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQmDQGP6YVkCCoGImvusS2VW2eE-FLzwi_H9ZtHTi146J5t8vbZvE2RJEpnHZxppu_11pJbHHWGSozo6GZbJYCTrr7ik0X5o6jNRkS37gA-20Z-OcAHcEmhTx9z9oRSAHNOD9SlOnRikKr/s1600/cornellbox_glass.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQmDQGP6YVkCCoGImvusS2VW2eE-FLzwi_H9ZtHTi146J5t8vbZvE2RJEpnHZxppu_11pJbHHWGSozo6GZbJYCTrr7ik0X5o6jNRkS37gA-20Z-OcAHcEmhTx9z9oRSAHNOD9SlOnRikKr/s400/cornellbox_glass.png" title="cornellbox_glass.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of cornellbox_glass.xml<br />
kernel execution time: 31.1 milliseconds</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpiifBiX3tCrV660qHyCO8l8-m147kuPz5LGjWR7ksIhqWBVib09UdWPMusjlBwipV3-B-o20vBarcAT6yZ0Cjb0houyaQW_hgw5EtGI1uf5Hfb0RmLGSWcr3hrGBrxra1RdCr2KqHKlh5/s1600/horse_and_glass_mug.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpiifBiX3tCrV660qHyCO8l8-m147kuPz5LGjWR7ksIhqWBVib09UdWPMusjlBwipV3-B-o20vBarcAT6yZ0Cjb0houyaQW_hgw5EtGI1uf5Hfb0RmLGSWcr3hrGBrxra1RdCr2KqHKlh5/s400/horse_and_glass_mug.png" title="horse_and_glass_mug.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of horse_and_glass_mug.xml<br />
kernel execution time: 571 milliseconds</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKgqY6jJmwJGsqYMJC6HLMjbEMEZge_rx2sX9PR-mupGzVfocxlqYt_1Eh48TEOPIFJbITS474zVcNdB_QZ_aDM64u4zZ4TKDavo6Iz4v2ydrqjZAD2pKcnH3gmLxUh1LtntSaMK9XeS8D/s1600/glass_plates.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKgqY6jJmwJGsqYMJC6HLMjbEMEZge_rx2sX9PR-mupGzVfocxlqYt_1Eh48TEOPIFJbITS474zVcNdB_QZ_aDM64u4zZ4TKDavo6Iz4v2ydrqjZAD2pKcnH3gmLxUh1LtntSaMK9XeS8D/s400/glass_plates.png" title="glass_plates.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of glass_plates.xml<br />
kernel execution time: 55.8 milliseconds</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO-J3F7EvFVDYZp3q735PYp1jLB6Wmze2eulYadKLc17erjEpxH5IFxB9t6uiXUq5fr9zx1gJSfXlO3_6FhKfcxu43OvOZWvtH6XeX975hZVXQ3DL3S_6R_3IdbAZbS-_mUpiQlxAymMBd/s1600/killeroo_glass.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO-J3F7EvFVDYZp3q735PYp1jLB6Wmze2eulYadKLc17erjEpxH5IFxB9t6uiXUq5fr9zx1gJSfXlO3_6FhKfcxu43OvOZWvtH6XeX975hZVXQ3DL3S_6R_3IdbAZbS-_mUpiQlxAymMBd/s400/killeroo_glass.png" title="killeroo_glass.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_glass.xml<br />
kernel execution time: 531 milliseconds</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq3NI32cxTYd9Qp7ojQfaeCmG9nVuE85jYEL4ZyLJSyuuMtLLbo5zX5BYnl9BzPZYiXYiKLjGKifd55HAhiTMF9HaYdf4juMWI2031BPnWBtaL5XK-igGFEydaWe-wWfe8c0AdI0x7Q2cC/s1600/killeroo_half_mirror.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq3NI32cxTYd9Qp7ojQfaeCmG9nVuE85jYEL4ZyLJSyuuMtLLbo5zX5BYnl9BzPZYiXYiKLjGKifd55HAhiTMF9HaYdf4juMWI2031BPnWBtaL5XK-igGFEydaWe-wWfe8c0AdI0x7Q2cC/s400/killeroo_half_mirror.png" title="killeroo_half_mirror.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_half_mirror.xml<br />
kernel execution time: 82.5 milliseconds</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcADQi1TWj-nl0Tz-EinJ-ncuM_o-bu5a7-gG6fG8qxN8fyNijWam8Zq5vcSR5uVuO72xij1Kw-gZtdk805_Drjyvph8aZg5GMiYURU3IyTA4VUB9MS006D_SyIxcLUpEcRa55_D8raxrY/s1600/killeroo_mirror.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcADQi1TWj-nl0Tz-EinJ-ncuM_o-bu5a7-gG6fG8qxN8fyNijWam8Zq5vcSR5uVuO72xij1Kw-gZtdk805_Drjyvph8aZg5GMiYURU3IyTA4VUB9MS006D_SyIxcLUpEcRa55_D8raxrY/s400/killeroo_mirror.png" title="killeroo_mirror.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo_mirror.xml<br />
kernel execution time: 82.5 milliseconds</td></tr>
</tbody></table>
Lessons learned:<br />
<div>
<ol>
<li>Rendering times are higly increased for the scenes which include refractive materials. Of course this is an expected outcome, however, the biggest reason for slowdown is the usage of stack which is used to recursively trace all the rays generated by these materials. Even without recursion, that is, just defining a stack of 2 kb in every thread and not touching it, my ray tracer slows down by 2x. A workaround to this problem is as follows: instead of generating two rays after light-refractive surface interaction, we generate one ray but we choose whether to refract or reflect the ray by means of Fresnel equations or Schlick's approximation and a uniform random number generator. The problem is that if we take just one sample for each pixel, the output might look incorrect. Therefore, we should take as many samples as possible for each pixel since the output converges to the expected output with infinitely many samples. For these reasons, I postponed the implementation of this approach to the next assignment where I will implement multisampling. After implementing this approach, every ray-surface interaction will be able to generate at most one ray so that I get rid of the stack.</li>
</ol>
</div>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0tag:blogger.com,1999:blog-5743751655059941858.post-79778698299826377022017-03-25T14:26:00.001+03:002017-03-28T01:19:32.544+03:00Assignment 3: Accelerating with BVHDrawing fancy stuff with ray tracing can be very slow even with the modern GPUs. In the previous assignment, my ray tracer rendered a horse, contains approximately 11k triangles, in 3.54 seconds! The rendering time is O(k) for k number of triangles in the scene without any accelerating structure. 3.54 seconds might seem reasonable for the time being but what happens if you want to render the <a href="http://graphics.stanford.edu/data/3Dscanrep/dragon.jpg" target="_blank">dragon</a> with 871k triangles and sending lots of rays beside the primary rays? My ray tracer rendered the dragon model in 97seconds, and just sent primary and shadow rays. So, an accelerating structure is inevitable to render scenes in a reasonable amount of time.<br />
<div>
<br /></div>
<div>
This assignment required us to implement a BVH tree to speed things up. Before delving into coding, I found some papers implementing BVHs efficiently. Let's list some of them:</div>
<div>
<ul>
<li><a href="http://www.sci.utah.edu/~wald/Publications/2007/ParallelBVHBuild/fastbuild.pdf" target="_blank">On fast Construction of SAH-based Bounding Volume Hierarchies</a>: This paper is one of the most popular one. It tells how to build BVHs with surface area heuristics(SAH) in a binned fashion. I implemented this paper and got nearly %10-15 faster rendering times compared to BVH with a median splitting.</li>
</ul>
</div>
<div>
<ul>
<li><a href="http://www.nvidia.ca/docs/IO/77714/sbvh.pdf" target="_blank">Spatial Splits in Bounding Volume Hierarchies</a>: This paper offers a way to contruct BVH trees which gives the best ray tracing performance. It splits the overlapping primitives and creates high-quality trees. However, construction time is way more slower than other BVH techniques. It is definitely worth trying. As soon as I find a free time, I will implement this paper and share the results.</li>
</ul>
</div>
<div>
<ul>
<li><a href="http://rapt.technology/data/pssbvh.pdf" target="_blank">Parallel Spatial Splits in Bounding Volume Hierarchies</a>: This one presents a fast way to build SBVHs on modern CPUs by means of parallelism and vectorization.</li>
</ul>
</div>
<div>
<ul>
<li><a href="http://luebke.us/publications/eg09.pdf" target="_blank">Fast BVH Construction on GPUs</a>: This paper introduces linear bounding volume hierarchy (LBVH) algorithm which makes use of <a href="https://en.wikipedia.org/wiki/Z-order_curve" target="_blank">morton code</a>. While construction times are really fast, ray tracing performance is not that promising. However, it offers you a trade-off when you have animated stuff in your scene and have to rebuild the BVH structure every frame.</li>
</ul>
<div>
<ul>
<li><a href="https://research.nvidia.com/sites/default/files/publications/karras2012hpg_paper.pdf" target="_blank">Maximizing Parallelism in the Construction of BVHs, Octrees, and k-d Trees</a>: Another paper for LBVHs.</li>
</ul>
</div>
</div>
<div>
<ul>
<li><a href="https://mediatech.aalto.fi/~timo/publications/karras2013hpg_paper.pdf" target="_blank">Fast Parallel Construction of High-Quality Bounding Volume Hierarchies</a>: This paper has another aspect to build BVHs. The algorithm first constructs an LBVH tree and then restructures the tree nodes to minimize SAH cost.</li>
</ul>
<div>
<ul>
<li><a href="http://www.students.science.uu.nl/~3220516/advancedgraphics/papers/understanding_the_efficiency_of_ray_traversal_on_gpus.pdf" target="_blank">Understanding the Efficiency of Ray Traversal on GPUs</a>: This paper presents several ways to get your ray tracer run much faster on GPUs.</li>
</ul>
</div>
</div>
<div>
<ul>
<li><a href="http://wscg.zcu.cz/wscg2014/Full%5CM83-full.pdf" target="_blank">Review and Comparative Study of Ray Traversal Algorithms on a Modern GPU Architecture</a>: Finally, this paper compares several accelerating structures by building them on CPU and ray tracing on GPU. You might want to look at this before you decide which accelerating structure to go with.</li>
</ul>
</div>
<div>
<br /></div>
<div>
Now, let's have a look at the images rendered for this assignment and respective rendering times.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEGq59SCs8P9ZF81odE90G04YmcJ9g__77T-coF4qflmJ53iBjkwmCraAUE_OYU1XG8cXHtVQDOkUSd_XDQqmAId6KsVqG26usn_cz6gSao-8zj4N9tT1iOUywZSGt6kpRauEY1v8ojqlN/s1600/killeroo.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEGq59SCs8P9ZF81odE90G04YmcJ9g__77T-coF4qflmJ53iBjkwmCraAUE_OYU1XG8cXHtVQDOkUSd_XDQqmAId6KsVqG26usn_cz6gSao-8zj4N9tT1iOUywZSGt6kpRauEY1v8ojqlN/s400/killeroo.png" title="killeroo.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of killeroo.xml<br />
kernel execution time without BVH: 8860 milliseconds<br />
kernel execution time with BVH: 13.8 milliseconds<br />
speed-up: 642x</td></tr>
</tbody></table>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixX3Owv69TBwk4yRHTp6yv4iN3oMINZ88PAeG7t50y4uEtESR7wjxJSiQO4nHUprRFj_1SofGqarbB3Me6BR3iYAwDyeO50i5yYkXtnjQKqN60OUZxWl_UGgMtsK1-xbylZ4oH2KZxDvsr/s1600/dragon.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixX3Owv69TBwk4yRHTp6yv4iN3oMINZ88PAeG7t50y4uEtESR7wjxJSiQO4nHUprRFj_1SofGqarbB3Me6BR3iYAwDyeO50i5yYkXtnjQKqN60OUZxWl_UGgMtsK1-xbylZ4oH2KZxDvsr/s400/dragon.png" title="dragon.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of dragon.xml<br />
kernel execution time without BVH: 97096 milliseconds<br />
kernel execution time with BVH: 21.5 milliseconds<br />
speed-up: 4516x</td></tr>
</tbody></table>
<div>
<span style="background-color: white; color: #444444; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px;"><br />Lessons learned:</span></div>
<div>
<ol>
<li>Every BVH node is 64-byte including bounding box of the left and right child, node index of left and right child, start and end indices to the triangle soup of the mesh. Since memory is the greatest bottleneck in CUDA, care must be taken dealing with the ordering of nodes. I realized that trees constructed in depth-first order gives better results, %5 performance improvement, than the trees constructed in breadth-first order.</li>
<li>In the <a href="http://raytracer.mustafaisik.net/2017/03/assignment-2-instancing-and-smooth.html" target="_blank">previous post</a>, I mentioned about the divergence. Well, I saw how it affected my ray tracer when traversing the nodes of the tree. In my first tree traversing implementation that suffers from divergence, I was checking if a node is a leaf or an inner node. If it is an inner node, I put it onto the traversal stack and if it is a leaf node, I do the ray-triangle(s) intersection. However, while some of the threads in warp do the ray-triangle intersection, which is computationally expensive, the threads that do not hit a leaf node would wait until the end of these intersection operations. The reason is that threads in a warp execute in a lock-step. That is, they execute the same instruction at every instruction cycle. However, if they see a branch, the threads that take the branch execute their path while others wait and vice versa. Eventually, I came up with another implementation where leaf nodes are put onto another stack. After tree traversal is finished, every thread does the intersection test by iterating through the stack which contains leaf nodes. I am sure that there are better ways to solve this, however, this gave me a %15 performance improvement.</li>
</ol>
</div>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0tag:blogger.com,1999:blog-5743751655059941858.post-86543429133135912552017-03-16T15:25:00.003+03:002017-03-29T15:43:16.650+03:00Assignment 2: Instancing and Smooth Triangles<div class="separator" style="clear: both; text-align: center;">
</div>
Hello again. We are required implement smooth triangles and object instances for this assignment. Smooth triangles make use of three normal vectors, per-vertex, instead of using just a single normal vector for the whole triangle. By means of barycentric coordinates computed for the intersection point and these normal vectors, we can have better visuals. The outputs show this clearly.<br />
<br />
Another issue was to implement instancing. Instead of having different memory locations for every instance of mesh, I preferred to use an instance class such that every single instance uses the same vertices but with different transformation matrices. Briefly, we do not transform the vertices but we apply the inverse transformation to the ray. This is more practical in general and when it comes to spheres, it eases our work.<br />
<br />
Here are the outputs:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJyvhLVn7EynQI0eFiPw4ijwkgid_Fh5Xx7__vnSyU-LHJu189tlli0fAPYMctld20RHaYPRPO-U4Ly4Z4tS73_9nMmFhfW5Yu08o2c0hvH2k-JVj6M1I7apTYZ-FwqPbVsJG4_-cc6oMU/s1600/horse.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJyvhLVn7EynQI0eFiPw4ijwkgid_Fh5Xx7__vnSyU-LHJu189tlli0fAPYMctld20RHaYPRPO-U4Ly4Z4tS73_9nMmFhfW5Yu08o2c0hvH2k-JVj6M1I7apTYZ-FwqPbVsJG4_-cc6oMU/s400/horse.png" title="horse.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of horse.xml<br />
kernel execution time: 3.54 seconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXzLOoDo8VSPEMunyLc5oqrMCLUS75WFZlM0mbiFwAAX9mPEbYkO-v0TYpLzxZusEnfCbhVRlLn3Fo79KnDIY23CmWa6cEsc_Qw41vS3qYFJis6vaykZ21xIMs5wx-eW9A2ccOz3aYW9Xn/s1600/horse_instanced.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXzLOoDo8VSPEMunyLc5oqrMCLUS75WFZlM0mbiFwAAX9mPEbYkO-v0TYpLzxZusEnfCbhVRlLn3Fo79KnDIY23CmWa6cEsc_Qw41vS3qYFJis6vaykZ21xIMs5wx-eW9A2ccOz3aYW9Xn/s400/horse_instanced.png" title="horse_instanced.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of horse_instanced.xml<br />
kernel execution time: 10.48 seconds</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBJ2tyh1QSeehyphenhyphenjS09BajDycyXIW4A-QwZ9dn0Eyg7sSsJhe4P-8yPzHOlk84WDcX6Wui4cBoLFqbmDmGiMcw_KqcOwpch_O_A3nerSS32Ah0geQa95EzpfrPqOzniKOg29kplxuVIqsX7/s1600/simple_transform.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBJ2tyh1QSeehyphenhyphenjS09BajDycyXIW4A-QwZ9dn0Eyg7sSsJhe4P-8yPzHOlk84WDcX6Wui4cBoLFqbmDmGiMcw_KqcOwpch_O_A3nerSS32Ah0geQa95EzpfrPqOzniKOg29kplxuVIqsX7/s400/simple_transform.png" title="simple_transform.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of simple_transform.xml<br />
kernel execution time: 2.92 milliseconds</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQO4s8MjDsqIZurpNEblb1XzmrQNuBABiKrA7zRV_71mmkR2VYhtb5LvYKhkGl5bvs23HTUMuQkWDkPDs37otPFyunc6PkkrPaYOxoV1zVLnCYqKsZE82MgOO276VnIJbFUG8np9N2WZz_/s1600/spheres_transform.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQO4s8MjDsqIZurpNEblb1XzmrQNuBABiKrA7zRV_71mmkR2VYhtb5LvYKhkGl5bvs23HTUMuQkWDkPDs37otPFyunc6PkkrPaYOxoV1zVLnCYqKsZE82MgOO276VnIJbFUG8np9N2WZz_/s400/spheres_transform.png" title="spheres_transform.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of spheres_transform.xml<br />
kernel execution time: 3.32 milliseconds</td></tr>
</tbody></table>
Lessons learned:<br />
<ol>
<li>I used -use_fast_math in the command line of the nvcc compiler. This command makes nvcc to optimize some of the functions and arithmetic operations by using some intrinsics which approximates the results. This command speeded up my ray tracer up to 1.25x. You can find the details of it in the related section of <a href="http://docs.nvidia.com/cuda/cuda-c-programming-guide/#intrinsic-functions" target="_blank">CUDA C Programming Guide</a>.</li>
<li>Occupancy is one of the most important concept when you are dealing with CUDA. It basically states the ratio of number of active warps(a warp contains 32 threads) to the number of possible active warps(device limit). However, %100 occupancy does not always mean that it gives the best results. For example, in my case, the number of registers per multiprocessor was the limiting case. If I set it to 32(65536 registers per multiprocessor/2048 active threads per multiprocessor) registers per thread, occupancy increases to %100 but threads use a limited amount of registers. If I let the nvcc decides, it uses 48 registers per thread which gives a poor occupancy. I manually set it to 36 and got the best results. However, it might always change.</li>
<li>CUDA featured graphics cards do not have branch predictors. Using branches in your code is not the end of the life. However, if you use it unnecessarily or you put large amount of code(or function that contains) to different execution paths in the same warp, you will suffer from divergence. It is explained neatly in the <a href="http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/#branching-and-divergence" target="_blank">CUDA C Best Practices Guide</a></li>
<li>Beside CUDA related issues, there was one thing that made me busy: instancing implementation for the spheres. After we transform our rays to object space by applying the inverse transformation of the object, we calculate the distance parameter in that space. Our professor told us that we can compare and use this parameter with other parameters calculated in other object spaces or in world space. This worked very well for triangles but I was suspicious if it is working for spheres or not. However, <a href="https://groups.google.com/forum/#!topic/pbrt/yqbvJQZ0MOU" target="_blank">here</a>, Matt Phar verifies that it works for the spheres as well. The only difference is that I was using geometric approach for ray-sphere intersection but they use the analytic solution. After some drawing and calculations on the paper, I understood that it is not possible to use distance parameter calculated in the object space in anywhere else if you are using geometric approach. Finally, I adopted the analytic solution for the ray-sphere intersection. Keep in mind that you should not normalize the direction vector of the transformed ray. If you do that, you cannot make use of what I've just explained above. The reason for this is well-explained in the last link.</li>
</ol>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0tag:blogger.com,1999:blog-5743751655059941858.post-31071507130826150742017-03-11T23:46:00.000+03:002017-09-17T04:18:34.752+03:00Assignment 1: First BloodHi! I will implement this ray tracer, I may use "ray tracer" and "path tracer" interchangeably, throughout this semester for the "Advanced Ray Tracing" course taught in Computer Engineering, Middle East Technical University. I built my first ray tracer, again for another course in Saarland University, on CPU. Since then, I always wanted to implement one to utilize the massive parallel performance of GPUs. Thanks to this course, I found the motivation to start to implement an <a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel" target="_blank">embarrassingly parallel</a> ray tracer.<br />
<br />
There were two options to get going: OpenCL or CUDA. Although OpenCL is a cross-platform solution for the project, I decided to go with CUDA since it seems that CUDA has a better support. I will run every program on my <a href="http://www.geforce.com/hardware/notebook-gpus/geforce-gtx-960m" target="_blank">Geforce GTX 960M</a>.<br />
<br />
I will be letting you know the execution time to produce every image. I only consider kernel execution time on GPU which does basically the whole thing. Therefore, memory allocations, copying host memory to GPU's global memory, etc. will not be considered in calculated time.<br />
<br />
I designed two different modes to run the ray tracer. First one is the photo mode. It basically takes a photo, saves it and returns. On the other hand, video mode presents an interactive real-time ray tracer. It can produce up to 500 frames per second for simple scenes. However, since I haven't implemented an accelerating structure yet, the ray tracer doesn't give a real-time performance in complex scenes.<br />
<br />
Enough talking, let's show the outputs and respective execution times.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6LQhqlW3A4o2oF6u2czszPr4fkQY10Llk55qkaEJXMXIbE0hPwyfCI1vZTYYi-2hH0aFIOPQucXEOb8XnkIAynm9S2Rj_hSe6zeBo-7X5ykmwpAuHnX0LBJw6ypgfim3QpUwMzd35l6ZK/s1600/simple.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6LQhqlW3A4o2oF6u2czszPr4fkQY10Llk55qkaEJXMXIbE0hPwyfCI1vZTYYi-2hH0aFIOPQucXEOb8XnkIAynm9S2Rj_hSe6zeBo-7X5ykmwpAuHnX0LBJw6ypgfim3QpUwMzd35l6ZK/s400/simple.png" title="simple.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of simple.xml<br />
kernel execution time: 2.7 milliseconds</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIFfc1J1KfWBdeKjfJNQbwvxj44Pp7zCMgjPnlCnToM2urWd1gvVwXCISmiSUeeQXm5mxd3UGDAdcv8SRvRkNafSxHBXGQ80SXS1q0rvUP0dqSJ83812nY3xngfzx1nm4nLSv6lkj0RVtT/s1600/simple_shading.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIFfc1J1KfWBdeKjfJNQbwvxj44Pp7zCMgjPnlCnToM2urWd1gvVwXCISmiSUeeQXm5mxd3UGDAdcv8SRvRkNafSxHBXGQ80SXS1q0rvUP0dqSJ83812nY3xngfzx1nm4nLSv6lkj0RVtT/s400/simple_shading.png" title="simple_shading.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of simple_shading.xml<br />
kernel execution time: 3.42 milliseconds</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnM11lR7QDpz6Xv6FWTuKE1oC5mMQmm1jVsTXxLOYJi2x92cp0OyUXRipGEGy9qtpWHlWnheMBx3wcD_tmxohV2J_ugMYzW5o31mTipNo3EYV47HLBu_dIg_8o1cBhyuciT9859Nq4FqOf/s1600/bunny.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img alt="" border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnM11lR7QDpz6Xv6FWTuKE1oC5mMQmm1jVsTXxLOYJi2x92cp0OyUXRipGEGy9qtpWHlWnheMBx3wcD_tmxohV2J_ugMYzW5o31mTipNo3EYV47HLBu_dIg_8o1cBhyuciT9859Nq4FqOf/s400/bunny.png" title="bunny.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">output of bunny.xml<br />
kernel execution time: 351 milliseconds</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Let's list some lessons learned from this assignment:</div>
<div class="separator" style="clear: both; text-align: left;">
</div>
<ol>
<li>Choose the number of threads per block wisely. Generally 8x8 or 16x16 gives the best performance.</li>
<li>Be careful when you copy the values from host memory to device memory if you are copying an instance of a class which involves virtual methods. When you try to copy the the instances of this class, vtable pointer is also copied along the member variables. The problem is that this vtable pointer points to memory locations reside in host memory. However, when you try to reach to these addresses in device side, the behaviour is undefined.</li>
<li>"Kernel execution time limit" took my whole day. It could have been really easy to solve it however I did not implement a structure to check runtime cuda call errors. If your default display graphics card and the card on which your cuda code executes are the same, you are likely to have this issue. If the GPU cannot finish executing the kernel in 5-6 seconds,(I do not know the exact limit) it simply ignores the kernel execution and stops. The workaround to this problem can be found <a href="http://mzshehzanayub.blogspot.com.tr/2012/10/cuda-kernel-time-out.html" target="_blank">here</a>.</li>
<li>As I mentioned in the previous one, write a macro or whatever you wish to check errors related to cuda calls.</li>
</ol>
Mustafa Işıkhttp://www.blogger.com/profile/08086920856262102562noreply@blogger.com0