Gaming Market
Fueled by the on going growth of the gaming market and its insatiable demand for better 3D graphics, Nvidia has evolved the GPU into the world's leading parallel processing engine for many computationally-intensive applications. In addition to rendering highly realistic and immersive 3D games, Nvidia also accelerate content creation workflows, high performance computing and numerous artificial intelligence systems and applications. so in this article we will try to explore the trending architecture in Nvidia industry called Turing architecture in details and deep dive into the technology and specifications that lie behind "Nvidia RTX 2080TI"


NVIDIA's Turing Architecture 
Nividia's Microarchitecture is built on TSMC's 12nm "FFN" process Nvidia has been looking to drive and entire paradigm shift in how games are rendered and how PC video cards are evaluated. CEO Jensen Huang has called Turing NVIDIA's most important GPU architecture since 2006's Tesla GPU architecture and from a features standpoint it's clear that he's not overstating matters.

Turing is the code name for GPU microarchitecture developed by Nvidia it was named after prominent mathematician and computer scientist "Alan Turing". This architecture was first introduced in August 2018  at SIGGRAPH 2018. 
The turing architecture introduces the first consumer products capable of real-time ray-tracing for about 10GigaRays/sec , the main part was introducing a dedicated artifical intelligence processors("Tensor cores"). 
Turing delivers a dramatic boost in shading efficiency achieving 50% improvement in delivered performance per CUDA core compared to the Predecessor PASCAL generation.

Turing Tensor Cores
Turing Tensor Cores: Leveraging Deep Learning Inference for Gaming ...
Tensor cores are specialized execution units designed specifically for performing the tensor/matrix operations that are the core compute function used in Deep Learning.
The Turing tensor cores provide tremendous speed-ups for matrix computations at the heart of deep learning neural network training and inference operations.
 

the main changes in Turing than the precious voltas tensor cores is that it enabled new hardware datapaths, and performed dot products to accumulate into INT32 product,INT8 mode operates at double the FP16 rate, or 2048 integer operations per clock. INT4 mode operates at quadruple the FP16 rate, or 4096 integer ops per clock. which increased its speed very highly than before



Hybrid Rendering
NVIDIA is stating that the fastest Ge-force RTX part can cast 10 Billion (Giga) rays per second, which is compared to the unaccelerated Pascal is a 25x improvement in ray tracing performance. allowing the combination of rasterization and compute-based techniques with hardware-accelerated ray tracing and deep learning. It use cases for raytracing include reflections, diffuse global illumination, shadows, and ambient occlusion while primary visibility is is resolved through rasterization as before. 

In next posts we will explore together in details about raytracing , Hybrid and Dynamic Rendering and alot more about memory management technology inside GPUs



Author Image

About Author
Hisham Elreedy is Digital Electronics Engineer, Graphics Designer, Blogger, Youtuber. Inspired to teach all he knows from his experience in studying undergraduate engineering by creating useful posts