March 18, 2024

Introduction

The Nvidia RTX 20 series needs no introduction, unless you’re living under a rock that is, in which case to recap everything. Nvidia held a presentation at Siggraph 2018 where they revealed RTX Quadro based on Turing. A brand new GPU micro-architecture based on TSMC 12nm process which in Nvidia’s own words, provides “Greatest Leap since 2006 CUDA GPU” with a follow up presentation at Gamescom 2018 which saw the reveal of 3 new RTX 20 series GPU aimed at games, namely RTX 2070, RTX 2080 and RTX 2080Ti, so what’s so good about Turing that Nvidia themselves made such a bold claim you might ask? Well, let’s start from the basics. Apart from changing GTX branding to RTX, 12nm node, GDDR6 memory and redesigned Founders Edition with 2 fans, Turing is named after British Mathematician/Code Breaker Alan Turing who is widely known for his role in breaking Germany Cipher known as “Enigma” in World War II and is also hailed as The Father of Modern Computer. Surely being named after Turing, the micro-architecture has a lot to live upto so Nvidia chased the Holy Grail of Computer Graphics, Ray Tracing, more specifically, Real Time Recursive Ray Tracing which was pioneered by Turner Whitted back in 1979 who infact works at Nvidia now. I won’t go into the details of Ray Tracing but in a nutshell, it allows for leaps and bounds better realism in computer graphics than what was previously capable by simple Rasterization but it has an extreme hunger for Computational Power. Which is why it was previously impossible to do it in Real Time as our Graphics Card weren’t fast enough. So what’s new about Turing  that makes the impossible happen? In a nutshell, this-

Let me break it down for you. As you can see that there’s a lot going on in every single Ray-Traced frame redendered by Turing. First you have the brand new RT core which is short for Ray Tracing Core which was in development since 2006. It’s a brand new Core that accelerates Ray-Tracing hence at first, both the RT core and traditional Floating Point 32 Shading is run simultaneously, once FP32 shading is done, Integer 32 is run along with RT core and at the end DNN Processing which “guesses” the missing data from the frame using Artificial Intelligence so it doesn’t have to keep waiting for all the other cores to completely all of their work and that’s done on Tensor core, a type of core that accelerates Artificial Intelligence programs, which first made its debut in Nvidia’s Professional Market only micro-architecture, Volta. This is the result.

A single Turing based GPU can run the now famous Ray Traced Star Wars demo in real time with each frame completed within “just” 45ms which means 22FPS compared 55ms, 18FPS of 4 Volta V100. Yes, 4 V100, the same GPU that powers 5 of the Top 7 fastest Super Computers and cost over $10,000 each. Where does Pascal based 1080Ti stands? 308ms which means mere 3 FPS. A single Turing gets 22FPS meanwhile 1080Ti gets 3… JUST 3. That’s around 7x more performance.

 

7x more? GREAT! Now I can run AAA games at my 3x 4k display with 144 FPS!

Whoa there! Hold your horses for a moment and look at the slide. Starting from left to right in the graphic you first have Rasterization, then Shading, then Ray Tracing. As you can see that all 3 of these micro-architecture spend most of their time while Ray Tracing, something which Turing has special RT core for and something which is non-existent in all the games on the market right now. It doesn’t mean that brand new Turing will outperform your GTX 1080Ti by 7x in the games that are out there today. It means that the games which have support for Real-Time Ray Tracing will be leaps and bounds faster, but that shouldn’t matter much despite the enormous lead over Pascal, Turing still manages mere 22 FPS. Not something you’d like to play games on. This is something almost certainly confirmed by folks over PC Games Hardware managed to get their hands on Turing based RTX 2080Ti running Shadow of The Tomb Raider with Real-Time Ray Tracing enabled and what’s supposed to be THE worlds fastest gaming graphics card was pushing mere 45 FPS on just 1080p resolution. Square Enix has come out with a response that the RTX implementation in the game is still in works but going from 45 FPS 1080p to 60 FPS 4k? Or even 2k? I’d be surprised if it managed constant 60 FPS at 1080p and that speaks volumes about Real Time Ray-Tracing. It also makes sense why Nvidia launched 2080Ti along with 2080 when usually the Ti flavor launches way later in the product cycle, because if 2080Ti is having problems pushing ~45 FPS on 1080p with Ray Tracing then imagine the 2080’s capability and what kind of performance you’ll get out of it. It won’t make Nvidia look good if their flagship GPU is getting only ~30 FPS at 1080p hence the reason why we get 2080Ti as well. Sure, Ray Tracing is now possible and you can turn it on for the best Eye Candy graphics in Single Player games but it’s not something everyone will run 24×7 given the toll it takes on performance, that is if you’re playing a game that can use the Real-Time Ray Tracing in the first place.

What about “normal” games? How does Turing perform? Ah. See this is where it gets a little complicated. During the keynote at Gamescom, Nvidia Founder and CEO Jensen Huang said that Turing posses the ability to do Floating Point AND Integer operations at the same time, which makes the Turing SM (Streaming Multiprocessors) which contains 128 CUDA cores each around 1.5x more performant compared to previous generation. So does it mean that Turing should be 50% faster than Pascal when matched Core for Core in games? Well, the question here is what do you define by “previous generation”. If it’s Pascal since Volta never made it to consumers then sure, Turing does offer the advantage of FP and INT compute at the same time but Volta? No. Volta also posses the ability to do FP and INT operations at the same time but you don’t see the Titan V based on Volta winning by miles due to its BETTER SM units, it wins due to its MORE SM units. Why? Because it’s something games needed to be coded for. It’s not something that Nvidia can do at the driver level. The game developer needs to support it. In fact AMD also has something very similar called “Primitive Shaders” in its Vega micro-architecture, something that doesn’t really work due to aforementioned needed support from Game Developer. Unlike AMD though, Nvidia has the resources to incentivize Game Developers to code for it but half-braked features into games as an afterthought never work as good as when they are implemented from the very conception of the game and so every game on the market for next few years will A) Either not support this feature at all or B) Partially supporting it and failing to leverage it properly meaning that you can kiss away your dreams of 1.5x more performance per SM in games goodbye. Not to mention that 1.5x would’ve been best case scenario. Realistically, on average Turing packs ~15% more CUDA cores compared to Pascal based GPU they are replacing, add the fact that new Turing uses GDDR6 which provided more bandwidth over GDDR5 and 5x plus it’s based on 12nm node vs 16nm for Pascal which will mean 5% to 10% clock speed bump as well and you’re looking at around 25% to 30% more performance with Turing compared to Pascal.

 

There’s no such thing as a bad product. It’s just bad pricing.

25% to 30% seems rather decent performance uplift but it sure comes at a cost. You see, The RTX 20 series commands a premium.

Just because of these pricing we cannot compare RTX 2070 with GTX 1070, RTX 2080 with GTX 1080 and lastly, RTX 2080Ti with GTX 1080Ti as the GTX 10 series cards have an MSRP of $399, $549 and $699 for GTX 1070, 1080 and 1080Ti respectively, not to mention the recent drop in price for GTX 10 series due to new launch which makes the disparity even worse for the new RTX lineup. RTX 2070 might as well be 25% faster than 1070 it’s replacing but when it’s punching in the GTX 1080 price range, we cannot compare it to GTX 1070 anymore. I guess we’ll have to wait and see for the actual benchmarks but one thing we can agree upon is that considering it’s been 2 years since 10 series launched, Turing isn’t really that great when it comes to Price/Performance ratio in normal games, that is if it even manages to beat 10 series with its reduced pricing. To be clear, since 12nm is not a completely new node compared to 16nm which Nvidia was previously using, TSMC infact groups their 12nm and 16nm node together, meaning Turing doesn’t benefit from much of the die area saving which is expected from a new node and since Turing now packs 2 new types of Cores Compared to Pascal namely the RT core and Tensor Core while also increasing the number of SM units, we can safely assume that Turing GPU powering GeForce RTX are one mammoth of a GPU in terms of size, which explains their higher cost along with current memory premium GDDR6 commands over its predecessor, so I can understand the price increase on Nvidia’s part but does it really cost that much more to make than Pascal? Only Nvidia knows, but in the end, it’s you who’s footing the bill so these tidbits shouldn’t matter to you and end user should look for Value for money for the price they’re paying.

Speaking of pricing, Turing based cards seems to have 2 different versions. One is “Reference Model” and the other is “Founders Edition”. Both versions have different Base and Boosts clockspeeds. All we can say for certain is that Mr. Jensen Huang mentioned pricing “Starting from” $499, $599 and $999 for RTX 2070, 2080 and 2080Ti respectively in the keynote. Compare that with the pricing they’re charging right now. It seems like Nvidia will have cheaper reference versions of the RTX card available at a later date. Will the Reference Model have dual fans like the Founders Edition? When will it launch? We simply don’t know.

 

Conclusion

Turing is an amazing micro-architecture from a technological point of view. It takes an interesting approach towards getting us to Real Time Ray Tracing and delivers, but for an average Joe who just wants to play games, It doesn’t appear to be as ground breaking for those tasks. What’s more is that after accounting for reduced 10 series pricing, it might not even offer better Performance per Dollar at launch. So should you buy one? I’ll ask you to wait for benchmarks instead but from what it seems right now, Turing doesn’t appear to offer any real reason for GTX 10 series owners to fork out the ludicrous pricing for the Founders Edition, unless you really want the Ray Tracing capability, which will take some time to catch on in the industry along with improved support for better utilization of FP and INT operations capability. Both of them will take time to fully implement in the new games and by the time there are plenty games supporting it, the new 30 series GPU will be upon us and for the vast majority of games out right now, the performance uplift Turing offers over the products it’s replacing diminishes when pricing is taken into account. If you are someone who has money to burn and want the latest hardware in your system then sure, buy Turing but I’d still ask you to wait for cheaper models but hey, if you’re spending anywhere from $600 to $1200 for 25% to 30% more performance then I assume you don’t care about saving few hundred bucks anyway. As for everyone else? Wait for benchmarks. As right now we can only guess, it may be an educated guess but a guess nonetheless but while you’re waiting for the benchmarks, it would be better not to have any crazy expectations as well. Is Turing a failure then? Far from it. It’s one of the most revolutionary micro-architecture but being first of its kind, it will face the “First Generation Issues” of improper support due to Catch 22 as the hardware for all these things didn’t exist in the past and hence no game takes advantage of it to its maximum but a revolutionary micro-architecture doesn’t necessarily mean it will be an awesome upgrade for 10 series owners. Turing does put the whole industry in a brand new path and where Turing pioneered Real-Time Ray Tracing, The next 30 series will perfect it. Till then, wait for benchmarks and buy whatever provides the best performance per dollar in your use case, even if its dated 2 years old GTX 10 series.