Its been 21 years since Nvidia launched, what they claim , to be the world’s first GPU the GeForce 256. It brought in for the company atleast, one of it’s most profitable branding and for Gamers a sense of pride that he/she is part of the ‘GeForce Family’. 2 years ago Nvidia in their Siggraph keynote ‘reinvented’ computer graphics with the launch of Turing and now finally after 2 whole years we see Ampere based RTX 3000 make its way into Gaming GPU’s to not only return pretty much to where it was performance / rupee (roughly ) back in its Pascal days but also evolve all the technologies it pushed forward with RTX 20.
GeForce RTX 3000-Series GPU information:
RTX 3090 | RTX 3080 | RTX 3070 | |
---|---|---|---|
GPU | Samsung 8N NVIDIA Custom Process GA102 | Samsung 8N NVIDIA Custom Process GA102 | Samsung 8N NVIDIA Custom Process GA104 |
Transistor | 28 billion | 28 billion | 17 billion |
SMs | 82 | 68 | 46 |
CUDA Cores | 10496 CUDA Cores | 8704 CUDA Cores | 5888 CUDA Cores |
Boost Clock | 1.7 Ghz | 1.71 Ghz | 1.73 Ghz |
Shader FLOPS | 36 Shader TFLOPS | 30 Shader TFLOPS | 20 Shader TFLOPS |
RT FLOPS | 69 RT TFLOPS | 58 RT TFLOPS | 40 RT TFLOPS |
Tensor FLOPS | 285 Tensor TFLOPS | 238 Tensor TFLOPS | 163 Tensor TFLOPS |
Tensor Cores | 328 3rd Gen Tensor Cores | 272 3rd Gen Tensor Cores | 184 3rd Gen Tensor Cores |
Memory Interface | 384-bit | 320-bit | 256-bit |
Memory Speed | 19.5 Gbps | 19 Gbps | 14 Gbps |
Memory Bandwidth | 936 GB/s | 760 GB/s | 512 GB/s |
VRAM Size | 24GB GDDR6X | 10GB GDDR6X | 8GB GDDR6 |
L2 Cache | 6144 KB | 5120 KB | 4096 KB |
Max TGP | 350W | 320W | 220W |
PSU Requirement | 750W | 750W | 650W |
Price | ₹1,52,000 MSRP | ₹71,000 MSRP | ₹51,000 MSRP |
Release Date | September 24th | September 17th | October |
So lets talk price , while the 3090, which Jensen claimed is a Titan series replacement, is priced at an eye watering 1.52 lakh, its slightly cut down version which will replace the flagship 2080Ti is only 71,000 . Its a far cry from the 1.30 lakh people had to shell out for a 2080Ti and is touted to be atleast 40% faster at 4k. The more enticing card is the 3070, which like the 1070 and the 2070 Super before it returns performance figures of the outgoing flagship at a much affordable price. However even though, the gen on gen pricing in the US has not increased, here prices have increased about 6 to 7k . Still leaks suggested a much worse pricing and i am sure the looming threat of both the Xbox Series X and RDNA 2 GPUs from AMD forced Nvidia to considering pricing very seriously.
GeForce RTX 3000 Series Hardware
At the current moment Nvidia have not disclosed it’s Ampere white paper and I am going off of the info Nvidia has shared on it’s website and what Ryan Smith over at Anandtech has revealed in his article.
GA 102 and 104 fabrication node
Nvidia claim they have worked with Samsung to implement a custom 8nm node for it’s gaming Ampere line of GPUs. This is a departure from TSMC’s 7nm node they used for the GA 100. We only have limited information about this process – mostly because it hasn’t been used too many places – but at a high level it’s Samsung’s densest traditional, non-EUV process, derived from their earlier 10nm process. With rumored higher yields and lower costs, Nvidia with their custom hardware and massive GPUs needed this process to make a profitable lineup in this economically uncertain era.
Ampere Streaming Multiprocessor
Nvidia in it’s website claims:
“Streaming Multiprocessors (SMs) are at the heart of NVIDIA GPUs, and our newest NVIDIA Ampere Streaming Multiprocessors are our best yet. Compared to previous-gen SMs on the Turing-architecture GeForce RTX 2080 Ti, new NVIDIA Ampere Architecture SMs offer 2x FP32 throughput for superior performance.“
So the doubling of cores may not be a mathematical double but rather 2 times more power powerful CUDA cores and hence Nvidia claiming this massive Cuda numbers. This fact will become clear once we get the white paper in our hands on September 17th.
2nd Generation Ray Tracing Cores & 3rd Generation Tensor Cores
NVIDIA made real-time videogame ray tracing a reality with the invention of Ray Tracing Cores, dedicated processing cores on the GPU specifically designed to tackle performance-intensive ray tracing workloads. On GeForce RTX 3000 Series GPUs, they are introducing 2nd Generation RT Cores, which have up to 2x the throughput.
The Addition of 3rd Generation of Tensor cores hugely improves the AI functionality of the cards and makes DLSS a lot more performant, to the point that a 3090 can drive games at 60 fps and 8k with DLSS Quality mode.
PCI-e Gen 4.0 Support
Right out of the gate, Nvidia are flashing their move to PCIe Gen 4 for better I/O latency. While in most cases even a 2080Ti has no bottlenecks on a PCIe x 16 but some cases have started creeping up where even the PCI e 3 limited RTX 2080Ti showing some frame improvements going to PCIe 4.0 system powered by Ryzen 3000. The Lowly ( by 3080/3090 standards ) 5700XT also shows this phenomena at 4k. It’s no surprise Ampere moved to Pcie 4, with upto 50% more performance it can become a very big factor.
PSU Requirements:
SKU | Power Supply Requirements |
---|---|
GeForce RTX 3090 Founders Edition | 750W Required |
GeForce RTX 3080 Founders Edition | 750W Required |
GeForce RTX 3070 Founders Edition | 650W Required |
- A lower power rating PSU may work depending on system configuration. Please check with PSU vendor.
- RTX 3090 and 3080 Founders Edition requires a new type of 12-pin connector (adapter included).
Cooling
While the 220 Watt 3070 FE card uses a somewhat familiar dual fan design, the 3080 and especially the 3090 are beef cakes. Both these GPUs have an unique V shaped PCB and a alternate facing fan for cooling. Both these cards come with a vapor chamber design and Nvidia claimes upto 30c less on the core with this new cooler design.
GDDR6X – World’s Fastest Graphics Memory
Nvidia claims it’s using the world’s fastest graphics memory in the form of Gddr6 X and while it is technically true , its also false. HBM 2 exists and can offer better bandwidth for less power. However its very expensive and not feasible on 1 lakh 50 thousand card apparently.
Acc to Nvidia GDDR6X uses innovative PAM4 signal transmission technology to once again double the data rate per clock, delivering unprecedented graphics memory performance to feed the most data-hungry workloads, such as gaming, professional visualization, and AI inference.
Other Features and Technologies:
- NVIDIA Reflex
- NVIDIA Reflex is a new suite of technologies that optimize and measure system latency in competitive games.
- It includes:
- NVIDIA Reflex Low-Latency Mode, a new technology to reduce game and rendering latency by up to 50 per cent. Reflex is being integrated in top competitive games including Apex Legends, Fortnite, Valorant, Call of Duty: Warzone, Call of Duty: Black Ops Cold War, Destiny 2, and more.
- NVIDIA Reflex Latency Analyzer, which detects clicks coming from the mouse and then measures the time it takes for the resulting pixels (for example, a gun muzzle flash) to change on the screen. Reflex Latency Analyzer is integrated in new 360Hz NVIDIA G-SYNC Esports displays and supported by top esports peripherals from ASUS, Logitech, and Razer, and SteelSeries.
- Measuring system latency has previously been extremely difficult to do, requiring over $7,000 in specialized high-speed cameras and equipment.
- NVIDIA Broadcast
- New AI-powered Broadcast app
- Three key features:
- Noise Removal: remove background noise from your microphone feed – be it a dog barking or the doorbell ringing. The AI network can even be used on incoming audio feeds to mute that one keyboard-mashing friend who won’t turn on push-to-talk.
- Virtual Background: remove the background of your webcam feed and replace it with game footage, a replacement image, or even a subtle blur.
- Auto Frame: zooms in on you and uses AI to track your head movements, keeping you at the centre of the action even as you shift from side to side. It’s like having your own cameraperson.
- RTX I/O
- A suite of technologies that enable rapid GPU-based loading and game asset decompression, accelerating I/O performance by up to 100x compared to hard drives and traditional storage APIs
- When used with Microsoft’s new DirectStorage for Windows API, RTX IO offloads up to dozens of CPU cores’ worth of work to your RTX GPU, improving frame rates, enabling near-instantaneous game loading, and opening the door to a new era of large, incredibly detailed open-world games.
- NVIDIA Machinima
- Easy to use cloud-based app provides tools to enable gamers’ creativity, for a new generation of high-quality machinima.
- Users can take assets from supported games, and use their web camera and AI to create characters, add high-fidelity physics and face and voice animation, and publish film-quality cinematics using the rendering power of their RTX 30 Series GPU
Conclusion
With the reveal of the RTX 3000 series, Nvidia has corrected many of issues the Tech press had found with the RTX 20 series. The return to much better pricing, especially in the high end and a true generational leap means that existing Maxwell, Pascal and some FPS junkie Turing owners will have a compelling Upgrade to look forward to. However, AMD is also looking very promising and buyers around Diwali and Christmas will have 2 very compelling options when both graphics vendors.