March 29, 2024

Introduction

Since TSMC has done nothing in terms of shrinking of manufacturing process in the last 3 years, it was always a challenge for the GPU manufacturers to pack more efficient architecture within the same 28nm manufacturing process.

Nvidia showed off their first generation Maxwell part in the form of GTX 750 and GTX 750Ti where they have pulled off an unbelievable amount of performance at a very low power envelope, something which is not yet fully been answered by AMD.

We will be reviewing the second generation Maxwell GPUs, GM204 in its gaming-oriented flagship, the GTX980.

gm204_die_1

GM 204 follows up on the earlier GM107,along with a few tweaks and tricks of its own which truly makes it Maxwell V2.We will know about it in detail but first let us take a look at what we have today and where exactly it stands in reference to the older Nvidia Cards.

NVIDIA GPU Specification Comparison
GTX 980 GTX 970 (Corrected) GTX 780 Ti GTX 770
CUDA Cores 2048 1664 2880 1536
Texture Units 128 104 240 128
ROPs 64 56 48 32
Core Clock 1126MHz 1050MHz 875MHz 1046MHz
Boost Clock 1216MHz 1178MHz 928Mhz 1085MHz
Memory Clock 7GHz GDDR5 7GHz GDDR5 7GHz GDDR5 7GHz GDDR5
Memory Bus Width 256-bit 256-bit 384-bit 256-bit
VRAM 4GB 4GB 3GB 2GB
FP64 1/32 FP32 1/32 FP32 1/24 FP32 1/24 FP32
TDP 165W 145W 250W 230W
GPU GM204 GM204 GK110 GK104
Transistor Count 5.2B 5.2B 7.1B 3.5B
Manufacturing Process TSMC 28nm TSMC 28nm TSMC 28nm TSMC 28nm
Launch Date 09/18/14 09/18/14 11/07/13 05/30/13
Launch Price $549 $329 $699 $399

The top end GM 204 consists of of two cards as of now, the GTX 980 and its slightly cut-down sibling GTX 970 which from an NVIDIA product stand point will replace GTX 780/780Ti and GTX 770 respectively.

The top end part is GTX 980,which looks to be roughly 10% faster than the GTX 780Ti purely by specs while consuming about 1/3rd power.This puts the GTX 980 as the fastest single GPU now but at a price.

Maxwell 1 : the birth of GM107

Before we jump to GTX 980’s specifics and performance in games, let us look back into Maxwell 1 and the GM107 chip which started it all.

GM107 or as we call it the Maxwell V1 was the end product of the “mobile-first” design philosophy of Nvidia.

The mobile SOC market is practically unforgiving where one has to keep the power and thermal consideration in the forefront and then design a GPU around these considerations. This philosophy led Nvidia to design the GPU lineup  bottom-up where they can design it for the mobile first and then scale it up for GeForce desktop/laptop segment.

Maxwell in itself is not very different form the erstwhile Kepler architecture. What Nvidia did was to rework the old architecture with a few new design tweaks to increase efficiency.

They went fromm this

keplar_smx
Keplar SMX

to this

maxwell_smm
Maxwell SMM

We can clearly see a couple of tweaks done in the die level.

The Kepler SMX was flat design with 4 warp schedulers and 15 different execution blocks, while the SMM has been partitioned. Physically each SMM is still one unit but logically the execution blocks accessed by warp schedulers have been marked.

The next tweak is forgoing of shared resources, while they are very good in situation when you have the workload to fill them up, they are pretty useless and feeds on unnecessary power when not being used.So while Nvidia has lost a bit of performance they make up for it in terms of power and space efficiency.

And the last but not the least, there are some tweaks made in terms of IPC by Nvidia,the scheduler has been rewritten to make it more efficient. And on the memory side of things we see in increase of L2 cache, from 256KB in GK107 to 2MB on GM107 and on GM204 it went from 512KB(GK104) to 2MB. There is also some die level optimization but Nvidia wouldn’t make that public.

Introducing GM204:

With the main Maxwell architecture taken care of lets dive into the Maxwell V2,the GM204.

GTX_980_Block

Even though a few new tricks/secret sauces were involved in the GM204, functionally its still a Maxwell GM107;just a bigger version of it with more SMM and ROP’s. This is understandable since they were working with the limitations of the ageing 28nm process and had to look for squeezing the last bits of performance from this.

While GM107 was built on 5 SMM,GM 204 is made up of 16,divided into 4GPC in place of GM 107’s one GPC. This bound to 64 ROPS’s and 4×64 bit controllers gives a 4x increase in number of ROP and 2x increase in memory bus.

maxwell_smm

In terms of design  GM204 SMM is identical to the GM107 SMM, however GM204 has 96KB of shared memory while GM107 has 64KB. The polymorph engine also is updated.

Other than that the die includes use of 4 shared texture units per 2 SMMs, leading to a 16:1 ROP:MC ratio, and a 512Kb register file for each SMM.

GM204DieB

One of the main changes form GM 107 is the increase of ROP:MC ratio in GM 204. i went form 8:1 to 16:1, last such changes were introduced when GDDR5 was introduced.

ROP To Memory Controller Ratios
GPU ROP:MC Ratio Total ROPs
Maxwell (GM204) 16:1 64
Maxwell (GM107) 8:1 16
Kepler (GK110) 8:1 48
Fermi (GF110) 8:1 48
GT200 4:1 32

With that aside let’s take a look at the card itself.

GTX 980 Showcase

I will let the pictures do the talking since there is not much to say about the card. The design philosophy follows the same ones we have been accustomed to with the GTX Titan/780/770/780Ti days though NVidia for some reason has stopped using vapour chamber based cooling.

P_20150312_113055

Here you can see the GTX 980 in its natural habitat,roaming freely…unbound….o wait..wrong show!

Nvidia retains the single fan design from its earlier version with the blower slightly offset to the right to suck in air, pass it over the chip and VRAMs and finally pushing it out through the brackets.

P_20150312_115644The intake looks like one half of the BMW grills to me.

P_20150312_114115

On the output part of things we see quite a nice assortment, 1x DVI-D, 1x HDMI 2.0 and 3x DisplayPort 1.2 ports.

P_20150312_114008

Not much to say about the logo,except the fact that you can control the LED through Nvidia GeForce Experience, where you can make it glow steadily or along with the beat of sound playing in your system.

Gaming Performance

First a little about overlcocking, even though we could overclock the card by a few MHz, overclocking the referencing is not all easy since you have a locked bios;also the reference cooler doesn’t really give much lenience to overclocking.
You can fry a few marsh mellows on the exhaust..albeit pretty slowly.

This is where we stopped in our overclocking adventure.
http://www.techpowerup.com/gpuz/details.php?id=c3f8r

The end overclock was 13% on Core and 6% on memory.

Bioshock Infinite

BioShock Infinite is a first-person shooter video game developed by Irrational Games, and published by 2K Games. based on the UNREAL Engine 3 the game is set in 1912 during the growth of American exceptionalism, the game has protagonist, former Pinkerton agent Booker DeWitt, sent to the floating air-city of Columbia to find a young woman, Elizabeth, who has been held captive there for most of her life. Though Booker rescues Elizabeth, the two are pursued by the city’s warring factions: the nativist and élite Founders that strive to keep the city for pure Americans, and the Vox Populi, rebels representing the common people. Booker finds Elizabeth to be central to this conflict, and learns that she possesses strange powers to manipulate rifts in the space-time continuum that ravage Columbia

Ultra Settings

image001

Ultra + DDOF
image002

Hitman Absolution

Hitman: Absolution (HMA) is an action-adventure stealth game developed by IO Interactive and published by Square Enix.It is the fifth entry in the Hitman game series, and runs on IO Interactive’s proprietary Glacier 2 game engine. One of the key points in this game is the lighting and its ability to render upto 1200 NPC at a time.

Ultra Settings
image007

Company Of Heroes 2

Company of Heroes 2 is a real-time strategy video game developed by Relic Entertainment and published by Sega for the Microsoft Windows platform.It is the sequel to the critically acclaimed 2006 game Company of Heroes.As with the original Company of Heroes, the game is set in World War II but with the focus on the Eastern Front, with players primarily controlling the side of the Soviet Red Army during various stages of the Eastern Front, from Operation Barbarossa to the Battle of Berlin. Company of Heroes 2 runs on Relic Entertainment’s proprietary Essence 3.0 game engine.

Ultra Settings

image006

Middle Earth : Shadow of Mordor

Middle-earth: Shadow of Mordor is an action role-playing video game set within Tolkien’s legendarium, developed by Monolith Productions and published by Warner Bros. Interactive Entertainment.The story of the game takes place between the events of The Hobbit and The Lord of the Rings. It was released for Microsoft Windows, PlayStation 4, and Xbox One in September 2014 and released for PlayStation 3 and Xbox 360 in November 2014.The game runs on LithTech Jupiter EX Engine(modified with Nemesis System).

Ultra Settings

image014

Metro 2033

It’s a first-person shooter video game with survival horror elements, based on the novel Metro 2033 by Russian author Dmitry Glukhovsky. It was developed by 4A Games in Ukraine and released in March 2010. The game is played from the perspective of Artyom, the player-character. The story takes place in post-apocalyptic Moscow. The game uses 4A Engine which supports Direct3D APIs 9, 10, and 11, along with NVidia’s PhysX and also NVidia’s 3D Vision.

Ultra Settings

image008

Final words

The GTX 980 is an engineering marvel as any flagship GPU should be. Nvidia have shown that even with a 2 year old manufacturing node you can work on get extra efficiency out of it.

For the power conscious guys it’s a treat – a very good example of proper RnD and execution of design.

However at the end of the day other than the exceptional power to performance ratio, there is not much brought into the table, but then again you cannot blame a GPU vendor solely for this since at the end of the TSMC is the one who cannot grow out of the 28nm rut.

The price is eye-watering as well. For most 980s retailing northwards of 45K+ INR, this is a significant investment. I feel that NVidia somehow missed the killing blow here – if they would have priced it more aggressively, AMD’s current generation flagships would have a really hard time. At present, AMD can lower the prices and still offer value and bide time before they bring out their next gen cards.

Apart form that I cannot find any flaws in the card unless I resort to some nitpicking. The GTX980 is the fastest single GPU gaming card on the planet (notwithstanding the Titan X), and like all flagships it demands a flagship price. If that’s not your cup of tea, then this probably isn’t the card you’re looking for.

TechARX rating : 8.5/10