NVIDIA GeForce GTX480 Video Card Review :: Graphics Fermi 100 Architecture

03-26-2010 · Category: Hardware - Video Cards

By Benjamin Sun

Iím going to break the features of the GeForce GTX 480 down with a page for each feature. Hereís a picture of the Fermi architecture in all of its glory. The Graphics Fermi 100 or GF100 has 512 CUDA cores which are more than double that of the GT200 which preceded it, 16 PolyMorph engines, four Raster Engines 64 texture units compared to 80 on the GT200, 48 ROPs (Outputted Pixels) versus 32 on the GT200, and a 384-bit GDDR5 memory interface.

Core

As you can see the 512 CUDA cores are split into four GPCs or Graphics Processing Clusters each composed of four Streaming Multiprocessor groups. Each GPC has its own Raster engine, four SMs, sixteen texture units and four PolyMorph Engines. The raster engine resides in the GPC while the PolyMorph Engine resides in the SMs. The Raster Engine does the triangle setup, rasterization and Z-cull. The PolyMorph engine does vertex attribute fetch and tessellation. The GPC is basically a GPU without the ROPs.

There are 48 ROPs on the GF100 for blending pixels, anti-aliasing and atomic memory operations. The ROP units are grouped in six groups of eight with each group having a memory controller tied to it. The main GF100 chip has six memory controllers. Each memory control is a separate 64-bit memory bus. The GF100 uses GDDR5 memory which has double the memory bandwidth of GDDR3 memory. In the case of the GF100 this means the full card will have a 384-bit GDDR5 memory bus. A RADEON HD 5870 with 1GHz memory on a 256-bit bus has 128GB/second of memory bandwidth a similarly clocked GF100 would have 192GB/second of memory bandwidth due to the 384-bit memory bus.

One thing that NVIDIA could do easily is scale this design up and down. Cutting one SM would result in a 15 SM part with 480 CUDA cores which is what the GTX 480 has. Cutting two SMs would result in a 448 CUDA core part which is what the GeForce GTX 470 is. Cutting a GPC would result in a 384 core part with three raster engines and 12 PolyMorph engines. One can foresee a 6 GPC part with 640 CUDA Cores if NVIDIA could get the die density down and move to a lower fabrication process but that will be hard enough considering the difficulty releasing a full 512 CUDA core part. The possibilities are endless with a similar approach to the ATI HD 5 series going from 1600 down to 400.