I Can Hazwell?

Pressure · Apr 23, 2013

So it seems pretty much confirmed that Haswell GT3e will have 128 MB eDRAM on-package.

Apple already wanted this to happen with Ivy Bridge for their MacBook Pro with retina display but apparently they were the only customer so it got canned. Understandable at a $50 premium.

Grall · Apr 23, 2013

Pressure said:
So it seems pretty much confirmed that Haswell GT3e will have 128 MB eDRAM on-package.

99.999% certainty just plain ole DRAM, likely on a quite wide bus.

Understandable at a $50 premium.

$50 premium for a premium notebook like a macbook retina or similar device isn't much of an issue, however apple didn't have any competitors in the ultra-high res laptop screen market when they launched. Curiously, now there's some other similar laptops out there, google's chrome pixel and whatnot, so it could have been nice had intel finished developing the ivy bridge GT3e (or whatever it would have been called) anyway. They would have had a bigger market than just Apple for it a couple months down the road.

mczak · Apr 23, 2013

Pressure said:
So it seems pretty much confirmed that Haswell GT3e will have 128 MB eDRAM on-package.

Apple already wanted this to happen with Ivy Bridge for their MacBook Pro with retina display but apparently they were the only customer so it got canned. Understandable at a $50 premium.

I don't think just integrating dram would have make any sense for Ivy Bridge. Unless they are really talking about GT3 Ivy, and I don't know if intel would have been ready for that.

Grall · Apr 23, 2013

Grall said:
99.999% certainty just plain ole DRAM, likely on a quite wide bus.

LOL, so much for that prediction: http://www.anandtech.com/show/6911/...usiness-haswell-gt3e-to-integrate-128mb-edram

(Should teach me to keep my yap shut on topics I really don't know anything about...

)

This is a condensate of a text written by David Kanter, who digs through informations intel has released to try and figure out what kind of graphics setup the highest tier of haswell CPU will offer. What seems to be the conclusion is 128MB eDRAM, fabricated on intel's own .22nm tri-gate process. ...With the catch that this is all speculation, as there's no official word to either confirm or deny. It looks to be anyone's best guess this far though.

tunafish · Apr 24, 2013

Grall said:
LOL, so much for that prediction

Egg on my face too.

This is a condensate of a text written by David Kanter, who digs through informations intel has released to try and figure out what kind of graphics setup the highest tier of haswell CPU will offer. What seems to be the conclusion is 128MB eDRAM, fabricated on intel's own .22nm tri-gate process.

Not only that, but as far as I can tell, it's actually something completely new. From the comment thread: "A high aspect-ratio, 3-D metal-insulator-metal capacitor trench has been integrated into the ultra-low-k interlayer dielectric and Cu metallization used for interconnect stacks."

Remember from last page where I told that the kind of caps grown on top of the transistors were completely incompatible with large metal stacks? Well, they're not anymore. Apparently, Intel builds their cap trenches with and within the metal stacks.

With the catch that this is all speculation, as there's no official word to either confirm or deny. It looks to be anyone's best guess this far though.

David Kanter is pretty much as reliable news source as there is in tech. I'd consider him more reliable than most official press releases...

mavere · Apr 24, 2013

x264 developer quantifies AVX2 improvements in x264.

The numbers aren't exhaustive due to the NDA, but they're the first I've seen using AVX2 in a 'common' task.

Grall · Apr 24, 2013

tunafish said:
Apparently, Intel builds their cap trenches with and within the metal stacks.

How much of a stack can you really need on a DRAM die though...? Maybe that helps with construction, I can't imagine you'd need anywhere near the 11+ layers used in modern microprocessors.

David Kanter is pretty much as reliable news source as there is in tech.

Kanter's really great, yeah. He also writes excellent articles that are easy to read for basically anyone really.

itsmydamnation · Apr 24, 2013

Grall said:
How much of a stack can you really need on a DRAM die though...? Maybe that helps with construction, I can't imagine you'd need anywhere near the 11+ layers used in modern microprocessors.

Kanter's really great, yeah. He also writes excellent articles that are easy to read for basically anyone really.

I think Dave's articles are great, but personally I find his postings to be very pro intel. Which is fair enough, but that, in a way makes his articles even better because despite his bias ( my perceived) he can still put it aside when doing an actual published analysis. ( insert thumps up emotocon that b3d lacks)

pcchen · Apr 24, 2013

If GT3e's memory is indeed Intel's custom eDRAM (which, 128MB seems to be a very reasonable estimate), then the question is how Intel's graphics driver going to manage this memory.

David Kanter's estimation of its bandwidth (around 64MB/s) is not very high. So, other than frame buffers (which is not very big, even 2880x1800 retina resolution with triple buffering takes less than 60MB) , it shouldn't be used as a cache for texture, but better to use it along side main memory as texture memory, in order to utilize main memory bandwidth as additional bandwidth. However, this will require intelligent texture management, preferably with block based system.

Grall · Apr 24, 2013

Intel had the i740 which did virtual texturing way back in the early AGP bus days. There's a little bit of history for block based schemes for them. Pretty sure the i740 did it all in hardware too, without any driver intervention. I believe the Gamecube/Wii works similarly too, with the hardware taking care of stuffing its cache with textures, without needing to fiddle with it via game code...

Something similar could have been implemented here.

pcchen · Apr 24, 2013

Grall said:
Intel had the i740 which did virtual texturing way back in the early AGP bus days. There's a little bit of history for block based schemes for them. Pretty sure the i740 did it all in hardware too, without any driver intervention. I believe the Gamecube/Wii works similarly too, with the hardware taking care of stuffing its cache with textures, without needing to fiddle with it via game code...

Something similar could have been implemented here.

IIRC the first i740 (I'm not sure about later revisions) do not cache texture on local memory at all. All texturing access are directly from main memory, so it's very slow. Certainly it's better not to be this way for GT3e (although, if you have two high resolution monitors, 128MB could be filled up pretty quickly).

Grall · Apr 24, 2013

The implementation in the i740 surely was very crude, considering the extreme age of the device (I remember playing the original Half-Life on one of these puppies!), I was merely referencing the principle.

Also, some sort of texture cache, even a miniscule one, could very well have been part of the i740, enough to hold a small chunk of texels for filtering and burst transfer capability, as surely the card did not access textures one texel at a time across the bus; that would have been monstrously inefficient and maybe not even technically possible.

pcchen · Apr 24, 2013

Grall said:
The implementation in the i740 surely was very crude, considering the extreme age of the device (I remember playing the original Half-Life on one of these puppies!), I was merely referencing the principle.

Also, some sort of texture cache, even a miniscule one, could very well have been part of the i740, enough to hold a small chunk of texels for filtering and burst transfer capability, as surely the card did not access textures one texel at a time across the bus; that would have been monstrously inefficient and maybe not even technically possible.

No, i740 still has on board memory (local memory), but it only uses these for frame buffer. All textures are stored in main memory, and have to be accessed across the AGP bus. Of course it has on-chip texture cache, just like every 3D chips, but that's not the topic of this discussion.

My point is, if GT3e only uses the eDRAM for frame buffer, that'd be pretty boring. However, just using the eDRAM for a "texture cache" (not the same as the traditional on-chip texture cache) is not ideal, because the relatively small size of the eDRAM would cause a lot of cache thrashing. A better way is to make the eDRAM and the main memory as some sort of "unified" memory, only that some parts are quicker. The system will have to determine which texture (or which part of some textures) are more frequently used and should be stored in the eDRAM, while others should stay in main memory.

Silent_Buddha · Apr 24, 2013

pcchen said:
No, i740 still has on board memory (local memory), but it only uses these for frame buffer. All textures are stored in main memory, and have to be accessed across the AGP bus. Of course it has on-chip texture cache, just like every 3D chips, but that's not the topic of this discussion.

My point is, if GT3e only uses the eDRAM for frame buffer, that'd be pretty boring. However, just using the eDRAM for a "texture cache" (not the same as the traditional on-chip texture cache) is not ideal, because the relatively small size of the eDRAM would cause a lot of cache thrashing. A better way is to make the eDRAM and the main memory as some sort of "unified" memory, only that some parts are quicker. The system will have to determine which texture (or which part of some textures) are more frequently used and should be stored in the eDRAM, while others should stay in main memory.

That makes me wonder if they'll use it in a similar way to the eSRAM in Microsoft's Durango. Granted, this is significantly larger than the pool of available fast memory on Durango.

Regards,
SB

Grall · Apr 24, 2013

pcchen said:
No, i740 still has on board memory (local memory), but it only uses these for frame buffer. All textures are stored in main memory, and have to be accessed across the AGP bus.

Thank you, I know that. I was around when these things actually sat on shelves in shops!

(Actually, that's what they mostly did, unfortunately for intel...)

My point is, if GT3e only uses the eDRAM for frame buffer, that'd be pretty boring.

Yeah, and as the article I linked to alludes to, it's not; instead it's speculated to be a last-level cache for the entire APU, both GPU and CPU, according to intel papers. Exactly how it is managed may not even be publically disclosed, although considering intel may want devs to optimize for their hardware they probably will reveal its inner workings closer to launch.

The reason I brought up the i740 is that it is an intel device (real3D, really, but meh), which did virtual texturing like haswell-e could be doing as well, although the exact mechanisms and implementations they would employ to achieve that would of course differ vastly.

3dilettante · Apr 24, 2013

itsmydamnation said:
I think Dave's articles are great, but personally I find his postings to be very pro intel. Which is fair enough, but that, in a way makes his articles even better because despite his bias ( my perceived) he can still put it aside when doing an actual published analysis. ( insert thumps up emotocon that b3d lacks)

Are these pro-Intel postings wrong or less accurate because of the bias you detect?
It may well be that he has better access to technical resources at Intel. Some vendors that might have technically interesting products are not as open. This is a relatively common refrain I've seen at various times about ARM products.
AMD seems to be suffering a dearth of news, although maybe Jaguar is a change relative to Bobcat (if they haven't fired/lost everyone who might talk about it).

Gubbi · Apr 24, 2013

Grall said:
Exactly how it is managed may not even be publically disclosed

If it is a cache, it isn't managed by software. Optimizing for it would mean to keep the amount of data touched for a frame below 128MB.

Example: A 1920x1080 render target with 4x MSAA and 16bytes per fragment G-buffer data is 128MB, but since we only need the 12 non-Z bytes of a fragment's G-buffer when we're rendering edges the actual touched memory footprint is much lower. Assuming 40% extra fragments for 4xMSAA we get 1920x1080 pixels x 4 Z-samples/pixel x 4 bytes/Z-sample+ 1920x1080 pixels x 1.4 fragments/pixel x 12 bytes/fragment = 65MB , leaving 63 MB for textures which should be plenty, - it's more than 22 bytes per fragment.

Cheers

3dilettante · Apr 24, 2013

As a cache, that amount of memory is going to have a lot of memory dedicated to tags, if this bandwidth cache is organized and accessed as if it were a standard cache.

The tags could be kept on the EDRAM, although that burns bandwidth by firing off accesses to main memory and the EDRAM, unless there's something else like a page table entry filtering out accesses.

itsmydamnation · Apr 24, 2013

3dilettante said:
Are these pro-Intel postings wrong or less accurate because of the bias you detect?
It may well be that he has better access to technical resources at Intel.

I wouldn't say it's his pro intel-ness, but more general negativity (thats to harsh a word) to companies that aren't Intel. Im a glass 1/2 full kind of guy, it seems Dave is for intel as well but for other companies he's a bit glass 1/2 empty.

tunafish · Apr 25, 2013

Silent_Buddha said:
That makes me wonder if they'll use it in a similar way to the eSRAM in Microsoft's Durango. Granted, this is significantly larger than the pool of available fast memory on Durango.

Regards,
SB

They can't. The durango memory will be under full programmer control, it will do nothing on it's own. This can be done because MS knows that all titles on the platform will be designed and optimized for it. If the GTe buffer was similarly managed, it would mean that only the games that were specifically optimized for it would gain any advantage. Since Intel can't hope to get most of the titles that would possibly run on it to be optimized in such a way, that is simply unacceptable.

I Can Hazwell?

Pressure

Grall

Invisible Member

mczak

Grall

Invisible Member

tunafish

mavere

Grall

Invisible Member

itsmydamnation

pcchen

Moderator

Grall

Invisible Member

pcchen

Moderator

Grall

Invisible Member

pcchen

Moderator

Silent_Buddha

Grall

Invisible Member

3dilettante

Gubbi

3dilettante

itsmydamnation

tunafish