What custom hardware features could benefit a console?

Shifty Geezer

uber-Troll!
Moderator
Legend
I’ve been saying for while that there’s not really a place for consoles to innovate in hardware, hence the next-gen systems are said to be very dull PC-like APUs. However, rumour that Durango has 3 extra custom ‘blocks’ suggests otherwise, which gets me wondering what are features that could be added in hardware to good effect?

We tend to think of console workloads as being graphics on the GPU and AI+physics on the CPU, with whatever crossover. But as bkilian revealed in another thread, things like audio workload, that we take for granted, can have significant requirements that take away from the CPU. That's where specialised custom hardware can bring efficiencies providing a larger overall performance gain than that same silicon thrown at programmable hardware. Eg. A cheap, small Broadcom h.264 decoder block versus a large amount of CPU/GPU power to do the same thing. And this is what consoles relied on in the past. So what software systems have we been overlooking that could benefit from custom hardware? A few historical ideas:

1) Hardware sprites.
A 2D portion of memory overlaid on the display.

2) Blitter
A memory manipulation processor that copies/moves RAM contents from place to place, which is otherwise performed on CPU

3) DSP/audio processor
Sound generation and manipulation.

4) compression hardware
(De)compress on the fly

All these functions (save sprites which were only applicable to scanline displays) can be performed on programmable hardware, but certainly found benefits in custom hardware on older machines. I don’t know what modern hardware would benefit from.

Well, DSP has obvious benefits given what bkilian said about audio being very processor intensive. That’s one area where n% of total system silicon budget for dedicated audio (and other signal functions) that provides >n% performance gain in total system.


What else? Is there value in a blitter? I don’t know what memory operations are performed in a typical 3D game beyond the obvious. Clearly a blitter would benefit 2D, but as 2D games are typically less demanding on the CPU I don’t see that adding a blitter would help much for that; the CPU is quite capable. Heck, 2D is handled as textures composited as quads on the GPU anyhow. I think of memory access on a console (or modern game) to be ‘load assets, fetch audio/textures/objects from RAM and process’ so this looks pretty worthless to me like sprites. Or could it be useful in ways I don’t know? What about shifting data from slow RAM to fast RAM on system with discrete memory pools?


We know compression hardware is featured on GPUs/SOCs for video playback. Is there value in some compression engine that could load/stream compressed assets, like JPEG2000 textures? Once in RAM the data is stored in GPU friendly formats, so are the benefits really worth it?

The real open question here is what areas of gaming technology can be improved beyond just more bandwidth, storage, and processing power? Have any devs hit issues regards audio, IO, memory functions, or something I’m completely unaware of, where they can see some bit of custom hardware will really help?
 
My wild and uninformed guess? Frame rate interpolater specifically designed for games to upscale animation, resolution, etc.

15FPS -> 30FPS -> 60FPS ala experimental SW: TFU2 paper but done better.
 
My wild and uninformed guess? Frame rate interpolater specifically designed for games to upscale animation, resolution, etc.

15FPS -> 30FPS -> 60FPS ala experimental SW: TFU2 paper but done better.

God I hope not.
Games have enough issues with latency as it is, adding features that add more doesn't seem like a good plan.
 
Maybe we could see some new CPU instructions that allow execution flow to break out of the CPU pipeline and transfer to the compute units of the GPU, then go back to the CPU if required. Maybe the GPU could benefit from the CPU prefetcher and write results directly into the L2 ready for further work.

You could perhaps split CUs up between CPU clusters so each would have its own bank to work with, and locate them close to each other on chip (rather than just bung the GPU on one side of the chip).
 
There should be quite a few (ideas/concepts) in the existing Cell jobs, including ray casting/tracing and H.265 stuff.
 
Well, the obvious answer would be a dedicated Raytracing chip, I'd guess. Although the current ones probably aren't that fast.
 
Embedded FPGA, close to the memory. Too expensive, too cumbersome to program, probably a patent minefield, too. Could be used by generic engines, like having a number of coprocessors munching on UnrealScript bytecode (is that even a thing?) at 200Mhz. Could implement cycle accurate emulators for older archs (and I mean 8/16-bit archs, not previous gen consoles), for a power saving advantage (nobody cares, 8/16-bit emulation will just use 10% of the expected CPU power in next-gen machines). Could replace the sound DSP with a more generic custom HW (but include bitstream to be used as a normal DSP).

Would there be a point in having programmable ROPs? Wasn't there some paper year ago about log z-buffers?

What kind of HW would speed-up deferred renderers?

What about virtual-texturing? Don't the latest AMD cards have some kind of VT helper hw? They did some kind of "record page fault without stopping and just render from the available mips", didn't they?
 
There should be quite a few (ideas/concepts) in the existing Cell jobs, including ray casting/tracing and H.265 stuff.
I did think after posting that if the custopm chips alleviate the CPU a lot, and the Cell is a good match, that'd be an argument for inclusion of Cell in PS4. Include it for BC and use it as a DSP+(de)compressor and whatnot in PS4 mode. I imagine it's very big and hot compared to specialist hardware though, and that's not really the topic I'm trying to get into. ;)

Embedded FPGA, close to the memory. Too expensive, too cumbersome to program, probably a patent minefield, too. Could be used by generic engines, like having a number of coprocessors munching on UnrealScript bytecode (is that even a thing?) at 200Mhz.
Is there value in a "state transition engine"? It's quite a common job, and I wonder if there are some hardware choices specific to the job that could improve it over the CPU's architecture.

Would there be a point in having programmable ROPs? Wasn't there some paper year ago about log z-buffers?
I'd count those as a GPU feature rather than custom hardware.

What kind of HW would speed-up deferred renderers?
Yeah, they've proven popular. Again, I think of them as a GPU function, but maybe a memory manipulation processor could help in some way with all those buffers? My expectation is not. You render them, then combine them in GPU. Can't see any need for messing around with the contents in memory.

Another thought that's occurred - how's about a vector UI processor? That seems a job perfectly suited to a specialist chip with curve and filled surface support, and one that's used in every game. I don't know if Flash will still be the UI platform of choice, but if it is, something optimal to run it might be good.
 
The HW zlib decompressor listed in the old Orbis speclist is neat, though I'd wonder about other compression formats too. Still, it's something.
 
I did think after posting that if the custopm chips alleviate the CPU a lot, and the Cell is a good match, that'd be an argument for inclusion of Cell in PS4. Include it for BC and use it as a DSP+(de)compressor and whatnot in PS4 mode. I imagine it's very big and hot compared to specialist hardware though, and that's not really the topic I'm trying to get into. ;)

...

Another thought that's occurred - how's about a vector UI processor? That seems a job perfectly suited to a specialist chip with curve and filled surface support, and one that's used in every game. I don't know if Flash will still be the UI platform of choice, but if it is, something optimal to run it might be good.

Yeah, not necessarily SPU implementation per se, just custom, fast and inexpensive implementation.
(Do the Steamroller/Jaguar CPUs have any power vector engines ? If not, an SPU-like engine may be a good consideration ?)

For specialized functions, I'm curious about tessellation and other massive polygon manipulation capabilities.

The other clear area from the Cell experience would be post processing.
 
I personally think that EDRAM will be a waste of time next time around unless they get get it over 32Mb

720p+4xAA is ~30Mb?
 
God I hope not.
Games have enough issues with latency as it is, adding features that add more doesn't seem like a good plan.

What about a GPU for rendering the current frame and a second GPU (or pseudo GPU xD) for post-processing jobs?
 
What would it mean if the xbox had raytracing as one of the sub systems for GPU? I ask this because I use ray tracing for visualization in 3d Max and even with high end GPUs, you can't get it to render fast enough during movement for it to not be noticeable. I don't think hardware has advanced enough for the consumer base to make raytracing viable for gaming.
 
Video acceleration is the classic IP block added.
From CD-I, CD-32 MPEG 1 addons
To Creative PC MPEG 2 card and PS2 HW for DVD

Its possible that a 4K H265 IP block could be added to get quiet low power 4K video

Audio is another possibility ( Except PS3 and N64 Audio has always had assist HW)

Both could be useful for large streaming worlds especially if hooked up to a page level virtual texture system...
 
I'd say some form of H265 is a given for Sony at this time. Not that it makes a lot sense for gaming, but it might help their struggling TV branch. Might as well use it for games, as we all now, Sonys own games tend to have a lot of FMVs with them.
 
What would it mean if the xbox had raytracing as one of the sub systems for GPU?
Raytracing for graphics is out of reach I believe, but a hardware assist, ray-casting system for use in graphics, AI, physics, or whatever else (audio even) is a possibility.
 
Raytracing for graphics is out of reach I believe, but a hardware assist, ray-casting system for use in graphics, AI, physics, or whatever else (audio even) is a possibility.
I see, it sounds like it can provide substantial help if done right. Maybe that's how a 1.2 TF gpu performs like a 3 TF one. Should be interesting.
 
Hell i'd be happy if all these "secret sauces" gave somewhere close to 2 TF from a 1 TF setup, it doesn't have to hit 3TF, that just seems like crazy pie in the sky wishing.

Although if AMD and Microsoft did indeed engineer a console that was able to hit 3TF in real world performance for games without a 3TF GPU, i would tip my top hat to them.

I guess we'll have to wait and see what they came up with.
 
A Copperlist? :LOL:

I like the idea of a Blitter. Maybe it could be useful to have a blitter with advanced frame buffer functions instead of just the bitwise stuff... being aware of pixel formats, a nice bicubic scaler, the src could be a different resolution and pixel format than the dest, and could even go through a 3D LUT, texture decompression, alpha blending, HDR stuff, or even anti-aliasing. Today a GPU can do all this anyway, but I think a separate Blitter would have a very small footprint, and let the GPU do what it's best at. All of these operations would be wasting the GPU otherwise, because they are too simple and would be limited by the memory I/O.
 
Back
Top