The most important hardware feature/tech so far

Fox5 said:
What do you mean by a SIMD cpu? Surely not say...a 3ghz P4 (it has SIMD capabilities through SIMD), it doesn't seem like it would have the computational ability a powerful GPU would require. If not a 3ghz P4 then is there something else you had in mind? A 3ghz G5? A 3ghz Power5 (which I believe would be multiple times the power of a P4 ..

well, something with a madd definitely. like altivec/vmx .. and some decent superscalarity, pardon, super-vectority. basically, picture a xenon with 20MB L1 cache, and instead of 6 contexts a 4-to-6-way co-issue - that's what i'd like. could be a bit expensive ATM, though.

..but don't current cpus offer over 30x the computational ability of a p4 for their shaders)?

yes, in their best case gpus do fine. unfortunately their best cases are essentially shaders from today's desktop/console gaming industy. but throw in a couple of dynamic branches and watch a G70 choke. so it would be very curious to see some recent production-level prman shaders translated (as much as possible) and benchmarked on one of these sm3 cards.
 
darkblu said:
well, something with a madd definitely. like altivec/vmx .. and some decent superscalarity, pardon, super-vectority. basically, picture a xenon with 20MB L1 cache, and instead of 6 contexts a 4-to-6-way co-issue - that's what i'd like. could be a bit expensive ATM, though.



yes, in their best case gpus do fine. unfortunately their best cases are essentially shaders from today's desktop/console gaming industy. but throw in a couple of dynamic branches and watch a G70 choke. so it would be very curious to see some recent production-level prman shaders translated (as much as possible) and benchmarked on one of these sm3 cards.

Would a cpu with that much cache still be able to keep the cache high speed?
Anyhow, current games are designed around current gpus, so while you may be able to create a situation where a super cpu outperforms a gpu I don't know if current games would perform better. Is the cpu to perform the rasterization as well, since even a DX6 level of functionality in a video card would still free up a lot of cpu time over just having a cpu do everything.
 
Fox5 said:
Would a cpu with that much cache still be able to keep the cache high speed?

it may end up being technologically challenging/prohibitively expensive. for instance, cell has merely 8x256KB + .5MB = 2.5MB of sweet memory, and that's akin to L2 cache characteristics. but that's today, let's see what tomorrow has for us.

Anyhow, current games are designed around current gpus, so while you may be able to create a situation where a super cpu outperforms a gpu I don't know if current games would perform better.

as long as your landmarking against current games, curent gpus are doing fine, and you'd hardly gain anything by a more flexible architecture.

try puting gpus in the proffy league; there've been already lab cases of cell outperforming a g70 at shader work, when that was not constrained to 'games circa 2002' ruleset. ironically, you don't need too much shading abilities to do the latter - see doom3's case with shaders. that's why i want to see current gpus crunching prman shaders.

fact of life is, all gpus did, do and will ever do is 'tackle' the bandwidth and latency requirements of the insatiable graphic algorithms by playing the game by the most favorable rules. of course that's fine and dandy, but it also entrenches the game. so try playing one step out of the box see what happens.

Is the cpu to perform the rasterization as well, since even a DX6 level of functionality in a video card would still free up a lot of cpu time over just having a cpu do everything.

sure, but i thought we were talking graphics technologies here, in context of doing purely graphics.
 
darkblu said:
well, something with a madd definitely. like altivec/vmx ..
For heaven's sake man, think what you're saying. :oops:
If you are making future wishlist - please steer WAY clear of VMX/Altivec/SSE/3Dnow and other prehistoric relics that have long outlived their usefulness.

That said, if you replaced your SIMD engine with something modern and actually userfriendly, then yea, I could see it as an interesting (and expensive) thing to play with.
 
Fafalada said:
That said, if you replaced your SIMD engine with something modern and actually userfriendly, then yea, I could see it as an interesting (and expensive) thing to play with.
You keep talking about this - I'm curious to read what you think should be done.

Jawed
 
Panajev2001a said:
PSP's VFPU with a good Integer/Logical instructions support on the side (cough... VU1 sucks a bit there... cough...) would be a better start :).
Dunno..Faf is talking about VFPU magics all the time to me but I haven't the opportunity to work on PSP :p
 
Fafalada said:
For heaven's sake man, think what you're saying. :oops:
If you are making future wishlist - please steer WAY clear of VMX/Altivec/SSE/3Dnow and other prehistoric relics that have long outlived their usefulness.

my apologies, but being a desktop coder i just don't know any better :(
 
Me, too!

I'll be the 15th to say programmable pixel/vertex shading, particularly since sm 2.0

MSAA (plus decent, usable af) second.

HDR should become important over the next year but it's too soon for a high rank.
 
darkblu said:
i'd have said HLSL if the thread title didn't state 'hardware feature'.
The more I think about it, the more convinced I am that HLSL is basically a "hardware feature".

Recognizing the need for a HLSL that practically exposes the hardware's underlying parallelism has put graphics on a much faster-moving track than it would be if we were stuck in the fixed function or assembly-language world.
 
Jawed said:
You keep talking about this - I'm curious to read what you think should be done.
Well others have already made good points about what direction would be pretty good.
But ok, if we list it as features (so hw engineers can chastize me later) as far as FP computation goes - in order of importance for me:
component access, broadcasting, bi-directional view of registerset, decorators(inc. swizzle), "horizontal" dotproduct(technically it also becomes two-way because of registerset requirement), a couple FxP <-> FP conversions.
I am ok with aligned only load/stores - but with decorator support(masking/swizzles).

Most likely would be implemented as a 64bit VLIW(2 instructions per word - 2execution pipelines) - because we want to have those 128 registers as well, so we need the instruction space.

With this featureset the unit also doubles as a scalar FPU with 512registers (in the sense that compiler could actually use it as such without generating shitty slow and bloated code), so we don't need any separate scalar units (IEEE compliancy freaks can go and... well do something else :p ).

darkblu said:
my apologies, but being a desktop coder i just don't know any better
Hehe, it's quite alright. It's the same reason why I don't whine about in-order but desktop ppl do ;)
 
Last edited by a moderator:
Aha Faf, so it sounds like you want CPUs to support vector/scalar operations in the same way as GPUs do. Is that right?

I dunno what broadcast (fill a vector's components from a scalar?) and bi-directional view are though :oops:

In fact it sounds like you want to write all your FP/vector programs on G70's fragment pipeline, though the branching support might get ya.

Jawed
 
Programmable graphics hardware, of course. Is there even another cantidate??? All other singls, like HDR, multisampling blabla are really secondary and unimportant.
 
Reverend said:
The more I think about it, the more convinced I am that HLSL is basically a "hardware feature".

Recognizing the need for a HLSL that practically exposes the hardware's underlying parallelism has put graphics on a much faster-moving track than it would be if we were stuck in the fixed function or assembly-language world.

there's much truth in what you say, HLSL will tremendously affect the hardware, but that's just the side-effect of software breakthroughs ; )
 
Not vertex shaders though, because that functionality can still reasonably be moved to the cpu. The performance hit can be moderate depending on the power of the cpu(s).

MSAA is another great function, but it is basicly a performance boost over SSAA and one that is becoming less important as the trend is moving towards blended SS/MS modes to deal with alpha transparancy, old-fashioned fence textures and HDR.
Also, no PC game to date REQUIRES AA or AF, to run.

I'd have to go with pixelshaders as the obvious choice.
 
Back
Top