AMD RV770 refresh -> RV790

Silent_Buddha · Mar 12, 2009

CarstenS said:
Think about fluid simulation for example. If the recursive depth is high enough, you'd basically have every simulated particle potentially influencing every other particle, which would make the problem to solve quite serial, wouldn't it?

Now, nobody would do such a thing apart from scientific simulations and techdemos. But the larger your "kernel" (or whatever the influence diameter would be called in a technically correct way) becomes, the less parallel your physics becomes. Talking about the butterfly effect.

Sure - you can make it scale almost perfectly like in 3DMark Vantage's physics test, where no single system can influence the other systems. You can parallelize also things like particle systems for smoke and a system for cloth sim and some rigid body simulation - okay. But the more realistic it becomes, the more serial it gets.

The more serial it becomes the worse it will run on a GPU than on a CPU. The main advantage of a GPU for physics modeling is how incredibly parallel it is.

Also as someone noted elsewhere PhysX isn't even taking advantage of things like SSE3/4. I'd have to go looking but I believe it doesn't use anything more advanced thatn MMX on the CPU side. Which means it's even more crippled on CPU and wasting even more cycles.

The advantage of an open source or more practically a non-GPU specific solution is that it would theoretically also be optimized to use the CPU fairly efficiently. A GPU should still be faster in most situations since it's highly parallel. A CPU would naturally excel in things that are much more serial in nature.

As it is. PhysX is optimized for PPU and GPU, and appears to be deliberately castrated with regards to CPU. Or if not deliberately castrated, deliberately ignore and not optimized for. As that would make the PPU/GPU look less attractive.

In that sense, yes I do hope PhysX dies horribly fast after OpenCL is released.

Regards,
SB

trinibwoy · Mar 12, 2009

Silent_Buddha said:
In that sense, yes I do hope PhysX dies horribly fast after OpenCL is released.

OpenCL isn't a replacement for PhysX. Don't know why people refer to them as equivalents. Still gonna have to wait for somebody to develop and integrate an OpenCL physics solution in their engine.

ChrisRay · Mar 12, 2009

Considering PhysX is its own complete library. And OpenCL is a compute language. Which can indeed create its own Physics library. But will have to be developed by someone else and entirely up to the dev. To say OpenCL is a PhysX killer is pretty far fetched.

Ailuros · Mar 13, 2009

trinibwoy said:
Sure, but now you're talking about the relative market positioning of Nvidia and Microsoft. There isn't anything inherently more open about DirectX except for the fact that Microsoft has a massive OS monopoly.

It wasn't that badly phrased that you didn't understand what I really meant in the end.

Or let me put it this way. Is it any harder for AMD/ATI to adopt CUDA than it is for them to adopt OpenCL? Nope.

I'm not that naive to believe that CUDA was not tailored for NV's GPUs.

nicolasb: How is CUDA specific to Nvidia hardware?

I don't think he meant that any other IHV couldn't use CUDA in theory; the burning question however is whether it favours GeForces or not.

Pressure · Mar 13, 2009

So, what about taking these bird droppings that make up PhysiX and CUDA into their own thread?

We still have things to guess about the RV790 afterall

Rangers · Mar 13, 2009

Kidra said:
You should mention it to DirectX

Or Open Office or Linux or..but we digress

rjc · Mar 13, 2009

Ailuros said:
the burning question however is whether it favours GeForces or not.

From Eric Demers 6 months ago on rage3d:

R3D: Since we're in the neighborhood, this is a good time to ask about GPGPU. Are you fully focused on OpenCL?
Eric: We're strong supporters of OpenCL. We've worked closely with Apple, we're members of the Khronos board responsible for OpenCL. We're always going to be focused on open standards that move the industry forward ... relying on proprietary solutions isn't an adequate long-term solution. OpenCL holds a lot of promise.

R3D: That's an elegant way of saying that you won't support CUDA, in spite of nVidia's claims that you could/should/must.
Eric: Irrespective of what nVidia says, the truth is that CUDA is their proprietary solution, which means that if we were to use it we'd be stuck being second place and following their lead; that's not really a position we want to be in. Another characteristic of CUDA is that it's very G80 centric, so first we'd have to build a G80 in order to have a reason to use CUDA. So, no, we're not going to support/use CUDA in any foreseeable future.

So you can probably bet your house that the extra die area on the RV790 is not for CUDA

trinibwoy · Mar 13, 2009

Ailuros said:
I don't think he meant that any other IHV couldn't use CUDA in theory; the burning question however is whether it favours GeForces or not.

Obviously Nvidia's implementation favors Geforces, just like their DirectX implementation favors Geforces. Do we have an example of a characteristic of the CUDA API that favors a specific architecture? Aren't CUDA and OpenCL all about structuring the problem in a parallel fashion and firing off kernels? I don't think there's anything about the API itself that demands a specific hardware implementation or memory architecture. For example, there's no reason shared memory couldn't be in off chip ram. It'd just be slow, but it would be the same for OpenCL as well.

Pressure · Mar 13, 2009

rjc said:
From Eric Demers 6 months ago on rage3d:

So you can probably bet your house that the extra die area on the RV790 is not for CUDA

Indeed, not to mention that their drivers are far better in Mac OS X compared to nVIDIA.

Jawed · Mar 13, 2009

CarstenS said:
Mirror's Edge is a console port scaling with more than two cores on PC. It even scales quite a bit with a dedicated physx-processor, which means, that there's not too much unused cores left in the system.

Eh?

The game does not scale appreciably with 2 extra CPU cores yet it scales dramatically with a GPU or a PhysX card. Clearly NVidia is taking the piss.

Now, of course you could fill those remaining cores up with physics calculations, but how to control the amount of processing time?

Huh?

Cores are ~ idling :!:

If, in some game, I turn on MSAA, the performance hit varies with the scene depending on a variety of factors. There's no pre-determined "limit of scaling" programmed-in to MSAA. It goes as fast as it can. Why would CPU effects-physics be any different?

Let's say 2 cores run the effects physics and graphics at 7fps, while 4 cores can run the same at 30 fps, with GPU doing the effects physics, it runs at 60fps. That would be credible and indicate genuine worth.

But if 2 cores and 4 cores hardly differ in performance then it's quite obvious that we're being marchitectured. No surprise, of course.

The CPU should be maxxed. What else is the game doing? Since the game is able to run on a dual-core PC, the extra cores are genuinely spare.

You simply cannot sell a PC game running decently only on quad- or octacore machines.

So you think running only decently on 8800GT or better is reasonable?

As it happens this is a non argument, because these are just effects physics, like god rays or bloom in graphics. Atmosphere, not gameplay.

See above. I have difficulties imagining how to effectively control the FLOPS used for physics. The obvious solution would be a slider for the amount of pieces some stuff would break up into, but AFAIK every instance of this slider would have it's own set of pre-tesselated geometry to work on, i.e. pre-defined breaking points.

If the game engine was designed properly it would detect the frame rate and adjust. Same thing goes for graphics effects, game engines should be dynamically scaling effects to framerate.

edit:
As to how scalable physics are in Games, just look at the link you've given above: A 9600 GT achieving a mere 10% higher Fps-Performance than the might 8600 GT. Physics does not even scale that well within Nvidias Geforce-GPUs - and there rather with clockrate than with number of cores.

Incompetence in software/hardware is no defence when the convenient, marketing-driven, end result is that CPU performance on 4 cores is useless.

Jawed

trinibwoy · Mar 13, 2009

Jawed said:
But if 2 cores and 4 cores hardly differ in performance then it's quite obvious that we're being marchitectured. No surprise, of course.

Yeah there's no doubt Nvidia has only made token efforts to get PhysX up and running on CPUs. But don't we have a highly optimized CPU based physics implementation to use as a benchmark? I haven't seen Havok doing anything particularly impressive with quad-cores either. So until they do, is there really a basis for criticizing PhysX's inadequate CPU utilization?

Jawed · Mar 13, 2009

trinibwoy said:
So until they do, is there really a basis for criticizing PhysX's inadequate CPU utilization?

Why do you need a third-party comparison to validate an opinion that's entirely defensible based solely upon the poor PhysX performance scaling on a CPU? At least come up with a technical defence (e.g. memory bandwidth).

Right now I'm playing with Novodex Rocket 1.1 (2.1.1 engine)

which is very entertaining, literally hours of fun. But it only uses a single CPU core.

My A64 3200X2 (2GHz) is running the Building Explode test at about 8-13fps. That's 5008 blocks starting off in stacks with collision/friction interactions amongst themselves. Apparently that's how demanding the shattering glass is in Mirror's Edge

Jawed

Jawed · Mar 13, 2009

No such thing as RV790

http://www.theinquirer.net/inquirer/news/408/1051408/ati-4890-asic

Nothing more than up-clocked.

Jawed

trinibwoy · Mar 13, 2009

Jawed said:
Why do you need a third-party comparison to validate an opinion that's entirely defensible based solely upon the poor PhysX performance scaling on a CPU? At least come up with a technical defence (e.g. memory bandwidth).

Well you don't from a purely technical standpoint. But since this isn't happening in a vacuum I'm just pointing out that even CPU focused middleware hasn't stepped up to the plate either.

My A64 3200X2 (2GHz) is running the Building Explode test at about 8-13fps. That's 5008 blocks starting off in stacks with collision/friction interactions amongst themselves. Apparently that's how demanding the shattering glass is in Mirror's Edge

Well do we have an example of an actual game that has physically simulated shattering glass besides Mirror's Edge? We're making a lot of assumptions without anything to compare to as a baseline.

Jawed said:
http://www.theinquirer.net/inquirer/news/408/1051408/ati-4890-asic

Nothing more than up-clocked.

Wow, so where did RV790 come from in the first place? Wasn't there talk of moving to 55GT at TSMC in order to reach those clocks?

AnarchX · Mar 13, 2009

The 4890 is not a new ASIC, it may or may not be a new stepping of the R770, but it is not a new part.

I would say Charlie only saw a document about the 800SPs of RV790/HD4890.

Jawed · Mar 13, 2009

trinibwoy said:
Well do we have an example of an actual game that has physically simulated shattering glass besides Mirror's Edge? We're making a lot of assumptions without anything to compare to as a baseline.

What is it about shattering glass that makes it more difficult than 5000 of these blocks? There's no friction amongst the pieces of glass, merely collision detection. It's a comparison of cuboids (regular shape, relatively easy) that collide and have mutual friction (can touch numerous other cuboids simultaneously, pretty difficult) versus irregular planar objects (i.e. each is a fixed configuration of planar triangles within a bounding cuboid) that merely collide.

Jawed

Jawed · Mar 13, 2009

AnarchX said:
I would say Charlie only saw a document about the 800SPs of RV790/HD4890.

You mean an old document that went out last month for the AIBs?

Jawed

trinibwoy · Mar 13, 2009

Jawed said:
What is it about shattering glass that makes it more difficult than 5000 of these blocks?

Nothing theoretically. But that's an isolated test, and not a full blown game engine. All I'm saying is that we don't have anything to use as a benchmark. If these things are so easy to do on quad-cores why haven't they been picked up by Havok enabled games? Everyone raves about the physics in Source but it's pretty rudimentary stuff, just integrated very well with gameplay. We can infer what should be possible on a CPU but as with anything else the proof is in the pudding.

Jawed · Mar 13, 2009

trinibwoy said:
We can infer what should be possible on a CPU but as with anything else the proof is in the pudding.

We can also infer that NVidia, like Ageia beforehand, is taking the piss by hobbling the CPU code - it has everything to gain and nothing to lose. What's entertaining is seeing how blatant the evidence for this fraud is, yet reviewers are not calling them out on it.

It was easy to see that GT200 was over-priced and under-performing even before RV770's launch. This is similarly easy to see.

It's Far Cry HDR nonsense all over again.

Jawed

Arty · Mar 13, 2009

trinibwoy said:
Wow, so where did RV790 come from in the first place? Wasn't there talk of moving to 55GT at TSMC in order to reach those clocks?

I can trace back the RV790 rumor's origin back to TPU, which honestly looked like pure conjecture and speculation.

I dont think "RV790" was ever confirmed by CJ or other legit sources.

AMD RV770 refresh -> RV790

Silent_Buddha

trinibwoy

Meh

ChrisRay

<span style="color: rgb(124, 197, 0)">R.I.P. 1983-

Ailuros

Epsilon plus three

Pressure

Rangers

rjc

trinibwoy

Meh

Pressure

Jawed

trinibwoy

Meh

Jawed

Jawed

trinibwoy

Meh

AnarchX

Jawed

Jawed

trinibwoy

Meh

Jawed

Arty

KEPLER