RSX = Nvidia 7900???

Mintmaster said:
On another note, I wonder if we'll see analysts reduce their projected cost of PS3 now that they know G71 is so tiny.
Not a lot because the RSX was not a very expensive part in their estimates, but I think there will be a major re-evaluation of the costs based on the information Sony will release in connection of the end of their financial year, maybe already on wednesday. we'll see.

But back on the RSX-G71 topic, don't you think Barbarians talk of mltithreading on the RSX requires some more logic as well, even though it seemed to be a fairly simple design where the different threads used different registers?
Barbarian said:
As far as I know the latency can be hidden by the hardware threading, but the pixel shaders have to use half the registers (compared to shaders when texturing from vram) otherwise there will be a performance degradation. It seems registers are shared between all hardware threads and more threads are needed to hide longer latencies, hence less registers available per thread.
I think it sounds like a pretty clever design which will help to get more efficient use of the shader hardware and it fits well in a closed concole environment, where you can use every dirty trick in the book.

In one of the nvidia patents that have been discussed they use multithreads which reuse the arithmetic units of some fixed function calculation units. I have a pet theory that the RSX actually implements that patent, it could be a way to get a bit closer to the incredible 1.8 Gflop figure they flashed at last years E3.

http://patft.uspto.gov/netacgi/nph-...t&s1=6987517.WKU.&OS=PN/6987517&RS=PN/6987517
 
Crossbar said:
I think it sounds like a pretty clever design which will help to get more efficient use of the shader hardware and it fits well in a closed concole environment, where you can use every dirty trick in the book.
It sounds as if PS3 developers can explicitly change the batch size, something which only the drivers can do on PC as discussed here:
http://www.beyond3d.com/forum/showthread.php?t=29041


In one of the nvidia patents that have been discussed they use multithreads which reuse the arithmetic units of some fixed function calculation units. I have a pet theory that the RSX actually implements that patent, it could be a way to get a bit closer to the incredible 1.8 Gflop figure they flashed at last years E3.

http://patft.uspto.gov/netacgi/nph-...t&s1=6987517.WKU.&OS=PN/6987517&RS=PN/6987517
I don't think RSX is different from NV4x/G7x in that regard.
 
Xmas said:
It sounds as if PS3 developers can explicitly change the batch size, something which only the drivers can do on PC as discussed here:
http://www.beyond3d.com/forum/showthread.php?t=29041
Do you imply the drivers could hide some multithreding already?

Xmas said:
I don't think RSX is different from NV4x/G7x in that regard.
You are likely correct, what I find intrigueing is the fact the patents explicitly use shader quads in the illustration.

rs.JPG

rs2.JPG
[
 
Crossbar said:
...
You are likely correct, what I find intrigueing is the fact the patents explicitly use shader quads in the illustration.
...

That patent was posted here,

NVIDIA Patent: G70/G80/RSX?

It's quite clearly a unified shader architecture at the hardware level. RSX has discrete VS/PS units as disclosed at E3...
 
Last edited by a moderator:
Jaws said:
It's quite clearly a unified shader architecture at the hardware level.
You are right, but if they implemented multithreading as Barbarian suggests they may have borrowed some ideas from this patent. Well it's just speculation from my side.
 
Crossbar said:
You are right, but if they implemented multithreading as Barbarian suggests they may have borrowed some ideas from this patent. Well it's just speculation from my side.

Well, a basis for speculation with a patent and multi-threading could be this NV patent,

Across-thread out of order instruction dispatch in a multithreaded graphics processor

patent said:
Across-thread out of order instruction dispatch in a multithreaded graphics processor

Abstract

Instruction dispatch in a multithreaded microprocessor such as a graphics processor is not constrained by an order among the threads. Instructions are fetched into an instruction buffer that is configured to store an instruction from each of the threads. A dispatch circuit determines which instructions in the buffer are ready to execute and may issue any ready instruction for execution. An instruction from one thread may be issued prior to an instruction from another thread regardless of which instruction was fetched into the buffer first. Once an instruction from a particular thread has issued, the fetch circuit fills the available buffer location with the following instruction from that thread...
 
Jaws said:
Well, a basis for speculation with a patent and multi-threading could be this NV patent,

Across-thread out of order instruction dispatch in a multithreaded graphics processor
Could be, but it's not a perfect match either to what Barabarian described as the patent does not require the threads to use different sub-set of the registers.

I guess there may be a few more patents being processed right now, I guess we just have to be patient a few more weeks before Sony drops the strict NDAs concerning the RSX.
 
Crossbar said:
Could be, but it's not a perfect match either to what Barabarian described as the patent does not require the threads to use different sub-set of the registers.

I guess there may be a few more patents being processed right now, I guess we just have to be patient a few more weeks before Sony drops the strict NDAs concerning the RSX.

Well, this part from the patent sounds like what Barbarian described,

patent said:
...[0007] One way to reduce this inefficiency is by increasing the number of threads that can be executed concurrently by the core. This, however, is an expensive solution because each thread requires additional circuitry. For example, to accommodate the frequent thread switching that occurs in this parallel design, each thread is generally provided with its own dedicated set of data registers. Increasing the number of threads increases the number of registers required, which can add significantly to the cost of the processor chip, the complexity of the design, and the overall chip area. Other circuitry for supporting multiple threads, e.g., program counter control logic that maintains a program counter for each thread, also becomes more complex and consumes more area as the number of threads increases.

[0008] It would therefore be desirable to provide an execution core architecture that efficiently and effectively reduces the occurrence of bubbles in the execution pipeline without requiring substantial increases in chip area.
...

It also could describe a unified architecture too...
 
Crossbar said:
Do you imply the drivers could hide some multithreding already?
Hide multithreading? I don't know what you mean. :-?



What Barbarian describes is already present in a similar form in NV30, though I think it has been greatly improved in NV40. The number of quads (threads) in flight in the pipeline is not fixed. There is a big register file providing space for N FP32 4-vectors (or 2N FP16 4-vectors). For any pixel shader program the compiler calculates the number of temporary registers used, and N divided by that number determines how many quads may be in flight simultaneously.

The more quads are in flight, the more latency for fetching texels from memory can be hidden. Because there are more other quads to process before the texels for a given quad are needed.
 
Jaws said:
Well, this part from the patent sounds like what Barbarian described,

It also could describe a unified architecture too...
Yes, I guess it leaves some room for interpretations.:smile:

Xmas said:
Hide multithreading? I don't know what you mean.
It was just a question. I couldn't really relate your post to mine in a good way.
But maybe the rest of your post is answer to my question. I guess you are suggesting that a single shader could have several threads executing simultaneously (as Barbarian indicates) already in the NV30 and it has been improved in the NV40.
If that is the case my speculation is indeed outdated.
 
Last edited by a moderator:
!eVo!-X Ant UK said:
RSX = Nvidia 7900???

nope.

RSX to GDDR3 is not going to have anywhere near the bandwidth of GF 7900 to GDDR3.

it is also likely (though not confirmed) that RSX will only have 8 ROPs instead of the 16 ROPs that NV40, G70, G71 all have.
 
Back
Top