If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Member
Join Date: May 2003
Posts: 295
|
http://pc.watch.impress.co.jp/docs/2.../kaigai199.htm
SCEI and PS3 development kit schedule announced Sony Computer Entertainment held their yearly summary meeting for the Playstation, the PlayStation Meeting 2005. In that meeting, new information regarding the Playstation 3 was released. First of all, the scheduled launch of the PS3 in the spring of 2006 was reconfirmed, and immediately before that, a pre-event called the “Playstation Conference” will be held. At first, SCEI used the Cell Evaluation System to do software stack validation. This machine was supposedly used as a debugging machine, used in internal company labs, and it was also provided to a select group of vendors for evaluation. It has a 2.4 GHz Cell processor, 256MB XDR DRAM, and an nVidia graphics board. Next, SCEI developed the much-anticipated “PS3 Evaluation System” for customer evaluation. The machine number is CEB-2030 and the codename is “Cytology.” SCEI has been distributing these machines to software vendors since this spring. The specs of the PS3 Evaluation System will be explained later, but basically it has a 2.4GHz Cell, 512MB of XDR DRAM, and a GeForce 7800 (G70). In December 2005, SCEI is scheduled to release the “PS3 Reference Tool”, which has nearly the same architecture as the actual PS3. It will have a 3.2 GHz Cell, the RSX, 512MB of XDR DRAM, and a BD drive. Currently, it is set to be a 2U rack mount unit, but vertical configurations are being considered. SCEI will continue to provide PS3 Evaluation Systems until November. Currently, 450 units have been sent out, and will continue to increase according to the supply figures shown below, to answer the intensive demand for the machine. August – 200 units Sept – 300 units October – 3000 units November – 3000+ units CELL and XDR DRAM are 75% of PS3’s capability The PS3 Evaulation System differs from the final PS3 specs in various ways. First of all, the Cell operating frequency is 2.4 GHz, which is 75% of the production board. In the case of the CPU, it is not uncommon to hold down the clock speed until validation is completed. Of course, while it is not possible for this machine to perform at the PS3 final spec, the knowledge that it is at 75% [and then compensating for it] should be enough to get by. The memory is XDR DRAM, and the Cell chip used is connected to the XDR DRAM by the XDR DRAM interface (XIO). This is also not full-spec. At least in June, the XDR DRAM date transfer rate in the PS3 Evaluation System was held to 2.4 Gbps. The PS3’s XDR DRAM data rate will be at 3.2 Gbps, so this is also a 75% capability. The XDR DRAM data rate drop can be seen as in sync with the CPU clock speed drop. What this shows is the possibility that the Cell CPU core and the XDR DRAM interface were developed at the same time. Simultaneous development is easier, and has other advantages. Particularly, in the case of CPU-memory, latency is a very important factor, so simultaneous development has many advantages. Most importantly, the XDR DRAM rate may have been dropped to compensate for the yield rate of the new XDR DRAM. It might be difficult to create 3.2 Gbps XDR DRAM samples at this early stage. If we think about the DRAM cell core clock (Internal Column Frequency), 3.2Gbps XDR DRAM is rather difficult. When XDR DRAM mass production for the PS3 begins, it will be moved to a 90nm process, but for now, it is being built on 100-110nm processes, which is bad for yield rates. Additionally, in the PS3 Evaluation System, RIMM (Rambus memory modules) are used. These modules might eat into the timing margins. The PS3 Evaluation System introduced in this conference has a 512 MB of XDR DRAM. This is twice the 256 MB of the PS3. This increase might be due to the RIMM modules. In June, it was explained that the PS3 Evaluation System was designed to also be able to use RIMM. This large amount of memory is meant for verification [appraisal, testing] purposes. The XDR DRAM interface is configurable, so it has high flexibility. It is an x16 interface, but is also capable being configured as x8 or x4. XDR DRAM has a point-to-point connection with the Cell chip. For example, by changing from x16 to x8, one channel can support connections with twice as many DRAM chips. The RIMM module takes advantage of this property, allowing one channel to support 2 RIMM while maintaining a point-to-point connection. (Trans. note: by context it is clear that “point-to-point” means a direct connection between two ICs, with no intervening chip in between.) On the other hand, the final PS3 design has the XDR DRAM memory directly integrated into the motherboard. ![]() Currently, the graphics are connected by PCI-Express x4 In the PS3 Evaluation System, the PC-centric GeForce 7800 GTX (G70) is used as a substitute for the RSX. The RSX and G70 are made from about the same shaders, and the internal shader architecture is predicted to be quite similar. Because of that, as far as graphics are concerned, using the G70 as a base for software development should not create many problems. Shader programs should be able to run as if the two chips were the same. However, the G70 has a lower clock speed than the RSX, and will certainly have some level of performance difference. However, an even greater difference than the internal GPU performance is the interface. In the PS3, the Cell and RSX are connected by a parallel interface developed by Rambus called FlexIO (Redwood), which has a wide 35GB/sec bandwidth (20GB/sec down, 15GB/sec up). However, the G70, which has a PCI Express x16 interface, cannot be directly connected to the Cell’s FlexIO interface. Therefore, in the PS3 Evaluation System the G70 is collected to the south bridge by PCI Express. In the June PS3 Evaluation System, they were connected by PCI Express x4. The south bridge used by the PS3 Evaluation System is basically the same as the south bridge developed by IBM for the Cell Workstation. Because of that, the chip has peripheral I/O PCI Express x4 meant for server applications. In the final version, PCI Express will disappear from the south bridge, but currently, the G70 is connected by it. For that reason, currently the PCI Express x16 interface in the G70 cannot realize its full potential. According to the spec of the south bridge, Cell has only a 5 GB/sec FlexIO interface to the south bridge. If we assume the same is true for the PS3 Evaluation System, it will have drastically less bandwidth than the actual machine. Furthermore, the G70 is connected to the south bridge by PCI Express x4, which, at 2GB/sec, is even less. If we compare Cell->GPU bandwidth, we see that the PS3 Evaluation System is only 1/20 of the PS3. According to SCEI, in the PS3 Evaluation System, the graphics side has been increased to 512MB of GDDR3 memory. In the actual PS3, there will be 256 MB of GDDR3. The reason for this increase in the video side memory is to allow buffering of data into the graphics side when the bus is idle. However, it will be difficult to use the PS3 Evaluation System to effectively evaluate the wide connection between the Cell and RSX in the PS3. Additionally, the GDDR3 interface of the RSX is 128bits wide, whereas the G70 is 256bits wide, which means if both use x32 512Mbit DRAM chips, the G70 can support twice as much memory. The special characteristic of the PS3 is the connection between Cell and RSX The big special characteristic of PS3 Graphics is the connection between Cell and RSX. The RSX itself has a similar architecture to the G70, but the host interface for the G70 is meant for the PC and is completely different. The G70 uses PCI Express x16 to connect to the chipset as 8GB/sec (4GB/sec one-way), and it cannot directly access main memory. In contrast, the RSX has a 35GB/sec (20GB/sec down, 15GB/sec up) direct connection to the Cell, and can directly render from the main memory on the Cell side. This is a big difference, because it allows a completely different way of using the GPU from PC architectures, SCEI explained. First of all, because the bus is wider, the Cell can perform a great amount of geometry operations, then send the vertex data [to the RSX]. Conversely, the RSX side can easily send data back to the Cell. “The Cell processor can do both pre-processing and post-processing. For example, tessellation, dot filling, etc… Cell can perform physics processing like collision and motion calculations, and transform the vertex array.” said David B. Kirk, Chief Scientist of nVidia. SCEI basically expects higher abstraction levels to be processed by Cell, and the details (like vertexes and pixels) to be processed by the GPU. The is reasonable – for example, in the case where the CPU side handles geometry transformation, collision detection, which is important in games, is not a problem. In the case where the GPU handles geometry transformation, if the data is not sent back to the CPU, clipping issues may occur. In the case of the PS3, the Cell side can perform transformations, and even if the GPU is used for transformation, it is comparatively easy to send the data back to the CPU side. In architectures up to now, either the CPU or the GPU have been the bottleneck. It this is not resolved, we cannot go any further. To face this, in PS3 architecture, if the GPU becomes the bottleneck, it can shift work to the Cell, if the Cell becomes the bottleneck it can send work to the GPU, shifting the workload. For example, according to the software, the Cell side can perform more graphics processing, or, oppositely, or easily make an adjustment to leave the graphics work to the GPU, it was explained. In summary, between the CPU and GPU programmable processors, a flexible balance adjustment can be done. In previous PC architectures, because they were limited by the CPU<->GPU pipe, geometry operations were held to a certain limit, and how rich an environment you can create within that limit became the main technical challenge. In contrast, the PlayStation2-type game consoles created large amounts of polygons, but after that it did not have the expressiveness of PCs. (Trans. note: probably means that PS2 is less capable in applying different effects to polygons than the PC, despite pumping out more polygons.) In the case of the PS3, both are possible, with the flexibility to balance the two. However, in the case of the currently available PS3 Evaluation System, because of restrictions in the architecture, it is not possible to evaluate the balancing [of Cell and RSX]. This is a difficulty and a weakness, but, if we state it differently, software demos on the current systems still do not demonstrate the full potential of PS3. It is possible that the actual PS3 will have performance greater than current demos. Significantly, when in comes to bus bandwidth, the Xbox 360 CPU-GPU connection is 21.6GB/sec, which is much wider than PCs. A wide-bandwidth CPU-GPU connections in not just the characteristic of PS3 in the next-generation consoles. PS2’s simple boot-up started with firmware, and it loaded the OS and libraries from the disk. In comparison the PS3 starts from “Haipaabaiza” (Hyper-visor?) firmware. Haipaabaiza is a type of VMM (Virtual Machine Manager) software, which runs not on top but under the OS, providing machine virtualization. Even, when using only the Cell OS for gameplay, Haipaabaiza will always start first, and on top of that runs the pre-defined OS (guest OS). The OS, along with Haipaabaiza, creates a two-layer image. This basic OS layering is the same in the PS3 Evaluation System.
__________________
With the return of school, I have regained the need to procrastinate, and hence, the need to come to Beyond3d... It's too bad all the posts I want to reply to are locked. |
|
|
|
|
|
#2 |
|
Member
Join Date: May 2003
Posts: 295
|
NOTES:
This is supposed to be a sentence-by-sentence translation, but that's not 100% true. Sometimes I rearranged things for clarity in English. Things not explicitly stated but implied, and alternative translations are marked in [], or written as translator notes. It's obvious in the original article that at some point in June, a certain set of specs for the PS3 Evaluation Machine were released, and the author bases some of his analysis from that, but he is not confident that those specs are current anymore. That's why he keeps referring to June. Also, at some point, specs for the south bridge of the Cell Workstation were also released. Thanks to matmarkyau for informing me that the experimental forum (where this was originally posted) was gonna die, and that the normal one was back up. One, if something's wrong, let me know - my Japanese is decent but not great. :/ Enjoy! And thanks DaveB for keeping this place alive.
__________________
With the return of school, I have regained the need to procrastinate, and hence, the need to come to Beyond3d... It's too bad all the posts I want to reply to are locked. |
|
|
|
|
|
#3 |
|
Member
Join Date: Oct 2003
Location: Australia,Brisbane
Posts: 491
|
No Problem.
|
|
|
|
|
|
#4 |
|
Artist formely known as Vysez
Join Date: Mar 2004
Location: Paris, France
Posts: 3,899
|
Thanks for the translation, nondescript.
Excellent work! Now, we'll be able to discuss the whole article, not the diagrams only.
__________________
- Power corrupts and absolute power is kinda neat. - If at first you don't succeed, put it out for beta test. --Internets |
|
|
|
|
|
#5 | |
|
Member
Join Date: Apr 2005
Posts: 659
|
Quote:
Last we heard from David Kirk he seemed to critisize X360 developers for moving pixel operations to the CPU side. I wonder if this idea will be used much on both systems. Or will Xenos rely on the unified shaders for load balancing. |
|
|
|
|
|
|
#6 | |
|
Regular
Join Date: Dec 2004
Posts: 5,670
|
Quote:
Anyway, great translation nondescript - thank you very much! |
|
|
|
|
|
|
#7 | |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,946
|
Quote:
|
|
|
|
|
|
|
#8 | ||
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,946
|
Quote:
|
||
|
|
|
|
|
#9 | |||
|
Regular
Join Date: Dec 2004
Posts: 5,670
|
Quote:
Quote:
|
|||
|
|
|
|
|
#10 |
|
Senior Member
Join Date: Jun 2005
Location: Chicago, IL
Posts: 1,527
|
EDIT: question answered by the great DaveB, while I typed.
|
|
|
|
|
|
#11 | |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,946
|
Quote:
|
|
|
|
|
|
|
#12 |
|
Member
Join Date: Apr 2005
Posts: 659
|
David Kirk was trying to imply that the automatic workload balancing of unified shaders wasn't sufficient.
Which would make sense if the scene was both vertex and pixel bound. In the unified case you would split the shaders 50/50 100% dedicated pixel shading on the GPU + CPU support seems like it would be a better option for performance. And it's possible now thanks to the wide CPU<->GPU pipe. |
|
|
|
|
|
#13 | |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,946
|
Quote:
|
|
|
|
|
|
|
#14 | |
|
Naughty Boy!
Join Date: Feb 2002
Posts: 6,802
|
Quote:
__________________
I've got a working quantum computer prototype in my backyard. The only problem is, it crashes at temperatures above absolute zero therefore is not very overclocker friendly. |
|
|
|
|
|
|
#15 | |||
|
Member
Join Date: May 2003
Posts: 295
|
Quote:
Quote:
The author's probably wrong in this case. It wouldn't be the first time that's happened. And I believe DaveB is right in saying that the author is referring to G70 in the general PC environment.
__________________
With the return of school, I have regained the need to procrastinate, and hence, the need to come to Beyond3d... It's too bad all the posts I want to reply to are locked. |
|||
|
|
|
|
|
#16 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,946
|
I didn't think it was your translation that was the issue.
|
|
|
|
|
|
#17 | |
|
Member
Join Date: May 2003
Posts: 295
|
Quote:
__________________
With the return of school, I have regained the need to procrastinate, and hence, the need to come to Beyond3d... It's too bad all the posts I want to reply to are locked. |
|
|
|
|
|
|
#18 | ||
|
Member
Join Date: Apr 2005
Posts: 659
|
Quote:
or you could have 10 dedicated shaders (which are 25% more efficient) allocated for pixel and let the CPU handle geometry. The second option seems to make more sense. |
||
|
|
|
|
|
#19 | |||
|
Naughty Boy!
Join Date: Feb 2002
Posts: 6,802
|
Quote:
__________________
I've got a working quantum computer prototype in my backyard. The only problem is, it crashes at temperatures above absolute zero therefore is not very overclocker friendly. |
|||
|
|
|
|
|
#20 | |
|
Member
Join Date: Apr 2005
Posts: 659
|
Quote:
|
|
|
|
|
|
|
#21 | ||
|
Naughty Boy!
Join Date: Feb 2002
Posts: 6,802
|
Quote:
__________________
I've got a working quantum computer prototype in my backyard. The only problem is, it crashes at temperatures above absolute zero therefore is not very overclocker friendly. |
||
|
|
|
|
|
#22 | |
|
Member
Join Date: May 2003
Posts: 295
|
Quote:
The reason I translated this is because I wanted to here what devs have to say about this. This seems to be a major weakness in PS3 development support, if the PS3 Eval System architecture encourages a different data-passing scheme and CPU-GPU workload allocation than the one that will bring out maximum performance in PS3.
__________________
With the return of school, I have regained the need to procrastinate, and hence, the need to come to Beyond3d... It's too bad all the posts I want to reply to are locked. |
|
|
|
|
|
|
#23 |
|
Grumpy Mod
Join Date: Dec 2004
Location: In a pretty pink padded cell
Posts: 25,994
|
I don't know how the current setup will affect first-gen games. There's still an existing system comparable with PC setup so devs can use existing PC communication.
Process data, fill RAM, GPU collects data from RAM and renders pixels. This works well enough for say UE3 1st gen games even on the eval kits. The huge CPU<>GPU bandwidth of PS3 is something devs will probably need to hav a good long look at to make real use of more than just feeding the GPU. Worst case situation a dev can work on an algorithm for doing something over FlexIO by commincating over the slow main RAM, and know that they'll have 15x the BW for the real thing. As long as they get the algorithm working, even if slow, they'll make headway into using FlexIO. The only unknown is *how* to address the GPU or CPU storages. I think they've all got their own addresses so it's all accessed with pointers, in which case going from reading/writing main memory to reading/writing directly across the FelxIO should be a doddle. Incidentally, how is memory addressed on Cell in a scalable environment? In the case you have say 2 PS3's linked up sharing resources, how would PS3a access PS3b's RAM? I'm guessing this is intrinsic to Cell's design given scalable was a key priority. Are there indications as to where in the memory map the SPE LS's and GPU caches (or other onboard storage on RSX) are located?
__________________
Shifty Geezer ... Tolerance for internet moronism is exhausted. Anyone talking about people's attitudes in the Console fora, rather than games and technology, will feel my wrath. Read the FAQ to remind yourself how to behave and avoid unsightly incidents. |
|
|
|
|
|
#24 | |
|
Member
Join Date: Apr 2005
Posts: 659
|
Quote:
|
|
|
|
|
|
|
#25 | ||
|
Naughty Boy!
Join Date: Feb 2002
Posts: 6,802
|
Quote:
__________________
I've got a working quantum computer prototype in my backyard. The only problem is, it crashes at temperatures above absolute zero therefore is not very overclocker friendly. |
||
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|