ERP said:Quote:
I think the same 1998 argument applies, if PS2 was to have 2.4 Gpix/s, a WTF would be applied! If six years later in 2004, simple Moores law should take us above 30 Gpix/s otherwise I'd sack my R&D team! All those billions of Yens...
On the subject of what would we'd do with 10's of Gpix/sec, since the entire GPU is programmable, including the pixel engine, would we not be able to implemet a Reyes pipeline? or other exotic delights.
Allright you missed my point......
My point i what am I trading for those 10's of billions of pixels, increasing the die area to increase fillrate, means that die area can't be used for say more ALU blocks. Or better texture filtering etc etc.....
How useful is 10 billion pixels per second if your entirely limited by ALU speed. The statement is more about balance that it is about fillrate.
My concern about PS3 in general is exactly what Sony will leave off the die for cost reasons. I have to assume we'll get decent texture filtering that works this time, I have to assume we'll have a complete set of blending ops, but I do worry that they might decide on a "novel" architecture and then have someone that doesn't understand it cut significant features for cost reasons...... IMO this is what happened to the GS.....
When it comes to system performance the devil is in the details and it's the details of what I'm not seeing or hearing that worry me. We'll know soon enough and it's not like I have any control over it so........
SUMMARY OF THE INVENTION
[0012] The present invention has been made under the above circumstances, and therefore an object of the present invention is to realize various operations by one computing unit without increasing the costs...
Paul said:Sony only cares about peak performance and raw numbers, it's absolutely true. To think that they will limit the fillrate to 8GP/S just because more would be 'pointless' is wishful thinking.
PSurge said:I'm not sure if I understood the patent correctly, but it seems to me that a SALP is meant to replace fixed function math units dedicated to things like texture filtering, LOD calculations, blending etc... and that their general purpose nature means that each one can service many different currently hardwired needs. My guess is that performance suffers for a given operation versus a dedicated hardware implementation, but that this is made up for by much better utilization (and flexibility).
Jaws said:I suppose it could be streatched to fit the 4 VS GPU but I still find it odd that the GPU is hooked off the BE in Diag A but off the main buss in Diag C. Something just doesn't add up there? :?
All things equal with the BE and VS, which bus layout seems the most efficient, Diag A or Diag C?
ERP said:I just have to laugh when people propose these ludicrous polygon counts they're expecing to see in next gen titles.
ERP said:I think the same 1998 argument applies, if PS2 was to have 2.4 Gpix/s, a WTF would be applied! If six years later in 2004, simple Moores law should take us above 30 Gpix/s otherwise I'd sack my R&D team! All those billions of Yens...
On the subject of what would we'd do with 10's of Gpix/sec, since the entire GPU is programmable, including the pixel engine, would we not be able to implemet a Reyes pipeline? or other exotic delights.
Allright you missed my point......
My point i what am I trading for those 10's of billions of pixels, increasing the die area to increase fillrate, means that die area can't be used for say more ALU blocks. Or better texture filtering etc etc.....
How useful is 10 billion pixels per second if your entirely limited by ALU speed. The statement is more about balance that it is about fillrate.
My concern about PS3 in general is exactly what Sony will leave off the die for cost reasons. I have to assume we'll get decent texture filtering that works this time, I have to assume we'll have a complete set of blending ops, but I do worry that they might decide on a "novel" architecture and then have someone that doesn't understand it cut significant features for cost reasons...... IMO this is what happened to the GS.....
When it comes to system performance the devil is in the details and it's the details of what I'm not seeing or hearing that worry me. We'll know soon enough and it's not like I have any control over it so........
ERP wrote:
I just have to laugh when people propose these ludicrous polygon counts they're expecing to see in next gen titles.
You were talking about a car game, how much polys you think we can expect on a classical racer next-gen? Around 50Mpps, 100Mpps, 200Mpps or more?
I do not see evidence of major features that almost made it in, but were cut out, except maybe the mip-map LOD calculation Hardware.
I disagree - it's going to decrease. If nothing else, it will decrease relative to the number of math ops used, and probably further yet when you have math ops that are comparably fast to using a texture lookup table approximation instead.Panajev said:No, it will not explode like that, but it still is going to increase compared to what you do now especially because I do not see the jump in Math ops per cycle to be that massive to eliminate completely the use of cube-maps, 3D Textures, etc... as look-up/shortcuts.
APUs won't work on one pixel/vertex at a time, it makes no sense to do that. Even with the short 4cycle instruction latency on VUs you need to write a pipelined loop with several vertices in flight to get optimal performance, and APU will only have a deeper pipeline, perhaps much deeper then that.How slow would it be for the APU to DMA the current context ( Stack and PC ) back to shared DRAM ( we only have 128 KB of Ls per APU ) and start processing on a new pixel ?
I guess that context switching would not be needed shading pixels from the same primitive: the problem if primitives start descending in the 1-4 pixels range in terms of area then we have to make sure each APU is processing multiple primitives in parallel.
The problem is that some might just as well... overinflating fillrate at expense of everything else could very well be making the machine weaker...wunderchu said:"wishful thinking" .... you make it sound as though people want the PlayStation 3 to be less powerful ............
Fafalada said:I disagree - it's going to decrease. If nothing else, it will decrease relative to the number of math ops used, and probably further yet when you have math ops that are comparably fast to using a texture lookup table approximation instead.Panajev said:No, it will not explode like that, but it still is going to increase compared to what you do now especially because I do not see the jump in Math ops per cycle to be that massive to eliminate completely the use of cube-maps, 3D Textures, etc... as look-up/shortcuts.
Eg. when a vector normalize no longer costs an arm and a leg, using a bunch of cube-lookups for normalizing just stops making sense.
Actually on the subject of working on multiple primitives at a time, that kinda brings the question on how will pixel data to process be submitted to the APUs in the first place. Sounds a little less complex to handle then texture fetches, but still has its set issues I wouldn't be sure about off hand...
Megadrive1988 said:so Graphics Synthesizer design was frozen in 1997?
Panajev2001a said:Fafalada said:I disagree - it's going to decrease. If nothing else, it will decrease relative to the number of math ops used, and probably further yet when you have math ops that are comparably fast to using a texture lookup table approximation instead.Panajev said:No, it will not explode like that, but it still is going to increase compared to what you do now especially because I do not see the jump in Math ops per cycle to be that massive to eliminate completely the use of cube-maps, 3D Textures, etc... as look-up/shortcuts.
Eg. when a vector normalize no longer costs an arm and a leg, using a bunch of cube-lookups for normalizing just stops making sense.
You hit very good points Fafalada ( about the ability to avoid Cube-maps for things such as Vector Normalization, etc... ), but while I see the ratio of Texture Fetches vs Math Ops decreasing, I still do not see a decrease in use of Texture Ops from what we do now, instead I still see an increase ( which seems to be matched by a possible increase in Texture Fetch latency ) just one that is not as fast as the increase in Math Ops usage.
That increase unfortunately might held the pipeline back and force programmers to write pipelined loops with more and more vertices and primitives in flight which, pardon the pun, is not something primitive.
Actually on the subject of working on multiple primitives at a time, that kinda brings the question on how will pixel data to process be submitted to the APUs in the first place. Sounds a little less complex to handle then texture fetches, but still has its set issues I wouldn't be sure about off hand...
As food for thought, can you please expand on this point please ?
System and method for data compression
Abstract
A system and method for compressing video graphics data are provided. The system and method include generating in a graphics pipeline, from video graphics data modeling objects, vertex data corresponding to the objects, rendering the video graphics data to produce a current frame of pixel data and a reference frame of pixel data, and, based upon the vertex data, defining a search area within the reference frame for calculating a motion vector for a block of pixel data within the current frame. The current frame then is compressed using the motion vector. The use of vertex data from the graphics pipeline to define the search area substantially reduces the amount of searching necessary to generate motion vectors and perform data compression....
....BACKGROUND OF THE INVENTION
[0002] The preparation, storage and transmission of video data, and, in particular, video graphics data generated by a computer (for example, video graphics data for a computer game), require extensive computer resources and broadband network connections. These requirements are particularly severe when such data are transmitted in real time among a group of individuals connected over a local area network or a wide area network such as the Internet. Such transmitting occurs, for example, when video games are played over the Internet. Such playing, moreover, is becoming increasingly popular.
[0003] In order to reduce the amount of network capacity and computer resources required for the transmission of video data, various encoding schemes for data compression are employed. These data compression schemes include various versions of the MPEG (Motion Picture Experts Group) encoding standard, for example, MPEG-1, MPEG-2 and MPEG-4, and others. These data compression schemes reduce the amount of image information required for transmitting and reproducing motion picture sequences by eliminating redundant and non-essential information in the sequences.
[0004] For example, the only difference in many cases between two adjacent frames in a motion picture sequence is the slight shifting of certain blocks of pixels. Large blocks of pixels, representing, for example, regions of sky, walls and other stationary objects, often do not change at all between consecutive frames. Compression algorithms such as MPEG exploit this temporal redundancy to reduce the amount of data transmitted or stored for each frame.
[0005] For example, in the MPEG standard, three types of frames are defined, namely, intra frames (I-frames), predicted frames (P-frames) and bi-directionally interpolated frames (B-frames). As illustrated in FIG. 1, I-frames are reference frames for B-frames and P-frames and are only moderately compressed. P-frames are encoded with reference to a previous frame. The previous frame can be either an I-frame or a P-frame. B-frames are encoded with reference to both a previous frame and a future frame. The reference frames for B-frames also can be either an I-frame or a P-frame. B-frames are not used as references.
[0006] In order to encode predicted frames and interpolated frames from reference frames, the MPEG scheme uses various motion estimation algorithms. These motion estimation algorithms include full search algorithms, hierarchical searching algorithms and telescopic algorithms. As illustrated in FIG. 2, under the MPEG standard, each frame typically is divided into blocks of 16 by 16 pixels called a macro block. A macro block of a current frame is encoded using a reference frame by estimating the distance that the macro block moved in the current frame from the block's position in the reference frame. The motion estimation algorithm performs this estimating by comparing each macro block of the current frame to macro blocks within a search area of the reference frame to find the best matching block in the reference frame. For example, for macro block 201 of current frame 207, a comparison is made within search area 203 of reference frame 209 between macro block 201 of the current frame and each macro block 205 of the reference frame to find the best matching block in the reference frame. The position of this best matching macro block within the reference frame then is used to calculate a motion vector for macro block 201 of the current frame. Rather than transmit for current frame 207 all of the video data corresponding to macro block 201, only the motion vector is transmitted for this block. In this way, the video data for the current block are compressed.
[0007] Executing motion estimation algorithms, however, also requires substantial computer resources. Since each macro block of a current frame must be compared to numerous macro blocks of one or more reference frames, an extensive number of computations are required. For example, the three-step-search algorithm (TSS) (a hierarchical algorithm) evaluates matches at a center location and eighth surrounding locations of a search area. The location that produces the smallest difference then becomes the center of the next search area to reduce the search area by one-half. This sequence is repeated three times.
[0008] A need exists, therefore, for a more efficient and effective method for compressing video graphics data, particularly in view of the increasing demand for systems capable of playing video games in real time over the Internet and other networks.
SUMMARY OF THE INVENTION
[0009] Data compression encoders, such as MPEG encoders, employ the same method for compressing video data regardless of the source of the video data. Video data from a live performance recorded by a digital camera and simulated video data generated by a computer, therefore, are compressed in accordance with the same data compression scheme and motion estimation algorithm. When video data are generated by a computer, however, information regarding the nature and movement of objects are known prior to the data's encoding and compression. Unlike present data compression encoders, the present invention takes advantage of this information to reduce the computational steps necessary to perform data compression.....
Tahir said:Yea but how much RAM does it have?