"1-. Interestingly, it seems everything is real. Eventually the CPU core + DSP etc 1.3 Tflops is the only part that has been used until now and it seems the GFX CORE will be taught this E3. The GFX CORE has 2.6 Tflops, I guess that's the performance that occurs through the middleware through final work environments embodied in games right ?, what is the total number of Tflops under gaming environment ?, according to oNE XBox calculations should yield 5.2 Tflops of raw, but there is a variable between 3.2 and 5.2, what the actual number would be within a gaming environment ?, you will use the GFX CORE in Gears of War 4 ? I imagine there will be more games this E3 teach what the machine is capable, right?
2-. X1 course computational calculation gives a total of 5.2 Tflops telling the SRA partition. The GFX CORE gives a total of 4 Tflops + CP and DSP. The SRA is the part that has been taught in HotChips, is the Main SoC used to solve the frame. In total, 5.2 Tflops.
The SRA partition is used to process the last block before the video out there where small Render output unit (ROPs) live. The ERA is divided into two physical blocks, it is why what eSRAM is also divided. So the ERA has 2 display plans and SRA 1 (for the system).
But when most engines use the GFX CORE, some parts of the SRA will remain unused since the same economic cost will be better to use the ERA.
1-. But the ERA can also be used for multimedia system or also as if it were a frontend ?, so I understand, Tflops ERA in April, 1.3 for the SRA or backwards ?, I understood that if in the end are 5.2 Tflops accessible to devs PS NEO has nothing to do with X1 and obviously much less able to cope with X1 PRO.
Xbox One. A prepared console to take advantage of the true characteristics of DX12 hardware
2-. Yes, you've got it backwards, SRA is part of the system, the party resolved from the backend to the frontend, the VLIW part having 1.3 Tflops and was taught in the Hotchips. The ERA partition is the exclusive part where resources reside NextGen, here lies the FL12_1, staying here 48 CU and are also scalable in the CP / scheduler. And talk about a climb of 48 CU by the integration of three-dimensional circuitro (3DIC) has a relatively low frequency of use, 500-600Mhz of all that is modular performance. 48 CUs to make 426Mhz 2.6 Tflops
3DIC Configuration
1-. Okay, now I understand why he had thought the GFX CORE was only 2.6 Tflops instead of 4, pushing the frequency CORE GFX capacity climbs up the frequency and GFX CORE can feed the area Scalar another partition.
2-. The GFX CORE is primarily designed and created for the block ERA, 768 SPU. 1 SPU is 4 ALUs and at the end, the result is 3072 arithmetic logic units that make the total 48 CUs.
GFXCORE layout visible in the first integrated XBox ONE
ERA partition consists of 1.3 Scalar 2.6 Tflops more Tflops (48 CUs) in vector mode. So the final actual configuration would look like:
We have 768 SPU, each SPU include 1 scalable unit plus 4 vector ALUs, all scalar part attends directly to the ARM part to the CP as well, which is why why we say that CP + DSP is equal to a load 1.3 working Tflops scalar, vector mode has a workload of 2.6 Tflops.
At the end we know why the double ALUs
So from the point of view of the old programming paradigm we have a GFX CORE 2.6 Tflops. If we combine scalar mode and vector mode we have 768 scalar operations that result in a 1.3 Tflops raw.
But from the point of view of the new paradigm and combining flops where scalar operations can also flops (you know by Mike Mantor work ;-)), the X1 from the GFX CORE can get a yield of 4 Tflops coming 1.3 Tflops of the scalar part plus 2.6 Tflops of vectorized part.
1-. Okay, now I catch it perfectly, so we know, then the configuration is that the ERA partition is capable of doing more Tflops 1.3 Scalar 2.6 Tflops in vector mode (48CU) and SRA part it consists of 1.2 Tflops.
2-. Yes, exactly. But when the ERA starts to be used, the slow SRA will not be used for games, it will be used only for the system. It is much better to use the NEXT GEN part because the SRA is based on VLIW, so from a developer perspective can use a balance of 2.6 Tflops for the ERA and 1.3 for the SRA, or could balance it all in ERA and use only SRA for system notifications.
The X1 PRO is just the evolution of this system where the partition ERA can rise from the current 500-600Mhz more than 1Ghz.
1-. So we can expect Microsoft E3 show in this part of the CORE GFX through some of the middleware's existing ?, in the roadmap I did echo HSA and full integration for 2017-18 is announced and what I see, X1 PRO will end 2017 just when X1 is fully unleashed and apparently is exactly the same architecture of X1 and X1 SLIM NEXT GEN only that portion is rising laps.
2-. I suspect that in this E3 we will teach you how well looks Gears of War 4, I suspect that updating the HDMI and HDR will be announced and that would be enough for me, probably Forza Motorsport 3 with some incredible visuals to finish unlocking it all in 2017 .
Obviously X1 PRO is the same concept but with a block X1 new hardware, more frequently and more CUs.
Because it is in vain to unlock the full capacity of X1 and if developers still do not use the new paradigm. Remember that's why what the eSRAM can make everything work much better, we need the developer to route data intelligently through it in order to use the ERA. What I mean is that the CORE GFX can have much more performance if the data are placed in the eSRAM to the ERA can use them in a smarter way (Streaming data model, DMEs).
eSRAM of CPU, DSPs, Scratch...
The amount of eSRAM is in XBox One is 2 * 32MB memory and makes scratch is the emb / SRAM, which is particularly slower, the last block of memory which is particularly linked to the GFX CORE. Engines movement are there to something concrete, are used to move data back and forth from slow memory to fast and that is why, by whom, the Jaguar has access to slow memory directly and eSRAM dedicated exclusively to the cluster GFX can not directly access the CPU or need.
What we were taught in Hot Chips was just the Main SoC with 47MB total. They did not speak of 3DIC
1-. So we expect a relatively large configuration in L3 memory to process it? (PIM). From what I see, the Main SoC presented in Hotchips has 47MB in total, how would the final configuration 3DIC counting? (What is a PIM ?, here)
2-. Plus 32MB SRAM CPU more 10-11MB more to the old paradigm VLIW, so in total there are more than 47MB 32MB Main SoC more. A total 79MB.
2D lithography appreciable
The amount of memory you see between clusters jaguar as you know, is 32MB SRAM and 1T is also stacked. Then there is the eSRAM 6T is the fastest of the two below it and have a configuration for each of the clusters of 24CUs plus a DSP plus ARM CPU.
Are visual parts of the ARM stacked?
Like I said, the eSRAM which is under the Jaguar is the slowest of the two and is why what is commonly said that the eSRAM can be accessed from the CPU jaguar, but the rapid eSRAM only be accessible by the GFX CORE including as we said an ARM CPU that act as a PC within the same block. The slow eSRAM is physically close to the Jaguar clusters. The eSRAM fast is the one near the GPU, physically it is 3DIC and you can see the VLIW part.
For The Nonstoppers
1-. Whereupon, the memory of the GFX CORE is partially in disuse and which has been using so far is the one near the Jaguar ...
2-. When using the GFX go CORE, it will go using the GFX eSRAM dedicated to CORE but probably pass as the current eSRAM, there is a learning curve. I think Gears of War will mark the standard in XBox ONE and then up. Gears of War 4 will be a "technical showcase." And soon, you will begin to correctly use the streaming engine model, that means you will start to correctly use tileados volumes and then is when we begin to see incredible things, things we did not think they were capable of. And all related to the PC as soon VGA will mostly FL12_1
XBox ONE: ready for SM6 and LLVM, characteristics of the feature level 12_1 through DX12
1-. Well, because apparently if true this new path opened by Microsoft, exciting things await us this year and next. It seems it's time to confirm everything that we had investigated beyond what convergence means DX12 architectures and new paradigm which among them is XBox ONE and both have insisted on demerit. Thank you for attending my doubts and guide the TRUEGAMER community, you're a good friend, I send a lot of positive energy and a hug.
2-. Thank you for your words, I appreciate the attention, we are in touch, my best wishes.
Tested in the future. We said from the beginning."
Original article:
http://www.truegamers.org/viewtopic.php?f=74&t=1665#p85222