# AMD announces new GPGPU card, hints at RV670 specs

Discussion in 'GPGPU Technology & Programming' started by Dave Baumann, Nov 8, 2007.

1. ### Jawed Legend

Joined:
Oct 2, 2004
Messages:
10,873
767
Location:
London
Something different

Any chance that modifying/widening the DP4 paths will provide the requisite stages?

Jawed

2. ### Farhan Newcomer

Joined:
May 19, 2005
Messages:
152
13
Location:
Regardless of whether it's a pipelined add, you can't do that shifting thing for the MUL because that would be incorrect. The alignment for p1 and p2 is always fixed (they are not 2 completely independent FP numbers, think of them as having a shared exponent). The addition is always between the top 54 bits of p1 and the bottom 54 bits of p2, with the carry propagation having to go through all the way to the MSB of p2 (27 bits).

3. ### Jawed Legend

Joined:
Oct 2, 2004
Messages:
10,873
767
Location:
London
I've diagrammed a possible set of exponents:

Code:
```Blo         27
Alo         27
---
w55
Bhi       53
Alo       27
---
z81
---
Z+W
=====
z82    partial sum 1
=====

Blo       27
Ahi       53
---
y81
Bhi     53
Ahi     53
---
107
---
X+Y
=====
108    partial sum 2
=====

p1       z82
p2    +108
=======
109
=======```
For the sake of clarity, both A and B have exponent 53. When split into hi and lo parts, the hi parts keep their exponent, 53, while the lo parts are normalised to exponent 27 (though it could be lower for either of them). I've then worked through the multiplications and additions, calculating the maximum value of each of the resulting exponents.

Doing this I think I've understood my mistake. When I said "the count of significant bits in p2 determines how many bits from p1 are used, i.e. 54-p2+27" that's wrong, it should be the difference in exponents as there's always 54 significant bits in p2.

---

My suggestion is the addition, p1+p2, is done on the final adder in the pipeline (in lanes X and Y). This adder is required to perform a DADD instruction, so in this case it is also used for p1+p2. Since DADD has to support two 53-bit operands by being a 54-bit adder, the addition of p1+p2, 27 bits + 54 bits requires no extra hardware dedicated to MUL.

So, what I'm thinking is that a conventional single precision DP4 needs to perform a final ADD on 4 MULs. So the DP4 instuction requires a 4 operand adder. I'm wondering if this same adder can also support:
• DMUL p1, p2
C comes from A*B+C.

Does DP4 work like that, though?

Jawed

4. ### itaru Newcomer

Joined:
May 27, 2007
Messages:
156
15

The AMD Stream Team is pleased to announce the availability of AMD Stream SDK v1.1-beta!

The AMD Stream Computing website will be updated in the next few days to reflect this new release.

With v1.1-beta comes:

- AMD FireStream 9170 support
- Linux support (RHEL 5.1 and SLES 10 SP1)
- Brook+ integer support
- Brook+ #line number support for easier .br file debugging
- Various bug fixes and runtime enhancements
- Preliminary Microsoft Visual Studio 2008 support

If you have any questions, please do not hesitate to post your question to the forum.

Sincerely,
AMD Stream Team

5. ### wingless Newcomer

Joined:
Aug 5, 2007
Messages:
79
0
Location:
Houston, Texas
Awesome. I hope we see more ATI support in GPGPU before CUDA takes over the market.

6. ### Karoshi Newcomer

Joined:
Aug 31, 2005
Messages:
181
0
Location:
Mars
Wishlist:
- Brook CUDA backend.

A quick search around here didnt find any references to this. I think I read a post sugesting CUDA on CTM or CAL a few days ago. Brook on CUDA seems easier.
Disclaimer: I know CUDA and AMD´s stream SDK only at the executive PDF level.

I see advantages to a brook port to cuda.

7. ### itaru Newcomer

Joined:
May 27, 2007
Messages:
156
15
http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~126593,00.html
AMD Stream Processor First to Break 1 Teraflop Barrier

—Next-generation AMD FireStream™ 9250 processor accelerates scientific
and engineering calculations, efficiently delivering supercomputer performance at
up to eight gigaflops-per-watt —

The AMD FireStream 9250 stream processor includes a second-generation
double-precision floating point hardware implementation delivering
more than 200 gigaflops, building on the capabilities of the earlier
AMD FireStream™ 9170, the industry’s first GP-GPU with double-precision floating point support.
The AMD FireStream 9250’s compact size makes it ideal for small 1U servers
as well as most desktop systems, workstations, and larger servers and
it features 1GB of GDDR3 memory, enabling developers to handle large, complex problems.

AMD is also working closely with world class application and solution providers
to ensure customers can achieve optimum performance results.
Stream computing application and solution providers include CAPS entreprise,
Mercury Computer Systems, RapidMind, RogueWave and VizExperts.
Mercury Computer Systems provides high-performance computing systems
and software designed for complex image, sensor, and signal processing applications.
Its algorithm team reports that it has achieved 174 GFLOPS performance for
large 1D complex single-precision floating point FFTs on the AMD FireStream 9250

8. ### MfA Legend

Joined:
Feb 6, 2002
Messages:
6,833
481
174 GFLOPs is incredibly fast (CUFFT did around 20 on the G80 last I looked).

9. ### Anarchist4000 VeteranRegular

Joined:
May 8, 2004
Messages:
1,439
359
1 TFLOP, <150W power, 1GB GDDR3 in a single slot? This ought to be interesting. I was sort of expecting a 2-slot card with a leaf blower.

Similar Threads - announces GPGPU card
1. ### Valve announces Half Life: Alyx, a AAA VR-only game

ToTTenTranz, in forum: VR and AR
Replies:
10
Views:
543
2. ### Valve announces VR-Only Half Life: Alyx. Hell freezes over in response.

ToTTenTranz, in forum: PC Gaming
Replies:
57
Views:
1,526

Replies:
32
Views:
301
4. ### Ubisoft announces UPlay+ Subscription Service for PC

BRiT, in forum: PC Gaming
Replies:
16
Views:
672
5. ### Age of Empires II: Definitive Edition announced at E3, arrives at 4k this Autumn 2019.

Cyan, in forum: PC Gaming
Replies:
0
Views:
214
6. ### Baldur's Gate III announced!

Cyan, in forum: Console Gaming
Replies:
12
Views:
505