Does Cell Have Any Other Advantages Over XCPU Other Than FLOPS?

darkblu · Dec 21, 2005

_xxx_ said:
You can calculate everything as floating point. Integer 1234 is float 1234.0F, no problem there except maybe for higher memory usage and a few more CPU cycles needed.

if you mean single-precision fp - unfortunately no. sp floats do not have the mantissa to represent correcly every integer value above 2^24 (up to max (unsigned) int32). say, 2^32-1 (i.e. max unsigned int32) cannot be represented precisely in sp fp. same with double-precision floats and integers above 2^53.

weaksauce · Dec 21, 2005

Uhm, ok, but do you have to count integers with flops?
Like why not make it "flop only"?

What stuff uses integer anyway? Physics and ai are flops right?

darkblu · Dec 21, 2005

weaksauce said:
Uhm, ok, but do you have to count integers with flops?
Like why not make it "flop only"?

i'm not quite getting your question, but if you make an architecture 'flop only' you'd have to restrict yourself to integers not geater than 2^24 (if flops are single-precision).

What stuff uses integer anyway? Physics and ai are flops right?

everything that uses pointers/references, counters/indices, bitfields/booleans. that basically amounts to 99.99% of the software known to mankind.

ERP · Dec 21, 2005

aaaaa00 said:
Actually that's not a good idea, because floating point math results are inexact. For example, it would be stupid to calculate pointer offsets or loop counters using floating point. ;-) Use the right tool for the right job.

To borrow a famous example:

http://docs.sun.com/source/806-3568/ncg_goldberg.html

Code:

int main() { double q; q = 3.0/7.0; if (q == 3.0/7.0) printf("Equal\n"); else printf("Not Equal\n"); return 0; }

What this code will print actually depends on the CPU and compiler (and sometimes even the compiler settings) you run it on.

If your second example fails, your compiler has a problem.
Result are inexact but they are also deterministic. There is one exception I can think of to this on PC, D3D will mess with the rounding mode, and can cause none deterministic results in some very rare cases. But technically that is a bug, and it wouldn't affect your above example anyway.

Floating poiint numbers just seem to be massively misunderstood, they are just as deterministic as ints, for the same set of operations you will get the same results.

MfA · Dec 21, 2005

Results are exact ERP ... but you forgot to note the type of the expression "3.0/7.0"

Lazy8s · Dec 21, 2005

I'd imagine that the inefficiency of using floating-point for interger work can become quite considerable.

darkblu · Dec 21, 2005

MfA said:
Results are exact ERP ... but you forgot to note the type of the expression "3.0/7.0"

*freaking post timeouts, batman!*
ok, this time very tersely:

you may end up with a problem with the original example if the following conditions are all at hand:
1. q does not get spilled to memory and the target architexcture is of that ingenious type that carry out fp computations at 'extended' precisions *cough..x86..cough*
2. the compiler is in 'floats exact mode' where accuracy-problematic computations don't get precomputed but are carried out as written (so you don't have a precomputed 3/7 at double precision in both places (the assignmet and the compare)
3. the compiler is a dick .. actually that one alone can ruin your day

Fox5 · Dec 21, 2005

darkblu said:
*freaking post timeouts, batman!*
ok, this time very tersely:

you may end up with a problem with the original example if the following conditions are all at hand:
1. q does not get spilled to memory and the target architexcture is of that ingenious type that carry out fp computations at 'extended' precisions *cough..x86..cough*
2. the compiler is in 'floats exact mode' where accuracy-problematic computations don't get precomputed but are carried out as written (so you don't have a precomputed 3/7 at double precision in both places (the assignmet and the compare)
3. the compiler is a dick .. actually that one alone can ruin your day

Wait, what will the incorrect output be?

darkblu · Dec 21, 2005

Fox5 said:
Wait, what will the incorrect output be?

'Not Equal'

Bobbler · Dec 21, 2005

darkblu said:
'Not Equal'

Except, in truth they aren't really equal

We just prefer that they be equal.

q is only equal to a double's worth of precision on 3.0/7.0, and in theory, 3.0/7.0 alone is an undefined precision.

Just like 3.1415926535897 is not the same as pi.

darkblu · Dec 21, 2005

Bobbler said:
Except, in truth they aren't really equal We just prefer that they be equal.

q is only equal to a double's worth of precision on 3.0/7.0, and in theory, 3.0/7.0 alone is an undefined precision.

Just like 3.1415926535897 is not the same as pi.

this has zero relevance to the problem at hand, as the domain of the latter is the finite precision arithemtic as found in the contemporary fp units (what you're basically saing is that fpu's are of finite precision - hardly a news).

Fox5 · Dec 21, 2005

darkblu said:
'Not Equal'

Lol, I should have looked at the code before I asked that.

Anyhow, won't most compilers either ask you to specify a precision for 3.0/7.0 or implicitly convert to double? Actually, I wouldn't know, I've only programmed in Java which I believe pretends to be independent of the hardware it's running on.

From what I recall of my java programming classes, the default integer precision is 32bit, and the default floating point precision is 64 bit, with the letter f after the number being required to define it as a 32 bit floating point variable.

Bobbler · Dec 21, 2005

darkblu said:
this has zero relevance to the problem at hand, as the domain of the latter is the finite precision arithemtic as found in the contemporary fp units (what you're basically saing is that fpu's are of finite precision - hardly a news).

It was merely an explaination of why the problems can happen. No need to get uppity.

It wasn't directed at you, as I'm sure you know, but rather as additional informated based what you said (towards Fox5).

ERP · Dec 21, 2005

Bobbler said:
Except, in truth they aren't really equal We just prefer that they be equal.

q is only equal to a double's worth of precision on 3.0/7.0, and in theory, 3.0/7.0 alone is an undefined precision.

Just like 3.1415926535897 is not the same as pi.

3.0/7.0 is in double precision. the C++ standard is quite explicit on default fp precision being double

3.0/7.0 != 3.0f/7.0f but that's to be expected, and you'll get a nice compiler warning when you do it..

darkblu · Dec 21, 2005

Bobbler said:
It was merely an explaination of why the problems can happen. No need to get uppity.

sorry, did not mean to. my point was that the root of the observed problem is not in the fact that fpu's are of finite precision (which is ok) but in a completely differet place (which is not ok and can/should be addressed). yes, if fpu's were of infinite precision that'd eliminate the problem, but that'd be by shifting the domain of the problem. as it is now it's not caused but the limitations of the arithmetic precision.

Bobbler · Dec 21, 2005

The "problems" with FP are really no different than the problems with integers (meaning they aren't inherent to CPUs having issues accurately calculating floating point stuff, as ERP already mentioned). All the problems are precision related. Like me trying to store a number over 4.3 billion in an unsigned Int32, it either won't work or you'd lose some digits (and in FP's case you usually just lose some of the ending digits).

I probably should have made myself more clear in my original post -- when I said "3.0/7.0 alone" I meant outside the realm of programming. I was talking in general terms and trying to give an explaination of why there would even be a problem for whoever was wondering.

A quick question though: what was the point in some CPUs internally using 80bit (or other non 32-64-128) FP when things like C/C++/etc are pretty standard in their use of 32/64bit?

ERP · Dec 21, 2005

Bobbler said:
The "problems" with FP are really no different than the problems with integers (meaning they aren't inherent to CPUs having issues accurately calculating floating point stuff, as ERP already mentioned). All the problems are precision related. Like me trying to store a number over 4.3 billion in an unsigned Int32, it either won't work or you'd lose some digits (and in FP's case you usually just lose some of the ending digits).

I probably should have made myself more clear in my original post -- when I said "3.0/7.0 alone" I meant outside the realm of programming. I was talking in general terms and trying to give an explaination of why there would even be a problem for whoever was wondering.

A quick question though: what was the point in some CPUs internally using 80bit (or other non 32-64-128) FP when things like C/C++/etc are pretty standard in their use of 32/64bit?

The rounding rules are precisely defined and results have to be rounded after every operation. So the internal presicion is irrelevant unless you care about all 80 or 128 bits.

If you do a set of operations in a different order or a different set of operations that are mathmatically equivalent, then it is likely the actual results will be different, but this is true of integer math aswell when numbers under or overflow.

Using doubles to store ints and rounding after every operation will give you identical results to using ints, unless the int overflows. Failing to round after every computation will introduce some potential variation.

DemoCoder · Dec 21, 2005

ERP said:
The rounding rules are precisely defined and results have to be rounded after every operation. So the internal presicion is irrelevant unless you care about all 80 or 128 bits.

Not strictly true. The rounding to single to double precision does not deal with the extra 4 bits in the exponent. Thus, two platforms running the same portable C-code expression will could come up with different results. This was one of the reasons why strictfp was introduced into Java (at a huge performance cost) because people running cross-platform code noticed unpredictable results when the compiler took advantage of the full 80-bit extended format even when rounding was being used.

Thus, if you don't care about portabily across platforms, then what you say is true. But if you want a given sequence of code to produce the same result on two different processor architectures when dealing with doubles, then it's not true. *Unless* the compiler commits performance suicide for you by using a technique to fix the exponent.

darkblu · Dec 21, 2005

Bobbler said:
All the problems are precision related. Like me trying to store a number over 4.3 billion in an unsigned Int32, it either won't work or you'd lose some digits (and in FP's case you usually just lose some of the ending digits).

ok, let me try to be more clear myself and end any misunderstandings here.

the particular problem we were originally discussing manifests itself through precision but, again, is not caused by the platform's computational precision per se. the same problem could have occured if the value at hand was mathematically finite and was nicely fitting one scalar type (say, a double) but was too big for another, smaller type (say, a single). the problem stems from the language semantics and the compilers failure to adhere to those while applying certain optimizations to certain arithmetic code. the 'extended' fp precision i used in the potential scenario was just an arbitrary precision exceeding the datatypes used in the example. yet, given proper language sematics and correct compiler behaviour this problem would not occur, regardless of the effective platform's precision.

I probably should have made myself more clear in my original post -- when I said "3.0/7.0 alone" I meant outside the realm of programming. I was talking in general terms and trying to give an explaination of why there would even be a problem for whoever was wondering.

yes, i understood you from the first time. the issue with your explanation though was that it took us a bit further from the solution than we originally were. i.e. it'd been useful if people were totally unfamiliar with computers real numbers representation.

A quick question though: what was the point in some CPUs internally using 80bit (or other non 32-64-128) FP when things like C/C++/etc are pretty standard in their use of 32/64bit?

it was considered hip at the time. nobody cared about performance of fp arithmetic (heck, nobody cared about fp per se) so intel decided if they were making something as exotic as an fpu why not go totally overboard with it to collect the audience's ovatons. suckers.

darkblu · Dec 22, 2005

DemoCoder said:
Not strictly true. The rounding to single to double precision does not deal with the extra 4 bits in the exponent. Thus, two platforms running the same portable C-code expression will could come up with different results. This was one of the reasons why strictfp was introduced into Java (at a huge performance cost) because people running cross-platform code noticed unpredictable results when the compiler took advantage of the full 80-bit extended format even when rounding was being used.

Thus, if you don't care about portabily across platforms, then what you say is true. But if you want a given sequence of code to produce the same result on two different processor architectures when dealing with doubles, then it's not true. *Unless* the compiler commits performance suicide for you by using a technique to fix the exponent.

Demo, in the example at hand it's enough for the compiler to be consistent wrt to spilled vs unspilled values and the code would always produce a deterministic result. except on platforms which use an odd-even rounding mode, that is.

Does Cell Have Any Other Advantages Over XCPU Other Than FLOPS?

darkblu

weaksauce

darkblu

ERP

MfA

Lazy8s

darkblu

Fox5

darkblu

Bobbler

Shazbot!

darkblu

Fox5

Bobbler

Shazbot!

ERP

darkblu

Bobbler

Shazbot!

ERP

DemoCoder

darkblu

darkblu

Similar threads