Humus said:Ah, I see. "The memory is good but short."
If you happend to be in the uni area on Nov 7, feel free to come to my presentation of my examination work I did at ATI.
Hmm, maybe i'll do. When, where ?
Humus said:Ah, I see. "The memory is good but short."
If you happend to be in the uni area on Nov 7, feel free to come to my presentation of my examination work I did at ATI.
demalion said:Why would you want to design your hardware in a non-traditional way that had register performance issues? That's particular thing is what I'm saying IHVs have reason to avoid.
The NV3X is non-traditional in the sense that register usage isn't free, like it tends to be on other architectures. Does that in itself mean it's a bad architecture? No.
"Bad" in what sense? It is indeed inefficient by a whole host of metrics.
If nVidia could implement their own compiler to reduce the register usage and bring this thing up to performance levels of the 9800, then it would be a good architecture. However, with DX9 HLSL they can't do this.
It's not because of the DX 9 HLSL that they couldn't do this, it is because of the hardware. That's my point.
This way the GPU industry will just become more uniform and less innovative, simply because they have to design their hardware to meet certain expectations the compiler has.
Well, the expectation under discussion is not to discard computation opportunities due to a temporary register capability that is severely limited for your designated workload. I don't think that is much of an innovation, and I think there is quite a bit of room for quite a few other innovations that remain....like a solution for your architecture that permits you a better return on your transistor budget.
Bjorn said:Humus said:Ah, I see. "The memory is good but short."
If you happend to be in the uni area on Nov 7, feel free to come to my presentation of my examination work I did at ATI.
Hmm, maybe i'll do. When, where ?
Humus said:I'm not saying register usage limitations is a feature, or in itself desirable. It's certainly a drawback. But why was it used in NV3X? It probably added something useful somewhere else, or maybe just saved enough transistors to implement another feature or improve performance somewhere else in the pipeline. Now what if it is possible to acheive the same level of performance on such a hardware than more traditional hardware by using a compiler written for it? Would you still think the hardware is flawed?
Now Lads! Remember what they say about meeting someone who you've only met in a chat room on the internet. Take an adult with youHumus said:Nov 7 at 13.00 in A1514.Bjorn said:Hmm, maybe i'll do. When, where ?
Well it does have registers... it's just that you can't really access the damned things! It was the same with the transputer. AFAICS stack-based architectures are great for teaching simple compilers and CPUs but in practice they are pretty awful from a performance point of view.DemoCoder said:Is there anything inherently wrong with a stack-oriented pipeline that has no registers what so ever? Should we accept an intermediate instruction set that forces future HW vendors to "target" their architecture towards it (in terms of datatypes, registers, instruction types)?
Humus said:...
I'm not saying register usage limitations is a feature, or in itself desirable. It's certainly a drawback. But why was it used in NV3X? It probably added something useful somewhere else, or maybe just saved enough transistors to implement another feature or improve performance somewhere else in the pipeline.
Now what if it is possible to acheive the same level of performance on such a hardware than more traditional hardware by using a compiler written for it? Would you still think the hardware is flawed?
Do not disparage the name of Forth in such a way!Simon F said:AFAICS stack-based architectures are great for teaching simple compilers and CPUs but in practice they are pretty awful from a performance point of view.
Simon F said:AFAICS stack-based architectures are great for teaching simple compilers and CPUs but in practice they are pretty awful from a performance point of view.
Philip Koopman said:Stacks work pretty well if someone takes the trouble to write a
stack-scheduling compiler. The problem has always been that
register-based machines with whizzy compilers were compared with
idiot-simple stack code -- not a fair comparison. Similarly, most of
the old "stacks are worse than registers/memory" arguments were based
on small code snippets that didn't exploit reuse of on-stack
variables, or didn't use realistic instruction sets that permitted
nondestructive accesses to the top couple stack elements. The stack
machines I worked on a decade ago were optimized for cost/performance,
not just raw performance. (But they did pretty well at raw
performance too.)
I published a paper on a first cut at such an optimizing stack
compiler back in 1994; I've since moved on to other pursuits. See:
http://www.ices.cmu.edu/koopman/stack_compiler/index.html It's
suitable for stack architectures that keep the top 2 or 3 values in
registers, which is commonly the case.
I haven't got a theoretical argument but how about the following hand-waving exercise...DemoCoder said:Simon F said:AFAICS stack-based architectures are great for teaching simple compilers and CPUs but in practice they are pretty awful from a performance point of view.
I think the key is, "in practice". I'm not aware of any theoretical proofs that stack architectures are inherently less efficient. I tried looking for papers, but couldn't find any that claimed stacks were less efficient than registers. I did find this from comp.compilers:
A := B+C
D := E+F
LD R0, B
LD R1, C
LD R2, E
LD R3, F ; Gives time for B and C to load
ADD R0, R0, R1 ;add B & C
ADD R2, R2, R3 ; add E & F
STO A, R0 ;hopefully B+C has finished executing by now...
STO D, R2
PUSH B;
PUSH C;
ADD ; Add B and C... note B and C might not have loaded... let's assume that the hardware is [i]really[/i] clever and won't actually stall unless we tried to read the result
PUSH E;
PUSH F;
ADD ;Same as above
POP D ;STALL?
POP A