darkblu said:
...
doh, i tried to clean up a reoccuring misconcepiton (i.e. increased aliasing having being associated with bandwidth savings) and you come up at me with the above?
Isn't it Chalnoth who said that?
ok, let's get this aspect of the discussion straight: your original reply to chanlnoth's pretty forward and logical statement (except for the 'aliasing' lead) didn't say anything at all except that you just disagreed with him.
No.
The
line you quoted "didn't say anything except that I disagreed with him", and his assertion about "Every ounce of speed" as I'd just addressed in my reasons, stated above that.
on what basis? - you didn't care to explain initially, at least not to a degree where a logical discussion could be led.
Actually, what I think prohibited logical discussion was your reducing my reasons to the status of non-existence by cutting them out of what you quoted. I didn't pick what you quoted and replied to, you did.
actually, you just threw in an accusation of beliefs, which i just reflected back to you, nothing more, nothing less.
Actually, I made an accusation of bias
after pointing out that Chalnoth had ignored almost every point I'd made on the issue and asked, specifically, how he summarily dismissed any question of computational performance cost and texture cache utilization.
so if you cut on the blaming attitude, we could hope for the signal-to-noise ration of this topic to eventually get better.
darkblu, this is a pot and kettle situation.
likely breaking the caching efficiency in the course, and hampering performance; it's the less detail that bandwidth savings (read 'speed ups') come from.
No, it is from maintaining texture cache efficiency and thereby saving bandwidth and avoiding having latency become visible. Just because reducing LOD is one way to avoid breaking texture cache efficiency doesn't mean reducing LOD and maintaining texture cache efficiency/increasing performance therefore become interchangeable.
i never implied reduced LOD and high cache efficiency were interchangeable. it's a uni-directional relation - i.e. decreasing LOD imporves cache efficiency, not vice versa.
Chalnoth was arguing that reducing LOD is the
only way to gain performance by the expedient of ignoring any consideration of other ways to improve cache efficiency. If you don't want to imply something, stop defending Chalnoth because he happens to agree with your personal preference in AF, and discuss the points of mine he sidestepped in where you actually started your discussion. You probably want an aid for finding where the problematic and myopic defense of Chalnoth is occuring to prevent reasonable discourse...as alternative to restating it all here, away from the examples, I'll
color code (yes, that was one above...more to follow).
apparently you're smart enough to know that, w/o the need from my side to step into details, so don't make me waste forum bandwidth.
apropos, if we care to introduce latencies in the picture, there are other factors (aside from caching) that affect memory reads performance, and therefore the preformance of successive operations. like page breaks, for example. so yes, you could decrease LOD, and even w/o any cache you could get higher texturing performance just becasue you end up with accessing less texture pages overall.
So, if you go back to the initial post you quoted, and my comments above that quote, can you see my point of why Chalnoth's assertion about the
only way to get performance from an AF method that had angle-dependent issues?
Can we now go back to the rest of my points Chalnoth used his assertion to dismiss?
...
For example, for your question regarding detail, doesn't bilinear filtering result in a lower level of detail than point sampling?
nope. per a given LOD bilinear sampling would reconstruct better or equally compared to nearest sampling.
But it would omit detail on angled surfaces in comparison to point sampling for a given LOD. Are you just ignoring angled surfaces? There is less aliasing, which is why I discussed the original comment separately from the basis for your accusation.
care to explain how bilinear sampling would omit details compared to point sampling per a given LOD?
Pardon, I was thinking of bilinear with mip levels versus point sampling without, and how the first could take more transistors to offer less (color) detail.
I was focused on the idea of transistor tradeoffs and constructed a bad example in response to your only mentioning LOD alone in reply to me.
...
what architectural opportunities are introduced by scrapping the whole bloody part and using the sillicon left for child toys electronic watches? what is your basis for your selection? you open up the vga box at home only to find out it's full of nice pinkie toy watches - how's that for an argument?
Well, I don't see how you'd get any benefit for 3D hardware out of your example, call me crazy.
I have this strange idea that saving transistors in one place allows you to spend them somewhere else in a design when you have a limited transistor budget, as opposed to allowing you to make toy watches.
This seems to make one of the examples relevant to the topic under discussion, and the other noise.
What opportunities might exist for an implementation of a sampling determination approximation to spend transistors on bandwidth savings that don't depend on less texture samples for gain?
zero opportunities (for less texure samples per
clock, not for less texure samples per se. that's what
bandwidth saving is - you put through less data per unit of time.)
Hmm? That ignores the impact of a cache on performance completely, by ignoring bandwidth savings on the throughput going into the cache while at the same time measuring samples retrieved from the cache as bandwidth throughput. This seems to effectively omit performance from the picture AFAIK.
absolutely. i don't care what bombastic performance gains a bad approximation would produce if it's
bad.
And we enter the Twilight zone where completely performance penalty free 16xAF with angle dependency for FP32 textures would be bad, because "angle dependency" is "bad", and nothing else matters.
All you had to do was bypass my point, which was that your usage of "bandwidth savings" says you get no bandwidth savings for avoiding texture cache misses by saying something absolute in disagreement to my point about how you could gain performance without reducing texture samples (i.e., more coming from a successfully managed texture cache).
He says it is impossible to gain performance from angle dependent AF implementation except by reducing LOD or increasing aliasing. I've discussed my commentary on texture cache, computation, and transistor budget several times now. To me, it would have saved time if Chalnoth actually addressed where I already mentioned them in my original post. But perhaps that is just me.
he didn't say it's impossible - he's said it's logical that a significant part of the increased cache performance must have come from the LOD decrease alone, which is bloody logical.
Darkblu, read this with me:
Chalnoth said:
The speed of this crappy anisotropic degree selection algorithm comes from only from selecting pixels to be at a lower degree of anisotropy than a more correct algorithm. Every ounce of speed that you get out of it is cost by a pixel that is displayed with less detail (or more texture aliasing) than you'd get with, say, the GeForce4's anisotropic.
as aniso degree computation would have likely been done in a clock, so the approximation itself saves nothing but transistors.
"Would have likely"? You mean What if?
What if, for example, aniso degree computation was used to predict how the texture cache should be managed?
What if the transistor savings allowed it to be replicated to manage more blend operations in paralllel?
What if a trick of organization of the mip level determination/storage allowed an approximation to more quickly pick samples and blend an AF result according to the relation to the screen, but the approximation failed to select usefully at other than 90 and 0 degrees? What if replicating some of the approximation algorithm twice, but with failure at an offset angle, maintained speed and improved image quality?
In such cases there is a benefit besides transistors, that might manifest with varying significance in the situations I initially outlined, and that are done a disservice by being simply dismissed as "crappy" or "stupid" without examination.
Your assertion that the
only benefit is in transistor savings is simply "What if the only reason for angle dependency is to turn down AF degree at angles", and I don't think the occurence of surfaced with the reduced AF angles in common game scenes supports it. Does performance drop when looking down right angled/45 degree angled hallways, and rise in hallways angled differently with similar screen surface area? That's what Chalnoth proposes in response to what I mention, and you keep defending.
where those AF-saved thansistors have eventually ended up is irrelevant - actually they may have not been used at all, for christ's sake. hypothesising that they could have been used to further increase the performance of the already 'naturally boosted' AF is nothing but pure sophistry
No, it is simply having a recollection of AF performance behavior that indicates the tradeoff can be more significant than simply capping AF degree at certain angles and gaining performance fromt he reduced AF on certain angled surfaces. I welcome you to discuss that topic some time instead of...
- they could have ended up as personal profit in somebody's bank account just as well.
Adding noise.
but at the end of the day we have 16th degree of aniso turning up as 4th degree (or even worse) at certain angles.
Yes, but I thought we agreed that the relationship between LOD and texture cache efficiency was unidirectional in dependency? If we have, why are you defending Chalnoth
if you can show me an algorithm for Nth degree of anisotropic texture sampling, which algorithm somehow reconstructs data from way lower LODs that would be required in the ideal case then I'm all for it, and all power to you.
Well, that would be one tradeoff relationship, but not the only one, that would follow what I propose.
until then - any degree-dropping-by-the-angle implementations will remain details-wise inferior.
To what, and when?
Not to 4x "ideal AF" at common game angles. Not to no AF at all. And, probably, not in performance/transistor tradeoff relationship to the "ideal" AF.
actually yes, yes and possibly yes.
yes - it's inferior to ideal 4x as you know what you get independently of z-axis angle, so it saves you unpleasant surprises, especially if you're a developer.
Are you playing a game by dropping "at common game angles" so you can disagree with me and restate your opinion, or did you just not finish reading the phrase?
yes - it's inferior to no AF at all because regardless of how loose an approximation it has ended up, i can bet my testicles on it that it has consumed quite a bit of transistors.
Ah, so now using up more transistors is the only determining factor. So "ideal" AF is inferior to no AF? What if ideal uses "quite a bit more transistors" than angle dependent AF...doesn't that make ideal inferior to angle dependent? Maybe it is simply that any degree of angle-dependent AF provides worse image quality than No AF?
Or maybe it is my mistake in thinking there was a point other than saying something contrary?
possibly yes - becasue r&d money could have been invested in actually producing a better approximation, which could have eventually ended up as the 'ideal' implementation.
Don't let consistency limit you.
What if you decided the "ideal" implementation simply wasn't within your transistor budget with your exisiting R&D? A chip design has finite limits, right?
Tell me, have I said angle-dependent AF is ideal, or have I said viewing even "4x GF4 AF" as ideal in comparison to any degree of angle-dependent AF depends on picking and choosing what you look at?
it's the good old argument of
meeting expectations, demalion.
And it's the good old argument of your own personal expectations not making other considerations irrelevant. I'm not proposing that my evaluation of angle-dependent AF makes your disapproval irrelevant, just that it doesn't make it useful to ignore consideration of what benefits might have resulted depending on details of the implementation beyond the "angle dependency".
an z-angle-dependent aniso impementation fails to meet quite some peoples expectations. we could carry out a poll, if that would show you anything.
For Pete's sake, I'm not disputing the existence your personal expectations, just some of the things you've gone on to say to validate them. You'll find that I've quoted them and made a relevant reply in conjunction.
What I was doing was pointing out why Chalnoth's mandates he proposed to dismiss my points did not make sense, and then you interjected your defense of your expectations to defend Chalnoth's sharing of them, ignoring those points, and going off on a noisy tangent.