NVIDIA shows signs ... [2008 - 2017]

Status
Not open for further replies.
And we already went over it. Newegg is not going to just hold them for fun unless someone else is paying them to. The longer they wait the cheaper they will need to go to move them. Newegg has been in business for quite some time so I figure they understand this. Either people are still buying them that do not know better or someone is paying newegg for the loss. There is no way they would just hoard them. Of course it is possible they have 1 in stock and are just keeping it there for window dressing I suppose.

I assune the same goes for the $288 3870 X2 that they are selling?
 
And we already went over it. Newegg is not going to just hold them for fun unless someone else is paying them to. The longer they wait the cheaper they will need to go to move them. Newegg has been in business for quite some time so I figure they understand this. Either people are still buying them that do not know better or someone is paying newegg for the loss. There is no way they would just hoard them. Of course it is possible they have 1 in stock and are just keeping it there for window dressing I suppose.

That's sorta true and sorta not.

Many B&M stores will not sell a product below their cost on it, even if it means keeping it on the shelf for over a year. Then you have some that will start discounting a product under cost after as soon as 3 months after offical EOL on a product.

And that's assuming they saw ZERO sales during those 3 months.

If a store can even manage to sell 1 unit a week they won't reduce the price below cost.

This isn't to say they aren't moving at Newegg, but you can't infer that they are moving either.

It's entirely possible that Newegg has 10-20 units (or however many) and they do yet feel the need to take it in the pants.

This is completely opposite of what happens with an official manufacturer reduction in the MSRP of a product. Where large retailers will get compensated in some way to lower the price to the MSRP. Smaller retailers who don't get compensated on the other hand will almost always have the odd effect of older retail products having a higher price than their newer stock of the same product.

It's a bit naive to think that Newegg will just decide to price under cost on a product when EOL on it has only been a month or less. Especially if at least 1 person is buying one a week.

Again, is that happening? Who knows, only Newegg does. None of us can say whether it is or isn't.

The only thing we do know is that Nvidia for Q3 was doing relatively worse than ATI/Intel and it's only going to get worse for Q4.

Regards,
SB
 
The only reason a business would keep a product is b/c they can write it off for taxes after they keep it for so long. In your example SB they are still selling the product to someone. What justification would a business have to pay for space to store a product they are not planning to sell.

If you sell 10 products for $100 you make $1000. If you simply keep them you lose money everyday b/c they take up space. If you can sell one a week you are still selling them and that was rather my point. As long as people are buying them they have no reason to reduce the price, but if people stop then the price will fall b/c it is better to sell them for $50 less than keep them for paperweights. It is possible that newegg has loads of extra space due to the economic situation and therefore are not running into that constraint at the moment, but if their business starts to grow they will want those products gone. (BTW some of the nvidia high end cards are limited to 99 purchases per customer that implies they have more stock than I would want if I was selling cards)

I assune the same goes for the $288 3870 X2 that they are selling?
You mean the product that was returned to them with physics support :LOL:? It is open box. Eventually they will discount it heavily to move it. (the lol is at the physics support on the 3800 not at you)
 
As long as people are buying them they have no reason to reduce the price, but if people stop then the price will fall b/c it is better to sell them for $50 less than keep them for paperweights.

Retail doesn't quite work that way. At some point certain businesses will decide that the cost of shelf/warehouse space overcomes the loss associated with selling a product below cost. That point will vary between businesses, but from my experience in retail is generally between 3-12 months. Occasionally (rarely) shorter and occasionally (rarely) longer.

If they have stock of say 1000 units in a warehouse that's going to be costly and they'll want to liquidate sooner. If they have stock of 10 units in a warehouse, it's not that costly and may not be worth liquidating. Especially if sales trends indicate you'll probably only move single digits weekly.

And even then if you are going 50 USD under cost, that's going to 5000 USD for 1000 units worth of stock. At that point you'll have to determine whether the generally slim margins on internet sales (in the case of Newegg) is worth freeing up 1000 units of space that could be used to sell products weekly. Chances are it'll probably be worth it after a while.

10 units in a warehouse at 50 USD under cost each would only be 500 USD against your profit margin. But at that point 10 extra units of space to move product on a weekly basis might not be a big deal. Especially if you manage to sell even 1 unit every month or every two months. IE - thus you'll generally want to wait much longer before discounting below cost.

1 month after official EOL would be too soon to say whether they should be cutting into their profit margin yet or not. Most especially if they don't have much stock and do not expect to get much more if any stock.

Regards,
SB
 
And the entire point of an e-tailer is that warehouse (shelf) space is cheap because you can build a massive warehouse in the middle of nowhere and post your stock to your customers, as opposed to having to 'display' it on retail shelves that are costly because you are paying premiums on the land because it attracts a high number of people in its catchment area.
 
Retail doesn't quite work that way. At some point certain businesses will decide that the cost of shelf/warehouse space overcomes the loss associated with selling a product below cost. That point will vary between businesses, but from my experience in retail is generally between 3-12 months. Occasionally (rarely) shorter and occasionally (rarely) longer.

If they have stock of say 1000 units in a warehouse that's going to be costly and they'll want to liquidate sooner. If they have stock of 10 units in a warehouse, it's not that costly and may not be worth liquidating. Especially if sales trends indicate you'll probably only move single digits weekly.

And even then if you are going 50 USD under cost, that's going to 5000 USD for 1000 units worth of stock. At that point you'll have to determine whether the generally slim margins on internet sales (in the case of Newegg) is worth freeing up 1000 units of space that could be used to sell products weekly. Chances are it'll probably be worth it after a while.

10 units in a warehouse at 50 USD under cost each would only be 500 USD against your profit margin. But at that point 10 extra units of space to move product on a weekly basis might not be a big deal. Especially if you manage to sell even 1 unit every month or every two months. IE - thus you'll generally want to wait much longer before discounting below cost.

1 month after official EOL would be too soon to say whether they should be cutting into their profit margin yet or not. Most especially if they don't have much stock and do not expect to get much more if any stock.

Regards,
SB

I am aware of all of that (and agree), but none of these arguments mean that you should just hold onto them hoping that at some point they become collectors items. These are not baseball cards. They are losing value all the time. If newegg doesn't move them they will be worth less with every single week that passes. That is the calculation someone there is (should be) doing. How many units are they selling now/week and how much value do they lose per week. Losing $5000 is better than holding onto them even if space is cheap and losing $10,000 next January when they have to cut the price by $100 to sell them.* In other words it may be better to take a loss now than wait and take a bigger loss. I am not even suggesting we disagree SB, I am just saying that a business would rather take a loss of $X per product than a loss of $X+$Y per product (Y and X are positive :devilish:) Basically they are trying to get people to pay right up to their willingness to pay for each card theoretically, but if the volume is too low they (newegg) will lose more money. That is what rebates are for really, just a hoop to jump through so that some people will pay more, but you still sell in higher volumes.


*All numbers are examples so don't anyone get uptight that $50 price cut isn't enough to make them move.
 
I'm not sure how detailed your simulation is, but constraints like strands have to retain their position when merged (e.g. modulo SIMD width or register file width) or being limited to 8 or 16 threads could cause some grief.
Finally a weekend...
I did like described before, modulo the SIMD width and being limited to 4 or 16 threads.

Disappointing, but still surprising results, they work better for worst cases, and better the more far away the threads being joined are, Using z-order is more important with it (I was imaging less) and minimal usage increased significantly, could you imagine a case where a SIMD with width 262144 could perform better than width 65536?

mandelbrot2.png


And the raw data:
Code:
        RowMajor        Zorder          RowMajorJoin4   ZorderJoin4     RowMajorJoin16  ZorderJoin16
2       98,7540%        98,7540%        98,9687%        98,9330%        99,1042%        99,1514%
4       97,0092%        97,4384%        97,3834%        97,7816%        97,7935%        98,2001%
8       94,2091%        95,6798%        94,8427%        96,2264%        95,8732%        96,7593%
16      89,6497%        93,6318%        91,0865%        94,2476%        91,9864%        95,0773%
32      82,9938%        90,6495%        84,9150%        91,5120%        85,6839%        92,7493%
64      74,4234%        87,3385%        75,4607%        88,6953%        82,1762%        90,0713%
128     61,6048%        82,9404%        61,7358%        84,6782%        68,9146%        87,1188%
256     41,6041%        77,6104%        58,8620%        79,8417%        59,0756%        82,4077%
512     24,6509%        72,0214%        41,2583%        73,7824%        41,3965%        75,8474%
1024    21,1462%        65,1978%        21,2066%        67,0696%        21,2993%        69,0363%
2048    20,8100%        57,0875%        21,0265%        60,6157%        21,4175%        64,4733%
4096    20,1047%        49,1887%        20,7442%        51,3486%        21,0862%        53,5220%
8192    19,2807%        41,2887%        20,7194%        43,9344%        20,7194%        46,9527%
16384   18,3604%        32,7539%        19,4470%        32,7539%        19,4470%        38,6817%
32768   17,3629%        23,9855%        17,3629%        23,9855%        19,4547%        34,5705%
65536   15,6383%        17,3451%        15,6383%        17,3451%        19,4464%        31,1793%
131072  13,0412%        11,1637%        13,0412%        15,6261%                
262144   9,7873%         9,7873%        13,0412%        19,5364%                
524288   9,7873%         9,7873%                                
1048576  9,7873%         9,7873%

@all
Ok guys, I know you hate off-topics and fortunally for you my time is scare and getting all this data (and mainly, future data based on the current one, wich was fairly simple) proved to not be so simple and fast as I initially tought so this is the last post with charts and tables in this thread.

@moderators
Could you split this discussion in another thread? It's getting interesting for GPGPU reference.
 
Finally a weekend...
I did like described before, modulo the SIMD width and being limited to 4 or 16 threads.
Ooh, nice work.

Was it done on the basis of recombination for each loop iteration (if divergence has just occurred)? And did you account for stalls while waiting to combine? I imagine those things are pretty tricky.

Disappointing, but still surprising results, they work better for worst cases, and better the more far away the threads being joined are,
Yeah, it really doesn't look like this is enough. Wilson Fung's initial simulations indicated about 20% performance gain based on NVidia's 32-wide threads. I haven't seen any major conceptual advances beyond his initial work.

Using z-order is more important with it (I was imaging less) and minimal usage increased significantly, could you imagine a case where a SIMD with width 262144 could perform better than width 65536?
SIMD widths 1024 and up have hit the dimensional limits of the rendering, haven't they?

Have you seen the latest results in the Julia rendering thread:

http://forum.beyond3d.com/showthread.php?t=55344

There's some pretty curious things happening there, all with linear rather than Z ordering. The control flow's deep-nesting complicates things substantially, but even so it's interesting to see that linear order can outperform Z-order (i.e. pixel shader). I guess the pixel shader is suffering from the finely-grained scheduling pattern across the set of SIMDs. Apart from driver troubles, I can't think what else it might be - though I can't actually work out the mechanism there. Other simulation work has shown non-obvious performance variations depending on how work is scheduled to SIMDs...

Jawed
 
Was it done on the basis of recombination for each loop iteration (if divergence has just occurred)? And did you account for stalls while waiting to combine? I imagine those things are pretty tricky.
Yes, I did all that, well, about stalls, I assumed recombination latency was non-existent or completly hidden, I really don't think there is need for a stall with a so simple recombination algorythm.

Yeah, it really doesn't look like this is enough. Wilson Fung's initial simulations indicated about 20% performance gain based on NVidia's 32-wide threads. I haven't seen any major conceptual advances beyond his initial work.
It's very workload dependent... Wich one he used? I wouldn't give up just because mandelbrot poor results...

SIMD widths 1024 and up have hit the dimensional limits of the rendering, haven't they?
They take more then one line at once, the last line of each took all image at once.

Have you seen the latest results in the Julia rendering thread:
I read the comments, didn't looked at source code to understand what is happening.
 
In case you don't have them:

http://www.microarch.org/micro40/talks/7-3.ppt
http://www.ece.ubc.ca/~aamodt/papers/wwlfung.micro2007.pdf

Yes, I did all that, well, about stalls, I assumed recombination latency was non-existent or completly hidden, I really don't think there is need for a stall with a so simple recombination algorythm.
I'm just thinking that while threads are pooled waiting to be combined (e.g. three pooled waiting for a fourth to join them), then the population of available threads is reduced, exposing the ALUs to stalls caused by other latencies (e.g. fetching, or simply clause switching in ATI).

Jawed
 
I took a look on it yesterday, someday will read it carefully.

I'm just thinking that while threads are pooled waiting to be combined (e.g. three pooled waiting for a fourth to join them), then the population of available threads is reduced, exposing the ALUs to stalls caused by other latencies (e.g. fetching, or simply clause switching in ATI).
To make things doable I assumed they always happens inside a group, a bit more restrictive than the full DWF described in the paper, I think some of the ideas there was like it... Whatever, being mandebrol it isn't affect by many problems the paper had to care about.

Keep in mind I don't have a cycle precise simulator neither time to write one, yet, a simple program that count executed clauses written during coffee breaks is better than "I guess".
 
Indicative of a move in software terms, yes. The hardware was built for graphics. The irony about this is that it's the inefficiencies of the fixed function hardware that make NVidia's design so large. The GPU size, for "equivalent realisable" FLOPs in vector form in NVidia's design might only have been 10% smaller (that's total GPU size, not mm² per FLOP).
Well given how things have developed since G80 I'm not prepared to accept that Nvidia designed a GPU and tacked on CUDA as a convenient afterthought :)

Ok, so GT200 has fewer shaders, inefficient shaders, and inefficient texturing. How does it ever manage to keep pace with the competition?
What arbitrary benchmark?
100% utilization.
Optimising for an architecture is more than just optimising for the ALUs. My beef is with a comparison that starts and ends with the "scalar has perfect utilisation" mantra.
Agreed, but there's nothing wrong with pointing out that a given architecture has inherently better utilization.
Fermi issues purely in-order from each warp. Until we know more about GF100's design we won't know how realisable it was in G80's timeframe. We'll also probably never really know how wasteful G80's approach was. Fermi, Larrabee and R800 all being in-order is a bit of a clue though, don't you think?
Hmmm I must have missed that bit. Where did they describe a change in Fermi with respect to instruction issue vs G80? As far as the whitepaper goes it doesn't preclude having multiple ready instructions from each warp available to the scheduler. Also, what waste are you referring to with respect to G80 instruction issue?
 
Not only motherboards...

iride4u said:
According to a EVGA tech installing 195.39s may help, but if damage has already been done it won't help. I have had 2 video cards and 1 motherboard trashed with these drivers.
 
Wow, killed the MB and Video cards? Thank goodness I don't know anyone with NForce MBs other than me anymore. And mine is a server chipset which I never need to upgrade the drivers, thank goodness.

All my friends were smart enough to switch to Intel, AMD or ULi chipsets.

Regards,
SB
 
Well I have 3 nforce MBs myself at this moment :) and all are working a-ok. Of course I rarely update drivers unless something is wrong. I will have to check what I used on the recent one I put together though. It might be this driver and that would be bad.
 
Status
Not open for further replies.
Back
Top