If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
|
|
#1 |
|
Dangerously Mirthful
Join Date: Feb 2002
Location: Winfield, IN USA
Posts: 15,314
|
It's the spider kit from Tahoe, just got it this week. It's a real 9900 2.6Ghz, but it's a B2 still.
__________________
Elite Bastards - Adminish “Be polite, be professional, but have a plan to kill everybody you meet.” - General James N. Mattis |
|
|
|
|
|
#2 |
|
Senior Member
Join Date: Jul 2002
Location: BelleVue Sanatorium, Billary, NY. Patient privileges: Internet access
Posts: 2,694
|
Dig, assuming you haven't been brainwashed into peremptorily installing the TLB patch you might not need, how is the software you are running, running?
Last edited by WaltC; 26-Dec-2007 at 23:19. Reason: typo |
|
|
|
|
|
#3 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
"Oh, my machine MIGHT not die with a machine check exception at any time due to a race condition in the L3 cache, so I won't install the TLB patch!" er...
|
|
|
|
|
|
#4 |
|
Dangerously Mirthful
Join Date: Feb 2002
Location: Winfield, IN USA
Posts: 15,314
|
I've actually had the TLB patch disabled since I got it and haven't had a single issue.
__________________
Elite Bastards - Adminish “Be polite, be professional, but have a plan to kill everybody you meet.” - General James N. Mattis |
|
|
|
|
|
#5 | |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
Quote:
From what AMD has said, there's a race condition in L3 when one core sets the dirty bit and another reads from the now-dirty page soon after; this leads to hilarious memory corruption and then generates a machine check error, from which the processor cannot be restarted (e.g., you have to reboot--as far as I can tell you can't just set an interrupt handler to deal with it). Of course, that race condition kind of has to be based on clock speed, as far as I can tell, so I don't know what the deal is there. I get the feeling if you're running multithreaded apps with both heavy memory accesses and poor cache coherency, you're going to encounter the crash. There's no mention of any virtualization-specific conditions that would cause this anywhere, so it's not going to be virtualization-specific--it's just the right kind of workload to generate the crash. holy crap are we ever off topic |
|
|
|
|
|
|
#6 |
|
Dangerously Mirthful
Join Date: Feb 2002
Location: Winfield, IN USA
Posts: 15,314
|
And I feel you're being a bit of an epipolar bear again.
I see your point, I shouldn't like my PC because it could lock up at anytime....but it hasn't and I've been f-ing trying so I'm just starting to think/feel/believe that mebbe when they say it's a rare condition that doesn't come up much that it really isn't. I'm not saying you're wrong Tim, just relaying my personal experiences with the actual hardware.
__________________
Elite Bastards - Adminish “Be polite, be professional, but have a plan to kill everybody you meet.” - General James N. Mattis |
|
|
|
|
|
#7 | |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
Quote:
1. Multiple threads (or at least some sort of way for multiple cores to simultaneously access the same page, so multiple threads is probably the easiest way to imagine this). 2. The threads are accessing the same memory segments. 3. Some page (let's call it X) is cached in L3. 4. Core 1 writes to page X, which causes the dirty bit to be set on the page in L3. 5. Sometime very soon after (how soon that is, I have no idea), Core 2 writes to page X. Now, here's where the TLB erratum hits you--the dirty bit (which exists in the TLB, hence the name) is ignored, you get memory corruption, and then the processor detects that things have gone HORRIBLY WRONG and generates the aforementioned machine check exception. Then the processor stops and you reboot. My big problem with the idea that the TLB patch can be ignored is that while it's probably pretty stable right now, that kind of workload will be more common in six months. Six months later, it'll be even more common, and so on. Claiming that it's just not necessary for most people may be true-ish right now (you'll probably still find apps that need it, but maybe they'll be rare), but I don't think it will be as true in the future. |
|
|
|
|
|
|
#8 |
|
S K R Y I N G
Join Date: Jul 2005
Posts: 4,815
|
Wait, so the only issue is that it reboots the system? No long time damage? What the hell is the worry then? Just run the system till you run into the issue and once you do enable the fix. Easy...
|
|
|
|
|
|
#9 | |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
Quote:
I'm tempted to write an app that would basically make the thing crash if the erratum is correct, try to get some data on when it actually occurs. Wouldn't really be hard... |
|
|
|
|
|
|
#10 | |
|
S K R Y I N G
Join Date: Jul 2005
Posts: 4,815
|
Quote:
I still haven't seen anyone being able to have the issue pop up under normal every day use baring but purposefully trying to force it. |
|
|
|
|
|
|
#11 | |
|
Dangerously Mirthful
Join Date: Feb 2002
Location: Winfield, IN USA
Posts: 15,314
|
Quote:
I double-dog dare you.
__________________
Elite Bastards - Adminish “Be polite, be professional, but have a plan to kill everybody you meet.” - General James N. Mattis |
|
|
|
|
|
|
#12 |
|
Darlek ******
Join Date: Jun 2004
Posts: 9,651
|
my god arnt you easilly pleased
|
|
|
|
|
|
#13 |
|
Retarded moron
|
Where did you get the details about the TLB issue Tim?
If what you are saying is true then this issue is quite big, I heavily make use of mutiple threads in certain parts of my work that access shared memory. If some parts of that memory is cached in the level 3 and mutiple threads access it after my main thread has written to it then I'm pretty much screwed? Thank goodness I didn't buy a Phenom.
__________________
I eat coffee. |
|
|
|
|
|
#14 | |
|
Heteroscedasticitate
Join Date: Mar 2005
Posts: 2,362
|
Quote:
__________________
Donald Knuth: Science is what we understand well enough to explain to a computer. Art is everything else we do. |
|
|
|
|
|
|
#15 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,242
|
The way I've interpreted the description of the error is that updates to TLB entries are not (edit: completely) atomic.
TLB entries exist in memory, and while the TLB used directly by the core is separate from the L1 cache, TLB entries that are evicted from the TLB can reside as data in either the L2 or L3. The problem as AMD described is that there is a window of time when a TLB entry that is present in the L2 needs to be updated due to a memory operation by the core up with the L1 and TLB (the aforementioned accessed and dirty bits), but another memory operation forces the L2 data to be evicted to the L3. What this means is that the L3 gets an old copy of the TLB entry that can then be loaded by another core. As a result, two cores have two different versions of the same TLB entry. The time window is very small, one core must update a TLB entry that is cached in its L2 at the same time that some other operation evicts the old TLB data in the L2 to the L3. Then, if another core loads up that old data, it, the system, or system data is screwed. Virtualization goes through a lot of common TLB accesses, which is why it is likely a bigger problem for server and virtualized loads. The problem with testing for this is that it requires a certain combination of events and data accesses that can force an L2 eviction at just the right time. edit: The erratum as I saw it described is a 2-parter. One involved evictions to the L3, the other occurs if the same L2 cache line is probed, in which case the core might simply forget to set the accessed and dirty bits and as an added bonus may corrupt another completely unrelated cache operation. Here's a description of the bug and the OS workaround in Linux that avoids most of the performance penalty associated with the BIOS workaround. https://www.x86-64.org/pipermail/dis...er/010259.html
__________________
Dreaming of a .065 micron etch-a-sketch. Last edited by 3dilettante; 27-Dec-2007 at 18:04. Reason: clarifications added |
|
|
|
|
|
#16 |
|
Regular
Join Date: Jun 2003
Posts: 6,177
|
I run a MySQL database as a back end for a news reader. I would not be surprised to see that sort of app running on multiple cores could trigger this kind of problem. Rebooting your machine in the middle of disk writes to your relational database and shafting the in-ram disk and db caches is potentially a recipe for lots of hassle.
I suppose it's a case of how important stability is for you. If you don't mind your machine rebooting once a month while playing games, it's not a big deal. If you trash a database every few hours that then needs repair/rebuilding/restoring, it's too much hassle to live with. I think the biggest problem is that the fix for this kills even more performance off chips that are already under-performing, which is why people are trying to justify ignoring the fix. No one would care if the fix didn't lower Phenom performance even more, and everyone would just install the patch and be done with it. |
|
|
|
|
|
#17 |
|
Mostly Harmless
|
Not everybody is running mission-critical apps on their PC. Certainly people who are should be using the patch. People who aren't, and own the cpu, might reasonably want to see if it bites them on the butt with their specific workload before making that decision.
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee "Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel ". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006 "Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss |
|
|
|
|
|
#18 | |
|
Red-headed step child
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,088
|
Quote:
__________________
"...twisting my words" |
|
|
|
|
|
|
#19 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,242
|
The patch isn't really an option for most people.
The limited release of Phenom before this bug popped up+the limited number of Spider boards that were sold to run Phenom+the need for BIOS updates to run Phenom as a drop-in replacement = not too many people. Barcelona customers are pretty rare, and a number of the big ones are HPC installations that might use the OS workaround instead. Going forward, every BIOS is going to have it and there won't be many that will have a way to turn it off. I read that AMD's Overdrive currently doesn't have a switch for it either. There's a nebulous "Turbo" button that apparently does turn it off, but what else does it do?
__________________
Dreaming of a .065 micron etch-a-sketch. Last edited by 3dilettante; 31-Dec-2007 at 19:33. |
|
|
|
|
|
#20 | |
|
Dangerously Mirthful
Join Date: Feb 2002
Location: Winfield, IN USA
Posts: 15,314
|
Quote:
Just gotta click on the little light thingy in the upper-right hand corner. Green is on, yellow middling, red off.
__________________
Elite Bastards - Adminish “Be polite, be professional, but have a plan to kill everybody you meet.” - General James N. Mattis |
|
|
|
|
|
|
#21 |
|
Senior Member
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,242
|
Perhaps I should restate what I said as "there is no explicit switch to disable the TLB patch".
There is a way to turn the workaround off, but it is not really marked as a switch for that one issue. It is not confirmed that the setting that disables the workaround doesn't alter other settings. This confused state is highlighted by your saying there are 3 settings. The workaround has a total of 2 states: on or off. That means other things may be affected.
__________________
Dreaming of a .065 micron etch-a-sketch. |
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|