Advertisement

There's no such thing as a stupid question, but they're the easiest to answer.
Login
Search

Advertisement

Hardware Hardware
Search Search
Search for:
Tech Support Guy > > >

Dell hard drive failure (intermittent)


(!)

gophersnake's Avatar
gophersnake gophersnake is offline
Member with 151 posts.
THREAD STARTER
 
Join Date: Mar 2006
Location: Northern California
Experience: program Assembly Language
22-Mar-2009, 03:41 AM #1
Dell hard drive failure (intermittent)
The problem first happened about three weeks ago. I wrote about it here. At the time it looked like changing the CMOS battery had fixed it, but now it's back. Here's what I know about it so far:

-- It's most likely to happen after the computer has been on for several hours, then is shut down briefly and restarted. I'm beginning to think it might be temperature-related because it's most likely to clear up after the computer has sat unused for a few hours.

-- So far at least, that drive has never shown any signs of trouble during a session -- only at startup. When it fails, it seems to disappear altogether -- the CMOS says it's supposed to be there but nothing is able to find/access it.

-- At least some of the times this has happened, the drive light has stayed on continuously. I didn't notice what it was doing the other times.

-- At bootup, the computer goes through its power-on self-test more or less normally though it may linger a little toward the end (where the white progress bar fills the whole box). The POST display disappears as usual. At this point, if it's going to boot normally the Windows flag logo appears. If not, the screen stays black, there are two short beeps, and I get the message:
Code:
strike F1 to retry boot, F2 for Setup utility
.

-- Once there's a boot failure, If I turn off the power and wait a few minutes I just get the same result. If I boot from a floppy or CD-R, there appears to be no hard drive present but everything else works normally.

-- I ran the Dell Diagnostics from CD-R while the problem was happening. For the hard drive I got the error message:
Code:
Test Errors
IDE Disk 0 - Confidence Test                               : Fail
  Status: Fail  Status Code: DOS DDG-D DISK 192 066
  Device: IDE_Disk_0  Test: Confidence_Test-Read_Test
  Release: 1073 Module(s): Disk
  Msg: Block 0: Address not found.
This message was logged to the RESULTS file in "test-one-device" mode. In automatic mode, the Dell Diagnostics would deliver the same message over and over for block after block.

-- A few hours later the computer booted normally. I ran the Dell Diagnostics again, this time from the C: drive in DOS mode. This time the result for that same drive was:
Code:
IDE Disk 0 - Confidence Test                               : Pass
  Status: Pass  Status Code: DOS DDG-D DISK 190 000
  Device: IDE_Disk_0  Test: Confidence_Test
  Release: 1073 Module(s): Disk

End testing: 03/21/2009 23:26:57 - 0 errors
I've found several references online to similar problems. One poster blamed it on defective hard-drive cables while others suspected the CMOS battery.

Since the battery is only a few weeks old, I haven't disturbed the cables, and the problem acts as if it might be at least a little heat-sensitive, I'm thinking some chip or other might be cutting out and back in. I'd try freezing spray but I don't know what part(s) of the computer to aim at first. I'd try a new hard drive (and cables) but I'd rather rule out the motherboard first, if possible.Ideas or information, anyone?

-------------------
Dell Precision 220, circa 2003
"Intel 82820 chipset" (whatever that means)
Pentium III, 733 MHz
256 Mb RAM
Hard drive described as "20 GB WDC WD20BB-75AUA1" (whatever that means)
(Should be: Western Digital WD200BB)
It's probably the original drive installed by Dell. Setup deals with it as "auto".

Windows 98 SE
Floppy drive, CD-R, and CD-RW available.
Windows Startup Diskette, Spinrite 5 diskette, and Dell Diagnostics CD available.
Freezing spray available.
Spare CR2032 battery available.
Spare computer available (but not very desirable).

Last edited by gophersnake; 23-Mar-2009 at 03:41 AM.. Reason: Fixed drive model number (in red)
jack-o-bytes's Avatar
Computer Specs
Member with 2,575 posts.
 
Join Date: Jan 2009
Location: Shopshire, England
Experience: Intermediate
22-Mar-2009, 04:35 AM #2
Go to http://www.almico.com/sfdownload.php and click on the speed fan 4.37. (it is only in small writing just below to the right of the title download) then when you have the program installed open it and wait for it to load. Once it has loaded you will see a list of temperatures on the right post them on this forum later. Also click the smart tab at the top of the program and click on which hard drive you want to check on the drop down bar. then at the bottem there will be two bars that say fitness and performance. If the bars are full it isnt your hard drive if they are low it is your hard drive.
Elvandil's Avatar
Computer Specs
Member with 51,993 posts.
 
Join Date: Aug 2003
Location: Vermont
22-Mar-2009, 05:31 AM #3
Run some thorough tests on the drive.

Make sure everything you want to keep is backed up.

Free Hard Drive Testing Applications:

Manufacturer's Tests
Victoria for DOS
Victoria for Windows (Both versions of Victoria are among the best and most thorough tests available.)
HD Tune
CheckDisk 1.03 (Marks bad sectors as unusable.)
HDAT2 (Diagnostics and bad sector recovery)
MHDD Low-level Diagnostics
Bootable Hitachi Drive Fitness Test Floppy or CD Image (works on most drives)

Hard Drive Manufacturers' Diagnostic Utilities Links:

TachTech
BleepingComputer
__________________
Microsoft MVP
異驚の界世 ípןɹoʍ ǝɥʇ ɟo sɹǝpuoʍ ǝɥʇ ɟo ǝuo sı ǝpoɔıun ʞuıɥʇ ı
gophersnake's Avatar
gophersnake gophersnake is offline
Member with 151 posts.
THREAD STARTER
 
Join Date: Mar 2006
Location: Northern California
Experience: program Assembly Language
22-Mar-2009, 07:00 PM #4
I'm trying jack-o-bytes's suggestion first.

(Elvandil, thanks for yours too. I have been keeping good backups. The hard drive, when it comes up at all, seems to perform entirely normally both with real data and with the Dell Diagnostics. Eventually, when I won't need the computer for a while, I'll run Spinrite 5 on it but I don't expect it to turn up anything unusual.)

I did download SpeedFan. It shows me four temperatures, one of which (with a red flame icon, of course) never seems to change from "127C". Since that equals 7FH, I suspect it's not a real temperature.

Of the other three temps, the first two always seem to hover in the 32 to 34 range. The third settles into that range too, but heads rapidly for the 50s when, for example, I print a document or (just to try something) start an AVG virus scan. It settles back down within half a minute or so when the print job finishes or I interrupt the virus scan. Both operations use both the CPU and the hard drive a lot but so far I have no idea which (if either) of those the temp reading is coming from. In time I might come up with some tasks that use one more than the other but I'd have to be able to run them from Windows so I could still monitor the temp(s) with SmartFan. That lets out Spinrite and the Dell Diagnostics, both of which run only in DOS mode.

It took me a while to find the "Fitness" and "Performance" bars; the S.M.A.R.T. window was blank and I kept skipping on by till I noticed the box for selecting a hard drive. Both of my ratings are 18 blue squares out of 20. I repeated the reading while AVG was scanning (and the temp readout climbing) but the Fitness and Performance stayed the same.

I guess I'll look in on those temps occasionally and see if I can catch any of them misbehaving or correlate them with (a.) recent drive activity or (b.) boot failure soon afterwards.

[One more thing I thought of would be to be sure to check those temps right before a restart, and especially to see if restarts become any more iffy when the temps have been on the high side right before shutting down.]

Last edited by gophersnake; 22-Mar-2009 at 07:05 PM.. Reason: To add afterthought
gophersnake's Avatar
gophersnake gophersnake is offline
Member with 151 posts.
THREAD STARTER
 
Join Date: Mar 2006
Location: Northern California
Experience: program Assembly Language
23-Mar-2009, 02:05 AM #5
Developments since my last update:

I downloaded a drive-testing utility from the Western Digital site. I must say I'm not too impressed with either the software itself or their instructions for downloading and installing it. The first time I ran Dlgdiag5.exe, with the C: drive present and working normally, it gave the drive a clean bill of health. Later, when the drive wasn't coming up at reboot, I tried to run it again from the bootable CD I'd made. It crashed with an "Exception 6, Invalid Opcode" error and a register dump before it even had a chance to display its licensing agreement. Admittedly, one difference between the two attempts was that the first time it was running under Windows, the second time under Nero's "Caldera DR DOS". I may try burning another CD where the boot files are from Windows.

In between those two attempts I ran Spinrite 5 at Level 1 (read every sector, don't try to fix anything) on the entire C: drive. As usual, it didn't turn up anything but did give the drive plenty of exercise. It was right after I took out the Spinrite floppy that the system again couldn't find the C: drive to boot from. As usual, it booted normally after sitting (and cooling down, incidentally or not) for a couple of hours. Of course SpeedFan wasn't accessible in the meantime.

From everything I've been able to determine, the one temp that fluctuates a lot is most likely the CPU. I'm no closer than before to figuring out what the other two are.

The only other thing I'm looking at right now is exactly what happens, and in what order, during normal and abnormal startup. I know, for instance, that the drive lights for the CD-R, the floppy and the hard drive come on at least briefly during the POST. Sometimes when the system hasn't been able to boot from C:, I've noticed its drive light staying on for what seems longer than usual, maybe even constantly, but I've never gotten around to noting exactly when that starts or how it differs from normal.

(Elvandil -- Do you see any point in testing that drive any further with any of the stuff you recommended? When it's been working it's passed everything I've thrown at it. What I'm trying to figure out is what happens during those times when nothing can even find it.)
Elvandil's Avatar
Computer Specs
Member with 51,993 posts.
 
Join Date: Aug 2003
Location: Vermont
23-Mar-2009, 02:30 AM #6
The SpinRite test didn't really tell you anything except that some data could be read from each cluster. It didn't determine if that data was corrupted or if the cluster allowed writing correctly. You really need a read-write-read test to be sure. But also be sure it is a non-destructive one. You should also access the drive's SMART cache to see if it detected any self-reported errors.

The drive light thing is interesting. You might try disconnecting other drives just to see what happens.

Intermittent problems are a real headache.
daniel_b2380's Avatar
Member with 2,391 posts.
 
Join Date: Jan 2003
Location: mid-atlantic
25-Mar-2009, 12:21 AM #7
one of these times when it isn't booting,
try checking the bios temps,
see what the cpu temp is,
if it is the one doing the most flucuating,
.
maybe even,
just to clear that possibility,
remove and clean the old and a new dab of arctic silver?
.
then leave side cover off,
to see if cpu fan may be the culprit?
gophersnake's Avatar
gophersnake gophersnake is offline
Member with 151 posts.
THREAD STARTER
 
Join Date: Mar 2006
Location: Northern California
Experience: program Assembly Language
25-Mar-2009, 03:02 AM #8
Quote:
Originally Posted by daniel_b2380
one of these times when it isn't booting,
try checking the bios temps,
see what the cpu temp is,
if it is the one doing the most flucuating,
Once the boot failure monster strikes, I won't have access to SpeedFan and the temperatures until everything's had plenty of time to cool down.I've been watching the temps in the meantime, and two of them don't seem to do much -- just hang out in the low to mid 30s. The third one, almost certainly the CPU, fluctuates so wildly that I have my doubts about it. It's sure enough correlated with CPU use but it'll sometimes jump from 38 to 46 (or the other way) in one sampling.

I guess I'll find out when I swap drives if the problem follows the drive or stays with the motherboard.

I guess I'll take a look at the fan and heatsink (and dust) situation next time I have the case open.

What, pray tell, is Arctic Silver? Sounds like some kind of heat sink compound. I may still have some 20-year-old silicone around; if not, I might have to make a pilgrimage to Fry's.

It's hard to believe that the CPU, hot or cold, has much to do with this problem. Even when the hard drive goes missing, if I boot from floppy or CD, everything I can get to acts pretty normal.
daniel_b2380's Avatar
Member with 2,391 posts.
 
Join Date: Jan 2003
Location: mid-atlantic
25-Mar-2009, 09:35 AM #9
gophersnake,
arctic silver:
http://www.arcticsilver.com/
they devote whole pages to the explanation,
no sense me trying to compete with that,

Quote:
Once the boot failure monster strikes, I won't have access to SpeedFan and the temperatures until everything's had plenty of time to cool down.
your bios doesn't have a 'health' page?
where the voltages, temps are displayed?
that has been pretty much standard for a LONG time now,
speedfan is okayyyy, but.......
the numbers displayed SOMETIMES have to be INTERPRETED,
[it's sometimes difficult to say WHICH number is for WHAT sensor],
[as you've already found out],
the ones you get from your bios are specific to that machine,
AND,
in that the bios is NOT dependent upon the harddrive,
should be accessable,
.
if you've moved it,
it might be as simple as a 'latch' coming loose,
.
and get that cover off, so you can see the cpu fan,
.
personally, i'd do the arctic silver BEFORE i messed anymore with the harddrive,
[your machine, your choice though],
As Seen On

BBC, Reader's Digest, PC Magazine, Today Show, Money Magazine
WELCOME TO TECH SUPPORT GUY!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.


(clock)
THIS THREAD HAS EXPIRED.
Are you having the same problem? We have volunteers ready to answer your question, but first you'll have to join for free. Need help getting started? Check out our Welcome Guide.

Search Tech Support Guy

Find the solution to your
computer problem!




Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools


WELCOME
You Are Using: Server ID
Trusted Website Back to the Top ↑

Content Relevant URLs by vBSEO 3.3.2