1. Computer problem? Tech Support Guy is completely free -- paid for by advertisers and donations. Click here to join today! If you're new to Tech Support Guy, we highly recommend that you visit our Guide for New Members.

Dell hard drive failure (intermittent)

Discussion in 'Hardware' started by gophersnake, Mar 22, 2009.

Thread Status:
Not open for further replies.
Advertisement
  1. gophersnake

    gophersnake Thread Starter

    Joined:
    Mar 5, 2006
    Messages:
    153
    The problem first happened about three weeks ago. I wrote about it here. At the time it looked like changing the CMOS battery had fixed it, but now it's back. Here's what I know about it so far:

    -- It's most likely to happen after the computer has been on for several hours, then is shut down briefly and restarted. I'm beginning to think it might be temperature-related because it's most likely to clear up after the computer has sat unused for a few hours.

    -- So far at least, that drive has never shown any signs of trouble during a session -- only at startup. When it fails, it seems to disappear altogether -- the CMOS says it's supposed to be there but nothing is able to find/access it.

    -- At least some of the times this has happened, the drive light has stayed on continuously. I didn't notice what it was doing the other times.

    -- At bootup, the computer goes through its power-on self-test more or less normally though it may linger a little toward the end (where the white progress bar fills the whole box). The POST display disappears as usual. At this point, if it's going to boot normally the Windows flag logo appears. If not, the screen stays black, there are two short beeps, and I get the message:
    Code:
    strike F1 to retry boot, F2 for Setup utility
    .

    -- Once there's a boot failure, If I turn off the power and wait a few minutes I just get the same result. If I boot from a floppy or CD-R, there appears to be no hard drive present but everything else works normally.

    -- I ran the Dell Diagnostics from CD-R while the problem was happening. For the hard drive I got the error message:
    Code:
    Test Errors
    IDE Disk 0 - Confidence Test                               : Fail
      Status: Fail  Status Code: DOS DDG-D DISK 192 066
      Device: IDE_Disk_0  Test: Confidence_Test-Read_Test
      Release: 1073 Module(s): Disk
      Msg: Block 0: Address not found.
    This message was logged to the RESULTS file in "test-one-device" mode. In automatic mode, the Dell Diagnostics would deliver the same message over and over for block after block.

    -- A few hours later the computer booted normally. I ran the Dell Diagnostics again, this time from the C: drive in DOS mode. This time the result for that same drive was:
    Code:
    IDE Disk 0 - Confidence Test                               : Pass
      Status: Pass  Status Code: DOS DDG-D DISK 190 000
      Device: IDE_Disk_0  Test: Confidence_Test
      Release: 1073 Module(s): Disk
    
    End testing: 03/21/2009 23:26:57 - 0 errors
    I've found several references online to similar problems. One poster blamed it on defective hard-drive cables while others suspected the CMOS battery.

    Since the battery is only a few weeks old, I haven't disturbed the cables, and the problem acts as if it might be at least a little heat-sensitive, I'm thinking some chip or other might be cutting out and back in. I'd try freezing spray but I don't know what part(s) of the computer to aim at first. I'd try a new hard drive (and cables) but I'd rather rule out the motherboard first, if possible.Ideas or information, anyone?

    -------------------
    Dell Precision 220, circa 2003
    "Intel 82820 chipset" (whatever that means)
    Pentium III, 733 MHz
    256 Mb RAM
    Hard drive described as "20 GB WDC WD20BB-75AUA1" (whatever that means)
    (Should be: Western Digital WD200BB)
    It's probably the original drive installed by Dell. Setup deals with it as "auto".

    Windows 98 SE
    Floppy drive, CD-R, and CD-RW available.
    Windows Startup Diskette, Spinrite 5 diskette, and Dell Diagnostics CD available.
    Freezing spray available.
    Spare CR2032 battery available.
    Spare computer available (but not very desirable).
     
  2. jack-o-bytes

    jack-o-bytes

    Joined:
    Jan 27, 2009
    Messages:
    2,582
    Go to http://www.almico.com/sfdownload.php and click on the speed fan 4.37. (it is only in small writing just below to the right of the title download) then when you have the program installed open it and wait for it to load. Once it has loaded you will see a list of temperatures on the right post them on this forum later. Also click the smart tab at the top of the program and click on which hard drive you want to check on the drop down bar. then at the bottem there will be two bars that say fitness and performance. If the bars are full it isnt your hard drive if they are low it is your hard drive.
     
  3. Elvandil

    Elvandil

    Joined:
    Aug 1, 2003
    Messages:
    51,988
    Run some thorough tests on the drive.

    Make sure everything you want to keep is backed up.

    Free Hard Drive Testing Applications:

    Manufacturer's Tests
    Victoria for DOS
    Victoria for Windows (Both versions of Victoria are among the best and most thorough tests available.)
    HD Tune
    CheckDisk 1.03 (Marks bad sectors as unusable.)
    HDAT2 (Diagnostics and bad sector recovery)
    MHDD Low-level Diagnostics
    Bootable Hitachi Drive Fitness Test Floppy or CD Image (works on most drives)

    Hard Drive Manufacturers' Diagnostic Utilities Links:

    TachTech
    BleepingComputer
     
  4. gophersnake

    gophersnake Thread Starter

    Joined:
    Mar 5, 2006
    Messages:
    153
    I'm trying jack-o-bytes's suggestion first.

    (Elvandil, thanks for yours too. I have been keeping good backups. The hard drive, when it comes up at all, seems to perform entirely normally both with real data and with the Dell Diagnostics. Eventually, when I won't need the computer for a while, I'll run Spinrite 5 on it but I don't expect it to turn up anything unusual.)

    I did download SpeedFan. It shows me four temperatures, one of which (with a red flame icon, of course) never seems to change from "127C". Since that equals 7FH, I suspect it's not a real temperature.

    Of the other three temps, the first two always seem to hover in the 32 to 34 range. The third settles into that range too, but heads rapidly for the 50s when, for example, I print a document or (just to try something) start an AVG virus scan. It settles back down within half a minute or so when the print job finishes or I interrupt the virus scan. Both operations use both the CPU and the hard drive a lot but so far I have no idea which (if either) of those the temp reading is coming from. In time I might come up with some tasks that use one more than the other but I'd have to be able to run them from Windows so I could still monitor the temp(s) with SmartFan. That lets out Spinrite and the Dell Diagnostics, both of which run only in DOS mode.

    It took me a while to find the "Fitness" and "Performance" bars; the S.M.A.R.T. window was blank and I kept skipping on by till I noticed the box for selecting a hard drive. Both of my ratings are 18 blue squares out of 20. I repeated the reading while AVG was scanning (and the temp readout climbing) but the Fitness and Performance stayed the same.

    I guess I'll look in on those temps occasionally and see if I can catch any of them misbehaving or correlate them with (a.) recent drive activity or (b.) boot failure soon afterwards.

    [One more thing I thought of would be to be sure to check those temps right before a restart, and especially to see if restarts become any more iffy when the temps have been on the high side right before shutting down.]
     
  5. gophersnake

    gophersnake Thread Starter

    Joined:
    Mar 5, 2006
    Messages:
    153
    Developments since my last update:

    I downloaded a drive-testing utility from the Western Digital site. I must say I'm not too impressed with either the software itself or their instructions for downloading and installing it. The first time I ran Dlgdiag5.exe, with the C: drive present and working normally, it gave the drive a clean bill of health. Later, when the drive wasn't coming up at reboot, I tried to run it again from the bootable CD I'd made. It crashed with an "Exception 6, Invalid Opcode" error and a register dump before it even had a chance to display its licensing agreement. Admittedly, one difference between the two attempts was that the first time it was running under Windows, the second time under Nero's "Caldera DR DOS". I may try burning another CD where the boot files are from Windows.

    In between those two attempts I ran Spinrite 5 at Level 1 (read every sector, don't try to fix anything) on the entire C: drive. As usual, it didn't turn up anything but did give the drive plenty of exercise. It was right after I took out the Spinrite floppy that the system again couldn't find the C: drive to boot from. As usual, it booted normally after sitting (and cooling down, incidentally or not) for a couple of hours. Of course SpeedFan wasn't accessible in the meantime.

    From everything I've been able to determine, the one temp that fluctuates a lot is most likely the CPU. I'm no closer than before to figuring out what the other two are.

    The only other thing I'm looking at right now is exactly what happens, and in what order, during normal and abnormal startup. I know, for instance, that the drive lights for the CD-R, the floppy and the hard drive come on at least briefly during the POST. Sometimes when the system hasn't been able to boot from C:, I've noticed its drive light staying on for what seems longer than usual, maybe even constantly, but I've never gotten around to noting exactly when that starts or how it differs from normal.

    (Elvandil -- Do you see any point in testing that drive any further with any of the stuff you recommended? When it's been working it's passed everything I've thrown at it. What I'm trying to figure out is what happens during those times when nothing can even find it.)
     
  6. Elvandil

    Elvandil

    Joined:
    Aug 1, 2003
    Messages:
    51,988
    The SpinRite test didn't really tell you anything except that some data could be read from each cluster. It didn't determine if that data was corrupted or if the cluster allowed writing correctly. You really need a read-write-read test to be sure. But also be sure it is a non-destructive one. You should also access the drive's SMART cache to see if it detected any self-reported errors.

    The drive light thing is interesting. You might try disconnecting other drives just to see what happens.

    Intermittent problems are a real headache.
     
  7. daniel_b2380

    daniel_b2380

    Joined:
    Jan 31, 2003
    Messages:
    2,392
    one of these times when it isn't booting,
    try checking the bios temps,
    see what the cpu temp is,
    if it is the one doing the most flucuating,
    .
    maybe even,
    just to clear that possibility,
    remove and clean the old and a new dab of arctic silver?
    .
    then leave side cover off,
    to see if cpu fan may be the culprit?
     
  8. gophersnake

    gophersnake Thread Starter

    Joined:
    Mar 5, 2006
    Messages:
    153
    Once the boot failure monster strikes, I won't have access to SpeedFan and the temperatures until everything's had plenty of time to cool down.I've been watching the temps in the meantime, and two of them don't seem to do much -- just hang out in the low to mid 30s. The third one, almost certainly the CPU, fluctuates so wildly that I have my doubts about it. It's sure enough correlated with CPU use but it'll sometimes jump from 38 to 46 (or the other way) in one sampling.

    I guess I'll find out when I swap drives if the problem follows the drive or stays with the motherboard.

    I guess I'll take a look at the fan and heatsink (and dust) situation next time I have the case open.

    What, pray tell, is Arctic Silver? Sounds like some kind of heat sink compound. I may still have some 20-year-old silicone around; if not, I might have to make a pilgrimage to Fry's.

    It's hard to believe that the CPU, hot or cold, has much to do with this problem. Even when the hard drive goes missing, if I boot from floppy or CD, everything I can get to acts pretty normal.
     
  9. daniel_b2380

    daniel_b2380

    Joined:
    Jan 31, 2003
    Messages:
    2,392
    gophersnake,
    arctic silver:
    http://www.arcticsilver.com/
    they devote whole pages to the explanation,
    no sense me trying to compete with that, :eek:

    your bios doesn't have a 'health' page?
    where the voltages, temps are displayed?
    that has been pretty much standard for a LONG time now,
    speedfan is okayyyy, but.......
    the numbers displayed SOMETIMES have to be INTERPRETED,
    [it's sometimes difficult to say WHICH number is for WHAT sensor],
    [as you've already found out],
    the ones you get from your bios are specific to that machine,
    AND,
    in that the bios is NOT dependent upon the harddrive,
    should be accessable,
    .
    if you've moved it,
    it might be as simple as a 'latch' coming loose,
    .
    and get that cover off, so you can see the cpu fan,
    .
    personally, i'd do the arctic silver BEFORE i messed anymore with the harddrive,
    [your machine, your choice though],
     
  10. Sponsor

As Seen On
As Seen On...

Welcome to Tech Support Guy!

Are you looking for the solution to your computer problem? Join our site today to ask your question. This site is completely free -- paid for by advertisers and donations.

If you're not already familiar with forums, watch our Welcome Guide to get started.

Join over 733,556 other people just like you!

Loading...
Thread Status:
Not open for further replies.

Short URL to this thread: https://techguy.org/811552