OT: More failures for larger harddrives?

Comments

PeterDuke wrote on 9/30/2009, 8:58 PM
To properly assess the effect of temperature on failure rate you should have a large number of units all CONTROLLED (i.e. set by the experimenter) in terms of the various factors such as temperature, humidity, shock, etc. which might affect reliabilty. The factors should be allocated to the units at random (not chosen by the experimenter). Analysis of variance will then reveal the factors and interactions which are significant (provided that the sample size is large enough). A non-significant finding does not prove that there is no relationship, because the sample size may not have been large enough. Observing a correlation between two factors such as heat and failure in an uncontrolled experiment does not prove cause and effect; it merely indicates a relationship to be studied further. Note also that correlation and analysis of variance assume a linear relationship, which may not be strictly true, although it might be good enough. Non-linear multiple regression would be better.

I have neither the time nor money to set up such an experiment. After having a few internal disk failures in past years and noting that they were too hot to touch, I mounted some fans near the disks and now they just feel warm. I haven't had any disk failures since (at least five years).
GlennChan wrote on 9/30/2009, 9:22 PM
If the drive has a Prolific chipset, the chipset could be causing problems. You will need to flash its firmware.

2- If you start a drive at one temperature and *really* heat it up, then the drive will screw up because the size of the platters change and the hard drive doesn't figure it out.
Rebooting it will make things fine I believe.

If you cool the drive, that won't happen.

3- A faulty power supply can kill your hardware. Replace them at the first sign of failing, otherwise you'll have expensive repairs when it takes out everything else :/
SeaJohn2 wrote on 10/3/2009, 2:15 PM
Subject: RE: OT: More failures for larger harddrives?
Reply by: johnmeyer
Date: 9/29/2009 10:04:38 PM

I just skimmed the Google report on drive reliability and there are so many problems with it that I have to stop myself from writing a stupidly long post that no one will read. However, if I just concentrate on the temperature correlations, most of the unexpected, inverted results (i.e., higher failure rate at lower temperatures) happened at temperatures WAY below what would normally affect longevity. As I stated earlier, you don't get interesting things (higher failure rates) until you get above 140F. That is 60C.

Their temperature graph stops at 50C !!!

==========================================================

There isn't much reason to look at temps above 50°C. I've worked in the hard drive industry for 20 years, and 50°C (55°C for some few models) is the upper end of the drive operational spec; this is for the reason John mentioned - it's a lot harder to maintain reliability above these levels, mainly for mechanical reasons when it comes to disc drives.

Logging mechanisms were implemented about 10 years ago which allow us to keep track of drive environment, and if a customer (a large corporate customer, not a general computer user) complains about failures, the drive log is one of the first things we look at. If a company complains that they have a 50% failure rate with our drives, but the temperature logs in the drives show that they are running them at 60°, there is nothing we can do; the gist of our response is, "too bad - don't do that."

Obviously, we don't word it like that, and we will work with them to help redesign their cabinet or airflow path, but in the end it is up to xxxxx Computer Co. to design their enclosure to supply an environment within a specified range for our drives (temperature, humidity, vibration and shock).
johnmeyer wrote on 10/3/2009, 3:57 PM
There isn't much reason to look at temps above 50°C. I've worked in the hard drive industry for 20 years, and 50°C (55°C for some few models) is the upper end of the drive operational spec; this is for the reason John mentioned - it's a lot harder to maintain reliability above these levels, mainly for mechanical reasons when it comes to disc drives. Well, if that is true, then that pretty much answers the original question about failure rates in external drives if the external enclosure has neither a fan or really good heat-sink. 50C is only 122F and I can absolutely guarantee that if you are doing a disk-intensive task that lasts for a few hours (like a copy operation or non-cpu-intensive render that exercises the disk a lot), a disk in a plastic enclosure with no fan is going to get hot enough to smell -- I know this from experience with my first enclosures.

So, based on your experience in the industry, it sounds like extended operation at these temperatures or above is going to result in bad things happening.
srode wrote on 10/3/2009, 4:40 PM
there's a couple companies that make back planes that will fit in either 2, 3, or 4 of the 5 inch drive bays which have good cooling built into them - I have 2 back planes taking up 4 of the 5inch bays in my computer holding a total of 6 drives. Another 3 drives are installed in the onboard drive cage that has a good cooling fand blowing on them.

I agree heat is likely what is killing your drives - however also be aware that there are several different levels of 1TB drives - Black and Enterprise being the best quality with 5 year warranties. I personally only use WD Caviar Blacks - haven't had a single failure (knock on wood) the Caviar blacks are more expensive but as with most things you get what you pay for.