Not patched by October 2020? Your drives could get bricked…
Quite a few Sound-Point out Drives (SSDs) made by SanDisk put up with from a flaw that can see them wiping out almost everything saved on them at forty,000 several hours (four a long time) — with HPE currently signing up for Dell in naming SanDisk owner Western Electronic as accountable for the bug, which has witnessed program directors scramble to discover and resolve affected servers.
Neglecting to get a firmware resolve in “will final result in push failure and data loss at forty,000 several hours of procedure and call for restoration of data from backup if there is no fault tolerance, these kinds of as RAID or even in a fault tolerance RAID method if a lot more SSDs fail than can be supported by the fault tolerance of the RAID method on the sensible drive” HPE mentioned.
It additional: “After the SSD failure occurs, neither the SSD nor the data can be recovered. In addition, SSDs which were place into services at the same time will most likely fail virtually simultaneously.” (A lot of seasoned teams will make stacks with non-sequential serial numbers and storage products and solutions from unique suppliers, but that’s not normally easy…)
See also: As AWS Slashes Disaster Restoration Charges by eighty%, Can Impartial Corporations Contend?
HPE direction for buyers posted on March 20 mentioned that primarily based on its evaluation of when servers outfitted with the SanDisk SSDs started out shipping and delivery, buyers should not put up with difficulties ahead of October 2020 giving conclusion-customers lots of time to make the significant patch ahead of their drives get bricked. Other OEMs are most likely to be affected.
(Computer system Organization Critique has not still witnessed any even further customer advisories. If you bought 1 from yet another server vendor, get in contact with our editor…)
Hey, Western Electronic: Many thanks for That
These days naming Western Electronic for the initially time (an before statement experienced just cited a “Solid Point out Push company), HPE informed Computer system Organization Critique in an emailed statement: “HPE was notified by Western Electronic of a company firmware defect in sure SAS SSD models employed throughout the market.
“Because this defect only triggers push failure soon after forty,000 several hours of procedure, no HPE buyers are in danger of failing for various months. HPE has received Serial Range information on the drives delivered to HPE buyers, and we are actively achieving out to people buyers and to supply updated firmware.”
Western Electronic did not react to requests for comment. The business before informed Blocks and Storage that, “Per Western Electronic corporate policy, we are not able to supply remarks about other vendors’ products and solutions. As this falls inside HPE’s portfolio, all associated item issues would very best be resolved with HPE straight.” (Which, offered it is in the end a Western Electronic item flaw, seems un-chivalrous…)
SanDisk SSD Bug: Dell Explained to Prospects in February
Dell in the meantime notified its buyers in February, emailing them to say that it experienced “identified a perhaps significant problem exactly where sure good point out drives may perhaps expertise failure and prospective data loss thanks to an problem with the drives’ firmware, the drives may perhaps fail soon after roughly forty,000 several hours of utilization.”
SanDisk drives ranging from 200GB to one.6TB are understood to be affected. These can be discovered in a sprawling array of Dell and HPE servers: both of those providers have furnished customers with a whole list of impacted products and solutions.
HPE has made Linux, VMware, and Windows scripts out there which perform an SSD push firmware verify for the forty,000 power-on-several hours failure problem, as has Dell, which pointed the finger at SanDisk design numbers LT0200MO, LT0400MO, LT0800MO, LT1600MO, LT0200WM, LT0400WM, LT0800WM, LT0800RO and LT1600RO.
Attentive techniques directors ought to have minor issues pinpointing the servers influencing and patching them in a ideally bug-free of charge fashion, but the problem is discouraging for big OEMs like Dell and HPE which deal with getting to recognize and notify all impacted buyers — and which will no question consider the brunt of any criticism.