Can an email data trail ever fully disappear?
The Congressional investigation into missing IRS emails has raised many questions about the technology being used by the agency, and whether its account of losing politically sensitive data in computer crashes is really plausible.
Are those emails truly lost forever? In this interconnected world, can an email data trail ever fully disappear, or could computer forensics experts possibly reconstruct the missing information?
On the trail of missing emails
The missing data issue first came to light earlier this month, when the IRS admitted it could not find thousands of emails from Lois Lerner, the former head of the IRS's Tax-Exempt Division, requested by the House Ways and Means Committee as part of its investigation into claims that the agency unfairly scrutinized conservative groups that were seeking tax exempt status.
The IRS said the emails, dating from 2009 through April 2011, had been permanently lost when Lerner's computer crashed in 2011. About 24,000 of her emails from that period were retrieved from other IRS employees who were sent copies, but an unknown number sent to people outside the agency remain missing.
When the committee questioned IRS Commissioner John Koskinen on Friday, June 20, about the missing emails, Koskinen blamed the crash on "bad sectors" on Lerner's hard drive. Bad sectors are portions of a hard drive that stop working either because of physical damage to the hardware or a software error.
He said forensic data recovery experts from the IRS's Criminal Investigation Division tried for three weeks to recover her data but were unsuccessful.
He said the hard drive was then demagnetized and sent to a recycler for destruction.
Koskinen added that the IRS continued to try to track down her lost emails by looking for them in places other than the hard drive.
How bad are "bad sectors"?
Lois Lerner wasn't the only IRS official to blame a computer crash for missing data. Six other IRS officials whose records were subpoenaed also said they experienced computer hard drive crashes.
The IRS Commissioner stated that it is an "industry standard" to expect "3 to 5 percent of computers" to crash for a variety of reasons, though that rate rises "after the warranty period to 10 to 16 percent." Rep. Jim Renacci, R-Ohio, noted that given agency statistics cited by Koskinen, the IRS failure rate is about 9 percent.
A crash could be caused either by software or hardware problems. Experts can usually recover data from software-related crashes fairly easily. Even hardware-related crashes do not necessarily eradicate all data.
Tom Hakim of We Recover Data in New York says that even with "bad sectors," the chances of recovering data are very good. "Bad sectors are usually not bad - more likely unstable," he told CBS News.
In Hakim's ten years of experience and roughly 100,000 disk recovery cases, he did find some disks in which the data could not be retrieved: "I'd guess maybe one in 50 cases."
However, in Lerner's case the question is now moot, since the damaged hard drive was recycled in 2011.
It's hard kill a hard drive
At Friday's hearing, Rep. Erik Paulsen, R-Minn., asked Koskinen, "Can you rule out that Lois Lerner destroyed her own computer?"
"There is no evidence that she did," Koskinen replied, adding that "the evidence is, she worked very hard to try to restore her emails."
Computer experts say even if you want to kill a hard drive, it is not that easy.
A hard drive usually has a read/write head that flies on top of a rapidly spinning platter. The platter is a thin piece of glass coated with magnetic particles that store the data. As long as the metal particles remain intact on the glass, the data are probably recoverable. "If the platter is scratched, we may still recover some data," notes Hakim. But if the head crashes very hard onto the glass and scrapes off the magnetic particles, or the particles are demagnetized, recovery is almost impossible.
On the other hand, glass and metal particles are pretty resistant to temperature or chemical damage. One Representative on the committee showed a slide of a hard drive burned by fire. He noted that data were "100 percent recoverable."
In 2008, Anthony Verducci, a reporter at Popular Mechanics "took two laptop drives, ... beat the heck out of them until we heard the signature clicking of mechanical hard-drive failure. Then we submerged one of the drives in custom- made storm-surge floodwaters (salt water, construction debris, oil) and let it soak for four days." He still got nearly 100 percent data recovery.
But in the case of Adam Lanza, who reportedly smashed his hard drive to bits before the Sandy Hook school shooting, the FBI said it was unable to retrieve information from his computer.
Even in cases of extreme damage, data recovery experts at the security firm Kroll told CBS News in 2012 that retrieving data is sometimes possible. Depending on the model, hard drives hold on average five disks per cartridge and record data on both sides, explained Erik Venema, who runs the forensics side of Kroll's operation. "It is like a cake. If the top one gets damaged, it may be that we can see the data that the user created on the other platters, which may not be damaged."
The life and death of an IRS email
Could Lerner's emails be backed up or retrieved somewhere else?
Most IRS officials use a desktop computer to send and receive email. At Friday's hearing, Koskinen called the IRS computer system "somewhat antiquated," noting that the IRS was still trying to "get all its workers onto Windows 7," which was released by Microsoft in October 2009. Many IRS employees also use laptops or smartphones, especially Blackberrys.
Jim Gerretson owns and runs Gerretson LLC, a federal IT contractor. In an email to CBS News, he says: "All email is controlled by a server (or servers) in the Agency. Outlook is a client that must be connected to an Agency server."
As of 2011, each IRS employees was allocated 500 MB of storage space on the IRS email server. If employees reached the 500 MB limit, they would need to erase emails and attachments from the server to make room for more. The server was backed up regularly onto tapes, but tapes were only kept for 6 months before they were overwritten with new backup data.
If an email was considered a "federal record" as defined by the Federal Records Act, then the individual employee who generated or received it was held responsible for keeping that email stored somewhere. The state of the art for archiving a federal document is paper. According to the IRS's manual:
"If you create or receive email messages during the course of your daily work, you are responsible for ensuring that you manage them properly. The Treasury Department's current email policy requires emails and attachments that meet the definition of a federal record be added to the organization's files by printing them (including the essential transmission data) and filing them with related paper records.... Please note that maintaining a copy of an email or its attachments within the IRS email MS Outlook application does not meet the requirements of maintaining an official record. Therefore, print and file email and its attachments if they are either permanent records or if they relate to a specific case."
Not all emails are "federal records," though the definition is open to interpretation. Under current IRS policy implemented recently, all email is retained, whether it is a federal record or not. However, that was not the case back in 2011.
Emails that leave the agency travel through the Internet using servers outside the IRS system. When asked if that leaves a digital breadcrumb trail that one might follow, Gerretson replied, "Always, but it's not easy."