Updated: Nov 7, 2018
August is upon us and with it comes an overkill of back-to-school commercials highlighting special deals on everything from backpacks to Air Jordans to even black markers. And while we see these ads every summer, a funny thought recently occurred to me during one of them: it wasn’t all that long ago when redaction was actually performed with black markers. Even as recently as ten years ago, there’d be hundreds of thousands of papers that had to be flipped through one-by-one during discovery, and sensitive information would be blacked out by hand. (A good deal on black markers could have come in handy!)
But with the amount of digital data in the universe doubling every two years by some accounts, the days of hard copies and manual work are slipping further and further behind us. The ediscovery software market is growing quickly (recently estimated to grow by 17.36% over the next four years), and redaction software is quickly becoming commonplace in organizations. Adios, black marker; hello, cursor!
That’s good news, right?
Not necessarily. Though redaction software, when used properly, prevents the inadvertent disclosure of sensitive information, many programs like Adobe Acrobat are not yet foolproof when it comes to redacting PII. Not to mention that many still require manual processes that are prone to human error. Employees might think they are redacting correctly or producing documents properly, but the process leaves a lot of room for error, both technological and clerical. Take redacting PDF documents, for example. PDF documents are constructed in layers consisting of text and images. Thus, using a redaction tool that cloaks an image or text is not a foolproof method of redaction since you are actually just adding another layer to the document, which can be peeled back to reveal what’s underneath.
This misuse and mistrust of redaction tools can have serious consequences. Here’s what we consider to be the five most epic redaction software “fails” to make headlines in the last several years:
1. HSBC – On December 3rd, 2009 HSBC issued public notification letters declaring it had redacted sensitive data in Chapter 13 bankruptcy forms that were filed electronically. Due to a “bug in its imaging software,” the information turned out to be viewable – in other words, with more and more legal documents being filed online, electronic redaction, if not done properly, can be peeled back with a simple cut-and-paste technique. The data disclosed might have included HSBC credit card numbers, as well as line-of-credit and mortgage information, a spokeswoman announced.
2. TSA – Another redaction fail in December of 2009 occurred when a Transportation Security Administration contract employee manually redacted a 93-page operating manual that was then put up on a government website. Shortly thereafter, it was discovered that the employee simply drew black boxes over text, so the redacted text was still there and available to read with standard cut-and-paste commands. As a result, secretive TSA screening methods designed to prevent terrorism became public knowledge.
3. New York Times – In May of 2010, the New York Times also suffered a major manual redaction failure. As part of its reporting on leaked Snowden documents, the New York Times redacted both the name of an NSA agent as well as a confidential NSA program target from a PDF it uploaded to its site. After the PDF was posted, a cryptography website downloaded a copy and discovered that three of the redactions intended to obscure sensitive national security information were easily accessible by – you guessed it – highlighting, copying, and pasting the text. As Bob Cesca of the Daily Banter put it, “All of this is due to the incompetence of whoever failed to properly redact the PDF before publishing it for the world to see.”
4. Citigroup – In July of 2013, Citigroup acknowledged that it failed to safeguard Social Security numbers, birthdates, and other sensitive data for nearly 150,000 customers who filed for bankruptcy between 2007 – 2011. Citi had discovered a problem with the way its software redacted customer data on bankruptcy filings for secured loans. The bank said in a statement, “As a result of this limitation in technology, personally identifiable information could be exposed and read if electronic versions of the court records were accessed and downloaded from the courts’ online docket system, and if the person downloading the information had the technical knowledge and software to restore the redacted information.”
5. Numerous FOIA redactions – With all of these epic redaction fails out there in the news over the past five years or so, it’s easy to see why people would want to err on the side of caution. But in many instances, responses to Freedom of Information Act (FOIA) requests have gone way overboard with redaction, making their efforts come off as both lazy and suspicious. One example involves a simple request for complaints regarding Amtrak’s lounge cars by Politico reporter Connor Skelding. The request resulted in a nine-month processing hiatus only to have a FOIA officer reply with this:
In another instance, ex-AOL worker Jason Smathers requested talking points generated by the National Security Agency between 2009 – 2012, a period just before everyone was up in arms about bulk data collection. It’s understandable for FOIA officers to be protective of information handed over in FOIA requests, especially after seeing all of these inadvertent leaks of sensitive information due to human error or redaction software failures. FOIA officers are often working with short time-frames as well, limited resources, and with a huge backlog of requests to deal with, so over-redaction is one way to cope. However, after nearly two years of processing, Jason became aware that apparently all of the NSA’s activities are too sensitive for the public to see:
This is a major reason why many lawyers still prefer black marker techniques, even with the growing amounts of data and costs of ediscovery. According to Fernando M. Pinguelo, a partner with Norris, McLaughlin & Marcus, P.A., when it comes to digital redaction, “You can’t help but wonder if that black line across the text is going to remain once you print the document.”
But regressing back to black marker techniques might not be the brightest move in today’s digital age. With ESI volumes growing at an alarming rate, it is so important to establish a proper and effective redaction strategy that is affordable, isn’t going to inadvertently leak sensitive information to the public, and will ensure sensitive information isn’t suspiciously over-redacted. Organizations and agencies that deal in sensitive information are starting to look to redaction automation software that solves both problems: produces foolproof, irreversible redactions, but will also do so automatically. Less time spent on manual work means more time is available for QC by a qualified human reviewer.