One of my favorite quotes by Bill Gates, that I refer to often, is “technology is just a tool. In terms of getting the kids working together and motivating them, the teacher is the most important.” If we think about that quote in terms of technology assisted review (TAR), we are the teacher and TAR tools are the students.
The two major uses for TAR tools are supervised and unsupervised machine learning, but each of those functions rely on the integration of process and workflow in order to work well.
Unsupervised machine learning is a grouping mechanism that doesn’t need human intervention, other than to hit the ‘GO’ button. We use this in tools like clustering, find similar, near duplicate and other visual display approaches – the TAR algorithms simply look for documents that look alike textually to a certain degree. It’s like what we used to do when we got boxes of paper documents – we’d sort them into piles of similar documents for better review, ignoring some piles, beelining for others. That is how the un-supervised learning tools work. Then, deciding which piles to dive into and which ones to ignore is where process and workflow come into play. Using these tools, we can move piles into review for full analysis, we can decide to simply spot-check for Responsiveness or we can push aside for lack of relevance.
Supervised learning is more nuanced and needs a teacher to provide guidance and direction. In order to find what we’re looking for as fast as possible, with as little review of irrelevant material as possible, we have to work with the algorithms and teach them how to find the documents we want. No algorithm is going to be able to tell you that a document is Responsive or not. However, with the right education by exemplars and training, the tools can tell us which documents are similar enough to those we’ve already decided are Responsive and Not Responsive. This helps us focus our review on those documents that are likely Responsive and, just as importantly, highlights the large pile of documents we don’t have to waste our precious time on.
Document volumes used to be smaller, so small, agile, focused teams of reviewers were the best way to help counsel respond to Discovery requests. These small teams became experts in the subject matter and remaining consistent across the team was relatively simple. In the last 10 years, as data volumes have grown, so too did the need for giant teams of reviewers to get through large volumes of documents in short time frames. Giant teams can be unwieldly and consistency across the team and productions becomes a search for the Holy Grail. Adding TAR tools can drastically reduce the volume of documents for review and therefore makes it possible to get back to those small, agile and focused review teams.
Continuous Active Learning (CAL) seems to be the way forward, the way that allows for those small review teams. Supervised machine learning that continues to learn as we learn is a great representation of the human/teacher and machine/student relationship. But it’s the process that seals the deal. We use the TAR tools to do what they can do (see relationships we can’t yet see) and then we create workflows to help us move documents into review and into training. It is an iterative process that truly does allow for continuous and active learning.
A good balance of reviewing a lot of documents and not enough documents can be achieved through these workflows. Documents move through the workflows, as decisions and QC are happening, as teaching and learning is happening in real-time, and the TAR tools create good audit trails throughout the process to maintain visibility and defensibility into the steps taken.
We have seen that this newest partnership between teacher (human) and student (machine) can drastically reduce time and cost of review and increase the comfort level that you’ve taken great strides (within reason) to find all Responsive documents. CAL is just a technology tool, but a CAL-driven workflow, guided by TCDI’s expertise, is a real solution.