Bill Gates once said that “technology is just a tool. In terms of getting the kids working together and motivating them, the teacher is the most important.” If we think about that quote in terms of technology assisted review (TAR), the quote makes total sense.
Members of my dev team and I were talking yesterday about the future of technology assisted review and the direction we should be headed in the development our TAR tools and methodologies. Two major topics we discussed were:
- supervised and unsupervised machine learning
- the importance of process and workflow over the technology
We were all in agreement that we need to maintain a focus on humans driving all of the TAR tools in order for the tools to work as we need them to work. Unsupervised machine learning has its place in clustering, find similar and other visual display approaches, but in order to find what we’re looking for as fast as possible, with as little review of irrelevant material as possible (the goal), we have to work with the algorithms and teach them how to find the documents we want. No algorithm is going to be able to tell you that a document is Responsive. However, with the right education by exemplars, it can tell us which documents are similar enough to those we’ve already decided are Responsive, and we should tag them the same.
It’s a slight difference, but a key difference. Unsupervised machine learning can group together contextually similar documents, but it will never be able to apply a binary IN/OUT decision to those similar stacks of documents without a human’s say so. That is the training that will always be necessary in TAR workflows.
The concern that machines will un-employ attorneys doing this kind of work is a false concern. Yes, it will reduce the number of reviewers necessary, but large volumes of reviewers is relatively a new occurrence anyway. Document volumes used to be smaller, so smaller teams of reviewers were common. In the last 10 years, as data volumes have grown, so too has the need for giant teams of reviewers to get through documents in short time frames. Adding TAR tools can, and should, reduce the volume of documents for review and thus make the use of large review teams unnecessary again.
The other topic we discussed a lot yesterday was the way we use the TAR tools is even more important than which technology we use. There are a lot of different algorithms out there that group similar documents together, all slightly different, but all doing essentially the same thing. While we did talk about which ones were best, we quickly agreed that how we use the tools – at which stages in the process, with what workflows, with what reporting, etc. – was even more important.
Continuous Active Learning (CAL) and hybrids of TAR 1 and CAL seem to be the way forward. Supervised machine learning that continues to learn as we learn is a great representation of the human/teacher and machine/student relationship. Couple that with representative sampling of all sub-populations not reviewed and we have both reduced the volume of documents needing review and increased the confidence level that we’ve viewed enough documents.
But it’s the process that seals the deal. We create workflows to help us use the TAR tools to move documents into review and into training for more active learning. The decision to review, or not review, a document is both a push and a pull. I, a human reviewer, submit examples of Responsive and Not Responsive documents to the algorithm (push). It in turn shows me other documents it can and cannot code and puts them in front of me for more teaching (pull). And this goes on and on, through the workflow steps until I feel I have seen everything I need to see.
The balance of reviewing too many documents and not enough documents can be achieved through workflows incorporating both a human reviewer and TAR bot. Processes and workflows are created upfront. Documents move through the workflow as teaching and learning occur, and documentation and audit trails are created throughout the process to maintain visibility into the steps taken.
We have seen that this newest partnership between teacher (human) and student (machine) can drastically reduce time and cost of review and increase the comfort level that you’ve taken great strides (within reason) to find all Responsive documents. TAR is just a technology tool, but a TAR-driven workflows led by human expertise can be a solution.