Situation
A large real-estate conglomerate engaged TCDI to review over one million documents following a substantial data breach. They were required to identify individuals whose Personally Identifiable Information (PII) may have been impacted.
During this project, TCDI leveraged its Process-Driven AI approach, which involved a combination of legal process optimization, legacy AI technologies, Generative AI technologies, and human validation through TCDI’s Military Spouse Managed Review (MSMR) program. This approach demonstrated remarkable efficiency and cost effectiveness in handling the PII notification review.
Reduced Document Population by 97%
using legacy-AI tools and metadata culling
Resolution
Given the immense volume of data, TCDI began the engagement by running the data set against its extensive list of PII categories and search terms. Starting with an initial volume of 1,060,742 documents, using legacy AI tools and metadata culling, TCDI was able to reduce the total data volume by 97%, resulting in 35,608 documents potentially containing PII.
These documents were then processed through a proprietary Generative AI tool by TCDI’s partner, TackleAI. The Gen AI tool identified 53 documents containing true PII that were reviewed and validated by TCDI’s MSMR team. In addition, the MSMR team manually reviewed exception documents, which represented less than 3% of the data set. Exception documents included documents that were considered too large, too small, or those containing no text.
Once the identified document IDs from the notification list were reviewed, TCDI’s MSMR team performed two additional rounds of QC. The first round, QC Set 1, involved two statistical samples of 381 documents from the Generative AI-processed set that were not flagged for PII. The second set, QC Set 2, included two sequential statistical samples of 385 documents that were excluded from Generative AI processing (null set) to verify proper culling during the legacy AI data reduction.
Through this QC process, one document in QC Set 1 was identified as having PII but was not flagged by Gen AI due to PII being in a foreign language. TCDI’s team worked closely with TackleAI to update the Gen AI prompts, and the analysis was rerun on the document set to include targeted terms, identify additional foreign language, and ensure no other iterations of the PII existed. In addition, eight documents containing handwriting were identified which were manually reviewed by the MSMR team. No errors were found in QC Set 2. At project conclusion, the total notification list included 48 individuals.
Impact
Through TCDI’s Process-Driven AI approach, the MSMR team achieved remarkable results for our client, including 90% automation of review work. This resulted in over 60% reduction in total project costs while maintaining superior quality and consistency across the data set.
In addition, by maintaining the “human-in-the-loop” approach through human QC and validation, the project was able to achieve its objectives with exceptional accuracy and speed. Most importantly, the notification list was completed within a 1.5-week period, significantly outpacing traditional PII review timelines for similar data set sizes which would typically take 3-4 weeks with aggressive data reduction or up to 3 months without it.