by Stephen Stewart, CTO, Nuix
Preface: This NOT about politics. This is all about the data discussed in Volume 1 of The Mueller Report.
I will admit, I am a total geek. When the government released the Mueller Report, I downloaded the PDF and, within a few minutes, ran it through Nuix.
For anyone who works with unstructured data for a living, the document would fall into the category of “Gross Data.” The PDF was a container for 447 JPGs with zero searchable text. Nuix made short work of this, and I was able to quickly OCR the images. Thanks to the auto-detect for rotation I was able to very quickly get good clean text.