Laura Sanchez discusses her research at DFRWS US 2019. Hello, I’m Laura Sanchez and I’m a graduate researcher at the University of New Haven. And I’ll be presenting the results of our survey that was conducted by my colleagues.
Our paper is a practitioner survey exploring the value of forensic tools, AI filtering and safer presentation for investigating child sexual abuse material or CSAM, believe it or not, that is actually a modified version of our original title. So our agenda for today is just a brief introduction, previous work survey, how the survey was designed, our results, challenges, future work and acknowledgements. So we decided to do some research in this particular area for those investigating child exploitation cases and we found two particular issues for those investigating these cases.
And one is investigative process itself. As many of you know, there may be large amounts of data that are obtained by people doing these types of investigations and it can be terabytes’ worth. So you have millions of images and hundreds of hours of video. And as such, it can be really difficult to get through all of this and manage a case and it can be quite unproductive. So it leads to lost time, longer investigations and unfortunately backlogs.
And we also find that these types of investigations do have an impact on investigators and the victims themselves. So investigators, due to the exposure to explicit content may end up having some kind of trauma which affects their life, the relationship and the work that they’re doing. And it also affects victims. It can be quite traumatizing or re-traumatizing to know that your sexual abuse might’ve been caught on camera and is out there permanently and it serves as a record.
So we explored different types of previous work, there’s a lot of great work out there on the psychological aspects and trauma for those working in the field of CSAM. We also looked at the current tools, techniques and automation that is out there that is currently in use by several different types of investigators. We looked at triaging, which is an important element of the investigation itself. And then artificial intelligence.
So our contributions we feel that to the best of our knowledge, this is really the first comprehensive study that really explores the value assigned to the technologies and tools used by investigators on CSAM cases. We also are really the first study that focuses on the data science techniques and technologies for these types of investigations. And the first one that also asks investigators specifically what’s the current false positives and false negatives that they are currently expecting and versus what their ideal would be.
So as I mentioned before, our motivation was really to tackle the two problems that are encountered by people in these types of investigations. So we really wanted to improve workflow to allow investigations to go by more quickly and to help out the investigators in the victims themselves. So shorten the amount of time that you might spend on an investigation as well as exposure to the type of material that you’re looking at.
Our questions did focus on the tools and technologies in use in this type of investigation. And we really focused on asking practitioners of these investigations, what they feel about it, what kind of value they assigned to them. And as you can see there, we have several different questions. And it really depended on what we were asking them, the style that we chose. So our questions were divided up accordingly.
So into demographics, we focused on the processing and detection of tools themselves and then really focused on the technology offered by the tools, asking people what was available to them, what they were utilizing, and how they really felt about it.
And then we were interested in the current workflow that they’re implementing at their organization and what they would actually like to see.
And we also presented them [with] a proposal workflow and we had 106 participants. Our ideal sample size was actually 97, so we were quite content having more people respond to the survey. Unfortunately, and it’s something that we discuss in our challenges and throughout the papers that even though we had 106 participants, not every single person responded to every single question. So there is a little bit you’ll see in some of the questions people answer more than others, but it is discussed quite frequently in the paper.
So in terms of demographics, our population was majority of white males ranging in age from 35 to 54 with at least a high school diploma. And for those that did have higher than a high school diploma, bachelor’s degree was the highest degree that they had and it was typically in the fields of technology and law.
And we also asked people what they felt they were most competent at. And unsurprisingly, we found that 99% of the respondents felt that they were much more comfortable in digital forensics even though not many people had had a degree in digital forensics.
And we also found that 41% were not competent at data science, which we found interesting given the amounts of data that people are coming into contact with. And we found it interesting because we feel that on some level if you are coming in with so much context with so much data, you should be a little bit comfortable, because we feel like having that comfort level will allow you to identify certain problems with the tools and technologies that you’re using. It might provide a way to be able to address that.
And then we also found that 69% have received formal training for investigating CSAM spaces. And most of those trainings were from government funded projects. So we have a lot of people who had received training at ICAC [and] the National White Collar Crime Center as well as the National Center for Missing and Exploited Children.
So we asked people about the tools that they utilize for processing as well for their actual investigative process later on. And we divided it into tools that are used for imaging or for images. Pardon. And as well as for videos.
And we wanted to see if there was a difference in the types of tools and technologies that were available to people and what they were actually using to process their images and then to process their videos. And unsurprisingly it does appear that for both processing images and videos, commercial tools are the most utilized. So among those were Cellebrite, Magnet Forensics, and the Forensic Toolkit. We also asked about their limitations and many people felt that the limitations were generally a feature and capability related. So there was a lack of filtering, safe viewing, carving, photo enhancement and photo grouping, which was actually something that people were very much interested in having.
In terms of detection, we specifically focused on certain things that are currently out there. So we wanted to know if people were utilizing ICOP or ICAC, the Yahoo ‘not safe for work’, both, or neither. And as you can see, 50% of the respondents had identified ICOP as something that they had utilized and continue to utilize. Whereas only 2.56 people, a percent of the respondents have said that they had used the tool provided by Yahoo. Well, majority also said that they hadn’t used either of them. So when asked about the benefits, several respondents said that it was quick and an addition, they said for those that did utilize a hash database, that’s one of the reasons why they found it to be so quick. On the other hand though, they also said that that was a limitation because you could only identify known or hashed content.
So while it provided a way to quickly identify images, it was limited in that it could only identify known images in terms of implementation and use. We asked about the following technologies that were implemented and image and video processing tools. So these are what we consider the filtering technologies. And so as you can see here skin tone detection, face detection are amongst those that are implemented them the most and both image and video processing tools.
And in addition, we asked people if they’re actually using these technologies. Just because it’s available to you doesn’t necessarily mean that you’re using it. So we were interested in seeing if people were actually using them and it appears that when it is available, people are using them. So again, skin tone detection and face detection are among the highest ones to be used.
We also focused on asking investigators about the value that they assign to the filtering technologies. And as we suspected, filtering technologies have a higher value than what we call safe viewing technologies was, which consists of things like I’m least explicit frame neural net detection, tag presentation, selective body, part viewing, nudity blocker and flow face presentation.
And actually I’m just going to kind of touch on this a little bit. So one of the reasons in discussions with investigators was they felt the filter and technologies were more conducive to their investigations because they still have to look at these images in order to develop their case and to be able to do what they need to do on while safe viewing is on some level appreciated. The filtering technology has more value to them because they feel it’ll help assist them more in getting to their main objective, which is getting the images or the videos that they need for their case.
So we also asked about workflow. So currently what people feel to be a limitation to their workflow is, the current tools and technology are very limiting. It doesn’t help them speed up the process. It is not as efficient as they’d like them to be. And there’s not enough people or resources to be able to manage the workload that they have. The time available if you have too many cases and not enough people. It’s often time consuming. If your workflow is also not efficient, it takes up even more time.
And then again, resources. So that’s money, equipment, staff, and then participants, suggestions included. Again, having more resources. So for better tools and technology, I’m adding a CSAM hash database that’s successful to investigators to also share newly discovered CSAM contents.
So making it easier for people to share this and to do it in a legal fashion. And then more training for management and the implementation of policies and standards for current workflows. So this is the workflow that had been presented to the respondents and it was developed by MITRE. So we had rapid acquisition, multimedia file extraction and analysis, automated leads presentation and safer presentation of explicit material.
When presented to the respondents, about 45% did find it very valuable and provided some feedback on that part valuable. And then about 35% found it very valuable, followed by slightly to moderately valuable. And so these are some of the suggestions that people gave them.
We asked them: if you could improve this particular workflow, what would you add to it? What do you think we could improve? And again, most of it had to do with some resources, time, tools, technology as well as other things.
Now, in terms of the challenges for our survey, as I had mentioned, while our ideal sample size had been met, we found that a lot of the participants did not answer all of the questions fully. So there was a lot of intentional skipping a question and early dropout rates. And that’s something that we hope to look at a little bit more closely, and reapproach.
So again, we had questions with varying responses. We also feel that some of the wording may have caused a little bit of misinterpretation for the question, so some non-related answers were provided for some of our questions. And then we also found some inconsistencies in terms of the answers provided by the respondents in terms of recommendations.
We would like to see more AI software design engineering and data science and digital forensics programs to make people a little bit more comfortable in that realm and to understand sort of what they’re dealing with, and to be able to give their own input and desire, and to fixing some of the problems that they’re encountering.
We also would like to see more research supporting CSAM investigations and having a funding model for it. I’m encouraging the development and use of CSAM-centered open source tools, establishing and implementing an up-to-date and standardized workflow. So something similar to what was presented by MITRE. So something a little bit more updated.
And then encouraging non-practitioners to engage in a training to better understand the work that’s investigators are dealing with in terms of CSAM and then the resources that are needed. And a lot of people said that they would specifically like to see this for their management. I think management does the best that they can. But what we found from several respondents was that there didn’t seem to be quite the understanding on their part of what is actually needed by the investigators themselves to do this work.
And moving away from hash value identification. As we had mentioned there, it was identified as an area that makes it, makes a lot of the tools work quickly in terms of identifying CSAM, but it’s also limited in that it will only identify what’s known. So new images and videos that are coming up may not be noticed right away. And then focus more research on age. Just summation.
When we asked respondents about the value of age estimation tools, all of them said that it would be incredibly beneficial, particularly for the age group of zero to 12. And then when broken down there were specific age groups that they felt a tool focusing on that age group would be more valuable than others.
Developing a technology that can identify and group images of the same victim. So a lot of people said that they would like to see something where a tool can identify one particular person and then sort of start cataloging other types of videos and images from maybe from other sources and kind of pull them all together so they can have all the information on one victim in one place.
Employing novel filtering techniques beyond skin tone detection, novel techniques for providing leads. And then finally developing technologies, allowing for new identified CSAM to be added and shared amongst investigators.
So finally we would like to thank [indecipherable] who gave us this amazing opportunity to work on this research. And a special thanks to all the practitioners that took and provided their responses. And our friends, I’ve met Otto Schwann who really dedicated his time and effort in helping design this survey to UNH students who assisted in the testing phase and then to DFRWS, who really made it possible to be here. So thank you.
There is time for questions.
Audience member: Okay, thank you. I’m very interested in this because obviously this is a huge problem. CSAM investigations are one of the most prevalent types of investigation, whatever we can do is great. And understanding the use case. I’m curious as to if your survey or your responses included any information about the difference in the time of using different techniques, such as skin filtering and hash searches, [which] upfront are not very time intensive. So they do allow a lot of the computational stuff [to be] done quickly. An examiner can go in when you start talking about AI age estimation higher processing on video content to find relative frames, et cetera, that tends to take a lot more time computationally upfront. Were there any responses that had to do with why? Maybe? I noticed that like the skin filtering and the hash is what most commonly used regardless of the availability and tools, if that has anything to do with it.
Laura: Time essentially was of essence and a lot of the responses that we received… we actually also did a whole section on asking respondents about the amount of time it takes to acquire their the videos and images for processing and all this. And we asked them for computers and Android devices and whatnot. And we had varying responses and times of how long it actually takes. But it does seem that for a lot of the people when implementing some of these technologies, it is a little bit more time-consuming, but also there was a concern of having false positives or false negatives. So that was another concern. So it seems to be like tied in together between the time and then the results.
Audience member: Thank you. Yeah.
Host: Other questions, comments in the back?
Audience member: I wanted to provide a comment on some of your survey. I work for a company that deals with AI in security practitioning and what we see with a lot of our customers, our CSOs and folks is that the term ‘data science’ scares people. And so when you had that comment about people feeling that they’re not competent, I think that’s because we as an industry, not just here but across all industries, almost see data science as almost like a magic space. And none of us dare say we want to be, you know, assume any competence in that level. So it may be something to consider when you look at how you’re analyzing your respondents.
Host: Thank you. Thanks. Any other questions? Thank you. [inaudible].