by Christa Miller, Forensic Focus
Mark Zuckerberg’s new “privacy manifesto” for Facebook marks not just a pivot in terms of how the social network shapes modern-day communication. It also marks what The Verge’s Casey Newton called “the end of the News Feed era.”
Zuckerberg’s opening statement draws a distinction between the “digital equivalent of a town square” which Facebook and Instagram have helped to build over the past 15 years, and the “digital equivalent of the living room” in which more users prefer to spend time together. Most child exploitation domain experts would be quick to point out, however, that child abuse is far more pervasive in living rooms and other private spaces than it is in town squares.
For example, the United Kingdom’s National Society for the Prevention of Cruelty to Children (NSPCC) cites studies from 2011 and 2014 showing, respectively, that more than 90% of sexually abused children were abused by someone they knew. That’s consistent with US-based research showing that only about 10% of perpetrators of child sexual abuse are strangers to the child.
Online, of course, Facebook and its properties Instagram and WhatsApp are well known to law enforcement:
- In late 2018, TechCrunch reported that WhatsApp’s encryption, as well as its lack of human moderation, made it easier for child abusers to use the app’s private groups to share child sexual abuse material (CSAM).
- Business Insider reported in September 2018 that Instagram’s IGTV service had recommended video content containing CSAM.
- Engineering and Technology described Facebook’s efforts in late 2018 to delete more than eight million pieces of content that violated the site’s rules on child nudity and exploitation in the previous three months alone.
- In March 2019, Forbes reported NSPCC research showing that Instagram has become the leading platform for child grooming in the United Kingdom.
This isn’t to say that non-Facebook-owned social media, including Tumblr, YouTube, and others, are immune to similar problems. However, Facebook’s new focus — including its recent announcement regarding its own cryptocurrency — highlights the ongoing tensions between safety and privacy, as well as between law enforcement and the private sector. How might Facebook’s planned shift from public to private sharing affect forensic investigations?
Digital Forensic Artifacts
Zuckerberg’s statement focuses on “private messaging, ephemeral stories, and small groups,” which he notes are the fastest growing areas of online communication. As reasons for this shift, he cites user desire for one-on-one or one-to-few communications, as well as concerns around permanent records of communications.
On the one hand, any content limits hurt investigations. “Most people today don’t jailbreak their $1000 iPhone or root their expensive Androids nearly as often as they used to,” explains Domenica “Lee” Crognale, a SANS Certified Instructor and co-author of the SANS FOR585: Advanced Smartphone Forensics course, “and this alone is keeping us from taking a look at some of these applications forensically.”
Indeed, the constantly evolving handshake between mobile operating systems and apps can make it difficult for forensic vendors to keep up, or to communicate new changes to forensic examiners.
In Crognale’s research, for instance, it’s common for a database or proprietary file formats to be inaccessible, if an app developer limits backup data to, say, attachments or pictures. “This means that we are often missing a piece of the puzzle,” she says; in other words, chat, email, and call content.
On the other hand, Crognale says she’s come across very few databases that are encrypted in their entirety. “It is more common to see certain records within a database encrypted (think secret messages),” she says.
Facebook Messenger and Instagram Direct have enabled private messaging for some time, of course. It’s the extension of WhatsApp’s end-to-end encryption (E2EE) to these services, and to additional planned services such as video chats, payments, and commerce, that are of concern.
Acknowledging “a growing concern among some that technology may be centralizing power in the hands of governments and companies like ours,” Zuckerberg heralds E2EE as a “decentralizing” tool that helps ensure the freedom of dissidents worldwide — as well as acknowledging misuse by “people doing… truly terrible things like child exploitation, terrorism, and extortion.”
Of course, the difference between data at rest and data in motion has been well documented. Joseph Pochron, President of Forensic Technology & Consulting at TransPerfect Legal Solutions, notes that the transition to E2EE is “a positive step, but a lot of the public doesn’t realize WhatsApp makes data at rest accessible via forensic tools.”
Brett Shavers, a former law enforcement investigator and a digital forensics practitioner, concurs. “I have seen instances where encryption was marketed, but not employed as well as [it was] marketed… unless the user has control of the encryption scheme, there is no way to ensure the encryption is truly encryption (that is, without backdoors). With that, law enforcement should never assume that if something is encrypted, that it is impossible to access, because what is inaccessible today may be accessible tomorrow.”
Bringing Facebook and Instagram apps more in line with WhatsApp, then, means existing forensic research on WhatsApp and similar apps — accessible via Shavers’ website, DFIR.training — might lend some clues as to what to expect from future changes.
Forensic vendor tools, meanwhile, offer solutions that allow examiners to get around E2EE in apps by:
- Obtaining data using the device phone number, or app tokens stored on the mobile device in place of users’ login details.
- Acquiring and decrypting WhatsApp backups stored in an Android user’s iCloud or Google Drive account, or directly from the WhatsApp server. These include backups that the examiners themselves can make.
However, these methods require investigators to know some piece of information or to acquire a level of access that isn’t always possible; for example, when the user is uncooperative or deceased, or can’t be found.
In those cases, investigators have to approach the provider for data. Law enforcement can write search warrants; corporate litigants have to subpoena service providers. Preservation orders prevent marked-for-deletion data from being wiped until the legal paper can be served, and the provider generally returns the data within a few weeks or months to the requester.
This process is in flux, says Pochron, thanks to laws like the European Union’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act, which are changing how providers deal with personally identifiable information (PII).
These laws require providers to enable users to see what data providers retain about them, and to revoke permission for providers to use the data. As a result, users can download their own data, which can in turn be imported into a forensic tool. In many cases investigators can work with victims and even some suspects to accomplish this.
The same data retention requirements make it easier to request data from a service provider. Under the GDPR, a company has 30 days to respond to a request for user data they retain; under California’s Privacy Act, as of 2020, a company will have 45 days to respond. (Notably, Apple’s Data & Privacy Portal states that company might take up to only seven days to verify and respond to a request.)
Pochron says such user data requests can often result in “quite a lot of information.” To that end, data that’s too difficult for forensic examiners to recover from a device may increasingly be available from the cloud — and with a richer dataset. For law enforcement, he stresses, this might make real-time consent to search and collect data increasingly important.
Overall, Pochron doesn’t anticipate that Facebook’s move to E2EE will be much of a game-changer for industry unless Facebook follows “ephemeral by design” apps like Signal, Wickr, or Telegram.
That’s the condition at the heart of what Zuckerberg calls a “permanence problem”: when large quantities of messages and photos, collected over time, “become a liability” owing to embarrassment or, well, criminal behavior. Facebook may address this through ephemerality, the ability to set content to expire or be archived automatically.
Going further than the platform’s Stories, automatic deletion or expiration could extend anywhere from a month or more to as little as a few seconds or minutes. That’s why ephemeral apps pose what Crognale calls “the biggest problem for data recovery.” “These time-bombed messages are really gone, in most every case that I have examined, when the time limit has elapsed,” she explains.
As a result, ephemeral messages carry two potential implications for forensic examiners, says Pochron. The first is, of course, whether ephemeral content includes “breadcrumbs” that can help build a case, or the “smoking gun” that can make it. That’s of particular concern to criminal investigators.
The second issue is legal. “If a business deploys apps that are ephemeral by nature, then it can’t retain those business records, and won’t be able to comply with legal hold requirements,” Pochron says. He points to a 2017 case, Waymo v. Uber, in which the plaintiffs accused the defendants of using Wickr — an ephemeral messaging app — to avoid such requirements.
So, while preservation orders would be useful in some cases, in others, investigators may have to rely on whether a victim was able to screenshot and store the evidence on their device — though Crognale notes that it’s possible for devices to take screenshots on the user’s behalf. “[For instance] if an incoming call interrupts what you were doing in one application and minimizes your screen,” she says.
Just as WhatsApp forensics could contain clues on how to address encryption of all Facebook and Instagram private messages, examiners may look to research published on Telegram forensics (ResearchGate; ScienceDirect), as well as Snapchat forensics (CarpeIndicium; ScienceDirect; Eidebailly), for clues on how to deal with ephemeral messages.
Crognale’s research shows that sometimes, unsuccessful message transmissions are cached on a device as data meant for destruction — though it’s becoming more common, she adds, for databases to purge information marked for deletion.
The fallback for now: user habits. “Luckily for us, lots of people like to rely on going back to messages… so we can only hope that they choose secret messages over the ephemeral kind, because we have more success with the recovery of secret messages.”
In Shavers’ view, the forensic recovery and analysis of ephemeral data is similar to verbal harassment. “The credibility of the victim and other factors are needed to corroborate what a victim saw on a device before it disappeared, if forensics cannot retrieve it,” he says. Exceptions: if a recording was made, or if metadata exists to help corroborate a complaint.
Another way law enforcement proactively identifies bad actors and their content is through direct participation. ABC7 in Chicago reported in late 2017 that undercover agents had infiltrated a number of Facebook groups set up to deal in drugs and guns. That probe resulted in 50 arrests.
Drugs and guns aren’t the only currency in private Facebook groups, though. The previously cited article from Engineering and Technology, dated October 2018, quoted NCMEC’s chief operating officer, Michelle DeLaune, who commented on the “crucial blind spot [of] encrypted chat apps and secretive ‘dark web’ sites where much of new [CSAM] originates.”
Although participation in CSAM-related groups is often predicated on the sharing of CSAM, which is prohibited by federal law and therefore limits individual agents from live activities, other technology may be able to help.
Late in 2018 Facebook announced: “In addition to photo-matching technology, we’re using artificial intelligence and machine learning to proactively detect child nudity and previously unknown child exploitative content when it’s uploaded.” Additionally, new machine learning-driven software, which Facebook is helping NCMEC to develop, could help to prioritize the tips that are ultimately shared with law enforcement.
Zuckerberg’s statement referred to technology that could detect “patterns of activity or through other means, even when we can’t see the content of the messages.” However, he acknowledged, “… we face an inherent tradeoff because we will never find all of the potential harm we do today when our security systems can see the messages themselves.”
Whether investigators themselves will be able to work more proactively using AI and similar technology depends on how fast new software, or new functionality in existing software (for example, current monitoring across peer-to-peer networks), could be developed. Undercover investigations are time-consuming and resource-intensive, and social media monitoring continues to be the subject of much contentious debate.
To some extent, as described in last year’s press release, Facebook already takes proactive steps to cooperate with law enforcement. Another possibility, as described in the TechCrunch article, could involve other third parties: “[WhatsApp] suggested that on-device scanning for illegal content would have to be implemented by phone makers to prevent its spread without hampering encryption.”
Interoperability Across Facebook’s Three Services
Zuckerberg’s manifesto calls for interoperability, or the ability for users to choose whichever service they prefer to send and receive messages. In other words, an Instagram Direct message might have originated with Facebook Messenger or WhatsApp — even SMS. (Acknowledging that the SMS protocol isn’t currently encrypted, Zuckerberg claims interoperability of SMS with Facebook messages would ensure encryption of these messages.)
Users would be able to maintain account separation across the three services, but to connect senders and recipients as part of an investigation, forensic examiners will need to go a step further. Metadata and any content fragments will have to help piece together who sent what, when and by whom it was received, and which apps were used across devices.
That could be difficult, Pochron says, if Facebook follows Apple’s example of continuity. An iPhone and Mac computer connected via Bluetooth, he says, wouldn’t show a handoff between the two — only that an iMessage was “sent.”
If no clean artifact or metadata exists to show what device a message was sent from or whether it was received, in other words, an examiner may not be able to testify to this level of detail. Crognale says the best in this case is to put the device — or account user — at a certain spot via network artifacts created at the same time a call was made (or vice versa).
That degree of interoperability could make it easier for bad actors to obfuscate their activity. Zuckerberg’s example, the ability for a user to use WhatsApp to receive messages sent to their Facebook account without sharing their phone number, as they currently do on Facebook Marketplace, could make it more difficult for investigators to tie activity to a particular device.
On the other hand, Crognale says this may not always be the case. Apps run in their own sandbox: like a “mini virtual machine,” she says, spun up for each app on the device. “Some applications contain flags to show where the message originated (i.e. WhatsApp or Instagram) so you can tell which app that user was interacting with when they made the post or sent the message,” she says.
Shavers says interoperability’s “consumer convenience” also helps because even if data isn’t stored in one database, it may be stored in another. “Basically, the more interconnected devices that a criminal may use, the more likely evidence will be strewn in too many locations to not be found by law enforcement,” he says. “Even when some of the evidence may be encrypted, and practically inaccessible, there may typically be enough evidence that is accessible to make a case.”
Even so, while some apps such as Viber “do a very good job at tracking every little thing in the application,” Crognale says, “others track the [bare] minimum.” According to her research, apps like WhatsApp, Instagram, and Facebook Messenger seem to fall somewhere in between.
The Mosaic of Metadata
Another key set of artifacts comes from message metadata. Even without content, metadata can be a powerful tool in helping investigators track who was talking to whom, how frequently, and over what period of time. Geolocation data, attachment data, and other metadata can be so important that in 2012, the United States Supreme Court considered whether the “mosaic” of data from disparate sources could be assembled into a separate search requiring its own warrant.
As Slate’s Ari Ezra Waldman pointed out, even with the incipient changes, Facebook can still collect plenty of metadata, and that the integration and expansion of services meant that Facebook could “piece together new kinds of data.”
Again looking to WhatsApp as an example, Cosimo Anglano’s 2015 research on artifact correlation showed how reconstructing the contact list, event logs, sent and received files, and other WhatsApp artifacts (on the Android platform) could be correlated together to reconstruct contact lists and message chronologies.
This outcome appears to be the case in other Facebook services, as a 2018 ReCode article describes about Facebook’s Portal home assistant hardware. Built on the Messenger infrastructure, Portal collects “the same types of information (i.e. usage data such as length of calls, frequency of calls) that we collect on other Messenger-enabled devices,” a spokesperson quoted in the piece said.
This may be why Zuckerberg states an intent “to limit the amount of time we store messaging metadata… [and] to collect less personal data in the first place, which is the way WhatsApp was built from the outset.”
The collection of less metadata ostensibly means less evidence, says Shavers, but that doesn’t mean no evidence at all. “Even if a provider collects less data, or stores data for less time,” he explains, “the mere propagation of data across devices and providers means that data exists in so many places that investigators can practically target exactly what they want, if they know what they need.”
For example, he says, metadata from pen registers — phone numbers being dialed or called in, with dates, times, and length of calls — can alone tie suspects together, even without a wiretap warrant to collect the content of those calls.
Likewise email headers in harassment cases, social media connections, and so on. “Metadata is the easiest to get court approval to obtain, easiest technically to obtain, and there is so much of it that in combination with other metadata, builds up quite the file of evidence,” Shavers says.
The major challenge with metadata isn’t its lack of content, says Pochron, but whether a forensic tool can properly map the right fields. What’s available through the provider’s API, versus actual app functionality, can change, or be located differently.
That points to the need to validate forensic tools through manual methods. Crognale says most apps have fields and tables within one or multiple SQLite databases that store artifacts related to social media activity, direct messaging, phone calls, video chats, GPS, even photo editing. “While the amount of different flags may not provide an immediately clear way to identify the actions that have occurred on a device,” she explains, “these artifacts can be easily determined through various testing methods on devices running the same version of the application.”
Investigative Information Sharing and Collaboration
Zuckerberg’s assertion that Facebook “won’t store sensitive data in countries with weak records on human rights like privacy and freedom of expression in order to protect data from being improperly accessed” could make things more difficult for investigators in countries where human trafficking, child exploitation, and other crimes are prevalent.
Whether Facebook’s services could be blocked in these or other countries, as Zuckerberg anticipates, likely means simply that child exploitation investigators will need to remain savvy to the tools bad actors are using apart from Facebook Messenger, Instagram, and WhatsApp.
“This will always be an issue, in both civil and criminal investigations,” Shavers notes. “Even with agreements, [international] cooperation and communication will never be as seamless as it is within one country. In these instances, investigators should always consider that a copy of the data that they need may actually be stored locally (in their own country) by virtue of being on a suspect’s device or a service provider in-country.”
The device itself is still an option, too, says Crognale. “For those devices that we are fortunate enough to be able to jailbreak or root prior to acquiring them, we can usually still find some very interesting data,” she explains.
Much remains uncertain until Facebook starts to make its intended changes, but Pochron says businesses can prepare now by enacting stronger information governance programs and tighter controls over “shadow IT,” or the tendency of employees to use unapproved apps that could result in legal sanctions in the event of litigation.
Meanwhile, forensic examiners in both public and private sectors can take cues from prior research and experience as to what to expect. “Purge by design,” encryption, and interoperability features don’t help, but they may not end up hurting as much as they might appear. “All data is ephemeral by nature in some regard,” says Pochron, and with that in mind, forensic examiners can approach these latest changes with curiosity to learn.