Generated using Ideogram
Abstract
Privacy is a core aspect of our lives. We have the fundamental right to control our personal data, physically or virtually. However, as we use products from external vendors, particularly the FAANG companies (Facebook, Amazon, Apple, Netflix, Google), our digital footprint is continuously being expanded. Fortunately, FAANG provides a service that enables us to export our data to a local drive in just seconds. From now on, we will refer to it as the “export service.”
While ease of use benefits customer experience, there are two sides to every coin. That often comes with a cost.
Attackers can use the export service to obtain extensive information about you, including PII (Personally Identifiable Information). To achieve their goal, they have a variety of initial entry points to choose from.
Some notable examples include cookie stealing, planting Malware on your device or even gaining physical access to your data. Each of these operations could enable the threat actors’ access to your information. We’re going to show you how this can permanently ruin someone’s life.
This article will explore the hidden threats within your sensitive data and outline the mitigation steps you and your company can take to enhance your security posture.
Introduction
In Jack Londonʼs wonderful tale White Fang, he describes the story of a lone wolf navigating a wilderness teeming with predators. Many characters in the story exploit Fang’s survival skills, much as the way threat actors can exploit the personal data collected by FAANG. While Fang does not necessarily intend to harm anyone, we all know that with great power comes great responsibility, and FAANG׳s excessive data harvesting can quickly become a predator׳s tool.
One of FAANG׳s loyal customers is Joe. He is just an average guy who works for a large tech company and is generally security aware. We have analyzed his exported information using the export services of Apple, Google and Meta.
Based on our analysis, we will explore adversarial scenarios where Joe׳s personal data could be exploited by each of the companies mentioned above. For each scenario, we will present the associated risks through pseudocode as a proof of concept. These POCs will utilize various input files exported from the respective export services while ensuring Joe׳s anonymity is preserved by using de-identification.
We primarily focus on the following adversarial tactics: defense evasion, lateral movement and reconnaissance. Reconnaissance is a challenging tactic for defenders, as there’s often very little they can do about it in terms of prevention.
Our starting point of the story is a compromised personal account (AKA an unprivileged user in Apple, Google or Meta) as we demonstrate how attackers can collect and exploit our personal data.
The full code examples, including the input file paths from the export that direct to the sensitive information, can be found in the following repository: WhiteFAANG.
Predators in the Wild
Using personal accounts in corporate environments is much more common than you might think.
Many seasoned tech industry professionals and non-tech users utilize their personal Google accounts for various tasks. Some examples include file sharing and writing documents, effectively creating a “creativeˮ bypass of email filtering and DLPs (Data Loss Prevention systems). According to a CyberArk survey conducted with more than 14,000 participants, approximately 63% of employees reported using personal accounts on their work laptops, with Google being the most used platform.
Not only do these employees put their entire organization at risk of possible exfiltration attempts, but they also expose it to accidental password synchronization issues.
One company that suffered from this kind of behavior is Cisco. In 2022, a threat actor compromised a Cisco employee’s “personalˮ Google account. Unfortunately, the employee entered sensitive corporate credentials while connected to his Google account, enabling the passwords to synchronize with the Google account. The attacker could then leverage these credentials to access the corporate VPN using MFA bypass techniques.
Another incident involving using a personal Google account was the Okta support system breach. David Bradbury (Okta’s Chief Security Officer) said, “An employee had signed in to their personal Google profile … the username and password of the service account had been saved into the employee’s personal Google account.ˮ The personal account was the initial entry point used by the attacker, which later became a major incident for Okta, affecting significant customers such as Cloudflare, 1Password and BeyondTrust.
At this point, it has become clear that our personal accounts serve as a common attack vector for threat actors. We will now cover adversarial scenarios for Apple, Google and Meta under the base assumption of a compromised personal account. Let’s start with Apple.
Apple
The output of Apple׳s export service is structured in the following manner:
Figure 2 Apple export directory structure
The adversarial use cases are:
Find Physical Devices
Today, most organizations integrate multi-factor authentication (MFA) as a core practice in the authentication process. The MFA challenge is usually performed using edge devices such as mobile phones.
To bypass a corporate MFA flow, we first need to map Joe׳s available edge devices and collect as much metadata as possible. The export allows us to gather the precise OS version of the active device and learn about Joe׳s patching habits. An adversary can use these insights to exploit proper vulnerabilities.
Another meaningful piece of information available to us through the export involves Bluetooth accessories. The information is available from “AccessoryDeviceInfo.json” as part of the export.
Not only can we map Joe’s Bluetooth devices, but we can also access their respective MAC addresses.
By analyzing the mapping we can exploit specific vulnerabilities, such as the recent AirPods CVE-2024-27867, which enables attackers to eavesdrop on an AirPods microphone using only a MAC address. It is worth noting that Android users who use AirPods do not receive automatic updates and are, therefore, likely susceptible to this vulnerability.
Find a trusted device: Read JSON file "Devices Registered with Apple Messaging.json" Print os-version from devices array Find Bluetooth devices: Read JSON file "AccessoryDeviceInfo.json" For accessory in devices print 'Accessory Name' print 'Bluetooth Mac Address'
Output |
iPhone OS,17.2.1,18D42 |
AirPods Pro 1a:2b:3c:4a:5b:6c |
Find ISP (Internet Service Provider) and Mobile Carrier Name
Reconnaissance is an integral part of every attack. We can look for mobile carrier information, which might lead us to move laterally into Joe׳s primary mobile device using social engineering. The export, generated by the export service, contains a great deal of PII, including the last four digits of Joe׳s credit card number (which can be found in “Billing Information History.csvˮ as part of the export). The digits are used as a standard identification method for banks and help desks, including Joe׳s mobile carrier and ISP.
We can combine Vishing (Voice Phishing) with the four last digits of his card to manipulate the help desk employee to our advantage. In nature, this situation can be likened to quicksand. The target — in this case the help desk employee— feels safe and unsuspecting, much like someone walking on what they believe to be solid ground. Once lured in, they sink deeper and deeper into the trap, revealing the hidden dangers only after the damage is done. In the past, we have seen actual incident examples that included MFA resetting (an MGM attack), SIM swapping (an attack on us insurers) and many others.
Another possible attack vector we can look for is identifying vulnerable equipment — unpatched routers made by the specific vendor, AKA our discovered ISP. Unpatched products lack critical security updates and, therefore, pose a serious risk.
Find IP company: Read CSV file "iTunes Payment Stack - Activity.csv" Extract IP Company column from file Find value in the column where value != None Print value Find mobile carrier: Read CSV file "Subscription Click Activity.csv" For entry in file Use Regex to find pattern: carrier": "([^"]+) print match
Output |
Verizon |
AT&T |
Find All Developers Who Created the Apps on Your iOS Device (Top 10, Sorted by Date)
Adversaries commonly use software supply chains to avoid defense mechanisms by targeting the weakest link. This attack vector can be leveraged by using our downloaded applications. The export contains information about every application we have ever downloaded to our iOS device. Each app developer represents an extension of trust, enlarging the attack surface.
We can flag the weakest app developers as candidates for a supply chain attack. Joe inherently trusts these providers, and compromising their CI/CD flows could, in turn, compromise his device.
Looking at the big picture, Joe has trusted more than 1,000 development organizations over the years. Did he intend to trust such a tremendous number of developers scattered across more than 50 countries? How many of these countries do he and his government consider hostile?
Find paid app providers: Read CSV file "Store Transaction History.csv" Extract Seller column Drop duplicates Count occurrences print count Sort by 'Item Purchased Date' in descending order Print top 10 rows Find free app providers: Read CSV file "Store Transaction History - Free Apps.csv" Perform the Same steps as the paid app
Output
The user trusted 431 different paid app providers and 842 free app providers.
Here are the top 10 paid providers:
Seller |
Audible, Inc. |
JoyTunes |
Apple Inc. |
Sony Music |
Microsoft Corporation |
Netflix, Inc. |
Spotify Ltd. |
Adobe Inc. |
Disney |
Nintendo Co., Ltd. |
Here are the top 10 free providers:
Seller |
Apple Inc. |
Duolingo |
Grammarly, Inc |
Google LLC |
OpenAI, L.L.C. |
Meta Platforms, Inc. |
Snap Inc. |
Zoom Video Communications, Inc. |
TikTok Inc. |
Imangi Studios (known for ˮTemple Runˮ) |
Find the Top 3 Most Common Event Locations Based on Joe’s Personal Calendar
The export gives us access to Joe׳s schedule via his calendar. It enables us to identify patterns in Joe׳s routines, including the exact locations in which he is expected to be.
This information gives us various options if we wish to target Joe. We can map the physical security posture of the discovered locations to find the weakest link, which could enable physical damage, asset theft or espionage.
Based on our research, Apple even records events that were deleted from your calendar (deletions might indicate a desire to hide something).
Read ICS file "Joe.ics" Read calendar from file For event in calendar extract location Count occurrences per location Print top 3 rows
Output
Location | Count |
Shake Shack, 400 W 8th St, Los Angeles, CA 90014, United States | 9 |
Corgi Cafe, C. de la Indústria, 78, Gràcia, 08025 Barcelona, Spain | 6 |
South Jersey Sports Center, 100 Pike Rd Bldg C, Mt Laurel Township, NJ 08054, United States | 3 |
The output of Google׳s export service is structured in the following manner:
Figure 3 Google export directory structure
The adversarial use cases are:
Find the Top 3 User Agents Used by Joe
Identities are at the heart of most security incidents. Therefore, large enterprises implement security controls, which often include UBA (user behavior analytics) and ITDR (Identity Threat Detection and Response).
To bypass these security controls, an attacker can simulate the victim׳s actions using the victim׳s most common user agents. A user agent is a string that represents a client, including its software version and operating system. Like a wolf in sheep’s clothing, the threat actor disguises himself by adopting the victim׳s identity, waiting for the right opportunity to strike.
Although user agents are considered a “weakˮ user identifier in comparison to other methods (IPs, session tokens, etc.), security vendors commonly integrate them into a multi-layered anomaly detection engine.
Read HTML file ".SubscriberInfo.html" Extract user agent table from the file Group BY "Raw User Agents" Count occurrences Sort by "IP Address" in descending order Print top 3 rows
Output
User-Agents | Count |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 | 8 |
Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Mobile/15E148 Safari/604.1 | 5 |
Mozilla/5.0 (Linux; Android 13; SM-G991U) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.5790.136 Mobile Safari/537.36 | 3 |
Find Sensitive Data for Blackmail Purposes Using Past Searches
According to our research, Google export contains every term you have searched for on Google in recent years. Adversaries could leverage this goldmine of confidential information for blackmailing purposes.
An attacker can build a list of custom keywords that he considers sensitive (focusing on addiction, financial trouble, etc.), and use it against Joe for blackmail.
The export even includes terms typed locally in the browser URL without the user hitting “enter.ˮ This behavior gives Joe a false sense of privacy, as Joe believes his keystrokes are local to his endpoint.
Read HTML file "My Activity.html" For word in sensitive word list if word in file.text print word, text
Output
ˮloanˮ, You searched and visited “affordable loans for everyone!ˮ |
ˮoverdueˮ, You searched and visited – “Your bill is overdue? Contact our lawyers now!ˮ |
Meta
The output of Meta׳s export service is structured in the following manner:
Figure 4 Meta export directory structure
The adversarial use cases are:
Find Joe’s Physical Location
The export can be used to find the exact user location (including postal code). Its accuracy depends on whether Joe has enabled the GPS settings on his mobile device. Otherwise, Meta will only use IP and publicly available information.
Knowing Joe׳s home address allows us to target his Wi-Fi network using known methods, such as the Evil Twin and deauthentication attacks.
Another possible adversarial route is the use of social engineering or espionage practices.
Read HTML file "primary_location.html" Find div '_2ph_ _a6-p' For section in div if section isdigit assign location and break Print location
Output
Address | postalCode |
redacted address in US | redacted postal code |
Find the Top 20 Posts That Captivated Joe’s Attention the Most
One of the more interesting data pieces adversaries may use against you is your post preferences. We found that the data collected about you over the past couple of years is so detailed that it records every Facebook post you view and the exact number of seconds it spends on your screen.
This highly intrusive data collection enables us to create a sophisticated social engineering campaign based on Joe׳s interests by sorting Joe׳s posts based on the time spent on each one.
We can then use an LLM to auto-generate a phishing email tailored for Joe.
Define file pattern as page_.html Define post array For file in pattern read HTML file extract table extract headers for row in table extract cells map headers to cell value if cells > 2 and third cell contains a link extract URL extract Time add {URL, Time} entry to post array Sort post array by Time Pick top 10 posts print post array
Output
Content | secondsViewd |
https://www.facebook.com/groups/politicsForFun/permalink/163343214453678/ | 2984.7 |
https://www.facebook.com/loanMasterMoneyTalks/videos/332116553356/ | 2690.2 |
Recap
What do we know about Joe so far?
Based on “personalˮ information only, we managed to map Joe׳s most critical assets. These include his active MFA device, which likely serves as the active MFA factor for his corporate assets. Knowing the exact device metadata information (such as OS version) enables us to target this device effectively.
We have learned much about our victim’s digital footprint, including his common user agents, mobile carrier, IP company and primary location. The information gathered has helped us with defense evasion and reconnaissance of our target.
Using his Bluetooth devices (which are commonly used in a corporate office environment), we listened in on Joe׳s microphone and heard his deepest secrets. Is he having any problems with his wife? Or maybe he has been kind enough to share the password to our corporate VPN?
The voice samples we obtain can be fed into a Deepfake model, facilitating easier social engineering attacks.
These adversarial examples are just the tip of the iceberg. Hundreds of other use cases lurk in the shadows, waiting to exploit your data. Some notable unexplored examples include guessing security questions using AI, extracting sensitive documents from Google Drive for social engineering purposes, password guessing to reduce brute-force search space and many more.
Attackers can leverage the scenarios we present in the article to create a fruitful reconnaissance framework that can be easily expanded to include various new techniques.
Joe׳s life may never be the same, as the threat actor targeted all of Joe’s most important domains, as described in the diagram below.
Figure 5 Joe’s life ruined diagram
Mitigation
So, what can we do to protect ourselves?
- Use strong phishing-resistant MFA for all accounts, while ensuring proper password complexity. Many do not view social media accounts as “sensitiveˮ and ignore critical security controls.
- Do not sync passwords between your personal and work accounts. It is easy to do this accidentally, so the best practice is to avoid using personal accounts in corporate environments whatsoever. In case you already synced one of your passwords to your Google account, visit Google Password Manager to view and remove it.
- Monitor personal account export actions as sensitive operations and respond accordingly. This detection should be integrated with an effective ITDR strategy.
- Request a deletion of your personal data for idle accounts. This is your legal right, as defined by GDPR and similar regulations, the “right to be forgotten.ˮ
- Perform a secure disk wipe of the local export information by overwriting the relevant section of the local drive with random 0s and 1s to permanently delete the data. This action can be done using shred command on Linux or SDelete (Sysinternals) on Windows. Use caution when operating these tools, as files will not be recoverable after deletion.
- Use an enterprise-grade protected browser. These products enable auditing capabilities while reducing the attack surface originating from web session data, password and form autofill syncs, risky commands, and personal data uploads and downloads.
Conclusion
In the predatory landscape in which we operate, threat actors can use our personal information against us. This not only allows threat actors to harm us, but it also puts our employer at risk. We should take responsibility by treating personal information as a critical asset, including security hygiene practices (MFA, password complexity) and proactive observability measures (detection and response) set by the blue teams. By employing these practices, we can avoid the next major breach while keeping our due right to privacy.
The attack vectors we have covered are expected to evolve drastically in the future in terms of both variety and quantity. Let’s ensure that, as a community, we are well-prepared and vigilant in securing our precious information.
Lior Yakim is a threat researcher at CyberArk Labs.