Last updated: Nov 15, 2024, 11:30 a.m. PT
It was brought to our attention that a threat actor has been observed using stolen customer credentials to target organizations utilizing Snowflake databases. This campaign focused on data theft and extortion. The threat actor primarily exploited environments lacking two-factor authentication (2FA) and originated from commercial VPN IPs. An attack tool named “rapeflake” has been identified in these incidents, though detailed information about the tool itself remains unknown. The threat actor has directly extorted organizations, further pressuring them by publicly posting stolen data for sale on hacker forums. The full extent of their activities is still under investigation, and we will update this section as new information becomes available.
Background
- A data theft and extortion campaign targeting organizations utilizing Snowflake databases is an emerging threat posed by the threat actor.
- The threat actor primarily exploited environments lacking two-factor authentication (2FA) and originated from commercial VPN IPs.
- An attack tool named “rapeflake” has been identified in these incidents, though detailed information about the tool itself remains unknown.
- The threat actor has directly extorted organizations, further pressuring them by publicly posting stolen data for sale on hacker forums.
Mitiga customers were already notified if they were affected. To learn more about how this threat can be detected and investigated in Snowflake environments, continue reading.
November 2024 Update: Threat Actor Arrest and Indictment
In a significant development in November 2024, Canadian authorities arrested Alexander "Connor" Moucka, known online by the aliases Judische and Waifu (The Hacker News). Moucka was implicated in a series of cyberattacks targeting Snowflake’s cloud data platform. Working with accomplice John Erin Binns, Moucka allegedly exploited credentials stolen via infostealer malware to infiltrate accounts that lacked multi-factor authentication (MFA). This operation reportedly impacted over 165 organizations, including prominent companies such as AT&T, Santander, and Ticketmaster, exposing their sensitive data and operational vulnerabilities.
The attackers’ methods extended beyond data theft to extortion, with Moucka and his group demanding payments to prevent the sale or public release of stolen information. In one high-profile instance, the group is alleged to have stolen 50 billion customer call and text records from a major U.S. telecommunications company, widely believed to be AT&T. This resulted in AT&T paying $370,000 to secure the deletion of their compromised data. The group is reported to have successfully extorted a total of $2.5 million across multiple victims, emphasizing the lucrative and dangerous nature of these cybercrimes.
Moucka's arrest followed a request for extradition by U.S. authorities, showcasing international collaboration in addressing cross-border cybercrime. His indictment also shed light on a larger cybercrime syndicate known as "the Com," to which Moucka is believed to belong. This group is notorious for blending digital and physical theft methods to execute financially motivated attacks. The case underscores the importance of securing cloud environments like Snowflake, not only by implementing MFA but also by adopting robust threat-hunting practices to detect and neutralize such sophisticated adversaries before breaches occur.
What is Snowflake?
Snowflake is a cloud-based data warehousing and analytics platform designed to handle large-scale data storage and processing. It offers a highly scalable architecture that enables organizations to manage and analyze massive amounts of data seamlessly. Snowflake is widely utilized across various industries for its ability to consolidate data from multiple sources, providing robust support for data-driven decision-making and advanced analytics.
Snowflake is highly popular, boasting 9,437 global customers and holding a significant 21.51% market share in the data warehousing market. Its widespread adoption across various industries underscores its robust capabilities and efficiency in handling large-scale data operations.
Conducting a Threat Hunt in Snowflake Environments
Proactive measures you can take to verify your Snowflake environment is secure:
- SSO Enforcement: While Single Sign-On (SSO) might be in place, is it truly enforced? There's a possibility that users can still authenticate using username/password outside of SSO directly to the Snowflake database. Double-check to prevent unauthorized access.
- MFA (Multifactor authentication) Enforcement: Is Multi-Factor Authentication (MFA) enforced across your organization? Ensure it's not just self-enrolled but mandatory for all users to add an extra layer of security.
- Network Exposure: Is your Snowflake database exposed to the internet? Consider using PrivateLink to limit exposure, or whitelist access only for authorized IP addresses to enhance network security. Conducting a Threat Hunt in Snowflake Environments
What is Threat Hunting?
Threat hunting is a proactive cybersecurity practice where analysts actively search through networks, systems, and databases to detect and isolate potential security threats that may have bypassed traditional security measures. Unlike reactive measures that respond to detected threats, threat hunting involves continuous monitoring and analysis to uncover hidden or emerging threats before they can cause significant damage.
How to Conduct a Threat Hunt in Snowflake
To investigate a potential breach in your Snowflake environment and detect any possible data exfiltration, organizations can leverage the forensic information available within Snowflake’s built-in database and schemas.
Forensic Information for Threat Hunting:
In every Snowflake environment, there is a database named "Snowflake" housing a schema called "ACCOUNT_USAGE." This schema holds metadata and historical usage data for the current Snowflake account, updating with each action taken, providing a comprehensive audit trail.
Key Views in the ACCOUNT_USAGE Schema:
- QUERY_HISTORY:
- Logs all queries within the account, aiding in identifying suspicious or unauthorized queries potentially indicative of data exfiltration attempts.
- Explore detailed table columns here.
- LOGIN_HISTORY:
- Tracks all login attempts, facilitating the detection of irregular login activities, like repeated failed attempts or logins from unfamiliar locations.
- Discover detailed table columns here.
- SESSIONS:
- Captures details of all created sessions, allowing monitoring of session activities and detection of anomalies in session behavior.
- Refer to detailed table columns here.
- ACCESS_HISTORY (Requires at least Enterprise Edition):
- Records all user activity within the account, crucial for monitoring user actions and identifying any unauthorized access or data manipulation.
- Find detailed information about the columns in the Snowflake documentation.
In the following sections we'll demonstrate how you can leverage "Query_History" and "Login_history" logs to effectively identify and investigate suspicious behavior within your Snowflake environment.
Analyzing Query History for Anomalies
In the "QUERY_HISTORY" view, we focus on spotting unusual user activities, which could suggest data exfiltration attempts. Here are examples of what to look out for:
- More Data Scanned than Average: Detect users scanning more data than usual, which might indicate unauthorized access.
- More Data Written to Results than Average: Identify users writing excessive data to results, possibly extracting data improperly.
- Accessing unusually number of warehouses or databases: Flag users accessing an unusually high number of warehouses or databases, which could signal unauthorized exploration.
- Anomaly detection in new daily resource access: Compare new daily resource accesses to a baseline, to detect significant deviations, indicating potential anomalies or unusual activity. This is to surface users that accessed anomalous number of new resources (Databases, Warehouse) in a single day.
- Rare client applications used by user: Highlight uncommon client applications used by users, as they might indicate suspicious activities.
- Exfiltration through inline URL to external cloud storage location: identify instances where data might be copied from Snowflake tables to external cloud storage locations, such as Amazon S3, Google Cloud Storage, Azure Blob Storage, by scanning for the COPY INTO command followed by a valid URL.
def inline_url_exfil(query_history_df):
# Define a regex pattern for URLs following the keyword "COPY INTO" with external location ref: https://docs.snowflake.com/en/sql-reference/sql/copy-into-location
url_pattern = r'(?i)copy into\s+(s3://[^\s]+|gcs://[^\s]+|azure://[^\s]+|https?://[^\s]+)'
# Convert QUERY_TEXT to lowercase and filter rows that contain "copy into"
filtered_df = query_history_df.withColumn('QUERY_TEXT', lower(col('QUERY_TEXT')))
filtered_df = filtered_df.where(col('QUERY_TEXT').contains("copy into"))
# Extract URLs from the lowercased QUERY_TEXT
filtered_df = filtered_df.withColumn('EXTRACTED_URL', regexp_extract(col('QUERY_TEXT'), url_pattern, 1))
# Indicate whether a URL was found
filtered_df = filtered_df.withColumn('URL_FOUND', F.when(col('EXTRACTED_URL') != '', True).otherwise(False))
final_df = filtered_df.where(col('URL_FOUND') == True)
return final_df
The function extracts any URLs found in the QUERY_TEXT and flags these instances. If an extracted URL is unfamiliar or not part of your organization's regular data transfer locations, this should be considered a red alert for potential malicious activity. An attacker could leverage this method to exfiltrate sensitive data to an external storage location under their control
By default, Snowflake allows the use of COPY INTO <location> to unload data to external URLs. To mitigate this risk and prevent unauthorized data exfiltration, you can configure Snowflake to block such actions. This can be done by setting the PREVENT_UNLOAD_TO_INLINE_URL parameter to true. This setting ensures that attempts to unload data to an inline URL are automatically prevented, adding an essential layer of security to your Snowflake environment.
In the QUERY_HISTORY view, various detections help identify outliers. During threat hunting, one standout detection is Anomaly Detection in New Daily Resource Access, which serves as an excellent lead generator for further investigation. This detection method combines two layers of anomaly detection
Identifying New Resource Access: Detecting new resources accessed by users for the first time in history offers valuable insights. However, to ensure reliable results, historical data of sufficient length is necessary. When reviewing the results, it is crucial to verify that a flagged day is not the first appearance of the user in history. Otherwise, all resources would be flagged as new.
- Calculating Daily Average and Standard Deviation: By calculating the average and standard deviation for the daily number of new resource accesses, we can identify days that stand out. Anomalies may indicate unusual patterns of resource access, warranting further investigation.
Below, we present our implementation in PySpark for the described logic:
def snowflake_query_first_time_db_anomaly(query_history_df):
query_history_df = query_history_df.withColumn("START_TIME", col("START_TIME").cast("timestamp"))
# Define window specification for each user and resource
window_spec = Window.partitionBy("USER_NAME", "DATABASE_NAME").orderBy("START_TIME")
first_time_df = query_history_df.withColumn("row_number", row_number().over(window_spec))
# Filter for rows where the row number is 1, indicating the first access
first_time_df = first_time_df.filter(col("row_number") == 1)
# Group the detections daily
first_time_df = first_time_df.withColumn("DATE", to_date("START_TIME"))
# Group by date and user name to count the number of new resources accessed on the same day
daily_new_resource_count = first_time_df.groupBy("DATE","USER_NAME").agg(count("*").alias('count'), collect_set('DATABASE_ID'), collect_set('DATABASE_NAME'))
# Calculate the average and standard deviation of daily new resource accesses
daily_stats = daily_new_resource_count.select(avg("count").alias("avg_count"), stddev_pop("count").alias("stddev_count")).collect()[0]
average_count = daily_stats["avg_count"]
stddev_count = daily_stats["stddev_count"]
upper_bound = average_count + 3 * stddev_count
lower_bound = average_count - 3 * stddev_count
# Filter for days with outlier new resource accesses
final_df = daily_new_resource_count.filter((col("count") > upper_bound) | (col("count") < lower_bound))
return final_df
A subsequent investigation may involve examining database names, the success or failure of attempts, and reasons for access denial. For instance, an attacker exploring the Snowflake environment may attempt to access databases they lack permission for, resulting in an excessive number of insufficient privilege errors.
def calculate_daily_error_percentage(query_history_df):
filtered_df = query_history_df.where(col('error_code')!='NULL')
error_stats = filtered_df.groupBy("date", "USER_NAME").count()
error_stats = error_stats.withColumnRenamed("count", "error_count")
total_queries = query_history_df.groupBy("date", "USER_NAME").count()
total_queries = total_queries.withColumnRenamed("count", "total_queries")
final_df = total_queries.join(error_stats, on=["date", "USER_NAME"], how="left")
final_df = final_df.withColumn("daily_error_percentage",
when(col("error_count").isNull(), 0)
.otherwise(col("error_count") / col("total_queries")))
return final_df
By keeping an eye on these anomalies in the "QUERY_HISTORY" view, organizations can better detect and respond to potential security threats in their Snowflake environment.
Analyzing Login History for Suspicious Activity
In the "LOGIN_HISTORY" view, our goal is to identify unusual IP addresses and detect suspicious login patterns, such as brute force attacks. Here's what to focus on:
- Rare IP Addresses: Detect IP addresses that rarely appear in login history logs. These uncommon IPs may signal potential security risks and merit further scrutiny.
- Brute force Detection: Implement a brute force detection mechanism to identify patterns of multiple failed login attempts from the same IP address within a short timeframe.
- Threat Intelligence Integration Utilize threat intelligence data to evaluate the risk associated with rare IP addresses. Look for traits such as anonymous VPN, TOR exit node, public proxy, hosting provider, and other indicators of suspicious activity.
By monitoring these anomalies in the "LOGIN_HISTORY" view, organizations can enhance their ability to detect and respond to potential security threats in their Snowflake environment.
Next Steps of Action
Once threat hunters identify suspicious behavior through "Query_History" and "Login_History" logs, their next steps are crucial in uncovering the extent of potential threats. Upon detecting anomalies, such as unusual data scanning or rare IP addresses in login attempts, it's imperative for investigators to delve deeper. This involves conducting thorough analysis of associated user activities, cross-referencing contextual information, and correlating findings with other relevant logs or external threat intelligence sources.
An exemplary follow-up investigation tip involves monitoring suspicious activities using the SESSION_ID in the QUERY_HISTORY view. This approach enables the tracking of all activities associated with a potentially compromised user session, including executed queries. By examining related activities within the same session, analysts can assess the risk level, understand the accessed data, and potentially identify any exfiltrated information.
Mitiga's Response: Assisting Customers Amidst the Snowflake campaign
To ensure our clients were well-prepared and protected, we reached out with a preliminary alert, informing them about the emerging threat posed by the threat actor. This alert included essential information on potential risks and initial steps to safeguard their Snowflake environments. Following this, we conducted a dedicated, event-driven threat hunt tailored to each client’s environment, meticulously tracking Indicators of Compromise (IOCs) and unusual activities within their systems.
Our threat hunt process involved analyzing key forensic information from Snowflake’s built-in database and schemas. We looked for anomalies such as unusual data scans or excessive data writing, which could indicate unauthorized access. We also scrutinized rare IP addresses, integrating threat intelligence to assess their risk levels.
Additionally, we developed custom Indicators of Attack (IOAs) to suit the unique nature of this threat, enhancing our ability to detect and mitigate potential breaches. Understanding the importance of proper security configurations, we collaborated closely with our clients to verify and optimize the security settings of their Snowflake instances. This included implementing strong authentication mechanisms, proper user permissions, and comprehensive logging and monitoring setups.
We invite organizations concerned about their Snowflake security to reach out to us. Our expert team is ready to assist in assessing your current security configurations and conducting thorough threat hunts to protect your valuable data.
Summary
In today's dynamic threat landscape, collecting forensic data from SaaS environments like Snowflake is paramount to safeguarding against potential security breaches. By proactively conducting continuous threat hunts and analyzing logs such as "Query_History" and "Login_history," organizations can detect and respond to suspicious activities effectively. These measures not only aid in identifying anomalies like unusual data scanning or rare IP addresses but also enable deeper investigations into potential threats. As the Snowflake campaign incident demonstrates, staying vigilant and leveraging advanced analytics are essential for mitigating risks and protecting critical data assets. Moving forward, we commit to updating this blog with any developments related to this campaign and the threat group, ensuring our readers stay informed and empowered in their cybersecurity efforts.
Learn more about this ongoing threat plus how researchers are helping teams understand the impact by watching this webinar.