CODASPY '25: Proceedings of the Fifteenth ACM Conference on Data and Application Security and Privacy

Full Citation in the ACM Digital Library

SESSION: Keynote Talks

Privacy - From the Ivory Tower to the Trenches in the Parliament

Privacy is a fundamental human right, but it is also one of the more complicated rights. Not only privacy is very context-oriented and society-oriented, it is hard to define privacy without involving adversaries trying to break it. The conceived tension between ''national security'' or ''public order'' and the basic human right of privacy, impacts the adoption of Privacy Enhancing Technologies (PETs), and at times, hinders the ability to offer privacy to users.

In this talk, we will discuss a few topics where such a conflict exists, such as biometric data, contact tracing, and COVID certificates. We will discuss the importance of incorporating privacy into our technical work, thus making a real-life impact by offering solutions that offer better privacy. We will discuss the problems in disseminating state-of-the-art scientific knowledge to the public.

We will then consider the problem of discussing these issues with policy makers, as they are the ones making the call, while not necessarily understanding the impact of complicated technologies for (and on) privacy.

Covert Social Influence Operations: Past, Present, and Future

Covert Social Influence Operations (CSIOs) have been studied for almost a dozen years. Since a first study of CSIOs in the 2014 Indian election and the DARPA Twitter Influence Bot Detection Challenge of 2015 under the SMISC Program, the field has come a long way. After a quick review of CSIOs of the past, this talk will quickly move on to how recent advances in AI will influence the direction of CSIOs. We can think of CSIOs as involving a threat actor (CSIO operator) targeting a defender (e.g. social platform). Though the extraordinary ability of modern AI to generate realistic text, image, video, audio, and multimodal content poses a potential threat, I will argue that the even more extraordinary ability of AI to dynamically adapt to changing circumstances and defender tactics will likely pose an even bigger threat. (The second part of this talk reflects joint work with Saurabh Kumar, Valerio La Gatta, Andrea Pugliese, Andrew Pulver, Jiazhi Zhang, and Youzhi Zhang.)

SESSION: Session 1: Web and Browser Security

CodeX: Contextual Flow Tracking for Browser Extensions

Browser extensions put millions of users at risk when misusing their elevated privileges. Despite the current practices of semi-automated code vetting, privacy-violating extensions still thrive in the official stores. We propose an approach for tracking contextual flows from browser-specific sensitive sources like cookies, browsing history, bookmarks, and search terms to suspicious network sinks through network requests. We demonstrate the effectiveness of the approach by a prototype called CodeX that leverages the power of CodeQL while breaking away from the conservativeness of bug-finding flavors of the traditional CodeQL taint analysis. Applying CodeX to the extensions published on the Chrome Web Store between March 2021 and March 2024 identified 1,588 extensions with risky flows. Manual verification of 339 of those extensions resulted in flagging 212 as privacy-violating, impacting up to 3.6M users.

Coding Malware in Fancy Programming Languages for Fun and Profit

The continuous increase in malware samples, both in sophistication and number, presents many challenges for organizations and analysts, who must cope with thousands of new heterogeneous samples daily. This requires robust methods to quickly determine whether a file is malicious. Due to its speed and efficiency, static analysis is the first line of defense.

In this work, we illustrate how the practical state-of-the-art methods used by antivirus solutions may fail to detect evident malware traces. The reason is that they highly depend on very strict signatures where minor deviations prevent them from detecting shellcodes that otherwise would immediately be flagged as malicious. Thus, our findings illustrate that malware authors may drastically decrease the detections by converting the code base to less-used programming languages. To this end, we study the features that such programming languages introduce in executables and the practical issues that arise for practitioners to detect malicious activity.

SemFinder: A Semantics-Based Approach to Enhance Vulnerability Analysis in Web Applications

Modern web applications are becoming increasingly complex. They include multiple dynamic runtime constructs that are difficult to analyze by static application security testing (SAST) tools. These tools often use a graph representation of the code for their analysis. However, built statically, such graphs may miss important data and control flows dependent on runtime information. In addition, the presence of difficult-to-analyze code patterns in modern web applications, referred to as testability tarpits, further reduces the accuracy of statically built graphs. As a result, current SAST tools have several false negatives because of 'hidden' paths, which are not present in the graphs. In this paper, we present SemFinder, an approach designed to automatically detect such hidden paths. SemFinder uses natural language semantics to hypothesize connections between different locations in the code based on the meaning and similarity of the variables in those locations and test those hypotheses dynamically. We evaluate SemFinder on 30 PHP applications and discover 215 new exploitable hidden paths with respect to existing SAST tools, leading to the submission of 31 new CVEs.

Evaluating Website Data Leaks through Spam Collection on Honeypots

Nowadays, people rely heavily on online services in their daily lives such as communication, education, shopping, and entertainment. While online services offer convenience in daily living, users often receive a large number of spams as a result. While previous studies have linked spam receipt primarily to user behavior, this research proposes that spam can serve as a forensic indicator of data leaks by websites. To test our hypothesis, we conducted an experiment to deploy 148 honeypots across 370 websites spanning 12 communities. We monitored and audited the spams received by our honeypots for 47 weeks and analyzed their nature, pattern and origin. The results reveal that some legitimate websites leak user data despite having privacy policy statements. The findings also highlight that some websites automatically enroll users in newsletters or mailing lists without asking consent during the sign-up. This issue arises from conflating privacy policies with spam subscription and third party share agreements. To address these issues we suggest that regulators enforce websites to separate subscription agreement from privacy policy statements, and direct consent for third party share be requested at sign up. Also, websites should evaluate third party chain to ensure user data protection.

Enhanced Threat Modeling and Attack Scenario Generation for OAuth 2.0 Implementations: Data/Toolset paper

OAuth 2.0 is a widely adopted authorization framework enabling secure, delegated access to resources on behalf of a user. While the protocol is robust when implemented correctly, real-world deployments often exhibit vulnerabilities due to misconfigurations, incomplete mitigations, or misunderstandings of its intricacies. (Semi-)automated testing is therefore essential to identify and address these security flaws. Among available tools, OAuch offers the most comprehensive benchmark for assessing OAuth IdP implementations by identifying potential threats based on the OAuth threat model and related standards. However, OAuch has notable limitations, including an incomplete threat model, ambiguous threat classifications, and a lack of support for multi-vulnerability attack scenarios. This paper presents enhancements to OAuch that improve the tool's usability, including enriched metadata, the introduction of attack scenarios for multi-threat analyses, and a likelihood assessment to prioritize mitigation efforts.

SESSION: Session 2: Differential Privacy

Harmonizing Differential Privacy Mechanisms for Federated Learning: Boosting Accuracy and Convergence

Differentially private federated learning (DP-FL) offers a compelling approach to collaborative model training by ensuring robust privacy for clients. Despite its potential, current methods face challenges in effectively balancing privacy, utility, and performance across diverse federated learning scenarios. Addressing these challenges, we introduce UDP-FL, to our knowledge the first DP-FL framework that universally harmonizes any randomization mechanism, including those considered optimal, by employing the Gaussian Moments Accountant (viz. DP-SGD). Central to UDP-FL is the 'Harmonizer,' a dynamic module engineered to intelligently select and apply the most suitable DP mechanism tailored to each client's specific privacy requirements, data sensitivities, and computational capacities. This selection process is driven by the principle of Rényi Differential Privacy, which serves as a crucial mediator for aligning privacy budgets effectively. Our comprehensive evaluation of UDP-FL, benchmarked against established baseline methods, demonstrates superior performance in upholding privacy guarantees and enhancing model functionality. The framework's robustness has been rigorously tested against a broad spectrum of privacy attacks, making it one of the most thorough validations of a DP-FL framework to date.

Differentially Private Iterative Screening Rules for Linear Regression

Linear L1-regularized models have remained one of the simplest and most effective tools in data science. Over the past decade, screening rules have risen in popularity as a way to eliminate features when producing the sparse regression weights of L1 models. However, despite the increasing need of privacy-preserving models for data analysis, to the best of our knowledge, no differentially private screening rule exists. In this paper, we develop the first private screening rule for linear regression. We initially find that this screening rule is too strong: it screens too many coefficients as a result of the private screening step. However, a weakened implementation of private screening reduces overscreening and improves performance.

Spend Your Budget Wisely: Towards an Intelligent Distribution of the Privacy Budget in Differentially Private Text Rewriting

The task of Differentially Private Text Rewriting is a class of text privatization techniques in which (sensitive) input textual documents are rewritten under Differential Privacy (DP) guarantees. The motivation behind such methods is to hide both explicit and implicit identifiers that could be contained in text, while still retaining the semantic meaning of the original text, thus preserving utility. Recent years have seen an uptick in research output in this field, offering a diverse array of word-, sentence-, and document-level DP rewriting methods. Common to these methods is the selection of a privacy budget (i.e., the ε parameter), which governs the degree to which a text is privatized. One major limitation of previous works, stemming directly from the unique structure of language itself, is the lack of consideration of where the privacy budget should be allocated, as not all aspects of language, and therefore text, are equally sensitive or personal. In this work, we are the first to address this shortcoming, asking the question of how a given privacy budget can be intelligently and sensibly distributed amongst a target document. We construct and evaluate a toolkit of linguistics- and NLP-based methods used to allocate a privacy budget to constituent tokens in a text document. In a series of privacy and utility experiments, we empirically demonstrate that given the same privacy budget, intelligent distribution leads to higher privacy levels and more positive trade-offs than a naive distribution of epsilon. Our work highlights the intricacies of text privatization with DP, and furthermore, it calls for further work on finding more efficient ways to maximize the privatization benefits offered by DP in text rewriting.

SESSION: Session 3: Access Control Management and Policy Compliance

Enhancing Relationship-Based Access Control Policies with Negative Rule Mining

Relationship-based access control (ReBAC) policies often rely solely on positive authorization rules, implicitly denying all other requests by default. However, many scenarios require explicitly stating negative authorization rules to capture exceptions or special restrictions that are not naturally enforced by deny-by-default semantics. This work presents a systematic method to mine ReBAC policies that integrate both positive and negative authorization rules from observed authorizations. We formalize the mining problem, show its NP-hardness, and develop an approach that identifies minimal policies while accurately reflecting observed access decisions. We demonstrate the feasibility and effectiveness of our proposed approach through a set of experiments. Our experimental evaluations on representative datasets demonstrate that including negative rules leads to more concise and semantically complete policies, confirming the necessity of explicit negative authorizations in complex access control settings.

To the Best of Knowledge and Belief: On Eventually Consistent Access Control

We are used to the conventional model in which access control is provided by a trusted central entity or by a set of distributed entities that coordinate to mimic a central entity. Authorization decisions are based on a single, append-only total ordering of all actions, including policy updates, which leads to strong consistency guarantees. Recent systems based on conflict-free replicated data types (CRDTs) break with this conceptual model to gain fundamental advantages in latency, availability, resilience, and Byzantine fault tolerance. These systems replace up-front coordination with subsequent reconciliation of decentralized authorization decisions and policy updates. One of these is the Matrix group communication system, whose massive public sector deployments in Europe necessitate timely characterization of the underlying alternative conceptual model. Similarly to eventually consistent replication in CRDTs, we define 'eventually consistent access control' and present its consequences. Our model postulates thinking in two orderings of actions with different consequences for authorization: a partial ordering for storage, where the past of an action is final knowledge, and a total ordering for execution, where the past of an action is a mutable belief.

Proof of Compliance (PoC): A Consensus Mechanism to Verify the Compliance with Informed Consent Policy in Healthcare

Healthcare industries are subject to various laws and regulatory oversight, just like other industries, such as pharmaceuticals, telecommunications, education, and financial services. Compliance with these regulations is essential for the organization's operation and growth. To help organizations detect early non-compliance issues, this paper proposes a consensus mechanism, Proof of Compliance (PoC), where a set of distributed, decentralized, and independent auditor nodes perform audit operations to determine the compliance status of any logical operations or accesses that have already been approved, granted, or executed in the system. The Proof of Compliance consensus mechanism helps organizations minimize compliance challenges. Organizations can consider PoC outputs to take further actions to reduce non-compliance cases and avoid compliance issues and business losses. The PoC reports do not support final regulatory compliance certification. However, it is possible if one or more multiple audit nodes are deployed and maintained in the consensus mechanism by the corresponding regulatory, government, or compliance authority.

SESSION: Session 4: Systems and Hardware Security

Exploiting HDMI and USB Ports for GPU Side-Channel Insights

Modern computers rely on USB and HDMI ports for connecting external peripherals and display devices. Despite their built-in security measures, these ports remain susceptible to passive power-based side-channel attacks. This paper presents a new class of attacks that exploit power consumption patterns at these ports to infer GPU activities. We develop a custom device that plugs into these ports and demonstrates that its high-resolution power measurements can drive successful inferences about GPU processes, such as neural network computations and video rendering. The ubiquitous presence of USB and HDMI ports allows for discreet placement of the device, and its non-interference with data channels ensures that no security alerts are triggered. Our findings underscore the need to reevaluate and strengthen the current generation of HDMI and USB port security defenses.

VS-TEE: A Framework for Virtualizing TEEs in ARM Cloud Contexts

Cloud computing processes and stores critical data, necessitating robust protections against unauthorized access. Confidential Computing (CC) technologies address this need by enabling secure computation in hardware-backed Trusted Execution Environments (TEEs). While solutions like AMD's Secure Encrypted Virtualization (SEV) provide strong protections, they remain vulnerable to attacks targeting applications within virtual machines (VMs). Similarly, the recent Armv9-A architecture introduces a promising Realm World for enhanced security, but its adoption is limited by hardware availability and upgrade constraints. ARM TrustZone, while widely supported, lacks native support for multiple isolated TEEs. In this paper we proposed framework eliminates the need for these components in the Trusted Computing Base (TCB), enabling secure integration of TEEs with VMs. It features a VS-TEE Driver for VM interaction and a VS-TEE Hypervisor for secure communication, ensuring compatibility with ARM TrustZone and OP-TEE libraries. We developed and evaluated an open-source prototype, demonstrating its effectiveness in addressing challenges like memory translation, resource management, and interoperability. Our framework enhances security for cloud environments, allowing multiple VMs to securely share TEE capabilities.

Defining Security Limits in Biometrics

Biometric systems are widely used for authentication and identification. The False Match Rate (FMR) quantifies the probability of matching a biometric template to a non-corresponding template and serves as an indicator of the system robustness against security threats. We analyze biometric systems through two main contributions. First, we study untargeted attacks, where an adversary aims to impersonate any user in the database. We compute the number of trials needed for a successful impersonation and derive the critical population size ( i.e., the maximum database size) and critical (FMR) required to maintain security against untargeted attacks as the database grows. Second, we address the biometric birthday problem, which quantifies the probability that there exists two distinct users that collide ( i.e., can impersonate each other). We compute approximate and exact probabilities of collision and derive the associated critical population size and critical (FMR) to bound the risk of biometric collisions, particularly in large-scale databases. These thresholds provide actionable insights for designing biometric systems that mitigate the risks of impersonation and biometric collisions, particularly in large-scale databases. Nevertheless, our findings show that current systems fail to meet the required security level against untargeted attacks, even in small databases, and face significant challenges with the biometric birthday problem as databases grow.

Probabilistic Data Structures in the Wild: A Security Analysis of Redis

Redis (Remote Dictionary Server) is a general purpose, in-memory database that supports a rich array of functionality, including various Probabilistic Data Structures (PDS), such as Bloom filters, Cuckoo filters, as well as cardinality and frequency estimators. These PDS typically perform well in the average case. However, given that Redis is intended to be used across a diverse array of applications, it is crucial to evaluate how these PDS perform under worst-case scenarios, i.e., when faced with adversarial inputs. We offer a comprehensive analysis to address this question. We begin by carefully documenting the different PDS implementations in Redis, explaining how they deviate from those PDS as described in the literature. Then we show that these deviations enable a total of 10 novel attacks that are more severe than the corresponding attacks for generic versions of the PDS. We highlight the critical role of Redis' decision to use non-cryptographic hash functions in the severity of these attacks. We conclude by discussing countermeasures to the attacks

Padding Matters - Exploring Function Detection in PE Files: Data/Toolset paper

Function detection is a well-known problem in binary analysis. While prior work has focused on Linux/ELF, Windows/PE binaries have only partially been considered. This paper introduces FuncPEval, a dataset for Windows x86 and x64 PE files, featuring Chromium and the Conti ransomware, along with ground truth data for 1,092,820 function starts. Utilizing FuncPEval, we evaluate five heuristics-based (Ghidra, IDA, Nucleus, rev.ng, SMDA) and three machine-learning-based (DeepDi, RNN, XDA) function start detection tools. Among these, IDA achieves the highest F1-score (98.44%) for Chromium x64, while DeepDi closely follows (97%) but stands out as the fastest. Towards explainability, we examine the impact of padding between functions on the detection results, finding all tested tools, except rev.ng, are susceptible to randomized padding. The randomized padding significantly diminishes the effectiveness of the RNN, XDA, and Nucleus. Among the learning-based tools, DeepDi exhibits the least sensitivity, while Nucleus is the most adversely affected among the non-learning-based tools.

SESSION: Session 5: Privacy Inference and Data Aggregation

Secure and Efficient Video Inferences with Compressed 3-Dimensional Deep Neural Networks

Deep neural network (DNN) services have been widely deployed for efficient and accurate inferences in many different domains. In practice, a client may send its private data (e.g., images, text messages and videos) to the service to get the inferences with the proprietary DNN models. Significant privacy and security concerns would emerge in such scenarios. Cryptographic inference systems have been proposed to address such privacy and security concerns. However, existing systems are tailored for DNNs on image inferences, but not directly applicable to video inference tasks that operate on the spatio-temporal (3D) features. To address such critical deficiencies, we design and implement the first cryptographic inference system, Crypto3D, which privately and efficiently infers videos with compressed 3D DNNs while ensuring rigorous privacy guarantees. We also update most cryptographic inference systems (designed for images) to support video understanding on 3D features with non-trivial extensions, treating them as baselines. We evaluate Crypt3D and benchmark with baselines utilizing the widely adopted C3D and I3D models on the UCF-101 and HMDB-51 datasets. Our results demonstrate that Crypto3D significantly outperforms existing systems on execution time: 554.68× vs. CryptoDL (3D), 189.21× vs. HEANN (3D) , 182.61× vs. MP-SPDZ (3D), 133.56× vs. E2DM (3D), 11.09× vs. Intel SGX (3D), 8.90× vs. Gazelle (3D), 3.71 × vs. Delphi (3D), 12.97 × vs. CryptFlow2 (3D), 1.49× vs. Cheetah (3D); accuracy: 82.4% vs. <80% for all of them.. Code is available at https://github.com/datasec-lab/crypto3D

Buffalo: A Practical Secure Aggregation Protocol for Buffered Asynchronous Federated Learning

Federated Learning (FL) has become a crucial framework for collaboratively training Machine Learning (ML) models while ensuring data privacy. Traditional synchronous FL approaches, however, suffer from delays caused by slower clients (called stragglers), which hinder the overall training process. Specifically, in a synchronous setting, model aggregation happens once all the intended clients have submitted their local updates to the server. To address these inefficiencies, Buffered Asynchronous FL (BAsyncFL) was introduced, allowing clients to update the global model as soon as they complete local training. In such a setting, the new global model is obtained once the buffer is full, thus removing synchronization bottlenecks. Despite these advantages, existing Secure Aggregation (SA) techniques-designed to protect client updates from inference attacks-rely on synchronized rounds, making them unsuitable for asynchronous settings.

In this paper, we present Buffalo, the first practical SA protocol tailored for BAsyncFL. Buffalo leverages lattice-based encryption to handle scalability challenges in large ML models and introduces a new role, the assistant, to support the server in securely aggregating client updates. To protect against an actively corrupted server, we enable clients to verify that their local updates have been correctly integrated into the global model. Our comprehensive evaluation-incorporating theoretical analysis and real-world experiments on benchmark datasets-demonstrates that Buffalo is an efficient and scalable privacy-preserving solution in BAsyncFL environments.

Multi-Device Context-Sensitive Attacks Against Privacy

As the adoption of wearable and smart devices increases, their privacy and security are still a concern. These devices collect sensitive data and constantly communicate with each other, posing new privacy threats that need to be understood and addressed. In this paper, we analyze the privacy of smart devices from a multi-device perspective. The central premise of our work is that information available at each device may be non-sensitive or lightly so, but by orchestrating information from multiple connected smart devices, it is possible to infer sensitive content. To verify this, we conduct a user study to understand user perceptions towards privacy on smart devices and contrast them with their actual behavior while operating these devices. We then present an attack framework that can leverage tightly coupled and connected smart devices, such as mobile, wearable, and smart TV, to leak sensitive information inferred from individually non-sensitive data. Finally, we introduce a tool based on NLP techniques to identify potential privacy vulnerabilities on smart devices and propose an integrated solution to increase smart devices' security. This analysis helps close the gap between user's perception and reality regarding privacy risks within their smart ecosystem.

Why You've Got Mail: Evaluating Inbox Privacy Implications of Email Marketing Practices in Online Apps and Services

This study explores the widespread perception that personal data, such as email addresses, may be shared or sold without informed user consent, investigating whether these concerns are reflected in actual practices of popular online services and apps. Over the course of a year, we collected and analyzed the source, volume, frequency, and content of emails received by users after signing up for the 150 most popular online services and apps across various sectors. By examining patterns in email communications, we aim to identify consistent strategies used across industries, including potential signs of third-party data sharing. This analysis provides a critical evaluation of how email marketing tactics may intersect with data-sharing practices, with important implications for consumer privacy and regulatory oversight. Our study findings, conducted post-CCPA and GDPR, indicate that while no unknown third-party spam email was detected, internal and authorized third-party email marketing practices were pervasive, with companies frequently sending promotional and CRM emails despite opt-out preferences. The framework established in this work is designed to be scalable, allowing for continuous monitoring, and can be extended to include a more diverse set of apps and services for broader analysis, ultimately contributing to transparency in email address privacy practices.

How Feasible is Augmenting Fake Nodes with Learnable Features as a Counter-strategy against Link Stealing Attacks?

Graph Neural Networks (GNNs) are widely used and deployed for graph-based prediction tasks. However, as good as GNNs are for learning graph data, they also come with the risk of privacy leakage. For instance, an attacker can run carefully crafted queries on the GNNs and, from the responses, can infer the existence of an edge between a pair of nodes. This attack, dubbed as a link-stealing attack, can jeopardize the user's privacy by leaking potentially sensitive information. To protect against this attack, we propose an approach called <u>N</u>ode <u>A</u>ugmentation for <u>R</u>estricting <u>Graphs</u> from <u>I</u>nsinuating their <u>S</u>tructure (NARGIS) and study its feasibility. NARGIS is focused on reshaping the graph embedding space so that the posterior from the GNN model will still provide utility for the prediction task but will introduce ambiguity for the linkstealing attackers. NARGIS applies spectral clustering on the given graph to facilitate it being augmented with new nodes with learned features instead of fixed ones. It utilizes tri-level optimization for learning parameters for the GNN model, surrogate attacker model, and our defense model (i.e., learnable node features). We extensively evaluate NARGIS on three benchmark citation datasets over eight knowledge availability settings for the attackers. We also evaluate the model fidelity and defense performance on influence-based link inference attacks. Through our studies, we have figured out the best feature of NARGIS-its superior fidelity-privacy performance trade-off in a significant number of cases. We also have discovered cases where there needs to be improvement and have proposed ways to integrate different schemes to make the model more robust against link stealing attacks.

SESSION: Session 6: Threat Detection and Intelligence

Citar: Cyberthreat Intelligence-driven Attack Reconstruction

Security Operation Centers (SOCs) are the first line of defense against an increasingly complex and sophisticated environment of advanced persistent threats (APTs). Inside SOCs, analysts deal with thousands of alerts every day and have to make real-time decisions about whether alerts are worth investigating further. However, they face several challenges in efficiently investigating a significant number of alerts daily and reconstructing attack scenarios from those alerts. In this paper, we present Citar, an approach for leveraging cyber threat intelligence (CTI) to facilitate attack scenario reconstruction. Citar enhances alert investigation by attributing alerts to potential attacker groups and examining audit logs for related attack instances. Utilizing a new correlation analysis developed for this purpose, we identify potential connections between flagged alerts and known attack behaviors present in a system. Citar is evaluated using a DARPA public dataset and 10 new attack scenarios (five real-world APT groups and five popular malwares). Our evaluation shows that augmenting existing detection mechanisms with Citar improves detection performance by up to 57%, significantly aiding SOC analysts in alert investigations and attack reconstructions.

SmishViz: Towards A Graph-based Visualization System for Monitoring and Characterizing Ongoing Smishing Threats

SMS phishing (aka 'smishing') threats have grown to be a serious concern for mobile users around the globe. In cases of successful smishing, attackers take advantage of users' trust through deceptive text messages to trick them into downloading malicious content, disclosing private information, or becoming victims of fraud. Current studies on smishing mostly focus on the classification of smishing (or spam) messages from benign ones as a means of defense. However, there is no systematic study to characterizing smishing threats and their landscapes by which we can monitor the ongoing campaigns from a bird's-eye perspective to apply effective defense. In this paper, we propose SmishViz, a graph-based visualization system that can aid defenders (i.e., analysts) to characterize ongoing smishing threats in the wild and allow them to monitor the connected campaigns and campaign-operations through effective graph visualization approach integrated with state-of-the-art open-source visualization tool. This paper also provides case study with real-world smishing dataset to showcase the efficacy of SmishViz system in practical use-case scenarios. Our case study results reveal that the proposed system can certainly help defenders to track and monitor ongoing smishing campaigns, understand attackers' tactics to formulate strategic defense and uproot the attack operations.

TerrARA: Automated Security Threat Modeling for Infrastructure as Code

The emergence of DevOps is accompanied by an increased use of Infrastructure as Code (IaC) to specify and manage deployment configurations, infrastructure, and associated resources. Terraform is one such IaC solution. However, improper configurations can lead to serious security threats. This paper introduces an approach, implemented as TerrARA, that provides a systematic and structured way for automatically eliciting security threats based on Terraform configuration files. Specifically, TerrARA: (1) automates the construction of an abstract model-an enriched Data Flow Diagram (DFD)-from Terraform configuration files for Amazon Web Services (AWS), and it can be extended to other resources and cloud providers via profiles; (2) encodes cloud computing threat patterns, which are utilized by the SPARTA threat modeling engine to automatically identify security threats; and (3) demonstrates its capability in accurately extracting DFDs from Terraform projects and eliciting relevant cloud computing security threats, achieving high accuracy and reasonable performance compared to existing tools and approaches like StartLeft and GPT-4o. By integrating it into CI/CD pipelines, the automated reconstruction and analysis enable continuous security assessments that systematically incorporate cloud infrastructure artifacts into the threat modeling process.

SESSION: Session 7: Blockchain and Decentralized Finance

Protecting DeFi Platforms against Non-Price Flash Loan Attacks

Smart contracts in Decentralized Finance (DeFi) platforms are attractive targets for attacks as their vulnerabilities can lead to massive amounts of financial losses. Flash loan attacks, in particular, pose a major threat to DeFi protocols that hold a Total Value Locked (TVL) exceeding 106 billion. These attacks use the atomicity property of blockchains to drain funds from smart contracts in a single transaction. While existing research primarily focuses on price manipulation attacks, such as oracle manipulation, mitigating non-price flash loan attacks that often exploit smart contracts' zero-day vulnerabilities remains largely unaddressed. These attacks are challenging to detect because of their unique patterns, time sensitivity, and complexity. In this paper, we present FlashGuard, a runtime detection and mitigation method for non-price flash loan attacks. Our approach targets smart contract function signatures to identify attacks in real-time and counterattack by disrupting the attack transaction atomicity by leveraging the short window when transactions are visible in the mempool but not yet confirmed. When FlashGuard detects an attack, it dispatches a stealthy dusting counterattack transaction to miners to change the victim contract's state which disrupts the attack's atomicity and forces the attack transaction to revert. We evaluate our approach using 20 historical attacks and several unseen attacks. FlashGuard achieves an average real-time detection latency of 150.31ms, a detection accuracy of over 99.93%, and an average disruption time of 410.92ms. FlashGuard could have potentially rescued over \405.71 million in losses if it were deployed prior to these attack instances. FlashGuard demonstrates significant potential as a DeFi security solution to mitigate and handle rising threats of non-price flash loan attacks.

SolRPDS: A Dataset for Analyzing Rug Pulls in Solana Decentralized Finance

Rug pulls in Solana have caused significant damage to users interacting with Decentralized Finance (DeFi). A rug pull occurs when developers exploit users' trust and drain liquidity from token pools on Decentralized Exchanges (DEXs), leaving users with worthless tokens. Although rug pulls in Ethereum and Binance Smart Chain (BSC) have gained attention recently, analysis of rug pulls in Solana remains largely under-explored. In this paper, we introduce SolRPDS (Solana Rug Pull Dataset), the first public rug pull dataset derived from Solana's transactions. We examine approximately four years of DeFi data (2021-2024) that covers suspected and confirmed tokens exhibiting rug pull patterns. The dataset, derived from 3.69 billion transactions, consists of 62,895 suspicious liquidity pools. The data is annotated for inactivity states, which is a key indicator, and includes several detailed liquidity activities such as additions, removals, and last interaction as well as other attributes such as inactivity periods and withdrawn token amounts, to help identify suspicious behavior. Our preliminary analysis reveals clear distinctions between legitimate and fraudulent liquidity pools and we found that 22,195 tokens in the dataset exhibit rug pull patterns during the examined period. SolRPDS can support a wide range of future research on rug pulls including the development of data-driven and heuristic-based solutions for real-time rug pull detection and mitigation.

Using Venom to Flip the Coin and Peel the Onion: Measurement Tool and Dataset for Studying the Bitcoin - Dark Web Synergy: Data/Toolset Paper

Bitcoin and the Dark Web present an interesting synergy that enables both legitimate anonymity and illicit activities, making it an important landscape to understand, especially as the Dark Web, with its hidden services, relies heavily on Bitcoin as a pseudonymous currency for transactions. However, a lack of scalable tools and timely datasets has limited systematic analysis of this ecosystem. To address this gap, we introduce Venom, a scalable framework for mapping Bitcoin activity on the Dark Web. Venom integrates multithreaded crawling, data extraction, and dataset generation, resulting in a comprehensive resource that allows us to easily collect snapshots of over 177,000 onion sites in roughly 24 hours. With the paper, we share both the tool and an example snapshot containing both per-site metadata and Bitcoin transaction data. Preliminary analysis reveals concentrated activity among key players and widespread content mirroring, offering new insights into the Dark Web's economic structure. Venom provides a critical resource for advancing research and monitoring in this domain.

SESSION: Session 8: AI and Security

Espresso: Robust Concept Filtering in Text-to-Image Models

Diffusion based text-to-image models are trained on large datasets scraped from the Internet, potentially containing unacceptable concepts (e.g., copyright-infringing or unsafe). We need concept removal techniques (CRTs) which are i) effective in preventing the generation of images with unacceptable concepts, ii) utility-preserving on acceptable concepts, and, iii) robust against evasion with adversarial prompts. No prior CRT satisfies all these requirements simultaneously. We introduce Espresso, the first robust <u>co</u>ncept <u>fi</u>lter based on Contrastive Language-Image Pre-Training (CLIP). We identify unacceptable concepts by using the distance between the embedding of a generated image to the text embeddings of both unacceptable and acceptable concepts. This lets us fine-tune for robustness by separating the text embeddings of unacceptable and acceptable concepts while preserving utility. We present a pipeline to evaluate various CRTs to show that Espresso is more effective and robust than prior CRTs, while retaining utility

Laminator: Verifiable ML Property Cards using Hardware-assisted Attestations

Regulations increasingly call for various assurances from machine learning (ML) model providers about their training data, training process, and model behavior. For better transparency, industry (e.g., Huggingface and Google) has adopted model cards and datasheets to describe various properties of training datasets and models. In the same vein, we introduce the notion of inference cards to describe the properties of a given inference (e.g., binding of the output to the model and its corresponding input). We coin the term ML property cards to collectively refer to these various types of cards.

To prevent a malicious model provider from including false information in ML property cards, they need to be verifiable. We show how to construct verifiable ML property cards using property attestation, technical mechanisms by which a prover (e.g., a model provider) can attest to various ML properties to a verifier (e.g., an auditor). Since prior attestation mechanisms based purely on cryptography are often narrowly focused (lacking versatility) and inefficient, we need an efficient mechanism to attest different types of properties across the entire ML model pipeline.

Emerging widespread support for confidential computing has made it possible to run and even train models inside hardware-assisted trusted execution environments (TEEs), which provide highly efficient attestation mechanisms. We propose Laminator, which uses TEEs to provide the first framework for verifiable ML property cards via hardware-assisted ML property attestations. Laminator is efficient in terms of overhead, scalable to large numbers of verifiers, and versatile with respect to the properties it can prove during training or inference.

The Ephemeral Threat: Assessing the Security of Algorithmic Trading Systems powered by Deep Learning

We study the security of stock price forecasting using Deep Learning (DL) in computational finance. Despite abundant prior research on vulnerability of DL to adversarial perturbations, such work has hitherto hardly addressed practical adversarial threat models in the context of DL-powered algorithmic trading systems (ATS).

Specifically, we investigate the vulnerability of ATS to adversarial perturbations launched by a realistically constrained attacker. We first show that existing literature has paid limited attention to DL security in the financial domain---which is naturally attractive for adversaries. Then, we formalize the concept of ephemeral perturbations (EP), which can be used to stage a novel type of attack tailored for DL-based ATS. Finally, we carry out an end-to-end evaluation of our EP against a profitable ATS. Our results reveal that the introduction of small changes to the input stock-prices not only (i)~induces the DL model to behave incorrectly but also (ii)~leads to the whole ATS to make suboptimal buy/sell decisions, resulting in a worse financial performance of the targeted ATS.

PromptShield: Deployable Detection for Prompt Injection Attacks

Application designers have moved to integrate large language models (LLMs) into their products. However, many LLM-integrated applications are vulnerable to prompt injections. While attempts have been made to address this problem by building prompt injection detectors, many are not yet suitable for practical deployment. To support research in this area, we introduce PromptShield, a benchmark for training and evaluating deployable prompt injection detectors. Our benchmark is carefully curated and includes both conversational and application-structured data. In addition, we use insights from our curation process to fine-tune a new prompt injection detector that achieves significantly higher performance in the low false positive rate (FPR) evaluation regime compared to prior schemes. Our work suggests that careful curation of training data and larger models can contribute to strong detector performance.

A Dataset for Evaluating LLMs Vulnerability Repair Performance in Android Applications: Data/Toolset paper

Automated Program Repair (APR) is a well-established research area that enhances software reliability and security by automatically fixing bugs, reducing manual effort, and accelerating debugging. Despite progress in publishing benchmarks to evaluate APR tools, datasets specifically targeting Android are lacking.

To address this gap, we introduce a dataset of 272 real-world violations of Google's Android Security Best Practices, identified by statically analyzing 113 real-world Android apps. In addition to the faulty code, we manually crafted repairs based on Google's guidelines, covering 176 Java-based and 96 XML-based violations from Android Java classes and Manifest files, respectively. Additionally, we leveraged our novel dataset to evaluate Large Language Models (LLMs) as they are the latest promising APR tools. In particular, we evaluated GPT-4o, Gemini 1.5 Flash and Gemini in Android Studio and we found that GPT-4o outperforms Google's models, demonstrating higher accuracy and robustness across a range of violations types. Hence, with this dataset, we aim to provide valuable insights for advancing APR research and improving tools for Android security.

SESSION: Session 9: Cryptography and Secure Communication

CryptMove: Moving Stealthily through Legitimate and Encrypted Communication Channels

To move laterally inside an enterprise environment, Advanced Persistent Threat (APT) attacks have used multiple techniques. Due to the arms race between the attacks and the defenses, such techniques have evolved over time, with the latest one capable of reusing existing network connections for stealthy lateral movement. However, this technique has limited impact because it cannot reuse encrypted connections that are becoming the norm. In this paper, we present CryptMove, a novel technique that can abuse existing and encrypted channels for lateral movement. CryptMove secretly accesses the memory of the target process to duplicate the security context that is used by the target process to perform encryption/decryption; it also secretly duplicates sockets owned by the target process and injects encrypted malicious commands through these sockets into the encrypted communication channels. Since the location of the security context is specific to the target application, CryptMove employs automated analysis of the target application's binary code, in order to learn a path to reach the security context via a sequence of memory accesses. To demonstrate the feasibility of CryptMove, we built PoC attack tools (on both Windows and Linux) that successfully attacked popular applications (e.g., OpenSSH, PuTTY, WinSCP and WinRM) under 63 different cipher-protocol combinations. We also confirmed that the CryptMove PoC is not detectable by several popular Antivirus and Endpoint Detection and Response systems.

Blind Brother: Attribute-Based Selective Video Encryption

The emergence of video streams as a primary medium for communication and the demand for high-quality video sharing over the internet have given rise to several security and privacy issues, such as unauthorized access and data breaches. To address these limitations, various Selective Video Encryption (SVE) schemes have been proposed, which encrypt specific portions of a video while leaving others unencrypted. The SVE approach balances security and usability, granting unauthorized users access to certain parts while encrypting sensitive content. However, existing SVE schemes adopt an all-or-nothing coarse-grain encryption approach, where a user with a decryption key can access all the contents of a given video stream. This paper proposes and designs a fine-grained access control-based selective video encryption scheme, ABSVE, and a use-case protocol called Blind Brother. Our scheme encrypts different identified Regions of Interest (ROI) with a unique symmetric key and applies a Ciphertext Policy Attribute Based Encryption (CP-ABE) scheme to tie these keys to specific access policies. This method provides multiple access levels for a single encrypted video stream. Crucially, we provide a formal syntax and security definitions for ABSVE, allowing for rigorous security analysis of this and similar schemes - which is absent in prior works. Finally, we provide an implementation and evaluation of our protocol in the Kvazaar HEVC encoder. Overall, our constructions enhance security and privacy while allowing controlled access to video content and achieve comparable efficiency to compression without encryption.

Private Eyes: Zero-Leakage Iris Searchable Encryption

This work introduces Private Eyes, the first zero-leakage biometric database. The only leakage of the system is unavoidable: 1) the log of the dataset size and 2) the fact that a query occurred. Private Eyes is built from oblivious symmetric searchable encryption. Approximate proximity queries are used: given a noisy reading of a biometric, the goal is to retrieve all stored records that are close enough according to a distance metric.

Private Eyes combines locality sensitive-hashing or LSHs (Indyk and Motwani, STOC 1998) and oblivious maps which map keywords to values. One computes many LSHs of each record in the database and uses these hashes as keywords in the oblivious map with the matching biometric readings concatenated as the value. At search time with a noisy reading, one computes the LSHs and retrieves the disjunction of the resulting values from the map. The underlying oblivious map needs to answer disjunction queries efficiently.

We focus on the iris biometric which requires a large number of LSHs, approximately 1000. Boldyreva and Tang's (PoPETS 2021) design yields a suitable map for a small number of LSHs (their application was in zero-leakage k-nearest-neighbor search).

Our solution is a zero-leakage disjunctive map designed for the setting when most clauses do not match any records. For the iris, on average at most 6% of LSHs match any stored value.

For the largest tested parameters of a 5000 synthetic iris database, a search requires 18 rounds of communication and 25ms of parallel computation. Our scheme is implemented and open-sourced.

Trilobyte: Plausibly Deniable Communications Through Single Player Games: Data/Toolset Paper

Plausibly deniable communication solutions built on services popular in Western countries may invite closer scrutiny into the activities of their users in censored countries. This paper investigates the ability of popular single-player games to provide the medium for plausibly deniable communications. We introduce Trilobyte, a system that hides data in game state generated opportunistically during regular game-playing activities, and shares data-hiding state through accounts on gaming platforms. We show that even in the presence of hypothetical censors that inspect game state, Trilobyte can hide up to 5.3 MB of data in game state saved in a one hour gaming session. We investigate the practicality of Trilobyte through surveys with 285 Chinese gamers, and by renting and purchasing thousands of gaming accounts. We find that most investigated games, including games developed in China, allow users to communicate keywords considered sensitive in China, when compressed, encrypted or hidden in game state or chat channels.

SESSION: Session 10: Vulnerability and Intrusion Detection

VulPatrol: Interprocedural Vulnerability Detection and Localization through Semantic Graph Learning

The growing complexity of software systems and the management of large, rapidly evolving codebases necessitate the analysis of immense volumes of lines per day due to code modifications and refactoring. Despite the use of static and dynamic analysis, test coverage, and rigorous code reviews, traditional methods often fail to accurately detect all security vulnerabilities, resulting in significant risks in production software. Recently, deep learning models have shown promising possibilities for improving vulnerability detection. Yet, there remains a clear gap between the abilities of current deep learning approaches and the level of performance required for precise source code vulnerability detection. To bridge this gap, it is crucial to develop enhancements in two fundamental areas: a code representation that accurately captures the semantics of programs and a model architecture with adequate expressiveness to analyze this representation effectively. We introduce VulPatrol, a semantic-aware, deep neural network-based system that constructs LLVM-IR interprocedural code property graphs from C/C++ source code. VulPatrol employs message-passing neural networks to capture complex dependencies and dynamic interactions within the code. As a result, it enhances the model's ability to classify potential vulnerabilities. Furthermore, we generate the first Vulnerability database based on compilable C/C++ open-source software to LLVM-IR, along with an obfuscated version. Our extensive evaluation on different benchmark datasets, including real-world programs, shows that VulPatrol outperforms the state-of-the-art baselines, improving the F1 measure by up to 12% for identifying vulnerable functions. Additionally, we evaluate VulPatrol on obfuscated code, which yields superior results regarding string variation and dissimilarity of the original codebase.

IoTDSCreator: A Framework to Create Labeled Datasets for IoT Intrusion Detection Systems

Intrusion detection systems (IDSes) are critical building blocks for securing Internet-of-Things (IoT) devices and networks. Advances in AI techniques are contributing to enhancing the efficiency of IDSes, but their performance typically depends on high-quality training datasets. The scarcity of such datasets is a major concern for the effective use of machine learning for IDSes in IoT networks. To address such a need, we present IoTDSCreator - a tool for the automatic generation of labeled datasets able to support various devices, connectivity technologies, and attacks. IoTDSCreator provides a user with DC-API, an API by which the user can describe a target network and an attack scenario against it. Based on the description, the framework configures the network, leveraging virtualization techniques on user-provided physical machines, performs single or multi-step attacks, and finally returns labeled datasets. Thereby, IoTDSCreator dramatically reduces the manual effort for generating labeled and diverse datasets. We release the source code of IoTDSCreator and 16 generated datasets with 193 features based on 26 types of IoT devices, 2 types of communication links, and 15 types of IoT applications.

Sherlock: A Dataset for Process-aware Intrusion Detection Research on Power Grid Networks: Dataset Paper

Physically distributed components and legacy protocols make the protection of power grids against increasing cyberattack threats challenging. Infamously, the 2015 and 2016 blackouts in Ukraine were caused by cyberattacks, and the German Federal Office for Information Security (BSI) recorded over 200 cyber incidents against the German energy sector between 2023 and 2024. Intrusion detection promises to quickly detect such attacks and mitigate the worst consequences. However, public datasets of realistic scenarios are vital to evaluate these systems. This paper introduces Sherlock, a dataset generated with the co-simulator Wattson. In total, Sherlock covers three scenarios with various attacks manipulating the process state by injecting malicious commands or manipulating measurement values. We additionally test five recently-published intrusion detection systems on Sherlock, highlighting specific challenges for intrusion detection in power grids. Dataset and documentation are available at https://sherlock.wattson.it/.