IWSPA '25: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics

Full Citation in the ACM Digital Library

SESSION: Session 1: APT Detection (keynote)

Real-time Analytics for APT Detection and Threat Hunting Using Cyber-threat Intelligence and Provenance Graphs

The persistent and stealthy nature of Advanced Persistent Threats (APTs) poses a significant challenge to enterprise security. Traditional detection mechanisms often fall short in identifying coordinated multi-step attacks or leveraging the rich context available in Cyber Threat Intelligence (CTI).

The three works presented tackle this problem from complementary angles -- real-time detection, correlation-based threat hunting, and automated intelligence extraction. A unifying thread across these three works is their shared reliance on provenance graphs as a powerful abstraction for capturing and reasoning about complex attacker behavior. Together, these approaches form a complementary ecosystem: Extractor extracts threat knowledge, POIROT hunts for manifestations of that knowledge, and HOLMES detects emergent threats in real-time, all grounded in a common graph-based representation of system activity and threat behavior.

HOLMES introduces a real-time detection framework aimed at identifying the coordinated activities typical of APT campaigns. It does so by correlating suspicious information flows to generate a robust detection signal and constructing high-level provenance graphs that summarize attacker behavior for analyst response. Its evaluation shows high precision and low false alarm rates, supporting its applicability in live operational environments.

POIROT builds on the growing use of CTI standards by actively leveraging the relationships between indicators—often underused in practice—for threat hunting. It treats the problem as a graph pattern matching task, aligning CTI-derived graphs with system-level provenance data obtained from kernel audits. Its novel similarity metric enables efficient search through massive graphs, revealing APT traces within minutes and demonstrating the operational utility of CTI relationship data.

Extractor addresses the challenge of unstructured CTI reports by automatically transforming them into structured, machine-usable provenance graphs. Without requiring strict assumptions about the input text, it extracts concise behavioral indicators that can be fed into threat-hunting tools like POIROT, bridging the gap between raw intelligence and analytical application.

Together, these systems represent a shift toward graph-based, intelligence-driven detection and response. They emphasize the value of integrating real-time monitoring with structured threat intelligence and automation, setting the stage for more adaptive and effective cybersecurity operations.

SESSION: Session 2: LLMs for Security and Privacy

Enhancing Security Insights with KnowGen-RAG: Combining Knowledge Graphs, LLMs, and Multimodal Interpretability

We present a hybrid Retrieval-Augmented Generation (RAG) framework KnowGen-RAG that integrates knowledge graphs comprising entities and relationships and LLM-based Natural Language Generation for application security, privacy, and compliance. The framework aims to enhance the accuracy and relevance of retrieved information to produce more context-aware and actionable security recommendations, identify potential privacy risks, and detect vulnerabilities by utilizing structured knowledge about entities (e.g., access control mechanisms, security policies, privacy-preserving algorithms, protocols, software vulnerabilities) and their inter-relationships in the security context. We also extend the multimodal-LLM interpretability paradigm by contextual explanation generation for equations and tables from unstructured and highly technical documents. KnowGen-RAG proves superior for security-related information retrieval and contextual reasoning. It significantly outperforms both the LLM's output without RAG and Baseline RAG in terms of preciseness and reliability. Specifically, KnowGen-RAG increases accuracy for the CyberMetric dataset, where the original approach struggled to perform consistently for lightweight LLMs, and Baseline RAG achieved only marginal improvements. Additionally, KnowGen-RAG enhances answer quality for our curated security dataset, SecMD, demonstrating its effectiveness and improved understanding of security-related techniques and digital artifacts when addressing complex questions. The system aims to strengthen the learning of security professionals by providing thorough insights into the security landscape, encouraging informed decision-making in the face of sophisticated challenges.

Generative AI vs. Human Deception: A Comparative Analysis of ChatGPT, Gemini, and Human-Generated Disinformation

Large Language Models (LLMs) are powerful tools for generating human-like text, yet their potential to produce and automate disinformation poses significant challenges. This study investigates the characteristics and detectability of disinformation created by both humans and generative AI using Large Language models such as ChatGPT and Google’s Gemini. We began with a dataset of verified health statements from peer-reviewed journals, which we then paired with falsified statements generated by both humans and AI models using structured prompts. This approach allowed us to compare the stylistic features of AI-generated and human-generated health disinformation. Through machine learning models, we assessed the accuracy of detecting disinformation generated by humans and AI based on features of verified and falsified statements. We investigated how stylistic, readability, sentiment, and stance features differentiate AI- from human-crafted disinformation. Our findings reveal that AI-crafted disinformation exhibits predictable stylistic and structural features, making human-crafted disinformation harder to detect by machine learning classifiers. Our results emphasize the need for customized disinformation detection approaches to address the differences between disinformation generated by LLMs and that produced by humans.

Exploring Prompt Patterns for Effective Vulnerability Repair in Real-World Code by Large Language Models

Large Language Models (LLMs) have shown promise in automating code vulnerability repair, but their effectiveness in handling real-world code remains limited. This paper investigates the capability of LLMs, in repairing vulnerabilities and proposes a systematic approach to enhance their performance through specialized prompt engineering. Through extensive evaluation of 5,826 code samples, we found that while LLMs successfully repair vulnerabilities in simple cases, they struggle with complex real-world code that involves intricate dependencies, contextual requirements, and multi-file interactions. To address these limitations, we first incorporated Control Flow Graphs (CFGs) as supplementary prompts, achieving a 14.4% success rate in fixing previously unresolvable vulnerabilities. Through analysis of repair failures, we identified three primary challenge categories and developed corresponding prompt patterns incorporating techniques such as granular contextual information provision and progressive code simplification. Evaluation on real-world projects demonstrated that our approach significantly improved LLMs' repair capabilities, achieving over 85% success rates across all identified challenge categories. Our findings suggest that while LLMs have inherent limitations in handling complex vulnerabilities independently, they can become effective tools for automated vulnerability repair when guided by carefully crafted prompts.

SESSION: Session 3: Security of LLMs (tutorial)

LLMs Under Attack: Understanding the Adversarial Mindset

With Large Language Models (LLMs) powering critical applications, adversarial threats present urgent challenges to their safety and reliability. This tutorial explores adversarial threats against LLMs by covering foundational concepts, identifying key security implications, examining specific attack vectors (such as data poisoning, evasion techniques, and prompt-engineering vulnerabilities), and highlighting LLMs' dual roles as both targets and enablers of malicious activity. We critically assess current defensive approaches, discuss recent criticisms regarding detection reliability and ethical considerations, and outline key open research challenges. Attendees will gain practical insights into anticipating and mitigating adversarial threats to secure the deployment and application of LLM systems.

SESSION: Session 4 : Blockchains for Security (keynote)

Secure and attack-resilient Release of Timed Information using Blockchains

With recent advancements in data-oriented technologies, we are witnessing a huge amount of data exchanges on the Internet. Timed release of information refers to data security primitives that protect the data until a prescribed point of time before the data is made available to the recipient. Numerous applications require timed release of information. Secure auctions require the bidding information to be protected until all bids arrive to ensure the integrity of the auction. Similarly, timed release of information plays an important role in secure voting schemes where all votes need to arrive before the individual votes are made available for counting. Current implementations of timed data release are heavily centralized involving a single point of trust and control. Such centralized schemes are prone to attacks from insider threats, external malware and denial of service attacks even when the service provider offering the service is trustworthy. These concerns with centralized systems have reflected recently in a significant shift towards decentralized peer-to-peer and blockchain-enabled technologies that can enable decentralized protection of data integrity while ensuring data transparency. This talk will present our research on a special class of data infrastructures called self-emerging data infrastructures using blockchain platforms to support timed release of data based on a novel decentralized data release model. The self-emerging data release system keeps the protected data secure and undiscovered such that it is not available prior to the legitimate release time and the data appears automatically at the release time. Our research has developed a highly distributed solution for building decentralized timed release of data using Blockchain-based smart contracts that provide guaranteed protection against adversarial attacks on obtaining access to the data before the release time. A key aspect to this model is a suite of novel secure data management techniques that allows data to be published at a future time point without a single point of trust (i.e., the data becomes self-emerging at the release time and prior to the release time, it remains undiscovered and unavailable). The timed-release mechanisms route the protected data within the self-emerging data release system in a deterministically pseudo- random manner enabling it to automatically appear at the release time. Our research also augments the timed data release techniques with mechanisms for decentralized timed transactions that enable scheduling of transaction functions without revealing the function inputs prior to the execution time. Prior cryptographic solutions called Timed-Release Encryption (TRE) have tackled the timed data release problem in two ways, namely using time-lock puzzles and using third party time servers. The time server used in these approaches is typically a single point of trust and becomes a security bottleneck to the overall system. Furthermore, solving a time-lock puzzle for each timed data release is not only computationally expensive but also not scalable to a large-scale data infrastructure such as the one considered in our research. In contrast, the self-emerging data infrastructure developed in our research does not involve a central point of trust and is highly scalable, involving only a modest computational cost compared to the cryptographic solutions. The choice of blockchain platforms as the underlying data infrastructure network in the proposed approach is motivated by the fact that blockchain networks are huge-scale massively distributed systems that make complete decentralization possible, and they are inherently designed to be reliable and robust to failures. This talk will first introduce our research on decentralized infrastructures using blockchain platforms to support timed release of data. We will discuss how our techniques provide a highly distributed solution for building decentralized timed releases of data using Blockchain-based smart contracts that provide guaranteed protection against adversarial attacks aimed at obtaining access to the data before the release time. Specifically, we will illustrate how our protocols guarantee provable resilience for timed data release against both rational and malicious adversaries. We will also review our techniques for enhancing the reliability of the timed data release and for supporting dynamic control of data. Finally, we will provide some insights into our techniques for facilitating decentralized timed transactions that allow to schedule transaction functions such that the function inputs are revealed only during the execution time.

SESSION: Session 5: Adversarial Attacks and Verification of AI pipelines

Poisoning Attacks against Quantile L1 Regression in CPS Anomaly Detection Frameworks

Data poisoning attacks pose a significant threat to learning-based anomaly detection methods applied in cyber-physical systems (CPS). Quantile L1 regression learning has proven to be beneficial in learning thresholds for anomaly detection frameworks, especially in smart living CPS. This paper addresses the need to quantify the vulnerability level of anomaly detection methods that use quantile L1 regression learning against data poisoning attacks. We introduce a novel poisoning attack strategy against quantile L1 regression that minimizes perturbations incurred per training point and adapts seamlessly to varying levels of training data access and different adversaries' varying stealth-level preferences. To quantitatively evaluate our strategy, we introduce metrics like Perceptibility, which measures stealthiness. Furthermore, we conduct a comprehensive evaluation of this vulnerability with the state-of-the-art anomaly detection framework that uses quantile-L1 regression with two proof of concept CPS (smart metering and smart transportation). Experimental results demonstrate enhanced stealthiness and efficiency compared to gradient-based poisoning strategies. Additionally, we quantify the impact of poisoning attacks on CPS and propose mitigation strategies to bolster resilience against such threats.

A Framework for Cryptographic Verifiability of End-to-End AI Pipelines

The increasing integration of Artificial Intelligence across sectors necessitates robust mechanisms for ensuring transparency, trust, and auditability of its development and deployment. This is particularly important in light of recent calls in various jurisdictions to introduce regulation on AI safety. We propose a framework for complete verifiable AI pipelines, identifying key components and analysing existing cryptographic approaches that contribute to verifiability across different stages of the AI lifecycle, from data sourcing to training, inference, and unlearning. This framework could be used to combat misinformation by providing cryptographic proofs alongside AI-generated assets to allow downstream verification of their provenance and correctness. Our findings underscore the importance of ongoing research to develop cryptographic tools that are not only efficient for isolated AI processes, but that are efficiently 'linkable' across different processes within the AI pipeline, to support the development of end-to-end verifiable AI technologies.

SESSION: Session 6: Analytics (tutorial)

Sparse Moving Averages for Lifelong Open-Ended Probabilistic Prediction

In this tutorial we give an overview of our work on online probabilistic multiclass prediction in non-stationary open-ended settings. We motivate the task within the context of {\em Prediction Games}, whereby the use of a growing set of concepts that predict one another creates non-stationarities. The techniques developed are applicable to other changing domains such as personalized assistants: In general whenever autonomous continual learning is required or desired, for instance when privacy and security is a constraint and we seek to ship products that continually learn and adapt on their own, \ie without the need to having access to a central shared (training) data repository. We introduce {\em Sparse Moving Averages} (SMAs), including adaptations of sparse exponential moving average (EMA) as well as queue-based methods with dynamic per-item histories. For performance evaluation, to handle new items, we develop a bounded version of log-loss. Our findings, on a range of synthetic and real data streams, show that {\em dynamic predictand-specific learning rates} enhance both convergence speed and stability.

SESSION: Session 7: Security and Privacy of AI Robots and Data

Analyzing Security & Privacy (S&P) Labels For AI Integrated Social Robots: A Novel User Assessment Based Research Study

Recent times have seen the emergence of governance, risk and compliance (GRC) related requirements in relation to human-centric artificial intelligence (AI) integrated consumer technology. Currently, there is strong advocacy for AI enabled consumer technology devices to adhere to the security and privacy requirements, as part of the humane AI characteristics connected to ethically aligned, trustworthy AI. These security and privacy requirements apply to AI integrated tech devices, such as social robots, which fall under the class of autonomous intelligent systems. This paper begins with analyzing multiple designs of security and privacy (S&P) labels, that are relevant to the explainability of tech products, such as social robots, which come with AI features and functionalities. It closely inspects the novel S&P label designs of social robots, namely the Zümi, Cozmo and Finch robots, and examines how these S&P labels communicate security & privacy related unique descriptors plus attributes to the device users & stakeholders. The main contribution of this paper is a novel user assessment study, which we conduct to determine the practicality, utility and potential value of these S&P labels. The focus of this work is to validate the design, relevance and socio-technical prospects of these S&P labels. As part of our preliminary user assessment study, we have collected data from 59 users and stakeholders about the design, utility and other prospective values of these S&P labels. In this paper, we share this research data gathered through our user study, and discuss the data. To our knowledge, this is a first of its kind user assessment study with S&P labels for AI enabled tech products, such as social robots, that explores opinions, perceptions, and feedback from different consumers and manufacturers, who are users & stakeholders of AI consumer technology devices, like social robots.

Privacy Risks and Protections in 2021 Texas State Abortion Data

Before its repeal by the Supreme Court of the United States in June 2022, \textit{Roe v. Wade} had protected a woman’s constitutional right to an abortion. Even before June 2022, Texas maintained some of the strictest abortion laws in the country, which have been bolstered by recent state legislation. The promulgation of Senate Bill 8 in 2021, for example, introduced financial incentives for individuals to report those seeking abortion-related care and has stoked fears of abortion vigilantism and surveillance. To understand the privacy risks care-seeking patients in Texas may face, this paper quantifies the extent to which individual privacy may be undermined through race and age information available at the county level in publicly available state records on induced terminations of pregnancy. It finds pronounced identity and membership disclosure risks. Due to the existence of a single estimated Asian woman of reproductive age in Lamb County and a single record of an abortion from an Asian woman in Lamb County that same year, for example, this woman is exposed to certain (i.e., 100\%) identity disclosure risk. It finds strong evidence that women in Texas are afforded varying levels of privacy protections via \textit{k}-Anonymity and $\delta$-Presence analyses. In response to these risks, we evaluated generalization-based methods to achieve a balance between maintaining the utility of these data disclosures while enhancing privacy protections of the published data.