By: Nikhil Kassetty
One of the most difficult challenges in the present-day finance sphere is real-time fraud detection. Interchange on cards, digital wallets, neobanks, BNPL platforms, and instant P2P rails takes milliseconds. Fraudsters have adapted to this world: they no longer rely on a single high-value fraudulent payment that is easy to identify; they organize rings of mule accounts, shell merchants, disposable devices, and rotating IPs. Attacks can be dispersed across large numbers of small transactions, which, on their own, are harmless but, when aggregated, form a coordinated campaign.
Isolated-per-transaction Models
Each individual transaction in traditional rule-based systems and state-of-the-art per-transaction ML models is being outperformed more and more. Fraud is, at the fundamental level, relational. A payment is not a sum of money between a card and a merchant; it is embedded in a graph of relationships among people, accounts, devices, IP addresses, merchants, and time-varying behaviors. This is the intuition of applying a graph AI to detect fraud. Instead of treating a transaction as a single row of features, we model the ecosystem as a graph. Customers are connected to their cards and accounts; merchants and terminals are connected to many accounts; devices are connected to many accounts; IP addresses are connected to devices; emails and phone numbers are connected to identities; and transactions are connected to edges between these entities. In this graph, the fraud rings can be represented by recognizable structures: clusters of accounts sharing devices and IP addresses, networks of newly created accounts trading with the same group of friendly merchants, or subgraphs that interact with the rest of the network only to cash out. Graph neural networks (GNNs) are well-suited to this world, as they can learn both structure and attributes. The use of a conventional model would only reveal that at 2:03 AM on a certain day at Country X, Card A paid Merchant B 120 dollars. An augmented system using a GNN may also observe that the device used in this transaction has been observed 20 times in the past, with 3 accounts already known to be fraudulent, and that this merchant is part of a close community with other merchants who have unusually high chargeback rates. In training, the GNN propagates the signal of financial activity related to fraud across the graph and alters the representations of nodes and edges to such an extent that similar entities in a relational context are close to one another in the learned embedding space. GraphSAGE, graph attention networks, and relational GCNs are examples of architectures often used for their ability to handle large, evolving, and heterogeneous graphs and to generalize to new nodes as the network changes.
In practice, a payments company or neobank has both a streaming event pipeline and a near-real-time display of its graph. This graph stores all new transactions, logins, KYC events, and chargebacks. Entity embeddings, such as those for accounts, devices, merchants, and IPs, are trained regularly or via streaming. Upon a transaction request, the fraud engine looks up the involved embeddings, combines them with traditional attributes such as amount, MCC code, geo, channel, and recent velocity, and delivers the result to a classifier. The risk score is the classifier’s output, and it is interpreted based on business policies to either approve the transaction immediately, escalate it, or reject it. Nonetheless, the most advanced graph model is based on past information. There are very few fraud labels, and they are unbalanced and often out of step with reality.
Patterns of attacks also change: as soon as a specific scheme is identified and prevented, fraudsters switch to new tactics. This is where generative AI may complement graph AI. Generative models can potentially generate realistic synthetic data on fraud and simulate potential attack scenarios that have not yet occurred in production. With structured transaction data, GANs or variational autoencoders inspired by generative models can be trained to learn the conditional joint distribution of features given the fraud label. After training, they can create new synthetic fraudulent records that appear statistically real but are not duplicates of individual customers. Full account histories, onboarding, device binding through to initial transactions, small test payments, and ultimately aggressive cash-out behavior could be generated by sequence models. Generative graph models in the graph domain can produce clusters of mule accounts, clusters of shared devices and IP addresses, and their networks with collusive merchants. It is also easier to orchestrate and create red-team scenarios using generative AI.
A fraud analyst can describe an attack in natural language, such as a slow-burning collusive merchant scheme in which a fraudulent merchant earns trust over time through low-value transactions before suddenly doubling the size of ticket operations and transferring money to mule networks, and an LLM can simulate it. It can imply the number of accounts and devices to add, the length of the scenario to execute, the development of transaction amounts, or even the production of code or queries to generate such information. This artificial world may, in turn, be fed into the fraud engine to determine the responsiveness of the existing rules and models to this case, the number of events detected and at which points, and the system’s weak points. The system can be adversarially trained over time using the new synthetic attacks generated; the models are updated to recognize those patterns, thereby creating a more robust defense. Naturally, any application of generative models to a financial scenario should be regulated. The synthetic data should be generated in a way that supports privacy, using de-identification and mitigation methods to reduce the risk of memorizing personal records. Its dispersion must be traced to ensure it does not alter vital operational statistics, such as distribution amounts, geographic mix, and channel usage. And synthetic sample labeling should be consistent with behavioral narratives, or models will learn to identify synthetic artifacts rather than meaningful indicators of fraud.
Customer experience is the other element of the challenge. Users of digital wallets and neobanks anticipate almost instant approvals for using their money, particularly for day-to-day payments. Strict fraud controls that often reject or scrutinize valid transactions can rapidly demoralize and drive customers away. It is not only to catch more fraud but to do so in a way that does not create a friction experience for good users. In the current systems, this is mitigated with a risk-based orchestrator layer over the raw fraud score. All transactions are rated based on the combined intelligence of graph models, conventional ML, and business rules. The adaptive step-up actions can be triggered by medium-risk events, such as an in-app message, an OTP, or biometric authentication, when the transaction involves a new device, a sensitive merchant, or an unusual time. High-risk events, particularly those involving suspicious graph neighborhoods or known compromise indicators, can be rejected or flagged for manual investigation. Graph signals will be especially useful for enhancing trust and suspicion. When an account is in a healthy section of the graph, with long-lived, heterogeneous, low-risk organizations and stable device and IP behavior, the institution can afford to give it higher limits or fewer challenges. On the other hand, if an otherwise insignificant transaction is initiated by a device linked to a large number of chargebacks, or if the account is closely linked to familiar mule clusters, minute anomalies are handled with greater seriousness. Generative models can also be used in this case to assist in the design and testing of new UX policies, and large language models can be used to write clear, empathetic notes when a transaction is stalled or disputed, minimizing confusion and customer frustration. Satisfying user demands for instant payment requires that the entire process operate under very strict latency constraints.
The fraud-scoring pipelines must respond in tens of milliseconds, which excludes heavy computation during the decision-making process. Rather, the system is designed so that high-cost operations, such as GNN training and embedding computation, run in the background, with results stored in a low-latency feature store. When verifying the truth, the fraud engine conducts quick lookups and lightweight scoring, with more detailed checks operating asynchronously to notify subsequent transactions or trigger post-authorization surveillance. Commonly, end-to-end, a current architecture to detect fraud in a digital wallet or neobank would have a streaming ingestion layer of events, a graph storage and computation layer, a feature store exposing graph-driven and conventional features, an ML modeling layer fusing GNN embeddings with classifier models, and an orchestration layer that maps the scores to business actions. In addition, the analyst tools provide a visual representation of the payment graph and fraud clusters, and a testing environment is available to replay and test synthetic and real-world attack patterns. Because this is a controlled area, governance and explainability are not discretionary. Regulators and internal risk committees would be interested in why transactions are not being passed and whether models perform fairly across the various customer groups.
Graph-based systems should thus be able to give humans understandable explanations like “the device used to make this payment has been linked to a series of similarly fraudulent accounts before” or “the transaction pattern on this account is not normal, given the lifecycle and peer profile of the account.” Post-hoc explainability models may be used to translate the model’s complex reasoning into auditable reason codes. Fairness checks should be performed periodically (including synthetic data pipelines) to verify that neither the models nor the generated situations encode undesirable biases. An incremental roadmap is realistic, at least in the early stages of an organization’s transition. The initial step is to construct a payment graph and obtain simple handcrafted graph features, which are incorporated into current fraud models to authenticate the uplift. The second step will involve introducing GNN-based embeddings and evaluating potential improvements in measures like AUC and recall rate, while maintaining a constant false-positive rate, as well as possible reductions in chargeback. After the graph foundation is stable, the introduction of generative data can be controlled to correct class imbalance and enhance the detection of rare fraud patterns, and a capability for scenario simulation can be built using generative AI. Lastly, the rich signals provided by both the graph and the generative components can be used to implement a mature risk orchestration layer that balances security and user experience. Fraud detection in real time with digital wallets and neobanks is finally a systems issue that cuts across data, modeling, infrastructure, product, and compliance. Graph AI provides an opportunity to perceive fraud in its actual form: as a networked, relational phenomenon. Generative AI offers the potential to model and simulate attacks, allowing for proactive approaches rather than just reactive responses. When carefully integrated into a risk- and UX-focused framework, these technologies have the potential to assist payment providers in managing fraud while maintaining the fast, seamless experiences customers expect.
Disclaimer: The information provided in this article is intended for informational purposes only. The potential benefits and capabilities of generative AI and graph AI in fraud detection are based on current research and applications. Results may vary depending on specific use cases and implementation. No guarantees or assurances are made regarding the effectiveness of these technologies in all situations.











