Skip to main content

Research and publications

AI Garage drives cutting-edge research by developing novel algorithms across diverse fields. Our team actively contributes to the global research community, publishing consistently in top-tier peer-reviewed conferences and specialized workshops.

Hands pointing at graphs on a table.

Research and publications

Time Series Modelling
Machine Learning
Trustworthy AI
Fraud Detection
Others

Time Series Modelling

Deviation-based Marked Temporal Point Process for Marker Prediction (2021)

Venue: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2021‎

Temporal Point Processes (TPPs) are useful for modeling event sequences which do not occur at regular time intervals. For example, TPPs can be used to model the occurrence of earthquakes, social media activity, financial transactions, etc. Owing to their flexible nature and applicability in several real-world scenarios, TPPs have gained wide attention from the research community. In literature, TPPs have mostly been used to predict the occurrence of the next event (time) with limited focus on the type/category of the event, termed as the marker. Further, limited focus has been given to model the inter-dependency of the event time and marker information for more accurate predictions. To this effect, this research proposes a novel Deviation-based Marked Temporal Point Process (DMTPP) algorithm focused on predicting the marker corresponding to the next event. Specifically, the deviation between the estimated and actual occurrence of the event is modeled for predicting the event marker. The DMTPP model is explicitly useful in scenarios where the marker information is not known immediately with the event occurrence, but is instead obtained after some time. DMTPP utilizes a Recurrent Neural Network (RNN) as its backbone for encoding the historical sequence pattern, and models the dependence between the marker and event time prediction. Experiments have been performed on three publicly available datasets for different tasks, where the proposed DMTPP model demonstrates state-of-the-art performance. For example, an accuracy of 91.76% is obtained on the MIMIC-II dataset, demonstrating an improvement of over 6% from the state-of-the-art model.

Authors: Anand Vir Singh Chauhan, Shivshankar Reddy, Maneet Singh, Karamjit Singh and Tanmoy Bhowmik

View full paper

Time Series Modelling

A Survey on Classical and Deep Learning based Intermittent Time Series Forecasting Methods (2021)

Venue: International Joint Conference on Neural Network (IJCNN), 2021‎‎

Demand forecasting is a fundamental aspect of inventory and supply chain management. Due to the sporadic nature of the demand, demand forecasting involves dealing with intermittent time series in domains such as retail, manufacturing. Conventional forecasting methods do not work well for intermittent time series due to inherent sparsity in such series. Researchers have proposed multiple methods to deal with intermittent time series such as Croston and its variants. Our work aims to provide an insight into the various forecasting methods traditionally known to work well for forecasting intermittent series. We have also explored deep learning methods that have been proposed in recent literature. These methods are thoroughly reviewed and explained in this survey paper. Additionally, experiments are done on two publicly available datasets to compare the performance of the traditional methods with deep learning models. Furthermore, a hybrid model made of independent classification and regression trees has been implemented and studied as well. We provide a comprehensive evaluation that aims at selecting the appropriate method, given the dataset, context, and objectives that have to be met by the forecasting practitioner/researcher.

Authors: Karthikeswaren Ramachandran, Kanishka Kayathwal, Gaurav Dhama and Ankur Arora

View full paper

Time Series Modelling

Exploring Generative Data Augmentation in Multivariate Time Series Forecasting: Opportunities and Challenges (2021)‎

Venue: Workshop on Mining and Learning from Time Series (MiLeTS) in conjunction with ACM International Conference on Knowledge Discovery and Data Mining (KDD), 2021‎

In multivariate time series (MTS), each time point constitutes multiple time-dependent variables. Short-term and long-term correlation of these variables is a significant characteristic of MTS, and is a key challenge while modelling the same. While classical auto-regressive models are heavily used to model MTS, neural models are more flexible and efficient. However, neural models rely on a large amount of labelled data for training. Availability of labelled time series data could be a bottleneck in real-world scenarios. This scarcity of labelled data can be mitigated by data augmentation. In MTS, augmentation techniques need to realize short-term correlations and long-term temporal dynamics. In this work, we introduce a novel meta-algorithm for time-series data augmentation to address the data scarcity problem. Due to the intrinsic ordering of samples in time series, we argue that one cannot simply add synthetic samples to the real samples for augmentation. To this end, we generate synthetic MTS data preserving temporal dynamics using an offthe-shelf generative algorithm and frame augmentation in MTS as a transfer learning problem. In addition, we point out the drawbacks of generative model in MTS augmentation. We show the effectiveness of our method on publicly available MTS datasets in forecasting. We also perform qualitative and quantitative analysis of synthetic MTS data and its applicability in long-term forecasting. To the best of our knowledge, this is the first study on generative data augmentation for MTS forecasting.

Authors: Ankur Debnath, Govind Waghmare, Hardik Wadhwa, Siddhartha Asthana and Ankur Arora

View full paper

Time Series Modelling

Deep Learning based Time Series Forecasting (2021)‎

Venue: Book chapter in the ‘Deep Learning Applications, Volume 3’ to be published by Springer, 2021‎

For decision-makers in the forecasting sector, decision processes like planning of facilities, an optimal day-to-day operation within the domain etc., are complex with several different levels to be considered. These decisions address widely different time horizons and aspects of the system, making it difficult to model. The advent of deep learning in forecasting solved the need for expensive hand-crafted features and deep domain knowledge. The work aims at giving a structure to the existing literature for time-series forecasting in deep learning. Based on the underlying structures of the technique, such as RNN, CNN, and Transformer, we have categorized various deep learning-based time series forecasting techniques and provided a consolidated report. Additionally, we have performed experiments to compare these techniques on 4 different publicly available datasets. Finally, based on these experiments, we provide an intuitive reasoning behind these performances. We believe that this work shall help the researchers in choosing relevant techniques for future research.

Authors: Kushagra Agarwal, Lalasa Dheekollu, Gaurav Dhama, Ankur Arora, Siddhartha Asthana and Tanmoy Bhowmik

View full paper

Time Series Modelling

Adversarial Generation of Temporal Data: A Critique on Fidelity of Synthetic Data‎

Venue: First International Workshop on Machine Learning for Irregular Time Series in conjunction with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2021

Generative modelling for temporal data has seen a paradigm shift from autoregressive to adversarial models. Adversarial generation algorithms have proven to be more efficient in capturing the complex temporal correlations that the simplistic autoregressive model could not. Albeit, high-fidelity remains a concern even for adversarial modelling. The generation of high-fidelity data requires the model to have three strengths: capture feature correlations, model long-term dependencies, and scalability in dimensions. This paper analyzes these strengths on the existing methods of adversarial temporal generation regarding the fidelity of synthetic data. Towards this, we evaluate different algorithms for adversarial temporal generation on five different datasets of varying dynamics (long-term vs. short-term dependency) and dimensionality. We conclude by discussing gaps in the literature and future directions for high fidelity temporal data generation through adversarial methods. Generative modelling for temporal data has seen a paradigm shift from autoregressive to adversarial models. Adversarial generation algorithms have proven to be more efficient in capturing the complex temporal correlations that the simplistic autoregressive model could not. Albeit, high-fidelity remains a concern even for adversarial modelling. The generation of high-fidelity data requires the model to have three strengths: capture feature correlations, model long-term dependencies, and scalability in dimensions. This paper analyzes these strengths on the existing methods of adversarial temporal generation regarding the fidelity of synthetic data. Towards this, we evaluate different algorithms for adversarial temporal generation on five different datasets of varying dynamics (long-term vs. short-term dependency) and dimensionality. We conclude by discussing gaps in the literature and future directions for high fidelity temporal data generation through adversarial methods.

Authors: Ankur Debnath, Govind Waghmare, Hardik Wadhwa, Siddhartha Asthana and Ankur Arora

View full paper

Time Series Modelling

Semi-supervised Learning for Marked Temporal Point Processes‎

Venue: Workshop on Weakly Supervised Representation Learning in conjunction with the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021‎

Temporal Point Processes (TPPs) are often used to represent the sequence of events ordered as per the time of occurrence. Owing to their flexible nature, TPPs have been used to model different scenarios and have shown applicability in various real-world applications. While TPPs focus on modeling the event occurrence, Marked Temporal Point Process (MTPP) focuses on modeling the category/class of the event as well (termed as the marker). Research in MTPP has garnered substantial attention over the past few years, with an extensive focus on supervised algorithms. Despite the research focus, limited attention has been given to the challenging problem of developing solutions in semi-supervised settings, where algorithms have access to a mix of labeled and unlabeled data. This research proposes a novel algorithm for Semi-supervised Learning for Marked Temporal Point Processes (SSL-MTPP) applicable in such scenarios. The proposed SSL-MTPP algorithm utilizes a combination of labeled and unlabeled data for learning a robust marker prediction model. The proposed algorithm utilizes an RNN-based Encoder-Decoder module for learning effective representations of the time sequence. The efficacy of the proposed algorithm has been demonstrated via multiple protocols on the Retweet dataset, where the proposed SSL-MTPP demonstrates improved performance in comparison to the traditional supervised learning approach.

Authors: Shivshankar Reddy, Anand Vir Singh Chauhan, Maneet Singh and Karamjit Singh

View full paper

Time Series Modelling

Deep Learning based Time Series Forecasting (2020)‎

Venue: 19th IEEE International Conference on Machine Learning and Applications (ICMLA), 2020‎

For decision-makers in the forecasting sector, decision processes like planning of facilities, an optimal day-to-day operation within the domain etc., are complex with several different levels to be considered. These decisions address widely different time horizons and aspects of the system, making it difficult to model. The advent of deep learning in forecasting solved the need for expensive hand-crafted features and deep domain knowledge. The work aims at giving a structure to the existing literature for time-series forecasting in deep learning. Based on the underlying structures of the technique, such as RNN, CNN, and Transformer, we have categorized various deep learning-based time series forecasting techniques and provided a consolidated report. Additionally, we have performed experiments to compare these techniques on 4 different publicly available datasets. Finally, based on these experiments, we provide an intuitive reasoning behind these performances. We believe that this work shall help the researchers in choosing relevant techniques for future research.

Authors: Kushagra Agarwal, Lalasa Dheekollu, Gaurav Dhama, Ankur Arora, Siddhartha Asthana and Tanmoy Bhowmik

View full paper

Machine learning

SHIP: Structural Hierarchies for Instance-dependent Partial Labels‎

Venue: WACV

SHIP introduces a plug-and-play hierarchy module for partial-label learning that derives structural label hierarchies directly from instance-dependent candidate label sets. It splits feature representations into multiple heads corresponding to hierarchy levels and supervises them using dynamically generated coarse-to-fine targets. This reduces mistake severity and improves representation quality across datasets with intrinsic or simulated hierarchies while adding minimal overhead to base PLL architectures.

Authors: Tushar Kadam, Utkarsh Mishra and Aakarsh Malhotra

View full paper

Machine learning

SALE-MLP: Structure Aware Latent Embeddings for GNN to Graph-free MLP Distillation‎

Venue: IJCAI

SALE-MLP proposes a structure-aware Graph-to-MLP distillation method that learns graph-semantic latent embeddings from node features without using the graph at inference time. It aligns a student MLP's feature space with a teacher GNN via unsupervised structural losses instead of relying on precomputed GNN embeddings. The approach achieves superior performance to existing G2M methods on node classification and link prediction in both transductive and inductive settings, with notable gains in inductive scenarios.

Authors: Harsh Pal, Sarthak Malik, Rajat Patel and Aakarsh Malhotra

View full paper

Machine learning

Tag2M- A Task-Agnostic Knowledge Distillation Framework for Distilling Gnn to MLP

Venue: KDD

Tag2M presents a task-agnostic GNN-to-MLP distillation framework that transfers structural knowledge from a teacher GNN into a lightweight MLP for graph-free, few-shot inference. It uses a self-supervised contrastive loss to encode topology from node attributes and Lipschitz positional embeddings plus an inference-time prompt head for rapid task adaptation. Tag2M generalizes across homophilous and heterophilous graphs and delivers large speedups (up to 20–200×) while outperforming prior distillation methods on multiple node-level tasks over 11 public datasets.

Authors: Ram Ganesh V, Ayush Singh, Aditi Rai, Harsh Pal, Deepanshu, Akshay Sethi, Aakarsh Malhotra and Sayan Ranu

View full paper

Machine learning

FairFusion: Debiasing Diffusion Models for Fair Synthetic Tabular Data Generation

Venue: ECAI

FairFusion introduces a debiasing framework for diffusion models that generate synthetic tabular data while enforcing fairness across sensitive groups. The method integrates fairness-aware constraints and loss terms into the diffusion process, reducing disparate treatment and impact in downstream models trained on the synthetic data. Empirical results on benchmark tabular datasets demonstrate that FairFusion achieves competitive utility while substantially improving group fairness relative to standard diffusion-based generators.

Authors: Ruma Roy, Darshika Tiwari and Anubha Pandey

View full paper

Machine learning

Towards Equitable Coreset Selection: Addressing Challenges Under Class Imbalance

Venue: CIKM Short

This work introduces Equitable Coreset Selection (ECS), a coreset framework explicitly designed for imbalanced classification settings. ECS adaptively prunes data while preserving minority-class coverage, mitigating the overrepresentation of majority classes seen in standard coreset methods. On multiple benchmarks, ECS consistently improves performance and robustness under severe class imbalance compared to state-of-the-art coreset baselines.

Authors: Liyana Sahir, Anugu Namratha Reddy, B Srinath Achary, Ashutosh Sharma, Krisha Shah, Sonia Gupta and Siddhartha Asthana

View full paper

Machine learning

Learning Representations for Bipartite Graphs using Multi-Task Self-Supervised Learning‎

Venue: ECML PKDD: Machine Learning and Knowledge Discovery in Databases: Research Track, 2023‎‎

Representation learning for bipartite graphs is a challenging problem due to its unique structure and characteristics. The primary challenge is the lack of extensive supervised data and the bipartite graph structure, where two distinct types of nodes exist with no direct connections between the nodes of the same kind. Hence, recent algorithms utilize Self Supervised Learning (SSL) to learn effective node embeddings without needing costly labeled data. However, conventional SSL methods learn through a single pretext task, making the trained model specific to the downstream task. This paper proposes a novel approach for learning generalized representations of bipartite graphs using multi-task SSL. The proposed method utilizes multiple self-supervised tasks to learn improved embeddings that capture different aspects of the bipartite graphs, such as graph structure, node features, and local-global information. We utilize deep multi-task learning (MTL) to further assist in learning generalizable self-supervised solution. To mitigate negative transfer when related and unrelated tasks are trained in MTL, we propose a novel DST++ algorithm. The proposed DST++ optimization algorithm improves existing DST by considering task affinities and groupings for better initialization and training. The end-to-end proposed method with complimentary SSL tasks and DST++ multi-task optimization is evaluated on three tasks: node classification, link prediction, and node regression, using four publicly available benchmark datasets. The results demonstrate that our proposed method outperforms state-of-the-art methods for representation learning in bipartite graphs. Specifically, our method achieves up to 12% improvement in accuracy for node classification and up to 9% improvement in AUC for link prediction tasks compared to the baseline methods.

Authors: Akshay Sethi, Sonia Gupta, Aakarsh Malhotra and Siddhartha Asthana

View full paper

Machine learning

Practical Bias Mitigation through Proxy Sensitive Attribute Label Generation‎

Venue: Workshop on Modelling Uncertainty in the Financial World (MUFin’23) in conjunction with AAAI, 2023‎

Addressing bias in the trained machine learning system often requires access to sensitive attributes. In practice, these attributes are not available either due to legal and policy regulations or data unavailability for a given demographic. Existing bias mitigation algorithms are limited in their applicability to real-world scenarios as they require access to sensitive attributes to achieve fairness. In this research work, we aim to address this bottleneck through our proposed unsupervised proxy-sensitive attribute label generation technique. Towards this end, we propose a two-stage approach of unsupervised embedding generation followed by clustering to obtain proxy-sensitive labels. The efficacy of our work relies on the assumption that bias propagates through non-sensitive attributes that are correlated to the sensitive attributes and, when mapped to the high dimensional latent space, produces clusters of different demographic groups that exist in the data. Experimental results demonstrate that bias mitigation using existing algorithms such as Fair Mixup and Adversarial Debiasing yields comparable results on derived proxy labels when compared against using true sensitive attributes.

Authors: Bhushan Chaudhari, Anubha Pandey, Deepak Bhatt and Darshika Tiwari

View full paper

Machine learning

Modeling Inter-Dependence Between Time and Mark in Multivariate Temporal Point Processes‎

Venue: ACM International Conference on Information and Knowledge Management, 2022‎

Temporal Point Processes (TPP) are probabilistic generative frameworks. They model discrete event sequences localized in continuous time. Generally, real-life events reveal descriptive information, known as marks. Marked TPPs model time and marks of the event together for practical relevance. Conditioned on past events, marked TPPs aim to learn the joint distribution of the time and the mark of the next event. For simplicity, conditionally independent TPP models assume time and marks are independent given event history. They factorize the conditional joint distribution of time and mark into the product of individual conditional distributions. This structural limitation in the design of TPP models hurt the predictive performance on entangled time and mark interactions. In this work, we model the conditional inter-dependence of time and mark to overcome the limitations of conditionally independent models. We construct a multivariate TPP conditioning the time distribution on the current event mark in addition to past events. Besides the conventional intensity-based models for conditional joint distribution, we also draw on flexible intensity-free TPP models from the literature. The proposed TPP models outperform conditionally independent and dependent models in standard prediction tasks. Our experimentation on various datasets with multiple evaluation metrics highlights the merit of the proposed approach.

 Authors: Govind Waghmare, Ankur Debnath, Siddhartha Asthana and Aakarsh Malhotra

View full paper

Machine learning

FairGen: Fair Synthetic Data Generation

Venue: DataPerf2022 Workshop in conjunction with International Conference on Machine Learning (ICML), 2022‎‎

With the rising adoption of Machine Learning across the domains like banking, pharmaceutical, ed-tech, etc, it has become utmost important to adopt responsible AI methods to ensure models are not unfairly discriminating against any group. Given the lack of clean training data, generative adversarial techniques are preferred to generate synthetic data with several state-of-the-art architectures readily available across various domains from unstructured data such as text, images to structured datasets modelling fraud detection and many more. These techniques overcome several challenges such as class imbalance, limited training data, restricted access to data due to privacy issues. Existing work focusing on generating fair data either works for a certain GAN architecture or is very difficult to tune across the GANs. In this paper, we propose a pipeline to generate fairer synthetic data independent of the GAN architecture. The proposed paper utilizes a pre-processing algorithm to identify and remove bias inducing samples. In particular, we claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples. Our experimental evaluation on two open-source datasets demonstrates how the proposed pipeline is generating fair data along with improved performance in some cases.

 Authors: Himanshu Chaudhary, Bhushan Chaudhari, Aakash Agarwal, Kamna Meena and Tanmoy Bhowmik

View full paper

Machine learning

RePS: Relation, Position and Structure aware Entity Alignment‎

Venue: Graph Learning Workshop in conjunction with The ACM Web Conference 2022

With the rising adoption of Machine Learning across the domains like banking, pharmaceutical, ed-tech, etc, it has become utmost important to adopt responsible AI methods to ensure models are not unfairly discriminating against any group. Given the lack of clean training data, generative adversarial techniques are preferred to generate synthetic data with several state-of-the-art architectures readily available across various domains from unstructured data such as text, images to structured datasets modelling fraud detection and many more. These techniques overcome several challenges such as class imbalance, limited training data, restricted access to data due to privacy issues. Existing work focusing on generating fair data either works for a certain GAN architecture or is very difficult to tune across the GANs. In this paper, we propose a pipeline to generate fairer synthetic data independent of the GAN architecture. The proposed paper utilizes a pre-processing algorithm to identify and remove bias inducing samples. In particular, we claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples. Our experimental evaluation on two open-source datasets demonstrates how the proposed pipeline is generating fair data along with improved performance in some cases.

Authors: Anil Surisetty, Deepak Chaurasiya, Nitish Kumar, Alok Singh, Gaurav Dhama, Aakarsh Malhotra, Vikrant Dey and Ankur Arora

View full paper

Machine learning

AuthSHAP — Authentication Vulnerability Detection on Tabular Data in Black Box Setting

Venue: 2nd ACM International Conference on Artificial Intelligence in Finance (ICAIF), 2021‎‎‎

Adversarial archetypes exploit the workings of any system to disrupt the robustness and the decision-making of the underlying machine learning algorithms. This is an important area of concern in the banking industry where the global approbation of real-time & friction less payment systems has prompted the financial institutions to invest in superior authentication solutions. Identifying fraudulent actors poses an inherent challenge for varied reasons — thorough fraud detection can have an impact on customer satisfaction. While fraud detection slows down the authentication process, the customers' demands faster response. Present systems tend to produce numerous false positives, along with being vulnerable to ingenious fraudsters, which in conjunction with dollar loss, also leads to reputation loss for the issuers. In this paper, we present AuthSHAP, a first of its kind, model agnostic and robust implementation of SHAP (SHapley Additive exPlanations) to uncover the extent to which key features are not appropriated by any model in the decision making. This 'knowledge' is significant information to a fraudsters to design intelligent or adversarial attacks. We show that even in black-box settings, where the attacker only needs to have a fair process or business understanding about the type of information or raw features that are passed through the system and the responses it is possible to understand the vulnerability. Thus, this can be used by any financial institution as an active preventive measure (a) to shield their authentication system from adversarial attacks and, also (b) to reduce false declines to prevent a sub-optimal customer experience amounting to both revenue and reputation loss for entities across the payment value chain. We propose an evaluation method using a simulation where we create a decision system or Model with desired vulnerability and identify the same via our proposed methodology in a black-box setting. We extend this by performing several experiments on aggregated and anonymized real-world financial transaction data and validate our findings with internal subject matter experts.

 Authors: Debasmita Das, Yatin Katyal, Rajesh Kumar Ranjan, Ram Ganesh V and Rohit Bhattacharya

View full paper

Machine learning

MeTGAN: Memory efficient Tabular GAN for high cardinality categorical datasets

Venue: 28th International Conference on Neural Information Processing (ICONIP), 2021‎

Generative Adversarial Networks (GANs) have seen their use for generating synthetic data expand, from unstructured data like images to structured tabular data. One of the recently proposed models in the field of tabular data generation, CTGAN, demonstrated state-of-the-art performance on this task even in the presence of a high class imbalance in categorical columns or multiple modes in continuous columns. Many of the recently proposed methods have also derived ideas from CTGAN. However, training CTGAN requires a high memory footprint while dealing with high cardinality categorical columns in the dataset. In this paper, we propose MeTGAN, a memory-efficient version of CTGAN, which reduces memory usage by roughly 80%, with a minimal effect on performance. MeTGAN uses sparse linear layers to overcome the memory bottlenecks of CTGAN. We compare the performance of MeTGAN with the other models on publicly available datasets. Quality of data generation, memory requirements, and the privacy guarantees of the models are the metrics considered in this study. The goal of this paper is also to draw the attention of the research community on the issue of the computational footprint of tabular data generation methods to enable them on larger datasets especially ones with high cardinality categorical variables.

 Authors: Shreyansh Singh, Kanishka Kayathwal, Hardik Wadhwa and Gaurav Dhama

View full paper

Machine learning

Self-Training with Ensemble of Teacher Models (2021)‎

Venue: Workshop on Weakly Supervised Representation Learning in conjunction with the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021‎‎

In order to train robust deep learning models, large amounts of labelled data is required. However, in the absence of such large repositories of labelled data, unlabeled data can be exploited for the same. Semi-Supervised learning aims to utilize such unlabeled data for training classification models. Recent progress of self-training based approaches have shown promise in this area, which leads to this study where we utilize an ensemble approach for the same. A by-product of any semi-supervised approach may be loss of calibration of the trained model especially in scenarios where unlabeled data may contain out-of-distribution samples, which leads to this investigation on how to adapt to such effects. Our proposed algorithm carefully avoids common pitfalls in utilizing unlabeled data and leads to a more accurate and calibrated supervised model compared to vanilla self-training based student-teacher algorithms. We perform several experiments on the popular STL-10 database followed by an extensive analysis of our approach and study its effects on model accuracy and calibration.

 Authors: Soumyadeep Ghosh, Sanjay Kumar, Janu Verma and Awanish Kumar

View full paper

Machine learning

Table Structure Recognition using CoDec Encoder-Decoder (2021)

Venue: Workshop on Machine Learning in conjunction with the 16th International Conference on Document Analysis and Retrieval (ICDAR), 2021‎

Automated document analysis and parsing has been the focus of research since a long time. An important component of document parsing revolves around understanding tabular regions with respect to their structure identification, followed by precise information extraction. While substantial effort has gone into table detection and information extraction from documents, table structure recognition remains to be a long-standing task demanding dedicated attention. The identification of the table structure enables extraction of structured information from tabular regions which can then be utilized for further applications. To this effect, this research proposes a novel table structure recognition pipeline consisting of row identification and column identification modules. The column identification module utilizes a novel Column Detector Encoder-Decoder model (termed as CoDec Encoder Decoder) which is trained via a novel loss function for predicting the column mask for a given input image. Experiments have been performed to analyze the different components of the proposed pipeline, thus supporting their inclusion for enhanced performance. The proposed pipeline has been evaluated on the challenging ICDAR 2013 table structure recognition dataset, where it demonstrates state-of-the-art performance.

 Authors: Bhanupriya Pegu, Maneet Singh, Aakash Agarwal, Aniruddha Mitra and Karamjit Singh

View full paper

Machine learning

DEDD: Deep Encoder with Dual Decoder Architecture for Stability and Specificity Preserving Encoding and Translation of Embedding between Domains (2021)

Venue: International Conference on Information Technology and Cloud Computing (ITCC) in conjunction with International Conference on Computing, Networks and Internet of Things (CNIOT), 2021‎

We propose a deep learning-based encoder with a dual decoder system to enrich the expressive power of embeddings pre-trained on two different corpora along with switching representation between domains. There are two scenarios: (a) Each of the corpora is pertaining to the different subject matter or topic of interests and (b) One corpus is a vast super-domain with generic and non-specific embeddings while the second one pertains to one specific sub-domain. In either case, the criterion for high-quality training would be to have enough common words between them. The mapping of contextual embeddings from both the corpus into the common latent space blends the semantic richness of both the corpus-specific learning while maintaining embedding stability. Furthermore, there is one dedicated decoder for either of the domains for generating the representation from common latent space. We evaluated our method for cross-learning between generalized GLOVE embedding and a very specialized skill-embedding developed by random-walk on a graph-based Skills Hierarchy. We demonstrate that our method preserves the stability of the generic embedding, the specificity of the skill domain as well as enriches the semantic representation of either domain through switching enabled by the encoder-to-duel-decoder path.

 Authors: Rajesh Kumar Ranjan, Debasmita Das, Ram Ganesh V., Yatin Katyal and Tanmoy Bhowmik

View full paper

Machine learning

A comparative study on Transformers for Word Sense Disambiguation‎

Venue: 28th International Conference on Neural Information Processing (ICONIP), 2021‎

Recent years of research in Natural Language Processing (NLP) have witnessed dramatic growth in training large models for generating context-aware language representations. In this regard, numerous NLP systems have leveraged the power of neural network-based architectures to incorporate sense information in embeddings, resulting in Contextualized Word Embeddings (CWEs). Despite this progress, the NLP community has not witnessed any significant work performing a comparative study on the contextualization power of such architectures. This paper presents a comparative study and an extensive analysis of nine widely adopted Transformer models. These models are BERT, CTRL, DistilBERT, OpenAI-GPT, OpenAI-GPT2, Transformer-XL, XLNet, ELECTRA, and ALBERT. We evaluate their contextualization power using two lexical sample Word Sense Disambiguation (WSD) tasks, SensEval-2 and SensEval-3. We adopt a simple yet effective approach to WSD that uses a k-Nearest Neighbor (kNN) classification on CWEs. Experimental results show that the proposed techniques also achieve superior results over the current state-of-the-art on both the WSD tasks.

 Authors: Avi Chawla, Nidhi Mulay, Vikas Bishnoi, Gaurav Dhama and A.K. Singh

View full paper

Machine learning

Improving the performance of Transformer Context Encoders for NER‎

Venue: IEEE Conference on Information Fusion (FUSION), 2021‎

Large Transformer based models have provided state-of-the-art results on a variety of Natural Language Processing (NLP) tasks. While these Transformer models perform exceptionally well on a wide range of NLP tasks, their usage in Sequence Labeling has been mostly muted. Although pretrained Transformer models such as BERT and XLNet have been successfully employed as input representation, the use of the Transformer model as a context encoder for sequence labeling is still minimal, and most recent works still use recurrent architecture as the context encoder. In this paper, we compare the performance of the Transformer and Recurrent architecture as context encoders on the Named Entity Recognition (NER) task. We vary the character-level representation module from the previously proposed NER models in literature and show how the modification can improve the NER model’s performance. We also explore data augmentation as a method for enhancing their performance. Experimental results on three NER datasets show that our proposed techniques established a new state-of-the-art using the Transformer Encoder over the previously proposed models in the literature using only non-contextualized embeddings.

 Authors: Avi Chawla, Nidhi Mulay, Vikas Bishnoi and Gaurav Dhama

View full paper

Machine learning

Modelling Approaches for Silent Attrition Prediction in Payment Networks‎

Venue: 20th IEEE International Conference on Machine Learning and Applications (ICMLA), 2021‎‎

Predicting customer attrition (churn) is a well known problem in industries that provide services, like financial institutions, telecommunications, e-commerce, and retail. There are two kinds of attrition — active and passive (silent). Active attrition is usually associated with subscription-based business models, commonly seen in telecommunications and internet industries like Netflix. In industries like finance, retail, and ecommerce, we see the other kind of attrition — silent attrition where customers stop doing business without formal notice. This makes the silent attrition prediction problem even more challenging because it is difficult to differentiate between attrited and inactive customers. We focus our work on predicting silent attrition which is still under-explored in the payment card industry (i.e. Mastercard, Visa). The contribution of our work is threefold. First, we present a data-driven approach to define silent attrition as customer inactivity. Second, we discussed multiple procedures to generate synthetic data thereby preserving customers’ privacy. At last, we presented a comprehensive view of various machine learning (ML) pathways in which this churn prediction problem can be framed and solved; each requiring a specific feature engineering. We presented experimental results corresponding to each pathway to comparative analysis. We believe that this work to be beneficial to the researchers and ML practitioners who often have to deal with sensitive financial data but have limited permission to use it. In this direction, we demonstrated the use of synthetic data generation to reduce the risk of data leakage and other privacy concerns relating to ML models development.

 Authors: Lalasa Dheekollu, Hardik Wadhwa, Siddharth Vimal, Anubhav Gupta, Siddhartha Asthana, Ankur Arora and Smriti Gupta

View full paper

Machine learning

Evolutionary adversarial attacks on Payment Systems‎

Venue: 20th IEEE International Conference on Machine Learning and Applications (ICMLA), 2021‎‎‎

Credit card fraud detection is arguably the most critical use case of machine learning for any payment system. Deep neural networks and tree-based classifiers can provide state-of-the-art performance for fraud classification. However, we try to emphasize that these models have serious vulnerabilities that need to be addressed. Studies show that it is possible to fool machine learning models with curated input samples known as adversarial examples. Attackers can use these examples to deceive the fraud classifiers deployed by institutions, causing considerable financial harm. We feel that the literature on adversarial examples for fraud detection systems has been limited to simpler datasets. In this paper, we use two large publicly available datasets for credit card fraud detection to benchmark the performance of some conventional machine learning models and compare the effectiveness of different black-box attacks on the best-performing model. Lastly, we introduce a novel gradient-free approach to black-box attacks, which uses evolution-based specialized perturbations to create attacks (ESPA). We show that the new method requires far fewer queries than other black-box attack methods like Zeroth Order optimization, Boundary Attack, and HopSkipJump, and can leverage the information gained from previously successful attacks.

Authors: Nishant Kumar, Siddharth Vimal, Kanishka Kayathwal and Gaurav Dhama

View full paper

Machine learning

Label-Value Extraction from Documents using Co-SSL Framework‎

Venue: 17th International Conference on Advanced Data Mining and Applications (ADMA), 2021‎

Label-value extraction from documents refers to the task of extracting relevant values for corresponding labels/fields. For example, it encompasses extracting the total amount from receipts, the date value from invoices/patents/forms, or tax amount from receipts/invoices. Automated label-value extraction has widespread applicability in real-world scenarios of document understanding, book-keeping, reconciliation and content summarization. Recent research has focused on developing label-value extraction models, however, to the best of our knowledge, limited attention has been given to developing a light-weight compact label-value extraction module generalizable across different document types. Since in real-world deployment, a developed model is often required to process different types of documents for the same label/field type, this research proposes a novel Context-based Semi-supervised (Co-SSL) framework for the same. The proposed Co-SSL framework focuses on identifying candidates for each label/field, followed by the generation of their context based on spatial cues. Further, novel data augmentation strategies are proposed which are specifically applicable to the problem of information extraction from documents. The extracted information (candidate and context) is then provided to a deep learning based model trained in a novel semi-supervised setting for applicability in real-world scenarios of limited training data. The performance of the Co-SSL framework has been demonstrated on three challenging datasets containing different document types (receipts, patents and forms).

Authors: Sara Sai Abhishek, Maneet Singh, Bhanupriya Pegu and Karamjit Singh

View full paper

Machine learning

MoDest: Multi-module Design Validation for Documents‎

Venue: ACM India Joint International Conference on Data Science and Management of Data (ACM CODS-COMAD), 2021‎

Information extraction (IE) from Visually Rich Documents (VRDs) is a common need for businesses, where extracted information is used for various purposes such as verification, design validation or compliance. Most of the research in IE from VRDs has focused on textual documents such as invoices and receipts, while extracting information from multi-modal VRDs remains a challenging task. This research presents a novel end-to-end design validation framework for multi-modal VRDs containing textual and visual components, for compliance against a pre-defined set of rules. The proposed Multi-mOdule DESign validaTion (referred to as MoDest) framework constitutes two steps: (i) information extraction using five modules for obtaining the textual and visual components, followed by (ii) validating the extracted components against a pre-defined set of design rules. Given an input multi-modal VRD image, the MoDest framework either accepts or rejects its design while providing an explanation for the decision. The proposed framework is tested for design validation for a particular type of VRDs: banking cards, under the real-world constraint of limited and highly imbalance training data with more than 99% of card designs belonging to one class (accepted). Experimental evaluation on real world images from our in-house dataset demonstrates the effectiveness of the proposed MoDest framework. Analysis drawn from the real-world deployment of the framework further strengthens its utility for design validation.

Authors: Bhanupriya Pegu, Maneet Singh, Kamal Kant, Karamjit Singh and Tanmoy Bhowmik

View full paper

Machine learning

Deep Learning Algorithm to Rank-Order Resumes using Discriminative Embedding Space Session Track (2020)‎

Venue: Grace Hopper Celebration India (GHCI), 2020‎‎

Authors: Sonali Syngal and Debasmita Das

Machine learning

Word and Graph Embeddings for COVID-19 Retweet Prediction (2020)

Venue: AnalytiCup Workshop in conjunction with the 29th ACM International Conference on Information and Knowledge Management (CIKM), 2020‎

In this paper, we present our solution for COVID-19 retweet prediction challenge. The proposed approach consists of feature engineering and modeling. For feature engineering, we leverage both hand-crafted and unsupervised learning features. As the provided data set is large, we implement auto-encoding algorithms to reduce feature dimension. To develop predictive models, we utilize ensemble learning and deep learning algorithms. We then combine these models to generate the final blended model. Moreover, to stabilize the predictions, we also apply bagging as well as down-sampling techniques to remove the tweets where number of retweets equals to zero. Our solution is ranked first on the public test set and second on the private test set.

Authors: Tam T. Nguyen, Karamjit Singh, Sangam Verma, Hardik Wadhwa, Siddharth Vimal, Lalasa Dheekollu, Sheng Jie Lui, Divyansh Gupta, Dong Yang Jin and Zha Wei

View full paper

Trustworthy AI

Unmasking Bias in Financial AI: A Robust Framework for Evaluating and Mitigating Hidden Biases in LLMs

Venue: ICAIF

This paper proposes a systematic framework to surface and quantify hidden biases in LLM-based financial AI applications. It designs stress-test prompts, evaluation protocols, and mitigation strategies that target fairness, robustness, and regulatory concerns in financial decision-support use cases. Results highlight non-trivial biases in off-the-shelf LLMs and show that the framework’s mitigation pipeline can significantly reduce disparate behaviors across demographic or customer segments.

Authors: Shresth, Balraj, Raghavendra, Hrishikesh and Puspita

View full paper

Trustworthy AI, Machine learning

Practical Bias Mitigation through Proxy Sensitive Attribute Label Generation

Venue: Workshop on Modelling Uncertainty in the Financial World (MUFin’23) in conjunction with AAAI, 2023‎‎

Addressing bias in the trained machine learning system often requires access to sensitive attributes. In practice, these attributes are not available either due to legal and policy regulations or data unavailability for a given demographic. Existing bias mitigation algorithms are limited in their applicability to real-world scenarios as they require access to sensitive attributes to achieve fairness. In this research work, we aim to address this bottleneck through our proposed unsupervised proxy-sensitive attribute label generation technique. Towards this end, we propose a two-stage approach of unsupervised embedding generation followed by clustering to obtain proxy-sensitive labels. The efficacy of our work relies on the assumption that bias propagates through non-sensitive attributes that are correlated to the sensitive attributes and, when mapped to the high dimensional latent space, produces clusters of different demographic groups that exist in the data. Experimental results demonstrate that bias mitigation using existing algorithms such as Fair Mixup and Adversarial Debiasing yields comparable results on derived proxy labels when compared against using true sensitive attributes.

Authors: Bhushan Chaudhari, Anubha Pandey, Deepak Bhatt and Darshika Tiwari

View full paper

Trustworthy AI, others

FLiB: Fair Link Prediction in Bipartite Network‎

Venue: 26th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-22), 2022‎

Graph neural networks have become a popular modeling choice in many real-world applications like social networks, recommender systems, molecular science. GNNs have been shown to exhibit greater bias compared to other ML models trained on i.i.d data, and as they are applied to many socially-consequential use-cases, it becomes imperative for the model results and learned representations to be fair. Real-world applications of GNNs involve learning over heterogeneous networks with several nodes and edge types. We show that various kinds of nodes in a heterogeneous network can pick bias from a particular node type and remain non-trivial to debias using standard fairness algorithms. We propose a novel framework- Fair Link Prediction in Bipartite Networks (FLiB) that ensures fair link prediction while learning fair representations for all types of nodes with respect to the sensitive attribute of one of the node type. We further propose S-FLiB, which effectively mitigates bias at the subgroup level by regularising model predictions for subgroups defined over problem-specific grouping criteria.

Authors: Piyush, Nitish Kumar, Sangam Verma, Karamjit Singh and Pranav Poduval

View full paper

Trustworthy AI, others

GroupMixNorm Layer for Learning Fair Models

Venue: Workshop on Interpolation Regularizers and Beyond in conjunction with NeurIPS, 2022‎‎

Recent research has focused on proposing algorithms for bias mitigation from automated prediction algorithms. Most of the techniques include convex surrogates of fairness metrics such as demographic parity or equalized odds in the loss function, which are not easy to estimate. Further, these fairness constraints are mostly data-dependent and aim to minimize the disparity among the protected groups during the training. However, they may not achieve similar performance on the test set. In order to address the above limitations, this research proposes a novel GroupMixNorm layer for bias mitigation from deep learning models. As an alternative to solving constraint optimization separately for each fairness metric, we have formulated bias mitigation as a problem of distribution alignment of several groups identified through the protected attributes. To this effect, the GroupMixNorm layer probabilistically mixes group-level feature statistics of samples across different groups based on the protected attribute. The proposed method improves upon several fairness metrics with minimal impact on accuracy. Experimental evaluation and extensive analysis on benchmark tabular and image datasets demonstrate the efficacy of the proposed method to achieve state-of-the-art performance.

Authors: Anubha Pandey, Aditi Rai, Maneet Singh, Deepak Bhatt and Tanmoy Bhowmik

View full paper

Trustworthy AI

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data‎

Venue: Data-centric AI Workshop in conjunction with Conference on Neural Information Processing Systems (NeurIPS), 2021‎

Machine learning models built on datasets containing discriminative instances attributed to various underlying factors result in biased and unfair outcomes. It’s a well founded and intuitive fact that existing bias mitigation strategies often sacrifice accuracy in order to ensure fairness. But when AI engine’s prediction is used for decision making which reflects on revenue or operational efficiency such as credit risk modeling, it would be desirable by the business if accuracy can be somehow reasonably preserved. This conflicting requirement of maintaining accuracy and fairness in AI motivates our research. In this paper, we propose a fresh approach for simultaneous improvement of fairness and accuracy of ML models within a realistic paradigm. The essence of our work is a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training and we further show that such instance removal will have no adverse impact on model accuracy. In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes, an inherent bias gets induced in the dataset, which can be identified and mitigated through our novel scheme. Our experimental evaluation on two open-source datasets demonstrates how the proposed method can mitigate bias along with improving rather than degrading accuracy, while offering certain set of control for end user.

 Authors: Aakash Agarwal, Bhushan Chaudhari and Tanmoy Bhowmik

View full paper

Trustworthy AI

Transitioning from Real to Synthetic Data: Quantifying the Bias in Model

Venue: Workshop on Synthetic Data Generation: Quality, Privacy, Bias in conjunction with the International Conference on Learning Representations (ICLR), 2021‎‎

With the advent of generative modeling techniques, synthetic data and its use has penetrated across various domains from unstructured data such as image, text to structured dataset modeling healthcare outcome, risk decisioning in financial domain, and many more. It overcomes various challenges such as limited training data, class imbalance, restricted access to dataset owing to privacy issues. To ensure the trained model used for automated decisioning purposes makes a fair decision there exist prior work to quantify and mitigate those issues. This study aims to establish a trade-off between bias and fairness in the models trained using synthetic data. Variants of synthetic data generation techniques were studied to understand bias amplification including differentially private generation schemes. Through experiments on a tabular dataset, we demonstrate there exist a varying levels of bias impact on models trained using synthetic data. Techniques generating less correlated feature performs well as evident through fairness metrics with 94\%, 82\%, and 88\% relative drop in DPD (demographic parity difference), EoD (equality of odds) and EoP (equality of opportunity) respectively, and 24\% relative improvement in DRP (demographic parity ratio) with respect to the real dataset. We believe the outcome of our research study will help data science practitioners understand the bias in the use of synthetic data.

 Authors: Aman Gupta, Deepak Bhatt and Anubha Pandey

View full paper

Fraud Detection

Prodem: Proactive Detection of Model Degradation in Financial Fraud Prediction Under Label Delay

Venue: ECML PKDD

Prodem targets proactive detection of performance degradation in fraud prediction models deployed under significant label delays typical of financial chargeback workflows. It develops monitoring signals and detection mechanisms that operate before true fraud labels fully materialize, enabling timely intervention. Experiments on real-world fraud pipelines show that Prodem flags degradation earlier and more reliably than conventional delayed-label monitoring, helping maintain fraud catch rates and business KPIs.

Authors: Akshay Sethi, Priyanshi Gupta, Sparsh Kansotia, Kamal Kant and Nitish Srivasatava

View full paper

Fraud Detection, others

Guided Self-Training based Semi-Supervised Learning for Fraud Detection

Venue: ACM International Conference on AI in Finance, 2022‎

Semi supervised learning has attracted attention of AI researchers in the recent past, especially after the advent of deep learning methods and their success in several real world applications. Most deep learning models require large amounts of labelled data, which is expensive to obtain. Fraud detection is a very important problem for several industries and large amount of data is often available. However, obtaining labelled data is cumbersome and hence semi-supervised learning is perfectly positioned to aid us in building robust and accurate supervised models. In this work, we consider different kinds of fraud detection paradigms and show that a self-training based semi-supervised learning approach can produce significant improvements over a model that has been training on a limited set of labelled data. We propose a novel self-training approach by using a guided sharpening technique using a pair of autoencoders which provide useful cues for incorporating unlabelled data in the training process. We conduct thorough experiments on three different real world databases and analysis to showcase the effectiveness of the approach. On the elliptic bitcoin fraud dataset, we show that utilizing unlabelled data improves the F1 score of the model trained on limited labelled data by around 10%.

Authors: Awanish Kumar, Soumyadeep Ghosh and Janu Verma

View full paper

Fraud Detection, others

Adversarial Fraud Generation for Improved Detection

Venue: ACM International Conference on AI in Finance, 2022‎‎

Generative Adversarial Networks (GANs) are known for their ability to learn data distribution and hence exist as a suitable alternative to handle class imbalance through oversampling. However, it still fails to capture the diversity of the minority class owing to their limited representation, for example, frauds in our study. Particularly the fraudulent patterns closer to the class boundary get missed by the model. This paper proposes using GANs to simulate fraud transaction patterns conditioned on genuine transactions, thereby enabling the model to learn a translation function between both spaces. Further to synthesize fraudulent samples from the class boundary, we trained GANs using losses inspired by data poisoning attack literature and discussed their efficacy in improving fraud detection classifier performance. The efficacy of our proposed framework is demonstrated through experimental results on the publicly available European Credit-Card Dataset and CIS Fraud Dataset.

Authors: Anubha Pandey, Alekhya Bhatraju, Shiv Markam and Deepak Bhatt

View full paper

Fraud Detection, Machine learning

Label-aware Sampling using Contrastive Learning for GNN-based Fraud Detection

Venue: Workshop on ML in Finance in conjunction with 28th SIGKDD Conference on Knowledge Discovery and Data Mining, 2022‎

Graph-based methods have garnered a lot of attention in fraud detection tasks due to the relational nature of the fraud behaviour. Owing to the success of graph neural networks in various graph- analytical problems like link prediction, node classification, graph classification etc. various GNN-based fraud detection models have been proposed. Most of the GNN-based approaches rely on aggregating information from neighbors to make inferences for a given node. However, these architectures do not explicitly identify which neighbours are valuable to the learning task and it may be harmful to the model performance. In various real-world fraud situations, the label distribution is highly skewed due to a small fraction of fraud events as compared to non-fraud events. This problem of sampling relevant neighbors to include in GNN aggregation is further exacerbated in the scenario with heavy class-imbalance; since a fraudulent node can easily camouflage among a lot of non-fraud nodes, and rely on the neighbor aggregation to evade the fraud detector. In this paper, we propose a novel GNN-based imbalanced fraud detection model. Our approach works by first splitting a node’s full neighbourhood into label aware sub-graphs, and then these sub-graphs are sampled by means of a separate Siamese network trained using contrastive loss. The contrastive network assigns scores to each pair of the neighboring nodes, and sampling the neighbourhood using the contrastive score allows us to under-sample from majority class neighbourhood. Then, we employ separate GNN layers for each of the filtered sub-graphs to aggregate the information and build corresponding node embeddings. These embeddings are further processed using aggregation function to get the final node representation vector which is then mapped to the class-label via a multi-layer-perceptron. Experiments are performed on the real-world Bitcoin transaction dataset(Elliptic dataset) which demonstrate that the proposed framework outperforms state-of-the-art baselines.

Authors: Garima Arora, Adarsh Patankar, Akash Choudhary and Janu Verma

View full paper

Fraud Detection, Machine learning

FairGen: Fair Synthetic Data Generation

Venue: DataPerf2022 Workshop in conjunction with International Conference on Machine Learning (ICML), 2022‎‎

With the rising adoption of Machine Learning across the domains like banking, pharmaceutical, ed-tech, etc, it has become utmost important to adopt responsible AI methods to ensure models are not unfairly discriminating against any group. Given the lack of clean training data, generative adversarial techniques are preferred to generate synthetic data with several state-of-the-art architectures readily available across various domains from unstructured data such as text, images to structured datasets modelling fraud detection and many more. These techniques overcome several challenges such as class imbalance, limited training data, restricted access to data due to privacy issues. Existing work focusing on generating fair data either works for a certain GAN architecture or is very difficult to tune across the GANs. In this paper, we propose a pipeline to generate fairer synthetic data independent of the GAN architecture. The proposed paper utilizes a pre-processing algorithm to identify and remove bias inducing samples. In particular, we claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples. Our experimental evaluation on two open-source datasets demonstrates how the proposed pipeline is generating fair data along with improved performance in some cases.

Authors: Himanshu Chaudhary, Bhushan Chaudhari, Aakash Agarwal, Kamna Meena and Tanmoy Bhowmik

View full paper

Fraud Detection

TeGraF: Temporal and Graph based Fraudulent Transaction Detection Framework

Venue: 2nd ACM International Conference on Artificial Intelligence in Finance (ICAIF), 2021‎‎

Detection of fraudulent transactions is an imperative research area in the financial domain, affecting the different entities involved in the payment process. An accurate fraud detection algorithm will help in identifying fraudulent transactions, thus facilitating immediate response and dispute resolution. To this effect, this research proposes a novel framework TeGraF for detecting fraudulent transactions by modeling temporal and structural features from a given input. The proposed algorithm operates at the intersection of two key research areas: Temporal Point Processes (TPPs) and Graph Neural Networks (GNNs). Due to the wide occurrence of sequential data in the financial domain, TPPs are very useful for modeling the sequence of transactions. Parallelly, the financial domain data can also be represented as a graphical structure capturing interactions between users and vendors/merchants. Thus, the proposed algorithm utilizes the temporal features learned via the TPP based model and the structural features captured via the GNN for modeling fraudulent transactions. Different graph representation learning techniques like Node2Vec, Metapath2Vec, LINE, DeepWalk, and BiNE are employed to compare the overall performance. Experiments have been evaluated on a synthetic dataset containing 62K users and 4M transactions, which demonstrate the improved performance of the proposed technique as compared to the existing algorithms.

 Authors: Shivshankar Reddy, Pranav Poduval, Anand Vir Singh Chauhan, Maneet Singh, Sangam Verma, Karamjit Singh and Tanmoy Bhowmik

View full paper

Fraud Detection

CuRL: Coupled Representation Learning of Cards and Merchants to detect Transaction Frauds (2021)

Venue: 30th International Conference on Artificial Neural Networks (ICANN), 2021‎

Payment networks like Mastercard or Visa process billions of transactions every year. A significant number of these transactions are fraudulent that cause huge losses to financial institutions. Conventional fraud detection methods fail to capture higher-order interactions between payment entities i.e., cards and merchants, which could be crucial to detect out-of-pattern, possibly fraudulent transactions. Several works have focused on capturing these interactions by representing the transaction data either as a bipartite graph or homogeneous graph projections of the payment entities. In a homogeneous graph, higher-order cross-interactions between the entities are lost and hence the representations learned are sub-optimal. In a bipartite graph, the sequences generated through random walk are stochastic, computationally expensive to generate, and sometimes drift away to include uncorrelated nodes. Moreover, scaling graph-learning algorithms and using them for real-time fraud scoring is an open challenge. In this paper, we propose CuRL and tCuRL, coupled representation learning methods that can effectively capture the higher-order interactions in a bipartite graph of payment entities. Instead of relying on random walks, proposed methods generate coupled session-based interaction pairs of entities which are then fed as input to the skip-gram model to learn entity representations. The model learns the representations for both entities simultaneously and in the same embedding space, which helps to capture their cross-interactions effectively. Furthermore, considering the session constrained neighborhood structure of an entity makes the pair generation process efficient. This paper demonstrates that the proposed methods run faster than many state-of-the-art representation learning algorithms and produce embeddings that outperform other relevant baselines on fraud classification task.

 Authors: Maitrey Gramopadhye*, Shreyansh Singh*, Kushagra Agarwal, Nitish Srivasatava, Alok Mani Singh, Siddhartha Asthana and Ankur Arora

View full paper

Fraud Detection

Application of Reinforcement Learning to Payment Fraud (2021)‎

Venue: Workshop on Multi-Armed Bandits and Reinforcement Learning: Advancing Decision Making in E-Commerce and Beyond in conjunction with the 27th ACM Conference on Knowledge Discovery and Data Mining (KDD), 2021‎

The large variety of digital payment choices available to consumers today has been a key driver of e-commerce transactions in the past decade. Unfortunately, this has also given rise to cybercriminals and fraudsters who are constantly looking for vulnerabilities in these systems by deploying increasingly sophisticated fraud attacks. A typical fraud detection system employs standard supervised learning methods where the focus is on maximizing the fraud recall rate. However, we argue that such a formulation can lead to sub-optimal solutions. The design requirements for these fraud models requires that they are robust to the high-class imbalance in the data, adaptive to changes in fraud patterns, maintain a balance between the fraud rate and the decline rate to maximize revenue, and be amenable to asynchronous feedback since usually there is a significant lag between the transaction and the fraud realization. To achieve this, we formulate fraud detection as a sequential decision-making problem by including the utility maximization within the model in the form of the reward function. The historical decline rate and fraud rate define the state of the system with a binary action space composed of approving or declining the transaction. In this study, we primarily focus on utility maximization and explore different reward functions to this end. The performance of the proposed Reinforcement Learning system has been evaluated for two publicly available fraud datasets using Deep Q-learning and compared with different classifiers. We aim to address the rest of the issues in future work.

 Authors: Siddharth Vimal, Kanishka Kayathwal, Hardik Wadhwa and Gaurav Dhama

View full paper

Fraud Detection

Temporal Debiasing using Adversarial Loss based GNN Architecture for Crypto Fraud Detection

Venue: ‎20th IEEE International Conference on Machine Learning and Applications (ICMLA), 2021‎

The tremendous rise of cryptocurrency in the payment domain has unlocked huge opportunities but also raised numerous challenges in parallel involving cybercriminal activities like money laundering, terrorist financing, illegal and risky services, etc, owing to its anonymous and decentralized setup. The demand for building a more transparent cryptocurrency network, resilient to such activities, has risen extensively as more financial institutions look to incorporate it into their network. While a plethora of traditional machine learning and graph based deep learning techniques have been developed to detect illicit activities in a cryptocurrency transaction network, the challenge of generalization and robust model performance on future timesteps still exists. In this paper, we show that the model learned on transactional feature set provided in dataset (Elliptic Dataset) carry a temporal bias, i.e. they are highly dependent on the timesteps they occur. Deploying temporally biased models limits their performance on future timesteps. To address this, we propose a temporal debiasing technique using GNN based architecture that ensures generalization by adversarially learning between fraud 1 classification and temporal classification. The adversarial loss constructed optimizes the embeddings to ensure they 1) perform well on fraud classification task 2) does not contain temporal bias. The proposed architecture capture the underlying fraud patterns that remain consistent over time. We evaluate the performance of our proposed architecture on the Elliptic dataset and compare the performance with existing machine learning and graph-based architectures. 1 Fraud and illicit are used interchangeably in this paper.

Authors: Aditya Singh, Anubhav Gupta, Hardik Wadhwa, Siddhartha Asthana and Ankur Arora

View full paper

Fraud Detection

Med-Dynamic Meta Learning — A multi-layered representation to identify provider fraud in healthcare‎

Venue: The International FLAIRS Conference Proceedings, Vol. 34, 2021‎‎

Every year, health insurance fraud costs taxpayers billions of dollars and puts patient’s health and welfare at risk. Existing solutions to detect fraudulent providers (hospitals, physicians, etc.) aim to find unusual pattern at claim level features but fail to harness provider-provider and provider-patient interaction information. We propose a novel framework, Med-Dynamic meta learning (MeDML), that extends the capability of traditional fraud detection by learning patterns from 1) patient-provider interaction using temporal and geo-spatial characteristics 2) provider’s treatment using encounter data (e.g. medical codes, mix of attended patients) and 3) referral using underlying provider-provider relationships based on common patient visits within 30 days. To the best of our knowledge, MeDML is first framework that can model fraud using multi-aspect representation of provider. MeDML also encapsulates provider's phantom billing index, which identifies excessive and unnecessary services provided to patients, by segmenting frequently co-occurring diagnosis and procedures in non-fraudulent provider’s claims. It uses a novel framework to aggregate the learned representations capturing their task-specific relative importance via attention mechanism. We test the dynamically generated meta embedding using various downstream models and show that it outperforms all baseline algorithms for provider fraud prediction task.

 Authors: Nitish Kumar, Deepak Chaurasiya, Alok Singh, Siddhartha Asthana, Kushagra Agarwal and Ankur Arora

View full paper

Fraud Detection

Intent2Vec: Representation Learning of Cardholder and Merchant intent from Temporal Interaction Sequences for Fraud Detection‎

Venue: Workshop on Applied Semantics Extraction and Analytics in conjunction with the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021‎

Fraud detection has been a challenging problem for financial institutions as it causes a loss of $24.2 billion per annum globally. This paper focuses on transaction fraud, the most prevalent type of fraud in the payment industry. The ability to detect and decline potential fraudulent transactions in real-time is crucial to guarantee a robust and secure environment for both, cardholders and merchants. Conventional fraud detection techniques predominantly use rule-based methods or extensive manual feature engineering for machine learning models. These fraud models rely on detecting anomalies in the attributes of a transaction. However, they fail to capture any type of interaction between cardholder and merchant involved in a transaction. The proposed approach, Intent2Vec, extends the capability of traditional fraud models by learning representation of payment entities using approaches of NLP to semantically capture the intent of doing a transaction. The modelled intent enables us to predict the next set of plausible merchants for a card and vice versa. Any deviation from the predicted and observed card or merchant can point towards a potential fraud. We test the relevance of intent based semantics on the downstream task of fraud detection wherein classifiers utilizing the entities’ learnt intent outperform other baseline algorithms on metrics such as AUC-PR and F1 score.

Authors: Nitish Kumar, Shinyjohn Shaju, Kanishka Kayathwal, Deepak Chaurasiya, Kushagra Agarwal, Alok Singh, Siddhartha Asthana and Ankur Arora

View full paper

Fraud Detection

Limitations and Applicability of GANs in Banking Domain‎

Venue: Workshop on Applied Deep Generative Networks in conjunction with the 24th European Conference on Artificial Intelligence (ECAI), 2020

Threats due to payment-related frauds are always a primary concern for financial institutions (FIs), often leading to huge losses and impacting consumer experience. To combat emerging frauds and improve the system’s robustness, FIs need an efficient system to detect fraud while authorizing payments. The biggest challenge in developing a fraud detection system is a high degree of class imbalance between fraudulent and legitimate transactions. Recently, Generative Adversarial Networks (GANs) are employed as an oversampling technique to augment the dataset with synthetic minority samples. In this paper, we present a systematic study to train GANs for synthetic fraud generation, demonstrating improved classifier performance detecting fraud. Training of GANs is conducted in various settings, including min-max objective and with or without auxiliary loss discriminating synthetic fraud and real fraud from non-fraud samples. Auxiliary loss is obtained using contrastive loss or triplet loss. Quality of trained GANs is estimated by evaluating the lift in classifier performance when trained with dataset augmented with synthetic fraud. Further, the effect of Discriminator Rejection Sampling (DRS) is studied in synthetic sample selection used for training data augmentation. The performance comparison of different settings proposed in this study is evaluated using a publicly available Credit-Card dataset and showed an absolute improvement of up to 6% in Recall and 3% in precision. We hope this paper will help advance the applicability of GANs with a practical insight into the research that has been done on this topic so far and open doors to interesting future research direction.  

 Authors: Anubha Pandey. Deepak Bhatt and Tanmoy Bhowmik

View full paper

Others

BiGReachFRauD: Bipartite Graph Representation Learning using Breached Sources for Financial Fraud Detection

Venue: ECAI (PAIS)

BiGReachFRauD builds bipartite-graph representations by linking payment entities to externally breached identifiers (such as leaked emails or devices) to augment fraud detection. It designs a representation learning pipeline over this bipartite structure to capture reachability patterns indicative of compromised entities. The learned embeddings, when fed into downstream fraud models, improve detection of compromised merchants and cards compared with baselines that ignore breached-source connectivity.

Authors: Manasvi, Suhas, Deepanshu, Hariom and Yatin

View full paper

Others

FgenXAI: A Generative AI Framework for Explainable Financial Records Summarization

Venue: KDD Workshop

FgenXAI proposes a generative-and-explainable AI framework that turns model explanations (e.g., feature attributions) into user-friendly summaries of financial records. The architecture includes query filtering, parsing/context building, response synthesis, and safety-focused response checking, loosely inspired by RAG-style modularity. Experiments on real financial workflows evaluate hallucination, refusal, and jailbreak robustness, showing that FgenXAI enables interactive, safer explanation consumption compared to one-shot XAI methods like SHAP/LIME alone.

Authors: Rakshit Rao, Manoj Mangam, Shivam Arora, Raahul Nallasamy, Sherin Bharathiy M, Aakarsh Malhotra and Alok Mani Singh

View full paper

Others

EvenOddML: Even and Odd Aggregation with Multi-Level Contrastive Learning for Bipartite Graph

Venue: CIKM Full

EvenOddML is a bipartite-graph representation learning model that aggregates information from immediate neighbors and 2-hop same-type neighbors via an even-and-odd encoder. It couples this encoder with a three-level contrastive learning scheme (layer, type-global, network-global) to jointly capture local and global structures. Evaluations on recommendation and link prediction tasks show that EvenOddML outperforms existing bipartite GNN methods, especially in capturing indirect same-type influences.

Authors: Manasvi Aggarwal, Jahnavi Methukumalli, Deepanshu Bagotia and Suhas Power

View full paper

Others

BMI-GP: Unsupervised Breach Merchant Identification via Adaptive Graph Pruning

Venue: ICAIF

BMI-GP addresses unsupervised identification of potentially breached merchants by modeling transaction networks as graphs and pruning them adaptively. The method constructs merchant-centric graphs and applies graph-pruning strategies to isolate suspicious connectivity patterns without labeled breach data. Experimental results on real payment data indicate that BMI-GP surfaces high-risk merchants earlier and with fewer false positives than heuristic thresholding approaches.

Authors: Kamna Meena, Subham Kumar Singh, Priyanshi Gupta, Gaurav Oberoi, Nitish Srivasatava and Siddhartha Asthana

View full paper

Others

Temporal Boosting for Incremental Tree-based Learning on Tabular Data

Venue: CODS (ADS)

This work proposes a temporal boosting strategy that incrementally updates tree-based models as new time-stamped tabular data streams in, without full retraining. The method adjusts boosting weights and tree updates to respect temporal drift while preserving past knowledge. Across several temporal tabular benchmarks, it yields better accuracy–latency trade-offs than standard batch retraining or naive online updates.

Authors: Rahul, Payal, Bhanu, Maneet, Josh and Chris

View full paper

Others

Time-dependent Check-in Attribute Prediction via Domain-aware CSMTPP

Venue: CODS

The paper models time-dependent user check-in attributes using a domain-aware variant of continuous-time spatio-temporal point processes (CSMTPP). It incorporates domain-specific signals (such as location semantics or periodicity) into the intensity function to better predict attributes associated with future check-ins. Experiments on real-world mobility datasets show improved predictive performance over generic spatio-temporal baselines.

 Authors: Anand, Ushmita and Maneet

View full paper

Others

A Semi-Supervised Vulnerability Management System

Venue: ‎Intelligent Systems Conference (IntelliSys) 2022‎

With the advent of modern network security advancements, computational resources of an organization are always at a threat from external entities. Such entities may be represented by hackers or miscreants who might cause significant damage to data and other software or hardware resources of an organization. A Vulnerability is a general way of representing a weakness in the software or hardware resources of the computational infrastructure of an organization. Such vulnerabilities may be either minor software issues, or in some cases may expose vital computational resources of the organization to external threats. The first step is to scan the entire computational infrastructure for such vulnerabilities. Once they are ascertained, a patching process is carried out to mitigate the threats. In order to perform effective mitigation, the most serious vulnerabilities should be given a higher priority. In order to create this priority list, a scoring mechanism is required for all scanned vulnerabilities. We present an end to end deployed vulnerability management system which can score these vulnerabilities using a natural language description of the same.

Authors: Soumyadeep Ghosh, Sourojit Bhaduri, Sanjay Kumar, Janu Verma, Yatin Katyal and Ankur Saraswat

View full paper

Others

RePS: Relation, Position and Structure aware Entity Alignment (2022)

Venue: Graph Learning Workshop in conjunction with The ACM Web Conference 2022‎

Entity Alignment (EA) is the task of recognizing the same entity present in different knowledge bases. Recently, embedding-based EA techniques have established dominance where alignment is done based on closeness in latent space. Graph Neural Networks (GNN) gained popularity as the embedding module due to its ability to learn entities’ representation based on their local sub-graph structures. Although GNN shows promising results, limited works have aimed to capture relations while considering their global importance and entities’ relative position during EA. This paper presents Relation, Position and Structure aware Entity Alignment (RePS), a multi-faceted representation learning-based EA method that encodes local, global, and relation information for aligning entities. To capture relations and neighborhood structure, we propose a relation-based aggregation technique — Graph Relation Network (GRN) that incorporates relation importance during aggregation. To capture the position of an entity, we propose Relation-aware Position Aggregator (RPA) to capitalize entities’ position in a nonEuclidean space using training labels as anchors, which provides a global view of entities. Finally, we introduce a Knowledge Aware Negative Sampling (KANS) that generates harder to distinguish negative samples for the model to learn optimal representations. We perform exhaustive experimentation on four cross-lingual datasets and report an ablation study to demonstrate the effectiveness of GRN, KANS, and position encodings.

Authors: Anil Surisetty, Deepak Chaurasiya, Nitish Kumar, Alok Singh, Gaurav Dhama, Aakarsh Malhotra, Vikrant Dey and Ankur Arora

View full paper

Others

CaPE: Category Preserving Embeddings for Similarity-Search in Financial Graphs

Venue: ACM International Conference on AI in Finance. 2022‎

Similarity-search is an important problem to solve for the payment industry having user-merchant interaction data. It finds out merchants similar to a given merchant and solves various tasks like peer-set generation, recommendation, community detection, and anomaly detection. Recent works have shown that by leveraging interaction data, Graph Neural Networks (GNNs) can be used to generate node embeddings for entities like a merchant, which can be further used for such similarity-search tasks. However, most of the real-world financial data come with high cardinality categorical features such as city, industry, super-industries, etc. which are fed to the GNNs in a one-hot encoded manner. Current GNN algorithms are not designed to work for such sparse features which makes it difficult for them to learn these sparse features preserving embeddings. In this work, we propose CaPE, a Category Preserving Embedding generation method which preserves the high cardinality feature information in the embeddings. We have designed CaPE to preserve other important numerical feature information as well. We compare CaPE with the latest GNN algorithms for embedding generation methods to showcase its superiority in peer set generation tasks on real-world datasets, both external as well as internal (synthetically generated). We also compared our method for a downstream task like link prediction.

Authors: Gaurav Oberoi, Pranav Poduval, Karamjit Singh, Sangam Verma and Pranay Gupta

View full paper

Others, Fraud Detection

Guided Self-Training based Semi-Supervised Learning for Fraud Detection

Venue: ‎ACM International Conference on AI in Finance, 2022‎

Semi supervised learning has attracted attention of AI researchers in the recent past, especially after the advent of deep learning methods and their success in several real world applications. Most deep learning models require large amounts of labelled data, which is expensive to obtain. Fraud detection is a very important problem for several industries and large amount of data is often available. However, obtaining labelled data is cumbersome and hence semi-supervised learning is perfectly positioned to aid us in building robust and accurate supervised models. In this work, we consider different kinds of fraud detection paradigms and show that a self-training based semi-supervised learning approach can produce significant improvements over a model that has been training on a limited set of labelled data. We propose a novel self-training approach by using a guided sharpening technique using a pair of autoencoders which provide useful cues for incorporating unlabelled data in the training process. We conduct thorough experiments on three different real world databases and analysis to showcase the effectiveness of the approach. On the elliptic bitcoin fraud dataset, we show that utilizing unlabelled data improves the F1 score of the model trained on limited labelled data by around 10%.

Authors: Awanish Kumar, Soumyadeep Ghosh and Janu Verma

View full paper

Others, Fraud Detection

Adversarial Fraud Generation for Improved Detection‎

Venue: ACM International Conference on AI in Financ, 2022‎‎

Generative Adversarial Networks (GANs) are known for their ability to learn data distribution and hence exist as a suitable alternative to handle class imbalance through oversampling. However, it still fails to capture the diversity of the minority class owing to their limited representation, for example, frauds in our study. Particularly the fraudulent patterns closer to the class boundary get missed by the model. This paper proposes using GANs to simulate fraud transaction patterns conditioned on genuine transactions, thereby enabling the model to learn a translation function between both spaces. Further to synthesize fraudulent samples from the class boundary, we trained GANs using losses inspired by data poisoning attack literature and discussed their efficacy in improving fraud detection classifier performance. The efficacy of our proposed framework is demonstrated through experimental results on the publicly available European Credit-Card Dataset and CIS Fraud Dataset.

Authors: Anubha Pandey, Alekhya Bhatraju, Shiv Markam and Deepak Bhatt

View full paper

Others, Trustworthy AI

GroupMixNorm Layer for Learning Fair Models‎

Venue: Workshop on Interpolation Regularizers and Beyond in conjunction with NeurIPS, 2022‎‎‎

Recent research has focused on proposing algorithms for bias mitigation from automated prediction algorithms. Most of the techniques include convex surrogates of fairness metrics such as demographic parity or equalized odds in the loss function, which are not easy to estimate. Further, these fairness constraints are mostly data-dependent and aim to minimize the disparity among the protected groups during the training. However, they may not achieve similar performance on the test set. In order to address the above limitations, this research proposes a novel GroupMixNorm layer for bias mitigation from deep learning models. As an alternative to solving constraint optimization separately for each fairness metric, we have formulated bias mitigation as a problem of distribution alignment of several groups identified through the protected attributes. To this effect, the GroupMixNorm layer probabilistically mixes group-level feature statistics of samples across different groups based on the protected attribute. The proposed method improves upon several fairness metrics with minimal impact on accuracy. Experimental evaluation and extensive analysis on benchmark tabular and image datasets demonstrate the efficacy of the proposed method to achieve state-of-the-art performance.

Authors: Anubha Pandey, Aditi Rai, Maneet Singh, Deepak Bhatt and Tanmoy Bhowmik

View full paper

Others

Post-pandemic Economic Transformations in the United States of America‎

Venue: Workshop on Social Data Mining in the Post-pandemic Era (SocDM 2022) in conjunction with IEEE Conference on Data Mining, 2022‎

The COVID-19 pandemic has impacted economic activity not only in the United States, but across the globe. Lockdown and travel restrictions imposed by local authorities have led to change in customer preferences and thus transformation of economic activity from traditional areas to new regions. While most changes have been temporary and short term, some of them have been observed to be of permanent nature. Using large-scale aggregated and anonymized transaction data across various socio-economic groups, we analyse and discuss such temporary relocation of citizens’ economic activities in metropolitan areas of 15 states in the US. The results of this study have extensive implications for urban planners and business owners, and can provide insights into the temporary relocation of economic activities resulting from an extreme exogenous shock like the COVID-19 pandemic.

Authors: Avi Chawla, Nidhi Mulay, Vikas Bishnoi, Yatin Katyal, Ankur Saraswat, Mohsen Bahrami, Esteban Moro and Alex Pentland

View full paper

Others, Fraud Detection

Label-aware Sampling using Contrastive Learning for GNN-based Fraud Detection

Venue: Workshop on ML in Finance in conjunction with 28th SIGKDD Conference on Knowledge Discovery and Data Mining, 2022‎

Graph-based methods have garnered a lot of attention in fraud detection tasks due to the relational nature of the fraud behaviour. Owing to the success of graph neural networks in various graph- analytical problems like link prediction, node classification, graph classification etc. various GNN-based fraud detection models have been proposed. Most of the GNN-based approaches rely on aggregating information from neighbors to make inferences for a given node. However, these architectures do not explicitly identify which neighbours are valuable to the learning task and it may be harmful to the model performance. In various real-world fraud situations, the label distribution is highly skewed due to a small fraction of fraud events as compared to non-fraud events. This problem of sampling relevant neighbors to include in GNN aggregation is further exacerbated in the scenario with heavy class-imbalance; since a fraudulent node can easily camouflage among a lot of non-fraud nodes, and rely on the neighbor aggregation to evade the fraud detector. In this paper, we propose a novel GNN-based imbalanced fraud detection model. Our approach works by first splitting a node’s full neighbourhood into label aware sub-graphs, and then these sub-graphs are sampled by means of a separate Siamese network trained using contrastive loss. The contrastive network assigns scores to each pair of the neighboring nodes, and sampling the neighbourhood using the contrastive score allows us to under-sample from majority class neighbourhood. Then, we employ separate GNN layers for each of the filtered sub-graphs to aggregate the information and build corresponding node embeddings. These embeddings are further processed using aggregation function to get the final node representation vector which is then mapped to the class-label via a multi-layer-perceptron. Experiments are performed on the real-world Bitcoin transaction dataset(Elliptic dataset) which demonstrate that the proposed framework outperforms state-of-the-art baselines.

Authors: Garima Arora, Adarsh Patankar, Akash Choudhary and Janu Verma

View full paper

Others

Effects of stimulus checks on spending patterns of different economic groups

Venue: ‎IEEE International Conference on Data Mining Workshops (ICDMW), 2021‎‎

This paper uses daily anonymous aggregated trans-action data to analyze the changes in consumer spending caused by receipt of the stimulus payments in the United States during the COVID-19 pandemic. The stimulus checks were provided as part of the CARES Act aiming to provide emergency assistance for individuals and businesses affected by the pandemic. We analyze the impact of the receipt of those payments on the aggregated daily spending of different socio-economic groups and industries. We show that the transaction patterns of low spending consumers were most impacted by the stimulus payments among different spending groups. Our study results also indicate that the consumer responses after the first stimulus check (April 2020) were substantial and significant on industries that sell daily essential items, whereas consumer responses after the third stimulus check (March 2021) were significant in non-essential goods (e.g. luxury and entertainment sector). The results of this study are of crucial importance because they could help policy makers better shape stimulus payments that may be needed in future emergencies.

 Authors: Nidhi Mulay, Vikas Bishnoi, Yatin Katyal, Mohsen Bahrami, Esteban Moro, Ankur Saraswat and Alex Pentland

View full paper

Others

Server-Language Processing: A Semi-Supervised approach to Server Failure Detection

Venue: ‎International Conference on Information Technology and Cloud Computing (ITCC) in conjunction with International Conference on Computing, Networks and Internet of Things (CNIOT), 2021‎

As industrial systems continue to grow in terms of scale and complexity, having an effective as well as proactive failure management approach helps mitigate the impact of server failure. While supervised methods fail to perform well in real-world servers due to label noise in log data as well as their failure to detect unseen failures, unsupervised techniques are often too naive to differentiate between complex log structures. We propose a NLP based semi-supervised solution that learns the complex understanding of healthy and failure log patterns using an ensemble of deep learning based density and sequential solutions. Our hypothesis is that server logs follow a language of their own, which we attempt to decipher through Server-Language Processing. Experimental evaluations on real world log data show that our proposed solution outperforms other existing log-based anomaly detection methods for real world application. The solution was implemented for 3000 servers for 6 months of log data, and was able to pick up server failures upto 2 weeks in advance without raising an excess of false alarms.

Authors: Sonali Syngal, Sangam Verma, Kandukuri Karthik, Yatin Katyal and Soumyadeep Ghosh

View full paper

Others

Pandemic Spread Prediction and Healthcare Preparedness Through Financial and Mobility Data

Venue: Workshop on AI for Public Health in conjunction with the International Conference on Learning Representations (ICLR), 2021

The pandemics like Coronavirus disease 2019 (COVID-19) require Governments and health professionals to make time-sensitive, critical decisions about travel restrictions and resource allocations. This paper identifies various factors that affect the spread of the disease using transaction data and proposes a model to predict the degree of spread of the disease and thus the number of medical resources required in upcoming weeks. We perform a region-wise analysis of these factors to identify the control measures that affect the minimal set of population. Our model also helps in estimating the surges in clinical demand and identifying when the medical resources would be saturated. Using this estimate, we suggest the preventive as well as corrective measures to avoid critical situations.

Authors: Nidhi Mulay, Vikas Bishnoi, Himanshi Charotia, Siddhartha Asthana, Gaurav Dhama and Ankur Arora

View full paper

Others

Information Retrieval and Extraction on COVID-19 Clinical Articles using Graph Community Detection and BIO-Bert Embeddings (2020)

Venue: Workshop on NLP for COVID-19 in conjunction with the 58th Annual Meeting of the Association for Computational Linguistics, 2020‎

In this paper, we present an information retrieval system on a corpus of scientific articles related to COVID-19. We build a similarity network on the articles where similarity is determined via shared citations and biological domain-specific sentence embeddings. Ego-splitting community detection on the article network is employed to cluster the articles and then the queries are matched with the clusters. Extractive summarization using BERT and PageRank methods is used to provide responses to the query. We also provide a Question-Answer bot on a small set of intents to demonstrate the efficacy of our model for an information extraction module.

 Authors: Debasmita Das, Yatin Katyal, Janu Verma, Rajesh Kumar Ranjan, Shashank Dubey, Aakash Deep Singh, Sourojit Bhaduri and Kushagra Agarwal

View full paper

Contact us to learn more

Get in touch