Ivaxi Sheth

I am a PhD student at CISPA Helmholtz Center for Information Security under supervision of Prof. Mario Fritz. My current research interests fall under the broad section of Trustworthy Machine Learning, particularly generalization and explainability. I am currently exploring the intersection of causality and language models.

Previously I was at Mila-Quebec AI. I graduated with MEng Hons in Electrical and Electronic Engineering from Imperial College London. My areas of focus were Machine Learning and Computer Vision supervised by Dr. Carlo Ciliberto. Previously I worked as an AI Research Engineer at Imagination Technologies, UK under Dr. Cagatay Dikici working on Hardware Acceleration of neural networks.

Email  /  Google Scholar  /  Twitter  /  Linkedin

News

Publications
elign LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History
Akash Gupta, Ivaxi Sheth, Vyas Raina, Mark Gales, Mario Fritz.
Preprint
[ Paper, Code ]

Large Language Models (LLMs) can perform a wide range of tasks, but their performance can be negatively impacted when there's a switch in tasks. This study is the first to formalize the study of such vulnerabilities, revealing that both very large and small LLMs can be susceptible to performance degradation from task-switches.

elign Auxiliary Losses for Learning Generalizable Concept-based Models
Ivaxi Sheth, Samira Ebrahimi Kahou
NeurIPS 2023
[ Paper, Code ]

We proposed a multi-task learning paradigm for Concept Bottleneck Models to introduce inductive bias in concept learning. Our proposed model coop-CBM improves the downstream task accuracy over black box standard models. Using the concept orthogonal loss, we introduce orthogonality among concepts in the training of CBMs.

elign Transparent Anomaly Detection via Concept-based Explanations
Laya Rafiee Sevyeri* Ivaxi Sheth*, Farhood Farahnak* Samira Ebrahimi Kahou Shirin Abbasinejad Enger
NeurIPS 2023, XAI in Action
[ Paper ]

We propose Transparent {A}nomaly Detection {C}oncept {E}xplanations (ACE). ACE is able to provide human interpretable explanations in the form of concepts along with anomaly prediction. Our proposed model shows either higher or comparable results to black-box uninterpretable models.

elign Relational UNet for Image Segmentation
Ivaxi Sheth*, Pedro Braga*, Shivakanth Sujit*, Sahar Dastani, Samira Ebrahimi Kahou
International Workshop on Machine Learning in Medical Imaging 2023
[ Paper , Code ]

We propose RelationalUNet which introduces relational feature transformation to the UNet architecture. RelationalUNet models the dynamics between visual and depth dimensions of a 3D medical image by introducing Relational Self-Attention blocks in skip connections.

elign Learning from uncertain concepts via test time interventions
Ivaxi Sheth, Aamer Abdul Rahman, Laya Rafiee Sevyeri, Mohammad Havaei Samira Ebrahimi Kahou
NeurIPS 2022, Trustworthy and Socially Responsible Machine Learning Workshop
[ Paper ]

We propose uncertainty based strategy to select the interventions in Concept Bottleneck Models during inference.

FHIST: A Benchmark for Few-shot Classification of Histological Images
Fereshteh Shakeri, Malik Boudiaf, Sina Mohammadi, Ivaxi Sheth, Mohammad Havaei, Ismail Ben Ayed Samira Ebrahimi Kahou. In submission
Our benchmark builds few-shot tasks and base-training data with various tissue types, different levels of domain shifts stemming from different cancer sites, and different class granularity levels, thereby reflecting realistic clinical settings. We evaluate the performances of state-of-the-art few-shot learning methods, initially designed for natural images, on our histology benchmark.
Three-stream network for enriched Action Recognition
Ivaxi Sheth, CVPRW' 21
We propose three stream network with each stream working on a different frame rate of input for action recognition and detection. We test our work on popular datasets such as Kinetics, UCF-101 and AVA. The results on AVA dataset particularly shows that effectiveness of the use of attention for each stream.
Patents
Hardware implementation of windowed operations in three or more dimensions
Ivaxi Sheth, Cagatay Dickici, Aria Ahamdi, James Imber. In submission

This website template - Jon Barron.