Profile picture

Hi! I'm Akshita.

Data Science, NLP, Explainable AI, and Theatre Enthusiast 🤖

About Me


I am a graduate student at Carnegie Mellon University’s School of Computer Science, pursuing a Master’s in Computational Data Science. My work centers on Natural Language Processing, Multimodal Machine Learning, and Retrieval-Augmented Generation (RAG), with a focus on building explainable, faithful, and culturally inclusive AI systems.

Most recently, I worked as an AI/ML Software Engineering Intern at Amazon Web Services, where I developed a self-correcting Generative AI application for verifying LLM responses and correcting them using Automated Reasoning in AWS Bedrock. Previously, as a Software Engineer at Cisco, I designed a proof-of-concept leveraging eBPF and machine learning to gain packet path insights in ASR9k routers.

At CMU, I am conducting research in Prof. Maarten Sap’s lab, studying Theory of Mind in LLMs and using chain-of-thought reasoning to improve faithfulness and commonsense in biased data. Alongside research, I serve as a Teaching Assistant for the Advanced NLP course under Prof. Sean Welleck, mentoring students on graduate-level assignments and projects.

During undergraduate studies, I conducted research at Samsung Research on understanding intents in code-mixed languages, and Indian Institute of Science where I was involved in the Spire Project - a large-scale multilingual corpus for Indic languages. During the pandemic, I also researched and conducted a survey on methods of Affect Recognition in Online Education, combining multiple modalities for understanding student emotions in online education.

Projects @ CMU


Search Engines, September 2025 - Present

Working on creating a query and ranking based Search Engine as part of my coursework!

Theory of Mind for Explainable AI, January 2025 - Present

Started my research in using simulatability metrics and evaluating LLM Faithfulness in biased data as a part of my Capstone project!

Exploring the use of test-time scaling for Multimodal Reasoning Tasks, January 2025 - May 2025

Compared scaling methods like chain-of-thought reasoning and synthetic data generation in multimodal datasets like MathVista.

Multi-Service Cloud Infrastructure, January 2025 - May 2025

Created a distributed microservice architecture, optimized for performance-critical workloads using Vert.x-based asynchronous servers. Also placed among the top 10 teams in terms of cost optimization and performance!

Movie Recommender System, January 2025 - May 2025

Deployed a real-time SVD movie recommender with continuous re-training, deploying, testing, monitoring online and offline metrics, and provenance - using Flask, Prometheus, Grafana, and MLFlow.

Debiasing Large Language Models through Casual-Guided Active Learning, December 2024

Improved performance on novel evaluation metrics by 5% through prompt finetuning for position bias in MT-Bench dataset.

Retrieval Augmented Generation (RAG) Model using LLMs, September 2025

Developed an end-to-end Pittsburgh events related, Question-Answering RAG Model on Llama 3.2 using Langchain and Ollama.

Publications


"Multimodality in Online Education: A Comparative Study." Multimedia Tools and Applications (2024)

  • Proposed a majority decision-level fusion model for multiple modalities - Facial Expressions, Posture, Speech, Eye-tracking.
  • Compiled a dataset of over 4k images for posture recognition, achieving an accuracy of 95.96% on CNN and 93.7% on SVM.
  • https://link.springer.com/article/10.1007/s11042-024-20540-0

"Joint Intent Classification and Slot Tagging on Agricultural Dataset for Indic Languages." ICACCS (2023)

"Comparison of Perplexity Scores of Language Models for Telugu Data Corpus in the Agricultural Domain." ICICCS (2024)