PhD Student in Computational Linguistics at Indiana University Bloomington. Researcher in Arabic NLP, Natural Language Inference, and Mechanistic Interpretability.
I am a dual-major PhD student in Computational Linguistics and Middle Eastern Languages & Cultures at Indiana University Bloomington, with a minor in Computer Science. My research centers on Natural Language Inference (NLI) and extends to NLP pipelines designed for real-world applications in Arabic.
I have built and evaluated datasets and systems for ellipsis detection, dialect classification, sexism detection, and named entity recognition in Arabic financial news. I have also studied public discourse on Arabic Twitter and analyzed language use in depression narratives. More recently, I have expanded my focus toward AI safety, explainability, and mechanistic interpretability — examining how transformer-based models internally represent linguistic features.
I developed Rasid, a 900M+ word Arabic Twitter corpus organized by year, month, and week — enabling longitudinal studies of Arabic language change. I also created RogueTeX, a web-based LaTeX editor with cloud compilation and storage, released publicly in early 2026.
Developing datasets, tools, and models for Arabic with coverage for dialect variation, morphology, and low-resource challenges. Includes ellipsis corpus creation, NER for financial news, and dialect identification.
Modeling entailment, contradiction, and pragmatic inference in Arabic texts. Focus on morphological and syntactic cues, discourse markers, and cross-lingual label transfer.
Probing transformer circuits to understand how speech and language models encode linguistic nativeness, syntactic structure, and semantic features. Committed to building interpretable, trustworthy AI.
Exploring appraisal theory, discourse analysis, and sentiment in clinical narratives, social media, and religious texts. Applying NLP to socially sensitive domains like mental health and Islamic discourse.