About

Muhammad S. Abdo

PhD Student in Computational Linguistics at Indiana University Bloomington. Researcher in Arabic NLP, Natural Language Inference, and Mechanistic Interpretability.

I am a dual-major PhD student in Computational Linguistics and Middle Eastern Languages & Cultures at Indiana University Bloomington, with a minor in Computer Science. My research centers on Natural Language Inference (NLI) and extends to NLP pipelines designed for real-world applications in Arabic.

I have built and evaluated datasets and systems for ellipsis detection, dialect classification, sexism detection, and named entity recognition in Arabic financial news. I have also studied public discourse on Arabic Twitter and analyzed language use in depression narratives. More recently, I have expanded my focus toward AI safety, explainability, and mechanistic interpretability — examining how transformer-based models internally represent linguistic features.

I developed Rasid, a 900M+ word Arabic Twitter corpus organized by year, month, and week — enabling longitudinal studies of Arabic language change. I also created RogueTeX, a web-based LaTeX editor with cloud compilation and storage, released publicly in early 2026.

quick_facts.sh
# Location
echo "Bloomington, Indiana, USA"

# PhD Major 1
echo "Computational Linguistics"

# PhD Major 2
echo "Middle Eastern Languages & Cultures"

# Minor
echo "Computer Science"

# Languages
echo "Arabic (Native)"
echo "English (Fluent)"
echo "Persian (Intermediate)"
Affiliations
🏛️ Dept. of Linguistics
Indiana University Bloomington
🏛️ Dept. of MELC
Indiana University Bloomington
🏛️ Hamilton Lugar School
Global & International Studies, IU

Academic Trajectory

Aug 2022 – Present
Dual-Major PhD — Computational Linguistics & Middle Eastern Languages & Cultures
Indiana University Bloomington · Minor in Computer Science
Research in Arabic NLP, NLI, and Mechanistic Interpretability. Advisor: Dr. Sandra Kübler (Linguistics) and Dr. Nader Morkus (MELC). Passed qualifying exams (MELC, High Pass, 2026).
Jan 2021
MA in English Linguistics
Ain Shams University (Al-Alsun), Cairo, Egypt
Thesis: "Analyzing Appraisal in Major and Bipolar Depression Patients' Narratives in Mental Health Forums: A Corpus-Based Study." Advisor: Dr. Nihal Nagi Sarhan.
Aug 2018
PGCE — Postgraduate Certificate in Education
Monofiya University, Egypt
Jun 2013
BA in English Language and Literature
Benha University, Egypt

Research Interests

01

Arabic NLP

Developing datasets, tools, and models for Arabic with coverage for dialect variation, morphology, and low-resource challenges. Includes ellipsis corpus creation, NER for financial news, and dialect identification.

MSADialectsCorpus
02

Natural Language Inference

Modeling entailment, contradiction, and pragmatic inference in Arabic texts. Focus on morphological and syntactic cues, discourse markers, and cross-lingual label transfer.

EntailmentPragmaticsDiscourse
03

Mechanistic Interpretability

Probing transformer circuits to understand how speech and language models encode linguistic nativeness, syntactic structure, and semantic features. Committed to building interpretable, trustworthy AI.

AI SafetyProbingSpeech
04

Computational Pragmatics

Exploring appraisal theory, discourse analysis, and sentiment in clinical narratives, social media, and religious texts. Applying NLP to socially sensitive domains like mental health and Islamic discourse.

AppraisalSentimentSocial Media

Technical Expertise

NLP & ML
Transformers (HuggingFace) PyTorch scikit-learn LSTM RAG Few-Shot Prompting Fine-tuning
NLP Tasks
Dialect ID NER Ellipsis Recovery Text Classification Sentiment Analysis NLI Information Extraction
Programming
Python Pandas Unix / Bash Jupyter JavaScript LaTeX
Data & Annotation
Corpus Design Annotation Schemes CQL (Corpus Query Language) IAA Evaluation
Corpus Tools
LIWC Stylo ConceptNet ArTenTen Corpus Sketch Engine
Evaluation
F1 / MCC Human-in-the-Loop Bias & Fairness Annotation Quality