Usman Naseem | Macquarie University

LLM Alignment should go beyond Harmlessness–Helpfulness and incorporate Human Agency

Usman Naseem, Tanmoy Chakraborty, Kai-Wei Chang, Mark Dras, Preslav Nakov, Nanyun Peng & Soujanya Poria

Cognitive Computation, 2026

Do Large Language Models Reflect Demographic Pluralism in Safety?

Usman Naseem, Gautam Siddharth Kashyap, Sushant Kumar Ray, Rafiq Ali, Ebad Shabbir, Abdullah Mohammad

EACL 2026

Are Aligned Large Language Models Still Misaligned?

Usman Naseem, Gautam Siddharth Kashyap, Rafiq Ali, Ebad Shabbir, Sushant Kumar Ray, Abdullah Mohammad

arXiv:2602.11305 (2026)

Can Large Language Models Make Everyone Happy?

Usman Naseem, Gautam Siddharth Kashyap, Ebad Shabbir, Sushant Kumar Ray, Abdullah Mohammad, Rafiq Ali

arXiv:2602.11091 (2026)

A Survey of Progress in LLM Alignment from the Perspective of Reward Design

Miaomiao Ji, Yanqiu Wu, Zhibin Wu, Shoujin Wang, Jian Yang, Mark Dras, Usman Naseem

IEEE Transactions on Artificial Intelligence, 2026

Mechanistic Interpretability for Large Language Model Alignment: Progress, Challenges, and Future Directions

Usman Naseem

Preprint, 2026

Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs

Pranav Bhandari, Nicolas Fay, Sanjeevan Selvaganapathy, Amitava Datta, Usman Naseem, Mehwish Nasim

EACL 2026

Beyond the Black Box: Demystifying Multi-Turn LLM Reasoning with VISTA

Yiran Zhang, Ming Lin, Mark Dras, Usman Naseem

AAAI 2026 (Demo)

VISPA: Pluralistic Alignment via Automatic Value Selection and Activation

Shenyan Zheng, Jiayou Zhong, Anudeex Shetty, Heng Ji, Preslav Nakov, Usman Naseem

arXiv:2601.12758 (2026)

CogMem: A Cognitive Memory Architecture for Sustained Multi-Turn Reasoning in Large Language Models

Yiran Zhang, J Hu, Mark Dras, Usman Naseem

arXiv:2512.14118 (2025)

From Native Memes to Global Moderation: Cross-Cultural Evaluation of Vision-Language Models for Hateful Meme Detection

Mo Wang, Kaixuan Ren, Pratik Jalan, Ahmed Ashraf, Tuong Vy Vu, Rahul Seetharaman, Shah Nawaz, Usman Naseem

The Web Conference (WebConf) 2026

Robust Harmful Meme Detection under Missing Modalities via Shared Representation Learning

Felix Breiteneder, Mohammad Belal, Muhammad Saad Saeed, Shahed Masoudian, Usman Naseem, Kulshrestha Juhi, Markus Schedl, Shah Nawaz

The Web Conference (WebConf) 2026

Revealing the Truth with ConLLM for Detecting Multi-Modal Deepfakes

Gautam Siddharth Kashyap, H Joshi, N Jain, Ebad Shabbir, J Gao, N Joshi, Usman Naseem

EACL 2026

Health-ORSC-Bench: A Benchmark for Measuring Over-Refusal and Safety Completion in Health Context

Z Zhang, L Huang, G Wu, Preslav Nakov, H Ji, Usman Naseem

arXiv:2601.17642 (2026)

Selected Publications