Ph.D. Candidate · Informatics · Indiana University

Aligning humans
and AI

I study human–AI alignment — probing and mitigating the biases and limitations of large language models so they can be implemented more faithfully, with reliable interpretability. My work spans natural language processing, large language models, and computational social science.

Browse publications → Email Google Scholar

About

I am Fan Huang, a fourth-year Informatics Ph.D. candidate in Complex Networks and Systems at Indiana University Bloomington, advised by Prof. Jisun An and Prof. Haewoon Kwak. I also work with Prof. Yong-Yeol Ahn and Prof. Filippo Menczer on research projects.

My current projects investigate the human–AI alignment problems and challenges that can be used to mitigate LLMs' biases and limitations and lead to a better implementation of AI models — working across Natural Language Processing, Large Language Models, and Computational Social Science.

I am actively seeking Research Scientist, Post-Doc, Research Fellow, or Tenure-Track Faculty opportunities starting Spring/Fall 2027, around the world.

Education

Ph.D. in Informatics Indiana University Bloomington 2023–present
Ph.D. in Computer Science Singapore Management University 2022–2023
M.S. in Information Systems Nanyang Technological University 2020–2022
B.S. in Computer Science & Technology Central South University 2014–2018

Latest Publications

All 19 →

Simulating Hate Speech Cascades with Multi-LLM AgentsPreprint · 2026
ReFlect: A Harness System for Long-Horizon LLM ReasoningPreprint · 2026
CogBias: Measuring & Mitigating Cognitive Bias in LLMsPreprint · 2026

Featured project

All projects →

CNS-IU · Human Reference Atlas (HuBMAP)

HRA Cell Embeddings Featured

A single-cell transcriptomics visualization pipeline over ~453k cells — turning massive datasets into legible maps.

Research Directions

Overview →

Keeping models aligned with peopleHuman–AI alignment
Looking inside the chain of thoughtReasoning & interpretability
Auditing and mitigating harmBias, safety & online harm
LLMs as instruments for studying societyComputational social science

News

All news →

June 2026

Delivered a talk LLM Moral Reasoning Trajectory Analysis, invited by Prof. Han Xue's lab, Donghua University, Shanghai.
April 2026

Delivered a talk AI-assisted Research, invited by Prof. Thai Le's lab, Indiana University Bloomington.
April 2026

Our paper Understanding Moral Reasoning Trajectories in Large Language Models is accepted at IC2S2 2026 (Parallel Talk track).
April 2026

Our paper XChoice: Explainable Evaluation of AI-Human Alignment is accepted at IC2S2 2026 (Poster track).
April 2026

Our paper Vulnerability of LLMs' Stated Belief? is accepted at ACL 2026 Findings.
March 2026

Presenting Vulnerability of LLMs' Stated Belief? at MSLD 2026 (UIUC).
Sept 2025

Our paper Is DeepSeek a New Voice Among LLMs in Public Opinion Simulation is accepted at ICDM 2025, Washington DC.
April 2025

Passed my Ph.D. Qualification Exam at IUB!

Featured Publications

All publications →

Vulnerability of LLMs' Stated Belief? LLMs Belief Resistance Check Through Strategic Persuasive Conversation InterventionsFeatured

Fan Huang, Haewoon Kwak, Jisun An

ACL Findings · 2026

Are LLMs persuaded out of their beliefs? Using the Source–Message–Channel–Receiver (SMCR) framework across six mainstream LLMs and three domains (facts, medical QA, social bias), we measure how stable each model's stated beliefs are under persuasive pressure. The smallest model flips on 82.5% of attempts at the first turn — and, counterintuitively, asking models to verbalize confidence makes them more vulnerable.

Beliefs & opinions LLMs & responsible AI
Understanding Moral Reasoning Trajectories in Large Language Models: Toward Probing-Based ExplainabilityFeatured

Fan Huang, Haewoon Kwak, Jisun An

IC2S2 2026 · 2026

We introduce moral reasoning trajectories — sequences of ethical-framework invocations across intermediate reasoning steps — and analyze their dynamics across six models and three benchmarks. Reasoning switches frameworks 55–58% of the time; linear probes localize framework-specific encoding to model-specific layers, and lightweight activation steering modulates how models integrate them.

Reasoning & interpretability LLMs & responsible AI
Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate SpeechFeatured

Fan Huang, Haewoon Kwak, Jisun An

The Web Conference (WWW) · 2023

Can ChatGPT explain why a post is implicitly hateful? On the LatentHatred dataset, ChatGPT reaches 80% agreement with human labels, and a user study shows its natural-language explanations can shift lay perceptions of hatefulness — revealing both the promise and the risks of LLM-generated explanations.

Online harm NLP & generation
ChatGPT Rates Like Human: Towards a Better Alignment of Text Explanation Quality AssessmentsFeatured

Fan Huang, Haewoon Kwak, Kunwoo Park, Jisun An

LREC-COLING · 2024

Can ChatGPT judge the quality of natural-language explanations the way people do? Across three NLE datasets and 900 human annotations, ChatGPT aligns well with humans on coarse-grained scales, and paired comparisons plus dynamic prompting further improve agreement — charting where LLM-based evaluation can stand in for human raters.

NLP & generation Human–AI alignment
Chain of Explanation: New Prompting Method to Generate Higher Quality Natural Language Explanation for Implicit Hate SpeechFeatured

Fan Huang, Haewoon Kwak, Jisun An

The Web Conference (WWW) · 2023

A prompting method — Chain of Explanation (CoE) — that generates natural-language explanations for why a post is implicitly hateful, guided by heuristic words and the target group. On LatentHatred, CoE sharply improves explanation quality (BLEU 44.0 → 62.3), outperforming baseline generation across models and metrics.

Online harm NLP & generation

Projects

All projects →

Engineering & applied work

Research tooling, evaluation harnesses, and data-science pipelines — from a single-cell visualization pipeline for the Human Reference Atlas to LLM probing and belief-resistance toolkits. Building toward enterprise-grade AI applications.

HRA Cell Embeddings
LLM-Morality
LLM-Stated-Belief
ChatGPTRater
Datasets & benchmarks

Tech Package

Open the tech package →

Hands-on LLM engineering

Beyond the papers: a lab of techniques I've built end-to-end — parameter-efficient fine-tuning (LoRA / QLoRA) of Llama-2 & Llama-3.1, local model serving with Ollama, and running DeepSeek-R1 under 1.58-bit dynamic quantization.

LoRA / QLoRA
Llama-2 · Llama-3.1
Ollama serving
DeepSeek-R1 · 1.58-bit
llama.cpp

Aligning humansand AI

About

Education

Latest Publications

Featured project

Research Directions

Featured Publications

Vulnerability of LLMs' Stated Belief? LLMs Belief Resistance Check Through Strategic Persuasive Conversation InterventionsFeatured

Understanding Moral Reasoning Trajectories in Large Language Models: Toward Probing-Based ExplainabilityFeatured

Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate SpeechFeatured

ChatGPT Rates Like Human: Towards a Better Alignment of Text Explanation Quality AssessmentsFeatured

Chain of Explanation: New Prompting Method to Generate Higher Quality Natural Language Explanation for Implicit Hate SpeechFeatured

Projects

Tech Package

Aligning humans
and AI