Projects

Engineering and applied work — research tooling, evaluation harnesses, and data-science pipelines. I'm building toward enterprise-grade AI applications; for now, here are projects I've shipped.

CNS-IU · Human Reference Atlas (HuBMAP)

HRA Cell Embeddings Featured

A visualization pipeline for single-cell transcriptomics embeddings from the Human Reference Atlas (HRA) and the HuBMAP consortium. It progresses from raw UMAP mappings through tuned PCA + t-SNE combinations over ~453k cells, isolating the top 24 cell types for clearer biological interpretation — turning massive single-cell datasets into legible maps.

Python AnnData / h5ad UMAP t-SNE PCA Jupyter
muyuhuatang/llm_morality

LLM-Morality

Probing-based toolkit for analyzing moral reasoning trajectories in LLMs — linear probes for ethical-framework attribution and lightweight activation steering across reasoning steps.

Python PyTorch Probing Steering
muyuhuatang/llm_stated_belief

LLM-Stated-Belief

Evaluation harness for LLM belief resistance under strategic persuasion, built on the Source–Message–Channel–Receiver (SMCR) framework across six models and three domains.

Python LLM eval Fine-tuning
muyuhuatang/ChatGPTRater

ChatGPTRater

Dataset and code studying how ChatGPT rates the quality of natural-language explanations compared with human assessments (LREC-COLING 2024).

Dataset NLP Evaluation
muyuhuatang/LLMsFT_LoRA

LLMs Fine-tuning · LoRA

A clean replication of efficient LLM fine-tuning with LoRA (low-rank adaptation) — a compact reference for parameter-efficient adaptation.

Python LoRA Fine-tuning
muyuhuatang/IsChatGPTBetter

Is ChatGPT Better — Dataset

Dataset supporting the study of ChatGPT vs. human annotators in explaining implicit hate speech (The Web Conference, WWW 2023).

Dataset Hate speech NLE