Narrative Sense Disambiguation with LLMs
Published:
Technologies: PyTorch, Hugging Face, GPT-4, LoRA
Description
- Pipeline Engineering: Engineered a graded word-sense plausibility pipeline, fine-tuning Transformer architectures (DeBERTa, RoBERTa) to quantify semantic ambiguity in narrative contexts.
- Synthetic Data Generation: Designed and implemented a synthetic data generation system using GPT-4 with structured prompting; generated ~800 ambiguity-rich samples that improved encoder rank correlation (Spearman’s ρ) by over 10%.
- LLM Benchmarking: Developed an automated LLM-as-a-Judge framework to benchmark Mistral-7B and Qwen-2.5 against human annotators, implementing LoRA fine-tuning to optimize performance on resource-constrained hardware.
