Dr. Sarah Chen
Associate Professor of Computer Science
Stanford University
My research focuses on natural language processing, machine learning, and the intersection of language understanding with knowledge representation.
Cross-Lingual Semantic Parsing with Minimal Supervision
Published in EMNLP, 2025 — Oral Presentation
Sarah Chen, Kwame Okafor, Tobias Müller
Abstract
Semantic parsing—the task of mapping natural language utterances to formal meaning representations—has seen dramatic improvements in English thanks to large pre-trained language models. However, extending these advances to the world's other 7,000+ languages remains a formidable challenge. Most languages lack the annotated training data required for supervised approaches, and even multilingual pre-trained models exhibit significant performance gaps on low-resource languages.
We present XSP-Transfer, a cross-lingual transfer method for semantic parsing that requires only 50 annotated examples in the target language. Our approach combines three techniques: (1) a language-agnostic meaning representation alignment objective that maps utterances from different languages into a shared semantic space, (2) a structure-aware code-switching augmentation strategy that generates synthetic training data by swapping aligned phrases between high- and low-resource languages, and (3) a confidence-based self-training loop that iteratively expands the target-language training set with high-confidence model predictions.
We evaluate XSP-Transfer on the Mschema2QA and MTOP benchmarks across 10 typologically diverse languages. With only 50 target-language examples, our method achieves 85% of the fully-supervised performance on average, and outperforms the previous best few-shot method by 14 points in exact-match accuracy. Ablation studies reveal that the alignment objective and code-switching augmentation contribute roughly equally to the gains, while self-training provides an additional 3–5 point improvement.
Citation
S. Chen, K. Okafor, T. Müller. (2025). "Cross-Lingual Semantic Parsing with Minimal Supervision." In Proceedings of EMNLP 2025.