Overview
HeadRouter is a training-free audio token pruning method for large audio language models. It learns from task-dependent attention-head behavior and routes token importance scoring toward semantic, acoustic, or mixed head-weight profiles.
Method
HeadRouter combines position-bias-reduced text-to-audio probing with dynamic head-weight routing. The router softly mixes task profiles to score and retain the most informative audio tokens for each input.
Head Behavior Analysis
Representative visualizations show why one fixed head profile is insufficient: semantic and acoustic tasks exhibit different selectivity patterns and separable head-behavior clusters.
Selectivity heatmap
Semantic tasks are more diffuse, while acoustic tasks concentrate on smaller groups of highly selective heads.
t-SNE of head behavior
Per-sample head-behavior vectors form structured task clusters, supporting input-adaptive routing.
Results
HeadRouter improves the trade-off between compression and task performance by preserving task-relevant audio tokens more consistently than fixed or task-agnostic pruning strategies.
Motivation and oracle comparison
Local comparison
Efficiency
Ablation
Citation
Citation information will be added after release.
@article{headrouter2026,
title = {HeadRouter: Dynamic Head-Weight Routing for Task-Adaptive Audio Token Pruning in Large Audio Language Models},
author = {He, Peize and others},
year = {2026}
}