FrontierLab Publications

2025

RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues

Wu et al. | COLING 2025 | GitHub Logo

[PDF] [Code] [Model]

← Back to Home