Red Teaming Language Models for Processing Contradictory Dialogues
Published in EMNLP 2024, 2024
Xiaofei Wen, Bangzheng Li, Tenghao Huang, and Muhao Chen. 2024. Red Teaming Language Models for Processing Contradictory Dialogues. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11611–11630, Miami, Florida, USA. Association for Computational Linguistics. [paper] [code]