An Evaluation of Chinese Human-Computer Dialogue Technology

Published • Versions 1 Vol 3 (2) : 274-286 2021

Zixian Feng, Caihai Zhu, Weinan Zhang, Zhigang Chen, Wanxiang Che, Minlie Huang, Linlin Li

DOI: 10.1162/dint_a_00090

： 2020 - 11 - 18

： 2021 - 01 - 27

： 2021 - 02 - 03

2323 49 0

Abstract & Keywords

Abstract: There is a growing interest in developing human-computer dialogue systems which is an important branch in the field of artificial intelligence (AI). However, the evaluation of large-scale Chinese human-computer dialogues is still a challenging task. To attract more attention to dialogue evaluation work, we held the fourth Evaluation of Chinese Human-Computer Dialogue Technology (ECDT). It consists of few-shot learning in spoken language understanding (SLU) (Task 1) and knowledge-driven multi-turn dialogue competition (Task 2), the data sets of which are provided by Harbin Institute of Technology and Tsinghua University. In this paper, we will introduce the evaluation tasks and data sets in detail. Meanwhile, we will also analyze the evaluation results and the existing problems in the evaluation.

Keywords: Chinese human-computer dialogue evaluation; Evaluation data; Few-shot learning; Knowledge-driven multi-turn dialogue

Acknowledgements

We would like to thank Social Media Processing committee of Chinese Information Processing Society of China (CIPS-SMP) for its strong support for this evaluation. Thanks to Huawei Technologies Co., Ltd. for providing financial support for this evaluation. Thanks to iFLYTEK Co., Ltd. for providing data and evaluation support. Thanks to Kaiyan Zhang and Jiale Zhang for their indispensable support during the evaluation. This paper is supported by the National Natural Science Foundation of China (No. 62076081, No. 61772153 and No. 61936010).

Zhang, W.N., et al.: A Chinese intelligent conversational robot. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations, pp. 13-18 (2017)

+ CSCD · Baidu Scholar

Serban, I.V., et al.: A deep reinforcement learning chatbot. arXiv preprint arXiv: 1709.02349v2 (2017)

+ CSCD · Baidu Scholar

Zhang, W.N., et al.: The first evaluation of Chinese human-computer dialogue technology. arXiv preprint arXiv:1709.10217v2(2017)

+ CSCD · Baidu Scholar

Turing, A.M. Computing machinery and intelligence. Mind 59(236), 433–460 (1950)

+ CSCD · Baidu Scholar

Wang, X.J., Yuan, C.X.: Recent advances on human-computer dialogue. CAAI Transactions on Intelligence Technology 1(4), 303 – 312 (2016)

+ CSCD · Baidu Scholar

Chen, H.S., et al.: A survey on dialogue systems: Recent advances and new frontiers. arXiv preprint arXiv: 1711.01731 (2017)

+ CSCD · Baidu Scholar

Zhang, Y.Z., Zhang, W.N., Liu, T.: Survey of evaluation methods for dialogue systems (in Chinese). SCIENTIA SINICA Informationis 47(8), 953-966 (2017)

+ CSCD · Baidu Scholar

Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio Speech Language Processing 23(3), 530–539 (2015)

+ CSCD · Baidu Scholar

Yan. R., Zhao, D.Y.: Coupled context modeling for deep chit-chat: Towards conversations between human and computer. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (KDD ’18), pp. 2574–2583 (2018)

+ CSCD · Baidu Scholar

Zhang, W.N., et al.: Neural personalized response generation as domain adaptation. World Wide Web 22, 1427–1446 (2019)

+ CSCD · Baidu Scholar

Hou, Y.: Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. arXiv preprint arXiv:2006.05702 (2020)

+ CSCD · Baidu Scholar

Zhou, H., et al.: KdConv: A Chinese multi-domain dialogue data set towards multi-turn knowledge-driven conversation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7098–7108 (2020)

+ CSCD · Baidu Scholar

Tang, B., Kay, S., He, H.B.: Toward optimal feature selection in NaiveBayes for text categorization. arXiv preprint arXiv: 1602.02850 (2016)

+ CSCD · Baidu Scholar

Li, J., et al.: A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055 (2015)

+ CSCD · Baidu Scholar

Papineni, K., et al. BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311-318 (2002)

+ CSCD · Baidu Scholar

Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805v2 (2018)

+ CSCD · Baidu Scholar

Sun, Y., et al.: ERNIE: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)

+ CSCD · Baidu Scholar

Cui, Y., et al.: Pre-training with whole word masking for Chinese BERT. arXiv preprint arXiv:1906.08101 (2019)

+ CSCD · Baidu Scholar

Chen, Q., Zhuo, Z., Wang, W.: BERT for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909 (2019)

+ CSCD · Baidu Scholar

Zhu, S., et al.: Vector projection network for few-shot slot tagging in natural language understanding. arXiv preprint arXiv:2009.09568 (2020)

+ CSCD · Baidu Scholar

Article and author information

Cite As

Feng, Z.X., et al.: An evaluation of Chinese human-computer dialogue technology. Data Intelligence 3(2), 274-286 (2021). doi: 10.1162/dint_a_00090

Zixian Feng

This work was a collaboration between all of the authors. Z.X. Feng (zxfeng@ir.hit.edu.cn) summarized the data sets and results of SMP2020-ECDT and drafted the paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.

Zixian Feng is a postgraduate in Research Center for Social Computing andInformation Retrieval, School of Computer Science and Technology, HarbinInstitute of Technology. Her current researh interests are mainly in humancomputer dialogue system evaluation.

Caihai Zhu

This work was a collaboration between all of the authors. C.H. Zhu (chzhu@ir.hit.edu.cn) drew the whole picture of the evaluation.

Caihai Zhu is a postgraduate in Research Center for Social Computingand Information Retrieval, School of Computer Science and Technology,Harbin Institute of Technology. His current researh interests are mainly inconversational recommendation.

Weinan Zhang

This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che (car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.

Weinan Zhang is an associate professor in Research Center for SocialComputing and Information Retrieval, School of Computer Science andTechnology, Harbin Institute of Technology. His research interest includeshuman-computer dialogue, natural language processing and informationretrieval.

Zhigang Chen

This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.

Zhigang Chen joined iFLYTEK Corporation in 2003 and is currently VicePresident of the AI Research Institute of iFLYTEK Corporation. He is mainlyresponsible for cognitive intelligence research and productization.

Wanxiang Che

This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.

Minlie Huang

This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.

Minlie Huang is an associate professor at the Department of ComputerScience and Technology, Tsinghua University. His research interests includeartificial intelligence, deep learning, reinforcement learning, and naturallanguage processing.

Linlin Li

This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.

Linlin Li is the leader of Intelligent Voice Assistant, Huawei Consumer BG.She is in charge of Huawei Xiaoyi’s voice assistant business. The main workinvolves the collaboration and mutual assistance of pan-terminal equipmentfor voice services, multi-language internationalization, multi-modal semanticunderstanding and end-to-end intelligence.

Publication records

Published: July 7, 2021 （Versions1）