Published Versions 1 Vol 3 (2) : 274-286 2021
Download
An Evaluation of Chinese Human-Computer Dialogue Technology
: 2020 - 11 - 18
: 2021 - 01 - 27
: 2021 - 02 - 03
2323 49 0
Abstract & Keywords
Abstract: There is a growing interest in developing human-computer dialogue systems which is an important branch in the field of artificial intelligence (AI). However, the evaluation of large-scale Chinese human-computer dialogues is still a challenging task. To attract more attention to dialogue evaluation work, we held the fourth Evaluation of Chinese Human-Computer Dialogue Technology (ECDT). It consists of few-shot learning in spoken language understanding (SLU) (Task 1) and knowledge-driven multi-turn dialogue competition (Task 2), the data sets of which are provided by Harbin Institute of Technology and Tsinghua University. In this paper, we will introduce the evaluation tasks and data sets in detail. Meanwhile, we will also analyze the evaluation results and the existing problems in the evaluation.
Keywords: Chinese human-computer dialogue evaluation; Evaluation data; Few-shot learning; Knowledge-driven multi-turn dialogue
Acknowledgements
We would like to thank Social Media Processing committee of Chinese Information Processing Society of China (CIPS-SMP) for its strong support for this evaluation. Thanks to Huawei Technologies Co., Ltd. for providing financial support for this evaluation. Thanks to iFLYTEK Co., Ltd. for providing data and evaluation support. Thanks to Kaiyan Zhang and Jiale Zhang for their indispensable support during the evaluation. This paper is supported by the National Natural Science Foundation of China (No. 62076081, No. 61772153 and No. 61936010).
Zhang, W.N., et al.: A Chinese intelligent conversational robot. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations, pp. 13-18 (2017)
Serban, I.V., et al.: A deep reinforcement learning chatbot. arXiv preprint arXiv: 1709.02349v2 (2017)
Zhang, W.N., et al.: The first evaluation of Chinese human-computer dialogue technology. arXiv preprint arXiv:1709.10217v2(2017)
Turing, A.M. Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
Wang, X.J., Yuan, C.X.: Recent advances on human-computer dialogue. CAAI Transactions on Intelligence Technology 1(4), 303 – 312 (2016)
Chen, H.S., et al.: A survey on dialogue systems: Recent advances and new frontiers. arXiv preprint arXiv: 1711.01731 (2017)
Zhang, Y.Z., Zhang, W.N., Liu, T.: Survey of evaluation methods for dialogue systems (in Chinese). SCIENTIA SINICA Informationis 47(8), 953-966 (2017)
Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio Speech Language Processing 23(3), 530–539 (2015)
Yan. R., Zhao, D.Y.: Coupled context modeling for deep chit-chat: Towards conversations between human and computer. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (KDD ’18), pp. 2574–2583 (2018)
Zhang, W.N., et al.: Neural personalized response generation as domain adaptation. World Wide Web 22, 1427–1446 (2019)
Hou, Y.: Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. arXiv preprint arXiv:2006.05702 (2020)
Zhou, H., et al.: KdConv: A Chinese multi-domain dialogue data set towards multi-turn knowledge-driven conversation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7098–7108 (2020)
Tang, B., Kay, S., He, H.B.: Toward optimal feature selection in NaiveBayes for text categorization. arXiv preprint arXiv: 1602.02850 (2016)
Li, J., et al.: A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055 (2015)
Papineni, K., et al. BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311-318 (2002)
Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805v2 (2018)
Sun, Y., et al.: ERNIE: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019)
Cui, Y., et al.: Pre-training with whole word masking for Chinese BERT. arXiv preprint arXiv:1906.08101 (2019)
Chen, Q., Zhuo, Z., Wang, W.: BERT for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909 (2019)
Zhu, S., et al.: Vector projection network for few-shot slot tagging in natural language understanding. arXiv preprint arXiv:2009.09568 (2020)
Article and author information
Cite As
Feng, Z.X., et al.: An evaluation of Chinese human-computer dialogue technology. Data Intelligence 3(2), 274-286 (2021). doi: 10.1162/dint_a_00090
Zixian Feng
This work was a collaboration between all of the authors. Z.X. Feng (zxfeng@ir.hit.edu.cn) summarized the data sets and results of SMP2020-ECDT and drafted the paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Zixian Feng is a postgraduate in Research Center for Social Computing andInformation Retrieval, School of Computer Science and Technology, HarbinInstitute of Technology. Her current researh interests are mainly in humancomputer dialogue system evaluation.
Caihai Zhu
This work was a collaboration between all of the authors. C.H. Zhu (chzhu@ir.hit.edu.cn) drew the whole picture of the evaluation.
Caihai Zhu is a postgraduate in Research Center for Social Computingand Information Retrieval, School of Computer Science and Technology,Harbin Institute of Technology. His current researh interests are mainly inconversational recommendation.
Weinan Zhang
This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che (car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Weinan Zhang is an associate professor in Research Center for SocialComputing and Information Retrieval, School of Computer Science andTechnology, Harbin Institute of Technology. His research interest includeshuman-computer dialogue, natural language processing and informationretrieval.
Zhigang Chen
This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Zhigang Chen joined iFLYTEK Corporation in 2003 and is currently VicePresident of the AI Research Institute of iFLYTEK Corporation. He is mainlyresponsible for cognitive intelligence research and productization.
Wanxiang Che
This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Zhigang Chen joined iFLYTEK Corporation in 2003 and is currently VicePresident of the AI Research Institute of iFLYTEK Corporation. He is mainlyresponsible for cognitive intelligence research and productization.
Minlie Huang
This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Minlie Huang is an associate professor at the Department of ComputerScience and Technology, Tsinghua University. His research interests includeartificial intelligence, deep learning, reinforcement learning, and naturallanguage processing.
Linlin Li
This work was a collaboration between all of the authors. W.N. Zhang (wnzhang@ir.hit.edu.cn) is the leader of 2020-ECDT. W.X. Che(car@ir.hit.edu.cn), Z.G. Chen (zgchen@iflytek.com), M. L. Huang (aihuang@tsinghua.edu.cn), and L.L. Li (lilinlin@huawei.com) guided the evaluation process and summarized the conclusion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Linlin Li is the leader of Intelligent Voice Assistant, Huawei Consumer BG.She is in charge of Huawei Xiaoyi’s voice assistant business. The main workinvolves the collaboration and mutual assistance of pan-terminal equipmentfor voice services, multi-language internationalization, multi-modal semanticunderstanding and end-to-end intelligence.
Publication records
Published: July 7, 2021 (Versions1
References
Data Intelligence