Published Versions 1 Vol 3 (2) : 10.1162/dint_a_00094 2021
Download
Overview of SMP-CAIL2020-Argmine: The Interactive Argument-Pair Extraction in Judgement Document Challenge
: 2021 - 01 - 10
: 2021 - 03 - 23
: 2021 - 04 - 25
52 0 0
Abstract & Keywords
Abstract: In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Competition held by both the Chinese AI and Law Challenge (CAIL) and the Chinese National Social Media Processing Conference (SMP), and introduce the related data set – SMP-CAIL2020-Argmine. The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff, providing the full context recorded in the judgement documents of both parties. We received entries from 63 competing teams, 38 of which scored higher than the provided baseline model (BERT) in the first phase and entered the second phase. The best performing system in the two phases achieved accuracy of 0.856 and 0.905, respectively. In this paper, we will present the results of the competition and a summary of the systems, highlighting commonalities and innovations among participating systems. The SMP-CAIL2020-Argmine data set and baseline models have been already released.
Keywords: Argumentation Mining; Judgement Documents; Natural Language Understanding; Pretrained Language Model
Acknowledgements
This work is partially supported by National Key Research and Development Plan (NO.2018YFC0830600), and is cooperated with China Justice Big Data Institute, with them providing judgement documents and the employment of professional annotators. The competition is also sponsored by Beijing Thunisoft Information Technology Co., Ltd, and supported by both CAIL and SMP organizers.
[1]
Vermeule, A.: Judicial History. Yale LJ 108, 1311 (1998)
[2]
Long, S., et al.: Automatic judgment prediction via legal reading comprehension. In: China National Conference on Chinese Computational Linguistics, pp. 558-572 (2019).
[3]
Segal, J.A. Predicting Supreme Court cases probabilistically: The search and seizure cases, 1962-1981. American Political Science Review 78(4), 891-900 (1984)
[4]
Keown, R.: Mathematical models for legal prediction. Computer/lj 2, 829 (1980)
[5]
Ulmer, S.S.: Quantitative analysis of judicial processes: Some practical and theoretical applications. Law and Contemporary Problems 28(1), 164-184 (1963)
[6]
Nagel, S.S. Applying correlation analysis to case prediction. Tex. L. Rev. 42 (1963)
[7]
Liu, Y.-H., Chen, Y.-L. A two-phase sentiment analysis approach for judgement prediction. Journal of Information Science 44(5), 594-607 (2018)
[8]
Sulea, O.-M., et al.: Exploring the use of text classification in the legal domain. arXiv preprint arXiv:1710.09306 (2017)
[9]
Katz, D.M., Bommarito, M.J., Blackman, J.: A general approach for predicting the behavior of the Supreme Court of the United States. PloS ONE 12(4), e0174698 (2017)
[10]
Stab, C., Gurevych, I.: Identifying argumentative discourse structures in persuasive essays. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 46-56 (2014)
[11]
Liu, J., Cohen, S.B., Lapata, M.: Discourse representation parsing for sentences and documents. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6248-6262 (2019)
[12]
Wang, L., et al.: Predicting thread discourse structure over technical web forums. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 13-25 (2011)
[13]
Bilu, Y., Slonim, N.: Claim synthesis via predicate recycling. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 525-530. (2016)
[14]
Zukerman, I., McConachy, R., & George, S.: Using argumentation strategies in automated argument generation. In INLG’2000 Proceedings of the First International Conference on Natural Language Generation pp. 55-62. (2000)
[15]
Sato, M., Yanai, K., Miyoshi, T., Yanase, T., Iwayama, M., Sun, Q., & Niwa, Y.: End-to-end argument generation system in debating. In Proceedings of ACL-IJCNLP 2015 System Demonstrations pp. 109-114. (2015)
[16]
Hua, X., & Wang, L.: Neural argument generation augmented with externally retrieved evidence. arXiv preprint arXiv:1805.10254. (2018)
[17]
Zhao, T., Lee, K., Eskenazi, M.: Unsupervised discrete sentence representation learning for interpretable neural dialog generation. arXiv preprint arXiv:1804.08069 (2018)
[18]
Taghipour, K., Hwee, T.N..: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882-1891 (2016)
[19]
Wei, Z., Liu, Y., Li, Y.: Is this post persuasive? Ranking argumentative comments in online forum. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 195-200 (2016)
[20]
Tan, C., et al.: Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions. In: Proceedings of the 25th International Conference on World Wide Web, pp. 613-624 (2016)
[21]
Dong, F., Zhang, Y., Yang, J.: Attention-based recurrent convolutional neural network for automatic essay scoring. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 153-162 (2017)
[22]
Habernal, I., Gurevych, I.: Which argument is more convincing? analyzing and predicting convincingness of web arguments using bidirectional lstm. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1589-1599 (2016)
[23]
Ji, L., et al.: Incorporating argument-level interactions for persuasion comments evaluation using co-attention model. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3703-3714 (2018)
[24]
Ji, L., et al.: Discrete argument representation learning for interactive argument pair identification. arXiv preprint arXiv:1911.01621 (2019)
[25]
Cheng, L., et al.: Argument pair extraction from peer review and rebuttal via multi-task learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7000-7011 (2020)
[26]
Kort, F. Predicting Supreme Court decisions mathematically: A quantitative analysis of the"right to counsel" cases. The American Political Science Review 51(1), 1-12. (1957)
[27]
Lauderdale, B.E., Clark, T.S.: The Supreme Court’s many median justices. American Political Science Review 106(4), 847-866 (2012)
[28]
Luo, B., et al.: Learning to predict charges for criminal cases with legal basis. arXiv preprint arXiv:1707.09168 (2017)
[29]
Xiao, C., et al.: Cail2018: A large-scale legal dataset for judgment prediction. arXiv preprint arXiv:1807.02478 (2018)
[30]
Zhong, H., et al.: Overview of Cail2018: Legal judgment prediction competition. arXiv preprint arXiv:1810.05851 (2018)
[31]
Xiao, C. et al.: CAIL2019-SCM: A Dataset of Similar Case Matching in Legal Domain. arXiv preprint arXiv:1911.08962 (2019)
[32]
Liu, C.-L., Hsieh, C.-D.: Exploring phrase-based classification of judicial documents for criminal charges in chinese. In: International Symposium on Methodologies for Intelligent Systems, pp. 681-690 (2006)
[33]
El Baff, R., et al.: Analyzing the persuasive effect of style in news editorial argumentation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3154-3160 (2020)
[34]
Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
[35]
Sun, J. Jieba Chinese word segmentation tool. Available at: https://github. com/fxsjy/jieba. Accessed 25 June 2018.
[36]
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8), 1735-1780 (1997)
[37]
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM. In: The Ninth International Conference on Artificial Neural Networks ICANN 99, pp. 850-855 (1999)
[38]
Reimers, N., Iryna, G.: Sentence-BERT: Sentence embeddings using siamese bertnetworks. arXiv preprint arXiv:1908.10084 (2019)
[39]
Liu, Y., et al.: Roberta: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
[40]
Zhang, Z. et al.: ERNIE: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129 (2019)
[41]
Sukhbaatar, S., Weston, J., Fergus, R.: End-to-end memory networks. In: Advances in Neural Information Processing Systems, pp. 2440-2448 (2015)
[42]
Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980-2988 (2017)
Article and author information
Cite As
Yuan, J., et al.: Overview of SMP-CAIL2020-Argmine: The interactive argument-pair extraction in judgement document challenge. Data Intelligence 3(2), 287-307 (2021). doi: 10.1162/dint_a_00094
Jian Yuan
All of the authors have made meaningful and valuable contributions to the resulting manuscript. J. Yuan(19210980107@fudan.edu.cn) undertook the code running test of the task, summarized the evaluationtask and drafted the paper.
Jian Yuan is currently a graduate student of the School of Data Science,Fudan University. His research interests include argumentation mining, legalartificial intelligence and knowledge representation.
0000-0002-3201-9844
Zhongyu Wei
All of the authors have made meaningful and valuable contributions to the resulting manuscript.Z. Wei (zywei@fudan.edu.cn), S. Zou, D. Li(lidh18@mails.tsinghua.edu.cn), D. Zhao (dhzhao@fudan.edu.cn) and X. Huang (xjhuang@fudan.edu.cn)designed, released and promoted the shared task.
Zhongyu Wei is an Associate Professor in School of Data Science at FudanUniversity and he serves as the secretory in Social Media Processing (SMP)committee of Chinese Information Processing Society of China (CIPS). He gothis PhD in The Chinese University of Hong Kong in 2014. His researchfocuses on multi-modality information understanding and generation crossvision and language, argumentation mining and some cross-disciplinarytopics.
0000-0003-3789-8507
Yixu Gao
All of the authors have made meaningful and valuable contributions to the resulting manuscript. Y. Gao (yxgao19@fudan.edu.cn) and W. Chen (chenwei18@fudan.edu.cn)participated in providing baseline models to the contestants.
Yixu Gao is currently a graduate student in the School of Data Science,Fudan University. Her research interests include argument mining,reinforcement learning and recommendations.
0000-0002-6605-6416
Wei Chen
All of the authors have made meaningful and valuable contributions to the resulting manuscript. Y. Gao (yxgao19@fudan.edu.cn) and W. Chen (chenwei18@fudan.edu.cn)participated in providing baseline models to the contestants.
Wei Chen is currently a PhD student in the School of Data Science at FudanUniversity. His research interests include dialogue systems and naturallanguage generation.
0000-0001-9431-9247
Yun Song
All of the authors have made meaningful and valuable contributions to the resulting manuscript. Y. Song (1171991@s.hlju.edu.cn), J. Ma (mqstssf2009@126.com) and Z. Hu (huz06@126.com) helped formulate the shared task from a professional law perspective.
Yun Song is a doctoral student of the Law School, Heilongjiang University,Harbin, China. Her research interest includes the history of law, artificialintelligence and justice.
0000-0001-5032-0107
All of the authors have made meaningful and valuable contributions to the resulting manuscript.Z. Wei (zywei@fudan.edu.cn), S. Zou, D. Li(lidh18@mails.tsinghua.edu.cn), D. Zhao (dhzhao@fudan.edu.cn) and X. Huang (xjhuang@fudan.edu.cn)designed, released and promoted the shared task.
Donghua Zhao is an Associate Professor of the School of MathematicalSciences, Fudan University, Shanghai, China. She received her PhD degreein Applied Mathematics from Fudan University in 2005. Her research interestincludes differential equations, complex networks, natural language processingand time-series analysis.
0000-0002-4959-2647
Jinglei Ma
All of the authors have made meaningful and valuable contributions to the resulting manuscript. Y. Song (1171991@s.hlju.edu.cn), J. Ma (mqstssf2009@126.com) and Z. Hu (huz06@126.com) helped formulate the shared task from a professional law perspective.
Jinglei Ma is a PM (product manager) of the China Judicial Big Data Institute.He graduated from the Beihang University with a Master of Laws. His researchinterest is judicial big data.
0000-0001-5854-2425
Zhen Hu
All of the authors have made meaningful and valuable contributions to the resulting manuscript. Y. Song (1171991@s.hlju.edu.cn), J. Ma (mqstssf2009@126.com) and Z. Hu (huz06@126.com) helped formulate the shared task from a professional law perspective.
Zhen Hu is an independent researcher who is interested in machine learning,natural language processing and control system. He received his PhD degreein Automation from Tsinghua University in 2015, and led some projects aboutsmart city, legal Intelligence and some other related subjects afterward.
0000-0001-9587-3493
Shaokun Zou
All of the authors have made meaningful and valuable contributions to the resulting manuscript.Z. Wei (zywei@fudan.edu.cn), S. Zou, D. Li(lidh18@mails.tsinghua.edu.cn), D. Zhao (dhzhao@fudan.edu.cn) and X. Huang (xjhuang@fudan.edu.cn)designed, released and promoted the shared task.
Shaokun Zou is the Chief Executive Officer (CEO) of Beijing Huayu YuandianInformation Service Co., Ltd. His research interest involves legal artificialintelligence and automatic legal systems.
Donghai Li
All of the authors have made meaningful and valuable contributions to the resulting manuscript.Z. Wei (zywei@fudan.edu.cn), S. Zou, D. Li(lidh18@mails.tsinghua.edu.cn), D. Zhao (dhzhao@fudan.edu.cn) and X. Huang (xjhuang@fudan.edu.cn)designed, released and promoted the shared task.
Donghai Li is a professorate senior engineer, deputy general manager andchief technology officer of Beijing Huayu Yuandian Information Service Co.,Ltd. He is a D.Eng. candidate in the Leading Talents for Innovation Programin Department of Computer Science and Technology in Tsinghua University.His major research interests are in legal search technologies. He is dedicatedto research and application of legal artificial intelligence, one of the first fewto apply knowledge graph technologies to the law tech field.
0000-0001-5177-3335
Xuanjing Huang
All of the authors have made meaningful and valuable contributions to the resulting manuscript.Z. Wei (zywei@fudan.edu.cn), S. Zou, D. Li(lidh18@mails.tsinghua.edu.cn), D. Zhao (dhzhao@fudan.edu.cn) and X. Huang (xjhuang@fudan.edu.cn)designed, released and promoted the shared task.
Xuanjing Huang is a Professor of the School of Computer Science, FudanUniversity, Shanghai, China. She received her PhD degree in ComputerScience from Fudan University in 1998. Her research interest includesartificial intelligence, natural language processing, information retrieval andsocial media processing.
0000-0001-9197-9426
Publication records
Published: July 9, 2021 (Versions1
References
Data Intelligence