Published Versions 1 Vol 3 (3) : 376-388 2021
Download
Overview of CCKS 2020 Task 3: Named Entity Recognition and Event Extraction in Chinese Electronic Medical Records
: 2021 - 02 - 11
: 2021 - 03 - 11
: 2021 - 03 - 15
24 0 0
Abstract & Keywords
Abstract: The CCKS 2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records. Two annotated data sets and some other additional resources for these two subtasks were provided for participators. This evaluation competition attracted 354 teams and 46 of them successfully submitted the valid results. The pretrained language models are widely applied in this evaluation task. Data argumentation and external resources are also helpful.
Keywords: Chinese electronic medical records; Event extraction; Named entity recognition; Clinical text; CCKS
Acknowledgments
[1]
Lample, G., et al.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260-270. Stroudsburg, Association for Computational Linguistics (2016)
[2]
Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. In: Transactions of the Association for Computational Linguistics, pp. 357-370. Stroudsburg, Association for Computational Linguistics (2016)
[3]
Ma, X.Z., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1064–1074. Stroudsburg, Association for Computational Linguistics (2016)
[4]
Devlin, J., et al.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
[5]
Uzuner, Ö., et al.: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association 18(5), 552–556 (2011)
[6]
Suominen, H., et al.: Overview of the share/clef ehealth evaluation lab 2013. In: The Fourth CLEF Conference, pp. 212–231. Berlin: Springer (2013)
[7]
Pradhan, S., et al.: Semeval-2014 task 7: Analysis of clinical text. In: Proceedings of the 8th International Workshop on Semantic Evaluation, SemEval@COLING 2014, pp. 54–62. Stroudsburg, Association for Computational Linguistics (2014)
[8]
He, J.Z., Wang, H.F.: Chinese named entity recognition and word segmentation based on character. In: Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing, 128-132. Available at: https://www.aclweb.org/anthology/I08-4022.pdf.
[9]
Liu, Z.X., Zhu, C.H., Zhao, T.J.: Chinese named entity recognition with a sequence labeling approach: Based on characters, or based on words? In: International Conference on Intelligent Computing, pp. 634–640. Berlin: Springer (2010)
[10]
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 1554–1564. Stroudsburg, Association for Computational Linguistics (2018)
[11]
Ding, R.X., et al.: A neural multi-digraph model for Chinese ner with gazetteers. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1462–1467. Stroudsburg, Association for Computational Linguistics (2019)
[12]
Liu, W., et al.: An encoding strategy-based word-character LSTM for Chinese NER. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2379–2389. Stroudsburg, Association for Computational Linguistics (2019)
[13]
Dianbo Sui, et al.: Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3830-3840. Stroudsburg, Association for Computational Linguistics (2019)
[14]
Xue, M.G., et al.: Porous lattice transformer encoder for Chinese NER. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 3831-3841. Available at: https://www.aclweb.org/anthology/2020.coling-main.340.pdf.
[15]
Chen, Y.B., et al.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 167–176. Stroudsburg, Association for Computational Linguistics (2015)
[16]
Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, p.p. 300–309. Stroudsburg, Association for Com- putational Linguistics (2016)
[17]
Liu, X., Luo, Z.C., Huang, H.Y.: Jointly multiple event extraction via attention-based graph informa- tion aggregation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, p.p. 1247–1256. Stroudsburg, Association for Computational Linguistics (2018)
[18]
Liu, J., et al.: Event extraction as machine reading comprehension. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), p.p. 1641–1651. Stroudsburg, Association for Computational Linguistics (2020)
[19]
Kalpathy-Cramer, J., et al.: Evaluating performance of biomedical image retrieval systems– an overview of the medical image retrieval task at ImageCLEF 2004–2014. Computerized Medical Imaging and Graphics 39, 55-61 (2015)
[20]
Magge, A., Scotch, M., Gonzalez-Hernandez, G.: Clinical ner and relation extraction using bi-char-lstms and random forest classifiers. In: International Workshop on Medication and Adverse Drug Event Detection, p.p. 25–30. In: PMLR (2018). Available at: http://proceedings.mlr.press/v90/magge18a/magge18a.pdf.
[21]
Ghiasvand, O., Kate, R.J.: Learning for clinical named entity recognition without manual annotation. Informatics in Medicine Unlocked 13,122–127 (2018)
[22]
Yadav, S., et al.: Exploring disorder-aware attention for clinical event extraction. ACM Transactions on Multimedia Computing, Communications, and Applications 16(1s), Article 31 (2020)
[23]
Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Stroudsburg, Association for Computational Linguistics (2018)
[24]
Cui,Y.M., et al.: Revisiting pre-trained models for Chinese natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 657–668. Stroudsburg, Association for Computational Linguistics (2020)
[25]
Liu, Y.H., et al.: RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Article and author information
Cite As
Citation: Li, X., et al.: Overview of CCKS 2020 Task 3: Named entity recognition and event extraction in Chinese electronic medical records. Data Intelligence 3(3), 376-388 (2021). doi: 10.1162/dint_a_00093
Xia Li
All of the authors contributed equally to the work. X. Li summarized the evaluation task and drafted the paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Xia Li is currently the head of the Clinical Pharmacy Department of the 305thHospital of the Chinese People’s Liberation Army. She received her Bachelor’sdegree from the Second Military Medical University in 2002. Recently herresearch interests center around clinical knowledge graph and intelligentdecision support for medication.
0000-0002-8420-1226
Qinghua Wen
All of the authors contributed equally to the work. Q.H. Wen reviewed the method documents submitted by the participating teams and undertook the code running test of the participating teams to ensure that the results were correct and fair. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Qinghua Wen is currently a graduate student in the Department of ComputerScience and Technology, Tsinghua University. His research interests includeknowledge engineering, relation extraction and data mining.
0000-0002-4116-2140
Hu Lin
All of the authors contributed equally to the work. H. Lin summarized the result and discussion part of this paper. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Hu Lin is currently the head of the Department of Medical Administration ofthe 305th Hospital of the Chinese People’s Liberation Army. His researchinterests focus on health management, Intelligent follow-up systems andmedical knowledge graph.
0000-0003-1525-5922
Zengtao Jiao
All of the authors contributed equally to the work. Z.T. Jiao was responsible for producing data sets and labeling results by medical experts. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
Zengtao Jiao is currently the director of AI lab in Yidu Cloud TechnologyCo., Ltd. His research interests focus on the key and difficult problems in thefield of medical artificial intelligence, such as medical text informationextraction, disease prediction model, and medical knowledge mining.
0000-0002-3534-479X
Jiangtao Zhang
All of the authors contributed equally to the work. J.T. Zhang was the organizer of this evaluation task who designed and released the shared task. All the authors have made meaningful and valuable contributions in revising and proofreading the resulting manuscript.
zhang-jt13@tsinghua.org.cn
Jiangtao Zhang received his PhD degree in Computer Science from TsinghuaUniversity in 2018. He is now working as the director of the InformationCenter of the 305th Hospital of the Chinese People’s Liberation Army. Hisresearch interests include knowledge graph, data mining and natural languageprocessing in the medical domain. He organized and released a series ofshared evaluation tasks for clinical knowledge discovery in CCKS 2017,CCKS 2018, CCKS 2019 and CCKS 2020.
0000-0001-8462-3915
Publication records
Published: Sept. 15, 2021 (Versions1
References
Data Intelligence