摘要:Deep learning approaches have exhibited a great ability on automatic interpretation of the electrocardiogram (ECG) . However, large-scale public 12-lead ECG data are still limited, and the diagnostic labels are not uniform, which increases the semantic gap between clinical practice . In this study, we present a large-scale multi-label 12-lead ECG database with standardized diagnostic statements . The dataset contains 25770 ECG records from 24666 patients, which were acquired from Shandong Provincial Hospital (SPH) between 2019/08 and 2020/08 . The record length is between 10 and 60 seconds . The diagnostic statements of all ECG records are in full compliance with the AHA/ACC/HRS recommendations, which aims for the standardization and interpretation of the electrocardiogram, and consist of 44 primary statements and 15 modifers as per the standard . 46 .04% records in the dataset contain ECG abnormalities, and 14 .45% records have multiple diagnostic statements . The dataset also contains additional patient demographics .