期刊名称:International Journal of Population Data Science
电子版ISSN:2399-4908
出版年度:2019
卷号:4
期号:3
页码:1-1
DOI:10.23889/ijpds.v4i3.1292
出版社:Swansea University
其他摘要:IntroductionCoded diagnoses (ICD-9, ICD-10) are only available in routine data of the Austrian Health-Care system in connection with sick leave or inpatient hospital stays. Therefore, they only cover a small part of the population. Coded diagnoses from the outpatient sector are not documented. The aim of the project is to estimate diagnoses based on filled prescriptions reimbursed by a public health insurance institution. The result is a model that can provide probable diagnoses (ICD-10 coding) based on individual medication (ATC coding). MethodsBeginning in 2008 / 2009, the project ATC->ICD-9 has been developed by means of a statistical procedure. Here, hospital and sick leave diagnoses, as well as data on received medication are used to determine assignment probabilities. In this project, we developed a new method to derive diagnoses from medications. Our method is based on the word2vec-algorithm: Patient histories are used as input phrases, so that low-dimensional embeddings of medications and diseases are learned. In the learned vector space, similar medications and diseases are close to each other. ResultsTo evaluate our model, we compute the vector representation for medications and look for nearby diseases. E.g., the closest diseases to typical diabetes medication are different kinds of diabetes and retina affections, while nearby gout medications, gout and kidney diseases are found. ConclusionFor the given examples, our model provides reasonable results. It does not only yield typical diseases to a medication, but also common secondary symptoms. This motivates to apply the model on further use cases. For example, given an anonymized list of patients, containing their medications, disease distributions of these patients can be computed.