Background: In some situations, echocardiography is regulated by clinical recommendations, but such recommendations are usually based on expert opinion, and not on the studies of the effectiveness of a diagnostic method. High-quality echocardiographic diagnostics requires the use of expensive equipment and highly qualified physician specialist. However, it is of practical importance to use echocardiography as a screening method in patients without acute symptoms.
Objective: This article describes the study results of echocardiographic (ECHO) tests data on the example of cardiovascular patients. The data from more than 145,000 echocardiographic tests were analyzed. One of the objectives of the study is the possibility to identify patterns and relationships of patient characteristics for more accurate appointment procedures based on the history of the disease and the individual characteristics of the patient.
Method: The EMR from the medical information system was converted into frames. The data from echocardiography is partly structured, contains the tables and the records on natural language. On next step, the history, diagnoses, the results of examinations were extracted from EMR. Records on natural language were processed. In the next step, the data frames were analyzed to identify correlations, as well as other data mining methods were used (the sub-process in Figure 1 includes not only machine learning methods for solving the classification problem but also clustering, class specification, etc.). As a result, data frame contained 62 features, including target classes for finding significant pathologies. On the next step, gaps were removed (for example, mean, median, or digests). The next step was to train the model using machine learning methods (naive Bayesian classifier, k-nearest neighbors’ algorithm, random forest method, decision trees). The last step was to analyze the features and clinical interpretation of decision trees and other data analysis results. Results: In the course of research, the EMR of 145,966 echocardiographic cases of patients (about 80,000 patients) with different causes of treatment were analyzed. The most frequent reasons for echocardiography are: arterial hypertension, observation of patients with heart noise, atherosclerosis and stenosis, heart attacks, congenital and acquired defects of membranes and valves. A group of patients with fibrillation was considered as a particular case. One of the critical parameters of the classification is the characteristics of the left ventricle, which reflects the severity of changes in the cardiovascular system associated with hypertension. Further, anthropometric indicators (height, weight), sex and age allow distinguishing a group of patients with a high probability of finding a significant pathology.
Conclusions: Moreover, it was also possible to identify the classes and characteristics of patients for whom repeated diagnostic procedures are reasoned. Calculation of personal risks from empirical retrospective data helps to identify the disease in the early stages. To identify patients with high risk of disease complications allow physicians to make right decisions about timely treatment, which can significantly improve the quality of treatment, and help to avoid diseases complications, optimize costs and improve the quality of medical care.