The 12-lead ECG is used to detect many cardiac abnormalities which include electrical conduction defects and myocardial infarction (MI) . The accuracy of the 12 lead ECG has, however, been called into question , and this is based largely on the appreciation that the necessary diagnostic information may not be captured by the recording sites that make up this format. To counter this, investigators have looked to alternative recording techniques to capture more useful information. The most extreme example of this is the BSPM. In this approach ECG information is recorded from as many as 200 sites on the torso , . This level of spatial sampling provides a much more comprehensive picture of cardiac activity as effectively all ECG information, as projected onto the body's surface, is captured. Although superior in terms of their diagnostic yield , BSPMs are not widely used in clinical practice. This is because the large number of recording channels makes the acquisition process more cumbersome and BSPMs have not experienced widespread utility outside of the research laboratory. Despite this there is much to be gained from the study of BSPM data, as in effect a more comprehensive picture of cardiac activity is being studied. In this paper we detail the investigation of the use of BSPM data in the classification of old MI. In particular we focus on locating the most useful diagnostic information in BSPMs and we use this to address the classification problem.
In this study we analyze a set of 192 lead BSPMs that were recorded from a mixture of patients that were previously diagnosed as being normal or having old MI. The clinical data and experimental procedures are described as follows.Clinical Data
The 192 lead BSPMs were recorded from a group of 116 subjects. This was made up of 57 subjects with MI at various locations and 59 normal subjects. The breakdown of this dataset is listed in Table 1. The recording procedure has previously been described in , ,  and is summarized as follows. On each subject the electrodes were positioned by placing 16 columns of 12 electrodes on the torso. These columns were equi-spaced around the thoracic circumference and a schematic of this electrode array is illustrated in Figure 1. For each subject the 192 channels of information were sampled simultaneously for a number of seconds. Subsequent to recording the data were averaged to represent one cardiac cycle. Beat markers were then inserted on this averaged beat by a human expert.
Table 1. Composition of BSPM Dataset used during experiments.
Fig. 1. Schematic of electrode array employed to record BSPM data. This illustration depicts the array as an unrolled cylindrical matrix of
16 x 12 = 192 recording sites. The top row corresponds with a horizontal line running around the circumference at the level of the suprasternal notch. The bottom row corresponds to a horizontal line at the level of the umbilicus.
Due to the abundance of data that is recorded using BSPMs, techniques have been developed that allow the effective reduction of this data prior to interpretation. A technique that has been widely adopted in BSPM representation is the use of ‘isointegral' or ‘isoarea' maps. In this approach the area under a specific portion of the ECG wave is calculated for each recorded lead; the resulting value for each individual lead is then used to generate a contour map. This technique summarizes the information contained in dozens of instantaneous maps into one picture , .
Although some information is lost, isointegral maps are useful as they provide an indication of the mean distribution of potentials over the selected interval. Two such maps that are commonly studied are the QRS isointegral and the STT isointegral . These maps provide an indication of the mean distributions during ventricular activation (depolarization) and recovery (repolarization) respectively. QRST isointegral maps have also been studied as they provide an indication of the ‘ventricular gradient', a measure of how much the processes of depolarization and repolarisation do not cancel one and other out in any particular lead . Figure 2 illustrates the regions of a representative cardiac cycle incorporated in each isointegral.
Fig. 2. Illustration of area incorporated by (a) QRS, (b) STT and (c) QRST isointegrals.
In the current study QRS, STT and QRST isointegrals were calculated. Each of these isointegrals consisted of 192 values which are used to generate a contour map. As the patterns of extrema, maxima and minima of such a map are studied by the clinician in order to provide diagnosis and because these patterns are characterized by the 192 calculated values, these values can be considered as features in the context of computerized classification. For the studies presented in this paper, three such maps were calculated; this effectively resulted in 576 features for each subject (3 x 192). This also translates to having three features per recording site per patient, e.g. for each recording site we have one QRS, one STT, and one QRST value.Feature selection
After calculation of the isointegral values further reduction in dimensionality was achieved by employing a signal-to-noise ratio-based feature ranking procedure. This approach is similar to the ‘filter' method proposed in  where each individual isointegral feature was ranked based on its utility when considered as an input to a single variable classifier (SVC). In the current study each variable is ranked using a signal-to-noise ratio-based feature ranking criterion , , . Let μ1(fi) and μ2(fi) be the mean values of feature fi for the classes 1 and 2; σ1(fi) and σ2(fi) be the respective standard deviation values of ith feature fi for the same classes, hence Si is determined as:
A higher value of |Si| indicates a stronger correlation between the feature value and the class distinction and hence infers that such a feature is useful in discriminating between classes.Classification
Following the signal-to-noise ratio-based feature ranking the best subsets of three, six and ten measurements (features) from the 192 available for each isointegral were used as inputs to four classification models (NB, SVM, MLP and RF). A brief description of these common classifiers is given as follows:
NB is a simple probabilistic classifier. It is based on the Bayes rule of conditional probability and it naively assumes independence between features. It uses the normal distribution to model numeric attributes by calculating the mean standard deviation for each class .
SVM is a kernel based classifier. The basic training for SVMs involves finding a function which optimizes a bound on the generalization capability, i.e., performance on unseen data. By using the kernel trick technique, SVM can apply linear classification techniques to non-linear classification problems .
A MLP is a non-linear classification approach that may be trained using the back propagation algorithm. A MLP consists of multiple layers of computational units (an input layer, one or more hidden layer and one output layer) .
A RF classifier constructs a number of decision trees. Each tree is grown from a different set of training data which are randomly selected with replacement. At each decision node the RF determines the best splitting feature from a randomly selected subspace of features. The final classification is based on the majority votes among instances decided by the forest of trees .
In the evaluation of each classifier we used ten-fold cross validation. The quality of each classifier was assessed by the extent to which the correct class labels have been assigned. In order to appreciate the experimental outcomes it is important not only to examine how many samples have been correctly classified in relation to a particular class, but also to indicate how well a classifier can classify an unknown sample as not belonging to a particular class. Thus, this study evaluates classifiers based on three statistical measures: precision (Ppv) (equation 2), true positive rate (also known as sensitivity, Se) (equation 3) and true negative rate (also known as specificity, Sp) (equation 4) which can be calculated as follows:
Where TP is true positive (samples correctly classified to appropriate class), FN is false negative (samples incorrectly classified as not belonging to appropriate class), FP is false positive (samples incorrectly classified as appropriate class), and TN is true negative (samples correctly classified as not belonging to appropriate class).
All four classification models were implemented within the framework provided by the Weka open-source platform . The configuration of the various classification models is summarized as follows:
The SVM results were obtained by using a polynomial kernel. For the MLP model, the results were obtained using a model consisting of one hidden layer with six nodes when evaluating the top ten features, four nodes when evaluating the top six features and two nodes when considering the top three features (the choice of feature subsets is discussed later in the paper). Each MLP was trained for 500 epochs and the learning rate was set to 0.3. For the RF algorithm, ten trees were grown in each run and the minimum number of instances per leaf was equal to two. A more detailed description of the selection of learning parameters for these models can be found in .
The feature selection approach adopted resulted in a set of scores for each of the isointegral types studied. As there are 192 feature scores, we were able to plot these as a 192 dimensional contour plot. These plots are illustrated in Figure 3. In the past this means of representation has been referred to as a lead performance map (LPM) , . In Figure 3 we have plotted one such map for each isointegral. The LPMs effectively show the distribution of the scores for each available feature with the output as calculated using equation 1. Based on these values the features were ranked and the top three, six and ten features were selected. The locations of the recording sites from which these features are measured are illustrated in Figure 4.
Fig. 3. Lead Performance Maps showing spatial distribution of values as defined by Equation 1. Figures (a), (b) and (c) represent QRS, STT and QRST isointegrals respectively.
Fig. 4. Positions of recording sites required to measure features selected using ranking method. Figures (a), (b) and (c) represent QRS, STT and QRST features respectively. In each case the top three features are shown as stars, the next three as triangles and the remaining four as squares.Classification
The classification accuracy of the three subsets of features for each isointegral are listed in Table 2. These results illustrate the performance of the four different classifiers (NB, SVM, MLP and RF) on each feature subset. Final accuracies are based on ten-fold cross validation as previously described.
We have divided this section between the discussions of the (a) selected features and associated recording sites and (b) the actual classification results.Feature selection
Firstly, referring to the QRS isointegral based LPM depicted in Figure 3a. It can be seen that there are two areas where the correlation is greatest. These are a region on the inferior anterior beneath the area interrogated by the standard precordial leads and a region on the superior posterior almost between the two shoulders. In the case of the STT isointegrals (Figure 3b) it can be seen that again there are two regions that correlate highly with the output. Again these are located on both the anterior and posterior surfaces, however, this time the area of high correlation on the anterior is located more laterally (towards the subject's left). The same also applies to the region on the posterior where the high, this time, is closer to the right shoulder as opposed to that in the QRS map. The characteristics of the QRST LPM lie somewhere between that of the other two maps. This is to be expected as the QRST data is effectively a combination of that in the QRS and STT portions. Overall, these observations would indicate that, for this population, there would be benefit in locating recording sites outside the area interrogated by the standard locations in the 12 lead ECG. This consolidates the findings of similar previous studies , , , , , .Classification
On analysis of the classification results that have been presented in Table 2, it can be seen that for each subset of features the QRS based features exhibit the poorest performance. This is observed from the fact that, regardless of the classifier or the size of the feature subset, the classification accuracy attained does not exceed 75%. In fact it is with this isointegral that lowest accuracy of all is observed, this is 62.9% using the RF classifier. The STT based features generally exhibit superior performance to the QRS based features as in most cases a classification accuracy in excess of 75% is observed. This is with the exception of the subset of three STT features in conjunction with the RF classifier which exhibits an accuracy of just under 70%. The QRST based features exhibit performance that is comparable to that obtained with the STT features. The QRST features also exhibit the highest attained accuracy which was 83.6%. This was obtained when using the RF classifier along with the subset of 10 features. The fact that similar results can be attained using the STT and QRST features may be based on the fact that the QRST data encompasses both the QRS and STT distributions.
Table 2. Performance of feature subsets for all four classifiers.
Based on the above experiments and presented results we have illustrated how diagnostic electrocardiographic information is localized on the body surface. It can also be seen that the localities of this information may be outside the regions currently interrogated using the standard 12 lead ECG. These results have validated our initial hypothesis in that it is possible to improve the automated diagnostic process of cardiac assessment by trying to identify alternative subsets of features from BSPMs. Such findings offer the potential for future recommendations in alternative lead sets for cardiac assessment.
The authors would like to acknowledge the support of Professor Robert L. Lux of the University of Utah, Salt Lake City in the realization of this study. In particular they would like to thank him for providing the clinical data used.
|||Wagner G. S.: Marriott's Practical Electrocardiography, 10th Edition, Lippincott Williams & Wilkins, 2001.|
|||Menown I. B. A., Patterson R. S. H. W, MacKenzie G., Adgey A. A. J.: Body Surface Map Models for Early Diagnosis of Acute Myocardial Infarction. Journal of Electrocardiology 1998; 31, pp. 180-188.|
|||Sun G., Thomas C. W., Liebman J., Rudy Y., Reich Y., Stilli D., Macchi E.: Classification of Normal and Ischemia from BSPM by Neural Network Approach. In: Proceedings of the 10th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 1988, pp. 1504-1505.|
|||Hoekema R., Uijen G. J. H., van Oosterom A.: On Selecting a Body Surface Mapping Procedure. Journal of Electrocardiology 1999; 32(2), pp. 93-101.|
|||Maynard S. J., Menown I. B., Manoharan G., Allen J., McC Anderson J., Adgey A. A.: Body Surface Mapping Improves Early Diagnosis of Acute Myocardial Infarction in Patients with chest Pain and Left Bundle Branch Block. Heart 2003; 89(9), pp. 998-1002|
|||Lux R. L., Smith C. R., Wyatt R. F., Abildskov J. A.: Limited Lead Selection for the Estimation of Body Surface Potential Maps in Electrocardiography. IEEE Transactions on Biomedical Engineering 1978; 25(3), pp. 270-276.|
|||Lux R. L., Burgess M. J., Wyatt R. F., Evans A. K., Vincent G. M., Abildskov J. A.: Clinically Practical Lead Systems for Improved Electrocardiography: Comparison with Precordial Grids and Conventional Lead Systems. Circulation 1979; 59(2), pp. 356-363.|
|||Lux R. L., Evans A. K., Burgess M. J., Wyatt R. F., Abildskov J. A.: Redundancy Reduction for Improved Display and Analysis of Body Surface Potential Maps. I. Spatial compression. Circulation Research 1981; 49, pp. 186-196.|
|||Taccardi B., Punske B. B., Lux R. L., MacLeod R. S., Ershler P. R., Dustman T. J., Vyhmeister Y.: Useful Lessons from Body Surface Mapping. Journal of Cardiovascular Electrophysiology 1998; 9, pp. 773-786.|
| ||Flowers N. C., Horan L. G.: Body Surface Potential Mapping. In: Cardiac Electrophysiology: From Cell to Bedside.(Eds. D. Zipes, and J. Jalife), Saunders, 1995, pp. 1049-1067.|
|||Finlay D. D., Nugent C. D., McCullagh P. J., Black N. D.: Mining for Diagnostic Information in Body Surface Potential Maps: A Comparison of Feature Selection Techniques. Biomedical Engineering Online 2005, 4(51), 2005.|
|||Golub T. R., Slonim D. K., Tamayo P., Huard C., Gassenbeck M., Mesirov J. P., Coller H., Loh M. L., Downing J. R., Caligiuri M. A., Bloomfield C. D., and Lander E. S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 1999; 286, pp. 531-537.|
|||Kornreich F., Montague T. J., Rautaharju P. M.: Identification of First Acute Q Wave and non-Q Wave Myocardial Infarction by Multivariate Analysis of Body Surface Potential Maps. Circulation 1991; 84, pp. 2422-2453.|
|||Kozmann G., Green L. S., and Lux R. L.: Nonparametric Identification of Discriminative Information in Body Surface Maps, IEEE Transactions on Biomedical Engineering, 1991; 38(11), pp. 1061-1068.|
|||Irina R.: An Empirical Study of the Naive Bayes Classifier. IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, 2001.|
|||Burges C., A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 1998; 2, pp. 121-167.|
|||Hampshire J. B., Perlmutter B. A.: Equivalence Proofs for Multilayer Perceptron Classifiers and the Bayesian Discriminant Function. In: Proceedings of the 1990 Connectionist Models Summer School, 1990.|
|||Witten I. H. Frank E.: Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann, San Francisco, 2005.|
|||Lopez J. A., Nugent C. D., van Herpin G., Kors J. A., Finlay D., Black, N. D.: Visualisation of Electrocardiographic Features in Myocardial Infarction. In: Proceedings of the 29th Annual Conference of the International Society for Computerized Electrocardiology (ISCE), Journal of Electrocardiology. Vol. 37, 2004, p. 149.|
|||Barr R. C., Spach M. S., Herman-Giddens S.: Selection of the Number and Position of Measuring Locations for Electrocardiography. IEEE Transactions on Biomedical Engineering 1971; 18, pp. 125-138.|
|||Kors J. A., van Herpen G.: How Many Electrodes and Where: A "Poldermodel" for Electrocardiography. Journal of Electrocardiology 2002; 35(suppl.), pp. 7-12.|
|||Finlay D. D., Nugent C. D., Donnelly M. P., Lux R. L., McCullagh P. J., and Black N. D.: Selection of Optimal Recording Sites for Limited Lead Body Surface Potential Mapping: A sequential Selection Based Approach. BMC Medical Informatics and Decision Making; 6(9), pp. 1-9.|
|||Kozmann G., Lux R. L., and Scott M.: Sample Size and Dimensionality in Multivariate Classification: Implications for Body Surface Potential Mapping. Computers and Biomedical Research 1991; 24, pp. 170-182.|