Skip to main content
Advertisement
  • Loading metrics

Screening for Chagas disease from the electrocardiogram using a deep neural network

Abstract

Background

Worldwide, it is estimated that over 6 million people are infected with Chagas disease (ChD). It is a neglected disease that can lead to severe heart conditions in its chronic phase. While early treatment can avoid complications, the early-stage detection rate is low. We explore the use of deep neural networks to detect ChD from electrocardiograms (ECGs) to aid in the early detection of the disease.

Methods

We employ a convolutional neural network model that uses 12-lead ECG data to compute the probability of a ChD diagnosis. Our model is developed using two datasets which jointly comprise over two million entries from Brazilian patients: The SaMi-Trop study focusing on ChD patients, enriched with data from the CODE study from the general population. The model’s performance is evaluated on two external datasets: the REDS-II, a study focused on ChD with 631 patients, and the ELSA-Brasil study, with 13,739 civil servant patients.

Findings

Evaluating our model, we obtain an AUC-ROC of 0.80 (CI 95% 0.79-0.82) for the validation set (samples from CODE and SaMi-Trop), and in external validation datasets: 0.68 (CI 95% 0.63-0.71) for REDS-II and 0.59 (CI 95% 0.56-0.63) for ELSA-Brasil. In the latter, we report a sensitivity of 0.52 (CI 95% 0.47-0.57) and 0.36 (CI 95% 0.30-0.42) and a specificity of 0.77 (CI 95% 0.72-0.81) and 0.76 (CI 95% 0.75-0.77), respectively. Additionally, when considering only patients with Chagas cardiomyopathy as positive, the model achieved an AUC-ROC of 0.82 (CI 95% 0.77-0.86) for REDS-II and 0.77 (CI 95% 0.68-0.85) for ELSA-Brasil.

Interpretation

The neural network detects chronic Chagas cardiomyopathy (CCC) from ECG—with weaker performance for early-stage cases. Future work should focus on curating large higher-quality datasets. The CODE dataset, our largest development dataset includes self-reported and therefore less reliable labels, limiting performance for non-CCC patients. Our findings can improve ChD detection and treatment, particularly in high-prevalence areas.

Author summary

Chagas disease (ChD) is a neglected tropical disease, and the diagnosis relies on blood testing of patients from endemic areas. However, there is no clear recommendation on how to select patients for testing in endemic regions. Since most cases of Chronic ChD are asymptomatic, the diagnostic rates are low, preventing patients from receiving adequate treatment.

The Electrocardiogram (ECG) is a widely available, low-cost exam, often available in primary care settings. We present an Artificial intelligence (AI) model for automatically detecting ChD from the ECG. AI algorithms have allowed the detection of hidden conditions on the ECG and, to the best of our knowledge, this is the first study that does it for ChD. We utilize large cohorts of patients from the relevant population of all-comers in affected regions in Brazil to develop a model for ChD detection that is then validated on datasets with ground truth labels obtained directly from the patients’ serological status.

Our findings demonstrate a promising AI-ECG-based model for discriminating patients with chronic Chagas cardiomyopathy (CCC). The capacity of detecting ChD patients without CCC is still limited. But we believe this can be improved with the addition of epidemiological questions, and that such models can become useful tools for pre-selecting patients for further testing.

Introduction

Worldwide it is estimated that Chagas disease (ChD) infects more than 6 million people, with thousands of deaths each year [1]. Caused by the protozoan parasite Trypanosoma cruzi (T. cruzi), the disease is endemic to countries in continental Latin America, but migration has carried ChD to new regions, including Europe and the United States [2]. The most critical consequence of ChD is chronic Chagas cardiomyopathy (CCC), which occurs in 20–40% of the infected individuals [3]. CCC comprises a wide range of manifestations, including heart failure, arrhythmias, heart blocks, sudden death, thromboembolism, and stroke [1, 3].

ChD is often a lifelong infection in which most chronically infected patients remain asymptomatic but at risk of progression to cardiac damage [4, 5]. The incidence of cardiomyopathy in those in this asymptomatic (indeterminate) form of ChD varies from 0.9 to 7% new cases annually [1] and is related to the parasite burden [5, 6]. There is no single gold-standard laboratory test for diagnosing chronic Chagas disease. Instead, at least two serological tests with different methods for detecting antibodies to T. cruzi and complementary sensitivity and specificity are needed to confirm infection [1, 3]. Treatment with antitrypanosomal drugs such as benznidazole can prevent progression to the cardiac form [7, 8], but it does not seem to prevent death and cardiac complications in those with advanced cardiomyopathy [9]. Thus, the early recognition of chronic ChD patients is a necessary step for treatment in the early phases, when treatment success rates are higher and can prevent severe organ damage from occur [10].

Even if the newly diagnosed patient has established cardiomyopathy, an early diagnosis will allow the initiation of guideline-directed medical therapy for clinical conditions, such as heart failure and atrial fibrillation, to halt disease progression and eventually prevent death [10]. ChD patients generally have low socio-economical levels and limited access to health services, and they frequently do not realize that they are infected. The awareness of ChD among healthcare providers is also low, and there is a lack of knowledge on who to screen as well as a lack of clarity on the appropriate tests and clinical management [11, 12].

In many countries, there are detection rates below 10%, even more frequently, below 1%. The low detection rates create a barrier to the health care system, preventing patients from receiving adequate treatment [13]. The under-appreciation of early diagnosis and treatment, especially at the primary healthcare level, represents a missed opportunity for modifying the natural history of the disease [10]. For this reason, the theme of World Chagas Disease Day 2022 was “finding and reporting every case to defeat Chagas disease” [13].

Here we study the possibility of using the electrocardiogram (ECG) to screen for ChD. The ECG is a widely available, low-cost exam, often provided in primary care settings in endemic countries [14]. The automated analysis of ECG is a successful technology and has already improved the analysis of this exam over the past decades [15].

The field of artificial intelligence, in particular deep learning [16], has demonstrated promising performance for automated analysis. Besides the success of classifying common ECG diagnoses with high-performance [17, 18], the technology has presented successes in predicting and screening for diseases and diagnoses which traditionally were not directly possible only from the ECG. These include detection of myocardial infarction without ST-elevation [19], predicting the future development of atrial fibrillation from sinus rhythm exams [20, 21] and the ability to screen for cardiac contractile dysfunction [22]. Indeed, there is evidence that deep learning reading of ECGs detects more than traditional features, as is indicated by studies showing good prediction of age and even the risk of death [2325].

In this study, we investigate whether a deep neural network can detect ChD and CCC from ECG tracings. Being able to evaluate ChD from this exam can help to detect cases in an early stage and enables early and more effective treatment.

Methods

Data sets

We develop our model using the SaMi-Trop data set [26] and the CODE data set [27]. The SaMi-Trop data set is a collection of ChD patients from the northern part of Minas Gerais, Brazil. The CODE data set [27] is more general, collected by the Telehealth Network of Minas Gerais (TNMG), Brazil [28]. For testing or external validation, we use the REDS-II data set [29] and the ELSA-Brasil data set [30]. The baseline characteristics of all four data sets are summarised in Tables 1 and 2.

thumbnail
Table 1. Development data sets baseline characteristics.

For CODE the Chagas patient reports their own condition, while for Samitrop the blood sample is used to determine the serological status. CCC stands for chronic Chagas cardiomyopathy, which is not available (n.a.) for the CODE data set. MI stands for myocardial infarction.

https://doi.org/10.1371/journal.pntd.0011118.t001

thumbnail
Table 2. Test data sets baseline characteristics.

Blood sample is used to determine the serological status in both datasets.

https://doi.org/10.1371/journal.pntd.0011118.t002

Definitions.

Chronic ChD is diagnosed by the presence of two positive different serological tests against T. cruzi in both SaMi-Trop and REDS-II cohorts, as recommended by international guidelines [3]. In the ELSA-Brasil study, a cohort primarily designed to study chronic non-communicable diseases, the presence of Chagas disease was detected by the presence of only one positive serological test. In the CODE study, Chagas disease was self-reported by the patients since this electronic cohort is formed by patients under care in primary care units in the state of Minas Gerais. For SaMi-Trop, REDS-II and ELSA cohorts, ECGs were transmitted to an ECG reading center at the ‘Centro de Telessaúde in Hospital das Clínicas’ in Belo Horizonte, Minas Gerais for standardized measurement, reporting and codification according to the Minnesota coding criteria in a validated ECG data management software [31]. Major ECG abnormalities were considered according to standard definitions [32], and all tracings with a major ECG abnormality have been reviewed by an experienced cardiologist.

CODE.

The Clinical Outcomes in Digital Electrocardiography (CODE) data set was developed with the database of digital ECG exams of the TNMG and a detailed description of the cohort can be obtained at [27]. The data set was collected between 2010 and 2017 from 811 counties in the state of Minas Gerais, Brazil. A subset of 15% of this data set is available online [33].

From an initial data set of 2,470,424 ECGs, 1,773,689 patients were identified. This initial data set contains the SaMi-Trop data set. Therefore, we first remove the patients from the SaMi-Trop study to avoid any overlap. Additionally, we have to exclude the ECGs with technical problems and those from patients under age 16, resulting in a total of 2,304,596 ECG records from 1,556,767 patients.

In this data set, the labels of ChD rely on self-reported diagnoses during the consultation. A total of 47, 474 ECGs (2.0%) from 25, 252 patients (1.6%) are labelled as positive ChD cases. The serological status of the self-reported Chagas labels has not been checked, and it is also unclear whether the patient has already developed CCC or not.

SaMi-Trop.

The study was conducted through a collaboration between scientists within the São Paulo-Minas Gerais Tropical Medicine Research Center (SaMi-Trop), formed with a specific research focus on ChD. [34] The study selected eligible patients with self-reported ChD diagnosis. This data set was collected in 21 Brazilian municipalities from ECGs taken between 2010 and 2012 by the TNMG. The connection to the TNMG explains the intersection of the SaMi-Trop data set with the CODE data set. The study has a follow-up time of two years. It is partially available in [35]

A total of 2, 157 patients were assessed in the study. Among the patients from the original SaMi-Trop study, we removed 22 patients with an undefined serological status, and the remaining 83 for not having a paired ECG recording. After the exclusions, the resulting data set comprises 2, 054 patients with 1, 910 ChD positive patients (93.4%). The positive patients consist of 1, 111 patients with CCC (54.1% of total sample) and 799 without (38.9% of total sample).

Some of the patients have taken multiple ECG recordings during an exam which we utilize during development as a form of data augmentation. Hence, we have 5, 019 SaMi-Trop ECG traces available including 2, 693 traces with CCC (53.7%) and 1, 961 traces without (39.1%).

REDS-II.

The Retrovirus Epidemiology Donor Study-II (REDS-II) data set was collected to observe the natural history of ChD patients in São Paulo and Montes Carlos, Brazil from blood donors. Seropositive and seronegative patients examined in 1996–2002 were re-examined in 2008–10 [4] with ECG exams and again in 2018–19 [29]. The data set consists of 631 patients that performed an ECG in the last visit in 2018–19, including 348 ChD patients (55.8%), of which 149 patients had CCC (23.6% of the total sample). The model is evaluated using a single exam from each patient (the first one).

ELSA-Brasil.

The Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) aimed to examine risk factors and the long-term incidence of chronic diseases with focus on cardiovascular diseases and diabetes. The baseline evaluation was performed in 2008–2010 and recruited active and retired civil servants from five universities and research institutes from 6 different Brazilian states. ChD serological status and standardized ECG were obtained from all participants [36, 37].

The data set consists of 15,105 patients in total. We remove 27 patients where the ChD serological status is not available, 12 patients where the serological status is inconclusive, and 1,327 patients from which the ECG traces are not available. After the exclusions, we have a data set with a total of 13,739 patients. ChD was confirmed in 280 of the patients (2.0%), of which 46 had CCC (0.3% of the total sample). The model is evaluated using a single exam from each patient (the first one).

Model

Data preprocessing.

The ECG signals have been re-sampled such that all ECGs have the same sampling frequency of 400 Hz. Each input ECG has 4, 096 time samples for each of the 12 standard ECG leads. Original signals of a shorter time span have been extended through zero-padding. The output data comprises binary scalar variables corresponding to positive or negative diagnose. We combine positive cases with and without CCC in our model in order to focus on the class of positive ChD cases in general.

Architecture.

The deep learning model consists of a residual neural network (ResNet) adapted to uni-dimensional signals, and includes convolutional layers both before and within the residual blocks. Our network architecture is visualised in Fig 1. We make use of the same network architecture as [17], where the CODE data set was utilised to classify multiple ECG abnormalities; we refer to that work for further details and note that we have modified the final output layer in adaptation to our binary classification. The model is implemented in PyTorch [38], building upon code used in related work [39, 40].

thumbnail
Fig 1. Network architecture.

The figure was originally illustrated in [17].

https://doi.org/10.1371/journal.pntd.0011118.g001

Parameter tuning.

The learnable parameters of the neural network are chosen through minimisation of the binary cross-entropy loss function. For increased computational efficiency, we split the training data into mini-batches of size 32.

We use both the CODE and SaMi-Trop data sets during the training phase. This way, we utilise the size of the CODE data set—with many examples of negative diagnoses—as well as the high-quality (mainly positive) entries of SaMi-Trop. Both data sets contribute with 50% of the data that the model experience in each mini-batch. The validation data is an independent mix of 30% of the SaMi-Trop entries and twice as many entries from CODE.

The dropout rate is 0.5, and we use a weight decay of 0.001 to reduce the risk of overfitting. The learning rate is initially set to 0.001 and is decreased in a step-wise manner by a factor 10 when the validation loss has not decreased for ten subsequent epochs (counted with respect to SaMi-Trop)—we terminate the optimisation if the learning rate drops below 10−7. We apply early stopping by using the network parameter values associated with the lowest validation loss for testing.

To reduce the sensitiveness of the weight initialisation, we use an ensemble approach by running the optimisation 15 times with different random seeds, and then averaging the outputs of the final models. The progression of the losses evaluated on the training and validation data sets are displayed in Fig 2.

thumbnail
Fig 2. Loss function evaluation.

The shaded regions correspond to the maximum and minimum values of 15 separate learning processes with different weight initialisations. The solid lines are the averages.

https://doi.org/10.1371/journal.pntd.0011118.g002

Threshold selection.

The model output is a value between 0 and 1 and can loosely be interpreted as the predicted probability of ChD being present in the exam analysed. The Chagas diagnose is predicted as positive when the model output is above a given classification threshold.

We consider two different approaches to selecting the threshold. The first one is by maximising the F1 score (i.e. the harmonic mean of precision and recall) on the validation data. This threshold is suitable for balanced or moderately imbalanced data sets where the main interest is to diagnose the patients under consideration.

The second approach is to choose the threshold by requiring a certain specificity on the validation data. The higher the specificity, the more likely is the model to correctly diagnose a negative patient. As a high specificity typically is desired for screening purposes, this approach for threshold selection is motivated on highly imbalanced data sets (which reflects the Chagas prevalence in the population as a whole).

The first approach is used on the REDS-II test set since this data set is only moderately imbalanced (55.8% ChD and 23.6% CCC ECGs). On the ELSA-Brasil test set the threshold is selected according to the second approach since this data set is more imbalanced (2.0% ChD and 0.3% CCC ECGs). We select the threshold by requiring a 90% specificity on the validation data.

Evaluation

Metrics.

Recall (also known as sensitivity), specificity and precision are threshold-dependent metrics that we used to evaluate and report the model performance. Recall or sensitivity specifies the ratio of true positive predictions to positive cases (i.e. the ratio of the positive cases that are indeed predicted as positive); specificity denotes the ratio of true negative predictions to negative cases; and precision is the ratio of true positive predictions to all positive predictions (the ratio of all positive predictions that are correct).

We also report two threshold-independent metrics. The AUC-ROC (also known as c-statistics) is the integral of the receiver-operator characteristics (ROC), and can be interpreted as the probability that a randomly chosen sample with positive label is assigned a higher output than a randomly chosen sample with negative label. Lastly, we report the average precision, which is obtained by integrating the precision-recall curve and thereby summarising it into a single value.

Analysis of the results in groups.

As part of the model analysis we evaluate the model performance in different subgroups of patients. We stratify the patients by age group {16–40, 40–49, 50–59, 60–69, 70+} and sex {male, female}. Bootstrapping [41] is used to analyse the empirical distribution of the metrics in each subgroup. We generate 1, 000 different data sets by sampling with replacement from the test set (each with the same number of samples as in the test set). Using the bootstrapped data sets, we compute the evaluation metrics described above and present the results in box plots.

Visualisation tools.

To identify possible patterns in the classification, we highlight parts of the ECG that the model focuses on for its prediction using an adaptation of the Grad-CAM visualisation method [42]. Visualisations are generated in two steps: in a forward pass we compute the activations of the neural network in an intermediary layer (we use the first convolutional layer of the first residual block), and in a backward pass we compute the gradients corresponding to these activations. The gradients are averaged to get the relative importance of each channel, which is then used to compute a proportional mean of the activations.

In essence, these plots highlight which parts of the ECG the network assigns particularly high importance. We generated the Grad-CAM plots for 20 cases (10 with CCC and 10 without) with the highest probability among the true positive cases. These plots were then inspected and analysed by a cardiologist for possible medical patterns.

Results

We evaluated the model performance on the validation data and the external test data sets. The ROC curve performance is displayed in Fig 3. The model attains AUC-ROC values of 0.80 (CI 95% 0.79–0.82) for the validation data set, 0.68 (CI 95% 0.63–0.71) for REDS-II and 0.59 (CI 95% 0.56–0.63) for ELSA-Brasil. The confidence intervals have been formed by bootstrapping the output of the ensemble model. Table 3 lists all performance metrics evaluated on the validation data for two different thresholds selected through the aforementioned approaches. The same metrics evaluated on the test data sets are listed in Table 4. Additionally, we also analysed the precision-recall curve and the empirical probabilities predicted by the model. These results are displayed in S1 and S3 Figs. The metrics for subgroups stratified by age and sex are displayed in Fig 4.

thumbnail
Fig 3. ROC curves on validation and test data: ChD vs normal.

Receiver operating characteristics (ROC) computed on the validation data set (SaMi-Trop + CODE) (a) and the test data sets, REDS-II (b) and ELSA-Brasil (c). The shaded regions encapsulate the maximum and minimum values corresponding to 15 different weight initialisations. The outputs of the 15 trained models are averaged to produce the output of the ensemble model, the result of which is shown by the solid lines. The 95% CI is obtained by bootstrapping the ensemble model. The dotted blue lines correspond to completely random assignment of class probabilities.

https://doi.org/10.1371/journal.pntd.0011118.g003

thumbnail
Fig 4. Results stratified by subgroup.

Box plots of the model performance on the REDS-II (top row) and ELSA-Brasil (bottom row) test sets stratified by age (left column) and sex (right column). The box plots give the performance on 1000 bootstrapped samples.

https://doi.org/10.1371/journal.pntd.0011118.g004

thumbnail
Table 3. Results on validation data.

Metrics and 95% confidence intervals evaluated on the validation data set for two different classification thresholds: 0.60 (selected by maximising the F1 score) and 0.71 (corresponding to 90% specificity).

https://doi.org/10.1371/journal.pntd.0011118.t003

thumbnail
Table 4. Results on test data.

Metrics and 95% confidence intervals were evaluated on two different configurations of the test data sets. Left we consider ChD and CCC as positive. Right we only consider CCC as positive. The classification thresholds are 0.60 for REDS-II and 0.71 for ELSA-Brasil.

https://doi.org/10.1371/journal.pntd.0011118.t004

We also evaluated the model for considering only patients with CCC as positive. In this case, the model attains an AUC-ROC of 0.82 (CI 95% 0.77–0.86) for REDS-II and 0.77 (CI 95% 0.68–0.85) for ELSA-Brasil (see Fig 5). All metrics for this configuration are included in Table 4. We also analysed the precision-recall curve and the empirical probabilities predicted by the model (S2 and S4 Figs).

thumbnail
Fig 5. ROC curves on test data: CCC vs all.

Receiver operating characteristics computed in REDS-II and ELSA-Brasil for predicting Chagas Cardiomiopathy. The shaded regions encapsulate the maximum and minimum values corresponding to 15 different weight initialisations—the outputs of these models are averaged to produce the output of the ensemble model, the result of which is given by the solid lines. The dotted blue lines correspond to completely random assignment of class probabilities.

https://doi.org/10.1371/journal.pntd.0011118.g005

In S5 Fig and S1 Table, we show the additional results for another test set configuration. Namely where the patients with CCC have been excluded; the remaining patients where ChD was detected are here constituting the positive cases (this configuration is indicated “no CCC”). We also show the result of a model trained to detect CCC (with all others being considered negative): S6 Fig shows the training curve, S7 Fig shows the ROC curves, precision-recall curves and empirical distribution of the probabilities, and finally, S1 and S2 Tables give the performance metrics in this case.

The Grad-CAM analysis is presented in Fig 6, which shows three representative leads of a patient with CCC from the ELSA-Brasil data set. The shaded regions illustrate what parts of the signals the model considers to be of particular importance for the prediction. In S8 Fig we include the equivalent plots for another three patients with positive Chagas diagnose, one with and two without CCC.

thumbnail
Fig 6. Grad-CAM analysis.

Grad-CAM plot for a patient with CCC from the ELSA-Brasil data set, correctly classified by the model as Chagas positive. This plot includes three representative leads (top to bottom: aVL, V1 and V6). The shading indicates regions that the model assigns particular importance for its prediction.

https://doi.org/10.1371/journal.pntd.0011118.g006

Discussion

Deep neural network-enabled analysis of the ECG is a topic of intense research [1925]. Such methods have shown promising potential in detecting diverse conditions that are not traditionally diagnosed from the ECG, such as contractile disfunction [22] or non-STEMI myocardial infarction [19]. ChD is the parasitic disease with the most impact in South America [43] and it affects the lives of millions of individuals worldwide. Early detection of this disease can therefore have a huge impact. Antiparasitic drugs are most effective in the early stage of the disease, however, most patients only become aware that they are infected much later when the patient is already in the later stage of the disease and presents other manifestations. Providing early treatment and the usage of advanced artificial intelligence or machine learning methods for the detection of this disease presents itself as a promising alternative. To the best of the authors’ knowledge, this is the first study to present such an application.

The development of data-driven methods for automatic diagnosis of neglected diseases presents a challenge of its own. These diseases usually affect areas where the population is underprivileged and have little access to the healthcare system. The data might not come in well-organised databases or might not even be stored in electronic format. In this sense, the CODE, SaMi-Trop, ELSA-Brasil and REDS-II cohorts are extremely valuable: they are medium or large-size and well-kept data sets that can be used for developing and testing such tools.

The results we present are promising and indicate that the model is capable of detecting patients with CCC from the ECG tracings with high discrimination. For patients without CCC, the discrimination is lower.

In light of the results, it is natural to ask if we can further improve the performance with respect to patients with CCC. Therefore, we restrict the positive diagnoses to patients with CCC during the training phase and consider all patients without CCC as negatives (this implies that ChD positive patients without CCC are considered negative in this scenario). The result of this approach is given in S1 and S2 Tables. All metrics considered, except for the recall, are indeed improved. Thus, this model might be the preferable choice for CCC detection.

Chagas cardiomyopathy is characterised by a group of typical ECG abnormalities, frequently combining conduction disturbances, especially right bundle branch block with left anterior hemiblock, associated with rhythm disorders, such as ventricular ectopic beats and atrial fibrillation [4446]. Thus, it is unsurprising that our Grad-CAM analysis depicts exactly the late portion of the QRS in cases with a bundle branch block. It is interesting that the Grad-CAM map also depicts the QRS complex when recognising the ChD patients with CCC, maybe related to the presence of high frequency, low amplitude abnormalities typical of fibrosis, which can occur early in the natural history of ChD [47]. However, this type of analysis has clear limitations [48, 49] since heatmaps can provide information on where the critical area for the neural network model is to make a decision but not inform if the abnormality is related to changes in voltage, duration or morphology modification of the ECG tracing. Moreover, recurrent features, like the RR interval, are not shown in this kind of heatmaps. Our analysis here is also limited to a small set of correct model predictions and does not represent a statistical analysis. Hence, we cannot deduct general rules for the diagnosis of ChD but we can identify from the unsurprising areas where the model focuses on that it does not use some unrelated proxy information to make its predictions.

Comparing the two test data sets, we obtain similar performance for discrimination in terms of AUC-ROC, but very different precision. This indicates that our model predicts many false positives for the ELSA-Brasil data set. Given the vast difference in prevalence for ChD patients in ELSA-Brasil (2.0%) and REDS-II (55.1%), it is reasonable that for ELSA-Brasil our model will by default have lower precision. We can also observe the large portion of false positive cases in S3 Fig panel C when choosing a threshold of 0.60 (based on F1 score) or even 0.71 (based on 90% specificity). We believe the performance could be improved with the addition of epidemiological questions, and that our model can be a useful tool in helping pre-selecting patients for further testing in order to determine the infection with ChD.

As previously mentioned, the ChD status in the CODE data set is based on self-reporting by the patients, and the labels are thus suffering from notable uncertainty. Thus, testing on these labels might be uninformative and we have used more reliable databases such as ELSA-Brasil and REDS-II to get a better estimate of our model performance. Nonetheless, the labels in CODE still contain a sufficient amount of information to learn about CCC patients and the data set was indeed useful in developing a better-performing model. Methods designed to reduce the impact of label noise (see e.g. [50, 51]) could potentially be employed for more efficient use of the CODE data.

Our model could be even more insightful if we could test it on other openly available data sets. However, data sets about neglected diseases are scarce and both ELSA-Brasil as well as REDS-II are valuable but also medium to large-scale sources to rigorously test the model. Furthermore, a comparison with other models or software for Chagas detection would be useful, but unfortunately, it is not possible—to the best of our knowledge, this is the first work that tackles automatic diagnosis of Chagas directly from the ECG. Therefore, this study serves as a first baseline that opens a new line of work for further improvements.

Our findings are particularly valuable under the scantiness of validated strategies to detect ChD patients in endemic regions. Current recommendations for screening include all patients who were born in or have lived for an extended period in ChD endemic zones [44], which can be challenging, especially in endemic countries, since it can encompass the whole population of a region. A risk score was developed specifically to answer the question, “Does my patient have chronic Chagas disease?” but it seems to have limited practical value since it includes 13 variables obtained from clinical and epidemiological history and from a conventionally analysed 12-lead ECG [52]. It implies that the best approach would merge conventional and non-conventional methods [53], including the use of rapid point-of-care serological tests [54].

A clinical study would be particularly valuable, as the performance of the model could be evaluated directly by clinicians and patients. At this stage, we foresee our model as a pre-selection method of patients for further screening of the serological status. It is important to underline that more available data will enable improvements of the model that can be adapted into its daily clinical practice. We hope that a future study will evaluate the clinical relevance of our model to improve the early diagnosis of ChD.

Supporting information

S1 Table. Results on test data: No-CCC configuration.

Metrics and 95% confidence intervals evaluated on the no-CCC configuration of the test data sets (see the text for details). The classification thresholds are 0.60 for REDS-II and 0.71 for ELSA-Brasil.

https://doi.org/10.1371/journal.pntd.0011118.s001

(XLSX)

S2 Table. Results on validation data: CCC-specific training.

Equivalent to Table 3 when the training is adapted to specifically target patients with CCC. The classification thresholds are 0.51 (selected by maximising the F1 score) and 0.33 (corresponding to 90% specificity).

https://doi.org/10.1371/journal.pntd.0011118.s002

(XLSX)

S3 Table. Results on test data: CCC-specific training.

Equivalent to Table 4 when the training is adapted to specifically target patients with chronic Chagas cardiomyopathy. The classification thresholds are 0.51 for REDS-II and 0.33 for ELSA-Brasil.

https://doi.org/10.1371/journal.pntd.0011118.s003

(XLSX)

S1 Fig. Precision-recall curves on validation and test data: ChD+CCC vs normal.

The shaded regions encapsulate the maximum and minimum values corresponding to 15 different weight initialisations—the outputs of these models are averaged to produce the output of the ensemble model, the result of which is given by the solid lines.

https://doi.org/10.1371/journal.pntd.0011118.s004

(EPS)

S2 Fig. Precision-recall curves on test data: CCC vs rest.

We consider only CCC as positive and ChD as well as normal as negative here.

https://doi.org/10.1371/journal.pntd.0011118.s005

(EPS)

S3 Fig. Output histograms: ChD+CCC vs normal.

Histograms are computed on the validation data and the test data. Note the logarithmic scale of the y-axis. We can see the number of false positive/negatives when applying the selected thresholds on the x-axis: 0.60 for REDS-II and 0.71 for ELSA-Brasil.

https://doi.org/10.1371/journal.pntd.0011118.s006

(EPS)

S4 Fig. Output histograms: CCC vs rest.

Histograms are computed on the test data where we only consider CCC as positive and ChD as well as normal as negative. Note the logarithmic scale of the y-axis.

https://doi.org/10.1371/journal.pntd.0011118.s007

(EPS)

S5 Fig. Results on test data: No-CCC configuration.

Receiver operating characteristics (left), precision-recall curves (middle) and output histograms (right) computed on the test data for the no-CCC configuration (see the text for details). This set removed the CCC cases and shows ChD vs normal.

https://doi.org/10.1371/journal.pntd.0011118.s008

(EPS)

S6 Fig. Loss function evaluation: CCC-specific training.

Equivalent to Fig 2 when the training is adapted to specifically target patients with CCC.

https://doi.org/10.1371/journal.pntd.0011118.s009

(EPS)

S7 Fig. Results on test data: CCC-specific training.

Receiver operating characteristics (left), precision-recall curves (middle) and output histograms (right) computed on the validation data and the test data when the training is adapted to specifically target patients with CCC.

https://doi.org/10.1371/journal.pntd.0011118.s010

(EPS)

S8 Fig. Grad-CAM analysis: Additional patients.

Complementing Fig 6 with another three Grad-CAM plots for patients from the ELSA-Brasil data set, correctly classified by the model as Chagas positive. We here include one patient with chronic Chagas cardiomyopathy (a), and two without (b-c). The plots include three representative leads (top to bottom: aVL, V1 and V6). The shading indicates regions that the model assigns particular importance.

https://doi.org/10.1371/journal.pntd.0011118.s011

(EPS)

References

  1. 1. Nunes M, Beaton A, Acquatella H, Bern C, Bolger A, Echeverría L, et al. Chagas Cardiomyopathy: An Update of Current Clinical Knowledge and Management: A Scientific Statement From the American Heart Association. Circulation. 2018;138:e169–e209. pmid:30354432
  2. 2. Bern C. Chagas’ Disease. The New England Journal of Medicine. 2015;373(5):456–466. pmid:26222561
  3. 3. Nunes MCP, Dones W, Morillo CA, Encina JJ, Ribeiro AL, Council on Chagas Disease of the Interamerican Society of Cardiology. Chagas disease: an overview of clinical and epidemiological aspects. Journal of the American College of Cardiology. 2013;62(9):767–776. pmid:23770163
  4. 4. Sabino EC, Ribeiro AL, Salemi VMC, Di Lorenzo Oliveira C, Antunes AP, Menezes MM, et al. Ten-Year Incidence of Chagas Cardiomyopathy Among Asymptomatic Trypanosoma cruzi–Seropositive Former Blood Donors. Circulation. 2013;127(10):1105–1115. pmid:23393012
  5. 5. Nunes MCP, Buss LF, Silva JLP, Martins LNA, Oliveira CDL, Cardoso CS, et al. Incidence and Predictors of Progression to Chagas Cardiomyopathy: Long-Term Follow-Up of Trypanosoma cruzi-Seropositive Individuals. Circulation. 2021;144(19):1553–1566. pmid:34565171
  6. 6. Basquiera AL, Sembaj A, Aguerri AM, Omelianiuk M, Guzmán S, Moreno Barral J, et al. Risk progression to chronic Chagas cardiomyopathy: influence of male sex and of parasitaemia detected by polymerase chain reaction. Heart. 2003;89(10):1186–1190. pmid:12975414
  7. 7. Viotti R, Vigliano C, Lococo B, Bertocchi G, Petti M, Alvarez MG, et al. Long-Term Cardiac Outcomes of Treating Chronic Chagas Disease with Benznidazole versus No Treatment. Annals of Internal Medicine. 2006;144(10):724–734. pmid:16702588
  8. 8. Cardoso CS, Ribeiro ALP, Oliveira CDL, Oliveira LC, Ferreira AM, Bierrenbach AL, et al. Beneficial effects of benznidazole in Chagas disease: NIH SaMi-Trop cohort study. PLOS Neglected Tropical Diseases. 2018;12(11):1–12. pmid:30383777
  9. 9. Morillo CA, Marin-Neto JA, Avezum A, Sosa-Estani S, Rassi A, Rosas F, et al. Randomized Trial of Benznidazole for Chronic Chagas’ Cardiomyopathy. New England Journal of Medicine. 2015;373(14):1295–1306. pmid:26323937
  10. 10. Echeverría LE, Marcus R, Novick G, Sosa-Estani S, Ralston K, Zaidel EJ, et al. WHF IASC Roadmap on Chagas Disease. Global Heart. 2020;15.
  11. 11. Miranda-Arboleda AF, Zaidel EJ, Marcus R, Pinazo MJ, Echeverría LE, Saldarriaga C, et al. Roadblocks in Chagas disease care in endemic and nonendemic countries: Argentina, Colombia, Spain, and the United States. The NET-Heart project. PLOS Neglected Tropical Diseases. 2022;15(12):1–12.
  12. 12. Damasceno RF, Sabino EC, Ferreira AM, Ribeiro ALP, Moreira HF, Prates TEC, et al. Challenges in the care of patients with Chagas disease in the Brazilian public health system: A qualitative study with primary health care doctors. PLOS Neglected Tropical Diseases. 2020;14(11):1–13. pmid:33166280
  13. 13. World Chagas Disease Day 2022—Finding and reporting every case to defeat chagas disease; 2022. www.who.int/news-room/events/detail/2022/04/14/default-calendar/world-chagas-disease-day-2022---finding-and-reporting-every-case-to-defeat-chagas-disease [Accessed: 31-05-2023].
  14. 14. Alkmim MB, Silva CBG, Figueira RM, Santos DVV, Ribeiro LB, da Paixão MC, et al. Brazilian National Service of Telediagnosis in Electrocardiography. Studies in health technology and informatics. 2019;264:1635–1636. pmid:31438267
  15. 15. Macfarlane PW, Kennedy J. Automated ECG Interpretation—A Brief History from High Expectations to Deepest Networks. Hearts. 2021;2(4):433–448.
  16. 16. LeCun Y, Bengio Y, Hinton G. Deep learning. nature. 2015;521(7553):436–444. pmid:26017442
  17. 17. Ribeiro AH, Ribeiro MH, Paixão GMM, O DM, Gomes PR, Canazart JA, et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nature communications. 2020;1760(11).
  18. 18. Siontis KC, Noseworthy PA, Attia ZI, Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nat Rev Cardiol. 2021;18(7):465–478. pmid:33526938
  19. 19. Gustafsson S, Gedon D, Lampa E, Ribeiro AH, Holzmann MJ, Schön TB, et al. Development and validation of deep learning ECG-based prediction of myocardial infarction in emergency department patients. Scientific Reports. 2022;12(1). pmid:36380048
  20. 20. Attia ZI, Noseworthy PA, Lopez-Jimenez F, Asirvatham SJ, Deshmukh AJ, Gersh BJ, et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. The Lancet. 2019. pmid:31378392
  21. 21. Biton S, Gendelman S, Ribeiro AH, Miana G, Moreira C, Ribeiro ALP, et al. Atrial fibrillation risk prediction from the 12-lead ECG using digital biomarkers and deep representation learning. European Heart Journal—Digital Health. 2021.
  22. 22. Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, et al. Screening for cardiac contractile dysfunction using an artificial intelligence–enabled electrocardiogram. Nature Medicine. 2019;25(1):70–74. pmid:30617318
  23. 23. Raghunath S, Ulloa Cerna AE, Jing L, vanMaanen DP, Stough J, Hartzel DN, et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nat Med. 2020;26(6):886–891. pmid:32393799
  24. 24. Raghunath S, Ulloa Cerna AE, Jing L, vanMaanen DP, Stough J, Hartzel DN, et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nature Medicine. 2020. pmid:32393799
  25. 25. Lima EM, Ribeiro AH, Paixão GM, Ribeiro MH, Filho MMP, Gomes PR, et al. Deep neural network estimated electrocardiographic-age as a mortality predictor. Nature Communications. 2021;12. pmid:34433816
  26. 26. Cardoso CS, Sabino EC, Oliveira CDL, de Oliveira LC, Ferreira AM, Cunha-Neto E, et al. Longitudinal study of patients with chronic Chagas cardiomyopathy in Brazil (SaMi-Trop project): a cohort profile. BMJ Open. 2016;6(5).
  27. 27. Ribeiro ALP, Paixão GMM, Gomes PR, Ribeiro MH, Ribeiro AH, Canazart JA, et al. Tele-electrocardiography and bigdata: The CODE (Clinical Outcomes in Digital Electrocardiography) study. Journal of Electrocardiology. 2019;57:S75–S78. pmid:31526573
  28. 28. Alkmim M, Figueira R, Marcolino M, Cardoso C, Abreu M, Cunha L, et al. Improving patient access to specialized health care: the Telehealth Network of Minas Gerais, Brazil. Bulletin of the World Health Organization. 2012;90:373–8. pmid:22589571
  29. 29. Nunes MCP, Buss LF, Silva JLP, Martins LNA, Oliveira CDL, Cardoso CS, et al. Incidence and Predictors of Progression to Chagas Cardiomyopathy: Long-Term Follow-Up of Trypanosoma cruzi-Seropositive Individuals. Circulation. 2021;144(19):1553–1566. pmid:34565171
  30. 30. Aquino EML, Barreto SM, Bensenor IM, Carvalho MS, Chor D, Duncan BB, et al. Brazilian longitudinal study of adult health (ELSA-Brasil): Objectives and design. American Journal of Epidemiology. 2012;175(4):315–324. pmid:22234482
  31. 31. Gomes PR, Paix GM, Lima EM, Marcolino MS, Ribeiro LB, Chequer G, et al. Electrocardiogram report system: the importance of decision-making tools. Journal of Electrocardiology. 2021;69:87.
  32. 32. Denes P. Major and Minor ECG Abnormalities in Asymptomatic Women and Risk of Cardiovascular Events and Mortality. JAMA. 2007;297(9):978. pmid:17341712
  33. 33. Ribeiro AH, Paixao GMM, Lima EM, Horta Ribeiro M, Pinto Filho MM, Gomes PR, et al. CODE-15%: a large scale annotated dataset of 12-lead ECGs; 2021. Available from: https://doi.org/10.5281/zenodo.4916206.
  34. 34. Cardoso CS, Sabino EC, Oliveira CDL, Oliveira LCd, Ferreira AM, Cunha-Neto E, et al. Longitudinal study of patients with chronic Chagas cardiomyopathy in Brazil (SaMi-Trop project): a cohort profile. BMJ Open. 2016;6(5):e011181. pmid:27147390
  35. 35. Ribeiro ALP, Ribeiro AH, Paixao GMM, Lima EM, Horta Ribeiro M, Pinto Filho MM, et al. Sami-Trop: 12-lead ECG traces with age and mortality annotations; 2021. Available from: https://doi.org/10.5281/zenodo.4905618.
  36. 36. Resende BAM, Beleigoli AMR, Ribeiro ALP, Duncan B, Schmidt MI, Mill JG, et al. Chagas disease is not associated with diabetes, metabolic syndrome, insulin resistance and beta cell dysfunction at baseline of Brazilian Longitudinal Study of Adult Health (ELSA-Brasil). Parasitology International. 2021;85:102440. pmid:34411740
  37. 37. Pinto-Filho MM, Brant LCC, Foppa M, Garcia-Silva KB, de Oliveira RAM, de Jesus Mendes da Fonseca M, et al. Major Electrocardiographic Abnormalities According to the Minnesota Coding System Among Brazilian Adults (from the ELSA-Brasil Cohort Study). The American Journal of Cardiology. 2017;119(12):2081–2087. pmid:28450038
  38. 38. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, et al. Automatic differentiation in PyTorch. In: NIPS-W; 2017.
  39. 39. Lima EM, Ribeiro AH, Paixão GMM, Ribeiro MH, Pinto-Filho MM, Gomes PR, et al. Deep neural network-estimated electrocardiographic age as a mortality predictor. Nature communications. 2021;5117(12). pmid:34433816
  40. 40. Ribeiro AH. ecg-age-prediction; 2021. https://github.com/antonior92/ecg-age-prediction.
  41. 41. Efron B, Tibshirani RJ. An introduction to the bootstrap. CRC press; 1994.
  42. 42. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. International Journal of Computer Vision. 2020;128(2):336–359.
  43. 43. World Health Organization. Chagas disease in Latin America: an epidemiological update based on 2010 estimates. Weekly Epidemiological Record = Relevé épidémiologique hebdomadaire. 2015;90(06):33–44.
  44. 44. Nunes MCP, Beaton A, Acquatella H, Bern C, Bolger AF, Echeverría LE, et al. Chagas Cardiomyopathy: An Update of Current Clinical Knowledge and Management: A Scientific Statement From the American Heart Association. Circulation. 2018;138(12):e169–e209. pmid:30354432
  45. 45. Ribeiro ALP, Marcolino MS, Prineas RJ, Lima-Costa MF. Electrocardiographic abnormalities in elderly Chagas disease patients: 10-year follow-up of the Bambui Cohort Study of Aging. J Am Heart Assoc. 2014;3(1):e000632. pmid:24510116
  46. 46. Brito BOdF, Ribeiro ALP. Electrocardiogram in Chagas disease. Rev Soc Bras Med Trop. 2018;51(5):570–577. pmid:30304260
  47. 47. Ribeiro ALP, Cavalvanti PS, Lombardi F, Nunes MdCP, Barros MVL, Rocha MOdC. Prognostic value of signal-averaged electrocardiogram in Chagas disease. J Cardiovasc Electrophysiol. 2008;19(5):502–509. pmid:18266670
  48. 48. Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health. 2021;3(11):e745–e750. pmid:34711379
  49. 49. Meira W, Ribeiro ALP, Oliveira DM, Ribeiro AH. Contextualized interpretable machine learning for medical diagnosis. Commun ACM. 2020;63(11):56–58.
  50. 50. Patrini G, Rozza A, Krishna Menon A, Nock R, Qu L. Making deep neural networks robust to label noise: A loss correction approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 1944–1952.
  51. 51. Lukasik M, Bhojanapalli S, Menon A, Kumar S. Does label smoothing mitigate label noise? In: International Conference on Machine Learning. PMLR; 2020. p. 6448–6458.
  52. 52. Brasil PEAAd, Xavier SS, Holanda MT, Hasslocher-Moreno AM, Braga JU. Does my patient have chronic Chagas disease? Development and temporal validation of a diagnostic risk score. Rev Soc Bras Med Trop. 2016;49(3):329–340. pmid:27384830
  53. 53. Romero M, Postigo J, Schneider D, Chippaux JP, Santalla JA, Brutus L. Door-to-door screening as a strategy for the detection of congenital Chagas disease in rural Bolivia; 2011.
  54. 54. Zamora LE, Palacio F, Kozlowski DS, Doraivelu K, Dude CM, Jamieson DJ, et al. Chagas Disease Screening Using Point-of-Care Testing in an At-Risk Obstetric Population. Am J Trop Med Hyg. 2020;104(3):959–963. pmid:33350375