A general framework to support cost-efficient fecal egg count methods and study design choices for large-scale STH deworming programs–monitoring of therapeutic drug efficacy as a case study

Luc E. Coffeng , et al.



Soil-transmitted helminths (STHs; Ascaris lumbricoides, Trichuris trichiura and the hookworm species Necator americanus and Ancylostoma duodenale) infect approximately 800 million individuals across the world and are responsible for the loss of more than three million disability-adjusted life years annually [1,2]. To control morbidity associated with these infections, the World Health Organization (WHO) strives to reduce the prevalence of moderate-to-heavy intensity (MHI) infections to less than 2% [3]. To reach this goal, anthelmintic drugs are periodically distributed to at-risk populations through large-scale deworming programs–so-called preventive chemotherapy [4]. In these programs, periodic follow-up surveys are conducted to determine whether the therapeutic efficacy of the administrated drugs is still satisfactory [5], and whether stopping or scaling down drug administration is justified [6]. However, as these programs often operate in resource-poor settings, it is important to minimize operational costs without jeopardizing the correctness of the program decisions (e.g., avoiding prematurely scaling down of preventive chemotherapy or continuing the administration of anthelmintic drugs with a reduced therapeutic drug efficacy). An important proportion of STH survey costs is related the processing of stool samples and the counting of STH eggs under a microscope. Speich and colleagues [7] demonstrated that, independent of the evaluated diagnostic method, the lion’s share (~70%) of the total costs of performing egg counts in Zanzibar was made up of salaries. More recently, Leta and colleagues calculated that personnel salaries (~40%) and car rental fees (~50%) made up a combined ~90% of the total study costs when doing a national STH mapping survey in Ethiopia [8]. Hence, the number of samples to be screened, the speed at which technicians can process a single sample, the number of samples that can be processed per day, and thus the number of sampling days, are considered the major cost drivers of programmatic surveys for infection prevalence or therapeutic drug efficacy.

Several different microscopy-based methods (e.g., Kato-Katz thick smear (KK), Mini-FLOTAC, McMaster and FECPAKG2) are used to diagnose STH infections in stool, of which some are more complex than others [911]. Of all currently applied methods, the WHO-endorsed KK is the most widely established. This method produces smears of 41.7 mg of stool to visualize STH eggs for microscopic identification and counting [12] and is thought to be relatively easy and affordable [7]. The Mini-FLOTAC employs a flotation solution to separate STH eggs from stool debris in a special device prior to counting [9]. The FECPAKG2 method is the most recent and innovative diagnostic method [11,13,14]. It is also a flotation-based method, but instead of using a standard microscope, it employs a purpose-made device to accumulate STH eggs in one field of view, and to produce a digital image of this view that can later be marked up by a technician [11]. However, in a previous study, it was shown that both FECPAKG2 and Mini-FLOTAC had a clinical sensitivity equal or inferior to a single KK for all STHs [15], and that these flotation-based methods provided lower fecal egg counts (FECs; expressed as eggs per gram of stool (EPG)) compared to KK [16].

Making an evidence-based choice about which FEC method to use in STH control programs remains non-trivial, particularly when a decision-making framework is intended to be applied to a wide range of epidemiological settings. This is because the suitability and the cost of different survey designs and diagnostic techniques will vary by epidemiological setting [17]. For instance, the probability of making correct policy decisions may strongly depend on both the performance of a particular diagnostic method and the associated decision criterion in a particular epidemiological setting [1823]. For STH, this performance depends on the average intensity of infection in a community as well as the level of variation in egg excretion (between individuals and within individuals over time), and in the case of the evaluation of drug efficacy, variation in individual drug responses [20,23]. Further, it is important to consider that the total operational cost of a survey will depend on the consumable costs of the diagnostic method used, the survey design (number of samples and number of days spent in the field), and the time needed to count eggs [20]. Importantly, the latter will depend on how many eggs need to be counted, which has not been considered before and which will vary by epidemiological setting and will depend on the goal of the survey (e.g., detecting infection (counting at least one egg) or quantifying intensity of infection (counting all eggs)). Quantifying these costs requires an in-depth analysis of the operational costs of processing samples with different FEC methods.

We aim to provide a general framework for evidence-based recommendations for cost-efficient decision-making in large-scale STH deworming programs based on FEC methods, using monitoring of therapeutic drug efficacy as a case study. To this end, we performed an in-depth analysis of the operational costs to process one sample for three FEC methods (KK, Mini-FLOTAC and FECPAKG2) based on the time-to-result and an itemized cost assessment. Next, we performed a simulation study to determine the probability of correctly concluding that the therapeutic drug efficacy is reduced based on different FEC methods, survey designs and numbers of individuals enrolled, while accounting for the variation in both egg counts and individual drug responses. Finally, we integrated the outcome of the in-depth cost-assessment into the simulation study to determine the most cost-efficient diagnostic test and survey design to detect presence of reduced drug efficacy for the different STH species across different scenarios of STH endemicity.


Ethics statement

Data were collected from four sites during a drug efficacy trial designed to test the equivalence of different FEC methods in attaining estimates of the therapeutic efficacy of a single oral dose of 400 mg albendazole (ALB) against STH infections in school aged children (SAC) [24]. The trial was performed in Brazil, Ethiopia, Lao PDR and Zanzibar (Pemba Island). The study protocol for this trial were reviewed and approved by the Ethics Committee of the Faculty of Medicine and Health Sciences, the University Hospital of Ghent University, Belgium (Ref. No B670201627755; 2016/0266) and the national ethical committees associated with each trial site (Ethical Review Board of Jimma University, Jimma, Ethiopia: RPGC/547/2016; National Ethics Committee for Health Research (NECHR), Vientiane, Lao PDR: 018/NECHR; Zanzibar Medical Research and Ethics Committee, United Republic of Tanzania: ZAMREC/0002/February/2015; and the Institutional Review Board from Centro de Pesquisas René Rachou, Belo Horizonte, Brazil: 2.037.205). The trial was retrospectively registered on Clinicaltrials.gov (ID: NCT03465488) on March 7, 2018. Parent(s)/guardians of participants signed an informed consent document indicating that they understood the purpose and procedures of the study, and that they allowed their child to participate. If the child was ≥5 years, he or she had to orally assent in order to participate. Participants of ≥12 years of age were only included if they signed an informed consent document indicating that they understood the purpose and the procedures of the study, and were willing to participate.

In-depth analysis of the operational costs to process one sample for three FEC methods based on the time-to-result and an itemized cost assessment


Measuring time-to-result for the different FEC methods was part of the drug efficacy trial, which have been extensively described elsewhere [24,25]. During the trial, baseline stool samples were collected from SAC, who were subsequently treated with a single dose of 400 mg ALB. Between 14 and 21 days after treatment, SAC who were positive for any STH species at baseline were re-sampled to evaluate the reduction in egg output (ERR). At baseline and follow-up, stool samples were processed by duplicate KK (slide A and B), Mini-FLOTAC and FECPAKG2 to determine FECs (expressed in EPG) for each STH separately.

Upon arrival in the laboratory, stool samples were first grouped into batches of ten samples (with the remainder in a separate last batch). Subsequently, each individual stool sample was homogenized by stirring with a wooden tongue depressor. Finally, subsamples were taken to be processed according to the different FEC methods. Fig 1 provides an overview of the different steps timed for each FEC method, including preparing the sample for analysis, counting eggs and data entry (demographic data and FECs), which ultimately resulted in the time-to-result measurement.


Fig 1. Overview of the different operational steps for the different FEC methods.

The distinctive steps to perform a Kato-Katz (KK), Mini-FLOTAC or FECPAKG2 on a single stool sample are provided in chronological order per method. The procedures are grouped per main subject (blue: entry of demographic data; green: preparation of the sample; yellow: reading of the slide/device or the image to count STH eggs; red: entry of fecal egg count data). Waiting steps included in the procedure are indicated in grey and represent a fixed amount of time. The small clock symbol indicates what steps have been timed as part of this experiment. Clock clip art from https://openclipart.org/detail/125725/time-temps.


Detailed standard operating procedures (SOPs) to time the preparatory steps and the egg counting process are described elsewhere (see S3–S5 Infos of Vlaminck et al. [24]); S1 Info provides a brief summary). A summary of the SOP to time the data entry is provided in S2 Info.

We expressed the time (in seconds) needed to enter data and prepare samples for analysis per batch by dividing the total time recorded per batch by the number of samples within that batch. These calculations included batches gathered at baseline and follow-up. Batches containing fewer than 5 samples were not timed and were excluded from these calculations. We report the average reading time, preparation time and data entry time across batches. The overall mean preparation time per sample was calculated as the mean of batch-specific estimates of time per sample. The data on the timing of egg counting were analyzed at the level of samples. For each FEC method, the correlation between the time required to count and the absolute number of eggs in the sample was quantified using linear regression models. In these models, we predicted the log10-transformed time (in seconds) needed for egg counting (dependent variable) using the square of the log10-transformed total number of STH eggs counted plus 1 as the independent variable. Statistical analyses were conducted in R [26], Microsoft Excel v16.16.7 and Prism version 6.0. for Mac.

Itemized cost assessment.

We calculated the cost of materials to collect a stool sample in a school setting and the costs to perform the FEC method, including the required equipment, supplies and reagents. For this, we performed an itemized cost assessment considering the cost per unit, the usage over a one-year period and the expected duration of use (in years). A detailed itemized cost assessment to collect stool samples and to perform the FEC method is provided in S3 Info. For specific items, such as the KK kit, Mini-FLOTAC or FECPAKG2 devices, we used the prices that were either advertised online or obtained through the manufacturer (2020). To estimate the cost of everyday materials, such as scissors, paper, salt and buckets, Ethiopian market prices were used (2020). The cost of a microscope and computer for data-entry were each amortized over 10,000 FEC samples, assuming that they both would be useable for multiple surveys.

Simulation study to assess the probability of correctly detecting a truly reduced therapeutic efficacy

Definition of a reduced therapeutic efficacy and survey designs.

For each candidate survey design, we determined the probability that the resulting ERR point-estimate confirmed the presence of reduced efficacy of a single oral dose of 400 mg ALB. Here, we assumed that the true efficacy was 5%-points under the species-specific thresholds specified by WHO (Table 1) [5], and we concluded the presence of reduced therapeutic efficacy if the ERR point-estimate was under the WHO threshold. Given that the endemicity at baseline has an impact on the statistical power [27] and the total survey costs [21], we determined the probability of correctly detecting a truly reduced therapeutic efficacy across different scenarios of endemicity.

Currently, it is recommended by WHO to determine the efficacy based on individuals that were egg-positive at baseline only; however, excluding individuals who were egg-negative at baseline from the analysis may result in a substantial overestimation of drug efficacy due to regression to the mean, particularly in low endemic settings or when true drug efficacy is low. Coffeng and colleagues showed previously that this bias could be avoided by a number of alternative survey designs [21]. However, an in-depth analysis of the associated operational costs was missing. Here, we re-evaluate some of the survey designs assessed by Coffeng and colleagues, including the WHO-recommended ‘screen and select’ design (SS; only egg-positive individuals are followed up), the ‘screen, select, and retest’ (SSR; only egg-positive individuals are followed up, but the analysis is based on a second separate baseline stool sample [21]), and the ‘no selection’ design (NS; all enrolled individuals are screened at baseline and follow-up). For the NS and SSR survey design, we explored two variants, one that was based on a single FEC on the follow-up sample (NS1 × 1/1x1 and SSR1 × 1/1x1), and one that was based on duplicate FECs (NS1 × 1/1x2 and SSR1 × 1/1x2). We did not consider survey designs based on a single FEC on two consecutive stool samples at follow-up FEC (NS1 × 1/2x1 and SS1 × 1/2x1), as sample collection on two consecutive days adds considerable logistical issues while yielding relatively little in terms of precision in drug efficacy estimates [21]. For the WHO-recommended SS design we considered two variants: one based on a single FEC both at baseline and follow-up (SS1 × 1/1x1), and one based on duplicate FECs at both time points (SS1 × 2/1x2). The former is recommended in the WHO manual to monitor the therapeutic efficacy of drugs against schistosomes and STH [5], and the latter is currently being piloted in a number of endemic countries as part of the Starworms project [27].

General simulation framework.

For the current simulation study, we adapted the framework described by Coffeng and colleagues [21], accounting for the following sources of variation in egg counts:

  1. Inter-individual variability in mean egg intensity due to variation in infection levels between individuals (assumed to follow a gamma distribution);
  2. Day-to-day variability in mean egg intensity within an individual due to heterogeneous egg excretion over time (assumed to follow a gamma distribution);
  3. Variability in egg counts between repeated aliquots of a stool sample due to the aggregated distribution of eggs in stool (assumed to follow a Poisson or a gamma-Poisson (i.e., negative binomial) distribution);
  4. Inter-individual variability in the effect of drug administration in terms of the ERR (assumed to follow a beta distribution).

For the quantification of each gamma distribution, we followed the approach of Denwood et al [28] in using the coefficient of variation (cv) as a standardised measure of variability, which is related to the shape parameter k of a gamma distribution by taking k = cv-2. Species-specific variability between individuals and within individuals over time were estimated based on data from clinical trials during which a duplicate KK was performed on two consecutive stool samples both at baseline and follow-up [29]. STH species and FEC method-specific variability between repeated aliquots of the same stool sample were estimated from the egg count data published by Cools et al. [15]. The average difference between FEC methods in terms of egg recovery performance was also determined as flotation techniques are know to miss unfertilized eggs [30] (see S4 Info for details). The parameterization of the simulation framework is summarized in Table 1. For a detailed description of the simulation model we refer to S5 Info.

Using this framework, we simulated egg counts for all survey designs across four scenarios, each representing different population average baseline FECs. The selection of these scenarios was based on infection levels in the nationwide mapping of STH infections in Ethiopia [31] (see also S1 Fig), where each scenario represented the median of school-level mean FEC (in EPG) of one of four endemicity levels (prevalence between 1.0 and 9.9%, 10.0 and 19.9%, 20.0% and 49.9%, or ≥50.0%). For each survey design, we considered a range of 100 to 5,000 individuals (with increments of 5 individuals) that are initially tested at baseline. For each survey design, sample size and endemicity scenario, 10,000 repeated Monte Carlo simulations were performed. In each simulation, the group-based arithmetic mean ERR was calculated using the recommended procedure [32], and the ERR was considered reduced if under the STH species-specific threshold (Table 1). For each survey design and sample size, we then calculated the proportion of the 10,000 repeated Monte Carlo simulations that correctly identified therapeutic efficacy as truly reduced (probreduced). If a baseline survey resulted in fewer than 50 egg-positive individuals, the survey was considered to have failed and was discontinued. In those cases, it was considered to not have detected reduced efficacy. In the remainder of the text, the proportion of surveys that fail will be referred to as the “failure rate”. All simulations and calculations were performed using the eggsim package [19] in R [26]. This package allows the same calculations to be made for any arbitrary set of parameter values using highly performant C++ code, and is freely available [33].

Total operational costs to monitor drug efficacy

For each simulated survey, we calculated the total operational costs in terms of (i) the cost of consumables to collect and process samples, (ii) the cost of a single mobile field team comprised of one nurse and three laboratory technicians (including salary and lodging), and (iii) the cost of transport, including car rental, salary of the driver, and gasoline. We assumed that a working day consists of 8 working hours, and that the daily salary of one team was 80 US$ (4 per diems of 22.5 US$) and that the daily cost for transport was 90 US$. Second, we assumed that the team collected samples in the morning (8:00–12:00), and that all collected samples were processed in the afternoon (13:00–17:00). Complete analysis of all samples on the same day implies that the number of samples that can be collected daily is limited, and that this number will vary across FEC methods, phase of the trial (less time for egg counting is required in follow-up samples) and endemicity (more time for egg counting is required in highly endemic areas). Note that we do not consider costs for the establishment and maintenance of laboratory infrastructure. We further assumed that all work takes place on regular working days, that the team does not take any breaks during processing, and that all samples are collected from a single school/community without loss to follow-up. All cost calculations were based on the itemized cost-assessment described above. Technical details on how the total costs were calculated can be found in S6 Info.



During the drug efficacy trial surveys performed in Brazil, Ethiopia, Lao PDR, and Tanzania, we assessed the mean time-to-result (i) to prepare stool samples (Tprep,X, X representing the number of aliquots per sample), (ii) to count eggs, (iii) to digitize demographic data (Tdemography), and (iv) to digitize the FEC results (Trecord,X). The time analysis is illustrated in Fig 2. Overall, a duplicate KK consumed the most time, requiring on average (standard deviation) 989 sec (449). A single KK consumed the least amount of time and required on average 507 sec (318). The mean time-to-result for a single Mini-FLOTAC or FECPAKG2 method were 786 sec (513) and 802 sec (329), respectively. The percentage of time-to-result spent on the egg counting process was approximately 80% for both KK (single KK: 413 sec out of 507sec; duplicate KK: 820 sec out of 989 sec) and Mini-FLOTAC (632 sec out of 786 sec), while this was 23% for FECPAKG2 (185 sec out of 802 sec). For the latter method, most of the time-to-result (74%) was spent preparing the samples for analysis (596 sec out of 802 sec).


Fig 2. Time required to quantify soil-transmitted helminth infections in stool by four fecal egg count methods.

The height of the bars represents the mean time (in sec) needed to enter demographic data (blue), to perform the preparation phase (green), to count eggs (yellow) and to enter egg count data (red) for a single (1xKK) and duplicate Kao-Katz (2xKK), Mini-FLOTAC (MF) and FECPAKG2 (FP). The relative proportion (in %) of total time required to perform the preparation phase and to count is reported inside the bars.


As expected, counting a larger number of STH eggs required more reading time, where the log transformed total time required to count all eggs could be well described as a linear function of the square of the base-10 log-transformed total egg counts (Fig 3). Table 2 summarizes the average time required for the different steps included in the total survey costs for each FEC method separately. To obtain an estimate for a single KK preparation we divided the average time to prepare a duplicate KK (135 sec) by two (= 67 sec). As a second Mini-FLOTAC can be filled from the same Fill-FLOTAC (no need to weigh and homogenize the sample for a duplicate Mini-FLOTAC), we multiplied the mean time required to process a single Mini-FLOTAC (131 sec) by 1.5 to estimate the time for a duplicate Mini-FLOTAC (197 sec). To estimate the time for a duplicate FECPAKG2, we doubled the time for each of the different preparatory steps, except for the step to prepare the samples in the FECPAKG2 sedimenters, resulting in a total time of 1,050 sec (= 142 sec + 2 x 174 sec + 2 x 280 sec). Similarly, the mean time needed to enter one duplicate KK result was 18 sec; the time needed to record FEC results based on single KK and Mini-FLOTAC was assumed to be half that value (9 sec). For the FECPAKG2 method, no FEC data entry was required as the software automatically registers and stores mark-up data. S7 Info provides more detailed information (number of batches timed; the average units per batch; average and SD time) on each step of the sample analysis process, starting with the timing of the demographic data entry followed by the preparation phase, the egg counting process, and the time it took to enter FEC data.


Fig 3. The reading time as a function of the number of STH eggs counted in a sample.

This figure represents the reading time as a function of the number of STH eggs counted in a sample for single Kato-Katz (KK), Mini-FLOTAC and FECPAKG2 separately. All egg counts represent raw egg counts (not in eggs per gram of stool). The red line represents the linear regression line. The function of the regression line is also provided.


Itemized cost assessment

The costs associated with the materials for stool sample collection in schools and to process stool samples for a single or duplicate KK, mini-FLOTAC and FECPAKG2 are reported in detail in Table 3. In summary, the costs associated with sampling a single sample (costsample) was US$ 0.57. The material costs to perform a single FEC (costaliquot,1) were US$ 1.37 for KK, US$ 1.51 for the mini-FLOTAC method, and US$ 1.69 for the FECPACKG2 method. When a duplicate FEC was performed on the same sample the material costs (costaliquot,2) were US$ 1.51 (KK), US$ 1.87 (Mini-FLOTAC), and US$ 2.73 (FECPAKG2).

Failure rate, probability of correctly detecting reduced therapeutic efficacy and the corresponding total survey costs

Given the large number of possible scenarios (981 sample sizes x 6 study designs x 4 levels of endemicity x 3 FEC methods x 3 STH species = 211,896 scenarios), and the different output parameters (failure rate, probreduced and costtotal), we first illustrate the performance of only KK-based survey designs in areas that are low endemic for hookworm(mean FEC = 3.7 EPG) (Fig 4). Fig 4A shows that the failure rate, i.e., the risk of observing fewer than 50 egg-positive individuals at baseline, is high (> 25%) when < 250 subjects were enrolled. For sample sizes of about 250 to 750 subjects, the failure rate was lower for a SS1 × 2/1x2 survey design compared to the other survey designs, as duplicate KK results in higher sensitivity for detecting at least one egg in the baseline samples. To reduce the failure rate to < 1%, at least 440 individuals needed to be enrolled for a SS1 × 2/1x2, while this was at least 690 for the other survey designs. For the NS and SSR survey designs, the probability of correctly detecting reduced drug efficacy (probreduced) increased with the number of individuals enrolled (Fig 4B) but varied between these survey designs. For example, when 700 individuals were recruited the probreduced equalled 84% for NS1 × 1/1x2, 79% for NS1 × 1/1x1, 71% for SSR1 × 1/1x2 and 67% for SSR1 × 1/1x1. For SS surveys, probreduced did not increase beyond 6% and decreased again with increasing sample sizes over 400, which is driven by the increasingly precise (due to sample size) but systematically overestimated drug efficacy (due to regression towards the mean).


Fig 4. The failure rate, the probability of correctly concluding reduced drug efficacy and the total survey cost across six survey designs.

This figure shows the impact of the survey design and sample size on the failure rate (Panel A), probability of correctly detecting truly reduced efficacy (probreduced; Panel B) and the mean total survey cost (costtotal; Panel C). To gain more insights into the most cost-efficient survey design, the probability of correctly detecting reduced drug efficacy probreduced was plotted as a function of the mean costtotal (Panel D). For each of the four panels, we only consider the use of Kato-Katz in areas with low levels of hookworm infection (mean FEC = 3.7 EPG). NS = no selection; SS = screen and select; SSR = screen, select, and retest. Note, for panel A, all survey designs other than SS1x2/1×2 are identical to SSR1x1/1×2.


For a given sample size, the most expensive survey designs were , NS1 × 1/1x2, NS1 × 1/1x1 and SS1 × 2/1x2, while the two SSR designs and SS1 × 1/1x1 were the cheapest (Fig 4C). For example, when enrolling 700 individuals, the mean total survey cost (costtotal) was at least 4,000 US$ for NS1 × 1/1x2, NS1 × 1/1x1 and SS1 × 2/1x2, while for the two SSR designs and SS1 × 1/1x1 the mean costtotal was around 3,000 US$ or less. To determine the most cost-efficient survey design, we plotted the probability of detecting reduced efficacy (probreduced) against total survey cost (costtotal) (Fig 4D). For survey budgets up to 2,600 US$, the two SSR survey designs maximized the probability to detect reduced drug efficacy. For budgets between 2,600 and 4,200 US$, the SSR1 × 1/1x2 design was the most cost-efficient. For budgets between 4,200 and 5,200 US$, NS1 × 1/1x1 resulted in the highest probreduced, whereas for a budget of 5,200 US$ or more, NS1 × 1/1x2 was the most cost-efficient survey design. To reduce the risk of falsely concluding adequate drug efficacy to <20% (probreduced ≥ 80%), NS1 × 1/1x1 was the most cost-efficient option (red line; costtotal = 5,000 US$), with NS1 × 1/1x2 as a close runner-up (beige line; costtotal = 5,200 US$).

Second, we explored the impact of the different FEC methods across the six survey designs for hookworms in the same endemicity level as above (Fig 5). Generally, deploying Mini-FLOTAC and FECPAKG2 did not greatly improve the probreduced for SS survey designs. For the other survey designs, Mini-FLOTAC and KK achieved were equally cost-efficient (lines are close to each other). For Mini-FLOTAC, the cheapest survey design to obtain a probreduced ≥ 80% was an NS1 × 1/1x2 survey based on 495 individuals, at a cost of 5,246 US$ (S6 Fig). For KK, this was NS1 × 1/1x1 based on 730 individuals at a cost of 4,987 US$ (Fig 4). For FECPAKG2, the probreduced remained below 85.2%, even when both sample size (2,000) and available budget (27,140 US$) were maximized (see S7 Fig for details on the impact of sample size).


Fig 5. The probability of correctly detecting presence of reduced drug efficacy and the total survey cost for three FEC methods across six survey designs.

This figure plots the probability of correctly identifying reduced therapeutic efficacy (probreduced) as a function of the mean total survey costs (costtotal) for the three different FEC methods (Kato-Katz thick smear (KK), Mini-FLOTAC and FECPAKG2; colored lines) and six survey designs (different panels). For each panel, we only consider areas that are low endemic for hookworm (mean FEC = 3.7 EPG). NS = no selection; SS = screen and select; SSR = screen, select, and retest.


When determining the most cost-efficient survey design for the other two STH species at low endemicity level (A. lumbricoides: mean FEC = 9.6 EPG, S2 and S3 Figs; T. trichiura: mean FEC = 2.8 EPG, S4 and S5 Figs), we noted three important differences compared to hookworm. First, the risk for a failed survey was remarkably lower for A. lumbricoides. While the failure rate is 8.5% when 250 subjects are enrolled for a survey (SS1 × 2/1x2 targeting A. lumbricoides (S2A Fig), this was 98.7% and 97.4% for T. trichiura (S4A Fig) and hookworms (Fig 4A), respectively. As a consequence of this, the mean costtotal and the sample size at which probreduced ≥ 80% was lower compared to the other STHs (A. lumbricoides: S2D Fig: NS1 × 1/1x1 at mean costtotal = 2,522 US$; T. trichiura: S4D Fig: NS1 × 1/1x2 at mean costtotal = 19,628 US$; Hookworm: Fig 4D: NS1 × 1/1x1 at mean costtotal = 4,987 US$). Second, in contrast to hookworms, where probreduced for a given budget differed only marginally between Mini-FLOTAC and Kato-Katz thick smear (Fig 5), the differences FEC methods were more substantial for A. lumbricoides and T. trichiura). For A. lumbricoides (S3 Fig), KK provided the highest probreduced for the same budget, while this was Mini-FLOTAC for T. trichiura (S5 Fig S5). Third, none of the survey designs achieved a probreduced ≥ 80% for T. trichiura, given the maximum simulated sample size of 2,000 individuals (S4B Fig).

In Fig 6, we show the impact of pre-treatment endemicity on the probability of correctly identifying reduced drug efficacy based on KK. When surveys were conducted in higher levels of endemicity, a higher probreduced was obtained for the same budget. Although this was observed for all three STH species and all six survey designs, this increase was most distinct for SS survey designs. This was to be expected as the bias due to regression towards the mean in SS survey designs is known to decrease with higher infection levels. Although SS survey designs rarely resulted in a correct detection of a truly reduced therapeutic drug efficacy when endemicity levels were low (top row panels Fig 6), they almost reach the highest probreduced at a cost of the cheapest survey design at the highest levels of endemicity for both A. lumbricoides and hookworms (bottom row panels Fig 6). For T. trichiura, the performance of SS survey designs remained relatively poor, which is logical as the regression towards the mean is expected to be higher when true drug efficacy is lower (45% for T. trichiura in the simulations). This figure also highlights a shift in the most cost-efficient survey design: while at low level of endemicity, the most cost-efficient survey design depends on the available funds, for higher endemicities, only the NS1 × 1/1x1 survey design maximizes the probreduced for any available budget.


Fig 6. The probability of correctly detecting presence of reduced drug efficacy and the total survey cost for six survey designs across four levels of endemicity when deploying Kato-Katz.

This figure plots the probability of correctly identifying reduced therapeutic efficacy (probreduced) as a function of the mean total survey costs (costtotal) across six survey designs for the three soil-transmitted helminth species and four levels of endemicity (see Table 1).


In Table 4, we provide the sample size and mean costtotal for those survey designs that detected reduced efficacy with probreduced ~ 80% at the lowest cost for each of the different STH species. Generally, this table confirms that the NS1 × 1/1x1 survey design in combination with KK was the most cost-efficient choice to assess therapeutic drug efficacy in all scenarios of STH species and endemicity. Only when surveys were conducted in areas where endemicity of T. trichiura infections were low, the NS1 × 1/1x2 survey combined with Mini-FLOTAC was more cost-efficient.


In this paper, we present new evidence-based recommendations for cost-efficient monitoring of therapeutic drug efficacy against STH, using a simulation framework that captures important interactions between STH epidemiology (variability in egg counts due to various sources), diagnostic test performance (species- and method-dependent egg recovery and count variability), survey design (bias and accuracy that change with endemicity) and operational costs (which change with endemicity, diagnostic method and survey design). With this framework, we address the challenge of minimizing operational costs of STH monitoring in resource-limited settings without jeopardizing the quality of the decision-making. We performed an in-depth analysis of the operational costs to process one sample for three FEC methods (KK, Mini-FLOTAC and FECPAKG2) based on the time-to-result and an itemized cost assessment. Next, we simulated how the probability of correctly detecting reduced therapeutic drug efficacy depends on different FEC methods and survey designs, accounting for sources of variation in egg counts as quantified based on several STH datasets. Finally, we integrated the outcome of the in-depth cost-assessment into the simulation study and determined the most cost-efficient survey design to detect presence of reduced drug efficacy across different scenarios of STH endemicity. Overall, we confirm that KK is the best FEC method to monitor therapeutic drug efficacy, but that the survey design currently recommended by WHO should be updated.

Single KK is the cheapest and least time-consuming method

The mean time and cost for material to process one sample varied from ~8.5 min (single KK) to ~16.5 min (duplicate KK), and from US$ 1.37 (single KK) to US$ 1.69 (FECPAKG2). Our study found that a single KK is both the cheapest and least-time consuming of the three FEC methods evaluated. Although a comparison across studies is not straightforward, as differences in laboratory time can be explained by differences in endemicity (laboratory time varies significantly with the number of eggs counted, as we show here) and possibly also the level of expertise of laboratory technicians, other researchers generally reported a similar laboratory time for both single ([35: ~9.5 min; [34]: ~11.0 min; [8]: ~5.0 min) and duplicate KK ([7]: ~16.6 min). The cost for material estimated in the current study (single KK: US$ 1.95; duplicate KK: US$ 2.17), were higher than those reported by Speich et al. [7] (single KK: US$ 0.03; duplicate KK: US$ 0.04). These differences can be explained by the fact that we included fixed survey costs (0.60 US$, e.g., gloves and permanent markers). In addition, while Speich et al [7] re-used the templates for 50 samples, we opted for single use of materials as there was only limited mesh and cellophane in one kit. For the other two FEC methods, the laboratory time and cost for material were ~13.1 min and US$ 1.51 for Mini-FLOTAC, and ~13.5 min and US$ 1.69 for FECPAKG2. However, data on laboratory time and cost for material to compare our results are either scarce (Mini-FLOTAC: 8–12 min [9]) or absent (FECPAKG2). It is also important to note that our estimates of laboratory time and material costs did not include the washing of devices, and that we based our costs based on an Ethiopian market in 2020.

Revision of the WHO guidelines to monitor drug efficacy is warranted

WHO currently recommends a selection and screen approach during which a single stool sample is processed by a single KK both at baseline and follow-up [5]. Although our study confirms that KK is the FEC method of choice, it indicates that the recommended survey design will often result in poor decision-making due to overestimation of drug efficacy (because of regression towards the mean) at a relatively high cost. Instead, a KK-based survey design where all children are followed up regardless of their baseline infection status (the “no selection” or NS design) should be preferred as it yields unbiased results at the lowest operational cost. The “screen, select, and retest” strategy, where individuals who are egg-positive at baseline are retested based on a new pre-treatment stool sample (the SSR design) [21], was found to be somewhat less cost-efficient than the NS design. However, as previously discussed [21], because the SSR design will yield more egg-positive individuals than a NS design based on the same budget, SSR could still be considered for study objectives that require a minimal number of eggs or egg-positive individuals, such as genotyping to identify resistance-conferring polymorphisms.

Identifying the most cost-efficient study design for any programmatic survey

In the present study, the laboratory time and cost analysis were used to identify the most cost-efficient design for monitoring drug efficacy, but these analyses can also be used to identify the most cost-effective study design for any other type of survey. Although this concept is not new and has been applied in the past [35,36], the level of detail that we present for each of the different FEC allows for fine-tuned cost-efficiency analysis of any programmatic survey. For instance, this framework would also lend itself well to assess the cost and performance of surveys for decisions about stopping or scaling down preventive chemotherapy against STH and/or assess the potential value and cost-efficiency of (new) diagnostic techniques with different (hopefully better) performance and throughput than FEC methods, but potentially higher reagent costs [37].

Automated egg counting would further reduce operational costs

As highlighted by the present study, egg counting is the most time-consuming step for KK and Mini-FLOTAC (80%). An obvious cost-saving strategy that would further reduce the operational costs is automated egg counting using a scanning/imaging device and artificial intelligence-based egg-recognition software to identify and report egg counts. A variety of artificial intelligence based digital pathology (AI-DP) devices are currently being studied [3842]. However, a complete AI-DP device is currently not commercially available, despite the successful examples for other parasitic infections (malaria: CellsCheck, http://www.biosynex.com; Loa Loa: [43]). At the time of writing, FECPAKG2 was probably the most advanced, but automated egg-recognition on the created images by existing STH egg-recognition software has proven difficult or impossible (S5 Info of Cools et al. [15]). In addition, our study highlighted that due to its poor diagnostic performance [15], FECPAKG2 is not recommended to monitor therapeutic drug efficacy in STH control programs. Despite these challenges, there are ongoing investments around each of the FEC methods to progress towards a complete point-of-care platform with automated egg counting and built-in data analysis [39,42].

Strengths and limitations

This is the first comprehensive study that compares the operational costs between the most-used FEC methods in STH surveys. It is important to note that the estimated costs are institute- and context-specific, and hence the reported values should not be interpreted as absolute. However, a major strength of our framework is that assumptions about costs can be easily adapted to represent particular settings. Because our framework aims to compare different survey designs and diagnostic methods, we do not consider costs that can be reasonably assumed to be the similar across different survey designs and different FEC methods: salary for senior staff to supervise the field activities, report, and analyze the data; power supplies; laboratory rent; per diems for days when work is not possible (e.g., weekends); time required for the field team to travel to and return from the study location at the start and end of the survey; and time required to set up and clean laboratories and inform schools and local health authorities prior to surveys. In addition, we do not consider the time required to travel to new another study location if the target sample size cannot be reached in single site, meaning that some of the larger recommended survey designs (e.g., for T. trichiura) are potentially somewhat more expensive than we estimated. A possibly relevant simplifying assumption that we made is that survey teams work constantly without any break. This may have led to a slight overestimation of the performance per cost (Fig 5) for each of the different FEC methods (more breaks because of manual egg counting) compared to FECKPAKG2. Finally, it is important to highlight that each of the trials were conducted by well-trained teams (and hence the laboratory time for a less experienced team might be underestimated), and that we assumed that no individuals would be lost to follow-up. Theoretically, each of the aforementioned factors could be included in our simulation framework, although we do not expect that the presented relative rankings of FEC methods and survey designs would be affected by the inclusion of this additional real-life complexity. However, some of these aspects, like laboratory infrastructure, will have to be considered when comparing FEC methods with other diagnostic techniques such as quantitative polymerase chain reaction.

Supporting information


  1. 1.
    GBD 2017 DALYs and HALE Collaborators. Global, regional, and national disability-adjusted life-years (DALYs) for 359 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018 Nov 10;392(10159):1859–1922. https://doi.org/10.1016/S0140-6736(18)32335-3. Erratum in: Lancet. 2019 Jun 22;393(10190):e44.
  2. 2.
    GBD 2016 DALYs and HALE Collaborators. Global, regional, and national disability-adjusted life-years (DALYs) for 333 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2017 Sep 16;390(10100):1260–1344. https://doi.org/10.1016/S0140-6736(17)32130-X. Erratum in: Lancet. 2017 Oct 28;390(10106):e38.
  3. 3.
    World Health Organization. Preventive chemotherapy to control soil-transmitted helminth infections in at-risk population groups. 2017, World Health Organization; Geneva, Switzerland.
  4. 4.
    World Health Organization. Soil-transmitted helminthiases: eliminating soil-transmitted helminthiases as a public health problem in children: progress report 2001–2010 and strategic plan 2011–2020. 2012: Geneva, Switzerland.
  5. 5.
    World Health Organization. Assessing the efficacy of anthelminthic drugs against schistosomiasis and soil-transmitted helminthiases. 2013, World Health Organization Geneva.
  6. 6.
    World Health Organization. 2030 targets for soil-transmitted helminthiases control programmes. 2020. World Health Organization; Geneva, Switzerland.
  7. 7.
    Speich B, Knopp S, Mohammed KA, Khamis IS, Rinaldi L, Cringoli G, Rollinson D, Utzinger J. Comparative cost assessment of the Kato-Katz and FLOTAC techniques for soil-transmitted helminth diagnosis in epidemiological surveys. Parasit Vectors. 2010 Aug 14;3:71. pmid:20707931
  8. 8.
    Leta GT, French M, Dorny P, Vercruysse J, Levecke B. Comparison of individual and pooled diagnostic examination strategies during the national mapping of soil-transmitted helminths and Schistosoma mansoni in Ethiopia. PLoS Negl Trop Dis. 2018 Sep 10;12(9):e0006723. pmid:30199526
  9. 9.
    Cringoli G, Maurelli MP, Levecke B, Bosco A, Vercruysse J, Utzinger J, Rinaldi L. The Mini-FLOTAC technique for the diagnosis of helminth and protozoan infections in humans and animals. Nat Protoc. 2017 Sep;12(9):1723–1732. pmid:28771238
  10. 10.
    Bekana T, Mekonnen Z, Zeynudin A, Ayana M, Getachew M, Vercruysse J, Levecke B. Comparison of Kato-Katz thick-smear and McMaster egg counting method for the assessment of drug efficacy against soil-transmitted helminthiasis in school children in Jimma Town, Ethiopia. Trans R Soc Trop Med Hyg. 2015 Oct;109(10):669–71. pmid:26385937
  11. 11.
    Ayana M, Vlaminck J, Cools P, Ame S, Albonico M, Dana D, et al. Modification and optimization of the FECPAKG2 protocol for the detection and quantification of soil-transmitted helminth eggs in human stool. PLoS Negl Trop Dis. 2018 Oct 15;12(10):e0006655. pmid:30321180
  12. 12.
    Katz N, Chaves A, Pellegrino J. A simple device for quantitative stool thick-smear technique in Schistosomiasis mansoni. Rev Inst Med Trop Sao Paulo. 1972 Nov-Dec;14(6):397–400. pmid:4675644
  13. 13.
    TechionGroup.com. 2017; Available from: https://www.techion.com/FECPAKG2.
  14. 14.
    Rashid MH, Stevenson MA, Waenga S, Mirams G, Campbell AJD, Vaughan JL, Jabbar A. Comparison of McMaster and FECPAKG2 methods for counting nematode eggs in the faeces of alpacas. Parasit Vectors. 2018 May 2;11(1):278. pmid:29716657
  15. 15.
    Cools P, Vlaminck J, Albonico M, Ame S, Ayana M, José Antonio BP, et al. Diagnostic performance of a single and duplicate Kato-Katz, Mini-FLOTAC, FECPAKG2 and qPCR for the detection and quantification of soil-transmitted helminths in three endemic countries. PLoS Negl Trop Dis. 2019 Aug 1;13(8):e0007446. pmid:31369558
  16. 16.
    Levecke B, Cools P, Albonico M, Ame S, Angebault C, Ayana M, et al. Identifying thresholds for classifying moderate-to-heavy soil-transmitted helminth intensity infections for FECPAKG2, McMaster, Mini-FLOTAC and qPCR. PLoS Negl Trop Dis. 2020 Jul 2;14(7):e0008296. pmid:32614828
  17. 17.
    Levecke B, Anderson RM, Berkvens D, Charlier J, Devleesschauwer B, Speybroeck N, Vercruysse J, Van Aelst S. Mathematical inference on helminth egg counts in stool and its applications in mass drug administration programmes to control soil-transmitted helminthiasis in public health. Adv Parasitol. 2015 Mar;87:193–247. pmid:25765196
  18. 18.
    Gass KM. Rethinking the serological threshold for onchocerciasis elimination. PLoS Negl Trop Dis. 2018 Mar 15;12(3):e0006249. pmid:29543797
  19. 19.
    Coffeng LE, Stolk WA, Golden A, de Los Santos T, Domingo GJ, de Vlas SJ. Predictive Value of Ov16 Antibody Prevalence in Different Subpopulations for Elimination of African Onchocerciasis. Am J Epidemiol. 2019 Sep 1;188(9):1723–1732. pmid:31062838
  20. 20.
    Coffeng LE, Malizia V, Vegvari C, Cools P, Halliday KE, Levecke B, Mekonnen Z, Gichuki PM, Sayasone S, Sarkar R, Shaali A, Vlaminck J, Anderson RM, de Vlas SJ. Impact of Different Sampling Schemes for Decision Making in Soil-Transmitted Helminthiasis Control Programs. J Infect Dis. 2020 Jun 11;221(Suppl 5):S531–S538. pmid:31829425
  21. 21.
    Coffeng LE, Levecke B, Hattendorf J, Walker M, Denwood MJ. Survey Design to Monitor Drug Efficacy for the Control of Soil-Transmitted Helminthiasis and Schistosomiasis. Clin Infect Dis. 2021 Jun 14;72(Suppl 3):S195–S202. pmid:33906226
  22. 22.
    Coffeng LE, Le Rutte EA, Munoz J, Adams E, de Vlas SJ. Antibody and Antigen Prevalence as Indicators of Ongoing Transmission or Elimination of Visceral Leishmaniasis: A Modeling Study. Clin Infect Dis. 2021 Jun 14;72(Suppl 3):S180–S187. pmid:33906229
  23. 23.
    Levecke B, Coffeng LE, Hanna C, Pullan RL, Gass KM. Assessment of the required performance and the development of corresponding program decision rules for neglected tropical diseases diagnostic tests: Monitoring and evaluation of soil-transmitted helminthiasis control programs as a case study. PLoS Negl Trop Dis. 2021 Sep 14;15(9):e0009740. pmid:34520474
  24. 24.
    Vlaminck J, Cools P, Albonico M, Ame S, Ayana M, et al. Comprehensive evaluation of stool-based diagnostic methods and benzimidazole resistance markers to assess drug efficacy and detect the emergence of anthelmintic resistance: A Starworms study protocol. PLoS Negl Trop Dis. 2018 Nov 2;12(11):e0006912. pmid:30388108
  25. 25.
    Vlaminck J, Cools P, Albonico M, Ame S, Ayana M, Cringoli G, et al. Therapeutic efficacy of albendazole against soil-transmitted helminthiasis in children measured by five diagnostic methods. PLoS Negl Trop Dis. 2019 Aug 1;13(8):e0007471. pmid:31369562
  26. 26.
    R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
  27. 27.
    Vlaminck J, Cools P, Albonico M, Ame S, Chanthapaseuth T, Viengxay V, Do Trung D, et al. Piloting a surveillance system to monitor the global patterns of drug efficacy and the emergence of anthelmintic resistance in soil-transmitted helminth control programs: a Starworms study protocol. Gates Open Res. 2020 Mar 10;4:28. pmid:32266328
  28. 28.
    Denwood MJ, Love S, Innocent GT, Matthews L, McKendrick IJ, Hillary N, et al. Quantifying the sources of variability in equine faecal egg counts: implications for improving the utility of the method. Vet Parasitol. 2012 Aug 13;188(1–2):120–6. Epub 2012 Mar 13. pmid:22469484
  29. 29.
    Knopp S, Mohammed KA, Speich B, Hattendorf J, Khamis IS, Khamis AN, et al. Albendazole and mebendazole administered alone or in combination with ivermectin against Trichuris trichiura: a randomized controlled trial. Clin Infect Dis. 2010 Dec 15;51(12):1420–8. pmid:21062129
  30. 30.
    Barda B, Cajal P, Villagran E, Cimino R, Juarez M, Krolewiecki A, et al. Mini-FLOTAC, Kato-Katz and McMaster: three methods, one goal; highlights from north Argentina. Parasit Vectors. 2014 Jun 14;7:271. pmid:24929554
  31. 31.
    Nikolay B, Mwandawiro CS, Kihara JH, Okoyo C, Cano J, Mwanje MT, et al. Understanding Heterogeneity in the Impact of National Neglected Tropical Disease Control Programmes: Evidence from School-Based Deworming in Kenya. PLoS Negl Trop Dis. 2015 Sep 30;9(9):e0004108. pmid:26421808
  32. 32.
    Levecke B, Speybroeck N, Dobson RJ, Vercruysse J, Charlier J. Novel insights in the fecal egg count reduction test for monitoring drug efficacy against soil-transmitted helminths in large-scale treatment programs. PLoS Negl Trop Dis. 2011 Dec;5(12):e1427. pmid:22180801
  33. 33.
    R-package; Available from: http://ku-awdc.github.io/eggSim.
  34. 34.
    Kure A, Mekonnen Z, Dana D, Bajiro M, Ayana M, Vercruysse J, Levecke B. Comparison of individual and pooled stool samples for the assessment of intensity of Schistosoma mansoni and soil-transmitted helminth infections using the Kato-Katz technique. Parasit Vectors. 2015 Sep 24;8:489. pmid:26400064
  35. 35.
    Assefa LM, Crellen T, Kepha S, Kihara JH, Njenga SM, Pullan RL, Brooker SJ. Diagnostic accuracy and cost-effectiveness of alternative methods for detection of soil-transmitted helminths in a post-treatment setting in western Kenya. PLoS Negl Trop Dis. 2014 May 8;8(5):e2843. pmid:24810593
  36. 36.
    Sturrock HJ, Gething PW, Clements AC, Brooker S. Optimal survey designs for targeting chemotherapy against soil-transmitted helminths: effect of spatial heterogeneity and cost-efficiency of sampling. Am J Trop Med Hyg. 2010 Jun;82(6):1079–87. pmid:20519603
  37. 37.
    Kazienga A, Coffeng LE, de Vlas SJ, Levecke B. Two-stage lot quality assurance sampling framework for monitoring and evaluation of neglected tropical diseases, allowing for imperfect diagnostics and spatial heterogeneity. PLoS Negl Trop Dis. 2022 Apr 8;16(4):e0010353. pmid:35394996
  38. 38.
    Larsson J. and Hedberg R., Development of machine learning models for object identification of parasite eggs using microscopy. 2020.
  39. 39.
    Dacal E, Bermejo-Peláez D, Lin L, Álamo E, Cuadrado D, Martínez Á, Mousa A, et al. Mobile microscopy and telemedicine platform assisted by deep learning for the quantification of Trichuris trichiura infection. PLoS Negl Trop Dis. 2021 Sep 7;15(9):e0009677. pmid:34492039
  40. 40.
    Ephraim RK, Duah E, Cybulski JS, Prakash M, D’Ambrosio MV, Fletcher DA, et al. Diagnosis of Schistosoma haematobium infection with a mobile phone-mounted Foldscope and a reversed-lens CellScope in Ghana. Am J Trop Med Hyg. 2015 Jun;92(6):1253–6. pmid:25918211
  41. 41.
    Cringoli G, Amadesi A, Maurelli MP, Celano B, Piantadosi G, Bosco A, et al. The Kubic FLOTAC microscope (KFM): a new compact digital microscope for helminth egg counts. Parasitology. 2021 Apr;148(4):427–434. pmid:33213534
  42. 42.
    Ward P, Dahlberg P, Lagatie O, Larsson J, Tynong A, Vlaminck J, Zumpe M, Ame S, Ayana M, Khieu V, Mekonnen Z, Odiere M, Yohannes T, Van Hoecke S, Levecke B, Stuyver LJ. Affordable artificial intelligence-based digital pathology for neglected tropical diseases: A proof-of-concept for the detection of soil-transmitted helminths and Schistosoma mansoni eggs in Kato-Katz stool thick smears. PLoS Negl Trop Dis. 2022 Jun 17;16(6):e0010500. pmid:35714140
  43. 43.
    D’Ambrosio MV, Bakalar M, Bennuru S, Reber C, Skandarajah A, Nilsson L, et al. Point-of-care quantification of blood-borne filarial parasites with a mobile phone microscope. Sci Transl Med. 2015 May 6;7(286):286re4. pmid:25947164

Source link