Toward a paradigm shift from deficit-based to proactive speech and language treatment: Randomized pilot trial of the Babble

Speech or language therapy is typically initiated reactively Background: after a child starts showing delays. Infants with classic galactosemia (CG), an inborn error of metabolism with a known high risk for both speech and language disorders, hold the keys towards evaluating whether preventive treatment is effective when the risks are known at birth.  We present pilot data from a randomized parallel trial of an innovative proactive speech and language intervention program, the Babble Boot Camp (BBC). : Five children with CG, otherwise healthy, participated in the BBC Method from approximately 2 to 24 months of age. One of these was randomly selected as control receiving conventional management. A pediatric speech-language pathologist met weekly via telepractice with the parents in the treatment cohort. Parents implemented the prespeech, speech, and language stimulation and expansion activities according to the protocol. The control child was still too young for conventional treatment. Primary outcome measures were speech sound production complexity in babble and speech and expressive vocabulary size. Secondary outcome measures were developmental milestones in communication, motor, and cognition. Outcomes in the treatment cohort were compared to typical children and the control child. The trial is ongoing. 1,2 3 1 1

children and the control child. The trial is ongoing.
: All four treated children had higher speech sound skills in babble, Results three had higher speech sound skills in meaningful speech, two had higher expressive vocabularies, and three had higher communication and personal-social skills, compared to the control child with CG.
Given the high risk for speech and language delays in children Discussion: with CG, finding on-schedule abilities in two or more of the treated children but not the untreated child is unexpected under random conditions. The trends toward beneficial effects of the BBC on speech sound production, expressive language, and communication milestones warrant appropriately powered larger clinical trials with full randomization.

Trial registration: NCT03838016
Keywords speech disorder, language impairment, genetic risk, infant, very early intervention, prevention

Introduction
Difficulties with speech and language are common among young children. In the US, 11% of children age 3 to 6 years have a communication disorder (Black et al., 2015). Many parents who are concerned about their child's ability to talk ask the child's doctor, who, in turn, may refer them to a speech-language pathologist. By the time a referral is made, the child may already be two, three, or even four years old and has passed critical stages in the process of speech and language development (Cates et al., 2012). There is strong evidence that early interventions in children with known risks or first signs of a variety of disorders are highly effective, for instance interventions for children with autism spectrum disorder as young as 12 months (Dawson et al., 2010;Guralnick, 2011;Rogers et al., 2014). However, very early speech and language services are not yet available, in part because speech and language are later-developing skills and disorders in these areas cannot be reliably diagnosed on behavioral grounds until an age when deficits become evident, greater than 24 months for speech ( , 1985). However, whether treatment targeting these early signs of communication has a beneficial effect on later speech and language development is unknown because such treatments have not been developed and validated. Going even further back in the developmental trajectory, some children are born with risk factors for speech and language disorders and this risk is known long before prespeech behaviors such as coo and babble and actual speech and language emerge. The question is whether proactive, preventive treatment, if it existed, could reduce the deleterious effects associated with the disease in these cases and thereby improve outcomes.
Infants with classic galactosemia (CG) are an ideal population to investigate whether proactive interventions during the first two years of life, long before traditional assessment and intervention are available, can significantly improve speech and language outcomes. CG is a recessively inherited inborn error of metabolism diagnosed via newborn screening, with incidence rates in the US ranging from 1/30,000 to 1/60,000. Newborn diagnosis can be life saving because of the deleterious effects of galactose buildup in the child's blood that can occur if dietary restrictions are not implemented immediately. Despite rigorous dietary management, however, children with this disease have a substantially higher risk, compared to the typically developing (TD) population, not only for motor and learning disabilities (Antshel et al., 2004;Karadag et al., 2013;Potter et al., 2013) but also, importantly, for severe speech and language disorders. Speech disorders were reported in 77% of children with CG (Hughes et al., 2009), compared to 3.8% among children generally (Shriberg et al., 1999), and language impairment in 90% of children with CG (Waggoner et al., 1990), compared to 7.4% among children generally (Tomblin et al., 1997). This elevated risk, coupled with the early identification, makes the CG population ideal to examine the efficacy of prospective intervention therapy. If proactive intervention is shown to be more effective than conventional management, this has the potential to change the management model from deficit-based to preventive services for these infants. It will also motivate similar studies in infants with other types of risk for communication disorders, for instance very low birth weight and 7q11.23 duplication syndrome. That is, children known to be at risk may benefit from early, prospective intervention, thus improving outcomes.
The Babble Boot Camp (BBC) is a program of activities and routines designed for children during the pre-speech and very early speech and language stages. It contains components intended to shape dyadic interactions across modalities, stimulate earliest vocalizations (coo, babble), support emergence of first words and sentences, and foster vocabulary and syntax growth. The active phase of intervention covers ages 2 to 24 months, with plans for follow-up testing using a professional evaluation of speech, language, and cognitive abilities at ages 36 and 48 months.
Here, we report pilot results of the BBC. This Phase 0 exploratory study demonstrates, with clear clinical application, a viable proactive early intervention approach for minimizing speech and language disorders in a vulnerable population of infants with a known genetic risk for these disorders. The purpose of this pilot study was to examine if children with CG participating in the BBC show age-expected markers of expressive speech and language and to compare their outcomes to those in the control child. The primary focus is speech and language development measured with standardized assessments, with secondary attention to cognitive and motor development.

Methods
This study was conducted with approval of the Institutional Review Boards at Arizona State University (IRB ID # STUDY00004969) and Washington State University (IRB ID # 13099). It began on January 31, 2017, and the trial is ongoing. Parents learned about the study through online research announcements and referrals from physicians or other service providers and contacted the research team. Once eligibility for participation was established and parents made the decision to participate, they gave written permission for their infants' participation and written consent for their own participation. The study is listed on ClinicalTrials.gov under NCT03838016.

Participants
The current participants are 14 children with CG and their parents. Here, we report on a subset of the children, the five oldest children with the longest participation record, for whom a nearly complete dataset is available through age 21 months. This pilot treatment cohort consists of four children, two girls (codes CG1, CG2) and two boys (CG4, CG5). Note that CG1 only participated in the study up to age 18 months, due to personal circumstances. One additional boy with CG was randomly selected to serve as a control who did not participate in the BBC treatment program. All families participated in the close monitoring components of the study, described further below. In the near future, this parallel design will be built out by recruiting more children into the treatment and control cohorts using randomization and blinding, the latter of which can only be applied to research team members who analyze the data.
In-and exclusionary criteria are identical for the treatment and control children with CG and are designed to evaluate the effects of the treatment while keeping all other factors the same. Age at entry into the study is approximately 2 months. All infants are required to have a newborn diagnosis of CG. Infants with additional health or sensory diagnoses are excluded. Boys and girls of any racial/ethnic background are equally eligible to participate. Primary language in the home must be English and at least one parent must have at least an 8 th grade education. Because the intervention is implemented via telepractice software, any family whose primary language is English can participate, regardless of country of residence. The cohort as a whole currently includes families in the US and the UK.

Materials and procedures
The BBC is implemented via parent training by a speech-language pathologist (SLP) with expertise in early childhood. The SLP uses a HIPAA-compliant telepractice computer interface to connect with the families. Parents learn about the typical milestones of prespeech, speech, and language development, potential red flags for delays, and, importantly, specific activities that support typical development for all stages of the program and development beyond. Children in the BBC treatment cohort progress through these milestones not based on age but on present levels of ability.
One key principle underlying all activities is the zone of proximal development or scaffolding in which parents provide speech and language models that bridge what the child can already do and what is slightly beyond the child's skill set: the model is in the zone of skills that the child can do with help (Vygotsky, 1979).
One key skill that is targeted throughout the program is imitation.

Outcome measures
Because the BBC is designed to beneficially influence speech and language development, the primary outcome variables are speech sound production in babble and speech and expressive language ability. Secondary areas of interests are cognitive and motor development and quality of life, but these are only partially addressed in this report. During the active BBC phase, we monitor all of these areas as well as a range of other enrichment variables including volubility of child vocalizations, environmental influences, and demographic factors.
Specifically to assess speech sound development and language growth, we use a combination of several standardized tests and established clinical procedures. In the present pilot report, we quantify speech and language growth using several metrics. First, we focus on the complexity of the produced speech sounds during babble and early speech. Once per month per child starting at age 6 months, we compute the Mean Babbling Level (MBL) (Stoel-Gammon, 1989), a clinical measure of speech sound complexity for babbling. To compute the MBL, a set of at least 50 utterances is compiled and transcribed into the International Phonetic Alphabet. An expert rater assigns a score of 1, 2, or 3 to each utterance and computes the average; thus, MBL scores range from 1 to 3 for each child at any given time point. A score of 1 is assigned to simple utterances consisting of a vowel, a syllabic consonant, or a consonant-vowel (CV) or vowel-consonant (VC) sequence where the consonant is either a glide ("w", "y") or a glottal stop, defined as a brief silence interrupting a vowel; note that glottal stops and glides are not considered to be true consonants. Examples of level 1 utterances are "m" and "wawa". A score of 2 is assigned to utterances containing at least one CV or VC sequence with a true consonant; if there are two or more syllables, the consonants may be the same ones or differ only in whether they are voiced or not. Examples are "bapa" and "dida". A score of 3 is assigned to utterances containing at least two true consonants produced in different parts of the mouth and/or with different airflow characteristics. Examples are "gaba", and "adap". The three scores thus express the progression from motorically and linguistically simple to more complex skills. MBL scores in the BBC treatment and control cohort were compared to MBL scores in TD children at equivalent ages (Morris, 2010). Whereas the MBL is applied to babble, which does not convey lexical meaning, an equivalent measure called Syllable Structure Level (SSL) exists for meaningful speech (Paul & Jennings, 1992), and as for the MBL, we used this measure in tandem with typical norms (Morris, 2010). The source materials for the MBL and SSL are daylong naturalistic audio recordings captured with a passive, wearable audio recorder (LENA Research Foundation, Boulder, CO). The recordings are obtained in the natural environment of the families and children. The recorder is returned to the research labs and processed offline to obtain the raw, daylong audio file, which provided the source for the present MBL and SSL scores. Other measures based on the recordings will be reported in the future, including estimates of child volubility such as syllables per hour and vocalizations per hour, speech production and discourse characteristics of the target child and other family members, and other variables known to be important for language and communication development in children. All measures collected in this way are objective and algorithm-driven. For purposes of the present work, we extracted multiple 5-minute audio segments with the highest occurrence of child utterances in order to transcribe these utterances into the International Phonetic Alphabet and to compute MBL and SSL scores for each child and month.
Second, we use the MacArthur-Bates Communicative Development Inventories 2 (MBCDI-2) (Fenson et al., 2007) to capture early expressive and receptive vocabulary sizes as reported by the parents when filling out the MBCDI-2 protocol forms. MBCDI-2 questionnaires were collected from each family at regular intervals (ages 12, 15, 18, 21, and 24 months), and computed percentile scores were compared between the treated and untreated participants. Percentiles were based on the publisher's norms, separately for boys and girls. Here, we focus on receptive vocabulary, available up to age 15 months, and expressive vocabulary, available for all reported ages but regarding our pilot sample, only through age 21 months.
Third, the Ages and Stages Questionnaires -3 (ASQ3) parent questionnaires (Squires & Bricker, 2009) capture communication abilities and personal-social development of young children. The evaluation of communication abilities differs from the expressive and receptive vocabulary checks in the MBCDI-2 in that it samples broader communication abilities, as relevant for child age, such as comprehending phrases, using language to achieve a goal, and producing multi-word sentences. The personal-social development component queries skills in activities of daily living such as feeding and dressing oneself and social interactions, both verbal and nonverbal, such as giving/receiving objects, asking for help, and role-play, as relevant for age. The ASQ3 is considered valid and reliable for purposes of tracking typical development in these and three other areas (gross motor, fine motor, and problem solving) and for identifying areas of concern, where the problem solving component is an estimate of cognitive ability (Schonhaut et al., 2013). All five questionnaire topics are applicable to children with CG because of the known risks for deficits in each of these areas (Antshel et al.,  2004, Karadag et al., 2013, Potter et al., 2013). Each component is scored by summing the raw scores, then assigning one of three categories, based on the provided norms: Above cutoff ("On schedule"), close to cutoff ("Provide learning activities and monitor"), or below cutoff ("Assess further"). The ASQ3 questionnaires are available in 21 separate, age-appropriate and agenormed sets from ages 2 to 60 months. We administer this tool in 6-months intervals and report here on ages 12 and 18 months.

Data analysis
Results and trends are presented descriptively, supported with graphs. Because of the small sample size of this pilot study, statistical tests for group differences and other inferential statistical procedures were not possible here, but will be used in future reports of the study.

Quality control
Questionnaire data from the MBCDI-2 and ASQ3 were entered into the database by teams of at least two trained research assistants. The MBL and SSL scores were computed by three trained research assistants; 15% of their scores were double-scored by other team members, and differences of over 10% in a given child's MBL score will be resolved by consensus; however, such a discrepancy has not occurred in the pilot data.

Mean Babbling Level (MBL)
MBL scores of the children with CG in the treatment cohort consistently exceeded those of the control child with CG and typical control children without CG (Figure 1). For the ages for which data were available for nearly all children (11 through 21 months), the average difference between the treatment cohort and the control child with CG was 1.01, indicating that the children with CG in the treatment cohort obtained substantially higher MBL scores than the children in the other two categories. For the ages for which data were available for the control child with CG and the typical control children without CG (12, 15, 16, 18, and 20 months, listed in Morris, 2010), the average difference was -0.04, indicating that the control child with CG obtained MBL scores roughly equivalent to the typical controls without CG, but note the declining trend for the control child at the most recent ages. The difference between the children in the treatment cohort and the typical children was -0.77, indicating that the children in the treatment cohort outpaced the typical children on average. Figure 1 shows MBL scores for all children and all available ages.

Syllable Structure Level (SSL)
Similar to the MBL, the SSL is calculated on the basis of 50 utterances, but unlike the MBL, it is based on meaningful speech. For one child in the treatment cohort, CG1, not enough meaningful utterances could be identified in any of the recording sessions up to 18 months, when the child left the study, to compute the SSL. Of the remaining children and for the available ages at 12, 13, 15, 19, and 20 months, the children in the treatment cohort with CG outperformed the control child with CG by 1.0. For ages 15 and 19 months, the only ages for which data were available for the treatment cohort and the typical children without CG, the children in the treatment group outperformed the typical children by 0.6. The control child with CG and the typical children without CG could not be compared directly because data at the same ages were not available, but at ages 20 and 21 months, his scores appear lower than the typical controls' scores at age 19 and 22 months. Note that, with only one exception, the highest scores were obtained by CG2 and CG5. Figure 2 summarizes the SSL scores.

Expressive and receptive vocabulary
Regarding expressive vocabulary size, the control child obtained one of the two or three lowest rankings for the ages for which data were available (Figure 3). Two children in the treatment cohort, CG2 and CG5, obtained the highest percentile rankings, which, at age 21 months, were 73 and 72, respectively. Two children, CG1 and CG4, obtained low expressive vocabulary scores, with CG1's percentile score being 1.7 at 18 months and CG4, 4.0 at 21 months. The control child, similar to CG1 and CG4 in the treatment cohort, showed declining expressive vocabulary scores, below the 8 th percentile at 21 months. Note that not having any words at all at age 12 months, which was the case for CG1 and the control child, corresponds to the 25 th percentile for boys and the 20 th percentile for girls, but the ranking drops rapidly with age if the expressive vocabulary does not increase substantially.
Regarding receptive vocabulary, the control child obtained the second-lowest percentile ranking (Figure 4), but note that all children including the control child show typical receptive vocabulary scores at the two ages represented in the questionnaires. The lowest percentile score, 10.5, is considered low average; it was obtained by CG1 at age 12 months. All other scores are solidly within normal limits. Figure 3 and Figure 4 summarize expressive and receptive vocabulary percentiles, respectively.

Ages and Stages Questionnaires -3 (ASQ3)
The AGS3 questionnaires at 12 and 18 months indicated that three of the four children in the treatment cohort had communication and personal-social abilities as expected for age. One child in this cohort had communication abilities below expectation for age at 18 months, and personal-social abilities close to the cutoff for age expectations. The control child scored close to the cutoff for age expectation at 18 months for communication, and at 12 and 18 months for personal-social abilities.
Regarding problem solving, of the children in the treatment cohort, only CG1 and CG4 had scores that were not consistently in the range of scores expected for age, both at age 18 months, whereas the control child was close to the cutoff at both time points. Regarding fine and gross motor skills, CG1 was the only child in the treatment cohort with scores that were below (fine motor) or close to (gross motor) the cutoff, whereas the control child had scores close to the cutoff in both areas at one time point and, for gross motor, at both time points. Across all areas, mainly one child in the treatment cohort (CG1) and the control child did not show developmental skills on schedule. Table 1 summarizes the ASQ3 scores for all children in the five assessed areas.

Discussion
In a small sample of very young children with a known and highly predictive risk for speech and language disorders due to CG, we show that a program of preventive activities and routines, the BBC, may have beneficial effects in three important regards.
First, very early attention to speech sound production may increase children's ability to produce more complex speech sounds in babble and speech. The children with CG in the treatment cohort obtained greater MBL scores during babble than the control child with CG as well as typical control children without CG. A similar pattern was seen for the SSL during meaningful speech. Together, these findings suggest that very early attention to speech sound production increases the ability to produce more complex speech sounds in babble and speech, in this case even beyond the expected levels of typically developing children. It is possible that MBL and SSL do not measure areas of crucial weakness in children with CG in general, but that targeting speech sound skills may give them a clear boost that may have beneficial effects on later speech and language development. Continued data collection and analysis in this population will provide conclusive evidence in this regard.
Second, the expressive vocabulary, and possibly to a lesser degree the receptive vocabulary, of children with CG may improve as a result of the BBC. Expressive vocabulary scores were completely within normal limits for two children in the treatment group, CG2 and CG5, whereas the control child with CG and two children in the treatment cohort, CG1 and CG4, scored below expectation. Given the 90% risk for language impairment in children with CG, finding two of four children in the treatment cohort with typical language skills may show the beneficial effects of the BBC program. The fact that receptive vocabulary scores were within normal limits for all children including the control child with CG may indicate that CG affects expressive language skills more than receptive language skills.
Third, there may be gains in competence in using language and nonverbal means to interact with others. Three of the children in the treatment cohort had communication skills and personalsocial skills as expected for age, as measured with the ASQ3. These children may be gaining age-appropriate competence in activities of daily living as well.
Comparing the results from all measures, the two children with the overall highest scores were CG2 and CG5. They showed no evidence of deficits in any of the evaluated area; in fact, in the areas of speech sound complexity in meaningful speech and expressive vocabulary size, they scored at the top of the pilot sample. By contrast, CG1 did not produce enough meaningful speech for SSL scores, had low expressive vocabulary scores, and scored near or below the cutoff for expectation in all five developmental domains assessed with the ASQ3. CG4 had intermediate MBL and SSL scores but low expressive vocabulary scores and one low score in the AGQ3, namely in problem solving. The control child obtained some of the lowest MBL, SSL, expressive vocabulary, and ASQ3 scores. These patterns are consistent with cross-domain associations among the evaluated skills (speech, language, social interaction, cognitive ability, motor ability). Whether these patterns reflect global levels of disease severity, global benefits from the treatment, or a combination of these cannot be ascertained based on the present data.

Limitations and future directions
Because of the small sample size in this pilot study, generalizations to other children with CG are not possible. Whereas the MBCDI-2 percentiles were based on sex-adjusted norms, the other measures were not, but note that the two children with the overall highest performance were a girl and a boy. It is not possible to identify which components of the BBC had the greatest impact, if any, on speech, language, cognitive, and motor development. Many other variables in the study, for instance volubility of child vocalizations and environmental factors such as quantity of child-directed speech, remain to be analyzed, not only in this pilot cohort but also in the full set of families in this study. Longer-term outcomes in the primary and secondary outcome variables will be evaluated as we follow the children until age 4 years and evaluate a more complete spectrum of outcome variables including speech, language, cognitive and motor development, and quality of life. Most importantly, the trends toward beneficial effects of the BBC on the primary outcome variables of speech sound production and expressive language warrant appropriately powered larger clinical trials. The project contains the following underlying data files: -UnderlyingData_MBL_SSL_MBCDI2.xlsx. This spreadsheet contains the MBL, SSL, and MBCDI-2 data for the participants.

Data availability
The underlying data also include audio files recorded in the participants' homes and hand-written questionnaires with identifiable information; therefore, these cannot be openly shared. De-identified questionnaire files can be made available to qualified researchers by contacting the first author at Beate.Peter@ asu.edu. Those might help to understand data outcomes on child variables measured.

Is the study design appropriate and is the work technically sound?
In this developmental period, parent training is a well established way to conduct intervention. Some outcome variables focused on intervention fidelity and range of potential differences across parents would be appropriate .
Are sufficient details of methods and analysis provided to allow replication by others? Very good information on child measurements with respected analyses or tests. No information about parent input analyses. I do not believe they are available.
If applicable, is the statistical analysis and its interpretation appropriate? Not applicable for this pilot study.
Are all the source data underlying the results available to ensure full reproducibility? Raw data not available but types of analyses are standards in the field. This aspect of the study is well described. described.

Are the conclusions drawn adequately supported by the results?
In this pilot cohort of four CG children, one control, and and data from a group of comparison TD children, results do not support assertions made in Discussion. No measure of parent behaviors, the method of implementation for BBC intervention method. Either diverse implementation by parents of the activities they were taught via telepractice, or diverse types of potential individual differences in the children could account for the findings in the pilot data on MBL and SSL the primary issue of interest in measuring BBC intervention outcomes. I am very supportive of the importance of intervention research and focus on early intervention in 'at risk' populations. Unfortunately, the number of unknowns that are not yet accounted for in this pilot study of four children and a control suggests that it may be a premature to make assertions about effects of BBC parent intervention with CG children based on this dataset. This manuscript focuses on an important and relevant issue: determining whether proactive intervention strategies are beneficial for infants at risk for later speech or language deficits associated with a specific condition (Classic Galactosemia, henceforth "CG"). This is an ambitious longitudinal project. In this manuscript, the authors report on preliminary findings and directions for research.

Is the work clearly and accurately presented and does it cite the current literature?
The primary limitation of the work in its current form is a lack of specificity. First, the literature review fails to give the reader a full sense of the related research or of the related clinical services that have gone before. For example, the authors state that, "very early speech and language services are not yet available" (p. 3). They add that, "The question is whether proactive, preventive treatment, if it existed, could reduce the deleterious effects associated with the disease [that results in risk for speech-language delay] …and thereby improve outcomes" (p. 3). These statements seriously downplay decades of research and of speech-language services in early intervention. As long ago as 1983, Tomasello and Todd showed that levels of parental joint attention impacted their infants' vocabularies between 12-18 months. Roberts and Kaiser's systematic review documents the research demonstrating that training parents to interact with their children in certain ways can impact linguistic outcomes. For example, more recently, Roberts . trained caregivers to use enhanced milieu teaching using a et al "teach-model-coach-review" approach. This included not only telling and showing the caregivers what to do, but also coaching them as they tried out the strategies and reviewing their success afterwards. Treatment fidelity on the part of the trainers and on the part of the caregivers were both measured. These authors demonstrated that the frequency and accuracy with which the caregivers carried out the strategies that they were taught had an impact on the children's communication progress. These are but a few of many studies documenting the value of SLPs training parents to interact with their infants and toddlers in a manner that will facilitate communication development, with the result that SLPs involved in early intervention typically do train parents to use these strategies. For example, ASHA's practice portal document on early intervention states that, "Services that include opportunities for families and caregivers to directly participate in intervention are essential to strengthen existing knowledge and skills and to promote the development of new abilities that enhance child and family outcomes" and that, "SLPs often coach families, caregivers, and other team members in how to implement functional, language-enhancing strategies during daily activities Therefore, the authors need to specify how their study is similar to and different from what's been done before. Unique and important characteristics of their study might include the focus on stimulating prelinguistic vocalizations specifically (versus communication/cognition more generally), beginning at such a young age, and intervention for children with this specific syndrome. However, the very brief descriptions that they provide of their caregiver trainings make it impossible to judge exactly what they are training the parents to do, how they are training them to do it, and whether or not the treatment fidelity of the trainers or of the parents are actually measured. In this sense, the work is not clearly presented. See below for further discussion of the current lack of specificity in these areas.

Is the study design appropriate and is the work technically sound?
The authors are gathering specific data systematically from several children, selected according to