Developing and Selecting Auditory Warnings for a Real-Time Behavioral Intervention

Real-time sensing and computing technologies are increasingly used in the delivery of real-time health behavior interventions. Auditory signals play a critical role in many of these interventions, impacting not only behavioral response but also treatment adherence and participant retention. Yet, few behavioral interventions that employ auditory feedback report the characteristics of sounds used and even fewer design signals specifically for their intervention. This paper describes a four-step process used in developing and selecting auditory warnings for a behavioral trial designed to reduce indoor secondhand smoke exposure. In step one, relevant information was gathered from ergonomic and behavioral science literature to assist a panel of research assistants in developing criteria for intervention-specific auditory feedback. In step two, multiple sounds were identified through internet searches and modified in accordance with the developed criteria, and two sounds were selected that best met those criteria. In step three, a survey was conducted among 64 persons from the primary sampling frame of the larger behavioral trial to compare the relative aversiveness of sounds, determine respondents' reported behavioral reactions to those signals, and assess participant’s preference between sounds. In the final step, survey results were used to select the appropriate sound for auditory warnings. Ultimately, a single-tone pulse, 500 milliseconds (ms) in length that repeats every 270 ms for three cycles was chosen for the behavioral trial. The methods described herein represent one example of steps that can be followed to develop and select auditory feedback tailored for a given behavioral intervention.


Introduction
Devices that provide real-time feedback are becoming increasingly affordable, accurate, and widely used. Glucose meters, [1,2] air particle monitors, [3,4,5] heart rate monitors, [6] and accelerometers [7] are among the real-time instruments employed to better understand and improve human health and behavior. Reductions in the size and cost of sound processors and speakers allow for widespread use of auditory warnings on real-time devices. [8] When equipped with real-time sensors, onboard computing, and audio, these devices can provide real-time feedback that is state-of-the-art for the implementation of behavioral interventions [1,6,9,10].
Auditory warnings are often discussed in the context of occupational environments where they are designed to attract attention and simultaneously provide information to users. [11] When auditory warnings are reliably delivered as an antecedent or consequence of human action, they can modify the behavior(s) and related behaviors that resulted in the warning. Consequential feedback that immediately follows a target action is more powerful than delayed feedback for modifying operant behaviors such as secondhand smoke exposure [12].
Real-time sensor and computing technology can be used to instantly detect behaviors and trigger audio (or visual) feedback that immediately follows. These auditory warnings can be engineered to shape behavior independent of, or synergistically with, brief coaching and education. [4,9,13,14] However, auditory warnings have not been evaluated for use in behavioral interventions. Considering that they can and do function to modify behavior, auditory warnings should be subject to the same scientific investigation as the coaching and therapy models they complement or replace [15,16,17].
Carefully designing auditory warnings can improve treatment adherence and participant retention. To effectively attract attention, warnings should be sufficiently loud and unique [18]. When used as a consequence intended to reduce the frequency of a target behavior, auditory warnings should be mildly aversive. [19,20] If too aversive, participants may become annoyed, leave the study or express counter-aggression. [21] Excessively loud or irritating signals also contribute to users turning off sounds, thereby avoiding future signals. [22] These adverse reactions pose challenges for treatment adherence for interventions using auditory warnings as the mechanism of behavioral change. To prevent attrition and non-adherence to treatment, auditory warning designers can involve the intended users in the development process and assess their preferences for specific warnings. At present, this is not standard practice.
This paper details the procedures used to design auditory warnings that are an integral component of a realtime behavioral intervention to reduce secondhand smoke exposure in homes with children throughout San Diego County. The study used custom Dylos DC1700 laser particle counters [3] calibrated to give mass concentrations for tobacco smoke [4] to measure air particle levels in the homes of tobacco smokers. A behavioral module outfitted with onboard computing, sound processors, and speakers (OWL, EME Systems, Berkeley, CA) was attached to the Dylos particle counter ( Figure 1) and programmed to deliver auditory warnings at two particle concentrations, the first of which was chosen as indicative of tobacco smoke; the second signaled higher particle concentrations. The volume at which warnings played could be tailored for each home. The auditory feedback was designed to serve as aversive stimuli that evoke behaviors that stop or avoid the signals(e.g., reducing indoor smoking frequency). If aversive characteristics of the sounds are too strong, participants could experience unpleasant reflexive responses that may lead to operant reactions [23] (e.g., turning off particle monitors)that contribute to interruption of real-time feedback (thereby rendering the audible component of the intervention ineffective) or to participant attrition. This paper describes a four-step process to purposively design and select auditory warnings to be used as real-time feedback in a health behavior intervention study. The steps described provide one example of design procedures to help increase the probability that the audio feedback functions as intended without contributing to undue burden that can result in treatment non-compliance or loss to follow-up.

Materials and Methods
A four-step process was used in the development, testing, and selection of audio warnings. The following sections describe the steps taken to design and evaluate auditory warnings intended for use as a real-time feedback component of a multicomponent behavioral intervention.

A Review of Ergonomic and Behavioral Literature
Auditory warnings are functionally defined as audible stimuli that capture attention and provide listeners with information, thereby prompting behavior. [11] Patterson's hierarchical framework [24] described a warning sound as comprised of a set of sound bursts that are themselves made up of a set of sound pulses [25]. A pulse is the fundamental unit of sound containing one tone or multiple harmonics [11].
Technical characteristics of warning sounds include elements of the signal's sound spectrum and its temporal dimension; both characteristics can affect perceptions of the sound and consequently impact behavioral responses. Key features of an auditory warning's sound spectrum include: the length of the sound pulse; the frequency of each harmonic within the pulse; the length of any onset and/or offset envelopes; and the sound intensity (often measured in decibels).The temporal dimension includes the number of times each pulse repeats, the time interval(s) between repeated pulses (speed), the total duration of the warning sound, and any changes in the pitch or fundamental frequency of pulses over the duration of the warning sound.
Psychoacoustic experiments identified speed and fundamental frequency (the lowest frequency among harmonics within a warning sound) as having the most notable impacts on perceived urgency and reaction times. [26,27,28] Higher speed and higher fundamental frequencies evoke shorter reaction times and are both associated with higher perceived urgency, [26,27,28] however, speed had the largest effect [26,27]. The total duration and loudness level of auditory warnings can impact behavioral response. [11,29,30] Aversive aural stimuli with short durations (e.g., 5000 milliseconds [ms]) reduce unwanted behaviors more reliably than stimuli with longer durations. [29,30]When sounds are very short (i.e., ≤ 200ms), the human ear fails to discriminate acoustic properties typical of warning signals. [11] Thus, auditory warnings expressly designed to modify behavior might best fit in the 200 ms to 5000 ms range. Loudness level is another important characteristic to consider when designing a signal to modify behavior. Sounds below the masking threshold of a given environment are difficult to discriminate and could fail to command attention or provide information. Sounds that are too loud can irritate users, especially when applied in environments with noise-sensitive persons, interfere with interpersonal communications, and lead users to switch the sounds off. [18] To avoid undesirable consequences of unnecessarily loud signals, and for ethical reasons, the loudness level of such stimuli must be determined considering the context in which they will be used [20,25].

Criteria Development
A sound panel comprised of five research assistants (RAs) synthesized design considerations identified in the literature (described above), listened to sample warning signals to become conversant with design considerations, and developed criteria for selecting appropriate sounds for the intervention. Investigators for the larger behavioral intervention oversaw criteria development and defined one overarching criterion: that sounds not be easily confused with sounds that are common in households or workplaces (including musical instruments) so that the auditory feedback would be associated with behaviors specific to the intervention. The resulting auditory warning selection criteria were as follows. Warnings were to: • Be aversive, but not sufficient to elicitirritation or evoke counter aggression (e.g., damaging the instruments). • Be perceived as urgent (i.e., with a fast pace and high frequency), with the sound for the higher particle concentrations more urgent than the initial warning used to indicate tobacco smoke. • Be comprised of sound bursts with either constant or descending frequencies.
• Be sufficiently loud to be heard over televisions and music played at a "normal" volume. • Consist of synthetic sounds that did not resemble musical instruments or a human voice. • Have a duration greater than 200 ms but less than 5000 ms. • Not signal previously learned responses (e.g., an alarm clock). • Not be commonly used in homes or workplace settings.

Development of Auditory Warnings
The sound panel then searched internet sources for free sample sounds that could be used and/or modified to fit the sound selection criteria. Ten sounds were initially identified. The sound panel convened to listen to each sound and discuss how well it fit the a priori defined criteria. Two sounds, (1) a single sign wave and (2) a sound burst containing a single sign wave repeated three times, were selected for modification. The remaining sounds were excluded from further testing because they were perceived to be signals for previously learned actions and/or were commonly used in home or workplace settings (e.g., cell phone ring tone, nuclear meltdown signal, morning wake-up alarm).
Audacity for Windows [31] was used to convert the selected sounds into auditory warnings by modifying the frequency, speed, and number of cycles each sound was repeated. Two warnings were created from each sound, one with a frequency of 500 hertz (Hz) and the other 800 Hz. Technical descriptions of the four resulting sounds are presented in Table 1. The total duration of each resulting warning was between 200 ms and 5000 ms. Sounds comprised of tones of 800 Hz were considered the highest frequency tolerable and were used to indicate higher particle concentrations. Signals comprised of 500 Hz tones were determined to be sufficiently different from 800 Hz signals to make the two easily distinguishable. After modification, audio files containing warning signals were saved as LPCM-encoded Waveform files at 44,100 samples per second, 16 bits per sample.

Representative of theIntended Users
The four warning sounds were subsequently tested with a convenience sample from San Diego State University (SDSU) Research Foundation Special Supplemental Nutrition Program for Women, Infants and Children (WIC)offices, as this population was the primary sampling frame for the larger intervention study. WIC serves pregnant, breastfeeding, and postpartum women as well as children up to 5 years old from low-income families. WIC participants make regular office visits to receive nutritionrelated support and services. The protocol for sound testing was approved by the institutional review board at SDSU.
At WIC offices, persons who appeared at least 18 years of age were asked to voluntarily complete a brief sound survey. Surveys were conducted in May 2012, in English or Spanish, depending on the preference of the participant. Of the 85 individuals approached, 83 were age 18 or older, of which 64 persons provided verbal consent and completed the anonymous survey.
Following consent, participants were given a paper survey along with audio headphones connected to a net book computer. Each participant was presented with one of two sound pairs, resulting in two study groups: a lowfrequency group and a high-frequency group. The low-frequency group tested warning sounds 1 and 3; the highfrequency group tested warning sounds 2 and 4 (see Table  1 for descriptions of the sounds).
At the beginning of each survey session, RAs flipped a coin to select the first sound to be played for the first participant (e.g., heads = warning sound 1 and tails = warning sound 3); the first sound played for each subsequent participant was alternated. RAs presented the first sound and asked participants to "think about the sound playing in their home after an event occurs". Participants were then asked to answer five questions (see 2.3.1. Survey Measures, below) about the first sound. All sounds were played at the same volume and participants were permitted to listen to the warning more than once, if needed. The same procedure was subsequently followed for the second sound. After completing the questions about each sound, participants answered questions comparing the two sounds, including which sound they "liked most", and a few additional questions about demographics and to baccouse. Upon completion, the surveys were placed in locked boxes and participants were givenincentives that included recipes for healthy smoothies and coloring sheets for their children.

Survey Measures
The survey was designed to measure the relative aversiveness of each auditory warning, determine participants' reported behavioral reactions to those warnings, and distinguish participant preference for the two warnings. The following variables were created from the survey questions: Sound description. The survey asked participants to select one of seven options (Cell phone, Warning or alert, Fire alarm, Smoke detector, Encouraging, School bell, or Other) that best described each sound. A blank space was provided next to "Other" for an open-ended response.
Aversiveness scales. Respondents indicated how each sound made them feel on two 5-point ordinal scales: 1=very unhappy to 5=very happy; and 1=very anxious to 5=very calm. They were also asked to rate the texture of each sound on two 5-point ordinal scales: 1=very hard to 5=very soft; and 1=very rough to 5=very smooth. Responses were used to create an aversiveness scale for each of the four warnings by a) recoding responses so higher values corresponded to higher levels of aversiveness (i.e., very unhappy, very anxious, very hard, and very rough); b) summing respondent's scores on the feeling and texture items; and c) dividing by the number of completed items. The resulting scales exhibited Cronbach's Alpha coefficients of .75 (warning sound 1), .52 (warning sound 2), .84 (warning sound 3), and .69 (warning sound 4); each scale contained 4 items.
Color. The survey asked respondents to associate each sound with one of three colors: red, yellow or green.
Active response. The survey asked participants what each sound made them want to do (Take a nap, Relax, Run/get away, Make it stop, and Other; those selecting Other were provided space to describe their answer). Responses were used to create a dichotomous variable, 'Active Response', which was coded Yes/No as follows: Yes (if respondent selected either or both of Run/get away or Make it stop, or if they selected Other and wrote that they would take action to investigate or stop the sound) or No (if responses indicated that no substantive response would be taken, e.g., "Be alert", "Check my phone", "Hang up my phone", "Nothing").
Soundpreference. After listening to both sounds, respondents were asked to select the warning signal they "liked most". Responses to this comparison item were used to generate a binary variable, 'Sound Preference'.

2.3.2.Statistical Analysis
Descriptive statistics were used to summarize respondents' characteristics. For paired-sample data (e.g., responses to warning sound 1 and warning sound 3 from the same respondents), the Wilcoxon signed-rank test was used for dichotomous and ordinal variables, and Fisher's exact test was used for unordered categorical variables. Analyses were first conducted for low frequency and high frequency groups separately. Warning sounds 1 and 2 were acoustically similar and would be used together in the same particle monitor (the same is true of warning sounds 3 and 4). As a result, analyses were also conducted for the low-and high-frequency groups combined.

2.3.3.Results of Sound Testing
Characteristics of the sample are shown in Table 2. The majority of participants (86%) were WIC clients. Approximately 44% were between the ages of 25 and 34 years. Spanish was the primary language spoken at home for 50% of participants, English for 47%. Analytical results are shown in Table 3. Respondents' descriptions of sounds differed significantly between warning sounds 1 and 3(p = .025) and between warning sounds 2 and 4 (p< .001). Warning sounds 1 and 2 were more often associated with a warning or alert while warning sounds 3 and 4 were more often associated with a cell phone. When low-and high-frequency groups were analyzed separately, differences between sounds on the aversiveness scale were not statistically significant. However, when low-and high-frequency groups were aggregated, aversiveness levels for warning sounds 1 and 2 (median = 2.9) were significantly higher than those for warning sounds 3 and 4 (median = 2.6), Z = 2.01, p = .045. The distribution of colors associated with warning sound 2 and warning sound 4 differed significantly (p =.001) and when low-and high-frequency groups were analyzed together, combined responses to sounds 1 and 2 differed significantly from combined responses to sounds 3 and 4 (p = .009).No significant differences in 'Active Response' were found when low-and high-frequency groups were analyzed independently. When analyzed together (highand low-frequency groups combined), a larger proportion of respondents reported they would take action as a result of hearing warning sounds 1 and 2 (76.2%) compared to sounds 3 and 4 (60.7%), Z = 1.96, p= 0.05. Regarding 'Sound preference', there were no significant differences among low-frequency, high-frequency, or combined groups..

Auditory Warning Selection
Results from empirical sound tests were used to select the auditory warnings for the behavioral intervention. Findings indicated that warning sounds 1 and 2 were more aversive, more often associated with a warning or alert and with the color red (as opposed to yellow or green), and more likely to elicit a behavioral response (selfreported by participants) compared to warning sounds 3 and 4. Furthermore, participants were indifferent with regard to their sound preference for warnings. Based on the results, warning sounds 1 and 2 were selected.

Discussion
The purposeful design of warning signals is critical, especially when the warnings are a central component of behavioral intervention. The present paper describes a four-step process used to develop, test, and select warning signals intended to modify smoking and ventilation behaviors to prevent indoor secondhand smoke exposure. The selected sounds have now been programmed into the behavioral modules attached to particle monitors and are being used and evaluated in an ongoing randomized controlled trial.
To our knowledge, the four-step process presented above represents the first model for developing auditory warnings for use as real-time feedback in a behavioral intervention intended for residential settings. Previous approaches for designing and evaluating auditory warning signals have focused primarily on workplace applications. Edworthy and Stanton [22] presented a 10-step usercentered approach whereby users of auditory warnings worked with designers to produce discriminable warnings where sounds and their meaning were strongly associated. Another approach included designing "intelligent sound alarms" and "bringing together multi-disciplinary teams, taking into account engineering, ergonomics and sound design" and developing the sounds within larger systems of alarms that were managed primarily by artificial intelligence. [32] Both approaches are similar to our fourstep model in that they involve representatives of the intended users in an effort to arrive at signals that are effective, tolerable, and fit within the user's environment.
While it is essential to purposefully design warning sounds before using them with real-time technology to modify behavior, it is also critical to fully describe the characteristics of the signals used. Previous work has rarely included technical descriptions of the warnings used, thereby presenting challenges for replicating findings and evaluating the effects of sound warnings on human behavior. Table 1 was modeled from the work of Edworthy and Stanton [22] and provides an example of how the key components of warning signals can be reported. In accordance with Patterson's hierarchical model of auditory warnings, the Table is stratified by characteristics of the sound pulse (sound spectrum) and by characteristics of the burst and the overall warning signal (temporal description).With this information, practitioners and researchers can more accurately reproduce auditory warnings identified as effective mechanisms of behavior change, thereby increasing the methodological fidelity of trials intended to replicate results. The reader is directed to the original manuscript from which Table 1 was based for more examples of how to comprehensively describe the technical characteristics of warning sounds. [22] There were several limitations to the procedures described in this paper. The sample of clients used to test alerts was not randomly selected, potentially biasing estimates and limiting the generalizability of our findings. However, threats to generalizing results to the broader San Diego WIC population were limited by approaching all clients who entered WIC offices during the survey period and achieving a completion rate of 78% (64/83).
Although statistical power was limited by the small sample size of the low-frequency (n=30) and highfrequency (n=34) groups, we were able to discriminate important differences as functional and not likely due to chance. To minimize Type 2 errors and to evaluate differences between sound bursts as they were to be used (i.e., a low-frequency warning to indicate the first particle concentration threshold and a high frequency warning to signal higher particle concentrations), high-and lowfrequency groups were combined and analyzed.
Following signal development and field testing, it was identified that warning sounds 1 and 2 did not exactly fit Patterson's model for a warning sound; instead they represented sound bursts. However, these sounds were clearly identified as warning signals by study investigators and by the sound panel, and when compared to warning sounds 3 and 4, by a sample of WIC participants and their families. If the definition of a sound burst were relaxed to include relatively long, uninterrupted sound pulses, warning sounds 1 and 2 would fit Patterson's model. That said, all sounds designed and tested in the present study consisted of sound bursts composed of single tones. Complex warnings composed of multi-tonal bursts may convey information with more precision and thus may make better auditory warnings. Future designers should consider both single-and multi-tonal sound pulses when designing auditory warnings.
This study highlights a critical, and often overlooked, component of behavioral interventions with auditory realtime feedback, i.e., the nature of the auditory feedback itself. It offers a first illustration of the research and development that might take place to maximize the functional reactions to auditory feedback. By empirically assessing characteristics of candidate warning signals, we gained knowledge of previously unknown variables that could explain the effects (or non-effects) of the larger trial, which should help us more thoroughly understand the mechanisms of the intervention. This report also serves as a model for investigating the characteristics of other types of feedback (e.g., lights) that can be employed in mobile, real-time and telemetry technologies for behavior change purposes. This approach offers an opportunity for researchers using new mobile technologies to alter or sustain health behavior, and to improve theoretical fidelity by designing behavioral consequences based on empirical evidence of their aversive or reinforcing functions [33].

Conclusion
Warning sounds are increasingly used as real-time feedback in behavioral interventions, yet are often not subject to the same systematic scrutiny as the individual and group therapy models these warnings complement or sometimes replace. Rigorous design methods can help create and select sounds that increase the likelihood of desired behavioral responses while improving treatment adherence and preventing the early exit of participants from research or treatment endeavors. The four steps discussed in this paper included examination of behavioral and ergonomic literature, iterative review of candidate sounds by investigators and a sound panel, and empirical tests of sounds among a population similar to the intended users. The process converged on a pair of sounds (warning sounds 1 and 2) that would best function to modify smoking and ventilation behaviors among participants in a real-time secondhand smoke intervention study. Our ongoing randomized trial will determine the effects of this real-time feedback as well as provide information about the inter-and intra-person variability in the averseness of the auditory stimuli selected [34].