Blended Learning Simulation Based Model Development

This paper presents the formative evaluation for an instructional model for Aerodrome Flight Information training development. The developed instructional model was a blending of e-briefing with face-to-face role-play simulation and debriefing. The aim of this study was to improve the effectiveness of the developed instructional model. This study employed a formative evaluation by Tessmer (2013) with used both qualitative and quantitative approaches. The respondent of field test is 22 students of the 2016-2017 academic years. The formative evaluation was consisted of expert review, one-to-one, small group, and field test. This research was carried out at Indonesia Civil Aviation Institute. The data were collected through open-end interview, questionnaire, and field observation. The data analysis used descriptive statistic method. The finding indicates that the blended learning simulation based model increased learning outcome (+12.55); students and instructors were more satisfy when they implemented this model (4.47); and the model increased student’s attention level (4.5). Based on the findings, the proposed instructional model can be implemented in this institution.


Introduction
The development of aviation training grew rapidly along with the development of technology and science. Therefore it is necessary for instructional designer producing new instructional model that take advantage of educational research and development result. This is for compensating the speed of the existing development. Effective instructional design focuses on performing authentic tasks, complex knowledge, and genuine problems. Thus, effective instructional design promotes high fidelity between learning environments and actual work settings [1]. In the process of research and development, formative evaluation is done in the final step after the prototype model is completely developed. Recently, the term formative evaluation has evolved into pilot test, formative assessment, dry run, alpha / beta testing, quality control, and learner verification and revision [2]. Formative evaluation is the process designers use to obtain data for revising their instruction to make it more efficient and effective. Its emphasis is on the collection and analysis of data and the revision of the instruction. The evaluator of this formative evaluation is the instructional designer who develops the instructional model [3]. Program developers conduct formative evaluation while the program or product is under development, in order to support the process of improving its effectiveness. In some situation, formative evaluation findings may result in a decision to abort further development, so that resources are not wasted on a program that has little chance of ultimately being effective [4]. Formative evaluation is used as a tool during training while knowledge, skills, or attitudes are being formed. Formative evaluations provide insight into a course for instructional designers, so that improvements can be made before the course is finalized [5].
Formative evaluation is different with summative evaluation. Summative evaluation is conducted to determine how worthwhile the final program is, usually is done by individuals other than the program developers [6].

Methods
In conducting a formative evaluation, a variety of methods can be employed that vary in the types of data collected (quantitative or qualitative), technical sophistication, type of researcher/participant contact (e.g. interviews or e-mailed questionnaires), the specific questions addressed and the number of people who would be contributing data. Multiple methods provide a richer understanding but also help to cross-validate the results [7]. This study uses both qualitative and quantitative approach using Tessmer's layers of formative evaluation. Figure 1 depicts the layers.
This formative evaluation is divided into two phase, namely designing and conducting formative evaluation phases.

Designing Formative Evaluation Phase
After prototype model or rough draft model is completely developed, the instructional designer performs self-evaluation. It aims to find errors that are clearly seen and immediately revised. When the self-evaluation is completed, the research instrument blueprint is designed. This instrument blueprint includes some instruments for expert review, one-to-one, small group, and field tests. Data collection is using interviews, questionnaires, and participatory observation.
The participants are experts of some fields, students and instructors. The experts consist of subject matter experts, linguists, instructional design experts, printed media specialists, and multi-media experts. The students are learners of Aerodrome Flight Information training at Indonesia Civil Aviation Institute. They are young adults, aged between 18-20 years old. The instructors are lecturers of Aerodrome Flight Information training at Indonesia Civil Aviation Institute. They have more than ten years experiencing in AFIS aerodrome and teaching in this institution. Table 1 shows the evaluation instrument blueprint that adapted from Dick, Carey, and Carey (2015).

Evaluation Instrument Validation
Instrument validation by experts' procedure as follows: formative evaluation instruments that have been prepared are examined by Supervisors. Selection of experts who are validating the instrument considers to their expertise and Educational Technology / Evaluation / Aviation Course Developer certificate holder. Formative evaluation instruments submitted to the experts to be validated. After validation is done, the experts gave some suggestion for revising the instrument. Furthermore each item of formative evaluation instrument is analyzed based on validation results from three experts to determine whether to use without revision, used with revision, or not used. From this stage the instrument is ready to be tested.
Experts who validate the evaluation instruments consist of three experts with qualifications as a learning design expert and aviation research and development expert.
Evaluation instrument validation by experts results as follows: 1. Subject Matter Expert evaluation instrument: all of questions can be used without revision.

Instructional Design Expert evaluation instrument:
Questionnaire item number 5 is used with revision and number 9 is not used.

Printed Media Expert evaluation instrument: item number 4 can be used with revisions. 4. Multimedia Specialist evaluation instrument:
Questionnaire number 12 can be used with revision. 5. Linguist evaluation instrument: all questions can be used without revision. 6. One-to-One evaluation instrument: point number 19 questions can be used with revisions. 7. Small Group evaluation instrument: all questions can be used without revision. 8. Field Trial evaluation instrument: all questions can be used without revision. 9. Small Group for Instructor evaluation instrument: Questionnaire number 1, 2, 5, 6, 7, 8, 9, 10, 11, and 12 can be used with revision. 10. Field Trial for Instructor evaluation instrument: Questionnaire number 1, 2, 5, 6, 7, 8, 9, 10, 11, and 12 can be used with revision.

Evaluation Instrument Trials
After the instrument is sorted according to the validation result, then the instrument reliability test is performed. For the interview instrument tested by using test-retest method to four respondents, while the questionnaires were tested to twenty-two students twice and calculated the reliability coefficient using Alpha Cronbach. The results of calculations on the first data retrieval obtained 0.963 and on the second data obtained 0.961. Trial of small group evaluation instrument and field test of instructor that using questionnaire technique done twice and then sought coefficient of reliability using Alpha Cronbach. The results of excel calculation on the first data retrieval obtained 0.833 and on the second data retrieval obtained 0.719. These mean that the instruments reliable to be implemented.

Conducting Formative Evaluation Phase
According to Tessmer (2013) as depicted in Figure 1, formative evaluation is conducting into four steps. The first step is expert review and continued with revising draft model 1. The second step is one-to-one with students and continued with revising draft model 2. The third step is small group and continued with revising draft model 3. The fourth step is field test and continued with revising draft model 4. The result of those revising is a final model that can be implemented to the institution.

Expert Review
The aim of this evaluation is to gain input from the experts used to revise the developed instructional model. Data was gathered through direct interviews to the experts. The experts consist of subject matter expert, instructional design expert, printed media expert, multimedia specialist, and linguist. This step uses draft model 1.
From information obtained, it is subsequently used to make the first improvement which resulted in the draft of two learning materials. The expert reviews through interviews obtained the following results: (1) The result of interview with subject matter expert is described as follows: a. Vocabulary and consistency of writing still need to be refined; b. The language that is used in the aviation community is English. So there are some terms in English that cannot be translated, because it is still written as it is. Then for the English term is written down in italics. c. There are still some figures that do not fit with the context, such as images. d. In studying the terminology, it is necessary for students to imagine the object described. Then the more number of images the better for learning. e. For practicing in AFIS simulator, scenarios should be made more variety, the number of exercises should be increased, so that learners will be interested and get richer in the training views.
The result of interview with multimedia specialist is input on placement of lesson plans on resources e-briefing.
(5) The result of interview with linguist is as follows: a. The grammar, spelling, and vocabulary used are not always clearly understood, because there are too long or unclear sentences between subject and predicate. There are several misspellings but still could be understood. The choice of words should be observed so as not to be extravagant and repetitive. b. Most of the sentences used are in accordance with the ability of students, but any of them are too long so it becomes very complex and a buildup of ideas. c. English words that may be translated into Bahasa should be written in Bahasa, unless the terms are retained in English. All of these results are used for revising instructional material both e-briefing and printed material. The draft model turns into draft model 2.

One-to-One
This evaluation aims to identify and eliminate the most visible errors in learning materials, obtain initial guidance on outcomes, and reactions to the content of learning materials by students. During this stage there is a direct interaction between instructional developer and student. This step uses draft model 2.
Three dimensions in one-to-one evaluation are clarity, impact, and feasibility. Data collection technique used interview to three students in order to explore further their opinions about developed instructional model. The respondents were three students of Aeronautical Communications selected based on their learning result obtained in the previous semester. For student who has above average abilities is selected one person, the average ability is chosen one person, and the ability below the average chosen one person. The Aerodrome Flight Information instructor has contribution to this election.
Students can also give underscores or highlights on unfamiliar words, unfamiliar examples, illustrations, paragraphs, images or confusing tables in the given instructional materials. The stages of accessing e-briefings are guided by instructor until students are able to do it independently. Students can also directly write a note about a confusing video or slide, or stop the video and ask instructional developer directly about their confusion.
The results of the interview are as follows: a. Video size on role-play demo should be enlarged; b. Schoology needs to be improved on the distribution of the chapters. It should not be separated which resulted in students must be re-register for the next lesson.
All of these results are used for revising instructional material both e-briefing and printed material. The draft model turns into draft model 3.

Small Group
There are two main objectives of a small group evaluation, which are: a) to determine the effectiveness of improvement of expert review and one-to-one evaluation and identify learning problems that students may encounter; and b) to determine whether students can use instructional materials on pre-briefing and role-play demonstrations independently. This step uses draft model 3.
To measure the effectiveness of instructional is used comparison of average values on pretest and posttest. In addition, it was also given questionnaires and interviews to determine the attitude and opinions of students and instructors. The information collected includes: (1) the continuity of the delivery of instructional in accordance with the format and environment specified; and (2) behaviors that implement or manage learning.
Respondents were selected eight students who represent low, medium, and high learning achievement student. Target population is students of Aeronautical Communication -Indonesia Civil Aviation Institute.
Quantitative and qualitative data are summarized and analyzed. Quantitative data consists of test scores that include the pretest and posttest scores. The effectiveness of expert review and one-to-one improvements is shown on the average test scores before and after small group evaluation. The average pretest score is 68.13, while the average posttest score is 83.38. The difference between pretest and posttest scores is 15.25.
After that, the normality of the distribution of pretest and posttest values is calculated using the Saphiro-Wilk test. This test is effectively used to calculate small sample quantities [10]. In the small group evaluation the number of samples is eight students. The test was calculated with SPSS version 22. The significance of Saphiro-Wilk test on pretest is 0.097 and at posttest is 0.475. The basis of decision making is, if the significance value> 0.05, then the research data is normally distributed. The Saphiro-Wilk test shows that both pretest and posttest data are normally distributed.
While the t test results are testing H0; µ pretest =µ posttest gives the value t = -14.795 with degrees of freedom 7. While the significance value for the two-tailed test is 0.000 which shows smaller than α = 0.05. This data proves that the static hypothesis H0; µ pretest =µ posttest is rejected. It can be concluded that mean of pretest and posttest score differ significantly. From the effectiveness test using t-test on the pretest and posttest learning outcomes, there is a difference between pretest and posttest which means draft model 3, which is a result of the improvement of expert review and one-to-one evaluation, effectively improving learners' learning outcomes.
Qualitative data consists of descriptions of questionnaires and interviews to students. Figure 2 depicts the level of student's satisfaction based on questionnaires result.

Field Test
Field test are the final stages of formative evaluation aimed at determining whether improvements made from small group trial results are effective. This step uses draft model 4.
The data collected includes attitudes and achievements of students, instructor opinions, and resources such as time and equipment.
Respondents consisted of 22 students who attended Aerodrome Flight Information training and two instructors.
Student achievement data are grouped on the pretest and posttest results. Behavior information students and instructors grouped separately. Summary of this data will help a certain area in less effective learning. Information from this large group trial is used to make final improvements to the draft learning materials.
Quantitative and qualitative data are summarized and analyzed. Quantitative data consists of test scores. To determine the effectiveness of the improvement of small group evaluation, it is necessary to calculate whether the value changes obtained by learners at the time of pretest with posttest in the field trial stage. The mean of pretest value is 73.5 and mean of posttest value is 86.05. The difference between pretest and posttest value is 12.55.
To calculate the normality of the distribution of pretest and posttest values used the Saphiro-Wilk test. The sample size is 22 students. The calculation was using SPSS version 22.
The result of Saphiro-Wilk test significance at pretest is 0.318 and at posttest of 0.011. The basis of decision making is, if the significance value > 0.05, then the research data is normally distributed. If the significance value is <0.05, then the research data is not normally distributed. Saphiro-Wilk test results indicate one of the data is not normally distributed. So to know the mean is used non-parametric statistics Wilcoxon signed ranks test [11]. Results of Wilcoxon Signed Ranks Test testing H0; µ pretest =µ posttest states that H0 is rejected. It can be concluded that mean of pretest and posttest score differ significantly. From effectiveness test using Wilcoxon signed ranks test on pretest and posttest learning outcomes, there is a difference between pretest and posttest which means draft model 4, which is a result of small group evaluation, effectively improving learners' learning outcomes.
Qualitative data consists of behavioral questionnaire descriptions, and interviews to students. Figure 4 depicts student's satisfaction in Field Test. Figure 5 shows student's attention level in field test.  All of these results are used for revising instructional material both e-briefing and printed material. The draft model turns into final model that can be implemented in this institution.

Discussion
In the field test, mean of pretest value is 73.5 and mean of posttest value is 86.05. The difference between both tests is 12.55. This indicates a significant increase in student learning outcomes. This can happen because students have a well preparation before performing roleplay simulation through e-briefing. Instructor's opinion is the concept / learning material is very good to equip the students to become a reliable AFIS officer. If this instructional model added with adequate facilities and infrastructure then this learning process will be reach the expected target.
The average of student and instructor satisfaction is 4.7 of 5 scales. This means that this learning model can satisfy students and instructors. Meanwhile the level of attention of learners to this learning model is 4.5 of 5 scales. This means that this learning model can attract the attention of learners to study independently.
The learning system is embedded in a simulation model that prepares learners in e-briefing activities before entering a role-play simulation session and debriefing to achieve Aerodrome Flight Information learning objectives. The implementation of this simulation-based blended learning cannot be separated from learning materials either in the form of e-briefing or printed materials consisting of textbooks, learner's guidance, and instructor's guidance.
Eligibility is how learning is relevant to the available resources, i.e. time and context. Context is defined as learners, learning media, and environment of learning.
The material expert states that the age and degree of independence of the learners is sufficient to complete the learning within the prescribed time and the time available is sufficient for the implementation of the lesson. Meanwhile, according to the instructional design expert, the age of young learners has the potential to be able to complete the learning because they are still quite energetic.
Instructors argue that motivation of learners is sufficient to follow instructional activities, and time available is sufficient for the implementation of this instruction. The results of the questionnaire to the three instructors also showed that the motivation of learners is feasible to follow the learning, and the time available is feasible to carry out the learning.
This study has close similarity with a study conducted by Triantafillou, Pomportsis, and Demetriadis (2003) that made Tessmer's methodology for designing and implementing formative evaluation of an adaptive educational system based on cognitive styles. Triantafillou, Pomportsis, and Demetriadis conducted formative evaluation with three steps as follows: expert review, one-to-one evaluation, and small group evaluation. They did not do field tests. [12]

Conclusion
The finding indicates that (1) the blended learning simulation based model increased learning outcome. Students had enough time for preparing the simulation material prior to conduct the role-play simulation. Students could ask some questions to their instructor or their peer through online forum discussion. So, they had enough confident to practice the scenarios in the simulator; (2) students and instructors were more satisfy when they implemented this model. Students and instructors had some more interaction to make the simulation was running better. As consequence, students could afford the instructional objectives and increase their learning motivation; and (3) the model increased student's attention level. The model applies information technology that suits with the needs of students who are the millennial generation. Therefore, students are more interested and enthusiastic about learning activities.
Based on the findings, the proposed instructional model can be implemented in this institution.