Building a House From Lego Blocks: Using Cross Cultural Validation to Develop the Constructed Motivation Questionnaire (CMQS) in Science

Annotation. This study focuses on constructing an instrument based on cross cultural validation. The constructed motivation questionnaire (CMQS) in science to measure student motivation. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were achieved with acceptable criteria. The reliability of latent factors in the CMQS ranges from 0.828 to 0.967. There is no significant bias based on gender. The details about assessment for all latent factors were discussed in the full article. The


Introduction
The rapid growth of science necessitates the provision for public policy in complex fields, such as health, engineering, genetic engineering, science, energy, and education. To contribute effectively to the society, individuals must possess and comprehend scientific knowledge and literacy to make well-informed decisions, and they should also be able to analyse various scientific questions and determine how human activities affect the natural world (OECD, 2007). Moreover, individuals must be first scientifically literate for gaining further scientific knowledge to support this change. In the education field, it is not just essential to learn science but also to discover factors that motivate students to grasp this subject. Indonesian students obtained the 8th lowest rank on student performance in science according to the Programme for International Student Assessment (PISA) 2018 report. The performance of boys from Indonesia with regard to scientific knowledge is one of the lowest from among the countries and economies that are part of the PISA. On the basis of the report, the average performance score in science of Indonesian students was 391 in 2018, which indicated a 12-point decrease compared with 2015, which placed Indonesia in a far worse position from that of the average score of other countries (OECD, 2018). A low performance score in science can be an indication of the lack of motivation in learning science; thus, educators must identify factors related to student motivation in science during the early years itself to improve their willingness to learn science (Lin-Siegler et al., 2016;Kusurkar et al., 2013;Hazrati-Viari, Rad, & Torabi, 2012). Therefore, science motivation can be assumed as an essential construct that decides students' achievements in the field.
In recent years, science education has contributed to the development of science and literacy (National Academy of Sciences, 2010). Students' ability to understand scientific literacy issues that aid in grasping further knowledge is not a process that can be cultivated spontaneously (Glynn et al., 2009). However, in contemporary education, students' capabilities and interests are limited on the basis of standardised evaluations due to pre-determined benchmarks; hence, teachers fail to identify the factors that motivate students in science. Moreover, they are also unable to explain why students' interest in science is pivotal to comprehend. This knowledge for the teacher has a direct corresponding effect and can improve student performances in learning science (Chen et al., 2014). A decline in science learning achievement occurs when teachers fail to comprehend what motivates students in learning science, especially during middle and secondary school education (Chen et al., 2014). Wang and Liou (2017) suggested that student motivation in learning can increase with specific attention. Student motivation is a factor that influences learning achievement, a rationale to learn science, student interest and beliefs about a specific task in science (Ho & Liang, 2015). Thus, student motivation in science can help teachers and students to improve science learning outcomes.
The self-determination theory (SDT) is a motivation and growth paradigm that is used to analyse ideal human activities and progress (Niemiec & Ryan, 2009). Intrinsic and extrinsic motivation are the basic tenets of the SDT, which is described in the more detailed cognitive assessment approach of Deci and Ryan (1985). Although some scholars differ in their perspectives, these two structures are part of a continuum. They are: (a) lack of motivation; (b) four levels of extrinsic motivation, which include external, introjected, identified and integrated regulations and (c) an intrinsic motivation (Deci & Ryan, 2000). Achievement goal theory (AGT) focuses on the reasons why students choose to be involved in various activities and assignments in learning. Two aspects play an essential role in learning goals, namely, mastery and performance goals. In turn, there are various other factors that affect student motivation and learning outcomes (Mayer & Alexander, 2016). Students who endorse mastery goals want to invest their time and effort in the task due to their interest in learning further. In mastery goals, students tend to compare their past performances with their current ones in learning instead of comparing their abilities in learning with those of other individuals. Students who endorse performance goals are focused on demonstrating their capabilities to other individuals. Furthermore, they are concerned about exhibiting their competence and comparing their capabilities with those of others in the learning process (Skaalvik & Federici, 2016). Nonetheless, several other factors also influence student motivation in learning as outlined in the social cognitive theory (SCT), such as anxiety, self-esteem, self-efficacy and self-regulation (Senler, 2016). Some constructs from three prominent theories are thus included in the questionnaire that was developed in the study in relation to student motivation and science learning.
Science teachers can help students who lack motivation in learning science through individual consultations or by creating like-minded groups for imparting education. In addition, these learning processes also provide information on what aspects underlie the motivation of students in learning (Altun, 2017). However, at the beginning of the learning process, how can science teachers identify which students lack motivation in learning science? What causes the lack of student motivation in learning science? Why are they not motivated to study science? These questions are adequate as primary instruments that can guide science teachers to motivate students based on their understanding of the general response to these questions. Nevertheless, answering these queries may be fairly difficult for teachers, especially for those who are preoccupied with administrative tasks and the evaluation of learning in an institution.
Moreover, assessing students personally is uncommon for institutions. To solve this problem, some researchers have developed a questionnaire to assess students' motivation in science. Glynn et al. (2011) developed and validated student motivation questionnaires in science and non-science majors on the basis of the approach of SCT. Hsiao et al. (2005) also developed a questionnaire to measure student motivation in science learning on the basis of environmental influences and learning. However, in accordance with the initial search and literature review conducted in this study, a questionnaire from goal achievement, self-determination and social cognitive theories has not yet been developed. This research was thus directed to measure student motivation in science using advanced statistical analysis by establishing and validating divergent constructs related to student motivation in science and its peripheral aspects on the basis of three prominent motivation theories.
Questionnaires are a tool science teachers can employ to efficiently collect student information that is useful during consultation sessions of a more personal nature. In addition, questionnaires can investigate student motivation in science learning and the relationship of motivation to other aspects. Validation is the basic principle in developing, evaluating, and revising a research instrument, mainly a questionnaire in research.
Validation in practice and theory is also crucial because it refers to the relationship between theories and facts and is used in interpreting the results obtained in the form of a scored questionnaire (American Educational Research Association, 1999). Validity is a unitary concept and contributes to numerous evidence, and it has three types, namely, content validity, criterion validity, and construct validity (Osterlind, 2006). However, this research is focused on establishing constructs based on three motivation theories by doing cross-cultural adaptation several dimensions related to motivation in science learning. Thus, this research will use the principle of cross-cultural adaptation for assessment and instruments by Hambleton et al. (2004) who proposed the standards for test adaptation and development.

Theoretical Framework
A theoretical framework is necessary to support the measurement model and discuss the results. This section will elaborate on establishing constructs in the used questionnaire on the basis of the motivation theories. The motivation of students for science is essential for their learning and achievements and for future career choices (Areepattamannil et al., 2011;Taskinen et al., 2013). Motivation is often considered a background aspect of learning and choices in science education research. This study represents some constructs related to the three motivation theories.
On SDT, three relevant constructs are included in this study, namely, intrinsic motivation, extrinsic motivation, and identified motivation. Participants in this construct are characterised by their intrinsic motivation as they are valued for being interesting, fun, and rewarding. Extrinsic motivation is described by the participants already involved in the scientific activity, not for its inherent value but for reasons linked to external values. For instance, information on proper qualifications in science can be obtained (Ryan & Deci, 2000). Identified motivation describes a person who respects the mission and embraces the regulatory process to some degree. Students who afford some extra time to study because they genuinely feel that they can maximise their ability, although they are not satisfied with the job .
Motivation has roots in student goals for learning science. Two student goals are emphasised in an achievement goal theory approach, namely, mastery and performance goals. Mastery goals have been theorised to produce similar effects as performance goals in any educational context and not to weaken each other (Dweck, 1986;Nicholls, 1984). Performance and mastery goals illustrate the different values of the learning process and distinct views regarding what must be learned and why some scientific phenomena happen. These goals also relate to diverse factors, which are the reasons for engaging in multiple activities. Students oriented with mastery goals focus on doing tasks in learning and mastering new skills in science. Mastery goals are commonly associated with high-quality learning approaches, high levels of willingness, and metacognition to evaluate current scientific knowledge (Senko et al., 2011). Students oriented with performance goals focus on mastering skills to compare their performances with those of other students. Students tend to link self-value with individual performance, such as intrinsic and extrinsic motivations. Mastery and performance goals are not two separate aspects, but they are factors that co-exist in motivation with the purpose of learning science (Hidi & Harackiewicz 2000). On the basis of its association with other factors, student achievement or orientation goal is central to many motivational and academic outcomes (Midgley & Urdan 2001;Pintrich, 2000).
On SCT, self-efficacy is chosen because in some studies, this construct is related to intrinsic motivation, extrinsic motivation, performance goal, and mastery goal (Maulana et al., 2016;Schumm & Bogner, 2016). Self-efficacy describes the perception of individuals regarding achieving goals and completing specific tasks. Students will be highly motivated to learn if they believe that they could obtain what students want (Bandura, 1986), whereas if they have low self-efficacy, then they would fear hard work because it would produce something negative (Glynn et al., 2011). Pajares (2002) affirmed that self-efficacy is an essential predictor in learning and is related to student achievement and learning goals. There is also a belief that self-efficacy is a determining factor that influences the decisions of students when they reach adulthood (Bandura et al., 2001).
Another factor included in the Constructed Motivation Questionnaire in Science (CMQS) is anxiety. Anxiety is a human emotional component that manifests itself in the form of apprehensive behaviour and restlessness with regard to endeavours in life. When this type of emotional aspect occurs concerning a state of testing or assessment, it is called anxiety of testing. However, this study focuses on test anxiety. Test anxiety is an experience that expresses itself in the mind and behaviour of the candidate in the form of fear of failure or negative self-assessment. The more people are nervous or concerned about possible treatment for themselves, the more they become apprehensive, afraid, and powerless (Olatoye and Afuwape, 2003). Additionally, test anxiety is a significant predictor of academic performance. Sgoutas-Emch et al. (2007) reported that the achievement of students in a science course was significantly predicted by the level of perceived preparedness, selfefficacy, previous exposure to the course materials, and test anxiety. Furthermore, Thomas and Gadbois (2007) verified that test anxiety was a significant predictor of examination grades. In the PISA 2015, test anxiety became one of the background factors that affected student learning and achievement in science (Kuger et al., 2016).
The seven factors or constructs related to student motivation in science that has been selected and adapted for the developed questionnaire are mastery goal, performance goal, intrinsic motivation, identified motivation, extrinsic motivation, self-efficacy, and test anxiety. After their adaptation for all of the related constructs, we combined all the constructs to a questionnaire used in this study, and we named the questionnaire as the CMQS.

Present Study
After constructing the questionnaire, we conducted cross-cultural validation to demonstrate whether the developed instrument used is in accordance with the Indonesian context. Cross-cultural validation aims to determine whether the developed instrument can be used in different cultures in similar studies, especially in the Indonesian or the non-western context (Huang & Wong, 2014). Before having been combined into questionnaires, the original constructs were in English, and they were used in the Indonesian version.
All of the processes in the adaptation of the questionnaire were referenced to the International Test Commission (ITC) guidelines for test adaptation by Hambleton, Merenda and Spielberger (2004). The adaptation and development of instruments have four main principles, namely, context, development and adaptation, administration and score interpretation. All the principles in the test adaptation guidelines were followed to adapt and develop the CMQS. Seven factors from different established constructs from previous research were combined in the developed questionnaire in this study (CMQS), namely, mastery goal, performance goal (Hellgren & Lindberg, 2017), intrinsic motivation, extrinsic motivation (Nielsen, 2018), identified motivation (Maulana et al., 2016), self-efficacy (Schumm & Bogner, 2016) and test anxiety (PISA, 2015).
Validity and reliability in the measurement model will be essential aspects to investigate in this study before exploring gender differences to measure student motivation in science using the CMQS in the Indonesian context. Reliability will also be calculated using internal consistency according to Cronbach's alpha and composite reliability (CR). Validity will be analysed using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). EFA provides preliminary information on how students construct their knowledge in learning science with psychometric factors from the motivation questionnaire. EFA is also used because this study adapted, added and changed some statements in chosen constructs. The psychometric factors indicate that some motivational components are conceptualised differently. The factors formed are called latent factors, and these factors must be the same based on the factors in constructs in the CMQS. CFA was used in this study to confirm the results from EFA. Furthermore, CFA will provide information related to confirming validity, especially convergent validity and discriminant validity, and the goodness of fit (GoF) indexes of the measurement model.
Finally, the research goals of this study are to investigate the validity and reliability in the measurement model and to explore the gender differences in student motivation in science using the CMQS. The following are the research questions related to goals in this study: (1) How valid the CMQS is according to EFA and CFA to measure student motivation in science?
(2) How reliable the CMQS is according to internal consistency using Cronbach's alpha and CR?
(3) Are there any differences of CMQS in model and dimension levels in measuring gender?

Participants
The participants were selected for the study using stratified random sampling. The study was conducted in West Kalimantan Province, Indonesia. We first conducted a power analysis using G-power software to calculate the minimum sample required (Faul et al., 2009). With an alpha level (5%), the needed power (95%), and a medium effect size (0.3), the sample size required according to G-power analysis was 111. In this study, 311 students in senior high school from 10th to 12th grades in science major from 10 different schools and 500 undergraduate students from three distinct universities in science major were also included in the study. The participants comprised a total of 811 students with 40.4% males and 59.6% females. The participants were 15 to 24 years old. The questionnaire was constructed by the researcher using both paper and online resources. The online-based questionnaire was administered through the eDia platform, which is an online system for diagnostic assessments. This platform has been used for evaluations and assessments across learning research, ranging from pre-school to higher education (Csapó & Molnár, 2019). Table 1 describes the sociodemographic characteristics of the participants.

Procedures
This study developed, adapted, and added constructs in the questionnaire according to the four main principles of adapting and developing an instrument from the ITC guidelines for test adaptation (Hambleton, Merenda, & Spielberger, 2004). The first principle is the context. Considering that the sampling target in this study comprises Indonesian students, researchers must eliminate the effect of cultural differences that are trivial to the main purposes of the study, such as language and wording for each item in the questionnaire. The researchers (one senior instructor from Pakuan University and one doctoral student in linguistics major from the University of Szeged) translated the items in the questionnaire using the back-forward translation from English to Indonesian and then from Indonesian to English. All the statements or wordings in the questionnaire were adjusted on the basis of the Indonesian context. Thereafter, the revised questionnaire was initially constructed. To examine the clarity of each item, five postgraduate students were asked to provide comments and opinions related to each construct in the questionnaire and to check whether there was an overlapping construct to improve the questionnaire quality. According to their comments, three items in identified motivation and two items in extrinsic motivation were paraphrased because the meaning was difficult to understand. Subsequently, a questionnaire named the CSMQ with 37 items was produced. The second principle is test development and adaptation, after ensuring the questionnaire that is adapted and developed considering the linguistic factors and cultural differences of the target sample, which is the Indonesian student. The researchers made a score assessment rubric and a questionnaire manual in the Indonesian language. The next principles are administration and score interpretation. The researchers also requested ethical approval from the University of Szeged and registered a questionnaire using the eDia system. Scoring was performed using a point scale with a range from 1 to 5, except anxiety on the questionnaire, which was scored as per a range from 1 to 4. The test anxiety remained in the initial number form because this construct was adopted according to the PISA 2015 trial test. The scoring rubric was used to interpret data on paper-and online-based tests with the eDia system.

Instruments
The CMSQ was developed by adapting seven factors from well-established constructs with 37 items to measure student motivation in science. Four items of mastery goal and four items of performance goal were measured using adapted items from learning goal constructs (Hellgren, & Lindberg, 2017). Five items of intrinsic motivation and seven items of extrinsic motivation were measured using adapted items from motivated strategies for learning (Nielsen, 2018). Moreover, seven items of identified motivation were measured and adapted from the autonomous motivation subscale, and the original item is 'four items' (Maulana et al., 2016). Five items of self-efficacy were also measured using a subscale of science motivation for adolescents (Schumm & Bogner, 2016), and the responses were used on a 5-point scale ranging from 1 = not at all like me to 5 = very much like me. Five items of the anxiety scale were measured using the adapted items from the PISA 2015 field trial, and the responses were used on a 4-point scale ranging from 1 = strongly disagree me to 4 = strongly agree. A student with a higher score had a higher factor in every component in the CMSQ. The result of the CMSQ will generate ordinal data, and the data were analysed as if interval data based on procedure and recommendation by Glynn et al. (2011) and Wu & Leung (2017).

Data Analysis
The results from data collection in this questionnaire will be analysed using the Statistical Package for the Social Sciences (SPSS) version 22 and the Analysis of Moment Structure (AMOS) version 24. Primarily, this study applied data screening to check missing data and to exclude outliers using the Mahalanobis distance. In the initial data, we found 51 outliers out of 811 students, and 760 students were analysed. Three (0.39%) students had missing values in answering one item in extrinsic motivation and one item in performance goal. This study replaced data using mean nearby points to deal with the missing values. Descriptive statistics and zero-order correlations were calculated for seven factors. EFA was used to analyse the questionnaire responses using the SPSS version 22. We used the maximum likelihood for the extraction method and the Promax rotation because after running EFA, we will apply CFA to check the model fit in the measurement model in the AMOS version 24. The Kaiser-Meyer-Olkin (KMO) analysis and the Bartlett sphericity check were tested for an examination of our sample appropriateness to run factor analysis using EFA (Kaiser, 1970). Very high (r > 0.9) and very low (r < 0.3) correlation matrices were checked. After the first analysis, three items from extrinsic motivation and one item from identified motivation were excluded due to having low loadings and high cross-loadings. Subsequently, factor analysis was applied again. We also reran the KMO test and calculated Cronbach's alpha to determine internal consistency for the remaining 34 items and for each subscale. The Kaiser-Guttman test was used to determine the number of extract variables using the own value of a variable higher than one (Kaiser, 1960). In our analysis and in the communities, the number of variables suggests that this criterion should provide a precise solution (Stevens, 2009).
After finishing EFA and internal consistency using Cronbach's alpha to check reliability to draw the measurement model, CFA was employed using a pattern matrix builder plugin in the AMOS (Gaskin & Lim, 2016). In CFA, we checked the factor loading of each item to constructs, model fit indices, reliability using CR, and construct validity. We also calculated the multi-group analysis or invariance across gender to check whether the measurement model measures the same factors across gender. For specific gender difference analysis, we ran a t-test for the independent sample with a corresponding 95% confidence interval and an effect size to measure the power of differences using Cohen's d. The mean comparison was used for every factor using a bar chart and a standard error with gender as differentiated components.

Common Method Bias (CMB)
This study employed Harman's single factor test analysis using the SPSS version 22 to ensure CMB and to determine whether a single variable appears for the greater part of the covariance between the measures using principal axis factoring with a single factor to extract (Podsakoff et al., 2003). The result indicated that a single factor solution accounted for only 41.427% of the cumulative variance, and this value is less than 50%. Therefore, CMB is not an issue in this study.

Exploratory Factor Analysis (EFA)
EFA is used in cases where the relationship between variables observed in an instrument is uncertain (Glynn et al., 2011). EFA must assess the responses of students to the questionnaire in this study because the CMQS is an instrument composed of seven factors based on aspects in the AGT and SCT with the Indonesian context. Findings from EFA corroborate that the means of Bartlett's test of sphericity based on Chi-square = 28209.251, DF = 528, p < 0.001 and KMO measure of sampling adequacy, KMO = 0.942 indicating the instrument is distinct and reliable factors and presents samples having good quality for further analysis (Kaiser, 1970;Field, 2013). Using a maximum likelihood for factor analysis extraction and Promax rotation, data computation extracts seven factors with 37 items in latent factors having an absolute value of above 0.5 as the threshold (Hair et al., 2010;Kock, 2014). Some cross-loading items and low loadings from identified motivation (one item) and extrinsic motivation (three items) were excluded from the factor analysis, and the final form of the CMQS consists of 33 items (see Appendix 1).

Reliability
Reliability is a measure of internal consistency in the responses of respondents across the items in questionnaires or other research instruments. Generally, all items in the research instrument are used to describe the same basic construct; hence, the scores of respondents should be correlated with one another (Wieland et al., 2017). This study used two techniques to measure the reliability, internal consistency assessed by Cronbach's alpha and CR. The evaluation of internal consistency reliability statistics using Cronbach's alpha and CR was assessed in the CMQS having acceptable thresholds. The value of this threshold should be above 0.7 (Dijkstra and Henseler, 2015;Hair et al., 2019;Streiner, 2003). The reliability of latent factors in the CMQS ranges from 0.828 to 0.967. Table 2 shows Cronbach's alpha and the CR values of mastery goal, performance goal, intrinsic motivation, extrinsic motivation, Identified motivation, self-efficacy, and test anxiety specifically. The overall reliability values of the CMQS show that the instrument used is highly reliable.

Convergent validity
Convergent validity was used to measure the level of the correlation of multiple variables in the same construct in an instrument, which means that convergent validity will be achieved if the variables in a factor are highly correlated. Achieve convergent validity, CR, factor loading and the average variance extracted (AVE) should be calculated (Ab Hamid, Sami & Sidek, 2017). Generally, the smaller the sample size, the higher the loading score required. It is best to have loading scores of more than 0.5 for each factor regardless of the sample size. The thresholds for the AVE should be above 0.5 for each composite factor, and CR should be 0.70 and above (Hair et al., 2019). However, when the AVE value is below 0.5 and the CR is higher than 0.6, the convergent validity of the construct still meets the minimum thresholds (Fornell & Larcker, 1981;Malhotra and Dash, 2011).
The AVE and CR values were computed using master validity tools (Gaskin & Lim, 2016), and the factor loadings were computed in EFA. All the loading score values of items are more than 0.5 (See Appendix 1). For seven latent factors in the CMQS, the AVE values range from 0.497 to 0.852, and the CR value ranges from 0.828 to 0.967 (Table 2). There is a low value of AVE in anxiety (0.497), but we still can establish convergent validity and reliability from CR alone if AVE is often too strict (Malhotra & Dash, 2011).

Discriminant validity
Discriminant validity was used to determine the extent to which latent factors differ empirically from one another (Hair, Hult, Ringle, & Sarstedt, 2016). Fornell and Larcker (1981) recommended that discriminant validity is achieved when the square root of the AVE is higher than the AVE shared correlation on a particular latent factor. The square root of the AVE should be above 0.5 and greater than the inter-correlation of latent factors in the model (Hair, Black, Babin, & Anderson, 2010). Table 3 shows the validity measurement on the basis of the Fornell and Larcker criterion that contains the significance of correlation (p), matrix correlation between latent factors, the AVE values (in bold), the CR value, and the square root each latent factor as the diagonal part (in bold). All latent factors in the CMQS achieve the discriminant validity threshold, especially for the interaction between the square root of the AVE and the inter-correlation of latent factors. LGMG 0.956 0.845 0.834*** −0.068 † 0.343*** 0.741*** 0.839*** 0.058 0.919 Note. Significance of correlations, † p < 0.100, * p < 0.050, ** p < 0.010, *** p < 0.001. SMIDM, identified motivation; SE, self-efficacy; SMEM, extrinsic motivation; SMIM, intrinsic motivation;
This study also employed a new criterion to assess the discriminant validity using the HeteroTraitMonoTrait (HTMT). In the establishment of discriminant validity conceptually and differently, the threshold of HTMT values should be less than 0.9 and 0.85 . The results in Table 4 explain that the CMQS is successful in establishing discriminant validity on the basis of the HTMT 0.85 criterion, in which all of the HTMT values are less than 0.85.

Confirmatory Factor Analysis (CFA)
In covariance based on structural equation modelling (CB-SEM), there are two kinds of the model, namely, measurement and structural models. This study is an initial part to assess the measurement model using CFA using the AMOS version 24. CFA is employed to confirm latent factors in the measurement model that showed all latent factors operating adequately and GoF indexes achieved; thus, in the next study, researchers have more confidence for finding relationships between latent factors and constructing hypotheses in structural models (Byrne, 2001). For measurement quality according to the suggestion of Chuah et al. (2016), we conducted the analysis for CR, convergent validity, and discriminant validity (Table 2). We drew the CFA diagram in the measurement model using the pattern matrix builder plugin by Gaskin (2016) to assess the model fit. The CFA results validate that the model of fit was achieved in the first analysis, CMIN/DF = 3.943, p < 0.001, GFI = 0.872, AGFI = 0.848, TLI = 0.945, CFI = 0.950, RMSEA = 0.062, P close = 0.000. We analysed the report from modification indices and doing covariance with items in the same factor having values of more than 10 to generate an outstanding result and improve the model fit in CFA. The most appropriate for the modification in the measurement model is to covary error terms that are part of the same factor (Hermida, 2015). The better model fit achieved CMIN/DF = 2.720, p < 0.001, GFI = 0.812, AGFI = 0.891, TLI = 0.968, CFI = 0.972, RMSEA = 0.048, P close = 0.891 (see Figure 1). Figure 1 depicts the CFA diagram after modification indices and gives information on values for the GoF. According to cut off criteria for fit indices in covariance structure analysis by Hu and Bentler (1999), the CMQS has achieved excellent criteria in the measurement model. The cut off criteria for fit indices for the excellent model fit are CMIN/DF > 1, CFI > 0.95, SRMR < 0.08 RMSEA < 0.06 and P close > 0.05.

Figure 1
CFA After Modification Indexes,Standardised Factor Loading and Correlation (N = 760)

Multigroup Analysis
We conducted multigroup analysis through CFA in measurement models by making two groups according to gender differences, females and males, to ensure that the measurement model in this study measures the same thing across gender. In other words, the instrument is not different if we measure two group levels, males and females. So, there is no bias on gender.Global test results confirm that there no significant differences exist across gender (p = 1), DF = 908, x 2 unconstrained = 2094.908, x 2 constrained = 2094.908. We also recalculated the model fit for females and males. The result asserts that female and male groups meet the criteria of the GoF indexes.

Analysis of Scale Scores for CMQS Components
We compared the means of seven latent factors in the CMQS using the independent sample t-test according to gender differences to analyse the scale scores for all components in CMQS. We also checked for Cohen's d effect size. The criterion of effect size consists of negligible (0-0.19), small (0.2-0.49), medium (0.5-0.79) and large (0.8 and above) (Cohen, 1992). According the independent sample t-test, we found the following: mastery goal (t (758) Cohen's d = 0.166). From this analysis, we further found that mastery goal, performance goal and intrinsic motivation between males and females are different and have small effect sizes. Figure 2 illustrates the bar chart related to every subscale by gender differences in the CMQS.

Figure 2 Comparison of Seven Factors in the CMQS (N = 760)
Note. Error bars show 95% CI. Mean score for whole latent factors M ± SD = 3.49 ± 0.49.
According to the illustration in the form of a bar chart for all latent factors, we found some slight differences (more than 0.10) in the scale scores between males and females in the mastery goal, performance goal and intrinsic motivation components. Generally, the scale scores of the science motivation components are not different between males and females.

Conclusion
The developed instrument of the CMQS in this study is valid and reliable according to the statistical analysis. EFA shows that the means of Bartlett's test of sphericity based on Chi-square = 28209.251, df = 528, p < 0.001 and KMO = 0.942 indicating the instrument can differ seven latent factors in the CMQS appropriately. Four items out of 37 items were excluded because of having low loadings of below 0.5. The reliability according to Cronbach's alpha and CR was achieved, which ranged from 0.828 to 0.967. The convergent validity achieved with good criteria of all latent factors has AVE value above 0.5, except test anxiety (0.497), but the convergent validity still can be achieved because of the high value of CR (> 0.6) for all the components. There is no issue about discriminant validity because the findings validate that the CMQS meets the criteria from the Fornell-Larcker criterion and the HTMT 0.85 . In CFA, the GoF index value is excellent for before and after modification indices. Multigroup analysis through CFA convinces that this model measures the same thing across gender differences indicatig the instrument has no bias in measure two group levels; males and females. For scale score analysis, small differences emerge in the mastery goal, performance goal and intrinsic motivation components according to t-test and small effect sizes according to Cohen's d criteria.

Limitations and Future Directions
Although this study provided knowledge about how to validate an instrument and how to measure the motivation component in science properly, our results have some limitations. First, this study merely assesses the measurement model without conducting further analysis on the relationship among latent factors. Second, this study is a cross-sectional study; hence, some disadvantages exist, including the challenges in analysing behaviour over a time period and collating samples on the basis of a variable on the studied population. Third, there is a possibility of some bias in the research although we had adopted appropriate procedures for data collection and necessary precautions.
In future studies, the researchers can conduct analysis on the structural model using covariance-based or partial least squares on structural equation modelling. Investigating test anxiety as a moderator variable in student motivation in science may be an exciting topic for future research. The results confirm that a correlation exists among latent factors in the CMQS, such as achievement goal and intrinsic motivation, but no clear model describes that interaction. Thus, modelling interaction among latent factors in the CMQS appears to be the next topic of interest for researchers. LGPG1 For me, it is important to be better than other students in science lessons.

Appendix 1
.084 .001 −.008 .035 .817 −.012 .039 LGPG2 I will try to obtain better grades on tasks and exams than other students.
.101 −.015 −.045 .026 .883 .011 −.013 LGPG3 My goal in learning science is to be better than other students.
.155 .017 .034 .007 .818 .016 −.062 LGPG4 My goal is to avoid worse results in science exams compared with other students. LGMG4 To understand every concept in science is my main priority following science lessons.
.169 −.016 −.004 .104 .058 .002 .632 LGMG2 I want to learn all things in science even if the materials do not appear in exams.