Developing and Validating Instrument for Data Integration Governance Framework

Data integration is one of the important subfields in data management. It allows users to access the same data from multiple sources without redundancy and preserving its integrity. Data Integration Governance Framework (DIGF) is being developed to guide the implementation of data integration. It functions as a reference and guideline for working level in data integration implementation. Hence the instrument used to validate the DIGF needs to be developed and validated for its accuracy, applicability, and suitability of use. The instrument comprises items structured as a questionnaire. This study proposes Lawshe’s technique to construe the content validity of the instrument. This technique involved the arithmetic of the Content Validity Ratio (CVR) to validate items in the questionnaire, which developed based on the factors identified for Data Integration Governance Framework. Each item in the questionnaire that validated based on the minimum CVR value of 0.75 endorsed as the final instrument of Data Integration Governance Framework to be used in Delphi Technique Evaluation. Keywords—Content validity; instrument development; data integration governance; Lawshe’s technique


I. INTRODUCTION
Data integration plays an important role to provide cleaned, integrated, and secured data for decision making and operation purposes in public sector [1], [2], [3]. Public sector as the biggest owner of data, needs an efficient data integration governance to support the digitalization plan [4]. An efficient governance should incorporate all aspects influencing data integration governance in public sector. Previous studies show that, focusing only on aspect of technology will prompt to failure in data integration governance [5], [6], [7].
Thus, to solve the issue, this study identified dimensions and factors influencing data integration governance in public sector using literature review, theories adoption and interview method. The factors and dimensions identified later being constructed into the public sector Data Integration Governance Framework (DIGF). DIGF that has been developed needs to be validated to ensure it suits the practicality and requirement of the public sector data integration initiative.
This study uses Delphi Technique to validate the framework using the questionnaire with Likert Scale to measure the validity of the framework. However, the Delphi Technique process should be preceded with an instrument validation [8], [9]. Hence, this study will focus on implementing content validation process to validate the instrument will be used for Delphi Technique. Content Validation Ratio (CVR) and Index of Content Validity (CVI) are identified as the measurement method for this process. This paper will be segmented into three major parts, which are firstly, the description of DIGF, comprising the dimensions and factors explanation; secondly, the methods used with the questionnaire summary; and lastly, the results and discussion on the data analysis.

A. DIGF
Three dimensions that have been identified in this study are people, process, and technology. Meanwhile, the factors listed are culture, clarity of roles and responsibility, and communication under the people dimension; law and regulation, and policy under process dimension and for the technology dimensions, factor is summed up as tools and technology. The relationship between all the three dimensions and six factors is being employed as the foundation for the DIGF development. DIGF development involves literature review of previous study, theories adoption and interview with the experts, to simulate and correlate the dimensions and factors influencing data integration governance in public sector. The description of the dimensions and factors are given in Table I. Based on the description and connection between the dimensions and factors above, this study has come out with a framework of data integration governance in public sector. The framework developed as per in Fig. 1.
Public sector DIGF that has been erected is a strategic basic framework as the dimensions and factors are connected and described generally. It could be a reference and adapted to any kind of organization including private sector in governing their data integration initiative.

B. Development of Questionnaire
Heeding to a rigid protocol suggested by [16] and [17], there are four processes and six supporting steps of developing questionnaire in content validation process. Notwithstanding, this study has come out with four processes and eleven supporting steps of developing questionnaire in content validation process. Fig. 2 explains the process adapted by this study in questionnaire development.

People
People refers to the entity that perform the activities using the tools and technology provided according to the objective ad principle set up. [10] Culture Culture involves knowledge, beliefs, habits, capability, and norms in an organization that influence the individual's and organization's goals. [11] Clarity of roles and responsibility Roles and responsibility described the contribution of the personnel towards the activities in the organization commensurate to their expertise and qualification. Clarity of roles and responsibility gives impact to facilitate the governance of any initiative. [10] Communication Communication relates to human's behavior. It also applied to other entities such as the hardware and software. Communication basically connects all the dimensions and factors together. [12], [13] Process Process is a set of related activities with input, value add, and procedures which produces specific output. Process automated by the technology and facilitated by people. [14] Law and regulation Law and regulation cover the law (act) and official orders issued by the government or the authorities to control or govern the implementation of activities and human behavior.

Policy
Policy is a simple and comprehensive mandatory formal statement that outline the rules and commands for an organization in performing any activities.

Technology
Technology refers to the tools and techniques used by people in implementing any activities. Technology creates innovative human resources and automated the processes. [14], [15] Tools and technology Tools and technology is the factor that support and facilitate the process and people's task. [10], [14], [15]  The questionnaire booklet is segmented into four components, which are (1) panel information, (2) items on factors influencing data integration governance, (3) definition of data integration and data integration governance, and (4) Data Integration Governance Framework (DIGF). Component (1) used for collecting panel's information such as job designation, place of work, year of experience and contact information. Meanwhile, component (2) consists of 94 specific items and 6 generic items on factors embedded in DIGF. Component (3) includes the definition of data integration and data integration governance that needs to be validated by the experts and component (4) covers the explanation of DIGF.
Specific items refer to the individual items for each factor. Meanwhile, the generic items represent the whole factor in general. Generic items is important to be developed as it gives opportunity to the experts to evaluate the factors in general [18]. The content of the items listed in the questionnaire for DIGF validation is summarized in Table II. Items were developed based on literature review, whereby discussed by previous studies.

Process 1: Planning and Strategizing
Step 1: Define type of method and questionnaire Step 2: Clarify administrative process

Process 2: Defining Content
Step 3: Provide conceptual definition of dimensions and factors Step 4: Develop items for factors Step 5: Define measurement skills Step 6: Identify the experts

Process 3: Designing Questionnaire
Step 7: Design and develop the questionnaire

Process 4: Validating Questionnaire
Step 8: Conduct content validation process Step 9: Calculate CVR value Step 10: Calculate CVI value Step 11: Analyse the results   Culture includes the aspects of organization culture and individual culture as the member of the organization.  Good organization culture ensures the alignment between corporate strategy and IT strategy.  Data sharing culture through data integration initiative plays an important role in data in public sector.  To successfully implementing data integration initiative, organization culture and individual work culture must be aligned and understood well.  Ethics must be incorporated as an organization culture in data management area especially in data integration.
Factor A2 -Clarity of roles and responsibilities Specific: A2-1 till A2-15 Generic: 7  Clarity of roles and responsibilities could ensure the member of organization assimilate their job scopes and task in data integration governance.  Clear power and job scope distribution determine the accountability and responsibility of the member of organization.
 Efficient leadership spearheading an effective data integration governance in organization.
Factor A3 -Communication Specific: A3-1 till A3-15 Generic: 10  Communication is an enabler to ensure other factors could be adapted efficiently.  Clear, structured, and effective communication will help the member of organization to comprehend the objectives, terms of reference and planning of data integration initiative in organization  The benefits of data integration should be communicated to the member of organization for them to support the implementation.  The usage of data standard and standard term in integration team will assist in data integration implementation.  Organization needs to provide effective communication channels to facilitate data integration governance.  Organization needs to provide an effective change management plan and execution to facilitate data integration governance.

Dimension B: Process
Factor B1 -Law and regulation Specific: B1-1 till B1-18 Generic: 13  Law and regulation include establishing act to protect and guide data integration governance.  Former acts regarding data integration and data sharing should be updated and aligned.  There should be an act enforced to protect the data security, privacy, and confidentiality.
 Alignment between federal, state, and local council's law and regulation should be established to support data integration initiate in public sector.

Factor B2 -Policy
Specific: B2-1 till B2-16 Generic: 16  Policy in organization or public sector itself helps to determine the direction, guideline, and rules in data integration implementation in public sector.  Establishment of clear and systematic policy will lead into good data integration governance and efficient implementation.
 Policy alignment between federal, state, and local council should be established to support data integration initiate in public sector.

Dimension C: Technology
Factor C1 -Tools and technology Specific: C1-1 till C1-14 Generic: 19  Choosing the right technology is crucial to assure compatibility, maintainability, reliability, and security of data integration initiative.  Choosing the right technology also will provide high quality data through well equipped function such as data cleansing, data profiling, data stewardship, and others.
 Tools and technology selection must be aligned and complied to the law and regulation, policy, cultural, and organization corporate and IT strategy.

III. MATERIAL AND METHOD
Delphi Technique has been identified as the validation method for DIGF. Delphi technique involves getting consensus from the experts to validate research output in an iteration process [19], [20]. However, the questionnaire that will be used as an instrument in Delphi Technique need to be validated in a pilot study to ensure that the perceived construct are clear, valid and manifest its contents [17], [18]. The process is also known as content validation process.
There are many content validation methods available such as psychometric analysis using Rasch Model [8] and modified kappa statistic [21]. However, this study recognized CVR and CVI as the methods to validate the Delphi Technique instrument as it involve experts' evaluation and commonly used for content validation for Delphi Technique [17], [18].

A. Selection of Experts
Experts' selection would be the most crucial and initial part of content validation process. Among the criteria of experts' selection are; (1) technical knowledge and experience in the research area, (2) willingness to participate, (3) having ample time to involve in the process and (4) possessing good communication skill [22], [23]. In this study, experts were selected based on their experience and knowledge in data integration area, research process and Delphi Technique. The numbers of experts selected is normally based on the research scope, resources available, (which include time and cost) and research objectives [24], [25]. Nonetheless, there is no definite mechanism to determine the right numbers of experts involved in content validation process and Delphi Technique for every different research [26].
Eight experts were identified for content validation process based on the criteria and requirement set up for this study as per in Table III. www.ijacsa.thesai.org  [27]. This method has been widely used in many research domains including computer science and engineering. According to Google Scholar, up to January 2022, Lawshe's CVR technique has been referred and cited for 7,149 times in various research publication. Meanwhile, review in Scopus Database identified 19 research on computer science and engineering from year 2016 until 2020 using CVR method to validate their content including three research that validated content of instrument for Delphi Technique using CVR [17], [28], [29].
CVR uses Likert Scale with three indicators, which are, "1not necessary", "2-useful (but not essential)" and "3-essential". Likert Scale with 3 indicators is being used to simplify and provide an objective evaluation process by the experts [17], [18]. Comments section is provided for the experts to express their opinion and suggestion of improvement on the items. CVR calculation and analysis assess the experts' agreement on the listed items in the questionnaire using formula introduced by Lawshe as below: Where ne is numbers of experts picked scale 2 and 3, and N equal to total numbers of experts.
Precondition for ne is based on the suggestion by [17], [18], and [30], as they concluded that indicators "2-useful (but not essential)" and "3-essential" refer to positive feedback from the experts which conduce to acceptance of the items.
The conditions for the formula by Lawshe are as below:
However, for this study, as suggested by [17] and [18], indicator "2-useful (but not essential)" also accepted, as both, indicators 2 and 3 reflect positive acceptance and relevancy to the study. This study also follows recommendation of [27] on minimum CVR value for items' acceptance based on numbers of experts participated as in Table IV.
Considering the numbers of experts participated in this study is eight (refer Table V), the accepted minimum CVR value is 0.75 for each item. All items that obtain minimum CVR value of 0.75 will proceed to the Delphi Technique process.

C. Index of Content Validity (CVI)
CVI is being used to evaluate the whole instrument either it measures the right and relevant items that should be measured or otherwise [31]. According to [32], as content validation process is very important to endorse the instrument of the study, it need to be done in a systematic arrangement with strong justification and credible proof. In this study, the CVI calculation used is adapted from [27] and [33] which CVI is equal to mean of CVR. The calculation of CVI where 't' is the accepted items is demonstrated as below.
According to [33], the closer value of CVI to 0.99 the higher value of content validity we should get. This means, the level of acceptance for the whole instrument will be higher too.

A. Selection of Experts
Experts were selected based on criteria determined in Table III. Eight experts from public sector and academics had been selected and agreed to participate in this study. The list of experts as per stated is in Table V.

B. Questionnaire Distribution
Content validation process was done within two weeks. Invitation was done through email and followed up by telephone call. Experts who agreed to participate will receive official invitation from the faculty and the questionnaire then distributed through email. Further explanation was done using email, telephone call and "WhatsApp" accordingly.

C. Analysis of Questionnaires
The minimum value of CVR accepted as mentioned above is based on numbers of experts involved. For this study, the minimum value accepted is 0.75 as we have eight experts on board. By this means, both specific and generic items with CVR value equal to 0.75 and above will be brought to the first round of Delphi Technique process for DIGF validation.
Content validation analysis based on Lawshe's technique is presented as below.
2) Item 1.A1-3 described that organization culture should not be a limitation in data integration initiative. Item 1.A1-4 stated that organization culture should be considered to design a data integration initiative in an organization. Meanwhile, item 1.A1-5 suggested that organization must ensure there would be no conflict of culture while adopting data integration in organization. There is one answer with indicator "1=not necessary" for these three items. However, the experts did not leave any comment on the items.
3) Item 5.A2-2 described that clarity of roles and responsibility will support member of organization to perform their task at their best capability and skill. Meanwhile, item 5.A2-6 explained that clarity of roles and responsibility will balance and incorporate technical and management aspects in data integration governance. For each item, there is one expert answered "1=not necessary". However, no comments were provided by the experts on these items. 4) Item11.B1-5 explained that managing law and regulation factor is important in data integration governance so that it would not be an obstacle in new technology adoption and utilization. An expert picked "1=not necessary" for this item with no comment provided. 5) Item 18.C1-10 and item 18.C1-13 received one "1=not necessary" each from one expert. Item 18.C1-10 stated that technology need to be accommodated with human resource capability in the organization. Expert's comment on item 18.C1-10, "human resource needs to adapt with technology and not other way around". Item 18.C1-13 describe those tools and technology adopted must be free from vendor lockin. No comments received for item 18.C1-13.
6) As all items obtained CVR value of 0.75 and above, all items are accepted and bring forward to the first round of Delphi Technique. 7) For generic item, all six items earned CVR value 1.00. This demonstrates that all experts agreed upon the importance of every factor equipped in DGIF.
8) All factors earned CVI more than 0.95 and the overall CVI for the questionnaire is 0.98. This concludes that overall questionnaire is measuring the right things for DIGF and validated by the experts.
Summary of CVR and CVI calculation for 94 items included in the questionnaire is presented in Table VI.

V. CONCLUSION
From the analysis executed, all 94 specific items and six generic items developed in the questionnaire are accepted by the experts. This indicate that items attached to the six factors included in DIGF have been validated through the content validation process using CVR and CVI calculation based on Lawshe's Technique. In conclusion, this questionnaire has been validated by the experts through content validation process and now ready to be used in Delphi Technique process to validate the DIGF. The validated DIGF will then be adopted in ensuring the successful implementation of data integration initiatives.