HOSPITAL FOOD SERVICE QUALITY IMPROVEMENT QUESTIONNAIRE (HFSQIQ): DEVELOPMENT, TRANSLATION AND VALIDATION OF A QUESTIONNAIRE

The hospital food service department provides meals to staff, patients, and their caregivers while adhering to dietary therapy guidelines and promoting nutritional wellness. High-quality food service plays a pivotal role in offering inpatients nourishing meals that promote physical and mental well-being, aiding their recovery and overall health during their hospitalisation. This study aimed to develop and validate a tool for measuring and evaluating hospital food operations using the Total Quality Management approach. A literature review, in-depth interviews with food service employees, and a peer-review process were conducted to identify the domains and items for the questionnaire. A “Hospital Food Service Quality Improvement Questionnaire” (HFSQIQ) with 61 items in six domains was developed and the content validation was performed by seven experts. The questionnaire was translated into Malay, and the internal consistency of the HFSQIQ was examined using Cronbach’s alpha. Resultantly, the HFSQIQ depicted high validity and reliability, with a high I-CVI and Kappa index rating for most items and a Cronbach alpha value of 0.97 and 0.98 for the importance and performance scales, respectively. In conclusion, the HFSQIQ is a useful tool for evaluating and improving the quality of hospital food service operations.


Introduction
Hospitals in the healthcare sector all provide the same services, but the service quality differs.Given the growing competition, healthcare organisations have allocated top priority to service quality.A hospital food service is a department that caters for the food of staff, patients, and their caregivers by attending to a diet therapy regime and enhancing nutrition application (1).The quality of food service plays a significant role by offering inpatients nourishing meals that are beneficial physically and mentally to their health and recovery during hospital stays.The primary objective of a hospital food service is met when meals are meticulously planned and tailored to satisfy the patient's specific dietary requirements (2).Patients who are well-nourished upon admission have the right to maintain their current nutritional condition after discharge (3).Failure to provide acceptable quality food services may result in poor food intake, thereby prolonging the recovery time, complication rates, and length of stay.These events may culminate in increased healthcare expenses, especially among the elderly (4).
Patient satisfaction is a widely accepted measure of food service quality, a key indicator.Food quality has been demonstrated as one of the most significant predictors of overall hospital stay satisfaction (5).Patients' nutritional status commonly deteriorates during hospitalisation (6,7).The hospital food is often negatively perceived as cold, tasteless, poorly presented, and badly served (8).Besides, high patient satisfaction is influenced by several factors including appreciation towards meal services, staff interactions, and eating and physical eating environments (9).Patient satisfaction was associated with emotions, morals, medical discourse, and cultures of gratitude (10).
However, satisfaction assessment is typically limited to a few general questions about food service, which are insufficient to elicit information from patients about objectives and interpersonal aspects, and to examine patients' wishes for personalised service (11).
In Malaysia, most studies used instruments, such as surveys and measures of plate or food waste, adapted from earlier studies conducted in other countries to measure or assess patients' food intake and satisfaction with hospital foods (9,12).A qualitative study found that emotions indirectly influenced patients' meal experience and food intake in hospitals (13).In addition, the staff play a primary role in providing a positive meal experience among patients during hospitalisation (14).
Various models have been developed to measure service quality.Total Quality Management (TQM) is one approach for fundamental measurement and continuous improvement.TQM is a concept rooted in the Japanese management style (15), which assists to improve the quality of services and goods through a collaborative approach and standardised performance.A previous study highlighted that all departments and individuals contribute to TQM in attaining standards on customer service and end-user satisfaction, which brings excellence to business (16).Nevertheless, there is insufficient information in the questionnaire designed for the Malaysian population.It is challenging to use the instrument in measuring food preferences due to differences in food culture and practices, which are linked to multi-ethnicity and diverse religions and socio-demographic backgrounds (17).Thus, this study aims to develop and validate a measurement tool using the TQM approach for monitoring and evaluating hospital food service operations.

Questionnaire development
In this study, the domain of the newly proposed questionnaire was identified through a literature review, in-depth interview, and peer review process.A literature search related to hospital food services quality improvement topic was used as a guide in generating the questionnaire's items for the measure (18)(19)(20)(21).Next, the interview was conducted among 24 food service employees, and the acquired data were thematically analysed.The findings helped to determine which questions should be added to or removed from the initial questionnaire.The interviews also assisted to build and improve the answer options.Details on the thematic analysis of the interview session for the process of item development can be gleaned from a previous study (22).Next, the questionnaire underwent a peer-reviewed process to identify any overlapping and duplication questionnaire.A preliminary Hospital Food Service Quality Improvement Questionnaire (HFSQIQ) comprising 61 items generated from six domains were developed throughout this process.

Content validation
The preliminary HFSQIQ underwent content validation, where a further judgement of the relevancy of items took place.According to Yusoff (23), the minimum number of experts for content validation should be six and the maximum number should not exceed ten.In addition, Polit et al. (24) and Polit and Beck (25) recommended using a CVI cut-off score of at least 0.83 with a minimum of six experts for the content validation procedure.Therefore, seven experts who were chosen from lecturers in food service or food technology, dietitians or catering dietitians, and catering officers working in the food service division of public and private hospitals for this study.The experts were selected based on criteria established by the researchers, such as their in-depth knowledge of scale development and/or the relevant domain, their dissimilarity to the individuals who developed the item pool, and the application of systematic expert judgement to avoid bias in the evaluation of items.The experts were specifically given the definition and the items describing the domain in the content validation form.Each domain and its underlying items were critically evaluated by the experts before scoring each item.
The experts had to provide written or verbal feedback to enhance the item's relevance to the intended domain.All feedback was considered when improving the domain and its item.Then, the experts were required to provide scores for each item individually based on the relevance scale following the domain and items review.Scales for evaluating relevance ranged from 1 = Not relevant to 4 = Highly relevant.To avoid having a neutral and ambiguous at the midpoint, Lynn (26), suggests adopting a 4-point rating scale instead of a 3-or 5-point rating scale (24).Each domain's item level (I-CVI) and scale level (S-CVI) content validity indexes were manually calculated.Two different indices were calculated to determine the S-CVI: 1) the percentage of items on one scale that an expert rated as valid (S-CVI/UA = universal agreement by the expert), and 2) the average percentage of items on one scale rated as valid (S-CVI/Ave = average agreement by experts) (27).Each item rated 3 or 4 was transformed to valid ('1'), whereas items with ratings of 1 or 2 were transformed to nonvalid ('0').S-CVI/Ave was determined using two formulas: I-CVI = (agreed item) / (number of raters); and S-CVI/Ave = (summation all I-CVI) / (number of items) All I-CVI values were initially obtained and divided by the number of items.The average percentage of each rater was then obtained using the second formula.Next, the number of items with 100% agreement was divided by the total number of items in that particular domain to determine S-CVI/UA (23).For the S-CVI/UA and S-CVI/ Ave, a value of 0.8 was considered acceptable (24,25).The items were revised or removed based on the panel's recommendations, and experts were contacted for second round of expert reviews to clarify any uncertainties.

The translation process of the pre-final HFSQIQ from English into Malay
The newly developed questionnaire, prepared in English, was translated into Malay following content validation.The purpose of the translation is to maintain the text's original meaning, style, and impact while translating the genuine context from English to Malay.The forward and backward translation process was adapted from Sousa and Rojjanasrirat guidelines (28).
The English version of the HFSQIQ was translated into Malay by two qualified independent translators.The first translator was familiar with healthcare terminology and the questionnaire's content in both languages.In contrast, the second translator was referred to as a naive translator who needed to be made aware of the objective the questionnaire intended to measure.The translators were instructed to independently prepare a forward translation version conceptually equivalent to the original HFSQIQ.A professional and qualified proof-reader then reviewed the translated questionnaire for inaccuracies.Lastly, the four members of the supervisory research team that are proficient in English and Malay examined and compared the questionnaire items translated into Malay to check for discrepancies between the translated text and the original language (English).To provide a preliminary forward translation of the HFSQIQ, the translations were reconciled to reach a consensus among research team members.
The next stage of the translation process entailed sending the preliminary initial forward translation Malay version of the preliminary HFSQIQ to a third translator.The third translator back-translated the tool from the target language (Malay) into the original language (English).To avoid reference to existing sources of teamwork assessment, the translator was not informed that the tool was being back-translated.After producing the backward translation version, research supervisory team members needed to reconcile the two English versions by comparing the backward translated version and the original version.Members of the supervisory research team were responsible for evaluating, revising, and consolidating the back-translated questionnaire to ensure conceptual, semantic, and content equivalence.They were also in charge of creating the pre-testing final target language questionnaire for pilot and psychometric testing (27).The members of the supervisory research team then discussed any disparities between the back-translated version and the original to guide the selection of phrases and words in the Malay version.All the comments and amendments made for the pre-final version of HFSQIQ were documented.

Face validation of pre-final version in both the English and Malay versions of HFSQIQ
Upon completing the content validation, 10 volunteer panels who were selected from the academician and healthcare professionals performed the face validation.The aim was to determine the clarity and understandability of the translated items.Based on the comprehensibility and clarity of the source and translated items in the HFSQIQ questionnaire, the raters were instructed to provide a Likert scale score between 1 (item not clear and understandable) and 4 (item very clear and understandable).The ratings of 1 and 2 were then reclassified as 0 (not clear and understandable).Scores 3 and 4 were concurrently reclassified as 1 (clear and understood).The item-level face validity index (I-FVI) was computed using the raw scores for each item's comprehensibility and clarity in Microsoft Excel.In addition, the average of the I-FVI score (S-FVI/Ave) for all items on the scale or the average of proportional clarity and comprehension as evaluated by all raters was calculated (29).According to Marzuki et al. (30), the minimum acceptable number of raters for an online survey is 10, with acceptable FVI values of at least 0.83.The following formula was used: I-FVI = (agreed item) / (number of raters) S-FVI/Ave = (sum of I-FVI scores) / (number of items) A modified Κappa index was also generated to estimate the I-CVI.The modified Κappa (k*) is an index of agreement among experts that demonstrates, beyond the possibility of random variation, that the item is relevant, clear, or possesses another quality of relevance.Polit et al. (24) formula was implemented to calculate the modified Κappa index in this study.For each item, the probability of chance agreement (Pc) was first computed using the formula below: Where N represents the total number of experts, and A represents the total number of experts or target users who agreed that the item was comprehensible, relevant, and clear.The next step was to calculate the Κappa value using the following formula: κ = (item -level content validity index -Pc) / (1 -Pc) Microsoft Excel was used to calculate the Κappa calculation.Based on the formula above, 0.74 is considered exceptional, 0.60 to 0.74 is acceptable, and 0.54 to 0.59 is fair (23).

Psychometric testing of the pre-final version of the HFSQIQ in a sample of the target population
This final step was to establish the initial psychometric features of the newly designed questionnaire using a sample of the population of interest.The purpose is to examine the internal structure of the questionnaire; the current study conducts a reliability test that is represented by a high value of the internal consistency and reliability coefficient, often determined by Cronbach's alpha coefficient (α) (27).In the present study, the formula presented by Bonett was used (31).The minimal sample size required to determine at least 80.0% power of the test is 22 hospitals based on an alpha value of 0.05.A minimum sample size estimation of 27 respondents from 27 hospitals was employed to determine the internal consistency by conducting a reliability test.exceeds 0.70 (29).Furthermore, the corrected item-total correlation and Cronbach's alpha if an item is deleted were examined for testing the reliability of the newly created HFSQIQ.A good correlation between the item and the total excluding the item is indicated by a corrected item-total correlation ideal value greater than 0.5 and the minimum acceptable value is should not less than 0.30 (32).

Preliminary HFSQIQ questionnaire design
The literature review identified a few models or methodologies designed to evaluate the performance of hospital food service operations.Additionally, a qualitative content analysis using semi-structured in-depth interviews with 24 staff from the dietetics and food service departments at two hospitals was undertaken to identify the indicators within six domains, including the food service operational management, food production and distribution management, staff management, nutritional management, and patient/customer service management.The domains of each of these indicators were determined conceptually by combining qualitative research with a literature review.The summary of the development HFSQIQ questionnaire is illustrated in Table 1.
This pilot study's target population was selected among management representatives from the food service departments of government, private, and teaching hospitals in Malaysia, including the head of the department, catering officer/assistant catering officer, operating manager, and dietitian/catering dietitian.The respondents were chosen based on the researcher's inclusion and exclusion criteria.They must be a Malaysian citizen, hold a position on the administrative team of the dietetics and food service department, have at least six months of experience working in a hospital food service department, and be proficient in Malay and/or English.To facilitate data collection, the questionnaire was distributed via an online Google Form.A URL link was sent to each respondent by email or WhatsApp.Respondents were requested to evaluate the indicators based on their perceptions of the importance and performance of food service operations.

Data analysis
Reliability analysis was performed using the Statistical Package for Social Sciences (SPSS) version 26.0.Cronbach's alpha (α) was used to examine the internal consistency of the HFSQIQ for two subscales measuring the importance and performance of food service indicators.Additionally, Cronbach's alpha was calculated for the whole HFSQIQ.Internal consistency is acceptable if Cronbach's alpha value

Expert panel judgement of HFSQIQ's validity
The CVI scores' average and universal agreement is displayed in Table 2.The overall CVI scale (0.88) was evaluated for good Validity, while the S-CVI/Ave ranged from 0.79 and 0.93.The findings revealed that all the items had S-CVI/UA scores less than 0.80 (ranging from 0.17 to 0.50).The I-CVI ratings for 53 items (86.9%) were greater than or equal to 0.80, and the κ index was greater than or equal to 0.74.In contrast, 8 items (13.1%) had I-CVI ratings below 0.80 and κ index below 0.74.Six items (9.8%) had ratings of I-CVI of 0.71 and κ index of 0.17, one item (1.6%) had I-CVI ratings of 0.57 and κ index of 1.05, and one item (1.6%) had I-CVI ratings of 0.29 and κ index of 1.00.The S-FVI/Ave ranged from 0.99 to 1.00 for the pre-final HFSQIQ English version, whereas the S-FVI/UA ranged from 0.89 to 1.00.Meanwhile, S-FVI/Ave varied from 0.93 to 1.00 and S-FVI/UA ranged from 0.79 to 1.00 for the Malay version of the pre-final HFSQIQ.For the pre-final HFSQIQ English version, the I-FVI ratings for all items were evaluated as excellent for face validity, with values greater than 0.80 (ranging from 0.90 to 1.00).While the Malay version was rated at or above 0.80 and κ index greater than 0.74 (ranging from 0.90 to 1.00) (Table 3).Based on the findings, the pre-final HFSQIQ for both English and Malay versions was appropriate, as each received excellent validation ratings for face validity.
Five males (18.5%) and twenty-two females (81.5%), representing 27 hospitals, signed the online consent form and completed the questionnaire for the reliability study.The respondents' mean average age (± standard deviation) was 34.74 (± 7.128) years.Most respondents were dietitians or catering dietitians (59.3%), followed by    Scoring scale analysis was also performed for the final HFSQIQ by assessing the internal consistency and reliability of the importance and performance scales as depicted in Tables 5 and 6, respectively.The Cronbach alpha (α) values for the overall importance and performance scales were 0.97 and 0.98, respectively.On the importance scale, the subscale scores for food service operational management, Table 4: The characteristics of the respondents for reliability study (N = 27) (continued) food production and distribution management, patient/ customer service management, equipment and facility management, and staff management were 0.809, 0.937, 0.900, 0.919, and 0.939, respectively.On the performance scale, the subscale scores for food service operational management, production and distribution management, patient/customer service management, equipment and facility management, and staff management were 0.907, 0.972, 0.933 and 0.914, respectively.The questionnaire's Cronbach alpha remained consistent with a significant difference if an item was deleted from the importance and importance scales, demonstrating that the newly developed questionnaire has excellent internal reliability.

Discussion
This study aimed to develop a valid and reliable questionnaire for measuring the performance of hospital food service operations in Malaysia.This study described the items' development process, translation, and validation of a newly proposed "Hospital Food Service Quality Improvement Questionnaire" (HFSQIQ).The TQM approach proposed by Balasubramanian for basic measurement and continuous improvement provided a framework for developing the questionnaire domains (15).In addition, a literature search on hospital food service quality improvement served as a guide for developing the questionnaire's items (18)(19)(20)(21).Some researchers from prior studies presented the "Importance -performance analysis" (IPA) developed by Martilla and James (33) to quantify quality attributes based on two measurement scales: 1) their value to operations (importance), and 2) their effectiveness of the operations or management (performance) (18,20,21).The development of the questionnaire items should start with identifying significant elements of the management of the food service operation from previous research in the same or related areas.Various qualitative research methods, such as focus groups, personal interviews, and managerial discretion, are essential for identifying potentially significant variables that might be overlooked.
This study utilised several approaches, including a literature review, in-depth interviews, peer review, and expert panel judgment, to identify the relevant topics or domains and items for measuring the perceived importance and performance of hospital food service management.The first version of HFSQIQ, which included six domains with 61 items, was generated before the expert evaluation.
After expert review and content validation analysis, the number of items in the second version of the HFSQIQ was reduced to 57 while the six domains were maintained.
Item 10 was removed after discussion and agreement with the supervisory team, despite its excellent validity, given that the item was identified as not a crucial component of the measurement scale.Additionally, since staff members other than dietitians cannot evaluate performance in the nutritional management domain and the questionnaire must focus on food service management, the supervisory teams opted to eliminate Items 51, 52, 54, 55, 56, and 57.However, Item 53 was suggested to be added to the patient/customer service management domain for it is pertinent.Finally, after face validity analysis, the third version of the HFSQIQ was revised to five domains and 51 items.
Before the translation process, the first version of the HFSQIQ underwent content validation, which included a detailed evaluation of the items' relevance and clarity, and face validation after the translation process to evaluate the understandability of the questionnaire by the target users or population.The content validity index is easy to use and understandable by providing detailed information on the strengths and weaknesses of each item, leading to the deletion and modification of items for a valid reason (34).Face validity, on the other hand, focused more on the design or structure of the questionnaire and its readability by the targeted users (35).In this study, two different indices were computed for content validation to determine the percentage of items on the S-CVI/UA and S-CVI/Ave by the experts.However, the findings revealed that the average agreement among experts has a high content validity level, although the experts' universal agreement ratings were less than 0.80 for all domains.
Questions were raised regarding the calculation of the agreement indices and the possibility of inaccuracy.
Although the agreement indices are just one step in determining content validity, other factors should be used as well to decide whether to reject or modify items (34).Most scale developers employ the 0.80 criterion set by Davis (36) as the minimum acceptable S-CVI value for new instruments or questionnaires (27).For example, Polit et al. (24) stated that even if the content validity of the scale of the items was insufficient using the S-CVI/UA approach (< 0.80), it is still sufficient to employ the S-CVI/ Ave approach (> 0.80).As an alternative to the content validity index, the modified kappa index was utilised in this study to verify the findings as it considers and incorporates chance agreement.Including the CVI and a multi-rater kappa coefficient in the content validation, as suggested by Wynd et al. (37), is a significant supplement to the CVI since the kappa coefficient offers information regarding the degree of agreement that exceeds chance.Therefore, high content and face validity index scores indicate that the HFSQIQ was established appropriately and is reliable for hospital food service operations in Malaysia.
The translation, adaptation, and validation method used in this study aligned with a thorough and detailed set of guidelines developed by Sousa and Rojjanasrirat (28) to maintain the items' original impact, style, and meaning when they are translated from English into Malay.Since these terminologies are more commonly used in English than in Malay, more efforts were required to translate some technical words.For example, a direct translation for item 30, "providing various food choices for a patient with normal diet", from English to Malay, "menyediakan pelbagai makanan untuk pesakit diet normal" was accurate.However, based on review by the certified translator, the translation was harmonised to "menyediakan pelbagai pilihan makanan untuk pesakit dengan diet normal".Hence, the participation of professional translators proficient in both the target and source languages is required to ensure that the respondents appropriately translated and understood the items.
One of the important components of test quality is reliability.It involves either an examinee's performance on the test items or the consistency or reproducibility of the results.Reliability is the consistent results of a given measurement.When a measurement is considered reliable if it consistently produces the same results under the same conditions (32).Internal consistency reliability was used in this study to evaluate the consistency of results across the HFSQIQ items.Cronbach's alpha is the most common internal consistency statistic used to identify the relationship between all test items (32).The findings indicate that the HFSQIQ was evaluated among 27 respondents.Cronbach's alpha for the five importance and performance measurement scales subscales ranged from 0.809 to 0.973.This result demonstrated that the newly developed HFSQIQ could rely upon to evaluate the performance of hospital food service operations in Malaysia.
Several limitations were identified during data collection throughout this study.It was not feasible to collect faceto-face data in hospitals due to the COVID-19 pandemic, thus a web-based data collection method was applied in this study.The responses were obtained through phone interviews or virtual meetings using the Google Meet platform.The questionnaires were distributed using the Google Forms application.Although virtual meetings have a few advantages, some technical concerns, such as a sudden internet disruption or slowdown, may have resulted in communication problems.To solve this problem, separate virtual meetings were held with the respondents who experienced issues with their phone or internet connections.Another limitation is that the items derived from research conducted in other countries may not be relevant to Malaysian settings.Thus, the Delphi technique study may be recommended for future research in developing tools to evaluate hospital food service performance.This could be achieved by soliciting the opinions of experts to identify a consensus position and present findings on a specific topic or set of questions based on the knowledge and experience of experts in the field (38).Finally, despite internal consistency reliability being the most basic test used for newly developed instruments, recommending additional reliability measures of re-retest reliability would be beneficial for determining the consistency of a set of parameters.

Conclusion
The results of this study demonstrate the high validity and reliability of a newly developed questionnaire entitled the HFSQIQ.HFSQIQ is a tool that can evaluate the importance and performance of hospital food service aspects or components for quality enhancement.Future research should include a criterion validation study, as this would be able to predict the outcome of another measure or domain of the HFSQIQ.This questionnaire can also be used in other businesses or industries that provide food services because it is a straightforward and practical tool for identifying food service-related components for continuous quality improvement of food service operations.

Table 1 :
Summary of different version of HFSQIQ from Stage 1 and 2 questionnaire development

Table 1 :
Summary of different version of HFSQIQ from Stage 1 and 2 questionnaire development (continued)

Table 2 :
The I-CVI and modified kappa index for items for first version of HFSQIQ

Table 2 :
The I-CVI and modified kappa index for items for first version of HFSQIQ (continued)

Table 2 :
The I-CVI and modified kappa index for items for first version of HFSQIQ (continued)

Table 3 :
FVI of item understandability and modified kappa agreement index for English and Malay version of pre-final HFSQIQ (N=10)

Table 3 :
FVI of item understandability and modified kappa agreement index for English and Malay version of pre-final

Table 4 :
The characteristics of the respondents for reliability study (N = 27)

Table 5 :
The internal consistency of the item total statistics for importance scale

item deleted Scale variance if item deleted Corrected item total correlation Cronbach alpha if item deleted
*Overall Cronbach alpha (α) = .970

Table 5 :
The internal consistency of the item total statistics for importance scale (continued)

Table 6 :
The internal consistency of the item total statistics for performance scale

Table 6 :
The internal consistency of the item total statistics for performance scale (continued)

Table 6 :
The internal consistency of the item total statistics for performance scale (continued)