Socio- Demographic, Clinical and Lifestyle Determinants of Low Response Rate on a Self-Reported Psychological Multi-Item Instrument Assessing the Adults’ Hostility and its Direction: ATTICA Epidemiological Study (2002-2012)

: Background : Missing data constitutes a common phenomenon, especially, in questionnaire-based, population surveys or epidemiological studies, with the statistical power, the efficiency and the validity of the conducted analyses being significantly affected by the missing information. The aim of the present work was to investigate the socio- demographic, lifestyle and clinical determinants of low response rate in a self- rating multi-item scale, estimating the individuals’ hostility and direction of hostility. Methods : 3042 apparently healthy volunteers residing in the Athens metropolitan area participated in the ATTICA epidemiological study [1514 (49.8%) were men [46 years old (SD= 13 years)] and 1528 (50.2%) were women [45 years old (SD= 14 years)]]. Hostility and Direction of Hostility was assessed with the Hostility and Direction of Hostility (HDHQ) scale. Binary logistic regression with backward model selection was used in order to identify the key demographic, clinical and lifestyle determinants of higher non-response rate in the HDHQ scale. Results : The vast majority of the participants (87.0%) had missing information in the HDHQ scale. Older age, lower educational level, poorer health status and unhealthy dietary habits, were found to be significant determinants of high nonresponse rate, while female participants were found to be more likely to have missing data in the items of the HDHQ scale. Conclusions : The present work augments prior evidence that higher non-response to health surveys is significantly affected by responders’ background characteristics, while it gives rise to research towards unrevealed paths behind this claim.


INTRODUCTION
Missing data constitutes a common phenomenon, especially, in questionnaire-based, population surveys or epidemiological studies. Presence of missing data reduce the representativeness of the selected sample, cause bias and lead to a decrease in the a-priori designed statistical power, as well as the efficiency and validity of the conducted analyses and therefore, distort inferences about the referent population [1,2]. Although several methodological frameworks have been proposed to reduce missingness in data collection in quantitative surveys, this situation is, unfortunately, very common in research. Moreover, it becomes more critical, especially when the missing data concern a multi-item, health-related instrument (or scale, score), which is applied to measure a latent construct that is difficult or impossible to be measured directly [3]. There is a variety of such instruments that have been developed to measure psychological disorders' symptomatology (like anxiety, depression, stress, hostility) [4], dietary patterns and behaviors [5], and several clinical conditions [6]. Lack of information in even one of the instruments' items, lead to the inability to calculate the total score of the instrument, making the whole procedure useless since it would not be able to correctly classify the individual to the health class belongs.
The main sources of item's non-response are, the type of research (e.g., topic of research, referent population), the structure of the questionnaire or the instrument, the interviewer (e.g., easy acceptance of don't' knows (DKs)), and the background characteristics of the respondents [7][8][9]. Identifying the profile of individuals with missing data, is of crucial importance in order for a study and its results to be valid. E.g., individuals with missing data may be systematically different from those with complete information, either regarding the outcome of interest, or their prognosis in general. Review of the source of missingness in health surveys revealed that older individuals and low educated, as well as females and those with poorer health status, tend to have higher levels of missing information [10].
Although several methodologies have been proposed, the aforementioned topic of missing data analysis is still not well studied and understood [11]. The aim of the present work was to investigate the demographic, clinical and lifestyle profile of the participants of the ATTICA epidemiological study, with missing data in a health-related scale estimating the individuals' hostility and its direction.

Design
The ATTICA study is a prospective, observational cohort investigation that was initiated in 2001-2002 and included two follow-up examinations in 2006 and in 2011-2012, respectively [12][13][14].

Sample
At baseline (2001-2002), 3,042 apparently healthy volunteers residing in 40 municipalities of the Athens metropolitan area agreed to participate (75% participation rate of the n=4056 participants initially approached); no significant differences regarding basic sociodemographic characteristics were observed between participants and non-participants (all p's>0.05) [15]. Of the enrolled participants, n = 1,514 (49.8%) were men [46 years old (SD= 13 years)] and n = 1,528 (50.2%) were women [45 years old (SD= 14 years)]. During baseline examination, a detailed clinical evaluation was performed by trained physicians; all participants were free of CVD and other chronic diseases, as evaluated by the physicians of the study. Further details regarding the methods and the sampling procedure applied in the ATTICA study have been previously detailed [14].

Bioethics
ATTICA study was approved by the Bioethics Committee of Athens Medical School. The study was carried out in accordance with the Declaration of Helsinki (1989) of the World Medical Association. All participants were informed about the study aims and procedures and provided written informed consent.

Socio-Demographic and Anthropometric Characteristics
The socio-demographic, anthropometric and lifestyle characteristics assessed, included among others age (in years), sex (male/ female), educational level (No formal studies/ Primary education (≤ 6 years)/ Secondary education (≤ 12 years)/ Higher education (> 12 years)) and body mass index (according to standard guidelines overweight/ obesity was defined as body mass index ≥ 25 kg / m 2 ).

Lifestyle Characteristics
Physical Activity Level Participants' physical activity status was evaluated through the validated short Greek version (9 items) of the "International Physical Activity Questionnaire" (IPAQ). According to the reported physical activities, participants were classified into four categories: inactive, low (i.e., <150 metabolic equivalent -METminutes/week), moderate (150-300 MET-minutes/ week) and Healthy Engaged Physically Active -HEPA (>300 MET-minutes/week). For the purposes of the present work participants were further classified into two main categories, inactive (sedentary) and physically active.

Smoking Habits
Participants' smoking habits were evaluated through pack-years of smoking (a pack year was defined as twenty cigarettes smoked daily for one year). Current smokers were defined as those who reported smoking at least one cigarette or any type of tobacco per day at the time of the interview, while former smokers were defined as those who previously smoked but had quitted within the previous year before enrolment.
Based on their smoking habits and physical activity status, participants were further classified into two main categories, as those having a healthy lifestyle (nonsmokers and physically active) and those having an unhealthy lifestyle (current/ former smokers, or physically inactive).

Dietary Assessment-Level of Adherence to the Mediterranean Diet
The MedDietScore, an instrument (scale) used to estimate the level of adherence to the Mediterranean diet, was applied to all participants [16]. This scale consists of 11 items estimating the frequency with which individuals consume several foods, which are either close to the Mediterranean diet (e.g., fruits, vegetables, non-refined cereals and products), or away (e.g., meat and meat products). Higher values of this scale indicate adherence to the traditional Mediterranean diet, while lower values indicate adherence to the "Westernized" diet.
Further details regarding the methods and measurements applied in the ATTICA study have been previously detailed [14].

Clinical Characteristics
Assessment of clinical characteristics (hypertension, hypercholesterolemia, and diabetes mellitus) was performed according to established physical examination procedures and pharmaceutical treatment [16]. In particular, diabetes mellitus was defined as a fasting blood sugar > 125 mg/dl or the use of antidiabetic medication and, thus, participants were classified as diabetic or non-diabetic. Patients whose average blood pressure levels that were measured by study's investigators through standard procedure, were greater or equal to 140/ 90 mm Hg or were under antihypertensive medication, were classified as having hypertension. Based on the total serum cholesterol levels measured, participants were classified in three groups (Group I: Desirable levels (< 200 mg/dL), Group II: Borderline levels (200-239 mg/dL) and Group III: High levels (> 240 mg/dL), with those belonging in Group II and III, characterized as hypercholesterolemic.

Evaluation-Estimation of Individuals' Hostility
A translated and validated version of the Hostility and Direction of Hostility Questionnaire (HDHQ) was used, in order to assess the levels of hostility of the participants. The scale consists of 51 items and comprises 5 subscales, namely, AH (Urge to Act Out Hostility), CO (Criticism of Others), PH (Projected Delusional or Paranoid Hostility), SC (Self-Criticism) and DG (Delusional Guilt). Higher scores are indicative of more severe levels of hostility [17].

Outcomes
The outcome examined in the present work was the likelihood of the participants presenting at least one missing data in the HDHQ scale. Specifically, participants were classified, as those without missing data and those with missing data in at least one item of the scale, in order to investigate the characteristics of those with missing data.

Statistical Analysis
Continuous variables are presented as mean values (standard deviation, SD) and categorical variables are presented as relative frequencies (%).

Investigation of the Participants' Profile with Missing Data
Associations between categorical variables and the binary (no missing data/missing data in at least one item) form of the number of missing data in the HDHQ scale, were tested with the Pearson Chi square test. Odds ratios (OR) and their corresponding 95% Confidence Intervals (95% CI) were evaluated through univariable and multivariable logistic regression analysis, which was used to find the participants' characteristics being significantly associated with the likelihood of having missing data in at least one item of the HDHQ. Backward model selection was used to determine the final significant predictors. All statistical analyses were performed in the STATA software, version 14.

Sample Characteristics
As presented in Table 1, the vast majority of the participants (87.0%) had missing information in the HDHQ scale. Almost half of the participants were aged less than 45 years old; 50.2% of them were females and at least 8 out of 10 (81.5%) were at least in the secondary educational level (≥ 7 years of education). The prevalence of the clinical conditions studied were: 58.9% (overweight/ obesity), 30% (hypertension), 6.9% (diabetes mellitus) and 39.6% (hypercholesterolaemia), while almost 8 out of 10 participants (77.9%) were either smokers or physically inactive (unhealthy lifestyle).

Participants' Profile with Missing Data
Participants with missing data in at least one item of the HDHQ scale, were older, less educated, with unhealthier dietary habits, as well as more likely to be diabetic, hypertensive and with higher levels of total serum cholesterol ( Table 1). Based on the results from the multivariable logistic regression model, participants with missing data were more likely to be at least 45 years old (OR= 2.47; 95% CI= 1.90-3.22), females (OR= 0.64; 95% CI= 0.50-0.81; Men Vs Women) and less educated (14+ years Vs 0-6 years: OR= 0.55, 95% CI= 0.36-0.83; 7-13 years Vs 0-6 years: OR= 0.64, ; Diabetes mellitus was defined as a fasting blood sugar > 125 mg/dl or the use of antidiabetic medication; Patients whose average blood pressure levels were greater or equal to 140 / 90 mm Hg or were under antihypertensive medication were classified as hypertensives; The definition of hypercholesterolemia was based on the total serum cholesterol levels (≥ 200 mg/dl). OR= Odds Ratio; CI= Confidence Interval; Ref. category= Reference category (***p< 0.001, **p< 0.05, *p< 0.10).

Notes:
Results are based on the multivariable logistic regression analysis; OR= Odds Ratio; CI= Confidence Interval; p= pvalue; Overweight/ Obesity was defined as Body Mass Index (BMI)≥ 25 kg/m 2 ; Patients whose average blood pressure levels were greater or equal to 140 / 90 mm Hg or were under antihypertensive medication were classified as hypertensives; The definition of hypercholesterolemia was based on the total serum cholesterol levels (≥ 200 mg/dl). 95% CI= 0.43-0.97). As for their clinical condition, participants with missing data had 1.3 times higher odds of being overweight/ obese (OR= 1.29; 95% CI= 1.01-1.65) and hypertensive (OR= 1.30; 95% CI= 0.97-1.74), while they were also found to have 1.5 times higher odds of having high total serum cholesterol levels (OR= 1.49; 95% CI= 1.15-1.92) (Figure 1).

DISCUSSION
The present work aimed to identify the profile of the individuals with missing data in a multi-item instrument, which is widely used to estimate individuals' hostility and direction of hostility. Data analysis revealed that the amount of missing data in such a structured questionnaire was significantly associated with various demographic, clinical and lifestyle characteristics, supporting that the missingness mechanism could be considered as MAR (Missing At Random) or MNAR (Missing Not At Random) and therefore special treatment of missing data should be considered from researchers before further analyses. In general, higher non-response rate was found to be significantly associated with older age, lower educational level, poorer health status and unhealthier dietary habits, while higher likelihood of having missing data was significantly associated with the female sex. Despite the limitations of the present cross-sectional analysis, our findings revealed the profile of participants to whom special focus should be given by the researchers when collecting such psychological data.
There is a substantial body of literature investigating the characteristics of the individuals with missing data in surveys, however this is the first study that focuses on the individuals' characteristics with missing data in a widely used multi-item instrument such as the HDHQ. As reported by previous studies, dropout risk can be viewed as the co-occurrence of common demographic (e.g., sex, age), social (e.g., ethnicity, marital and educational status), personal (e.g., attitude, knowledge, time constraints, financial or other resource limitations, lack of family and friends support, stigma or embarrassment of health-related problems) and need factors or barriers (e.g., mental health disorders, comorbidities, etc.) [18,19]. Missing data in psychological instruments are widely acknowledged to exist even more frequently, when the interviewers are mental health practitioners, since research participants are less likely to share private information about their mental health status [20]. More specifically, stigma and lack of trust in mental health providers constitute crucial obstacles [21]. According to the findings of recent studies of avoiding help-seeking, it is recognized that individuals' assumptions about how others see them and whether they are considered mentally ill are likely to be important determinants [22]. Ordinary opinions ('conventional understanding') of mental health problems as incurable mental instability contribute to both the cultivation of fear of negative evaluation and concern about the possible consequences for individuals to be evaluated as mental health clients within the care system [23].
The present results are in accordance with a previous study, where it was reported that several demographic, clinical and lifestyle factors including age, body mass index, physical activity, and parity are significantly associated with the number of items left blank in a structured questionnaire [24]. Furthermore, lower educational level, as a proxy measure of the participants' socio-economic status, was significantly associated with higher non-response, which agrees with the study of Wilks et al. reporting that individuals in lower socioeconomic groups tend to present higher non-response rates in health surveys [25]. Education is considered a key factor in the adaptive expression of anger and its components. In particular, education and other learning opportunities can provide better access to knowledge, stronger critical thinking skills, as well as problem-solving, self-control, perseverance, conscientiousness and negotiation skills [26]. Thus, more educated people are likely to have a greater ability to understand, judge, and act upon healthcare information [27]. Besides, previous studies have also, stated that low-educated women drop out of prevention programs more frequently than the more highly educated, resulting in low retention rates [28]. In addition, the present findings seem to agree with the study conducted by Ying, who found that younger and higher educated men were more likely to respond to the entire structured instrument, while middle-aged men and older women were found to have the highest non-response rates [29]. According to Mody et al., older individuals are in a greater risk of item nonresponse by missing or skipping items, either due to cognitive impairment, or due to physical problems, such as vision impairments [30]. Moreover, participants' poorer health status was also connected with higher non-response rate in the HDHQ scale, which is in accordance with other studies reporting higher nonresponse rates in individuals with lower subjective health and poorer physical, cognitive, and psychological functioning [15,31,32].
Our finding with regard to the higher nonresponse rate among females, is quite the opposite with various previous studies reporting that female participants are more likely to participate in surveys [33][34][35]. Similarly, several past studies outlined that men are more likely to reject psychological assessment than women because of underestimation of the disease risk, their subjective belief they have less control over their health or better health than women, the denial that their behavior is problematic and finally, their sense that they are stronger than women and so, they have to control their emotions ('hegemonic masculinity'), except anger which is considered as a means of influence by them [36][37][38]. Nevertheless, other studies have indicated that women are significantly more likely than men to drop out early from healthcare services [39]. Additionally, a study found that women who completed all stages of the research had a different profile than women who completed only the initial assessments [40]. Traditionally, women are more likely to internalize their emotions than men, while men are more likely to externalize them [41]. In particular, women are expected to be more polite and modest than men, and if they do not follow this prescriptive stereotype, they might elicit negative impressions and responses from others [42]. The expression of anger by women can be considered as a violation of gender stereotypic prescriptions, and angry women can be evaluated more negatively than angry men [43]. Based on these, we could assume that women participants may not have completed the HDHQ scale under the fear of negative evaluation and/or as a means of suppressing their anger, while men have lower rates of missing data in the items of the HDHQ scale because they are not afraid of scoring high in hostility.

LIMITATIONS
To the best of our knowledge, this is the first study investigating the profile of individuals with missing data, in the Hostility and Direction of Hostility questionnaire to such extent, yet without applying any imputation methodology that would empower the empirical data analyses, but this was not the purpose of the present work. However, the conclusions of the present work should be considered under some existing limitations, such as the cross-sectional nature of the data, which does not allow for causal associations to be drawn.

CONCLUSIONS
In summary, older and less educated individuals, as well as those with morbidities and unhealthier lifestyle habits, constitute a risk group for higher non-response rates when collecting such data, and therefore, researchers should give special focus when interviewing them, in order to keep the gathered information response rate in high levels.

DIRECTION FOR FUTURE RESEARCH
Anger suppression, low educational status and obesity are commonly proposed as risk factors not only for physical health but also for depression, especially in women. Hence, it would be critical if future studies could explore the role of depression regarding missing data in the assessment of hostility, since those who experience symptoms of psychological distress tend to avoid any form of health services in general (e.g., preventive care) [44,45]. Finally, apart from medical factors in physical health, our findings indicate that avoidance behavior in relation to psychological assessment could also play a role in current clinical implications, such as hypertension and total serum cholesterol levels. This, however, remains unclear. Our research has raised many questions and further work needs to be carried out.

CONFLICT OF INTEREST
Professor Demosthenes Panagiotakos is member of the Journal's Editorial Board. The rest authors declare that they have no conflict of interest.

FUNDING
The ATTICA study is supported by research grants from the Hellenic Cardiology Society (HCS2002) and the Hellenic Atherosclerosis Society (HAS2003). This work was supported by the Ageing Trajectories of Health: Longitudinal Opportunities and Synergies (ATHLOS) project, which has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement No 635316.

AUTHOR CONTRIBUTIONS
TT wrote the manuscript (interpretation of the results and discussion) and performed the statistical analysis. CV and TP contributed to the interpretation of the results and discussion. DBP was responsible for the study's design and implementation and critically reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.