**2.3 Methods of data collection**

A supportive letter was given from the College of Natural and Computational Science, Biology Department, Hawassa University, to get the permission of the respective directions to select the study participants and conduct the interviews in extension offices, health institutions and hospitals. Two days of training were allocated to train four experienced data collectors. The training focused on explaining the purpose of the study, the meaning and interpretation of some scientific terminologies in each question, and obtaining consent from every single participant. The data collectors were experienced and capable of speaking the local languages (Amharic and Sidamingia). The four data collectors conducted door-to-door visits based on a list of members of the households to get responses and fill in the questionnaire. The list of householders was coded and their names were not mentioned for anonymity and confidentiality. In-depth interviews with officials and physicians were conducted by the researchers with the help of a professional translator who spoke both Amharic and English languages.

## **2.4 Data analysis**

All data was coded and analyzed using SPSS version 25. Descriptive statistics were used to summarize frequencies and proportions, and results were presented in tables and charts. A multiple logistics regression model was employed to determine the effect of the independent variables on farmers' knowledge and the prevalence of self-reported toxicity symptoms. Multiple logistic regression was employed due to its powerful statistical way of modeling a binomial outcome for categorical data [19]. Chi-square, as well as Hosmer and Lemeshow tests, were firstly used before running the logistic regression test to measure the association between the independent and dependent variables and to check whether the model fits the data or not respectively. The data was summarized using the odds ratio, 95% confidence interval at .05 alpha levels.

#### **2.5 Sampling technique and sample size determination**

The study employed a multi-stage sampling technique due to the advantage it gives of using more than one stage and combine several sampling techniques. The multi-stage sampling in this study entitled four stages. In the first stage, the Tula sub-city was purposively selected as it is relatively accessible by scientists. In the second stage, the Finchawa and Tullo rural kebeles were also purposively selected because of the considerable number of farmlands available in both kebeles, the extensive usage of pesticides in their farmlands, and their strategic location around Lake Hawassa. Both rural kebeles are considered the catchment area of Lake Hawassa. In the third stage, the study applied a simple random sampling to select farmers from both rural kebeles. All participants agreed to participate in the research study by signing informed consent forms. In the fourth stage, a convenience sampling was employed to select one official from the extension office in Finchawa, one official from the extension office in Tullo, one physician from the Bushullo Health Institution and one physician from the Referral Hospital.

The farmers' representative of both Finchawa and Tullo rural Kebeles estimated the number of farmers that use pesticides in their farmland as 100 farmers distributed as follows: Finchawa 49% and Tullo 51%. The sample size was determined by using the formula of Kothari [20]; at 95% level of confidence. Accordingly, the total sample including 10% of the contingency is 73.

#### **2.6 Pilot testing**

The questionnaire was piloted with 20 farmers (10 participants from Finchawa and 10 participants from Tullo) who did not participate in the study. Hence, all the forwarded comments regarding the wording of sentences, vague sentences and unclear scientific ideas were amended to ensure the validity of the items. The research was also expected to be reliable on its findings. Reliability of binary items were tested using Kuder–Richardson 20. The KR-20 can be applied to any test item responses that are dichotomously scored [21]. The value of internal consistency tests suggested a good level of reliability. Further, the internal consistency of the Likert scale items, was also tested using Chronbach's alpha. Cronbach's (1951) alpha was developed based on the necessity to evaluate items scored in multiple answer categories [21]. The value of internal consistency tests indicated a good level of reliability.
