**2.4. Outcome measures**

*Primary outcome:* The pain was assessed using a visual analogue scale (VAS), and a Laitinen scale. VAS is a line of 10 cm, the leftmost side is 0 = no pain and the far right is 10 = unbearable pain. The participants marked the scale of their current level of pain after their usual daily activities. The values in centimeters were recorded for statistical analysis. The same therapist administrated the measurements of all the participants and was blinded to the treatment.

*Secondary outcome:* In order to identify a specific index of disability there was used the WOMAC as a subjective measure of perceived health. It is a questionnaire that consists of three parts of questions and can be filled in a few minutes. There were 24 questions: about pain (5 questions), about stiffness (2 questions), and about physical function (17 questions) [10, 11]. In our study, we used a more detailed Likert scale version of the WOMAC, which includes a five-point scale for patients to mark (0 = none, 1 = mild, 2 = moderate, 3 = severe, and 4 = extreme). Achieving higher score means lower level of perceived health. All the scores were summed and coded. Answering the questions the patients described their stays during the past 3 days. The same therapist made the measurements to all the participants and was blinded to the treatment.

## **2.5. Data analysis**

A priori sample size was determined in this study, giving the anticipated Cohen's d effect size of 0.8, the probability level of 5%, and the desired statistical test power level of 80%. We estimated that we needed minimum 26 participants in each group. The data were analyzed with descriptive as mean, standard deviation (SD) of two groups, mean (SD) within-group differences, 95% CI (95% confidence interval) of mean between-group differences, and inferential techniques. The mean within-group differences and the mean between-group differences (95% CI) were calculated for each of the outcomes based on the change scores (i.e., after minus before scores). The Shapiro-Wilk test identified the nonnormal distribution of the VAS and of the WOMAC data. The mean between-group differences for data was analyzed using the Mann-Whitney U test. To describe the differences in related treatments, the effect size between-group difference was calculated using Cohen's d, and classified as small (d = 0.2), moderate (d = 0.5), and large (d = 0.8) [12]. The level of statistical significance was set at a twotailed p-value of 0.05. The analyses were performed by a blinded and independent statistician according to a prespecified statistical analysis plan on an intention-to-treat basis [13].
