*2.1.3 Map recommendations*

We coded and classified the recommendations into a structured vocabulary using a deductive approach. The recommendations were mapped into seven

<sup>3</sup> http://www.seniorsonly.club/

<sup>4</sup> https://www.seniorforums.com/

<sup>5</sup> https://gs.statcounter.com/os-market-share/mobile/worldwide

### *Usability Recommendations for Designers of Smartphone Applications for Older Adults… DOI: http://dx.doi.org/10.5772/intechopen.96775*

categories provided in Peter Morville's [37]. Usability honeycomb as shown in **Figure 2**. Usability is about designing products to be effective, efficient, and satisfying. Usability includes user experience design. This may include general aspects that impact everyone and do not dis-proportionally impact people with disabilities. The caveat here is that usability practice and research often does not sufficiently address the needs of people with disabilities [24].

We also considered two alternatives to Morville's definition of Usability. Firstly, in 1994, Jakob Nielsen [38] provided 5 attributes that together constitute usability i.e. Learnability, Efficiency, Memorability, Errors and Satisfaction. Then in 2003, Whitney Quesenbery [39] proposed a 5E model to describe usability i.e. Effective, Efficient, Engaging, Error Tolerant and Easy to Learn. Subsequently, in 2004, Peter Morville used a 'honeycomb' to illustrate usability with 7 sub-categories Useful, Usable, Findable, Valuable, Desirable, Accessible, Credible. The reason we choose the latter is that it is the latest and more granular than the former ones i.e. 7 categories compared to 5, and it is well recognised and highly cited.

#### *2.1.4 Validate recommendations using inter-rater reliability*

Different coefficients can be used for evaluating the agreement in classification of recommendations between the three raters or inspectors.

**Proportion Agreement-** A straightforward approach to evaluate the agreement is to consider the proportion of ratings upon which raters agree. This is, however, considered naive as the agreement may have occurred solely by chance. According to [40], using proportion or percentage of agreement tends to produce higher values than other measures of agreement. He discourages the use of proportion agreement, because science is inherently about conservatism rather than liberalism. In addition, the use of the proportion of agreement can be unreliable [41]. Therefore, the use of proportion or percentage agreement was not our choice for an evaluative measure.

**S-Coefficient-** Another option for evaluation of the agreements was the Scoefficient proposed by [42]. However, he assumes that the agreement by chance is due to raters assigning sub-categories/classes to the recommendations randomly at an equal rate.

**Cohen Kappa-** An alternative definition for agreement is the raters' tendency to distribute the classifications in a certain way. This seems a reasonable assumption

**Figure 2.** *User experience honeycomb [37].*

*a priori*, in an inspection context. This is assumed to be the case with Cohen Kappa's coefficient [43]. We chose it because the three researchers, based on their theoretical knowledge of the domain, would be expected to classify the recommendations in a specific way, and given that we have 7 options in which to classify the text snippets from the forums, there was plenty of room for error or to highlight differences. It has been established that a good agreement as measured by Cohen Kappa's coefficient can produce slightly higher reliability results in the case of over seven categories [44]. Moreover, the Kappa coefficient is widely used in social and medical sciences and it has thousands of citations to date [45]. In the medical domain, it has been presented as a measure of agreement in reliability studies [46]. A variant of Kappa called weighted Kappa [47] was also considered, but it is most useful for non-nominal scales and when the relative costs of agreement can be quantified. The analysis of these three options led us to use Cohen's Kappa [48] because it is a robust and useful statistic tool for inter-rater reliability testing.

#### *2.1.5 Transform recommendations to design patterns*

Once we identified and developed a consolidated set of recommendations, we structured them into a design pattern format which consisted of sections: Problem, Rationale, Solution, Type, Sub-type, Related Patterns, References/Evidence as shown in [36], which explains each heading and what is included in each section.

Qualitative evidence suggests that design patterns can aid maintenance, but they do not appear to help novice designers learn about design [49]. Also, Bieman particularly observes that: 'there is very little empirical evidence of the claimed benefits of design patterns and other design practices when applied to real development projects' [50]. This implies that simply using patterns does not ensure good design, they should be used appropriately [49]. The initial attempt to develop design patterns was made by Christopher Alexander to document and discuss building architecture design and solutions [51]. Design patterns have become popular among software developers after the publication of "Gang of Four" [52]. It is defined as:

*"A pattern is a structured document comprising of set of pre-defined sections; this means that all patterns in a given pattern language have the same general structure, making it easy for readers to find information"* [53].

There are subtle differences in the structure of the design patterns used in existing research. This paper will follow the format proposed by [54] as this is also used in numerous studies e.g., [53, 55, 56].

#### **2.2 Data analysis strategy**

In qualitative research, the process of data collection, data analysis, and report writing is not always completed in distinct steps; they are often interrelated and occur simultaneously throughout the research process [57]. We used thematic analysis to develop themes that were emphasised during Think Aloud sessions or on the ageing forums. This is a method for identifying, analysing, organising, describing, and reporting themes found within a data set. Bruan et al. 2006 [58] presents a linear, six-phased method for thematic analysis. This involves becoming familiar with the data, generating initial codes, searching for themes, reviewing themes, defining and naming themes and producing the report. It is actually an iterative and reflective process that develops over time and involves a constant moving back and forward between phases. We have adapted the guidelines of [57, 58] to perform analysis of the qualitative data generated during the mixed methods. This was accomplished using Microsoft Excel. The snippets were matched with the theme columns and marked or annotated using zero, one, tick, cross. To facilitate code creation, we used a single

### *Usability Recommendations for Designers of Smartphone Applications for Older Adults… DOI: http://dx.doi.org/10.5772/intechopen.96775*

'sentence' as a unit of analysis for extracting and defining the themes. Author 1 developed an initial set of themes which were validated by author 3.

**Step 1: Reading-** Author 1 browsed through the transcripts verbatim and took notes from first impressions. After that, he re-read the transcripts line by line.

**Step 2: Coding-** Phrases, sentences, and sections with regard to the importance were labelled. A sentence was considered as important or relevant based on several factors, e.g., if the participant has explicitly said that this aspect is important, whether something new or surprising came up, whether something that had been previously published was discussed, or if something is repeated in several places.

**Step 3: Categorising-** Analysing the all the codes created, the most important and relevant codes were combined to form a category or a theme. Some trivial codes which had no relevance to the research objectives were ignored. The themes that were generated as a result of this stage were general and abstract.

**Step 4: Label and Define the Categories-** These high level themes were labelled and developed from the significant codes. These high level themes were the main results of this study which elaborate new knowledge from the perspective of techsavvy older adults. The definitions of these themes were also outlined.

**Step 5: Rectification of Unseen Bias-** In order to alleviate any bias in the findings and to increase reliability, Author 1 randomly sent the snippets to author 3 without a theme assigned. The list and definitions of themes was sent separately. Author 3 commented, assigned these snippets to the themes and identified new themes. This resulted in the addition, deletion and modification of several themes. Any unresolved issues were discussed between all authors until a consensus was reached. After the completion of this step, we had a final set of mutually agreed 3 key themes and 24 sub-themes.
