**1. Introduction**

The COVID-19 pandemic struck in 2019 had caused varying effects on different sectors and industries. It affected educational systems worldwide, resulting in an almost complete closure of higher education institutions. Students and teachers were compelled to immediately adjust and switch to online teaching and learning activities. Universities also had to allow some flexibility when it comes to conducting examinations to eliminate in-person physical interaction.

Two years after the outbreak, all educational institutions are back open. However, there is no denying that everything is going back to how it used to be. The 'new normal' has forced us to move into digital, involving a hybrid education that combined face-to-face and virtual activities. Adaptation to new technologies seems obligatory and has become a part of the daily routine for educational systems globally. We can observe that online learning is on the rise and assessments can be conducted remotely.

With regard to medical education, e-learning helps students to adjust and adapt to an online medical environment. Yet, it limits students' interpersonal contact

with patients and opportunities for clinical practice and professional development. Nevertheless, the pandemic has brought a new insight that medical teaching and learning as well as student assessment can be conducted virtually. At Universiti Putra Malaysia, conducting online assessments during the pandemic made us realise the necessity to remain maximising the use of technology. The transition from the traditional method of assessment and the paradigmatic shifts are discussed further.

### **2. Student assessment in medical education**

Assessment is the process of documenting the level of a learner's knowledge, skills, and attitude and its purpose is to make judgement and decisions about a student's learning against a certain standard or benchmark [1]. Assessment can be classified as a formative or summative assessment. Formative assessment, also known as 'assessment for learning', is an ongoing process that aims to monitor student's learning. It is usually low stake and conducted informally in class. This assessment is a powerful diagnostic tool for students to pinpoint which areas they have mastered and which areas of weakness so they can concentrate their efforts in those areas moving forward. Constructive feedback on the strengths and weaknesses of students is the cornerstone of formative assessment to shape and improve future learning. In many cases, educators modify their instructional materials and clarify contents to ensure students to achieve the expected learning outcomes. Examples of formative assessment include short quizzes during class, direct observation of procedural skills (DOPS) and mini-clinical examination (mini-CEX).

Summative assessment, on the other hand, known as 'assessment of learning', takes place at the end of a course of study which is usually high-stake. The purpose is to provide an accurate pass-or-fail decision about students and a final measure of student performance. In health professions education, summative assessment is conducted to determine whether students have met the minimum standards during progression, graduation and licensure to assure that the public is protected from incompetent practitioners. Concurrently, medical educators may obtain feedback on the appropriateness of learning outcomes and the effectiveness of learning instruction based on post-assessment analysis [2]. Examples of summative assessments include those that occur at the end of a course, semester, year or before the newly graduated doctors can begin to practise medicine professionally.

Medical students must acquire and demonstrate various domains of competency throughout the training. However, there is no single method of assessment that can adequately evaluate their performance across all domains. Each assessment method has its own advantages and disadvantages. Therefore, a variety of assessment methods are required to ensure that students achieve all required competencies before graduation.

More than 30 years ago, psychologist George Miller proposed a hierarchical framework for assessing clinical competence [3]. It is a valuable model showing the levels of knowledge and skills assessed in medical education. The iconic Miller's pyramid model divides between assessment of cognition and behaviour in practice (**Figure 1**). The base of the pyramid is knowledge ('knows'), followed by the application of knowledge ('knows how'). Acquiring medical knowledge is the essential precursor for clinical problem-solving. The 'knows' level can be assessed by written assessment such as multiple-choice questions (MCQs) while 'knows how' add a level of complexity to the cognitive scheme. Students need to apply their knowledge,

*Perspective Chapter: Paradigm Shift on Student Assessment Due to COVID-19 Pandemic... DOI: http://dx.doi.org/10.5772/intechopen.109555*

#### **Figure 1.** *Miller's pyramid [3].*

manipulate the information, and demonstrate an understanding of the relationship between concepts and applications [1]. Appropriate assessment methods include higher-order MCQs, essay and viva or oral exams.

The third level of the pyramid moves the method of assessment to performance assessment and represents clinical skills competency, usually assessed under a controlled environment ('shows how'). The assessments are rather simulated and standardised. Objective structured clinical communication (OSCE) is an example of assessment, in which students may demonstrate clinical skills such as communication or performing a physical examination on a simulated patient. Finally, the top of the pyramid is clinical performance, assessed by direct observation in authentic clinical settings ('does'). Examples of assessments include workplace-based assessments such as mini-CEX or DOPS where students demonstrate clinical performance with actual patients by integrating their knowledge, skills and abilities in the real-world clinical setting.

Miller's pyramid is frequently used with other taxonomy frameworks such as Bloom's revised taxonomy. Bloom's taxonomy encompasses six levels of the cognitive domain, from the lowest level, which is remembering information, up to successively more to complex higher-order levels, which is creating [4]. The taxonomy model is useful while deciding expected cognitive outcomes and constructing written assessment items.

In selecting the appropriate assessment methods in the program, the purpose of the assessment should be considered. *Is it for formative or summative purposes?* As stated above, there are different levels of clinical competence are required. *Are the different assessment methods cover all clinical competence in Miller's pyramid? Are they adequate?*

In 1996, van der Vleuten proposed a conceptual model for defining the utility of an assessment method [5]. There are five criteria of an assessment method involved in this model, namely reliability (*does it consistently measure what it is supposed to?*), validity (*does it measure what it is purported to?*), educational impact (*how does it affect teaching and learning?*), acceptability (*is it acceptable to relevant stakeholders?*) and cost (*does it practical and feasible?*). Using this model, the utility of an assessment method can be derived by conceptually multiplying all the weights of each criterion.

*Assessment utility reliability validity educ* = ×× × × *ational impact acceptability cost* (1)

It is important to note that this is not a mathematical formula, but a notional one. The weight of each criterion depends on the purpose of the assessment. For formative purposes, more weight is given to educational impact while for summative purposes, more weight is given to reliability [6].

Later, Norcini et al. published a consensus statement identifying seven criteria of a good assessment. Five of them were derived from van der Vleuten's model, while another two are equivalence (*does it produce similar results in different groups?*) and catalytic effect (*does it create, enhance, and support education?*) [7]. Therefore, it is essential to evaluate these criteria when considering the appropriate and suitable assessment methods or tools in the programme.

#### **2.1 Assessment of knowledge acquisition and application**

Written assessments are widely used in medical education to assess knowledge acquisition, comprehension of basic principles and clinical reasoning. Although these skills are positioned at the base of Miller's pyramid ('knows' and 'knows how'), they form a foundational set of skills that students need to master prior to achieving clinical competence. They are inexpensive, convenient and produce reliable scores. There are many types of written assessment commonly used for medical students which will be covered in the next section.

#### *2.1.1 Multiple choice questions (MCQs)*

This is certainly the most popular assessment method globally because of its validity, reliability and practicality. The A-type MCQs require examinees to select one best answer from several options. They are also known as single-best answer questions (SBAQs) or one-best answer (OBA) questions. The question consists of a stem, which can be a clinical or non-clinical vignette, a lead-in statement and three or more answer options.

The R-type MCQs, also called as extended matching items (EMIs) or extended matching questions (EMQs), are an extended version of the A-type format. In a set of EMIs, there is a theme of the questions, a list of options (can be from seven to 20), a lead-in statement and a minimum of two items or vignettes. All items should be relevant to the theme. For each item, examinees choose the correct answer from the list of options. Both A-type and R-type MCQs can be used to assess the theory and application of knowledge, critical thinking and problem-solving skills.

Multiple true false (MTF) questions are becoming less popular among medical schools. They are normally used to test factual recall. However, this type of assessment is able to cover more breadth of a topic, which is suitable to be used in formative assessment. Each item consists of a stem, followed by five statements related to the stem. For each statement, examinees may select either true or false. In a pen-andpaper examination, optical mark recognition sheets, better known as OMR sheets used by the examinees to mark their answers. Those sheets are analysed by an OMR machine and the scores can be obtained instantly. Certain OMR machines can also perform a concurrent evaluation of the quality of the questions based on the examinee's response and scores.

*Perspective Chapter: Paradigm Shift on Student Assessment Due to COVID-19 Pandemic... DOI: http://dx.doi.org/10.5772/intechopen.109555*

#### *2.1.2 Short answer questions (SAQs) and essay questions*

This type of assessment consists of open-ended questions which require the examinees to write either brief or long answers in SAQs or essays, respectively. They can either assess lower-order or higher-order thinking. Usually, examinees are assessed on the application of knowledge ('knows how'), and clinical reasoning. In both methods, the disadvantage is that they have to be marked manually by examiners. It can be resource intensive with a large number of examinees per cohort, particularly for essay questions. In certain cases, the answer scripts are marked by more than one examiner, based on the answer scheme. Although it can reduce the examiner's workload, this may affect the reliability of the scores with multiple examiners per question.

#### **2.2 Assessment of clinical performance**

#### *2.2.1 Objective structured clinical examination (OSCE)*

OSCE consists of several structured stations in a circuit in which an examinee moves in sequence. The number of stations and the duration for each station can vary based on the complexity of the skills being assessed [8]. It allows examinees to demonstrate a specific clinical skill in each station in a standardised medical scenario. It is usually conducted in summative assessment. OSCE is widely implemented due to its high validity and reliability to assess across different cases and skills. A large number of students can be assessed in the same way with multiple concurrent circuits. The use of standardised or simulated patients is common during OSCE so that examinees may interact with them to perform history taking, physical examination, counselling and others. There will be an examiner at each station to observe and score the examinees based on a pre-determined checklist.

#### *2.2.2 Long case and short case examinations*

The long case is a traditional clinical examination that assesses student competence at the 'shows how' level in Miller's pyramid. It requires a student to spend approximately an hour with a patient, taking history and carrying out a physical examination, unobserved. Then, the student summarises the findings to one to three examiners and answers several questions. The examiners score the student using unstructured marking criteria. Although many concerns regarding its reliability [9], the long case is still popular due to its authenticity and ability to assess clinical approach holistically. To increase the validity and reliability of long cases, several modifications have been implemented such as observing students while they interact with a patient, using a structured marking scheme and increasing the number of cases [10].

Short case, on the other hand, requires a student to spend about 5–10 minutes with a patient to examine the patient and detect signs under observation. Then, the student needs to formulate a clinical or differential diagnosis of the patient. Similar to long case, the student is scored according to unstructured marking criteria. In many medical schools, the introduction of OSCE has replaced long-case and short-case examinations, especially for high-stake examinations.

#### *2.2.3 Workplace-based assessment (WBA)*

WBA encompasses a group of assessment methods that evaluates students' performance in an actual clinical setting. Examples of WBA include mini-clinical evaluation exercise (mini-CEX), direct observation of procedural skills (DOPS) and case-based discussion (CBD). These assessment methods have high authenticity and are located at the tip of Miller's pyramid ('does'). They are usually conducted as formative assessments with the main aim to aid learning through feedback.

Mini-CEX expects a student to conduct a focused clinical skill such as history taking or physical examination with an actual patient within a short and specified time. The performance is graded using a structured evaluation form and constructive feedback is provided. This assessment occurs on multiple occasions in daily practice with different assessors and in different settings. DOPS is a variation on the mini-CEX, which focuses mainly on procedural skills. It is specifically designed to evaluate practical skills for example in surgical, medical or general practice against pre-determined criteria, followed by a face-to-face feedback session.

On the other hand, CBD is a focused discussion driven by an existing case the student has encountered. The discussion centres on what was done, why it was done and how any investigation and intervention was made. After the discussion, the assessor scores the quality of performance and provides constructive feedback.
