The Australian Medical Council is an organisation whose work impacts across the lands of Australia and New Zealand.
The Australian Medical Council acknowledges the Aboriginal and/or Torres Strait Islander Peoples as the original Australians and the Māori People as the tangata whenua (Indigenous) Peoples of Aotearoa (New Zealand). We recognise them as the traditional custodians of knowledge for these lands.
We pay our respects to them and to their Elders, both past, present and emerging, and we recognise their enduring connection to the lands we live and work on, and honour their ongoing connection to those lands, its waters and sky.
Aboriginal and/or Torres Strait Islander people should be aware that this website may contain images, voices and names of people who have passed away.
This resource guide is designed for authorities who plan to introduce WBA. It provides information about the principles of WBA and the essentials for developing a successful clinical assessment program conducted within the workplace. It is designed to complement the AMC accreditation standards for WBA for IMGs. However, the information contained in this guide is generally applicable to any health profession.
IMGs who have obtained qualifications from authorities that are currently not designated as competent authorities are required to undertake the CAT MCQ examination followed by further assessment in either the Standard Pathway (AMC Examinations) or the Standard Pathway (workplace based assessment). Both of these pathways lead to the award of the AMC Certificate and eligibility for general registration.
The goal of the Standard Pathway (WBA) is to ensure that an IMG possesses an adequate and appropriate set of clinical skills and other essential attributes to practise safely within the Australian health care environment and cultural setting. WBA is additional to normal supervision requirements that apply to all IMGs and doctors-in-training.
The opportunity to conduct the assessment over a sustained period allows a comprehensive assessment of clinical competency and performance.
WBAs must have established reliability and validity to ensure defensible decisions are made. WBA should assess performance across a prescribed range of clinical areas and domains of clinical practice. This resource addresses these requirements.
Further information regarding WBA can be found here
A key resource is the availability and time of clinician assessors.
Reliability and validity are two important concepts that characterise all measurements.
To have confidence in measurements of the IMG’s performance in the workplace, it is important the methods are deemed to be both reliable and valid.
WBA has an important summative function in forming a judgment about an IMG’s capacity for safe and independent practice in Australian clinical settings, formative assessment is also a component of WBA.
These terms are explained below:
Assessment terms
Formative assessment
Formative assessment is primarily aimed at helping IMGs to identify the effective and ineffective aspects of their clinical performance. The likelihood that IMGs are able to improve performance as a result of formative assessment is increased markedly if the assessor:
More details on feedback and how it can be effectively delivered are amplified in resource guide 7.
Summative assessment
Summative assessment aims to pass judgment on an IMG’s clinical performance. To orient IMGs to the system of WBA that they will encounter, it is a good strategy to allow a number of formative ‘practice’ assessments before the first summative assessment. Once the summative assessments are underway, a systematic recording of outcomes is essential, including the assessor’s global judgment of the IMG’s performance, and notes indicating the strengths and weaknesses observed and plans developed with the IMG for improvements. A well-constructed and managed spreadsheet or database is essential so that the assessing authority can keep track of the IMG’s progress.
Reliability
Reliability is a measure of the accuracy of the score derived from an assessment. It is a function of the size of the sample of performance used as a basis for the assessment. It is known that performance in medicine differs widely from case to case, which means that performance must be sampled widely over cases to enable an accurate estimation of a clinician’s overall level performance. This issue is discussed further in this course in resource 4, in relation to specific assessment methods.
Score accuracy is a function of the consistency of measurement over different events and by different assessors. Consistency across assessors can be aided by reference to explicit, observable performance criteria against which a performance is judged. Consistency can also be aided by training assessors in the use of assessment methods.
Validity
When assessing IMGs’ performance in the workplace, the questions posed by validity are ‘does the assessment measure what it was intended to measure?’ and ‘do the assessment data provide a basis for making a decision about overall level of IMG performance?’
A precondition for validity is reliability. Reliability is largely a function of the adequacy of sampling of cases and different assessors, and is therefore influenced by the total assessment time. Validity is largely a function of whether the sampling was representative across the types of patients, and types of skills for which the IMG is expected to exhibit competent performance. Assessment data can therefore be a reliable measure but may not be a valid measure if representative sampling is not achieved. Resource 5 on blueprinting addresses the issue of content validity in greater detail.
Purpose of the WBA
WBA is used to:
Making judgments about performance
In WBA, judgments about an IMG’s level of performance are made by assessors. These judgments are recorded on rating scales, where each scale position is associated with a term such as unsatisfactory, or meets expectation, or exceeds expectations. It is essential that IMG assessors reference their judgments to the expected level of performance (at the end of the first postgraduate year (PGY1) for local graduates).
WBA involves the measurement of abilities and attitudes within highly complex areas of activity. Performance in a complex area of activity requires focused observation of IMGs in clinical practice. Assessors should be clinicians who are experienced in relevant clinical practice and trained in the assessment of IMGs and other medical trainees and practitioners.
Principles underpinning WBA
1. Use a variety of methods
Clinical performance is multifaceted and no single method can assess all of the elements that contribute to good medical practice. Assessment methods vary in their effectiveness to assess across different elements of performance. It is essential to use a set of assessment methods that complement each other in terms of what they assess well.
2. Use multiple observations
To ensure that an assessment decision is reliable and justifiable, the assessment must be planned to sample broadly across clinical areas. It is recommended that multiple observations occur in various clinical settings over a period of time.
3. Use multiple assessors
All assessors should undertake appropriate training to calibrate their judgements. Assessors need to be mindful of conflicts of interest which may bias their assessments. To moderate these influences ensure that the assessment of each candidate is based on the opinions of multiple assessors.
Summary
The selection of assessment methods is one of the most critical of all influences on the success of WBA.
Medical education literature shows that the number and specific nature of tasks and the total time involved are key factors underpinning reliability and validity. Choice and design of assessment methods, clearly described criteria, and assessor training are essential elements of a successful WBA program.
To plan a WBA program, its purpose must be clear. For the assessment of IMGs in the Standard Pathway (WBA), the AMC has developed a list of clinical domains and clinical areas of practice.
The focus of WBA is on IMGs’ application of their knowledge and skills in their clinical work. Specifically, WBA should assess the following aspects of their performance:
Clinical skills (history taking, physical examination, investigations and diagnosis, prescribing and management, counselling/patient education and clinical procedures)
Applying clinical knowledge and skills, including a suitable approach to each patient and the ability to take a history, conduct a physical examination, order investigations, interpret physical signs and investigations, formulate a diagnosis and management plans, prescribe therapies and counsel patients.
Clinical judgment
Synthesising information obtained about and from a patient to prioritise investigations and treatment, demonstrating the safe and effective application of clinical knowledge and skills within Australian health care settings; demonstrating safe practice when dealing with unfamiliar conditions or contexts.
Communication skills
Demonstrating effective oral, non-verbal and written communication skills, including effective listening skills.
Ability to work as an effective member of the health care team:
Demonstrating respect, teamwork and effective communication.
Ability to apply aspects of public health relevant to clinical settings
Recognising the importance of applying public health principles relevant to the Australian health care setting.
Cultural competence
Demonstrating an ability to value diversity, to communicate and work effectively in cross-cultural situations and the capacity for cultural self-awareness.5
Professionalism and attention to patient safety
The ability to demonstrate respect, compassion and empathy for the patient; to work safely and effectively within the legal and ethical guidelines for practice within Australia; to recognise the limitations of one’s own knowledge and skills; to recognise the need for continuing professional development; and to meet the responsibilities of positions within the Australian health care setting, including teaching responsibilities.
The Standard Pathway (WBA) covers the same clinical areas as the Standard Pathway (AMC Examination). These clinical areas comprise:
It is recognised that whilst WBA should cover all of these areas, it may not be possible to assess in every clinical setting.
Assessment methods need to be selected to ensure that all domains of clinical practice are appropriately assessed in the workplace.
The AMC WBA accreditation guidelines for IMGs specifies requirements for:
References:
5 Adapted from Cultural competency in health: A guide for policy, partnerships and participation, Australian Government, NHMRC 2006.
This resource describes the methods that have been developed for WBA, and outlines strategies for their use. As explained in resource guide 2, no one assessment method is able to assess all the abilities and characteristics that IMGs must demonstrate in the workplace. A range of complementary assessment methods are needed to assess the abilities and characteristics that together constitute effective clinical performance.
The methods adopted to assess an IMG’s performance should possess two important characteristics:
The key skills expected of the IMG must be aligned with the assessment method used and the feedback provided.
It is commonly stated that assessment drives learning. What is assessed conveys to an IMG the things that are important, both to learn and to do. Learning and skill development are fundamental in an IMG’s progress towards required levels of clinical performance.
There are many strategies available for assessing workplace based performance. This course will focus on the more commonly used strategies which are supported by studies of the reliability (accuracy) of the data they provide, and of the validity of the judgments made on the basis of these data. These strategies are:
The first two of these methods are referred to as direct methods of assessment, as they are based upon direct observation of an IMG’s performance in the workplace. The latter two methods are indirect methods of assessment, as they are based upon records of an IMG’s performance. ITA may or may not include direct observation. Effective WBA of IMGs should incorporate several of these strategies, and should include direct observation of the IMG in-patient encounters.
The ratings recorded on the form, when cumulated over multiple patients, multiple observers, and different clinical tasks will provide a defensible basis for the judgment made.
The process of directly observing a doctor in a focused patient encounter for purposes of assessment is called a Mini-Clinical Evaluation Exercise (mini-CEX). A mini-CEX entails observing an IMG with a real patient (typically for 10-15 minutes) on a focused task such as taking a history, examining or counselling a patient, whilst recording judgments of the IMG’s performance on a rating form and then conducting a feedback session with that individual on his/her performance (a further 10-15 minutes).
The feedback session following the observation should be highly focused in guiding the IMG’s learning by identifying the strengths and weaknesses of his/her performance (formative assessment), and planning for their subsequent learning and skill development. The ratings recorded on the form, when cumulated over multiple patients, multiple observers, and different clinical tasks will provide a defensible basis for a judgment of an IMG’s level of overall performance.
Many different mini-CEX rating forms exist. One where reliability and validity has been widely studied was developed by John Norcini (1995).9 This form has been specifically adapted and evaluated for use with IMGs in Australia and Canada with effective outcomes and is recommended for use in assessing IMGs in the Standard Pathway.
Ratings are elicited relative to the following aspects of an IMG’s performance: medical interviewing skills, physical examination skills, professionalism/humanistic qualities, counselling skills, clinical judgment, organisation/efficiency, and overall clinical competence. Performance is rated using a standardised nine-point scale where 1, 2 and 3 are unsatisfactory, 4, 5 and 6 are satisfactory, and 7, 8 and 9 are superior. The form provides an option for noting that a particular aspect of performance was insufficiently observed or unobserved. The form also elicits narrative information on the details of the encounter, and provides space for providing feedback on the performance observed.
The following steps guide the effective use of the mini-CEX.
Attributes’ of assessors
A mini-CEX assessor should be clinically competent in the area of the patient’s problem(s). The assessor can be one of the IMG’s clinical supervisors, a senior vocational trainee, or a visiting doctor who is serving as an external assessor of the IMG.
Orientation and training of assessors
Assessors should be trained to use the mini-CEX rating form, to be consistent, to reference their assessment to the same standard, and to provide effective feedback. Such a program for providing assessor orientation and training is described in resource guide 8 and in Appendix 2.
Orientation for IMGs
IMGs should be oriented to the mini-CEX assessment process, the rating form and the descriptions of the rating categories. Ideally, they should be given the opportunity to engage in some formative or practice mini-CEX assessments prior to participating in those that will ‘count’.
Schedule of mini-CEX observations
Each IMG should undergo a number of mini-CEX assessments conducted by a different assessors. Approximately 30 minutes should be allocated for each assessment, to observe the encounter, complete the rating form and conduct a feedback session. Support staff may be assigned the responsibility for scheduling mini-CEX assessments, for obtaining permission from patients as appropriate for their observed encounter with IMG, etc.
Selecting the encounters to be observed
A ‘blueprint’ is constructed to guide the selection of encounters to be observed. This blueprint enables a systematic selection of patients comprising a range of:
The development of a blueprint is discussed in resource guide 5.
Assessors need to be sufficiently familiar with the patient’s health concern to enable them to critically judge the performance being reviewed. For example, if a physical examination is being observed, the assessor needs to be aware of the patient’s history (to the degree that it may guide which aspects of the physical examination are undertaken) and the patient’s physical findings, to enable an assessment of the IMG’s accuracy in eliciting these findings.
The assessor’s role in the mini-CEX assessment process
Mini-CEX assessors should remain uninvolved in the encounter (no comments, suggestions, questions), and be as unobtrusive as possible (become a ‘fixture’ in the room, ideally so that the IMG and the patient forget that the assessor is there), unless there are risks to patient safety. If an assessor identifies issues to follow-up with the patient (for example, check findings, refine a treatment plan), this should be done after the IMG has completed the encounter with the patient. The rating form should be completed and then discussed with the IMG. All questions on the form should be completed with both effective and ineffective aspects of performance noted.
Provision of effective feedback and a plan for further development
Principles of giving effective feedback are outlined in resource guide 7, including the challenge of giving feedback where performances are poor.
Return of mini-CEX materials
The mini-CEX evaluation forms should be returned to a designated administrative person or unit for data entry and record keeping.
To be able to defend a final judgment on an IMG’s overall level of performance follow the protocol for mini-CEX assessments. Such a defense will be a function of the reliability of the composite rating derived from the mini-CEX over multiple observations, and the validity of this rating as a global measure of an IMG’s level of clinical performance. The former is largely a function of the number of mini-CEX observations and observers, and reliability estimates can be calculated. The latter is a logical function of the representative sampling of performance across the spectrum of clinical situations in which the IMG would be expected to be proficient.
To obtain reliable scores from the mini-CEX ensure that IMGs are observed in multiple encounters and by several different assessors. The key to making valid inferences of an IMG’s ability from their mini-CEX scores is to ensure that they are observed over a representative sample of clinical domains, clinical areas, and over a range of clinical settings. A mini-CEX study conducted in Australia (Nair et al, 2008) found that scores derived from as few as ten mini-CEX encounters possessed a reliability coefficient exceeding 0.80.11 This result is consistent with those from overseas studies.12,13,14 Nair et al reported that the process had face validity; that is, the IMGs viewed the mini-CEX as superior to most other assessment methods, including OSCEs, for assessing their clinical performance. Other studies have provided evidence of construct validity, reporting high correlations between mini-CEX and other measures of performance for undergraduate and postgraduate trainees (Kogan et al 2003, Norcini et al, 2003).15,16
A form of mini-CEX called the Professional Mini Clinical Exercise (PMEX) has been developed to assess behaviours related to professional attributes.17 The four main attributes assessed by the PMEX are doctor-patient relationship skills, reflective skills, time management skills and inter-professional relationship skills, and these are captured by 24 behaviour items. The PMEX rating form uses a four-point scale where 1 is unacceptable, 2 is below expectations, 3 is met expectations, and 4 is exceeded expectations.
The Direct Observation of Procedural Skills (DOPS) is a form of mini-CEX where the focus is on observing and assessing an IMG’s performance of a procedure on a real patient. A DOPS assessment generally requires 10-15 minutes of observation time followed by 5 minutes of feedback and the completion of the DOPS rating form.
The issues discussed earlier in the lesson on the mini-CEX are all relevant to a DOPS. Many different DOPS rating forms have been developed and, at the vocational training level, forms have been developed that are specific to a given procedure. In the context of assessing IMGs, it is recommended that a generic DOPS rating form be employed. It has been adapted from a form developed in the UK for the Foundation Program, which has demonstrated acceptable reliability and validity.18,19,20 This form elicits assessor ratings on component skills related to the procedure observed, such as obtaining informed consent, appropriate pre-procedure preparation, technical ability, communications skills and overall clinical competence in performing the procedure.
A DOPS assessment should focus on the core skills that IMGs should possess when undertaking an investigative or therapeutic clinical procedure. DOPS is a focused observation or ‘snapshot’ of an IMG undertaking the procedure. Not all elements need be assessed on each occasion. The studies cited above have shown that multiple DOPS over time, using multiple assessors, provide a valid, reliable measure of performance with procedures. The following steps guide the effective use of the DOPS.
The logistics of arranging DOPS assessments can be challenging. Morris et al reported that opportunities for DOPS are found in emergency departments and in the operating theatre during routine procedures.21 As with the mini-CEX, decisions on which procedures to sample in DOPS observations are best guided by a blueprint.. Assessors should be medical practitioners who are familiar with the case being reviewed, who possess expertise relative to the patient’s problems, and who have received orientation or training in the case-based discussion assessment process.
There are few reports available on studies of the reliability or validity of DOPS. In a review of tools used to assess procedural skills of surgical residents, the reliability of the assessment is reported to be enhanced through the use of objective and structured performance criteria, such as those on DOPS assessment forms.22 A DOPS assessment appears to possess high face validity; that is, a DOPS assessment ‘looks to be valid’ because it is a structured assessment of an IMG’s ability to perform a procedure with a real patient in a real clinical setting.
Case-based discussion (CBD) is an alternative term for chart stimulated recall, an assessment technique originally developed by the American Board of Emergency Medicine. It is designed to allow the assessor to probe the candidate’s clinical reasoning, decision making and application of medical knowledge in direct relation to patient care in real clinical situations. It is a validated and reliable tool for assessing the performance of candidates and identifying those in difficulty. The CBD tool has greater validity and reliability when aligned with specific constructs in discussing the patient (i.e. elements of history, examination, investigations, problem solving, management, referral and discharge planning).
Case-based discussion is designed to:
For more information, see the background paper, Norcini J, Burch V. WBA as an assessment tool: AMEE Guide No. 31. Medical Teacher. 2007; 29: 860-862 (PDF 1.9 MB).
The guidelines for conducting the case-based discussion are similar to those for the feedback session for the mini-CEX. As the goal of a case-based discussion is to obtain an assessment of the IMG’s clinical reasoning and decision-making, the discussion should be interactive. For example, the assessor could pose questions which elicit the IMG’s interpretation of data in the record, the reasons for particular tests being ordered and what the results mean, what other tests could have been ordered, recommendations on the next steps in the management of the patient, treatment options and what the IMG would recommend and why, as well as the prognosis, and so on. The assessment form is completed following the assessment encounter.
Several studies support the validity of case-based discussions. Maatsch et al 1983 found a high correlation between case-based discussion scores and the initial certification score (which took place ten years earlier) of the doctors taking part in the study.23 Furthermore, the doctors involved in the study considered this method to be the most valid measure of their practising ability. Other studies have shown that case-based discussions correlate well with scores on previous and current oral examinations.24,25
In-training assessment reports (also referred to as ‘structured supervision reports’) are based upon direct observation of IMGs in real clinical settings over a period of time.
Observations are carried out by the supervisor(s) assigned to the IMG, but others may play a role. For example, nurses and other health team members are often asked to contribute to in-training assessments of communication and inter-personal skills, ethical behaviour, reliability and professional integrity. Pharmacists could be asked for comment on prescribing ability. Patient feedback, peer assessment, self assessment and medical record audits may also contribute to the judgments recorded in in-training assessment reports.
The use of multiple sources of information as a basis for ratings of IMG performance is highly effective in reducing the subjectivity of these ratings, although subjectivity of in-training assessments remains a concern. In many studies of in-training assessment systems, the individuals being assessed have not been directly observed, reducing significantly the reliability of the assessment and its ability to discriminate accurately.
For example, a study of medical students found that nineteen separate in-training assessments would be needed to achieve a reliability coefficient of 0.80. A value of 0.80 is the reliability that should be obtained when making decisions about an individual’s performance.26 In-training assessments remain an important means for WBA, as they enable a broad range of professional behaviours to be assessed. For example, in-training assessments can capture evidence of behaviours such as honesty, being reliable and working well as a team member.
Structured ITA reports contribute to evidence about the IMG’s progress through the required supervision period.
Structured in-training assessment reports are widely used in medical training in Australia, serving to signify trainees’ preparedness to move to the next level or rotation of training. The progress of IMGs in the Standard Pathway (AMC Examination) has a history of being monitored in this way, with structured in-training assessment reports contributing to decisions about the IMG’s progress through the required supervision period. For IMGs in the Standard Pathway who elect to replace the clinical examination option with a workplace based option, it is likely that in-training assessment will continue to be a component of their WBA process.
What is multisource feedback, and how is it put into practice?
The use of multiple assessors helps to address potential ‘conflict of interest’ problems.
Multisource feedback, or 360 degree assessment as it is more commonly called, provides evidence on performance of IMGs from a variety of sources. These sources may include colleagues, other co-workers (nurses, allied health) and patients. Questionnaires, completed by each of these groups, assess an IMG’s performance over time, in contrast to in a specific patient encounter. This assessment method is gaining popularity in medicine, and has been used with medical students through to sub-specialists. Multisource feedback enables the assessment of a group of proficiencies that underpin safe and effective clinical practice, yet are often difficult to assess. Included in these proficiencies are interpersonal and communication skills, team work, professionalism, clinical management, and teaching abilities.
In the context of assessing trainees at the PGY1 level, the Foundation Program in the UK has developed a single questionnaire called the ‘mini-PAT’ (Peer Assessment Tool) which elicits ratings from fellows, senior trainees, nurses and allied health personnel. This form is discussed by Norcini et al (2007).27 In IMG assessment in Canada, 360 degree assessment has been pursued on a broader basis with separate forms for colleagues, co-workers, self, and patients. Both of these options could be useful in the assessment of IMGs in the Standard Pathway, though the 360 degree approach is favoured in the Canadian system.
Some studies of 360 degree assessment with practising doctors have shown the technique to possess limited ability to discriminate levels of performance, with average ratings typically being high (for example, 4.6 out of 5). This limitation does not appear to be as great in the context of IMG assessment, where the range of performance is much wider than that observed with local graduates. Other studies from Canada, the United States and Scotland have shown that 360 degree assessment can be a reliable, valid and feasible approach that contributes to improved practice.28 Reliability analyses indicate that samples of 8-10 co-workers, 8-10 medical colleagues and 25 patients are required.
There are a number of other methods of WBA that could be employed in the assessment of IMGs. Several of these methods can be found in the 2010 publication, Assessment Methods in Undergraduate Medical Education.
These methods include:
Other methods are discussed in Norcini & Burch (2007).29 These methods include:
These methods outlined above are not used as extensively as the methods described in detail in this lesson, nor are there data supporting their reliability and validity. As such, they are not currently recommended for adoption in WBA of IMGs in the Standard Pathway.
Hospitals and health providers seeking to develop their system of WBA of IMGs in the Standard Pathway should formulate an assessment strategy which draws on the methods and strategies presented in this resource. Importantly, the assessment strategy should:
This resource has described methods of WBA for IMGs in the Standard Pathway (WBA) as an alternative to the existing AMC clinical examination. Successful completion of the Standard Pathway assessment program should address whether an IMG possesses an adequate and appropriate set of clinical skills and other essential characteristics to practise safely and effectively within the Australian health care environment.
References:
6 Frederiksen N. The real test bias: Influences on testing and teaching and learning. Am Psychol 1984;39:193-202.
7 Swanson DB, Norman GR, Linn RL. Performance-based assessment: Lessons from the health professions. Educ Res 1995;24:5-11.
8 Shepard LA. The role of assessment in a learning culture. Educ Res 2000;29:4-14
9 Norcini J, Blank L, Arnold G, Kimball H. The mini-CEX (clinical evaluation exercise): a preliminary investigation. Ann of Intern Med1995;123(10):795-799.
10 Nair BR, Alexander HG, McGrath BP, Parvathy MS, Kilsby EC, Wenzel J, Frank IB, Pachev GS, Page GG. The mini clinical evaluation exercise (mini-CEX) for assessing clinical performance of international medical graduates Med J Aust 2008;189 (3):159-161.
11 ibid.
12 Cruess R, McIlroy J, Cruess S, Ginsburg S, Steinert Y. The professionalism mini-evaluation exercise: A preliminary investigation. Acad Med 2006;81(10 Suppl):S74-S78.
13 op. cit. Norcini et al. 1995 #9.
14 Norcini JJ. Peer assessment of competence. Med Educ 2003;37(6 ):539-543.
15 Kogan J, Bellini L, Shea J. Feasibility, reliability and validity of the mini-clinical evaluation exercise (mini-CEX) in a medicine core clerkship. Acad Med 2003: 78 (10 Suppl) S33-35.
16 Norcini J, Blank L, Duffy F, Fortna G. The mini-CEX: a method for assessing clinical skills. Ann Intern Med 2003: 138(6) 476-81
17 op. cit. Cruess R, et al. 2006 #12.
18 Wragg A, Wade W, Fuller G, Cowan G, Mills P. Assessing the performance of specialist registrars. Clin Med 2003;3(2):131-4.
19 Wilkinson J, Benjamin A, Wade W. Assessing the performance of doctors in training. BMJ 2003;327:s91-2.
20 Davies H, Archer J, Heard S. Assessment tools for Foundation Programmes—a practical guide. BMJ Career Focus 2005;330(7484):195-6.
21 Morris A, Hewitt J, Roberts C. Practical experience of using directly observed procedures, mini clinical evaluation examinations, and peer observation in pre-registration house officer (FY1) trainees. Postgrad Med J 2006;82:285-88.
22 Reznick R. Teaching and testing technical skills. Am J Surg 1993;165:358-61.
23 Maatsch JL, Huang R, Downing S, Barker B 1983 Predictive validity of medical specialist examinations. Final report for Grant HS02038-04, National Center of Health Services Research. Office of Medical Education research and Development, Michigan State University, East Lansing, MI.
24 Norman GR, David D, Painvin A, Lindsay E, Rath D, Ragbeer M 1989 Comprehensive assessment of clinical competence of family/general physicians using multiple measures. Proceedings of the Research in Medical Education Conference, pp75-79.
25 Solomon DJ, Reinhart MA, Bridgham RG, Munger BS, Starnaman S. An assessment of an oral examination format for evaluating clinical competence in emergency medicine. Acad Med 1990;65:S43-S44.
26 Daelmans HEM, van der Hem-Stokroos HH, Hoogenboom RJI, Scherpbier AJJA, Stehouwer CDA, van der Vleuten CPM. Feasibility and reliability of an in-training assessment programme in an undergraduate clerkship. Med Educ 2004;38(12):1270-1277.
27 Norcini J, Burch V. Workplace-based assessment as an educational tool: AMEE Guide No. 31. Med Teach 130 2007;29:9,855-871.
28 Garman AN, Tyler JL, Darnall JS. Development and validation of a 360-degree-feedback instrument for healthcare administrators. Journal of Healthcare Management 2004;49(5):307-21.
29 op. cit. Norcini JJ, Burch V. 2007 #27
A blueprint depicts the relationship between what must be assessed and how it is assessed.
Assessment involves a sampling of a candidate’s performance. It is impossible to test for all clinical domains for all medical conditions. The sample of performance tested must be representative of safe medical practice. A blueprint defines the aspects of an IMG’s performance that will be assessed, how each aspect will be assessed, and the number of performances that are to be assessed. It ensures that there is an appropriate balance between what is assessed and how it is assessed. Blueprints are constructed to ensure that assessments are balanced and valid, whether they be for summative or formative purposes.
A blueprint is a matrix where the rows relate to what is being assessed and the columns relate to how to assess. The cells within the matrix are completed as decisions are made about which method best assesses which attribute.
1. Levels of assessment: Identify the levels for which the blueprint is being constructed.
Blueprinting can be undertaken at different levels. For an undergraduate medical program, a blueprint might be constructed at the macro level, or whole program, showing where various methods are used to assess objectives or outcomes across all years of the program. There may be an additional blueprint to show how an individual year assessment is planned. There may then be a more detailed, micro-blueprint at the level of a semester-long course showing the relationship between learning objectives/semester content and written and clinical examination methods.
For WBA, blueprints need to be constructed at different levels. At the macro level, a blueprint should be developed to show how the methods selected assess the defined clinical domains of performance; this will be the overall assessment plan. A micro level blueprint is needed to plan the implementation of each assessment method at the clinical level. For example, a blueprint for mini-CEX observations should be developed to define how to distribute observed encounters across clinical disciplines, clinical tasks and possible clinical sites. Examples of blueprints are shown later in this resource.
2. What to assess: Decide on what is being assessed.
For WBA in the IMG context, the domains of clinical performance that are to be assessed have been defined in the AMC WBA accreditation guidelines for IMGs.
The clinical domains are:
3. Suitability: Decide on what assessment methods are suitable
Consider which methods will appropriately assess the various dimensions of performance. Figure 2 shows one possible sample of observed encounters, using the mini-CEX method, across the clinical dimensions and the clinical areas. Figure 3 shows an example of a micro level blueprint showing the distribution of mini-CEXs across different aspects of the clinical dimensions and across clinical areas. Other plans are possible.
Figure 2 – An example of a macro level blueprint | |||
Clinical Dimensions | Direct observation of mini-CEX | ITA / Structured supervision | 360° assessment |
Clinical Skills | ✓ | ||
Clinical Judgment | ✓ | ||
Communication skills | ✓ (With patients) | ✓ (With patients & colleagues) | ✓ (With colleagues) |
Ability to work as an effective member of the health care team |
✓ | ||
Ability to apply aspects of public health relevant to clinical settings |
✓ | ||
Cultural competence | ✓ | ||
Professionalism and attention to patient safety |
✓ | ✓ | ✓ |
Figure 3 – An example of a micro level blueprint | ||||||
This figure shows the distribution of mini-CEXs across different aspects of the clinical skills dimensions and across clinical areas | ||||||
Clinical Areas | ||||||
Predominant focus of the encounter for Direct Observation Assessment |
||||||
Adult Health – Medicine | Adult Health – Surgery | Women’s Health O&G | Child Health | Mental Health | Emergency Medicine | |
History Taking | Encounter G | Encounter I | ||||
Physical Examination | Encounter A | Encounter C | ||||
Investigations and Diagnosis |
Encounter E | Encounter K | ||||
Prescribing and Management |
Encounter B | Encounter F | Encounter J | |||
Counselling/Patient Education |
Encounter H | |||||
Clinical Procedures | Encounter D | Encounter L |
4. Balance
Decide on the balance between methods and the relative importance of content. This enables the cells within the matrix to be completed. Decide if all the dimensions of performance defined carry equal weight, and if there are mandatory components that must be completed by all candidates.
5. Stakeholder input
Commitment to WBA from senior leadership is essential to its success. The feasibility of the assessment is important. Stakeholder input on the development of the blueprint is desirable to ensure that the plan devised is acceptable and feasible.
6. Training
Ensure all assessors are familiar with the blueprint. Assessors need to understand how the assessment encounter they are involved in fits into the whole assessment of a candidate and how the results of the assessment will be used. This can be incorporated into assessor training (see Resource 8). Failure to do so runs the risk that individual assessors will not appreciate the importance of their individual task or may extend their assessment to include other attributes, and clinical cases which have or will be assessed by others.
7. Access to blueprint
A macro level blueprint should be available to assessors and IMGs, to demonstrate how candidates will be assessed in the workplace.
8. Monitoring
Processes should be established to monitor adherence to the blueprint over the period of the assessment process. Ideally, assessments should be scheduled over a period of time.
In any assessment, a standard must be set to enable assessors to form judgments on candidates’ performance. Standards for passing the assessment must be set in a manner consistent with the purpose of the assessment and must be defensible. There are several methods available for using a formal approach to setting a standard. In essence, all are to some degree arbitrary and rest on the judgment of a group of experienced and trained assessors.
Standards may be relative or absolute. Relative standards use methods such as bell curves to set a proportion of candidates that will pass or fail. One candidate’s performance is compared to other candidates’ performances in forming the judgment. Absolute standards describe what the candidate must demonstrate to pass, and is a more appropriate approach in WBA.
The overall standard for WBA for IMGs in the Standard Pathway has been set to a minimally, or just competent, medical officer at the end of PGY1. For those developing WBA programs, while the overall standard has been set, decisions need to be made about how to apply the standard to the individual and combined assessment formats.
Deciding on a passing standard:
Issues to consider in deciding a passing standard are as follows:
After the assessment has been completed, review the consequences of the standard set. The overall passing rates should be reviewed and input sought from a multidisciplinary group of clinicians involved in the assessment process.
References:
30 Norcini JJ , Lipner RS, Langdon LO, Strecker CA. A comparison of three variations on a standard-setting method. J Educ Meas 1987;24:56-64.
31 Norcini JJ. Setting standards on educational tests. Med Educ 2003;37:464-469.
32 Ben-David MF. AMEE Guide #18 Standard setting in student assessment. Med Teach 2000 22:2;120-130.
33 Livingston SA, Zieky MJ. Passing scores: A manual for setting standards of performance on educational and occupational tests. 1982 Princeton NJ: Educational Testing Service.
34 Angoff WH. Scales, norms and equivalent scores. In Thorndike RL (ed) Educational Measurement. American Council on Education. 1971 Washington DC. 508-600.
35 Ebel RL. Determination of the passing score. In Ebel RL (eds) Essentials of Educational Measurement (3rd ed). 1979 Prentice-Hall, Englewood Cliffs, NJ 3337-42.
36 op. cit. Ben-David MF. 2000 #31.
The built-in feedback loop is an advantage of the WBA. WBA is based on IMGs seeing real patients, and feedback given enables the setting of action plans for improved performance in the future.
WBA comprises observation (to enable a judgment to be made on performance) and feedback (information that is timely, specific to the IMG’s performance and relevant to effective performance in the workplace).
There is a need for observations to be guided by clear and coherent assessment criteria, which can also serve as focal points during feedback discussions with IMGs.
Here are some examples of feedback that are unlikely, on their own, to effect change in workplace performance:
Useful feedback requires time, commitment and precision.
Comments that fail to assist improvement are typically too general, too long after the event, second-hand views, solely negative (negative without including advice for improvement) and/or do not ensure the IMG has understood the advice and knows how to address the problem.
There is no place for personal issues or perceived personality clashes to be raised during feedback on clinical performance. Effective feedback involves a dialogue between the assessor and the IMG, aiming to identify what was done well and not done well, and helps to develop a plan for improvement. In a feedback session, IMGs should be challenged to address each of these issues, with the assessor prompting when they lack the insight to do so on their own. Useful feedback requires time, commitment and precision.
Start with the learner’s agenda when giving feedback. Ask ‘What do you think you did well?’ ‘What do you think needs improvement?’
In giving feedback, some words that describe effective feedback are ‘specific’, ‘immediate’, ‘first-hand’, ‘constructive’, ‘descriptive’, ‘action-affirming’ and ‘adequate’. These descriptors of feedback are outlined as follows:
Specific
The feedback is restricted to the task just performed, and does not include comments that refer generally to other events.
Immediate
The feedback is provided immediately following, or as soon as practicable after, the observed performance.
First-hand
The feedback describes what has just been observed by the supervisor/assessor, and does not include what others might be saying.
Constructive
The feedback provides helpful suggestions for improving performance and/or directs the IMG to resources that can assist; it serves to motivate and reinforce desirable behaviour.
Descriptive
The feedback describes what was good about the performance, plus what was missing and what needs to be done to improve; an honest appraisal—which may contain information the IMG would prefer not to hear—is most appropriately delivered through describing what has just been observed and specifying the actions/behaviour that were not satisfactory. Describe behaviours with ‘I’ statements, such as; ‘I observed that…’, ‘This is what I think you did well….’, ‘These are the areas that I saw need improvement’.
Action-affirming
The feedback sketches out an action plan—which may be recorded on the spot—to give the IMG a summary of expectations. Encourage self-assessment: ‘How might you try to improve?’ ‘Here are some ways you might like to consider.’ Indicate if there are resources that can support achievement.
Adequate
The feedback is detailed and clear, and ensures that the IMG has understood the message being given.
Feedback on under-performance
While assessors and candidates would like to see a successful outcome of the assessment process, the reality is that this will not always be the case. Many assessors find giving feedback to candidates difficult where candidates are not proceeding through the assessment process as might reasonably be expected, or have failed their assessment. The most difficult feedback sessions are those with individuals who lack insight and fail to reflect on their actions, or have not been successful in their performance.
It is important that assessors meet their responsibilities in this regard – a poor or failing performance should be recorded as that.
To deal with this situation, assessors will need:
It will assist candidates in this situation to:
For the assessment system to be robust and defensible it is important that:
For more information on feedback to candidates, visit the Giving Effective Feedback resource of this website.
References:
37 Long EM. Report of a survey of the ACD in-training assessment system. The Australasian College of Dermatologists 2007.
Selection of assessors
The question ‘who can assess?’ is addressed in the AMC WBA accreditation guidelines for IMGs. Effective assessors are essential to any assessment program.
WBA is most appropriately conducted by clinicians who are:
Typically, assessors are drawn from the clinician pool within the workplace where the assessment is conducted. Alternatively, the assessment may be conducted—or supplemented—by an appropriately experienced and trained team of externally-based clinicians.
Training and calibration of assessors
The expectation is that assessors will have adequate support in fulfilling their assessment responsibilities. Training is important to promote understanding of and confidence in assuming the assessor role.
The aim of a training and calibration program is to ensure that assessors are:
A full training program plan for a mini-CEX training session is included in Appendix 2. This includes nine taped patient encounters with different candidates and three feedback sessions that can be used as a basis of training for the mini-CEX.
The basic elements of any training program are shown below.
Part 1. Introduction
Explain the purpose and format of the workshop and present an overview of the assessment method being discussed (its history, and research evidence for its effectiveness).
Part 2. What is being assessed?
Before the participants view the details of the assessment methodology proposed, show examples of the types of performances you are assessing and ask the participants to identify key aspects of this performance – to identify what was good and what was not good, and what criteria could be used to judge this performance.
Part 3: The assessment method
Provide the participants with a sample of the assessment method used and any rating forms involved. Ensure the participants understand the dimensions being assessed and the scale used.
Part 4: Frame of reference training
This type of training can improve the accuracy of ratings. It works by providing the assessor with a context or ‘frame’ for their ratings. Participants view a sample of performances and are asked to rate these using the rating scale provided. After each ‘frame’, participants discuss their scores and the reasons for any variance.
Part 5: Giving effective feedback
Discuss the steps in giving effective feedback. Focus on situations where the IMG has displayed poor performance, as many clinicians find this the most difficult feedback to give. Appendix 2 has three feedback video sessions recorded after mini-CEX patient encounters.
The possibility of biased judgments arising through a conflict of interest may be present in the assessment process. Assessors who work directly with those whom they are assessing face extreme pressure in situations where they observe—and are obliged to document—unsatisfactory performance. It is extremely unlikely, however, that all assessors can be totally free of conflicts of interest.
Measures that may be taken to avoid or reduce the risk of bias in judgments include:
It is possible that situations will arise where a conflict of interest is declared and the assessor(s) absent themselves from the assessment of a particular IMG.
Appropriate and adequate resources are crucial to the success of the assessment process. They contribute significantly to the functioning of a defensible assessment system and are key factors in ensuring that the assessment system is sustainable.
Different environments call for differences in implementing assessment processes; and it is expected that resources and support structures will vary across assessment sites. Practical experience indicates that the resources outlined below are needed to ensure a defensible and sustainable WBA system.
Leadership
Visible commitment from leadership is essential.
Staffing
IMG orientation
Administrative system
A functional WBA system requires dedicated staff support.
Data storage
A spreadsheet for storing and managing data should meet the specific needs of the assessment processes and AMC/medical board reporting requirements.
Quality control
Assessment data must provide reliable indicators that serve as a basis for valid inferences about the ability to practise independently.
Budgetary considerations
Educational expertise
Authorities conducting WBA should consider drawing regularly on educational expertise and experience. Guidance from an educator might be sought in the following areas:
This resource summarises key messages from the preceding resource. The factors cited in this resource are essential to effective systems of WBA and can serve as a checklist to guide the development and evaluation of your system of assessing IMGs.
1. Visible commitment from leadership for WBA
Developing and implementing a system of WBA that will provide defensible decisions about the performance of IMGs is a significant undertaking. Given the demands of effective systems of WBA, they must exist in a context in which clinical, administrative, and other leaders in your setting convey public support for their importance, and provide staff involved with substantive support.
2. An administrative infrastructure that supports assessors and IMGs in the WBA process
A functional WBA system must offer dedicated staff support time related to tasks such as the scheduling of assessments, the selection of patients, and the provision and collection of assessment forms and consent letters. Infrastructure support should also include a system and personnel for filing assessment forms, recording assessment data onto electronic spread sheets, and analysing assessment data.
3. Effective assessors
The most important factor in WBA is effective assessors. ‘Effectiveness’ entails a commitment to the task of:
Commitment should be premised on a system which provides for assessors’ time and support (for example, administrative, financial, educational), rather than adding WBA to an already full schedule. Expertise in WBA should be established through orientation and training programs for assessors.
4. Orientation and commitment of IMGs to the assessment process and their role in it.
IMGs must understand the WBA process, and that the process is used to judge and report their performance. With their understanding of the importance of these assessments to their future, IMGs are likely to be highly focused and ensure that they receive the prescribed number and types of assessments. IMGs must also understand their responsibilities in implementing the WBA system. For example, IMGs may be assigned the responsibility for scheduling a specific number of assessments in a term.
5. A plan (blueprint) defining ‘what’ is to be assessed
An important challenge facing the WBA system is to obtain a sample of an IMG’s performance which permits generalisation about the overall level of performance. It is important that this sample be elicited over a wide range of patients (for example, age, complaints, gender, culture, acute and chronic illness) and clinical dimensions/areas and clinical settings (such as in-patient and ambulatory settings), and employ sufficient numbers of observations and assessors. As outlined in the AMC WBA accreditation guidelines, representative sampling can be planned by developing a blueprint identifying the spread of observations across disciplines and across clinical areas.
6. A plan which defines the selected assessment strategies
A WBA strategy should include a plan for multiple snapshots over time, using multiple methods and multiple assessors. The assessment strategies should be selected on the basis of what the provider wants to assess in the clinical context, and what each method assesses well. Using multiple assessors will take into account that some assessors are ‘hawks’ while others are ‘doves’, and that different assessors focus on different aspects of clinical performance. It is also important to ensure that assessors do not have a conflict of interest in their roles as assessors, through either their personal or work relationships with those whom they are assessing.
7. A quality control system for monitoring the assessment process and the measurement qualities of the assessment data.
Steps should be taken to ensure that the WBA system is providing reliable assessment data on the IMGs and serving as a basis for valid inferences about their performance. Evidence of validity and reliability can be obtained by eliciting data from assessors and IMGs on their perceptions of factors that may affect these attributes, and by calculating reliability estimates for your assessment data. Help in calculating reliability estimates can be obtained from the educational experts.
8. A plan to identify and address practical considerations which may affect assessments
A key factor which may invalidate assessments of IMGs is the conflict of interest that assessors may have. Ideally, an assessor should be ‘at arms length’, not having a friendship, collegial or employment relationship with the individual being assessed. This is a difficult condition to meet in many settings. One of the principles underpinning WBA is to use multiple assessors; and it is important to ensure that at least some of these should have no conflict of interest in their relationship with the IMG.
A second factor that may affect assessments is the restricted nature of the patient population in some settings. If, for example, the setting does not provide encounters with women’s health, paediatric or psychiatric patients, the assessment system will need to identify alternative clinical venues where such patients exist and where IMG assessment can occur.
9. Mechanisms to evaluate and support feasibility
Effective systems of WBA for IMGs require a significant commitment of resources. A key resource is the availability and time of clinician assessors. In planning systems of WBA, projections should be made regarding the number of clinicians available and the time required of clinician assessors. These projections should be translated into costs, and strategies for addressing these costs should be defined.
Another key resource for WBA is access to a suitable spectrum of patients.