Blueprint & Other Guidelines for Paper Setters

Blueprint & Other Guidelines for Paper Setters

“My Lord, increase me in knowledge.” (Holy Quran 20 : 114)

Learning Objectives

Explain what is a Blueprint
Explain purpose of a blueprint
Describe how blueprint is made

Introduction

Students often come out of an exam hall complaining that:

Unimportant questions were given
Questions came from outside the syllabus
Too many questions came from a particular topic/chapter/unit
Not enough time to complete the paper; paper was too lengthy
There was no question from such an important unit/chapter
Most of the questions were too difficult
Paper was too easy, a piece of cake
Questions were too vague
This was not taught to us

The Blueprint

Written examinations are the most commonly employed method to assess knowledge in medical education and are used to test recall abilities as well as higher-order cognitive functions, such as the interpretation of data and problem-solving skills.

Blueprinting can be defined as the creation of a template to determine the content of a test; it is a detailed plan of action for a paper.

Blue print is a two way matrix that ensures that all aspects of the curriculum and educational domains are covered by assessment programs over a specified period of time. It is a chart which shows the placement of each question in respect of the objective and the content area that it tests.

It should be drawn up by paper setters for any formative or subjective test, including MCQs, SEQs, OSPEs, and Structured Viva etc. It should be drawn out for a small class test, term exam, sendup as well as university examinations.

It lists the number and type of questions across the course content, with relative weightage given to each topic/chapter/unit (depending on the examination). Weightage given is determined according to number of learning objectives & their relative importance.

There may be subtle or huge differences between what is taught and emphasized in lectures , what is mentioned in the syllabus or textbooks, and what is assessed. All this has to be systematically organized.

Blueprinting is increasingly used in the field of medical education worldwide; the assessments are prepared in such a way that those students who have not met important learning outcomes are not able to graduate.

Content validity gauges the extent to which an assessment covers a representative sample of the material which should be assessed; for example, if examination questions cover the learning objectives of the syllabus, the examination is considered to have content validity. Other objectives of blueprinting are given below.

Objectives/Purpose of a Blueprint

Using an examination blueprint ensures:

Increases validity and reliability of the paper
Is a guide to paper construction
Ensures proper weighting of marks for important topics
Aligns questions with learning objectives
Distributes questions according to clinical importance
Minimizes inter-examiner variations in selecting questions
Ensures uniform distribution of questions from the syllabus
Prevents over or under-sampling of questions from a single topic
Ensures papers are well organized and test in-depth subject knowledge
Ensures appropriate overall difficulty level for an average student
Ensures sufficient time given for attempting the paper
Helps students focus on key areas of clinical importance/public health.
Improves examination results

Criteria & Significance of a Good Assessment

Good Validity

Validity is a measure of the degree to which the assessment actually reflects the qualities in the candidate that it is intended to measure
Content Validity: Questions should be congruent with learning objectives of the course and their relative importance
Validity is the strength of the conclusions, inferences or propositions we draw from the assessment.

High Reliability

The degree to which the result of a measurement, calculation, or specification can be depended on to be accurate. Reliability is the degree of consistency of a measure. A test will be reliable when it gives the same repeated result under the same conditions.
Individual marks would not change appreciably over many more assessments.

Reproducibility

This means that if a student were given a similar examination repeatedly (incorporating many different cases, many different written questions, etc.) he or she would come up with a similar mark on each occasion.
This constitutes the proof that the assessment is accurately reflecting the candidate’s actual ability, and not just their good luck or bad luck in meeting a particular group of questions, patients (in a clinical exam), or examiners.

Generalizability:

The marks given for the sample of cases or questions really does represent the marks that would be obtained if a very much wider range of questions were to have been attempted.

Uniformity & Consistency

There should be planned and organized distribution of questions from the syllabus. Proper weightage should be given to each area
It also implies that papers should be of similar difficulty each year. It should not be that there is a difficult paper one year and an easy paper the next year

Feasibility

The state or degree of being easily or conveniently done
Questions should be at level of the students (undergraduate etc.)
Questions should be from the prescribed syllabus
Questions should be clearly written and unambiguous
An average student should be able to answer the questions in the allotted time

Acceptability

Both candidates and the examiners find the purposes and format of the assessment reasonable and acceptable
Item indices (Difficulty Index, Discrimination indices etc. should be determined after the paper for MCQs and SEQs)

Factors Affecting Difficulty & Discrimination Indices

Objectivity

Should fulfill learning objectives
Being the object or goal of one’s efforts or actions. not influenced by personal feelings, interpretations, or prejudice; based on facts; unbiased

Transparency, Fairness and Accountability

Unfair and irregular practices in the exam system can impede access of worthy candidates to progress
Right to information on the qualifications and experience of the paper setter(s)
A common request for information is to ask to see the marks given by examiners on the answer papers

https://www.thedailystar.net/opinion/perspective/making-public-exams-transparent-1519834

Making of the Blueprint

Learning Objectives
The syllabus, especially topics of the exam or test are used to make a blueprint
Examination content should match the syllabus
Learning objectives of each topic should be well designed and preferably displayed beforehand
Learning objectives are designed according to university syllabus, recommended textbooks, past papers and other necessary information for undergraduate students
Learning objectives for each topic are scored according to quantity, clinical importance, (impact on health and frequency/prevalence in society) and time given to them.
The relevant units and/or topics are then given weightage i.e. amount of questions to be given to them (SEQs, MCQs based on the above information)
Individual weightages to each unit or chapter or topic is determined from the indicators of a blueprint

Indicators of a Blueprint

Impact

Relative Impact Score:

Critical
Essential
Important
Need to Learn
Nice to learn
Trivial

The impact score is determined by the following factors

Number of learning objectives
Number of lectures/time devoted to the area
Number of Pages
Incidence & prevalence in the society
These are reasons why number of questions from CNS in paper of Pharmacology of UHS have to be increased

Final impact score ranges from 1 – 3.

Impact score of 1 implies ‘Critical’ and ‘Essential’
Impact score of 2 implies ‘Important’ and ‘Need to Learn’
Impact score of 3 implies ‘Nice to learn’ and ‘Trivial’

2. Frequency

This tells us about how frequently the topic/learning objective/question has been asked in previous formative and summative examinations.

It also ranges from 1 – 3.

Frequency score 1 means less frequently asked question
Frequency score 2 means moderate frequency of asking question
Frequency score 3 means high frequency of asking question

Weightage of Each Content

Following steps are conducted for deciding weightage to each content area:

Calculate I × F i.e. Impact of topic × Frequency of asking questions from each topic
Calculate total summation of all I × F and this will be labeled as “T”.
Weightage coefficient (W) will be calculated as I × F/ T
Multiply the Weightage coefficient (W) by total number of items
Calculate adjusted weightage of each content areas as per total marks
All this can be displayed on a table on MS Excel or MS Word
Same can be displayed for MCQs, SEQs etc

The difficulty level of each question should also be displayed i.e. C1, C2 or C3 level (example ahead)

C1 (Recall), C2 (Understanding) and C3 (Analysis)

Levels of Cognition

C1. Recall . Student remembers or memorizes

e.g. Enumerate 5 causes of fever.

This is lowest level of knowledge.

C2. Understand. Student describes or explains

e.g. Explain findings of an ECG or X ray chest

C3. Student uses information gained from different sources to reach a diagnosis

e.g. considering history, exam, blood gas levels, CT findings. Scenario based.

This is the level we should aim in teaching & in assessment.

For a long time there were 3 levels of cognition, C1 – C3. Now as shown in picture there are 6 levels.

Highest level is ‘Creation’. Student is asked to use knowledge to come up with his/her own plan /solution of a problem

e.g. Design a management plan for 30 year old lady who has reported with high grade fever, 6 days after a delivery at home.

This is necessary so that at the end one can see whether they are appropriately distributed i.e. not all should be of C1 level or C3 level.

This means for early formative exams, like class tests, 50 % of questions should be of C1 level, 40 % of C2 and 10 % of C3 level.

The percentages may altered e.g. by increasing ratio of C3 level questions later on in the session in term, sendup and professional exams.

Cognitive competence goes far deeper than merely remembering facts. Exercise of these higher levels of cognition should be actively promoted and explicitly assessed

Marks allotted to each question or part of the question pertaining to the topic or learning objective should be mentioned in a separate column.

Different segments of the content should be assessed in MCQs and SEQs; also beneficial in avoiding repetition.

In the last column, time given for each question(s) should be allotted; i.e. the approximate time in which we expect an average student to answer the question (realistically).

It is also recommended that blue prints should be prepared by different subject experts every time and should be peer reviewed

An example of a blueprint is given below:

Time Factor

As one can see in the above table, the time expected for answering each SEQ is given. This can be assessed by estimating time required by a student to answer the SEQ (one can ask a demonstrator to attempt the SEQ); Same can be done with MCQs (average 1 minute; some should be answered in 15-30 seconds, some in 30 seconds – 1 minute while the C3 level MCQs in 1.5 minutes)

Conclusion

In conclusion, constructing a test blueprint includes the following steps:

Identify the content areas of the subject matter (sections) to be measured by the exam

Identify the learning outcomes (domains) across each section to be measured by the exam

Weight the sections and domains in terms of their relative importance

Construct a spreadsheet in accordance with relative weights by distributing the test items

Allocate the relevant cells of the spreadsheet with their level of cognition

The question paper should be fairly distributed over the whole syllabus prescribed for the course during the academic semester. No question or part thereof should be out of the prescribed syllabus. Repetition of questions must be avoided.

The blue print makes the assessment clear, explicit and transparent to everyone involved in the process of learning. It makes assessment ‘fair’ to the students as they can have clear idea of what is being examined and can direct their learning efforts in that direction. Blueprints arising from these detailed specifications form an exact sampling plan for content domain to be tested.

Other Examination Guidelines

When asking questions, action verbs should be used from Bloom’s Taxonomy. Action verbs used in learning objectives should also be used in question papers. Like ‘enlist, explain, tabulate, describe, compare, enumerate, explain, classify’ etc.

Learning objectives should be SMART (specific, MEASURABLE, attainable, relevant and time focused)

Use following verbs for the given cognitive levels:

C1 : Enlist. Enumerate, Classify, Define, Describe, Name, label

C2 : Explain, Rationalize, Tabulate, differentiate, calculate

C3: Compare and Contrast, Justify, Discriminate

Bloom’s Taxonomy Verb Chart

Avoid using ‘What’or ‘write down’ or ‘give’ etc. in questions. It is ambiguous. Students and examiners will not be knowing whether one is asked to define, enlist, describe, explain etc.

Include the mark allocation for each question and parts of a question, with a more detailed breakdown where necessary
Proof read the text once again by oneself and then pass on the paper to the Reviser for the final proof reading, preferably a senior colleague.
Pass on the finalized draft of the paper to an external reviser who has to proof read the text again, ensure that no test item is out of syllabus, check that all set tasks are workable and that the paper can be completed in the set time.
One should be able to give satisfactory answers to the following questions :

Window Dressing :

Window Dressing :
‘superficial or misleading presentation of something, designed to create a favourable impression.’

Verbosity:

‘ the fact or quality of using more words than needed’

Above picture shows questions that are NOT scenario based ! They are simple questions which have undergone window dressing/verbosity. There is no link between the scenario and lead question (underlined in red).

These lead questions could have been asked separately WITHOUT the preceding window dressing.

A true scenario based question has to have a link between the stem and the lead question, whether for an MCQ or a SEQ.

Examples of scenario based questions given below:

Also read this blog on flaws in designing MCQs :

Tips for Answering MCQs for the ‘Testwise” Students

With great acknowledgments and thanks to my mentor in Medical Education Dr. Fahad (Dental Surgeon, Medical Educationist)

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Dr Nauman Shad

Blueprint & Other Guidelines for Paper Setters

Leave a Reply Cancel reply