Student information about the Progress Test of September 2022

What is adaptive testing?

(Computer) adaptive testing provides students with a series of questions that is adapted to the student’s knowledge level. Since the difficulty of all questions is known, every student can receive a Progress Test that fits his or her knowledge. Such an individualized, automatically generated test has the benefit that it does not have to be made by all students at the exact same time.

How does computer adaptive testing work?

With computer adaptive testing, the computer selects questions from a large question bank. The selection of every subsequent question is based on the answer given to the prior question. For example, if the student answers a question correctly, the algorithm will select a slightly more difficult following question from the question bank. This process is maintained until the student gives an incorrect answer, which will result in the algorithm selecting a question that is easier than the preceding question. This performance-based selection of questions is sustained until the system has gathered adequate information to correctly estimate the student’s knowledge in certain disciplines. However, the blueprint (e.g., the Progress Test content specifications; categories and disciplines) of the Progress Test will not change compared to the ‘original’ version. The algorithm will still cover all disciplines.

The principle of adaptive testing is illustrated in the following figure:

Adaptive Testing principle

Considering that a student does not perform equally well in every discipline, a practical example will look like this:

Run1

The horizontal line is the student’s predicted knowledge level at the end of the Progress Test. The other, more wavy line between the ticks shows the knowledge level determined by the computer during the Progress Test, which was used to determine the difficulty level of the following question.

What are the benefits of adaptive testing?

Computer adaptive testing ensures a completely individualized test. The test is thus different for every student. Computer adaptive testing is a state-of-the-art technology that enables precise estimation of students’ knowledge in a short period of time. The number of questions that needs to be answered for this precise estimation is lower than before. The total number of questions during the Progress Test of September 2022 is 135.

Since all students will make an individual and therefore different test, the Progress Test will not have to be made by all students at the same time. In this way, education and clinical activities do not have to be interrupted in order for the Progress Test to take place. Additionally, the implementation of the computer-adaptive Progress Test enables the use of other media, including sound and video, for questions in the future.

In short, the most important benefits of adaptive testing are the flexibility that it creates and its precise measurements.

The technology behind adaptive testing is explained in the following video: https://www.youtube.com/watch?v=ZvFNwR8ABo4&t=65s

What changes about the questions with adaptive testing?

The new approach of the Progress Test offers many benefits. However, some changes were necessary. Firstly, students can no longer take the test booklet with questions home, since the adaptive Progress Test will now be held exclusively online. The questions will then remain in the question bank. To ensure that students still remember what the questions of the Progress Test were about, the students will receive a short description of the content of the question, as well as whether or not the question was answered correctly. Secondly, the option for a question mark will be removed. This feature does not support the adaptive environment, because it will not provide adequate information whether or not the student has knowledge on the topic of the question. It is therefore important that you answer every question as good as possible, as this determines the next question that you will receive. However, the adaptive characteristic will eventually improve the Progress Test, resulting in less questions that you will not be able to answer.

If you want to check the type of questions that are presented during a Progress Test, please visit the iVTG website (www.ivtg.nl), which offers a complete Progress Test of 200 questions. Additionally, the question bank is made available (to a certain extend) in collaboration with Medisch Contact through the website “Arts in spe”, to be found at www.medischcontact.nl/kennis-carriere/voortgangstoets.htm.

What changes about the testing with adaptive testing?

The adaptive Progress Test will only be held digitally (on a computer). This approach has been tested for several years and has already been implemented in the international medical curriculum of Maastricht. In May 2022, many students of several universities have made the Progress Test in its original form, as well as in the new adaptive form. This showed that the computer-adaptive testing system works well, and that the test results of individuals students are comparable between the original and the adaptive form of testing. From September 2022 onwards, the Progress Test will not be held in its original form anymore. Every student will now make the Progress Test digitally during an assigned time slot on campus. However, this time slot will not be the same for every student, since every student will make a different, individualized test.

Since the adaptive Progress Test consists of 135 questions, the examination time has been brought back to 3 hours. The Progress Test has to be completed to ensure a valid result.

The questions will be adapted to the student’s knowledge level. Therefore, one will receive less questions that can be filled in without having to think about it. In turn, the adaptive Progress Test will most likely feel like a more difficult test than the original Progress Test. Considering that the questions are chosen by the CAT system based on the result of the previous question, it is not possible to go back to a certain question and questions must therefore always be answered directly.

Why did the Progress Test approach change to adaptive testing?

The current Progress Test is a very long test consisting of 200 questions, with a maximum time of 4 hours. The test is based on the level of a basic medical doctor (‘basisarts’), which means that many students (especially in their bachelor’s) have difficulties answering the questions. Furthermore, the fact that all students have to make the original Progress Test at the exact same time forms a challenge logistically.

The adaptive Progress Test will be shorter and will offer questions on the level of the student. Adaptive testing has also been shown to be more efficient since students require less time to finish their test. The test score will not only be more informative, but will also correspond better to the estimated knowledge level. Additionally, the reliability of the test score is comparable to the test score of a two times as long paper Progress Test.

The adaptive Progress Test will not be held at one time point, but all students will be distributed over time slots during one week. The Progress Test will still be held four times a year.

What are the reasons to change the method of how the Progress Test is taken?

There are several reasons to change the way students take the Progress Test:

  • The transition from knowledge-based questions to context-rich, relevant questions allowed for more text. For example: the previous test booklet was 42 pages long. The Progress Test is meant to test knowledge and its application in the medical field, not your ability to focus and concentrate.
  • Mainly in the first years of the medical program, most questions are far too difficult. In this way, the student is tested on a small number of questions that he or she could potentially answer. This results in an unfavorable signal-noise balance with a bigger chance of unjustified insufficient grade.
  • It is difficult to ensure security and safety measurements for one identical test that is made by an increasing number of students every year, resulting in an increased risk of (large-scale) cheating/fraud.
  • An identical Progress Test requires examination at the same time, which is not always possible for every participating faculty. Considering the increasing number of students having to take the Progress Test, finding proper locations is more difficult too, which compromises the current system of progress testing.

What will not change with adaptive testing?

The Progress Test will still be held four times per year. The rules for determining the end result at the end of the year will still apply.

What kind of questions will be asked during an adaptive Progress Test?

Computer-adaptive testing is only possible if the difficulty of a question is known. This difficulty can only be estimated by asking the question. There is a big pool of questions available of which the difficulty is known, because they have been asked during previous Progress Tests. However, this pool of questions has to be updated regularly, meaning that also newly developed question will be asked during the adaptive Progress Test. Because the difficulty of these novel questions is not known, these questions will not count for the test score. The results will however be used to calculate the difficulty of the novel question in order to add the novel question to the question bank. The Progress test consists of 135 questions, of which 15 are newly developed.

Who will encounter the Progress Test changes with adaptive testing?

All medical students studying at the medical faculties of Maastricht, Nijmegen, Groningen, Leiden, Amsterdam (VU and UvA) and Rotterdam will make the new adaptive Progress Test from September 2022 onwards. Utrecht will start with pilots  in December 2022 and complete implementation of the adaptive system in 2023-2024.

What is different about the test score and feedback with adaptive testing?

After finishing the Progress Test, students will receive (following the determination of the standard/norm) a score in TestVision. Moreover, students will receive feedback on their progression in the Prof system. This will thus all remain the same. Since the difficulty of the questions is determined prior to the test, the standard setting will in the end no longer be adapted to the test result. In other words: then there will be no relative, but an absolute standard setting. See the final question at the end of the list below. There it is explained how the standard setting is determined in the transition period.

Two things that will change, are:

  1. The score will be shown on a different scale. It is necessary to express the score on a standardized scale, since every student will make a different test. More information on standardized scaling can be found at: https://www.youtube.com/watch?v=2JjaWQZChqs

The scale will be set up in a way that fits the scores that you’re used to as good as possible. The Prof system will provide graphical information on your performance relative to the entire cohort. Research has shown that properly analyzing the feedback in Prof results in better performance.

  1. You will receive a report of all correctly and incorrectly answered questions in TestVision. All questions in the question bank will get a short question topic. You can use this question topic as an indication what to study.

How is the difficulty of novel questions determined?

The level of new questions is determined by the so-called Rasch model using the item-response theory. This means that the chance that one student answers a question correctly is compared to the chance that this student answers the other questions of the test correctly (the knowledge level of the student). These two aspects can be plotted to determine the difficulty of the novel question with respect to the “average question”. The figure below illustrates this principle. The vertical, black line is depicted just right of the 0 line, which indicates that this question is a bit more difficult than the average.

FAQ3

How are is the information for calibration collected?

Calibration information is based on all participating students rather than from one faculty. The number of participating students has been around 10,000 for the past couple of years. All questions from 2007 onwards are calibrated, but also checked for validity/usefulness. This check is performed by two faculties for every single question; in case there was no consensus, the question was discussed (for a third time) in a meeting in which all faculties were represented.

Is it correct that 80% of the questions of the adaptive Progress Test are actually “old” questions?

This means that 100% of the questions that count for the test result are based on “old” questions.

How many questions are currently in the question bank?

Approximately 7800 questions.

When a question is answered incorrectly, how often will questions be asked from the same discipline?

The result of an individual question does not affect the choice of discipline of the subsequent question.

My test featured multiple questions on the same subject, is that supposed to happen?

If you have difficulty with a particular subject, multiple questions on this exact subject might be more noticeable. The consequences of this are presumed to be small, considering the limited impact of two questions on your final results.

There are two explanations for two or three questions on the same subject:

  1. In a certain section of the question bank there could be multiple very similar questions and two or more could be given to you by chance. As the level of difficulty is adjusted along the way, based on your score in previous questions, a questions answered wrong in the begin might return later on in a simpler format, making it easier to answer.
  2. One of the questions could count towards your final result while the other is a pretest item and does therefore not count. How does that work? Not all questions in your test contribute to your final result. Dispersed throughout the test are 15 so-called pretest items. These are inserted in the test to assess their difficulty level for inclusion in the question bank and use in future progress tests. In the selection of these pretest items, the content of the previous questions is not taken into account.

Although having double questions does not have major repercussions and could discourage studying, the progress test commission aims not ask multiple questions on the same subject. For that reason all questions (ca. 7000) and all new questions are checked additionally for textual similarities to then be compared manually for comparability in questioning. The questions that are considered too similar are marked as ‘enemy items’; TestVision then automatically prevents these questions from appearing together in a student’s test.

To identify these enemy items, your input is very helpful. If you find two or more (nearly) identical questions, please report which questions this concerns,  either during the test, directly afterwards or during the insight opportunity. This report can be filed in TestVision’s commentary box. Please mention the question numbers and the subject so we can process the report properly.

Which possibilities are offered for students with a disability (e.g., having difficulties with digital testing)?

This will be managed by the local exam committee/examiner.

Is there still a measuring moment? Will this test result be comparable to previous test results?

The adaptive Progress Test will be held 4 times a year, resulting in a total of 24 measuring moments. The distribution of the Progress Tests throughout the medical program does thus not change upon the start of adaptive testing. The distribution of the Progress Tests is according to your study plan up until measuring moment 12. After graduating from the bachelor’s program, the distribution of Progress Tests will continue to measuring moment 24. The scale in which the test result of the adaptive Progress Test is expressed is comparable to the previous test results.

How long will it take for students to receive their test result?

This depends on the time slot of your Progress Test and when the last Progress Test will be held during that week. The maximum amount of time to receive the test result is established in the local study- and exam regulations (Dutch: “opleiding- en examenreglement” or OER).

How is the result of the test determined and how does this relate to the paper test?

The result of an individual student is determined according to the “Weighted Likelihood Estimation” (WLE) method by Thomas A. Warm (1989). This score is compared with the mean of the total student population and is expressed as a z-score. This score is then transformed into a standard scale with a mean of 35 and a standard deviation of 15. These two measures are based on the mean progress test scores of the last years and their aim is to best fit the scores of the adaptive test with the scores we were used to in the paper progress test. This works fine. However, because the principle of the adaptive test is rather different from the old paper test, the scores and results behave differently in some extent. E.g. there can be negative scores. This can be explained by the fact that the z-score can be negative (the mean z-score is expressed as 0, the standard deviation -1 or +1). For example: with the transformation to the standard scale a student scoring 3 SD under the mean of the year group can score a -10. In the old paper test a negative score was also theoretically possible by the “correction for guessing”.

With the adaptive test, questions are used of which the difficulty is precisely known. Therefore a relative norm is no longer necessary for a fair judgement. During the first year of adaptive testing it is carefully monitored whether the results do not change in comparison with the old paper setting, especially for the I/S cut-off points.