The effectiveness of automatic item generation for the development of cognitive ability tests
Research has shown that the increased use of computer-based testing has brought about new challenges. With the ease of online test administration, a large number of items are necessary to maintain the item bank and minimise the exposure rate. However, the traditional item development process is time-consuming and costly. Thus, alternative ways of creating items are necessary to improve the item development process. Automatic Item Generation (AIG) is an effective method in generating items rapidly and efficiently. AIG uses algorithms to create questions for testing purposes. However, many of these generators are in the closed form, available only to the selected few. There is a lack of open source, publicly available generators that researchers can utilise to study AIG in greater depth and to generate items for their research. Furthermore, research has indicated that AIG is far from being understood, and more research into its methodology and the psychometric properties of the items created by the generators are needed for it to be used effectively. The studies conducted in this thesis have achieved the following: 1) Five open source item generators were created, and the items were evaluated and validated. 2) Empirical evidence showed that using a weak theory approach to develop item generators was just as credible as using a strong theory approach, even though they are theoretically distinct. 3) The psychometric properties of the generated items were estimated using various IRT models to assess the impact of the template features used to create the items. 4) Joint responses and response time modelling was employed to provide new insights into cognitive processes that go beyond those obtained by typical IRT models. This thesis suggests that AIG provides a tangible solution for improving the item development process for content generation and reducing the procedural cost of generating a large number of items, with the possibility of a unified approach towards test administration (i.e. adaptive item generation). Nonetheless, this thesis focused on rule-based algorithms. The application of other forms of item generation methods and the potential for measuring the intelligence of artificial general intelligence (AGI) is discussed in the final chapter, proposing that the use of AIG techniques create new opportunities as well as challenges for researchers that will redefine the assessment of intelligence.