文章基本信息

标题：A similarity threshold-based tool for generating and assessing essay computer-based examinations.
作者：Longe, Olumide B.
期刊名称：Library Philosophy and Practice
印刷版ISSN：1522-0222
出版年度：2012
期号：August
语种：English
出版社：University of Idaho Library
摘要：Testing is done in schools to determine if learners mastered what has been taught. Conventional examinations employ the use of paper answer booklets as a medium on which responses to paper-based questions can be answered. These are then collected at the end of each examination, marked and recoded to determine if the student should move to the next level or not or as a basis for course completion. Since their introduction in the early 1980s, personal computers (PCs) have seen great use in educational settings, from the development of computer based tutorials, computer-aided learning and computer based assessment. Test questions and results could be stored on computers days before it was printed out and so the number of people with access to the questions was reduced to just the teacher and the computer operator. In particular, testing has experienced a lot of improvements with the introduction of the internet. Now teachers teaching the same subject/course in different locations can come together over the network and set the questions, sending the test over the network on the day of the exam (centralizing exam setting), thereby making it difficult for cheating to take place.
关键词：Educational services industry;Teachers

A similarity threshold-based tool for generating and assessing essay computer-based examinations.

Longe, Olumide B.

Introduction

Testing is done in schools to determine if learners mastered what has been taught. Conventional examinations employ the use of paper answer booklets as a medium on which responses to paper-based questions can be answered. These are then collected at the end of each examination, marked and recoded to determine if the student should move to the next level or not or as a basis for course completion. Since their introduction in the early 1980s, personal computers (PCs) have seen great use in educational settings, from the development of computer based tutorials, computer-aided learning and computer based assessment. Test questions and results could be stored on computers days before it was printed out and so the number of people with access to the questions was reduced to just the teacher and the computer operator. In particular, testing has experienced a lot of improvements with the introduction of the internet. Now teachers teaching the same subject/course in different locations can come together over the network and set the questions, sending the test over the network on the day of the exam (centralizing exam setting), thereby making it difficult for cheating to take place.

Even in light of all these developments, teachers are still actively involved with drawing up examination questions and could still make the assessment process porous by leaking questions either deliberately or inadvertently to students before the examination. Since teachers also mark essay examinations, there is the potential for undue favors and result manipulation to favor some students at the expense of others. In this paper, we present the development of an essay examination generator called EssayTest. This tool requires minimal external input in the generation and marking of questions and also provides a mechanism to store/record the results.

Related Works

Technology is playing an increasingly influential role in education globally. Computers and mobile phones are now being used to promote electronic learning and facilitate lectures across the globe in real-time mode (Sadiq, 2012). Multimedia facilities promote student engagement, interaction and collaborations in virtual learning environments. Technology is being used not only in administration and teaching but also for educational assessment (Dede, 2002). In some cases, conventional electronics with higher penetration, such as television and radio, are interlinked with the internet to reach learners in remote communities. For instance, the Kothmale Community Radio Internet employs this hybrid to provide educational opportunities in a rural community in Sri Lanka (Sally, 2008). The Indira Gandhi National Open University in India uses a combination of print, recorded audio and video, broadcast radio and conferencing technologies to reach learners (Mandanmohan, 2006). The distance learning program at the University of Ibadan, Nigeria also engages a mix of these technologies to reach learners on its Diamond FM station, the University Radio Station (UIDLC, 2012).

Existing standardized computer-based tests include the Scholastic Aptitude Test, or SAT, the Graduate Record Exam (GRE) to evaluate students applying for graduate degree programs, the Metropolitan Achievement Test (MAT), the California Achievement Test (CAT), the Comprehensive Test of Basic Skills, the Iowa Test of Basic Skills, the Preliminary Scholastic Aptitude Test (PSAT), taken in preparation for the SAT and used to select National Merit Scholarship winners, and the American College Test (ACT), an aptitude test taken in addition to or in place of the SAT.

While advocates of standardized tests maintain that test scores provide a valid measure of academic aptitude and contend that these examinations are impartial in comparing students from a variety of social and educational backgrounds (Schmitt and Dorans, 1988; Scholes and McCoy, 1998), critics argued that the tests do not account for differences in socioeconomic backgrounds and do not accurately assess the scholastic performance of students (Lorie and Graue, 1993; Willingham et al, 2000; Willingham et al, 1988; Young, 2004; Steele and Aronson, 1998). They also argued that emphasis on high test scores encourages teachers to focus only the material likely to be covered in the tests rather than provide a comprehensive education. Computer-Aided Learning (CAL) describes the use of technology in the teaching and learning process (Zwick, 2004; 2006). Test generators are computer programs that aid student's assessment process. The first test generator developed was really just a simple program that generated random numbers for each student and the questions corresponding to these numbers were printed out for the students to answer. The limitation of this system was that in major examinations, due to the number of students and the limited amount of questions, it is highly likely that more than one student would have the same questions to answer (Achim and Christophe, 2005).

New types of generators later came up that divide the students into groups and the students in each group were given completely different questions than those in all other groups. It became the responsibility of examiners to ensure that no two students in the same group sat down next to each other (Carlos and Abelardo, 2004). In this scenario, questions were still being generated by the teachers and answers to those questions were marked manually. An improvement on this was the development of automated test generators that allow teachers to set multiple choice questions that students can answer with automated result generation (Reggie et al, 2002). The difficulty of assessing students on essay type examination led to the development of a new type of test generator that required teachers to submit lecture notes. The test generator then generates a table containing keywords (i.e. words that occur most in the note) and uses this to remove segments of the notes for students to fill in the missing parts. The challenge was that there were in lectures for which the keyword table contained mostly words that were not relevant to concepts that were taught in class, and as such any tests generated this way were not able to properly assess the student's understanding of the materials presented in class. These systems are fraught with so many challenges that in most educational settings, essay-based tests using computers have been completely abandoned and only multiple-choice computer assessment are in use.

Research Direction

Most computer-based assessments (CBA) employ test generators that produce multiple choice questions usually without options. The limitations of these types of evaluation are that students can randomly select or guess answers with a 25% chance of choosing the right answer per question. The implication is that there is a one out of four probability that students can pass such examinations without understanding the contents taught in class; without studying for the examination and by just guessing answers. Multiple choice testing cannot provide in all cases evidence that students have learned and therefore is not an effective way of testing students' ability in some courses.

To overcome the limitations of the keyword-based system, we proposed and develop an essay type test generator that allows teachers input "likely questions" and answers into a databas e. Questions are then randomly selected from these pools and assigned randomly to students. Answers in response to these questions by the students are then compared to pre-recorded answers in the database and students are graded. The system will randomly assign questions to students in such a way that no two students will have the same question at any point in time. To mark the paper the answer submitted would be broken up into tokens and this would be tested against the answer given by tutors to see if they mean the same thing. The use of a database would be employed to save all the data and facilitate easy comparison, recording and updating.

EssayTest System Overview

Our intention is to automate the whole essay-based testing process such as the setting questions, grading answers, storing and displaying results. The system allows the tutor to input questions before the examination and these are saved in the database. Out of all these questions previously submitted into the database by the tutor via the lecturer's panel / side of the system, random questions are generated for each student sitting for the examination. If enough questions have been inputted into the database, no two (2) students in the entire hall would be answering the same question at the same time. After the exam the answers are marked automatically and graded and stored in the database. To remark scripts the results of the student are recalled from the database with the click of a button.

The main users of this system would be:

* Students

* Professors/Tutors/Lecturers

* Administrators

Students would sit for exams via the system and their scripts would be graded by the system. The lecturers would also use the system to set questions, view result and search for scripts. The administrator is an individual who will be placed in charge of the testing system. The job of the administrator is to regulate the system, manage users in such a way that they would not interfere with each other, check to see that questions set by lecturers are up to standard, view results, search for scripts and set exam time and duration. There are scenarios which provide a more specific view of the different functions that must be implemented to perform many of the general functions mentioned above. The scenario is listed below.

System Design

The system design was divided into two phases:

1. Logical Design

2. Physical Design

Logical Design

A logical data flow diagram shows the flow of data through a transaction processing system without regard to the time period when the data flows or the processing procedures occur. Here I designed the software logically, using process modeling by Data Flow Diagram (DFD) and Entity Relation diagram (ERD) technique.

Physical Design: A user-friendly interface was developed for the EssayTest Generator using Java Programming.

Use Case

The Use Case Diagram is a UML Diagram that is used to show the actors in a given system and the activities they perform. The actors are as follows:

1. Students

2. Lecturer

List of Use Cases

1. Get Question

2. Answer Question

3. Mark Answers

4. Input Questions

5. Input Answers

6. View Result

System Implementation

The system consists of two usage platforms, the lecturer's platform and the student's. At the lecturer's end, there are seven (7) sections, namely login, add questions, edit questions, view results, remark scripts, modify login, course details, while on the student side there are basically only three (3) sections - login, answer questions and results.

Login Section

This is the first page that lecturer will see when they start the application. This section enables the entire application be secured, as people without the correct passwords are not allowed to access any other part of the application. It also ensures that lecturers are only able to access their own courses and no one else's.

Home

If the user has successfully entered correct login details, he is taken to the home page, which contains shortcuts to other sections of the application.

Input Questions

On this page the lecturer can enter questions, answers to those questions and the respective marks of those questions, which will be saved in the database.

Review Questions

Here the lecturer can view the questions currently in the database and can modify the question, its answer or the mark associated to it and save this updated version in the database, or he can delete the question from the database.

Check Result

Here the lecturer can view the results of the students who took the course that year and can also print this result.

Student's Login

On the day of the exam students can sit for their exams via the student's side of the application. To do so they first have to login with their Student ID number to ensure that only registered students are permitted to sit for the exam.

Exam Page

On login, questions are randomly generated and given to the students until either the allocated time for the exam elapses or the student answers enough questions.

Conclusion and Future Works

Most computer-based assessments (CBA) employ test generators that produce multiple choice questions, usually without options. The limitations of these types of evaluations are that students can pass them without possibly having a mastery of the concepts taught. We developed EssayTest, an automated test generator for essay examination, as a solution to the inadequacies associated with multiple choice computer-based examinations. Future work will seek to increase and improve system functionalities by providing components for biometrics authentication for test takers and the ability for the system to upload graphics or answers that require the student to draw.

End Notes /Appreciation

The author appreciates the efforts and collaboration of Mr. Abiodun Ajayi for his input into programming the interface.

References

Achim, R. and Christophe, B. (2005): New trends and technologies in computer-aided learning. FIP TC10 Working Conference: EduTech 2005, October 20-21, 2005, Perth, Australia. https://library.villanova.edu/Find/Record/986996

Carlos, D. and Abelardo. P. (2004): computer-aided design meets computer-aided learning. IFIP 18th World Computer Congress; TC10/WG10.5 EduTech Workshop, 22-27 August 2004, Toulouse, France. www.informatik.uni-trier.de/~ley/db/indices/a

Dede, C. (2002). Vignettes about the future of learning technologies. In Visions 2020: Transforming education and training through advanced technologies. Washington, DC: U.S. Department of Commerce. http://www.technology.gov/reports/TechPolicy/2020Visions.pdf

Reggie, K., Jimmy, C., Weijia, J., Anthony, F. and Ronnie, C. (2002): Web-based learning: men & machines. Proceedings of the First International ... of the First International Conference on Web-Based Learning in China (ICWL 2002). http://isbndb.com/d/publisher/world_scientific_publishing_co.html?start

Sadiq, F.I (2012). eCollaboration for Tertiary Education Using Mobile Systems. Computing, Information Systems & Development Informatics Journal. Vol 3, No. 1. pp 5-9

Schmitt, A. P. & Dorans, N. J. (1988). Differential item functioning for minority examinees on the SAT (ETS Research Report 88-32). Princeton, NJ: Educational Testing Service.

Scholes, R. J., & McCoy, T. R. (1998, April). The effects of type, length, and content of test preparation activities on ACT assessment scores. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.

Shepard, L.A., & Graue, M.E. (1993). "The morass of school readiness testing: Research on test use and test validity." In B. Spodek (Ed.), Handbook of Research on the Education of Young Children. New York: Teachers College Press.)

Steele, C. M., & Aronson, J. (1998). Stereotype threat and the test performance of academically successful African Americans. In C. Jencks & M. Phillips (Eds.), The BlackWhite test score gap (pp. 401-427). Washington, DC: Brookings Institution Press.

Willingham, W. W., Pollack, J. M., & Lewis, C. (2000). Grades and test scores: Accounting for observed differences (ETS Research Report 00-15). Princeton, NJ: Educational Testing Service.

Willingham, W. W., Ragosta, M., Bennett, R. E., Braun, H., Rock, D. A., & Powers, D. E. (1988). Testing handicapped people. Boston: Allyn and Bacon, Inc.

Young, J. W. (2004). Differential validity and prediction: Race and sex differences in college admissions testing. In Zwick, R. (Ed.), Rethinking the SAT: The Future of Standardized Testing in University Admissions. pp. 289-301. New York: RoutledgeFalmer.

Zwick, R. (2004). Is the SAT a "wealth test?" The link between educational achievement and socioeconomic status. In R. Zwick (ed.), Rethinking the SAT: The Future of Standardized Testing in University Admissions, pp. 203-216. New York: RoutledgeFalmer.

Zwick, R. (2006). Higher education admissions testing. In R. L. Brennan (Ed.), Educational Measurement (4th ed.), pp. 647-679 Westport, CT: American Council on Education/Praeger.

EDITOR: The following references were in the bibliography but are not referred to in the article:

Madanmohan, R. (2006) The nature of the information society: A developing world perspective. http://www.itu.int/osg/spu/visions/papers/developingpaper.pdf

Sally D. B. (2008) Food & Agriculture Organisation, Rome MDE Programme, Athabasca University, Canada. http://www.irrodl.org/index.php/irrodl/article/view/563/1038

Sawyer, R. L. (1985). Using demographic information in predicting college freshman grades. (ACT Research Report No. 87) Iowa City: ACT, Inc.

UIDLC (2012). University of Ibadan, Nigeria Distance Learning Centre. http://www.dlc.ui.edu.ng/sub-degree--diplomalife-long/admission-prospective-applicants

Olumide B. Longe Dr

University of Ibadan, longeolumide@fulbrightmail.org

Longe, Olumide B. Dr, "A Similarity Threshold-based Tool for Generating and Assessing Essay Computer-Based Examinations" (2012). Library Philosophy and Practice (e-journal). Paper 793.

http://digitalcommons.unl.edu/libphilprac/793

Longe Olumide Babatope

Fulbright Fellow

International Centre for Information Technology & Development

Southern University System

Southern University

Baton Rouge, LA, USA

longeolumide@fulbrightmail.org