Home
Scholarly Works
Detecting LLM-Generated Text in Computing...
Conference

Detecting LLM-Generated Text in Computing Education: Comparative Study for ChatGPT Cases

Abstract

Due to the recent improvements and wide availability of Large Language Models (LLMs), they have posed a serious threat to academic integrity in education. Modern LLM-generated text detectors attempt to combat the problem by offering educators with services to assess whether some text is LLM-generated. In this work, we have collected 124 submissions from computer science students before the creation of ChatGPT. We then generated 40 ChatGPT submissions. We used this data to evaluate eight publicly-available LLM-generated text detectors through the measures of accuracy, false positives, and resilience. Our results find that Copy Leaks is the most accurate LLM-generated text detector, G PTKit is the best LLM-generated text detector to reduce false positives, and GLTR is the most resilient LLM-generated text detector. We note that all LLM-generated text detectors are less accurate with code, other languages (aside from English), and after the use of paraphrasing tools.

Authors

Orenstrakh MS; Karnalim O; Suárez CA; Liut M

Volume

00

Pagination

pp. 121-126

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

July 4, 2024

DOI

10.1109/compsac61105.2024.00027

Name of conference

2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)
View published work (Non-McMaster Users)

Contact the Experts team