Skip to main content

Table 2 Related work: preprints

From: Testing of detection tools for AI-generated text

Source

Detection tools used

Dataset

Evaluation metrics

Khalil & Er 2023

3

iThenticate, Turnitin, ChatGPT

50 essays generated by ChatGPT on various topics (such as physics laws, data mining, global warming, driving schools, machine learning, etc.)

True positive,

False negative

Wang et al. 2023

6

GPT2-Detector, RoBERTa-QA, DetectGPT, GPTZero

Writer, OpenAI Text Classifier

• Q&A-GPT: 115 K pairs of human-generated answers (taken from Stack Overflow) and ChatGPT generated answers (for the same topic) for 115 K questions

• Code2Doc-GPT: 126 K samples from CodeSearchNet and GPT code description for 6 programming languages

• 226.5 K pairs of code samples human and ChatGPT generated (APPS-GPT, CONCODE-GPT, Doc2Code-GPT)

• Wiki-GPT dataset: 25 K samples of human-generated and GPT polished texts

AUC scores, False positive rate, False negative rate

Pegoraro et al. 2023

24 approaches and tools, among them online tools ZeroGPT, OpenAI Text Classifier, GPTZero, Hugging Face, Writefull, Copyleaks, Content at Scale, Originality.ai, Writer, Draft and Goal

58,546 responses generated by humans and 72,966 responses generated by the ChatGPT model, resulting in 131,512 unique samples that address 24,322 distinct questions from various fields, including medicine, opendomain, and finance

True positive rate, True negative rate