Zapping up Turnitin e.a. to Identify ChatGPT Output in Academia?

Is ChatGPT rephrasing and summarizing of internet content the new normal in schools and in academia? Will the service determine what sources are used, what is said, and how it is said by university students? Professors and tutors should continue to be able to determine how much energy has been invested by their respective students in their drafting assignments. And: there is the question of commercial, copyrighted content, which remains unresolved and which first came into the spotlight when a famous newspaper filed a complaint against a generative AI service for having its algorithms trained with journalistic articles from the same paper.

So far, there is no software nor application capable of identifying to what extent ChatGPT has been used in students’ researching and writing-up tasks and theses. It would, however, be expedient to create a tool to be used by university staff and which would track the output of ChatGPT – even though ChatGPT itself does not memorize its own output. But would it be fruitful for universities to program their own plagiarism software? On the technical side, this would require large server parks and CPU power – independent from the core functions of ChatGPT. As an open-source or commercial service, however, such a tool could be part and parcel in tracking student-created content in a relatively short amount of time – and rather reliably.

There are already some alternatives to ChatGPT, and their number is growing. This makes things more confusing. Broadly, on the economic side, one would have to work alongside all those tools concurring with the functions of ChatGPT. Without ads, then, this would require fees to cover huge cost while staying true to privacy considerations. Even though most users would likely remain private users instead of companies and educative institutions.

There are three basic solutions:

  • Existing plagiarism checkers, like Turnitin, Grammarly, and many other proven brands of anti-plagiarism might be tuned up with interfaces to the one ChatGPT – and many do-alikes – to sniff out passages of text created by automatized IT services. This could require logging on to the locally operating ChatGPT software used by students, which is problematic. Students could simply use a different app. A new comparative service might prompt students to do their research themselves, or rephrase internet texts alongside generative tools’ results, in order for them to reach good grades. In short, some amount of proper initiative would be encouraged but the role of the AI would no longer be visible. Some professors would consider this fraud, others might simply allow the use of ChatGPT because it has become so very popular.

  • The second viable way would be to create new stand-alone software to identify generative artificial intelligence output claimed to be original ‘human’ content, specifically. A one-does-it-all software would make it less difficult for supervisors to weigh yellow and red flags regarding classical sources and AI output, and it might be more efficient due to the focus on generative AI-created content. But there needs to be a second level of content analysis – such as a plagiarism checker – in place, which, that point reached, the use of AI would be scrutinized. In part within AI packages, in other cases as a stand-alone. In any case, that instance would be separate from the content-creating tool. That function would save the original output of the first instance without the AI-generating part of the package being able to read and save the text in any way, to then be compared and identified by a user-side software for teachers and professors which could sound the alarm once a find is made. Such an instance could be made legally binding for software creators such as the ones behind the open-source ChatGPT itself – instead of the many user-generated interfaces using ChatGPT as a backbone – to make sure.

  • The third way to find AI-generated text would be watermarks included in the output of the respective apps and software solutions. However, they would come in the form of text, or specific phrases, and be analyzed by linguistic algorithms. Such watermark passages could later be altered by students, which would make it impossible for software checkers to differentiate.

  • Finally, in theory, there is a fourth way: remaining conservative about content, and placing an interdiction of ChatGPT apps and software applications at school and at universities. Otherwise, the actual achievements of young scholars in drafting and completing their proper texts and documents would not be transparent. Instead, their versatility in operating ChatGPT would be measured. In case that is desireable.

Thorsten Koch, MA, PgDip
Policyinstitute.net
2 March 2023
Updated: 20 January 2024

Author: author

Leave a Reply

Your email address will not be published. Required fields are marked *