Recently, the OpenAI organization launched its new artificial intelligence product, the ChatGPT-4 chatbot, or GPT-4 for short. This bot uses machine learning (a form of artificial intelligence) to generate text that looks like it was written by a human, in response to typed requests. But is GPT-4 as smart as it seems?
The GPT-4 version of this system comes at a time when this range of artificial intelligence bots has already achieved great popularity.
Since its debut in November 2022, ChatGPT has become the fastest growing technology platform in history, reaching 100 million users in just two months.
ChatGPT allows users to ask you for help with things like writing essays, writing business plans, generating computer code, and even composing music.
This artificial intelligence is capable of understanding human language with an astonishing level of comprehension, answering almost everything that is asked in a coherent and clear manner. Is able to analyze large amounts of data and write a summary. You have enhanced reasoning or “common sense” ability. It also translates between languages like other systems already do.
The field of artificial intelligence is recording great advances lately and GPT-4 is a good example of this. In any case, progress is not easy. This is attested not only by the experiences of some users but also by various scientific investigations carried out on the ChatGPT system and on others.
The international team of David Wood, a professor of accounting at Brigham Young University in the United States, conducted an extensive study to compare the hit rate of accounting student responses to questions and the hit rate of ChatGPT responses to those same questions. The study was motivated in part by fears that using ChatGPT would be an effective new way for students to cheat and achieve higher scores than they would by giving only their own answers to questions.
The team led by David Wood, a professor of accounting at Brigham Young University, found that the ability of the ChatGPT system to answer accounting questions was below the average ability of a large group of students, at least until recently. (Photo: Nate Edwards/BYU)
The authors of this research number more than 300 and are from 186 educational institutions in 14 nations. More than 25,000 questions from accounting exams were used. And additional ones were prepared to interrogate ChatGPT.
Although ChatGPT’s performance was impressive, students performed better. They obtained an overall average score of 76.7%, compared to 47.4% for ChatGPT. In 11.3% of the questions, ChatGPT scored above the student average, especially in accounting information systems and in the field of auditing. But the artificial intelligence bot did worse in other fields, such as taxes.
Regarding the type of question, ChatGPT obtained better results in the true or false questions (68.7% correct answers) and in the multiple choice ones (59.5% correct answers), but it had problems with the questions that require typing. a short answer (between 28.7% and 39.1% correct). In general, ChatGPT had more difficulty answering higher order questions.
The authors of this study also detected bugs that, in theory, should already be fixed or in the process of being fixed. Some of these bugs led ChatGPT to make silly arithmetic errors, or to cite references to works that don’t exist and sometimes even whose supposed authors don’t exist either.
This study is titled “The ChatGPT Artificial Intelligence Chatbot: How Well Does It Answer Accounting Assessment Questions?”. And it has been published in the academic journal Issues in Accounting Education.
Another fear about artificial intelligence in general is that when preparing texts from written information, they end up plagiarizing fragments of articles and books, that is, copying significantly long texts without citing the source. Dongwon Lee’s team, from the Pennsylvania State University in the United States, is currently presenting the results of a study in this regard, at the 2023 ACM Web Conference held in Austin, Texas, United States. The study is titled “Do Language Models Plagiarize?”. And the conclusion is that the systems examined do show this worrying tendency to plagiarize. (Fountain: NCYT by Amazings)