It has become clear that, without a legislative change, companies building AI models are going to use pretty much any data a Google search can give them, and this also includes forums and social networks. That is why some companies like Twitter and Reddit have started charging for this data.
Now, It’s Stack Overflow’s turna forum that at the time had taken a negative position prohibiting the use of ChatGPT to respond to users on the platform, now it is even more established in its position forcing payment by artificial intelligence companies for the use of the data that is hosted on the website.
This can be difficult for some users to understand, but some datasets such as Google’s C4, with which the artificial intelligence of various companies are being trained, They have data from Reddit, Stack Overflow, Wikipedia, and thousands of other web pages, with a statistic that mentions that the symbol “©” is present in this dataset more than 200 million times, constituting a massive breach.
It will be necessary to see if the companies respect this, Well, for now it is still early to know if this will help or not.
End of Article. Tell us something in the Comments!
Jordi Bercial
Avid technology and electronics enthusiast. I tinkered with computer components almost since I could walk. I started working at Geeknetic after winning a contest on their forum for writing hardware articles. Drift, mechanics and photography lover. Do not hesitate and leave a comment on my articles if you have any questions.