ChatGPT is in trouble. OpenAI is getting sued in the US for illegally using content from the internet to train their LLM or large language models. It got called out for unauthorised data mining to augment its information database.
As reported by First Post, a class action lawsuit has been filed against OpenAI, the creator of ChatGPT, claiming that the company’s AI training methods violated the privacy and copyright of practically everyone who has ever shared content online. OpenAI gathered an enormous amount of data from various sources on the internet to train its advanced AI language models.
These datasets consist of a wide range of materials, such as Wikipedia articles, popular books, social media posts, and even explicit content of niche genres. More importantly, OpenAI acquired all this data without seeking permission from the content creators. If this refreshes anyone’s memory, it would be Samsung’s coding for their semiconductor division, as well as other confidential data.
What The Trouble Entails
The class action lawsuit, filed in California, argues that OpenAI’s failure to adhere to proper protocols, including obtaining consent from content creators, amounts to outright data theft.
The lawsuit filing stated, “Instead of following established procedures for the acquisition and usage of personal information, the Defendants resorted to theft. They systematically scraped 300 billion words from the internet, including ‘books, articles, websites, and posts,’ which also included personal information obtained without consent.”
How OpenAI Nicks Your Ideas And Work
It is a valid argument that if you have been active online in recent decades, your digital contributions are likely incorporated into OpenAI’s datasets. Consequently, any output generated by OpenAI’s language models, which is used for profit, may contain fragments of your data obtained through silent scraping.
Ryan Clarkson, Managing Partner at the law firm suing OpenAI, explained to The Washington Post that “all of that information is being taken at scale” without it being originally intended for utilisation by a large language model.
ChatGPT – A Weapon Of Market Consolidation?
Companies Struggle To Protect Corporate Secrets From ChatGPT
Samsung Employees Accidentally Leaked Company Secrets Via ChatGPT
Samsung May Have A Semiconductor Factory In Vietnam
Samsung Sells Record US$1.7 Billion Of Phones Over India Holiday
Samsung’s Smartphone Shipments Drop By 8 Percent In Global Markets In Q3 2022
Samsung Envisions Hyper-Growth in Memory and Logic Semiconductors Through Intensified Industry Collaborations
Samsung To Produce Semiconductor Parts In Vietnam In 2023
Fine-Tuning Production With Behringer Saws Inc
WANT MORE INSIDER NEWS? SUBSCRIBE TO OUR DIGITAL MAGAZINE NOW!
Letter to the Editor
Do you have an opinion about this story? Do you have some thoughts you’d like to share with our readers? APMEN News would love to hear from you!
Email your letter to the Editorial Team at [email protected]