Confused on Steam Play and Proton? Be sure to check out our guide.
We do often include affiliate links to earn us some pennies. See more here.

I really fear for the internet and what it will become in even just another year, with the rise of AI writing and AI art being used in place of real people. And now OpenAI openly state they need to use copyrighted works for training material.

As reported by The Guardian, the New York Times sued OpenAI and Microsoft over copyright infringement and just recently OpenAI sent a submission to the UK House of Lords Communications and Digital Select Committee where OpenAI said pretty clearly:

Because copyright today covers virtually every sort of human expression– including blog posts, photographs, forum posts, scraps of software code, and government documents–it would be impossible to train today’s leading AI models without using copyrighted materials. Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.

Worth noting OpenAI put up their own news post "OpenAI and journalism" on January 8th.


Why am I writing about this here? Well, the reasoning is pretty simple. AI writing is (on top of other things) increasing the race to the bottom of content for clicks. Search engines have quickly become a mess to find what you actually want, and it's only going to continue getting far worse thanks to all these SEO (Search Engine Optimisation) bait content farms, with more popping up all the time, and we've already seen some bigger websites trial AI writing. The internet is a mess.

As time goes on, and as more people use AI to pinch content and write entire articles, we're going to hand off profitable writing to a select few big names only who can weather the storm and handle it. A lot of smaller scale websites are just going to die off. Any time you search for something, it will be those big names sprinkled in between the vast AI website farms all with very similar robotic plain writing styles.

Many (most?) websites make content for search engines, not for people. The Verge recently did a rather fascinating piece on this showing how websites are designed around Google, and it really is something worth scrolling through and reading.

One thing you can count on: my perfectly imperfect writing full of terrible grammar continuing without the use of AI. At least it's natural right? I write as I speak, for better or worse. By humans, for humans — a tagline I plan to stick with until AI truly takes over and I have to go find a job flipping burgers or something. But then again, there will be robots for that too. I think I need to learn how to fish…

Article taken from GamingOnLinux.com.
Tags: Editorial, Misc
26 Likes
About the author -
author picture
I am the owner of GamingOnLinux. After discovering Linux back in the days of Mandrake in 2003, I constantly came back to check on the progress of Linux until Ubuntu appeared on the scene and it helped me to really love it. You can reach me easily by emailing GamingOnLinux directly.
See more from me
68 comments
Page: «4/7»
  Go to:

Arehandoro Jan 10
QuoteBecause copyright today covers virtually every sort of human expression– including blog posts, photographs, forum posts, scraps of software code, and government documents–it would be impossible to train today’s leading AI models without using copyrighted materials. Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.

Yet their own platform, LLM, code, etc is copyrighted and not released under an open-source licence. I could potentially believe their shit if AI was to benefit everyone, not just them and/or a few companies.

QuoteBecause copyright today covers virtually every sort of human expression– including blog posts, photographs, forum posts, scraps of software code, and government documents–it would be impossible to train today’s leading AI models without using copyrighted materials. Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.

That's like saying that I cannot profit massively without destroying the environment and exploiting employees. Oh, wait...

QuoteBecause copyright today covers virtually every sort of human expression– including blog posts, photographs, forum posts, scraps of software code, and government documents–it would be impossible to train today’s leading AI models without using copyrighted materials. Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.

Today's citizens needs are not a fascinating AI built on top of other people's works. Citizens today need job stability, affordable housing, a public, free and of quality education and health, and the list could be virtually endless.
Lachu Jan 10
Microsoft needs to sell copyrighted GNU/GPL code as public domain to ones customers? I think, yes. I must reminder you, that some organization decided to take to court of Microsoft, because MS is stilling Free Software. Looking at history of this company, it always stole. MS stole GUI from Xerox, stole network protocols (kerbos), stole many solution from Linux desktop and now sold way to stole GNU/GPL code without need to break law.
JustinWood Jan 10
Bold move to say the quiet part loudly. Stupid move too, but then again when has that ever stopped this particular brand of pond scum.
Quoting: scaine
Quoting: NathanaelKStottlemyer
Quoting: Purple Library Guy
Quoting: NathanaelKStottlemyerP.S. According to LanguageTool, three commas were needed in the article.
Ehhh, IMO commas are kind of a "soft" punctuation mark--there are stylistic differences in how people use them. There are many situations where it's not really technically "wrong" either to use one or not to use one, and others where it is wrong by some technical standards to do it a particular way, but doing it that "wrong" way still works given the flow of the sentence and the way people talk. Periods, for instance, are a lot clearer--if you're at the end of a sentence you should be using one, period. Well, unless you have a reason to use a question mark or exclamation point instead. But commas are comparatively mushy, and I don't trust computerized guidance about how to use them.

All the places where LanguageTool said a comma was needed, I wouldn't care either way. However, I personally err on the side of using the commas, because they save lives after all.

This joke?

A comma is the difference between:
- Let's eat, Grandma!
and
- Let's eat Grandma!

It's not a joke, it's serious. Every day, countless lives are lost!
TheRiddick Jan 11
" I have to go find a job flipping burgers or something. "

Sorry that job has already been taken by AI. What we do have is a job where rich folk need someone to live in their toilet to wipe their asses! Requires a degree!


Last edited by TheRiddick on 11 January 2024 at 4:36 am UTC
whatever Jan 11
genAI is like an averaging filter, the internet has been gooified by it, a gray goo of mediocrity, everything is bland, a blanket of blandness is now covering everything, in all domains of artistry.
well, not really everything, GamingOnLinux is a corner of interesting stuff made by humans for humans, and these little corners must be preserved for the sake of humanity.
14 Jan 13
View PC info
  • Supporter Plus
I said it before, I think companies' models are going to become the sweet sauce. If OpenAI loses this copyright case, they will need to start paying for copyrighted content to be ingested into their model. It will be a content subscription just like streaming media companies.

I think there is an argument that reading copyrighted material is same as a human doing so and then writing their own creative work, however it doesn't stand against companies' acceptable use policies which often deny or limit scraping by bots. This is exactly the same as bot scraping, where the difference between typical usage is a machine doing it as well as the volume (resource expense).
scaine Jan 14
View PC info
  • Contributing Editor
  • Mega Supporter
Quoting: 14I think there is an argument that reading copyrighted material is same as a human doing so and then writing their own creative work

When people do this, they pay for the privilege, or access libraries and can only check out books and copyrighted materials for private use. OpenAI and others aren't doing that, they're just consuming all the content, even pirated material and context behind paywalls, on the internet, and using it to train their model.

Of course, proving that will be the court battle.
LoudTechie Jan 14
Quoting: 14I think there is an argument that reading copyrighted material is same as a human doing so and then writing their own creative work

Look up the legal standing of fan fiction. Than repeat that statement.
Using copyrighted "aspects" is enough to be considered a copyright violation.
Quoting: LoudTechie
Quoting: 14I think there is an argument that reading copyrighted material is same as a human doing so and then writing their own creative work

Look up the legal standing of fan fiction. Than repeat that statement.
Using copyrighted "aspects" is enough to be considered a copyright violation.
I suggest looking up Marion Zimmer Bradley.

QuoteFor many years, Bradley actively encouraged Darkover fan fiction. She encouraged submissions from unpublished authors and reprinted some of it in commercial Darkover anthologies. This ended after a dispute with a fan over an unpublished Darkover novel of Bradley's that had similarities to one of the fan's stories. As a result, the novel remained unpublished and Bradley demanded the cessation of all Darkover fan fiction
The fan threatened to take Marion Zimmer Bradley to court for infringing on the fan's copyright. The fan holds the copyright to their own prose. The fan clearly does not hold the copyright to the characters. But should the author of the original work use prose from a fan work...well, things get dicey.

You'd also expect to face some legal trouble if you ripped some fan subs and tried to pass them off as your own translation (which has been done before).

Of note is the Organization for Transformative Works, which works to protect fan works and has this to say:

QuoteCopyright is intended to protect the creator’s right to profit from her work for a period of time to encourage creative endeavor and the widespread sharing of knowledge. But this does not preclude the right of others to respond to the original work, either with critical commentary, parody, or, we believe, transformative works.

In the United States, copyright is limited by the fair use doctrine. The legal case of Campbell v. Acuff-Rose held that transformative uses receive special consideration in fair use analysis. For those interested in reading in-depth legal analysis, more information can be found on the Fanlore Legal Analysis page.
And:

QuoteWhile case law in this area is limited, we believe that current copyright law already supports our understanding of fanfiction as fair use.

We seek to broaden knowledge of fan creators’ rights and reduce the confusion and uncertainty on both fan and pro creators’ sides about fair use as it applies to fanworks. One of our models is the documentary filmmakers’ statement of best practices in fair use, which has helped clarify the role of fair use in documentary filmmaking.

It's certainly not as cut and dry as you might think.
While you're here, please consider supporting GamingOnLinux on:

Reward Tiers: Patreon. Plain Donations: PayPal.

This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!

You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Login / Register


Or login with...
Sign in with Steam Sign in with Google
Social logins require cookies to stay logged in.

Buy Games
Buy games with our affiliate / partner links: