Generative AI Has a Visual Plagiarism Problem::Experiments with Midjourney and DALL-E 3 show a copyright minefield

  • chemical_cutthroat@lemmy.world
    link
    fedilink
    English
    arrow-up
    63
    arrow-down
    9
    ·
    edit-2
    7 months ago

    I’m getting really tired of this shit. These images are so heavily cherry picked. If you put those prompts into Midjourney you may get things similar, but they aren’t going to be anywhere near that. My guess: someone used the copyrighted images as part of the prompt, but is leaving that bit out of their documentation. I use Midjourney daily, and it’s a struggle to get what I want most of the time, and generic prompts like what they show won’t get it there. Yes, you can roll the prompt over and over and over again, but coming up with something as precise as what they have is a chance in a million on your first roll or even 100th. I’ll attach the “90’s cartoon” prompt to illustrate my point.

    The minion bit is pretty accurate, but the Simpsons is WAAAAY off. The thing is, that it didn’t return copyrighted images, it returned strange amalgams of things that it blends together in its algorithms. Getting exact scenes from movies isn’t something it’s going to just give you. You have to make an effort to get those, and just putting in “half-way through Infinity War” won’t do it.

    At best that falls under fair use. If a human made it, it would be fanart, and not copyrighted scenes. This is all just lawyers looking to get rich on a new fad by pouring fear into rich movie studios, celebrities, and publishers. “Look at this! It looks just like yours! We can sue them, and you’ll get 25% of that we win after my fees. Trust me, it’s ironclad. Of course, I’ll need my fees upfront.”

    • Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      2
      ·
      edit-2
      7 months ago

      The new version of midjourney has a real overfitting problem. The way it was done if I remember correctly is that someone found out v6 was trained partially with Stockbase images pairs, so they went to Stockbase and found some images and used those exact tags in the prompts. The output from that greatly resembled the training data, and that’s what ignited this whole thing.

      Edit: I found the image I saw a few days ago. They need to go back and retrain their model, IMO. When the output is this close to the training, it has to be hurting the creativity of the model. This should only happen with images that haven’t been de-duped in the training set, so I don’t know what’s going on here.

      • Blue_Morpho@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        In 15 minutes I can get Google to give me a link to pirated content. Hosting links to pirated content gets you arrested in the US. But Google doesn’t just give you the pirate links which is why it is legal. It’s a tool that you can use to get them if you work at it a little.

    • stevedidWHAT@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      7
      ·
      edit-2
      7 months ago

      They’ll do anything to slow the progress of publically accessible power.

      Fight them tooth and nail. Self governance over interference from ignorant, decrepit politicians.

      Also stop using copyrighted materials when training. You put in the extra mile now, and you’ll be able to make your own (automated) copyright material.

    • TwilightVulpine@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      7 months ago

      I’m sorry to tell you but fanart is subjected to copyright, as are all derivative works that aren’t sufficiently transformative, even if they aren’t used commercially. It’s a subjective measure but I doubt any judge would say those top images are completely distinct from the Minions or Simpsons. What happens is that usually the rights owners don’t chase every single infringement, out of goodwill or simply because it would be too expensive to litigate every unauthorized use.

      To be fair personally I think that’s excessive. But I believe so especially because it makes artists lives more difficult. However AI isn’t making it any easier either…

    • Zoboomafoo@slrpnk.net
      link
      fedilink
      English
      arrow-up
      3
      ·
      7 months ago

      That Tree God on the bottom right looks really neat, and a worthy addition to the “Villain with legitimate grievances that murders for no good reason” club

    • burliman@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      7 months ago

      Thank you for saying this way better than I would have, and saving me the effort too! Agreed! I am getting tired of this shit too.