@patatahooligan

patatahooligan@lemmy.world · 4 days ago

The point of encrypting something that gets decrypted midway by an organization is that there are worse actors than the organization out there. I’m not really scared of Steam abusing my credit card info, but I am afraid of random internet strangers.

Also remember that https doesn’t just protect your data, it also verifies that you’re actually on the website you think you are. The internet is basically unusable without this guarantee, especially on a network you share with others.

patatahooligan@lemmy.world · 10 days ago

So no vetting at all presumably since you didn’t mention it? So how do you know that Dashlane is safer than a password scheme that might be guessed by someone after they’ve already compromised a couple of your passwords?

patatahooligan@lemmy.world · 12 days ago

For someone to work it out, they would have to be targeting you specifically. I would imagine that is not as common as, eg, using a database of leaked passwords to automatically try as many username-password combinations as possible. I don’t think it’s a great pattern either, but it’s probably better than what most people would do to get easy-to-remember passwords. If you string it with other patterns that are easy for you to memorize you could get a password that is decently safe in total.

Don’t complicate it. Use a password manager. I know none of my passwords and that’s how it should be.

A password manager isn’t really any less complicated. You’ve just out-sourced the complexity to someone else. How have you actually vetted your password manager and what’s your backup plan for when they fuck up?

patatahooligan@lemmy.world · 3 months ago

It’s not just Batman. This is a common trope in the superhero genre. Pop Culture Detective has a great video on the subject: https://youtu.be/LpitmEnaYeU

patatahooligan@lemmy.world · 4 months ago

The source code in this torrent is a clone of the git repo. I don’t know if there are missing branches but it should have the entirety of the master branch history at least.

patatahooligan@lemmy.world · edit-2 4 months ago

I have my own backup of the git repo and I downloaded this to compare and make sure it’s not some modified (potentially malicious) copy. The most recent commit on my copy of master was dc94882c9062ab88d3d5de35dcb8731111baaea2 (4 commits behind OP’s copy). I can verify:

that the history up to that commit is identical in both copies
after that commit, OP’s copy only has changes to translation files which are functionally insignificant

So this does look to be a legitimate copy of the source code as it appeared on github!

Clarifications:

This was just a random check, I do not have any reason to be suspicious of OP personally
I did not check branches other than master (yet?)
I did not (and cannot) check the validity of anything beyond the git repo
You don’t have a reason to trust me more than you trust OP… It would be nice if more people independently checked and verified against their own copies.

I will be seeding this for the foreseeable future.

patatahooligan@lemmy.world · 5 months ago

First time I’ve heard of Mojeek. Why should I trust it more than any other company? Is there anything particular about its economic model or governance that makes it less likely to decide to be unethical?

patatahooligan@lemmy.world · 5 months ago

Aha I see what you’re saying. It’s possible that dr CD considered the second part to be crucial, but it doesn’t seem that people who listened to his message felt the same way, myself included. I probably speak for a lot of people when I say we hadn’t realized just how much these platforms are “subsidized” and how much damage that does to the entire market. So that part ended up being associated in our minds with the term enshittification.

patatahooligan@lemmy.world · 5 months ago

“Enshitification” does not mean “I don’t like it”. It is specifically about platforms that start out looking too good to be true and turn to shit when the user base is locked in. The term is generally used for cases where the decline in quality was pre-planned and not due to external factors. Using the same term each time is, in my opinion, an appropriate way to point out just how common this pattern is.

patatahooligan@lemmy.world · 7 months ago

I read through the article but it doesn’t seem to specify the nature of the book. How do we know it’s a “knock off”? It might very well be fanfiction. Copyright law aside, fanfiction can be original and is a valid artistic expression.

This is quite a nuanced issue. The author is claiming that the Rings of Power copied his ideas. Even if the author didn’t have the legal right to publish this book, he might have put original ideas into his work, and the Tolkien Estate should not automatically own these. The copyright owner “should” (within the current legal framework) be able to make you take down your derivative work, but they don’t own it. The article doesn’t specify why the original lawsuit was dismissed.

patatahooligan@lemmy.world · 7 months ago

Then the site is wrong to tell you that you can use the images in any way you want.

That’s what I’m saying.

intentionally violate copyright

Why is it intentional? Some characters come up even in very generic prompts. I’ve been toying around with it and I’m finding it hard to come up with prompts containing “superhero” that don’t include superman in the outputs. Even asking explicitly for original characters doesn’t work.

For the most part it hasn’t happened.

And how do you measure that? You have a way for me to check if my prompt for “Queer guy standing on top of a mountain gazing solemnly into the distance” is strikingly similar to some unknown person’s deviantart uploads, just like my prompt containing “original superhero” was to superman?

The status quo…

Irrelevant to the discussion. We’re talking about copyright law here, ie about what rights a creator has on their original work, not whether they decide to exercise them in regards to fan art.

until they get big enough

Right, so now that multi-billion dollar companies are taking in the work of everyone under the sun to build services threatening to replace many jobs, are they “big enough” for you? Am I allowed to discuss it now?

This is an argument-by-comparion.

It’s not an argument by comparison (or it is a terrible one) because you compared it to something that differs (or you avoided mentioning) all the crucial parts of the issue. The discussion around AI exists specifically because of how the data to train them is sourced, because of the specific mechanisms they implement to produce their output, and because of demonstrated cases of producing output that is very clearly a copy of copyrighted work. By leaving the crucial aspects unspecified, your are trying to paint my argument as being that we should ban every device of any nature that could produce output that might under any circumstances happen to infringe on someone’s copyright, which is much easier for you to argue against without having to touch on any of the real talking points. This is why this is a strawman argument.

You don’t own a copyright on a pattern

Wrong. In the context of training AI, I’m taking about any observable pattern in the input data, which does include some forms of patterns that are copyright-able, eg the general likeness of a character rather than a specific drawing of them.

your idea of how copyright should work here is regressive, harmful

My ideas on copyright are very progressive actually. But we’re not discussing my ideas, we’re discussing existing copyright law and whether the “transformation” argument used by AI companies is bullshit. We’re discussing if it’s giving them a huge and unearned break from the copyright system that abuses the rest of us for their benefit.

a description specific enough to produce Micky mouse from a machine that’s never seen it.

Right, but then you would have to very strictly define Micky Mouse in your prompt. You would be the one providing this information, instead of it being part of the model. That would clearly not be an infringement on the model’s part!

But then you would have to also solve the copyright infringement of Superman, Obi-Wan, Pikachu, some random person’s deviantart image depicting “Queer guy standing on top of a mountain gazing solemnly into the distance”, … . In the end, the only model that can claim without reasonable objection to have no tendency to illegally copy other peoples’ works is a model that is trained only on data with explicit permission.

patatahooligan@lemmy.world · 7 months ago

If AI companies were predominantly advertising themselves as “we make your pictures of Micky mouse” you’d have a valid point.

Doesn’t matter what it’s advertised as. That picture is, you agree, unusable. But the site I linked to above is selling this service and it’s telling me I can use the images in any way I want. I’m not stupid enough to use Mickey Mouse commercially, but what happens when the output is extremely similar to a character I’ve never heard of? I’m going to use it assuming it is an AI-generated character, and the creator is very unlikely to find out unless my work ends up being very famous. The end result is that the copyright of everything not widely recognizable is practically meaningless if we accept this practice.

But at this point you’re basically arguing that it should be impossible to sell a magical machine that can draw anything you ask from it because it could be asked to draw copyright images.

Straw man. This is not a magical device that can “draw anything”, and it doesn’t just happen to be able to draw copyrighted images as a side-effect of being able to create every imaginable thing, as you try to make it sound. This is a mundane device whose sole function is to try to copy patterns from its input set, which unfortunately is pirated. If you want to prove me wrong, make your own model without a single image of Micky Mouse or a tag with his name, then try to get it to draw him like I did before. You will fail because this machine’s ability to draw him is dependent on being trained on images of him.

There are many ways this could be done ethically, like:

build it on open datasets, or on datasets you own, instead of pirating
don’t commercialize it
allow non-commercial uses, like research or just messing around (which would be a real transformative use)

patatahooligan@lemmy.world · 7 months ago

Seems like a petty technicality to me.

The “transformation” is the petty technicality in my opinion. Would it be transformative if I sold you a database of base64 encoded images? What about if they were encrypted?

Hell, you can hire me to paint based on prompts you give me. That’s the exact same service an AI provides, no? I’m going to study copyrighted materials to get better at my service. Surely if pictures -> AI model is transformative, then pictures -> knowledge in my brain is transformative as well. So you give me the prompt “Mickey Mouse” and I draw this. This is “custom art”. You think you can use that commercially? And if you realize that you can’t, why do you think I should be able to legally sell you this service?

patatahooligan@lemmy.world · 7 months ago

Pictures and things that draw pictures aren’t the same thing.

And that’s completely irrelevant because “things that draw pictures” is not the work being sold. You’re buying pictures.

patatahooligan@lemmy.world · 7 months ago

Except it’s not really transformative because the end product is not the model itself. The product is a service that writes code or draws pictures. It is literally the exact same as the input and it is intended specifically to avoid having to buy the inputs.

patatahooligan@lemmy.world · edit-2 9 months ago

You can argue that “open source” can mean other things that what the OSI defined it to mean, but the truth of the matter is that almost everyone thinks of the OSI or similar definition when they talk about “open source”. Insisting on using the term this way is deliberately misleading. Even your own links don’t support your argument.

A bit further down in the Wikipedia page is this:

Main article: Open-source software

Generally, open source refers to a computer program in which the source code is available to the general public for use for any (including commercial) purpose, or modification from its original design.

And if you go to the main article, it is apparent that the OSI definition is treated as the de fact definition of open source. I’m not going to quote everything, but here are examples of this:
https://en.wikipedia.org/wiki/Open-source_software#Definitions
https://en.wikipedia.org/wiki/Open-source_software#Open-source_versus_source-available

And from Red Hat, literally the first sentence

Open source is a term that originally referred to open source software (OSS). Open source software is code that is designed to be publicly accessible—anyone can see, modify, and distribute the code as they see fit.

…

What makes software open source?

And if we follow that link:

In actuality, neither free software nor open source software denote anything about cost—both kinds of software can be legally sold or given away.

But the Red Hat page is a bad source anyway because it is written like a short intro and not a formal definition of the concept. Taking a random sentence from it and arguing that it doesn’t mention distribution makes no sense.

Here is a more comprehensive page from Red Hat, that clearly states that they evaluate whether a license is open source based on OSI and the FSF definitions.

patatahooligan@lemmy.world · 9 months ago

They could make new updates to lemmy proprietary

Maybe not even that. Lemmy is released under the AGPL3. This means that modified versions of Lemmy have to also be released as free software under the AGPL3 or a compatible license. To release a derivative work under an incompatible license you would need to own the code or be given permission by each contributor to do so. For any contribution where you can’t make a deal with the author, you would have to rip it out of the codebase entirely. Note that this is true for lemmy devs as well. If there is no Contributor License Agreement that states otherwise, they cannot distribute the work of other contributors under an AGPL3-incompatible license.

patatahooligan@lemmy.world · 10 months ago

It’s not about “accomplishing” something that couldn’t be done with a database. It’s about making these items tradeable on a platform that doesn’t belong to a single entity, which is often the original creator of the item you want to sell. As good as the Steam marketplace might be for some people, every single sale pays a tax to Valve, and the terms could change at any moment with no warning. The changes could be devastating for the value of your collectibles that you might have paid thousands of dollars for. This could not happen on any decentralized system. It could be something else that isn’t NFTs but it would absolutely have to be decentralized. Anything centralized that “accomplishes the same thing” doesn’t really accomplish the same thing.

It’s worth noting that this sort of market control would never be considered ok on any other market. Can you imagine a car manufacturer requiring every sale to go through them? Would you accept paying them a cut when you resell your car? Would you accept having to go through them even to transfer ownership of the car to a family member? If a car manufacturer tried to enforce such terms on a sale they would be called out for it and it would most likely be ruled to be unlawful. But nobody questions the implications of the same exact situation in a digital marketplace.

patatahooligan@lemmy.world · 1 year ago

Let’s remove the context of AI altogether.

Yeah sure if you do that then you can say anything. But the context is crucial. Imagine that you could prove in court that I went down to the public library with a list that read “Books I want to read for the express purpose of mimicking, and that I get nothing else out of”, and on that list was your book. Imagine you had me on tape saying that for me writing is not a creative expression of myself, but rather I am always trying to find the word that the authors I have studied would use. Now that’s getting closer to the context of AI. I don’t know why you think you would need me to sell verbatim copies of your book to have a good case against me. Just a few passages should suffice given my shady and well-documented intentions.

Well that’s basically what LLMs look like to me.

patatahooligan@lemmy.world · 1 year ago

But what an LLM does meets your listed definition of transformative as well

No it doesn’t. Sometimes the output is used in completely different ways but sometimes it is a direct substitute. The most obvious example is when it is writing code that the user intends to incorporate into their work. The output is not transformative by this definition as it serves the same purpose as the original works and adds no new value, except stripping away the copyright of course.

everything it outputs is completely original

[citation needed]

that you can’t use to reconstitute the original work

Who cares? That has never been the basis for copyright infringement. For example, as far as I know I can’t make and sell a doll that looks like Mickey Mouse from Steamboat Willie. It should be considered transformative work. A doll has nothing to do with the cartoon. It provides a completely different sort of value. It is not even close to being a direct copy or able to reconstitute the original. And yet, as far as I know I am not allowed to do it, and even if I am, I won’t risk going to court against Disney to find out. The fear alone has made sure that we mere mortals cannot copy and transform even the smallest parts of copyrighted works owned by big companies.

I would find it hard to believe that if there is a Supreme Court ruling which finds digitalizing copyrighted material in a database is fair use and not derivative work

Which case are you citing? Context matters. LLMs aren’t just a database. They are also a frontend to extract the data from these databases, that is being heavily marketed and sold to people who might otherwise have bought the original works instead.

The lossy compression is also irrelevant, otherwise literally every pirated movie/series release would be legal. How lossy is it even? How would you measure it? I’ve seen github copilot spit out verbatim copies of code. I’m pretty sure that if I ask ChatGPT to recite me a very well known poem it will also be a verbatim copy. So there are at least some works that are included completely losslessly. Which ones? No one knows and that’s a big problem.