No my statement is extreme but only designed to place a stake as far away in the sand from the lie “we
open source” as is feasible.
-
-
Antwort an @iwasleeg @NoraDotCodes
I am not sure how the argument holds. If Copilot does not include on its own source code any of that GPL code, I am not sure that it matters the training data, as it's not part of the software distribution.
1 Antwort 0 Retweets 7 Gefällt mir -
Antwort an @ITStudiosi @iwasleeg
True - but if it includes AGPL code, which it does, even the server-side code is covered.
1 Antwort 0 Retweets 5 Gefällt mir -
Antwort an @NoraDotCodes @ITStudiosi
And my position is that if GPL code is part of the training data, and all they’ve done is multiplied that by some sigma values, the generated code is a derived work based on that GPL code and thus must be distributed under the same terms.
2 Antworten 1 Retweet 9 Gefällt mir -
you may have forgotten the triviality bit: copyright won't apply to a trivial bit of code -- and these all seem trivial, and you can imagine that's how the model is built: include only snippets with X000 or more appearances in the corpus cc
@hipsterelectron who's also thinking1 Antwort 0 Retweets 4 Gefällt mir -
Antwort an @luisbruno @iwasleeg und
about this. i prefer the copyrighted model approach: the oracle v google case went around the structure sequence and organization of the N thousands APIs: even if individually e.g. java.math.max() isn't copyrightable, the whole body is maybe that's a better copyright argument
1 Antwort 0 Retweets 4 Gefällt mir -
Antwort an @luisbruno @iwasleeg und
to say that the use of the code isn't transformative, but that the whole model has been trained on a corpus whose sets of licenses have to be respected anyway, I'd rather hear an actual lawyer talk about this one ;)
1 Antwort 0 Retweets 3 Gefällt mir -
Antwort an @luisbruno @iwasleeg und
danny mcClanahan hat danny mcClanahan retweetet
i have determined that we are absolutely fucked but feel free to wait for a lawyer to confirmhttps://twitter.com/hipsterelectron/status/1410391147868946433?s=21 …
danny mcClanahan hat hinzugefügt,
1 Antwort 0 Retweets 3 Gefällt mir -
Antwort an @hipsterelectron @iwasleeg und
you're allowing the “this was mediated by tech” logic confuse you: if a snippet is big enough and you have copyright on a similar enough bit of code -- the mechanism by which someone copied it doesn't matter copyright protects the expression of an idea regardless of copy method
2 Antworten 0 Retweets 5 Gefällt mir -
Antwort an @luisbruno @hipsterelectron und
if a large snippet is constructed by you+copilot in multiple steps and ends up looking just like someone else's, you done goofed ;) i would also like to attach to the model itself, though -- a ML model is a derivative work on its own, and i which this wasn't still being debated.
2 Antworten 0 Retweets 5 Gefällt mir
Yeah, that latter bit was my original point. It's easy to hide a derivative work generated by the model, but the model itself MUST comply with the licenses it is trying to launder.
-
-
Antwort an @NoraDotCodes @hipsterelectron und
my apologies, i noticed it earlier but ran out of characters to fully word it; yes you had mentioned it, and word it better than i did ;)
2 Antworten 0 Retweets 3 Gefällt mir -
Antwort an @luisbruno @NoraDotCodes und
oh wow. now that you mention it, it's a very interesting question whether it's even possible for an ML model to comply with the GPL.
1 Antwort 1 Retweet 8 Gefällt mir - Antworten anzeigen
Neue Unterhaltung -
Das Laden scheint etwas zu dauern.
Twitter ist möglicherweise überlastet oder hat einen vorübergehenden Schluckauf. Probiere es erneut oder besuche Twitter Status für weitere Informationen.