Also, the majority of the data used to train Copilot (which came from GPT-3) is not code. The biggest part of it was the GPT-3 part that was trained with public internet content (which could also be licensed). So, you can't also state that "everything" (+)
-
-
Antwort an @_odelucca @NoraDotCodes und
that Github used to train Copilot was potentially licensed since the majority of the training was done in the transformer (GPT-3). So, the "code" part is just the tip of the iceberg. Just to specialize the GPT-3 algorithm to write code (+)
1 Antwort 1 Retweet 1 Gefällt mir -
Antwort an @_odelucca @NoraDotCodes und
Anyway, we need to do a LOT of debate on this subject. Not only for copilot but for ANY licensed content. For example: - If we train an AI to write songs, what should we do? Since we used licensed music to train it (+)
1 Antwort 0 Retweets 0 Gefällt mir -
Antwort an @_odelucca @NoraDotCodes und
- If we train an AI to write movie scripts, what should we do? Since we used licensed movie scripts to train it - If we train an AI to take pictures, what should we do? Since we used professional pictures to train it (+)
1 Antwort 0 Retweets 0 Gefällt mir -
Antwort an @_odelucca @NoraDotCodes und
This is a WHOLE new area. We can't simplify it as "OMG! COPILOT IS WRITING THE SAME LINE AS WE CAN SEE HERE IN THIS SOFTWARE" Common, it is way more complicated than that!
1 Antwort 0 Retweets 0 Gefällt mir -
Actually, we can, and we should simplify it, because no matter how complex the AI system is, it boils down to the same end result. It is designed to provide useful recommendations, not to avoid copyright infringement, that's the problem.
1 Antwort 0 Retweets 0 Gefällt mir -
You're mixing concepts, maybe because (as you've said) you're not an expert in machine learning. Recommendations are different from predictions. What Copilot (and also GPT-3) does is predicting your expectations based on a previously stated context. (+)
1 Antwort 0 Retweets 0 Gefällt mir -
Antwort an @_odelucca @awakecoding und
As I've said, if we go to this road, we can cancel any machine learning initiative, since every ML initiative uses publicly available data, not considering if they are licensed or not.
2 Antworten 0 Retweets 1 Gefällt mir -
Good.
1 Antwort 0 Retweets 1 Gefällt mir -
Just a final thought. I don't have the answer for it! So, that's why I've said that this needs a LOT of debate. Since both positions could be correct. And I don't have a final opinion yet. I truly don't know if using public available licensed data is morally correct.
2 Antworten 0 Retweets 0 Gefällt mir
It doesn't need "debate", it needs litigation.
-
-
I think it needs a lot of debate. Because this extends to any machine learning algorithm, not only GPT-3
1 Antwort 0 Retweets 0 Gefällt mir -
Obviously? But how would debate help this situation? It's well understood by the parties involved, we just need someone (the courts) to decide who is right.
1 Antwort 0 Retweets 0 Gefällt mir - Antworten anzeigen
Neue Unterhaltung -
Das Laden scheint etwas zu dauern.
Twitter ist möglicherweise überlastet oder hat einen vorübergehenden Schluckauf. Probiere es erneut oder besuche Twitter Status für weitere Informationen.