1. Consider sharing your code as a tool to build on more than a snapshot of your work: -other will build stuff that you can't imagine => give them easy access to the core elements -don't over-do it => no need for one-liner abstractions that won't fit other's need – clean & simple
-
-
Prikaži ovu nit
-
2. Put yourself in the shoes of a master student who has to start from scratch with your code: - give them a ride up to the end with pre-trained models - focus examples/code on open-access datasets (not everybody can pay for CoNLL-2003)
Prikaži ovu nit -
3. Give clear instructions on how to run the code, at least evaluation, in such a way that, combined with pretrained models, it allows for fast test/debug 4. Use the least amount of dependencies: if you are using an internal framework to build the model => copy the relevant part
Prikaži ovu nit -
5. Spend 4 days to do it well. Open-sourcing a good code base takes some time but you should consider it as important as your paper 6. Consider merging with a larger repo: are you working on language models?
Transformers is probably happy to help you
https://github.com/huggingface/transformers …Prikaži ovu nit -
7. Now if you want to build a large-scale tool like
Transformers? Here are a few additional tips
A. focus on one essential feature that your community really needs and no one provides
B. do it well
C. keep putting yourself in the shoes of people using your tool for the 1st timePrikaži ovu nit -
D. Open-sourcing ML can be very different from other types of open-sourcing: - ML bugs are silent => researchers need to know exactly what's happening inside your code. - Researchers will create things you have no ideas about => they'll want to dive in your code and modify it.
Prikaži ovu nit -
=> you need to keep everything clear & visible. No unnecessary user-facing abstractions or layers. Direct access to the core. Each user-facing abstraction is a mask that can hide some ML-bug, a potential source of misunderstandings, and a steeper learning curve for users. [End]
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
I have made a repository that is a cookbook along these lineshttps://github.com/nathanshammah/scikit-project …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.