"Reproducible machine learning with PyTorch and Quilt" (https://blog.paperspace.com/reproducible-data-with-pytorch-and-quilt/ …). Interesting post, but I think "reproducibility" is a bit overselling it. Would be curious to hear from people who tried Quilt how it's beneficial compared to the trad. code on github+data on AWS S3?
Wait am I reading right that public data is free to distribute?
-
-
Not sure, but I think that's for the free account. I also see private storage for paid accounts (much like GitHub)
ধন্যবাদ। আপনার সময়রেখাকে আরো ভালো করে তুলতে টুইটার এটিকে ব্যবহার করবে। পূর্বাবস্থায়পূর্বাবস্থায়
-
-
-
1/ I think I now get what they mean with reproducible: reproducible in the sense of tracking the history of potential changes to a dataset, not necessarily in the sense of sharing the dataset and processing pipeline (which one can do with GitHub + S3)
ধন্যবাদ। আপনার সময়রেখাকে আরো ভালো করে তুলতে টুইটার এটিকে ব্যবহার করবে। পূর্বাবস্থায়পূর্বাবস্থায়
-
-
-
2/2 So, it may be useful in certain circumstances to reproduce some results. Although, I think if you document your data processing steps (ideally automate all steps via scripts), it would be overkill to version all the data between input and end product in many projects
-
that depends on how well you've locked down your environment, seeds, etc. and whether there are (computational or otherwise) benefits to maintaining intermediaries in a data pipeline.
-
but yes... history of changes is super valuable. currently, we release TBs of data, but users can only access the "latest" that we've released (or whatever they cached before we updated).
-
also, quilt.install does the local caching (including support for versions!!) that
@dyfbrain & team had to build out for the allensdk. cc@nicholashcain -
As I think about it more, the "data versioning" piece would integrate nicely into a Continuous Analysis strategy.https://www.nature.com/articles/nbt.3780 …
-
1/ Yeah, could be useful to some degree in that context. However, I think the best use case for quilt may be a scenario where e.g., a dataset/database is the end product of a (research) project, and where you are planning to make updates
কথা-বার্তা শেষ
নতুন কথা-বার্তা -
লোড হতে বেশ কিছুক্ষণ সময় নিচ্ছে।
টুইটার তার ক্ষমতার বাইরে চলে গেছে বা কোনো সাময়িক সমস্যার সম্মুখীন হয়েছে আবার চেষ্টা করুন বা আরও তথ্যের জন্য টুইটারের স্থিতি দেখুন।