Do people have favorite datasets to train new analysts and data scientists on? Something that is messy enough to require some practice cleaning ideally. @_brohrer_ @vboykis @chrisalbon @djpardis @generativist @ryxcommar
-
-
Replying to @dunmireg @_brohrer_ and
I think it depends on what you’re interested in! Generating your own data is beneficial for learning tools/packages, too! Think of what skills you’re wanting to hone (1/x)
1 reply 0 retweets 2 likes -
Replying to @MikeDoesData @dunmireg and
For example: Scraping tweets or reviews teaches you about xpath, RegEx, string slicing Scraping images and using OpenCV teaches signal processing, and MANY algorithms used in DL Rob Hyndman has great datasets for time series modeling techniques. Also, http://H2O.ai
1 reply 0 retweets 2 likes -
Replying to @MikeDoesData @dunmireg and
Steal my idea if you’d like: Scrape restaurants in a 10 mile radius from you. Gather food preferences for you and your friends. Build a recommender based on what combination of your friends are going out with you, and their top 3 preferences. (Just trying time help!!!)
1 reply 0 retweets 3 likes
Scraping projects that satisfy an itch certainly motivated a lot of my learning over the years.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.