Opens profile photo
Follow
Click to Follow dsmilkov
Daniel Smilkov
@dsmilkov
Co-creator of Know Your Data & TensorFlow.js // PAIR // Responsible AI // Google Brain // ๐Ÿ‡ฒ๐Ÿ‡ฐ๐Ÿ‡บ๐Ÿ‡ธ // ML & Visualization
Boston, MAsmilkov.comJoined October 2010

Daniel Smilkovโ€™s Tweets

Pinned Tweet
We started Know Your Data last year while still working on TensorFlow.js when we realized that ML developers and researchers werenโ€™t spending enough time looking at and cleaning their datasets. So we set out to build a tool that would visualize machine learning datasets.
Quote Tweet
We are very excited to announce Know Your Data, a tool that weโ€™ve been working on for over a year. Weโ€™ve been using it inside Google and today we're sharing a beta version. Try it here โ†’ knowyourdata.withgoogle.com Here is a thread with a few features of the tool: 1/8
Show this thread
3
68
Show this thread
We computer vision researchers rarely look at the individual data points inside our datasets. Mainly because we are too lazy and/or do not have the right tools. This needs to change. And now we have a great tool from datasets team: Know Your Data. A thread.๐Ÿงต(1/5)
6
759
Show this thread
A case study on the Coco Captions dataset shows how simple statistics can often reveal unintentional bias in the data collection and annotation process. The analysis was done by , and using the knowyourdata.withgoogle.com tool.
Quote Tweet
Check out a case study with Know Your Data โ€” a dataset exploration tool introduced earlier this year at Google I/O โ€” that highlights how biases can be traced to both dataset collection and annotation practices. goo.gle/3CzY0WQ
15
Come work with us to push the limits of interactive data analytics and help mitigate bias and quality issues with ML data! Apply via the link ๐Ÿ‘‡
Quote Tweet
Interested in working with the PAIR team at Google on projects like Know Your Data? Apply here, we're hiring! And yes, "full stack": we really care about folks thinking about the intersection between UX / frontend and large scale data munging. careers.google.com/jobs/results/1
1
33
Thank you ๐Ÿ™ for graduating a research tool to a full fledged production service. Iโ€™ve learned so much about prod at Google in the last couple of months from all of you!
Quote Tweet
I'm grateful to work at a company that rolls up its sleeves to fight ML bias. I formed a 20% Reliability/Productivity team to launch #KnowYourData at scale: knowyourdata.withgoogle.com. Just another day in my @LifeAtGoogle #wearehiring #sre #engprod #fightmlbias #GooglePAIR #GoogleIO
Show this thread
1
9
Thank you for your ideas and suggestions on the tool, and especially the nPMI idea which became the Relations tab!
Quote Tweet
New really cool tool from Google to analyze your datasets. I was able to help with this and I'm sad not to be there for this announcement. It's neat and useful, and has a lot more awesome stuff on the roadmap. twitter.com/googledevs/staโ€ฆ
26
With KYD we aim to raise the bar for publishing a dataset. Just like research papers come with reproducible code, we want datasets to be accompanied by an interactive tool. We also hope that dataset owners will use the tool to improve their datasets.
1
5
Show this thread
We want people from wider backgrounds to have access to ML technologies, so we launched a beta of Know Your Data for Datasets. We hope for your feedback to make the tool usable not only by ML researchers, but also by PMs, engineers, and other decision makers.
Embedded video
0:05
1.2K views
2
28
Show this thread
No matter which battleground state you look at, the exit polls say it all. Thank you Blacks, Hispanics, Asians and others for saving the democracy of USA. (Screenshot for PA)
Image
9
When people ask me about PAIR and Fernanda, I have a lot to share, but at the end I point to a simple fact โ€” People often switch teams at Google (which is great) yet Fernanda has been my manager for 6 years now and Iโ€™m more excited about my work than ever
29
Itโ€™s so important to have mini, educational version of an ML model, especially since it lends itself to interactive visualizations that can help us understand it better. Looking forward to seeing those visualizations come out!
Quote Tweet
I wrote a minimal/educational GPT training library in PyTorch, am calling it minGPT as it is only around ~300 lines of code: github.com/karpathy/minGPT +demos for addition and character-level language model. (quick weekend project, may contain sharp edges)
Show this thread
1
9
A really nice tool for debugging language-based models! Goes to show you the potential when visualization experts partner with domain (NLP) experts! Itโ€™s ๐Ÿ”ฅ
Quote Tweet
Introducing an early-release version of the Language Interpretability Tool (LIT), a visual, interactive, and extensible open-source tool for analyzing all sorts of NLP models ๐Ÿ”ฅ Code: github.com/pair-code/lit/ Paper: arxiv.org/abs/2008.05122 #NLProc (1/4)
Show this thread
Embedded video
GIF
16
We make a ton of engineering and human decisions when building even the simplest ML models. This podcast breaks those decisions down, step by step, giving us a fresh perspective on ML.
Quote Tweet
๐ŸŽ™๏ธSuper excited to launch a new project with @dweinberger! โ€œTic-Tac-Toe the Hard Wayโ€ is a 9 episode introductory podcast about the human choices that go into making machine learning systems. More at pair.withgoogle.com/thehardway/
Show this thread
17
Finalization is advancing to stage 4 (ready for inclusion to the standard)! This is super relevant to libraries that are using WebGL and WASM, such as TensorFlow.js where we do manual memory management. tidy() is here to stay but Weakrefs will help a lot with one-off tensor leaks
Quote Tweet
ECMAScript excitement ๐Ÿ˜‰ Today TC39 advanced these features to Stage-4 ๐ŸŽ‰ - Intl.ListFormat - Intl.DateTimeFormat: dateStyle & timeStyle - Logical Assignment (&&= ||= ??=) - Numeric Separators (1_000) - Promise.any & AggregateError - WeakRefs & FinalizationRegistry
11
International students: I know how resilient youย can be. I was one of you. Work with your peers and schools to figure out a plan so your visa stays valid. Use your resilience to minimize the damage. Give the US another chance. I will use my voice and my vote and we will fix this.
1
34
Show this thread