Opens profile photo
Follow
Elena Samuylova
@elenasamuylova
CEO (YC S21) 🚀. Built ML products, now ML tools 🛠. Open source 💻.

Elena Samuylova’s Tweets

Here is what comes with the new release: 📊A new Text Overview preset. 🔡Domain classifier as a new drift detection method. 🔢Text descriptor drift as an alternative. 📚All relevant existing tests and metrics now support text data. 🗂Multi-modal data support. (2/3)
1
1
Show this thread
🚀Super happy to announce the latest release: text data support! It has been one of the most requested features: “How can I use Evidently to monitor and evaluate text data drift?” Now, you can do that! Our open-source library now supports raw text inputs. (1/3)
Image
3
20
Show this thread
A lot of cool stuff is in the works: 📈More evaluations ⭐️Spark integration 🔡Text data support If you like what we do, support Evidently by giving a ⭐️ on GitHub, and join us to celebrate! GitHub: github.com/evidentlyai/ev Event on Dec 15: lu.ma/lqo2oi0i (4/4)
2
7
Show this thread
📊 A customizable framework for batch model checks. You can mix and match these evaluation “lego bricks” to fit your monitoring scenario. You can also implement your tests and metrics while re-using the underlying Evidently architecture to structure your evaluations. (3/4)
Image
2
4
Show this thread
📚An open-source library of ML evaluation metrics and tests. Evidently has over 100 tests and metrics to evaluate data quality, data drift, or ML model performance. Each comes with relevant parameters, clean API, and helpful visualization. Expect more! (2/4)
Image
1
3
Show this thread
🏦 Lastly, we have the business or product KPI.  No one uses ML to get "90% accuracy". There is a business need behind it: converting users, getting better forecasts, etc. Does the model bring value to the business? Are the product metrics affected by the model OK? (5/n)
1
2
Show this thread
🤖 Third, you have the hero: the ML model itself.  No model is perfect, and no model lasts forever. But they still can be useful! You need to ensure it remains so. Is the model still fit for the task? Are the predictions accurate? Can you trust them? (4/n)
1
1
Show this thread
📊 Second, you have the data.  The production ML models take new data as input. But it changes over time. There are also many issues with data quality and integrity. We must monitor this crucial component. Is the data OK? Can you trust it to generate predictions? (3/n)
1
1
Show this thread
💻 First, you still have the software backend.  It can be a set of batch jobs or a real-time service. You need to know how this software component works. Did the prediction job execute successfully? Did the service respond? Is it working fast enough? (2/n)
1
2
Show this thread
What exactly is ML monitoring? We've published a new entry-level blog on all things it can comprise. When you deploy an ML system in production, it integrates into the business. It's ROI time! To track its performance, you need to look at more than one thing 👇🏼 (1/n)
Image
1
47
Show this thread
Translating users' needs into a great product is hard. Seems even harder for dev tools. You deal with so many layers of abstraction and infrastructure and have to make a bunch of calls re: architecture quite early still. Great thread 👇🏼
Quote Tweet
Bananas and ML infrastructure: I've asked around about cloud workflows, and most of the feedback had unhappiness with cloud tooling. This prompted a discussion in @chipro's MLops community -- why are MLops frameworks so bad? (1/9)
Show this thread
2
ML monitoring is still is in its nascency. As a community, we are yet to define the right approaches to do it at scale. Something we hope to solve with , and a big reason why we do it in open-source. A great overview of challenges by 👇🏼
Quote Tweet
Over the last year, many people have told me that operationalizing machine learning isn’t a research problem. I disagree. In the final post of my ML monitoring series, I outline research challenges and solution ideas: shreya-shankar.com/rethinking-ml-
Image
18
Awesome OSS integration to detect drift in ML pipelines: + ❤️ So happy to see it live! We must have discussed the idea with when both tools were about 2 months old. Thanks so much to for making it happen and for a very kind blog:
Quote Tweet
A little love letter to a tool I've spent some time getting to know over the past few weeks in order to write an integration for @zenml_io. @EvidentlyAI is a monitoring tool for machine learning workflows. blog.zenml.io/zenml-loves-ev
Show this thread
16
Absolutely awesome news especially for these 2-3 people companies where YC is the first investor.
Quote Tweet
We're excited to announce our new standard deal at Y Combinator. When a company is accepted into YC, we now invest a total of $500,000. @gralston shares more on our blog: blog.ycombinator.com/ycs-standard-d
4
📈 This year has been eventful for Evidently AI. We started 2021 as a bootstrapped team of two and a small Python library. We finish the year as a team of five, a YC company, and a comprehensive tool for ML model evaluation and monitoring. Look what happened in this recap👇🏼
Quote Tweet
🎉 2021 has been an incredible year for Evidently! We got into Y Combinator, shipped a lot, hired the team, and (almost) hit the 2000 GitHub stars. Read the recap here: evidentlyai.com/blog/2021-reca Wish you all a very Happy New Year!
2
14
Before starting to work on , we spoke with 50+ data science teams and came to the same conclusion. The ML models in production most often use tabular data and don't use deep learning. Text comes second.
Quote Tweet
Text is slowly gaining more prominence in *conjunction* with relational data in the day-job of many Data Scientists. However, tabular data remains the cornerstone of what Data Science is still about in most work environments. 4/4
Show this thread
8
🔥 Our latest collaboration! Love this feature at to combine code and text in one interactive article:
Quote Tweet
A new collaboration: 📊 Evidently reports + 📖 @DeepnoteHQ notebooks! Check out this Deepnote project to: - read more about drift detection - see how to use Evidently in Deepnote - directly play with the code 👇🏼 deepnote.com/@EvidentlyAI/E
Image
2
We are hosting the first Evidently community call this week! 👇🏼
Quote Tweet
🎨 Customize it! A major update: you can easily add custom tests, metrics, and components to the Evidently reports. Join the Community Call this Thursday to learn more: eventbrite.com/e/evidently-co More details in the release blog post: evidentlyai.com/blog/evidently
Image
7
Awesome OSS ML news 👇🏼
Quote Tweet
1/3 Big news!! After more than 750 pull requests from 89 contributors and 19 months since our first release in May 2020 we are happy to announce the 1.0 release of #Haystack! 🎉🎁🖖 buff.ly/3rMrZYo (release notes here: buff.ly/3lLWEBs)
Show this thread
4
Replying to
1. Gross underestimation of the expense & effort of ML projects. many believe a single DS or tool can save you. 2. Not defining problems correctly or aligning to business outcomes. 3. Gaslighting of DS that they need to be “full stack” and master the entire surface area 1/
5
121
With outlier detection, we look at the individual unusual inputs. Our goal is to decide that the model will handle them well. If we detect an outlier, we often send it for manual processing. We want this check to be sensitive to detect a single anomaly. (3/4)
Image
1
2
Show this thread
With data drift, we look at the "global" data distributions. Our goal is to decide if we can still trust that the model. If we detect drift, model retraining is often the ideal action. We want this check to be robust to outliers and only react to meaningful changes. (2/4)
Image
1
3
Show this thread
We've been experimenting with a new blog format. A summary write-up to an applied ML question we get often. Here is the first one. What is the difference between data drift and outlier detection for production ML models? (1/4)
Image
1
19
Show this thread