Opens profile photo
Follow
Click to Follow EvidentlyAI
Evidently AI
@EvidentlyAI
Open source tools to evaluate ๐Ÿ“Š , test ๐Ÿšฆ, and monitor ๐Ÿ“ˆ ML models. GitHub: github.com/evidentlyai/ev Community: discord.gg/xZjKRaNp8b
evidentlyai.comJoined February 2020

Evidently AIโ€™s Tweets

Pinned Tweet
๐Ÿ”ฅ A new blog! An introductory overview of the different facets of ML system monitoring. When one mentions โ€œML monitoring,โ€ this can mean many things. Are you tracking service latency? Model accuracy? Data quality? We made a comprehensive overview:
1
39
๐Ÿ“ŒIn case you missed it Which metrics to track when monitoring ML systems in production? We suggested a framework that consists of 4 components: ๐Ÿฅ Software health โœ… Data quality ๐Ÿ“ˆ ML model quality ๐Ÿ’ฐ Business KPI More on ML monitoring facets ๐Ÿ‘‡
3
We're very much looking forward to the talk! There are still free places available, just remember to sign up through meetup link below...
Quote Tweet
Anyone in Edinburgh next week? Our CTO, @EmeliDral, will be talking about data drift at @PyDataEdinburgh on February 2nd. She'll discuss drift detection methods and how to incorporate it in the production ML workflow ๐ŸŒŠ Stop by if you're nearby ๐Ÿ‘‡ meetup.com/PyData-Edinbur
7
5๏ธโƒฃ Take costs into account when deciding how often to retrain. Model retraining does not come for free ๐Ÿ’ต Compare associated system and labor costs with the expected benefits of retraining a model. More tips by the Nubank team ๐Ÿ‘‡ building.nubank.com.br/automatic-retr 6/6 ๐Ÿงต
1
Show this thread
4๏ธโƒฃ Do unit-testing (sort of) for ML models. How can we guarantee that a newly retrained model is better for every single example scored? We cannot ๐Ÿคทโ€โ™€๏ธ But we can run synthetic examples to verify that retrained model gets the expected outcome. 5/6
Image
1
1
Show this thread
3๏ธโƒฃ Use fair comparison benchmarks. After retraining a model, compare it against the current model in production using: - Feature importances - Out-of-sample performance - Stability of predictions, etc. Whatever the metric, calculate on an unbiased holdout dataset ๐Ÿ‘ฉโ€๐Ÿ”ฌ 4/6
Image
1
1
Show this thread
2๏ธโƒฃ Donโ€™t forget about other teams using the model. Whenever you retrain a model, you may need to adjust how its predictions are used. One way to make adjustments easier is to use calibrated probability predictions โš–๏ธ 3/6
1
1
Show this thread
1๏ธโƒฃ There are prerequisites for automatic retraining. It will be impossible without these 3 MLOps components: โœ… CI/CD pipeline for testing and deployment ๐Ÿ“ˆ ML monitoring to check the current model state ๐Ÿ”ข Solid data pipeline to generate train/test datasets 2/6
1
3
Show this thread
Sooner or later, every ML model decays. The world changes, and its static picture (aka train set) becomes obsolete. Often the easiest way to update a model is to retrain it on newer data. Nubank retrains ML models automatically. Here are 5 tips by : 1/6 ๐Ÿงต
1
8
Show this thread
In case you missed it, we have an AMA session with tomorrow! Join us to chat about: ๐Ÿ”ฅ Building high-performance ML teams ๐Ÿš€ Scaling MLOps โœ… Moving from one model to many ๐Ÿ›  Monzoโ€™s ML stack ๐Ÿ“ฆ modelstore Register here ๐Ÿ‘‡ lu.ma/k6en6l67
Image
3
Anyone in Edinburgh next week? Our CTO, , will be talking about data drift at on February 2nd. She'll discuss drift detection methods and how to incorporate it in the production ML workflow ๐ŸŒŠ Stop by if you're nearby ๐Ÿ‘‡
5
๐Ÿ’ฌ Weโ€™re kicking off the 2023 AMA season! Our new guest is ! Neal is the Staff ML Engineer and former Associate Director of ML at Monzo. Join us to chat about building high-performance ML teams, scaling MLOps, and more! Register here๐Ÿ‘‡ lu.ma/k6en6l67
Image
2
The last (but not least) AMA in 2022 was with Ben Wilson from Databricks ๐Ÿค– "I don't care about the tools. You shouldn't either." Ben on approaching any ML project, AutoML, deploying models into the wild, and how to learn about ML deployment ๐Ÿ‘‡ evidentlyai.com/blog/ama-ben-w 10/11
1
1
Show this thread
We chatted with Matt Squire from on MLOps pain points. โŒ Here are 3 of those: - Tools are heavy-weight and laborious to run - Monitoring is hard - It's no good in great tools if nobody can use them More discussions in recap blog๐Ÿ‘‡ evidentlyai.com/blog/ama-matt- 4/11
1
1
Show this thread