ICYMI: We published a guest post on the Jupyter blog! 🤩
Ploomber
@ploomber
The fastest
way to build data pipelines. Develop iteratively, deploy anywhere.
Join our community! ploomber.io/community
Ploomber’s Tweets
Evaluation reports are an important skill for data scientists. With #sklearn_evaluation you'll be able to create them in a snap! 🤌
Learn how to create your own reports:
👉 buff.ly/3jdRpfH
#DataScience #OpenSource #Ploomber
A ROC curve is used to evaluate and compare the performance of a binary classifier system. #sklearn_evaluation contains tools to generate a ROC curve easily!
Generate tour ROC curves and evaluate your models easily:
👉 buff.ly/3wx9vwo
#DataScience #ML #Ploomber
1
3
Export your #SQL data into a #df! 🐼
Being able to export your SQL data into a #Pandas DataFrame is an essential skill for data scientists. Learn how to do it using #JupySQL in just a few steps!
👉 buff.ly/3JgIXqT
#DataScience #OpenSource #Ploomber
1
4
Try out ploomber-engine today! 🎩
As a #DataScientist, reproducibility is an important part of your work. With #ploomber_engine, you can re-run your notebooks and ensure that their outputs still match.
Get started with this guide:
👉 buff.ly/3DfE9hs
#OpenSource
4
Your #notebooks with interactive widgets! 📊
#Voilà turns #Jupyter notebooks into standalone web apps, which opens the door for using any notebook as an interactive web-based platform.
Try it now:
👉 buff.ly/3R5FpJV
#DataScience #OpenSource
GIF
1
3
Rank the #features in your #models! 📊
Feature ranking allows to evaluate single features or pairs of features using a variety of metrics that score the features on the scale [-1, 1] or [0, 1] allowing them to be ranked.
Learn to rank your features:
👉 buff.ly/3WH6gwO
2
Don't miss our collaboration: #JupySQL x ! 🐘
Hydra is an open-source data warehouse built on Postgres. Hydra leverages columnar storage, vectorized execution, and query parallelization.
Read the guide we prepared for you:
👉 buff.ly/3WC9tOr
GIF
3
Welcome to part II of #CausalInference and experimentation with #RDDs in #Python! 🐍
In this post, we illustrate how to build #Bayesian regression discontinuity designs (RDDs) with PyMC. Get ready for this awesome content!
👉 buff.ly/3j7aWi0
#DataScience #Ploomber
1
Have you tried #ploomber_engine's CLI? ⚙️
With ploomber-engine you can profile #Jupyter notebook's memory usage. This is one of our newest features: try our CLI for running notebooks and profiling them! ⚡
Get started: buff.ly/3R9tt9L
#DataScience #Ploomber
GIF
2
4
Don't miss our collaboration: #JupySQL x #MindsDB! 🐻
enables you to use your data and make forecasts. It speeds up the ML development process by bringing #MachineLearning into the database.
Read the guide we prepared for you:
👉 buff.ly/3XVkX0p
#DataScience
GIF
1
6
Meet the tool of the week: #ipydatagrid, an intuitive and highly performant datagrid widget for #Jupyter Notebook and #JupyterLab. 💠
The widget supports all common grid interactions, such as sorting and filtering of tabular data.
Try it out!
👉 buff.ly/3R2RZcI
GIF
2
4
Profile your notebooks with #ploomber_engine! 🔍
We developed a tool that allows you to find memory leaks and runtime bottlenecks so you can optimize your code.
Read our guide on how to profile the memory usage across cell runs in #Jupyter:
👉 buff.ly/3H1xCbf
1
2
Do you know what #CumulativeGains and #LiftCharts are used for?
Both consist of a curve and a baseline for measuring #model performance. The greater the area between the curve and the baseline, the better the model.
Learn about them in this post:
👉 buff.ly/3XOC0kS
GIF
3
#sklearn_evaluation: The easiest way to evaluate your #ML models.
Don't believe us? Try it yourself.
👉 buff.ly/3XnLLqo
7
Learn about #CausalInference! 🧠
Regression Discontinuity Designs (#RDDs) are a special natural experiment framework for analyzing causal relationships. In this post, we will discuss causal inference in #DataScience via RDDS.
Check it out:
👉 buff.ly/3WoQ1og
2
To our friends from Google and Microsoft. Our sympathy is with you!
We have a few spots for Software Engineers and Data Scientists. Submit an application here: buff.ly/3kqjZuB
Feel free to retweet for visibility!
#microsoft #google #datascientists #layoffs
1
6
All about classification! 👽🖖
We prepared a quickstart guide on #classification using #sklearn_evaluation, which includes: data loading, data cleaning, model fitting, features evaluation, and model comparison. It includes code samples!
Check it out!
👉 buff.ly/3XAD8IG
2
7
An essential skill every #DataScientist should master is how to EFFECTIVELY evaluate #ML models. One way is by using plots, which allow you to better understand your models. 🔝
Learn 5 evaluation plots to take your model performance to the next level: buff.ly/3Wm5xBd
GIF
3
Check out this quick tip to plot the K-S statistic the EASY way! ⚡️
The Kolmogorov–Smirnov statistic quantifies the distance between the empirical distribution functions of two samples.
Learn to plot it easily with #sklearn_evaluation:
👉 buff.ly/3klfEZz
#DataScience
GIF
3
#JupySQL is a powerful tool to work with #SQL in #Jupyter, allowing you to run queries and plot large datasets EASILY. 🦾
Get started: buff.ly/3iK1dOv
#DataScience #Ploomber #OpenSource
1
5
Follow this guide to use JupySQL in our notebooks! ⚡️
#DataScience involves running #SQL queries and accessing connections to data from different sources. With #JupySQL you can write your own queries in #Jupyter over existing #dataframes!
👉 buff.ly/3IWGiSQ
1
9
As a #DataScientist, correct documentation in notebooks and files is important! 📝
The #spellchecker is a #JupyterLab extension highlighting misspelled words in markdown cells within notebooks and in the text files.
Check it out:
👉 buff.ly/3IVQlrp
#OpenSource
GIF
1
3
Good #DataScientists know how to measure classifiers!
A lift curve determines how effective a predictive #model is. The greater the area between the lift curve and the baseline, the better the model.
Plot a lift curve easily with #sklearn_evaluation:
👉 buff.ly/3XAeTKx
GIF
3
Can I trust my model’s probabilities? 🤔
A #calibration_curve is a graphical representation of a model’s calibration. It allows us to benchmark our model against a target: a perfectly calibrated model.
Evaluate your calibration with #sklearn_evaluation:
buff.ly/3kjciGz
GIF
2
We've prepared a #quickstart guide just for you!
#JupySQL allows you to run #SQL and plot large datasets in Jupyter via a %sql, %%sql, and %sqlplot magics.
Check out this video with and learn to use JupySQL!
👉 buff.ly/3QPLf1W
#DataScience #OpenSource
5
21
Replying to
1
3
Plot large datasets with #JupySQL! 📊
The plotting module in JupySQL runs computations in the #SQL engine, delegating memory management to ensure that intermediate computations do eat up the memory, allowing you to efficiently plot MASSIVE datasets.
👉 buff.ly/3QBY1Rt
4
With #ploomber_engine you can profile #Jupyter notebook’s memory usage. Unlike papermill, which isn’t capable of doing it. 🔥
We're glad to present one of the newest features:
The #CLI for running notebooks, profiling them, and more!
Check it out: buff.ly/3X4gzMw
GIF
1
3
Run the entire #Ploomber ecosystem (and more) in #Colab using #Conda! 🐍
#k2s allows you to bootstrap #Jupyter kernels and #Python environments. This provides an easy installation of packages such as #JupySQL, #geopandas, and many more.
Try it out:
👉 buff.ly/3QyeOES
3
2

