The finance industry has a saying for this: "past performance is no guarantee of future results". Your model scoring X on your test dataset doesn't mean it will perform at level X on the next N situations it encounters in the real world. The future may not be like the past.
-
-
Show this thread
-
So when asking the question, "would you rather use a model that was evaluated as 90% accurate, or a human that was evaluated as 80% accurate", the answer depends on whether your data is typical per the evaluation process. Humans are adaptable, models are not.
Show this thread -
If significant uncertainty is involved, go with the human. They may have inferior pattern recognition capabilities (versus models trained on enormous amounts of data), but they understand what they do, they can reason about it, and they can improvise when faced with novelty
Show this thread -
If every possible situation is known and you want to prioritize scalability and cost-reduction, go with the model. Models exist to encode and operationalize human cognition in well-understood situations.
Show this thread -
("well understood" meaning either that it can be explicitly described by a programmer, or that you can amass a dataset that densely samples the distribution of possible situations -- which must be static)
Show this thread
End of conversation
New conversation -
-
-
My previously trained wetware learned that Owls sometimes have visible feet from this single new image:https://twitter.com/cassmarketos/status/1229473344480673792?s=20 …
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Wait, are there people that don't know this? Fuzzing tests, input constraints etc.. all remain as important for production inference systems as they are for any piece of software. I have over the years recognized a lack of basic product competence for a lot of ML engineers
-
It's elementary, but many deep learning researchers are not aware of this, because in 100% of the work they've ever done, the model's score on the test set was the finality. Most DL research starts from an existing dataset and stops at a test set score.
- Show replies
New conversation -
-
-
This is so true, I see people saying there dl models are used in production but there is no way to compare how good these models are in production, this is also one reason finance doesnt use deep models in production.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Just finished a capstone project on patient similarity driven predictive modeling of clinical data, 22 papers with excellent model performance, but when I looked for a real world implementation...*crickets*
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.