One thing we've shipped last year and that I'm looking forward to iterating more on later this year: TensorFlow Cloud. Train your TF/Keras model in a distributed way on GP by just adding one line to your local script / Colab notebook / Kaggle notebook.https://www.tensorflow.org/guide/keras/training_keras_models_on_cloud …
-
-
Tekton / Kubeflow pipelines with Feast, Katib, and Horovod, deploying to TF Serving (also using other TFX components for quality control). Also used built-in cluster API and tf.distribute recently (since v2) instead of Horovod for some pipelines. Works across various providers.
-
Alternatively use federation in many cases, offloading some of the training to edge devices. Quality control is significantly harder with that pattern and doesn’t accomplish exactly the same things as traditional distributed training.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.