But there are several key advantages of the Functional approach over the subclassing approach: 1. Your model has known inputs shapes. 2. You get access to the internal connectivity graph. 3. The model is a data structure, not a piece of bytecode. Let's see what these are about.
-
Show this thread
-
1. Because the model has known input shapes, it's capable of running input validation checks, for easy debugging:pic.twitter.com/1B8E7GXmK1
1 reply 1 retweet 19 likesShow this thread -
Further, it's even capable of standardizing inputs to what it expects: if you pass data of shape (batch_size,) to a model that expects (batch_size, 1), it will just reshape it. Likewise for dtype conversion (e.g. float64 will get converted to float32).
1 reply 0 retweets 15 likesShow this thread -
2. You get access to the internal connectivity graph. This means you can plot the model, for instance. This is great for debugging. Like this:pic.twitter.com/ZnG6ym9yei
1 reply 0 retweets 38 likesShow this thread -
Having access to internal nodes also means you can access an intermediate layer output and leverage it in a new model. This is a killer feature for feature extraction, fine-tuning, and ensembling. Let's add an extra output to the model above:pic.twitter.com/gCxafm21UF
1 reply 0 retweets 27 likesShow this thread -
3. The model is a data structure, not a piece of bytecode. This means it can be cleanly serialized and deserialized -- even across platforms. keras.Model.from_config(functional_model.get_config()) reconstructs the exact same model as the original.
1 reply 1 retweet 24 likesShow this thread -
If your model is a Python subclass, to serialize it you could either: a. Pickle the bytecode -- which it completely unsafe, won't work for production, and won't work across platforms
2 replies 2 retweets 15 likesShow this thread -
b. Save it as a SavedModel -- which is a form of one-way export (of the TF graph) and won't let you reconstruct the exact same Python object. A graph of layers is a data structure; defining and saving it as a data structure is the intuitive thing to do.
2 replies 3 retweets 20 likesShow this thread -
Many runtimes other than Python TensorFlow understand the Keras graph-of-layers format, such as TF.js, CoreML, DeepLearning4J... A high level, human-readable saving format is much easier to implement for third-party platforms.
1 reply 2 retweets 21 likesShow this thread -
A last advantage of the Functional API I haven't listed here is that it is much less verbose, because it is less redundant (no need to list/name each layer twice). Consider this subclassed VAE vs. an equivalent Functional model...pic.twitter.com/hkxVE8eXlZ
1 reply 2 retweets 35 likesShow this thread
Note that you don't have to inline your Functional model definitions all the time -- complex models should be broken down into stateless functions (one function per architectural block). Here's an example of a Transformer for timeseries classification.pic.twitter.com/gBi4mO2FyT
-
-
That's it for this tweetorial. Feel free to chime in with your own takes on pros and cons of the Functional and subclassing approaches!
9 replies 3 retweets 40 likesShow this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.