https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:
- The static shape is a tuple or a list.
- The dynamic shape is itself a tensor describing the shape of the original tensor
Then, you need to add some debug information in the function: MetaOptimizer::OptimizeGraph() in the following file:
tensorflow/core/grappler/optimizers/meta_optimizer.cc
Status MetaOptimizer::OptimizeGraph(Cluster* cluster, const GrapplerItem& item,
GraphDef* optimized_graph) {
for (const auto& f : item.feed) {
const auto& shape = f.second.shape();
VLOG(2) << "...[DEBUG] in OptimizeGraph(), input name:" << f.first
<< ", shape=" << shape.DebugString();
}
After that, define your input shape of your model for 2 cases:
Case 1: Give a static shape with "None" for the batch size.
Case 2: Give a static shape with a specific value for the batch size.
tf.placeholder(self.image_dtype, [self.batch_size, self.image_shape, self.image_shape, 3], 'input'),
tf.placeholder(tf.int32, [self.batch_size], 'label')]
tensorflow/core/grappler/optimizers/meta_optimizer.cc:202] ...[DEBUG] in OptimizeGraph(), input name:input, shape=[1,224,224,3]
tensorflow/core/grappler/optimizers/meta_optimizer.cc:202] ...[DEBUG] in OptimizeGraph(), input name:label, shape=[1]
For Case 2, the debug information will be like this:
tensorflow/core/grappler/optimizers/meta_optimizer.cc:202] ...[DEBUG] in OptimizeGraph(), input name:input, shape=[64,224,224,3]
tensorflow/core/grappler/optimizers/meta_optimizer.cc:202] ...[DEBUG] in OptimizeGraph(), input name:label, shape=[64]
So, you may think there is no big deal with that. But, if you take a look at Grappler's Cost Analysis or GetPeakMemoryUsage, you will find if the feed shape is with a specific value for the batch size, all the cost calculation, and memory analysis will be reasonable and acceptable.
For instance, the peak memory usage of ResNet-50 model will be the following result:
Case 1:
peak_mem_usage:425138936 ==> 0.395G
Case 2:peak_mem_usage:6979111176 ==> 6.5GB
So, what is the consequence?
Ans: For instance, "SwappingPass" in memory_optimizer.cc will not work properly.
P.S:
Here is an article mentioned about "Static input Batch size"
How to train Keras model x20 times faster with TPU for free
https://www.dlology.com/blog/how-to-train-keras-model-x20-times-faster-with-tpu-for-free/
"Input pipelines running on CPU and GPU are mostly free from the static shape requirement, while in the XLA/TPU environment, static shapes and batch size is imposed."
It also provides a full example:
https://colab.research.google.com/drive/1QZf1WeX3EQqBLeFeT4utFKBqq-ogG1FN
No comments:
Post a Comment