Model:
( Here is an important point: You should give your placeholder with a specific number for batch size. If the placeholder that contains batch size doesn't have the batch size information, the following estimation of cost and model report will be incorrect because of using batch size equals 1 )
import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data from tensorflow.python.client import timeline from tensorflow.python.grappler import item as gitem from tensorflow.python.grappler import cluster as gcluster from tensorflow.core.framework import attr_value_pb2 # Common imports import numpy as np import os # Create the model batch_size=100 x = tf.placeholder(tf.float32, [batch_size, 784]) w = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10])) y = tf.matmul(x, w) + b # Define loss and optimizer. # The minimize() funtion will build the backward propagation graph. y_ = tf.placeholder(tf.int64, [batch_size]) cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y) train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) init = tf.global_variables_initializer()
Then, this part is to generate the cost report for my simple model.
Generate Cost Report:from tensorflow.python.grappler import cluster as gcluster
from tensorflow.python.grappler import item as gitem
cluster = gcluster.Cluster(disable_detailed_stats=False)
m = tf.train.export_meta_graph(graph=tf.get_default_graph())
cost_report = tf.pywrap_tensorflow.GenerateCostReport(m.SerializeToString(), True, False, cluster.tf_cluster)
Result: ( you can see the cost details by operation )print(cost_report) Total time measured in ns (serialized): 119000 Total time measured in ns (actual): 1257333 Total time analytical in ns (upper bound): 0 Total time analytical in ns (lower bound): 0 Overall efficiency (analytical upper/actual): 0 Overall efficiency (analytical lower/actual): 0 Op, Count, Measured time (ns), Time percent, Acc percent, Analytical upper, Analytical lower, Overall eff Compute eff Memory eff VariableV2, 2, 28000, 24%, 24%, 0, 0, 0%, 0%, 0%, SparseSoftmaxCrossEntropyWithLogits, 1, 23000, 19%, 43%, 0, 0, 0%, 0%, 0%, MatMul, 2, 17000, 14%, 57%, 0, 0, 0%, 0%, 0%, ExpandDims, 1, 10000, 8.4%, 66%, 0, 0, 0%, 0%, 0%, Const, 1, 9000, 7.6%, 73%, 0, 0, 0%, 0%, 0%, Identity, 2, 9000, 7.6%, 81%, 0, 0, 0%, 0%, 0%, Sum, 1, 7000, 5.9%, 87%, 0, 0, 0%, 0%, 0%, ApplyGradientDescent, 2, 6000, 5%, 92%, 0, 0, 0%, 0%, 0%, NoOp, 1, 4000, 3.4%, 95%, 0, 0, 0%, 0%, 0%, Add, 1, 3000, 2.5%, 97%, 0, 0, 0%, 0%, 0%, Mul, 1, 3000, 2.5%, 1e+02%, 0, 0, 0%, 0%, 0%, Below is the per-node report summary: Op, Measured time (ns), Compute time (ns), Memory time (ns), Compute eff, Memory eff, Inputs VariableV2, 22000, 0, 0, -inf%, -inf%, [] Identity, 5000, 0, 0, -inf%, -inf%, [(784, 10)] VariableV2, 6000, 0, 0, -inf%, -inf%, [] Identity, 4000, 0, 0, -inf%, -inf%, [(10)] MatMul, 11000, 0, 0, -inf%, -inf%, [(784, 10)] Add, 3000, 0, 0, -inf%, -inf%, [(100, 10), (10)] SparseSoftmaxCrossEntropyWithLogits, 23000, 0, 0, -inf%, -inf%, [(100, 10), ] ExpandDims, 10000, 0, 0, -inf%, -inf%, [] Mul, 3000, 0, 0, -inf%, -inf%, [(100, 1), ] Sum, 7000, 0, 0, -inf%, -inf%, [(100, 10), ] MatMul, 6000, 0, 0, -inf%, -inf%, [] Const, 9000, 0, 0, -inf%, -inf%, [] ApplyGradientDescent, 3000, 0, 0, -inf%, -inf%, [(784, 10), ] ApplyGradientDescent, 3000, 0, 0, -inf%, -inf%, [(10), ] NoOp, 4000, 0, 0, -inf%, -inf%, []
from tensorflow.python.grappler import cluster as gcluster
from tensorflow.python.grappler import item as gitem
cluster = gcluster.Cluster(disable_detailed_stats=False)
m = tf.train.export_meta_graph(graph=tf.get_default_graph())
model_report = tf.pywrap_tensorflow.GenerateModelReport(m.SerializeToString(), True, False)
Result: ( you can see the operations in details )
print(model_report) GradientDescent [NoOp] GradientDescent/update_Variable_1/ApplyGradientDescent [ApplyGradientDescent] output 0 (float_ref) has shape [10] gradients/add_grad/tuple/control_dependency_1 [Identity] output 0 (float) has shape [10] gradients/add_grad/tuple/group_deps [NoOp] gradients/add_grad/Reshape_1 [Reshape] output 0 (float) has shape [10] gradients/add_grad/Shape_1 [Const] output 0 (int32) has shape [1] gradients/add_grad/Sum_1 [Sum] output 0 (float) has shape ? gradients/add_grad/BroadcastGradientArgs [BroadcastGradientArgs] output 0 (int32) has shape [x6] output 1 (int32) has shape [x7] gradients/add_grad/Shape [Const] output 0 (int32) has shape [2] gradients/sparse_softmax_cross_entropy_loss/xentropy/xentropy_grad/mul [Mul] output 0 (float) has shape [100, 10] gradients/sparse_softmax_cross_entropy_loss/xentropy/xentropy_grad/PreventGradient [PreventGradient] output 0 (float) has shape [100, 10] sparse_softmax_cross_entropy_loss/xentropy/xentropy [SparseSoftmaxCrossEntropyWithLogits] output 0 (float) has shape [100] output 1 (float) has shape [100, 10] Placeholder_1 [Placeholder] output 0 (int64) has shape [100] add [Add] output 0 (float) has shape [100, 10] Variable_1/read [Identity] output 0 (float) has shape [10] Variable_1 [VariableV2] output 0 (float_ref) has shape [10] MatMul [MatMul] output 0 (float) has shape [100, 10] Variable/read [Identity] output 0 (float) has shape [784, 10] Variable [VariableV2] output 0 (float_ref) has shape [784, 10] Placeholder [Placeholder] output 0 (float) has shape [100, 784] gradients/sparse_softmax_cross_entropy_loss/xentropy/xentropy_grad/ExpandDims [ExpandDims] output 0 (float) has shape [100, 1] gradients/sparse_softmax_cross_entropy_loss/xentropy/xentropy_grad/ExpandDims/dim [Const] output 0 (int32) has shape [] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/tuple/control_dependency [Identity] output 0 (float) has shape [100] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/tuple/group_deps [NoOp] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Reshape_1 [Reshape] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Shape_1 [Const] output 0 (int32) has shape [0] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Sum_1 [Sum] output 0 (float) has shape ? gradients/sparse_softmax_cross_entropy_loss/Mul_grad/BroadcastGradientArgs [BroadcastGradientArgs] output 0 (int32) has shape [x4] output 1 (int32) has shape [x5] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Shape [Const] output 0 (int32) has shape [1] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Mul_1 [Mul] output 0 (float) has shape [100] gradients/sparse_softmax_cross_entropy_loss/Sum_grad/Tile [Tile] output 0 (float) has shape [100] gradients/sparse_softmax_cross_entropy_loss/Sum_grad/Const [Const] output 0 (int32) has shape [1] gradients/sparse_softmax_cross_entropy_loss/Sum_grad/Reshape [Reshape] output 0 (float) has shape [1] gradients/sparse_softmax_cross_entropy_loss/Sum_grad/Reshape/shape [Const] output 0 (int32) has shape [1] gradients/sparse_softmax_cross_entropy_loss/Sum_1_grad/Tile [Tile] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/Sum_1_grad/Const [Const] output 0 (int32) has shape [0] gradients/sparse_softmax_cross_entropy_loss/Sum_1_grad/Reshape [Reshape] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/Sum_1_grad/Reshape/shape [Const] output 0 (int32) has shape [0] gradients/sparse_softmax_cross_entropy_loss/div_grad/tuple/control_dependency [Identity] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/tuple/group_deps [NoOp] gradients/sparse_softmax_cross_entropy_loss/div_grad/Reshape_1 [Reshape] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/Shape_1 [Const] output 0 (int32) has shape [0] gradients/sparse_softmax_cross_entropy_loss/div_grad/Sum_1 [Sum] output 0 (float) has shape ? gradients/sparse_softmax_cross_entropy_loss/div_grad/BroadcastGradientArgs [BroadcastGradientArgs] output 0 (int32) has shape [x2] output 1 (int32) has shape [x3] gradients/sparse_softmax_cross_entropy_loss/div_grad/Shape [Const] output 0 (int32) has shape [0] gradients/sparse_softmax_cross_entropy_loss/div_grad/mul [Mul] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/RealDiv_2 [RealDiv] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/Select [Select] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present [Sum] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present/Const [Const] output 0 (int32) has shape [1] sparse_softmax_cross_entropy_loss/assert_broadcastable/static_scalar_check_success [NoOp] sparse_softmax_cross_entropy_loss/num_present/broadcast_weights [Mul] output 0 (float) has shape [100] sparse_softmax_cross_entropy_loss/num_present/broadcast_weights/ones_like [Fill] output 0 (float) has shape [100] sparse_softmax_cross_entropy_loss/num_present/broadcast_weights/ones_like/Const [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present/broadcast_weights/assert_broadcastable/static_scalar_check_success [NoOp] sparse_softmax_cross_entropy_loss/num_present/broadcast_weights/ones_like/Shape [Const] output 0 (int32) has shape [1] sparse_softmax_cross_entropy_loss/num_present/Select [Select] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present/ones_like [Fill] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present/ones_like/Const [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present/ones_like/Shape [Const] output 0 (int32) has shape [0] sparse_softmax_cross_entropy_loss/num_present/zeros_like [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/num_present/Equal [Equal] output 0 (bool) has shape [] sparse_softmax_cross_entropy_loss/num_present/Equal/y [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/Const [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/ones_like [Fill] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/ones_like/Const [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/ones_like/Shape [Const] output 0 (int32) has shape [0] sparse_softmax_cross_entropy_loss/Equal [Equal] output 0 (bool) has shape [] sparse_softmax_cross_entropy_loss/Equal/y [Const] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/RealDiv_1 [RealDiv] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/Neg [Neg] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/Sum_1 [Sum] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/Const_2 [Const] output 0 (int32) has shape [0] sparse_softmax_cross_entropy_loss/Sum [Sum] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/Const_1 [Const] output 0 (int32) has shape [1] sparse_softmax_cross_entropy_loss/Mul [Mul] output 0 (float) has shape [100] gradients/sparse_softmax_cross_entropy_loss/value_grad/tuple/control_dependency [Identity] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/value_grad/tuple/group_deps [NoOp] gradients/sparse_softmax_cross_entropy_loss/value_grad/Select_1 [Select] output 0 (float) has shape [] gradients/Fill [Fill] output 0 (float) has shape [] gradients/grad_ys_0 [Const] output 0 (float) has shape [] gradients/Shape [Const] output 0 (int32) has shape [0] gradients/sparse_softmax_cross_entropy_loss/value_grad/zeros_like [Const] output 0 (float) has shape [] sparse_softmax_cross_entropy_loss/Greater [Greater] output 0 (bool) has shape [] sparse_softmax_cross_entropy_loss/Greater/y [Const] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/value_grad/Select [Select] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/Reshape [Reshape] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/div_grad/Sum [Sum] output 0 (float) has shape ? gradients/sparse_softmax_cross_entropy_loss/div_grad/RealDiv [RealDiv] output 0 (float) has shape [] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Reshape [Reshape] output 0 (float) has shape [100] gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Sum [Sum] output 0 (float) has shape ? gradients/sparse_softmax_cross_entropy_loss/Mul_grad/Mul [Mul] output 0 (float) has shape [100] gradients/add_grad/Reshape [Reshape] output 0 (float) has shape [100, 10] gradients/add_grad/Sum [Sum] output 0 (float) has shape ? GradientDescent/learning_rate [Const] output 0 (float) has shape [] GradientDescent/update_Variable/ApplyGradientDescent [ApplyGradientDescent] output 0 (float_ref) has shape [784, 10] gradients/MatMul_grad/tuple/control_dependency_1 [Identity] output 0 (float) has shape [784, 10] gradients/MatMul_grad/tuple/group_deps [NoOp] gradients/MatMul_grad/MatMul_1 [MatMul] output 0 (float) has shape [784, 10] gradients/add_grad/tuple/control_dependency [Identity] output 0 (float) has shape [100, 10] gradients/MatMul_grad/MatMul [MatMul] output 0 (float) has shape [100, 784]
If you really want to know how Grappler does generate these reports, here is a hint to take a look at the step stats calculation information by using this function:
from tensorflow.core.protobuf import device_properties_pb2
from tensorflow.python.grappler import item
grappler_item = item.Item(m)
op_perfs, run_time, step_stats = grappler_cluster.MeasureCosts(grappler_item)
print("op_perfs...:", op_perfs)
print("run_time...:", run_time)
print("step_stats...", step_stats)
You can check them by yourself. I just do sampling from the result...op_perfs... :
...
...
compute_cost: 32000
node: "MatMul"
memory_time: 32000
op_memory {
output_memory: 40000
}
, op {
op: "Add"
attr {
key: "T"
value {
type: DT_FLOAT
}
}
inputs {
dtype: DT_FLOAT
shape {
dim {
size: 1000
}
dim {
size: 10
}
}
}
inputs {
dtype: DT_FLOAT
shape {
dim {
size: 10
}
}
}
device {
type: "GPU"
vendor: "NVIDIA"
model: "GeForce GTX 1080"
frequency: 1809
num_cores: 20
environment {
key: "architecture"
value: "6.1"
}
environment {
key: "cuda"
value: "9000"
}
environment {
key: "cudnn"
value: "7104"
}
num_registers: 65536
l1_cache_size: 24576
l2_cache_size: 2097152
shared_memory_size_per_multiprocessor: 98304
memory_size: 8504868864
bandwidth: 320320000
}
}
...
...
...
step_stats... :
...
...
...
...
node_stats {
node_name: "MatMul"
op_end_rel_micros: 32
all_end_rel_micros: 32
output {
tensor_description {
dtype: DT_FLOAT
shape {
dim {
size: 1000
}
dim {
size: 10
}
}
allocation_description {
requested_bytes: 40000
allocated_bytes: 40000
}
}
}
timeline_label: "MatMul"
memory_stats {
}
}
No comments:
Post a Comment