I recently attended the Intel AI workshop and they gave an advice of using NUMACTL to improve the performance of training and inferencing in Deep Learning with Intel Caffe. Here I post some related information as follows:
Tuesday, October 2, 2018
Tuesday, September 18, 2018
[XLA 研究] How to use XLA AOT compilation in TensorFlow ( Part II )
My previous post: [XLA 研究] How to use XLA AOT compilation in TensorFlow is about a simple example to use XLA AOT. But, if you want to see a more complicated example, please take a look at this: https://gist.github.com/carlthome/6ae8a570e21069c60708017e3f96c9fd
Monday, September 17, 2018
[TFLMS] Large Model Support in TensorFlow by Graph Rewriting
This post just introduces this paper "Large Model Support in TensorFlow by Graph Rewriting" and it is published as a pull request in the TensorFlow repository for contributing to the TensorFlow community. With TFLMS, we were able to train ResNet-50 and 3DUnet with 4.7x and 2x larger batch size, respectively. Quite amazing...
Friday, September 7, 2018
[TensorFlow] Why does the feed's shape matter in TensorFlow Grappler?
Before explaining this, you should understand what Shapes and dynamic dimensions are in TensorFlow. This article below explains the concept very well.
https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:
https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:
Tuesday, September 4, 2018
[XLA related] Sort out my thought and notes about XLA related
This post could be a little bit unstructured because it's for my reference in notes.
I recently found that there are several slides in SlideShare which contain very good information and source code analysis/study about XLA related as follows:
I recently found that there are several slides in SlideShare which contain very good information and source code analysis/study about XLA related as follows:
Wednesday, August 29, 2018
[TensorFlow] My simple way to profile TensorFlow and dump variables and GPU memory
As we know that if we want to profile Tensorflow graph and know what operations take more time and what less. This can be done with Tensorflow timeline module like this:
( I ignore the part of the model to simplify my example code )
( I ignore the part of the model to simplify my example code )
... run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) run_metadata = tf.RunMetadata() ... with tf.Session(config=config) as sess: init.run() for epoch in range(n_epochs): for iteration in range(10): sess.run(training_op, feed_dict={X: picture, y:picture_label}, options=run_options, run_metadata=run_metadata) fetched_timeline = timeline.Timeline(run_metadata.step_stats) chrome_trace = fetched_timeline.generate_chrome_trace_format() with open('timeline_step_%d.json' % iteration, 'w') as f: f.write(chrome_trace)
Tuesday, August 21, 2018
[ONNX] Train in Tensorflow and export to ONNX (Part II)
If you read the previous post as the link below, you probably may ask a question: If the input TF graph for freezing is not a binary format, what do we do?
http://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to.html
Let us recall the previous example below. The file "graph.proto" is the binary format of the protobuf file for TensorFlow graph generated from the following function:
http://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to.html
Let us recall the previous example below. The file "graph.proto" is the binary format of the protobuf file for TensorFlow graph generated from the following function:
with open("graph.proto", "wb") as file:
graph = tf.get_default_graph().as_graph_def(add_shapes=True)
file.write(graph.SerializeToString())
Friday, August 17, 2018
[TensorFlow] Rewriter_Config and Memory Optimization Passes
In the previous post as the below link, I mentioned that the default value of rewrite_config seems to change a little bit.
https://danny270degree.blogspot.com/2018/06/tensorflow-compare-memory-options-in.html
To clarify my doubt, I check the TensorFlow's memory_optimizer.cc and arrange the mapping table:
https://danny270degree.blogspot.com/2018/06/tensorflow-compare-memory-options-in.html
To clarify my doubt, I check the TensorFlow's memory_optimizer.cc and arrange the mapping table:
Subscribe to:
Comments (Atom)