Here I want to introduce 2 ways to traverse all nodes' input and output in the graph using C++ in Grappler.
P.S: you have to be able to get GrapplerItem and GraphDef objects in your code.
First, check my example node name in Tensorboard as follows:
conv1/Conv2D
Wednesday, October 17, 2018
Tuesday, October 2, 2018
[NUMACTL] How to use numactl in practice?
I recently attended the Intel AI workshop and they gave an advice of using NUMACTL to improve the performance of training and inferencing in Deep Learning with Intel Caffe. Here I post some related information as follows:
Tuesday, September 18, 2018
[XLA 研究] How to use XLA AOT compilation in TensorFlow ( Part II )
My previous post: [XLA 研究] How to use XLA AOT compilation in TensorFlow is about a simple example to use XLA AOT. But, if you want to see a more complicated example, please take a look at this: https://gist.github.com/carlthome/6ae8a570e21069c60708017e3f96c9fd
Monday, September 17, 2018
[TFLMS] Large Model Support in TensorFlow by Graph Rewriting
This post just introduces this paper "Large Model Support in TensorFlow by Graph Rewriting" and it is published as a pull request in the TensorFlow repository for contributing to the TensorFlow community. With TFLMS, we were able to train ResNet-50 and 3DUnet with 4.7x and 2x larger batch size, respectively. Quite amazing...
Friday, September 7, 2018
[TensorFlow] Why does the feed's shape matter in TensorFlow Grappler?
Before explaining this, you should understand what Shapes and dynamic dimensions are in TensorFlow. This article below explains the concept very well.
https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:
https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:
Tuesday, September 4, 2018
[XLA related] Sort out my thought and notes about XLA related
This post could be a little bit unstructured because it's for my reference in notes.
I recently found that there are several slides in SlideShare which contain very good information and source code analysis/study about XLA related as follows:
I recently found that there are several slides in SlideShare which contain very good information and source code analysis/study about XLA related as follows:
Wednesday, August 29, 2018
[TensorFlow] My simple way to profile TensorFlow and dump variables and GPU memory
As we know that if we want to profile Tensorflow graph and know what operations take more time and what less. This can be done with Tensorflow timeline module like this:
( I ignore the part of the model to simplify my example code )
( I ignore the part of the model to simplify my example code )
... run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) run_metadata = tf.RunMetadata() ... with tf.Session(config=config) as sess: init.run() for epoch in range(n_epochs): for iteration in range(10): sess.run(training_op, feed_dict={X: picture, y:picture_label}, options=run_options, run_metadata=run_metadata) fetched_timeline = timeline.Timeline(run_metadata.step_stats) chrome_trace = fetched_timeline.generate_chrome_trace_format() with open('timeline_step_%d.json' % iteration, 'w') as f: f.write(chrome_trace)
Tuesday, August 21, 2018
[ONNX] Train in Tensorflow and export to ONNX (Part II)
If you read the previous post as the link below, you probably may ask a question: If the input TF graph for freezing is not a binary format, what do we do?
http://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to.html
Let us recall the previous example below. The file "graph.proto" is the binary format of the protobuf file for TensorFlow graph generated from the following function:
http://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to.html
Let us recall the previous example below. The file "graph.proto" is the binary format of the protobuf file for TensorFlow graph generated from the following function:
with open("graph.proto", "wb") as file:
graph = tf.get_default_graph().as_graph_def(add_shapes=True)
file.write(graph.SerializeToString())
Friday, August 17, 2018
[TensorFlow] Rewriter_Config and Memory Optimization Passes
In the previous post as the below link, I mentioned that the default value of rewrite_config seems to change a little bit.
https://danny270degree.blogspot.com/2018/06/tensorflow-compare-memory-options-in.html
To clarify my doubt, I check the TensorFlow's memory_optimizer.cc and arrange the mapping table:
https://danny270degree.blogspot.com/2018/06/tensorflow-compare-memory-options-in.html
To clarify my doubt, I check the TensorFlow's memory_optimizer.cc and arrange the mapping table:
Thursday, August 16, 2018
[TensorFlow] How to print the timestamp of a node/operation of computation graph in run-time?
When some people first time tries to debug or print out information of some result from a node/operation in the computation graph in TensorFlow, they maybe confuse about how to do it. Fortunately, someone in Google gave a great explanation of the print function:
https://towardsdatascience.com/using-tf-print-in-tensorflow-aa26e1cff11e
After reading it, you should understand how tf.Print() function works and to use it.
https://towardsdatascience.com/using-tf-print-in-tensorflow-aa26e1cff11e
After reading it, you should understand how tf.Print() function works and to use it.
Wednesday, August 8, 2018
[ONNX] Use ONNX_TF and nGraph_ONNX to do inference/prediction with ONNX model
Here I try to use the pre-trained model from ONNX model zoo, which the models are already converted from some deep learning framework. So I download the Resnet50 model from the following URL and untar it:
wget https://s3.amazonaws.com/download.onnx/models/opset_8/resnet50.tar.gz
tar -xzvf resnet50.tar.gz
P.S: pre-trained ONNX models: https://github.com/onnx/modelsThen, I can do the inference/prediction using this ONNX model in two ways:
[ONNX] Train in Tensorflow and export to ONNX (Part I)
From my point of view, ONNX is a model description spec and ONNX model needs Deep Learning framework or backend tool/compiler which supports it to run.
The advantage of ONNX as I know is about portable and exchangeable between DL frameworks.
Here I will use this tutorial to convert TensorFlow's model to ONNX model by myself.
https://github.com/onnx/tutorials/blob/master/tutorials/OnnxTensorflowExport.ipynb
The advantage of ONNX as I know is about portable and exchangeable between DL frameworks.
Here I will use this tutorial to convert TensorFlow's model to ONNX model by myself.
https://github.com/onnx/tutorials/blob/master/tutorials/OnnxTensorflowExport.ipynb
Tuesday, July 31, 2018
[Fun] compress and composite dataset to one image file
Tuesday, July 17, 2018
[Confusion Matrix] How to calculate confusion matrix, precision and recall list from scratch
I directly give an example which is with 10 categories, such as CIFAR-10 and MNIST. It explains how to calculate the confusion matrix, precision and recall list from scratch in Python. My data is generated at random. You should replace by yours. Here it goes:
Saturday, July 14, 2018
[Qt5] How to develop Qt5 GUI with TensorFlow C++ library?
Here I give a simple and complete example of how to develop Qt5 GUI with TensorFlow C++ library on Linux platform. Please check out my GitHub's repository as follow:
https://github.com/teyenliu/tf_inference_gui
https://github.com/teyenliu/tf_inference_gui
Monday, July 9, 2018
[TensorFlow] How to implement LMDBDataset in tf.data API?
Thursday, July 5, 2018
[TensorFlow] How to build your C++ program or application with TensorFlow library using CMake
When you want to build your C++ program or application using TensorFlow library or functions, you probably will encounter some header file missed issues or linking problems. Here is the step list that I have verified and it works well.
1. Prepare TensorFlow ( v1.10) and its third party's library
2. Modify .tf_.tf_configure.bazelrc
1. Prepare TensorFlow ( v1.10) and its third party's library
$ git clone --recursive https://github.com/tensorflow/tensorflow
$ cd tensorflow/contrib/makefile
$ ./build_all_linux.sh
2. Modify .tf_.tf_configure.bazelrc
$ cd tensorflow/
$ vim .tf_configure.bazelrc
append this line in the bottom of the file
==>
build --define=grpc_no_ares=true
Wednesday, June 27, 2018
[XLA JIT] How to turn on XLA JIT compilation at multiple GPUs training
Before I discuss this question, let's recall how to turn on XLA JIT compilation and use it in TensorFlow python API.
1. Session
Turning on JIT compilation at the session level will result in all possible operators being greedily compiled into XLA computations. Each XLA computation will be compiled into one or more kernels for the underlying device.
1. Session
Turning on JIT compilation at the session level will result in all possible operators being greedily compiled into XLA computations. Each XLA computation will be compiled into one or more kernels for the underlying device.
Monday, June 25, 2018
[PCIe] How to read/write PCIe Switch Configuration Space?
Thursday, June 21, 2018
[TensorFlow] How to get CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) in a bash script for building TensorFlow from source
The AVX and SSE4.2 and others are offered by Intel CPU. (AVX and SSE4.2 are CPU infrastructures for faster matrix computations) Did you wonder what CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) you should use on your machine when building Tensorflow from source? If so, here is a quick solution for you.
[TensorFlow 記憶體優化實驗] Compare the memory options in Grappler Memory Optimizer
As we know that in Tensorflow, there is an optimization module called "Grappler". It provides many kinds of optimization functionalities, such as: Layout, Memory, ModelPruner, and so on... In this experiment, we can see the effect of some memory options enabled in a simple CNN model using MNIST dataset.
Thursday, June 14, 2018
[XLA 研究] How to use XLA AOT compilation in TensorFlow
This document is going to explain how to use AOT compilation in TensorFlow. We will use the tool: tfcompile, which is a standalone tool that ahead-of-time (AOT) compiles TensorFlow graphs into executable code. It can reduce the total binary size, and also avoid some runtime overheads. A typical use-case of tfcompile is to compile an inference graph into executable code for mobile devices. The following steps are as follows:
1. Build tool: tfcompile
1. Build tool: tfcompile
> bazel build --config=opt --config=cuda //tensorflow/compiler/aot:tfcompile
Friday, June 8, 2018
[XLA 研究] Take a glance to see the graph changes in XLA JIT compilation
In the preamble of this article, to understand XLA JIT is pretty hard because you probably need to understand TensorFlow Graph, Executor, LLVM, and math... I have been through this painful study work somehow so that I hope my experience can help for those who are interested in XLA but have not get understood yet.
Subscribe to:
Posts (Atom)