Danny's tech notebook | 丹尼技術手札: Tensorflow

Showing posts with label Tensorflow. Show all posts

Tuesday, October 30, 2018

[TensorFlow] Train in Tensorflow and do inference with the trained model

If you want to train your model in Tensorflow and do inference with the trained model, you can refer to this post.

1. Train your model

I will use the simple CNN model in my previous post:
[ONNX] Train in Tensorflow and export to ONNX (Part II)
https://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to_20.html

So, after training, you will get these files:

my_mnist/

├── checkpoint

├── graph.pbtxt

├── my_mnist_model.data-00000-of-00001

├── my_mnist_model.index

└── my_mnist_model.meta

[TensorFlow] Does it help the processing time and transmission time if increasing CUDA Steam number in TensorFlow?

Before starting to increase the CUDA Steam number in TensorFlow, I want to recap some ideas about the Executor module. When TensorFlow session runs, it will build Executor. Meanwhile, if you enable CUDA in TensorFlow build configuration, the Executor will add visible GPU devices and create TF device object (GPUDevice object) mapping to physical GPU device. There are 4 kinds of streams inside GPUDevice:

CUDA stream
Host_to_Device stream
Device_to_Host stream
Device_to_Device stream

[TensorFlow Grappler] How to do the topological sorting in TensorFlow Grappler?

If you try to implement some optimizers in TensorFlow Grappler, you must have to know how to deal with the directed computation graph. One of the most important tools/knowledges is topological sorting.
The definition from Wiki: Topological sorting
https://en.wikipedia.org/wiki/Topological_sorting
"In the field of computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering."

[Tool] To draw a sequence diagram using online tool sequencediagram

This website provides an online free tool for users to draw the sequence diagram as follows:
https://sequencediagram.org/

Basically, you can follow the instructions at the left top corner button. Check it out.
Here is my example of the sequence diagram about tracing some source codes of XLA AOT in TensorFlow.

[TensorFlow Grappler] The ways to traverse all nodes' input and output in the graph using C++ in TensorFlow Grappler

Here I want to introduce 2 ways to traverse all nodes' input and output in the graph using C++ in Grappler.
P.S: you have to be able to get GrapplerItem and GraphDef objects in your code.

First, check my example node name in Tensorboard as follows:
conv1/Conv2D

[TensorFlow] Why does the feed's shape matter in TensorFlow Grappler?

Before explaining this, you should understand what Shapes and dynamic dimensions are in TensorFlow. This article below explains the concept very well.
https://blog.metaflow.fr/shapes-and-dynamic-dimensions-in-tensorflow-7b1fe79be363
The key idea is:

[XLA related] Sort out my thought and notes about XLA related

This post could be a little bit unstructured because it's for my reference in notes.
I recently found that there are several slides in SlideShare which contain very good information and source code analysis/study about XLA related as follows:

[TensorFlow] My simple way to profile TensorFlow and dump variables and GPU memory

As we know that if we want to profile Tensorflow graph and know what operations take more time and what less. This can be done with Tensorflow timeline module like this:
( I ignore the part of the model to simplify my example code )

...
run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
run_metadata = tf.RunMetadata()
...
with tf.Session(config=config) as sess:
    init.run()
    for epoch in range(n_epochs):
        for iteration in range(10):
            sess.run(training_op, feed_dict={X: picture, y:picture_label}, 
                     options=run_options, run_metadata=run_metadata)
            fetched_timeline = timeline.Timeline(run_metadata.step_stats)
            chrome_trace = fetched_timeline.generate_chrome_trace_format()
            with open('timeline_step_%d.json' % iteration, 'w') as f:
                f.write(chrome_trace)

[ONNX] Train in Tensorflow and export to ONNX (Part II)

If you read the previous post as the link below, you probably may ask a question: If the input TF graph for freezing is not a binary format, what do we do?
http://danny270degree.blogspot.com/2018/08/onnx-train-in-tensorflow-and-export-to.html

Let us recall the previous example below. The file "graph.proto" is the binary format of the protobuf file for TensorFlow graph generated from the following function:

  with open("graph.proto", "wb") as file:
    graph = tf.get_default_graph().as_graph_def(add_shapes=True)
    file.write(graph.SerializeToString())

[TensorFlow] Rewriter_Config and Memory Optimization Passes

In the previous post as the below link, I mentioned that the default value of rewrite_config seems to change a little bit.
https://danny270degree.blogspot.com/2018/06/tensorflow-compare-memory-options-in.html

To clarify my doubt, I check the TensorFlow's memory_optimizer.cc and arrange the mapping table:

[Qt5] How to develop Qt5 GUI with TensorFlow C++ library?

Here I give a simple and complete example of how to develop Qt5 GUI with TensorFlow C++ library on Linux platform. Please check out my GitHub's repository as follow:
https://github.com/teyenliu/tf_inference_gui

[TensorFlow] How to implement LMDBDataset in tf.data API?

I have finished implementing the LMDBDataset in tf.data API. It could be not the bug-free component, but at least it's my first time to try to implement C++ and Python function in TensorFlow. The API architecture looks like this:

[TensorFlow] How to build your C++ program or application with TensorFlow library using CMake

When you want to build your C++ program or application using TensorFlow library or functions, you probably will encounter some header file missed issues or linking problems. Here is the step list that I have verified and it works well.

1. Prepare TensorFlow ( v1.10) and its third party's library

$ git clone --recursive https://github.com/tensorflow/tensorflow
$ cd tensorflow/contrib/makefile
$ ./build_all_linux.sh

2. Modify .tf_.tf_configure.bazelrc

$ cd tensorflow/
$ vim .tf_configure.bazelrc
  append this line in the bottom of the file
  ==>
  build --define=grpc_no_ares=true

[XLA JIT] How to turn on XLA JIT compilation at multiple GPUs training

Before I discuss this question, let's recall how to turn on XLA JIT compilation and use it in TensorFlow python API.

1. Session
Turning on JIT compilation at the session level will result in all possible operators being greedily compiled into XLA computations. Each XLA computation will be compiled into one or more kernels for the underlying device.

[TensorFlow] How to get CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) in a bash script for building TensorFlow from source

The AVX and SSE4.2 and others are offered by Intel CPU. (AVX and SSE4.2 are CPU infrastructures for faster matrix computations) Did you wonder what CPU configuration flags (such as SSE4.1, SSE4.2, and AVX...) you should use on your machine when building Tensorflow from source? If so, here is a quick solution for you.

[TensorFlow 記憶體優化實驗] Compare the memory options in Grappler Memory Optimizer

As we know that in Tensorflow, there is an optimization module called "Grappler". It provides many kinds of optimization functionalities, such as: Layout, Memory, ModelPruner, and so on... In this experiment, we can see the effect of some memory options enabled in a simple CNN model using MNIST dataset.

[TensorFlow] My case to install TensorFlow with GPU enabled

My Operation System is Ubuntu 14.04 LTS 5 and GPU card is GeForce GTX 750Ti

1. Go to nvidia.com and download the driver (NVIDIA-Linux-x86_64-367.44.sh)

2. For Nvidia to find linux header files (*):
$ sudo apt-get install build-essential linux-headers-$(uname -r)

3. To enable full screen text mode (nomodeset):
$ sudo gedit /etc/default/grub
>> Edit GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset"
Save it and reboot
$ sudo update-grub
$ sudo reboot

4. Log into with Ctl +Alt + F1

5. Stop the X Server service
$ sudo service lightdm stop

6. Install nVidia driver
$ sudo ./NVIDIA-Linux-x86_64-367.44.sh

7. Install CUDA (GPUs on Linux)

Download and install Cuda Toolkit

https://developer.nvidia.com/cuda-downloads

sudo dpkg -i cuda-repo-ubuntu1404-8-0-local_8.0.44-1_amd64.deb

sudo apt-get update

sudo apt-get install cuda

8. Download and install cuDNN

https://developer.nvidia.com/cudnn

tar xvzf cudnn-8.0-linux-x64-v5.1.tgz

cd cuda
sudo cp include/cudnn.h /usr/local/cuda-8.0/include
sudo cp lib64/* /usr/local/cuda-8.0/lib64
sudo chmod a+r /usr/local/cuda-8.0/lib64/libcudnn*

9. You also need to set the LD_LIBRARY_PATH and CUDA_HOME environment variables. Consider adding the commands below to your ~/.bash_profile. These assume your CUDA installation is in /usr/local/cuda:

$ vim ~/.bashrc

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64"
export CUDA_HOME=/usr/local/cuda-8.0
export PATH="$CUDA_HOME/bin:$PATH"
export PATH="$PATH:$HOME/bin"

10. To install TensorFlow for Ubuntu/Linux 64-bit, GPU enabled:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.1-cp27-none-linux_x86_64.whl

To find out which device is used, you can enable log device placement like this:
$ python
>>>> import tensorflow as tf
>>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Tuesday, August 2, 2016

[Tensorflow] Fizz-Buzz example enhancement

I am just based on this Fizz-Buzz example as below to add 2nd convolution layer and guess what? The result is quicker to be learn. But, this is just the first step to learn "Deep Learning"...
There is still a lot of things and knowledge that need to learn more.
http://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow/

Before

After

Reference
http://www.slideshare.net/WrangleConf/wrangle-2016-lightning-talk-fizzbuzz-in-tensorflow

Danny's tech notebook | 丹尼技術手札

Tuesday, October 30, 2018

[TensorFlow] Train in Tensorflow and do inference with the trained model

1. Train your model

Tuesday, October 23, 2018

[TensorFlow] Does it help the processing time and transmission time if increasing CUDA Steam number in TensorFlow?

Thursday, October 18, 2018

[TensorFlow Grappler] How to do the topological sorting in TensorFlow Grappler?

[Tool] To draw a sequence diagram using online tool sequencediagram

Wednesday, October 17, 2018

[TensorFlow Grappler] The ways to traverse all nodes' input and output in the graph using C++ in TensorFlow Grappler

Friday, September 7, 2018

[TensorFlow] Why does the feed's shape matter in TensorFlow Grappler?

Tuesday, September 4, 2018

[XLA related] Sort out my thought and notes about XLA related

Wednesday, August 29, 2018

[TensorFlow] My simple way to profile TensorFlow and dump variables and GPU memory

Tuesday, August 21, 2018

[ONNX] Train in Tensorflow and export to ONNX (Part II)

Friday, August 17, 2018

[TensorFlow] Rewriter_Config and Memory Optimization Passes

Saturday, July 14, 2018

[Qt5] How to develop Qt5 GUI with TensorFlow C++ library?

Monday, July 9, 2018

[TensorFlow] How to implement LMDBDataset in tf.data API?

Thursday, July 5, 2018