Wednesday, August 30, 2017

[Caffe] Try out Caffe with Python code

This document is just a testing record to try out on Caffe with Python code. I refer to this blog. For using Python, we can easily to access every data flow blob in layers, including diff blob, weight blob and bias blob. It is so convenient for us to understand the change of training phase's weights and what have done in each step.

Monday, August 7, 2017

[Caffe] How to use Caffe to solve the regression problem?

There is a question coming up to my mind recently. How to use Caffe to solve the regression problem? We used to see a bunch of examples related to image recognition with labels and they are classification problem. In my experience, I have done this problem using TensorFlow, not Caffe. But, I think in theory they are both the same. The key point is using EuclideanLossLayer as the final Loss Layer and it's the detail from the official web site:

Wednesday, August 2, 2017

[Raspberry Pi] Use Wireless and Ethernet together

The following content is my Raspberry Pi 3's setting in /etc/network/interface as follows. In my case, I both use wireless and ethernet device at the same time.
# Include files from /etc/network/interfaces.d:
source-directory /etc/network/interfaces.d

auto lo
iface lo inet loopback

auto wlan0
allow-hotplug wlan0
iface wlan0 inet manual
    Wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

allow-hotplug eth0
iface eth0 inet static
    address 140.96.29.224
    netmask 255.255.255.0
    up ip route add 100.85.0.0/24 via 140.96.29.254 dev eth0
    up ip route add 140.96.29.0/24 via 140.96.29.254 dev eth0
    up ip route add 140.96.98.0/24 via 140.96.29.254 dev eth0

[Debug] Debugging Python and C++ exposed by boost together

During the studying of Caffe, I was curious about how Caffe provides Python interface and what kind of tool uses for wrapping. Then, the answer is Boost.Python. I think for C++ developer, it is worth time to learn and I will study it sooner. In this post, I want to introduce the debugging skill which I found in this post and I believe these are very useful such as debugging Caffe with Python Layer. Here is the link:
https://stackoverflow.com/questions/38898459/debugging-python-and-c-exposed-by-boost-together

Tuesday, July 18, 2017

[PCIe] lspci command and the PCIe devices in my server

The following content is about my PCIe devices/drivers and the lspci command results.

$ cd /sys/bus/pci_express/drivers
$ ls -al
drwxr-xr-x 2 root root 0  7月  6 15:33 aer/
drwxr-xr-x 2 root root 0  7月  6 15:33 pciehp/
drwxr-xr-x 2 root root 0  7月  6 15:33 pcie_pme/

Thursday, May 18, 2017

[Caffe] Install Caffe and the depended packages

This article is just for me to quickly record the all the steps to install the depended packages for Caffe. So, be careful that it maybe is not good for you to walk through them in your environment. ^_^

# Install CCMAKE

$ sudo apt-get install cmake-curses-gui

Monday, May 15, 2017

[NCCL] Build and run the test of NCCL


NCCL requires at least CUDA 7.0 and Kepler or newer GPUs. Best performance is achieved when all GPUs are located on a common PCIe root complex, but multi-socket configurations are also supported.

Note: NCCL may also work with CUDA 6.5, but this is an untested configuration.

Build & run

To build the library and tests.

$ cd nccl
$ make CUDA_HOME=<cuda install path> test
Test binaries are located in the subdirectories nccl/build/test/{single,mpi}.

$ ~/git/nccl$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./build/lib
$ ~/git/nccl$ ./build/test/single/all_reduce_test 100000000
# Using devices
#   Rank  0 uses device  0 [0x04] GeForce GTX 1080 Ti
#   Rank  1 uses device  1 [0x05] GeForce GTX 1080 Ti
#   Rank  2 uses device  2 [0x08] GeForce GTX 1080 Ti
#   Rank  3 uses device  3 [0x09] GeForce GTX 1080 Ti
#   Rank  4 uses device  4 [0x83] GeForce GTX 1080 Ti
#   Rank  5 uses device  5 [0x84] GeForce GTX 1080 Ti
#   Rank  6 uses device  6 [0x87] GeForce GTX 1080 Ti
#   Rank  7 uses device  7 [0x88] GeForce GTX 1080 Ti

#                                                 out-of-place                    in-place
#      bytes             N    type      op     time  algbw  busbw      res     time  algbw  busbw      res
   100000000     100000000    char     sum   30.244   3.31   5.79    0e+00   29.892   3.35   5.85    0e+00
   100000000     100000000    char    prod   30.493   3.28   5.74    0e+00   30.524   3.28   5.73    0e+00
   100000000     100000000    char     max   29.745   3.36   5.88    0e+00   29.877   3.35   5.86    0e+00
   100000000     100000000    char     min   29.744   3.36   5.88    0e+00   29.868   3.35   5.86    0e+00
   100000000      25000000     int     sum   29.692   3.37   5.89    0e+00   29.754   3.36   5.88    0e+00
   100000000      25000000     int    prod   30.733   3.25   5.69    0e+00   30.697   3.26   5.70    0e+00
   100000000      25000000     int     max   29.871   3.35   5.86    0e+00   29.700   3.37   5.89    0e+00
   100000000      25000000     int     min   29.809   3.35   5.87    0e+00   29.852   3.35   5.86    0e+00
   100000000      50000000    half     sum   28.590   3.50   6.12    1e-02   27.545   3.63   6.35    1e-02
   100000000      50000000    half    prod   27.416   3.65   6.38    1e-03   27.375   3.65   6.39    1e-03
   100000000      50000000    half     max   30.811   3.25   5.68    0e+00   30.670   3.26   5.71    0e+00
   100000000      50000000    half     min   30.818   3.24   5.68    0e+00   30.931   3.23   5.66    0e+00
   100000000      25000000   float     sum   29.719   3.36   5.89    1e-06   29.750   3.36   5.88    1e-06
   100000000      25000000   float    prod   29.741   3.36   5.88    1e-07   30.029   3.33   5.83    1e-07
   100000000      25000000   float     max   28.400   3.52   6.16    0e+00   28.400   3.52   6.16    0e+00
   100000000      25000000   float     min   28.364   3.53   6.17    0e+00   28.434   3.52   6.15    0e+00
   100000000      12500000  double     sum   33.989   2.94   5.15    0e+00   34.104   2.93   5.13    0e+00
   100000000      12500000  double    prod   33.895   2.95   5.16    2e-16   33.833   2.96   5.17    2e-16
   100000000      12500000  double     max   30.228   3.31   5.79    0e+00   30.273   3.30   5.78    0e+00
   100000000      12500000  double     min   30.324   3.30   5.77    0e+00   30.341   3.30   5.77    0e+00
   100000000      12500000   int64     sum   29.914   3.34   5.85    0e+00   30.036   3.33   5.83    0e+00
   100000000      12500000   int64    prod   30.975   3.23   5.65    0e+00   31.083   3.22   5.63    0e+00
   100000000      12500000   int64     max   29.954   3.34   5.84    0e+00   29.949   3.34   5.84    0e+00
   100000000      12500000   int64     min   29.946   3.34   5.84    0e+00   29.952   3.34   5.84    0e+00
   100000000      12500000  uint64     sum   29.981   3.34   5.84    0e+00   30.100   3.32   5.81    0e+00
   100000000      12500000  uint64    prod   30.911   3.24   5.66    0e+00   30.800   3.25   5.68    0e+00
   100000000      12500000  uint64     max   29.890   3.35   5.85    0e+00   29.947   3.34   5.84    0e+00
   100000000      12500000  uint64     min   29.929   3.34   5.85    0e+00   29.964   3.34   5.84    0e+00

 Out of bounds values : 0 OK
 Avg bus bandwidth    : 5.81761