如何判斷查看tensorflow是否在python shell中使用gpu加速？

我已經在我的ubuntu 16.04中安裝了tensorflow，使用here和ubuntu的內置apt cuda安裝。

現在我的問題是如何測試tensorflow是否真的使用gpu？我有一個gtx 960m gpu。當我import tensorflow，輸出如下

I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally

這個輸出是否足以檢查tensorflow是否使用gpu？

最佳解決辦法

我不認為“打開CUDA庫”足以說明問題，因為Graph的不同節點可能位於不同的設備上。

要找出使用哪個設備，可以像這樣啟用日誌：

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

次佳解決辦法

除了使用其他答案中列出的sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))以及TF官方文檔外，還可以嘗試將計算分配給GPU，並查看是否有錯誤提示。

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

這裏

“/cpu:0″：機器的CPU。
“/gpu:0″：機器的GPU，如果你有的話。

如果你有一個GPU並可以使用它，你會看到結果。否則，你會看到一個長堆棧跟蹤的錯誤。最後你會看到下麵的內容：

Cannot assign a device to node ‘MatMul’: Could not satisfy explicit device specification ‘/device:GPU:0’ because no devices matching that specification are registered in this process

第三種解決辦法

以下代碼可以為您提供tensorflow所有可用的設備。

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

Sample Output

[name: “/cpu:0” device_type: “CPU” memory_limit: 268435456 locality { } incarnation: 4402277519343584096,

name: “/gpu:0” device_type: “GPU” memory_limit: 6772842168 locality { bus_id: 1 } incarnation: 7471795903849088328 physical_device_desc: “device: 0, name: GeForce GTX 1070, pci bus id: 0000:05:00.0” ]

第4種辦法

我認為有一個更簡單的方法來實現這一點。

import tensorflow as tf
if tf.test.gpu_device_name():
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
    print("Please install GPU version of TF")

它通常輸出：

Default GPU Device: /device:GPU:0

這對我來說似乎更容易理解，而不像那些詳細的日誌，一大堆東西看得人很淩亂。

第5種辦法

我更喜歡使用nvidia-smi來監視GPU的使用情況。如果您在開始編程時顯著增加，則表明您的tensorflow正在使用GPU。

第6種辦法

這應該提供可用於Tensorflow的設備列表(在Py-3.6下)：

tf = tf.Session(config=tf.ConfigProto(log_device_placement=True))
tf.list_devices()
# _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456)

第7種辦法

我發現從命令行查詢GPU是最簡單的：

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.98                 Driver Version: 384.98                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 00000000:02:00.0  On |                  N/A |
| 22%   33C    P8    13W / 250W |   5817MiB /  6075MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1060      G   /usr/lib/xorg/Xorg                            53MiB |
|    0     25177      C   python                                      5751MiB |
+-----------------------------------------------------------------------------+

如果您的學習是後台進程，則jobs -p中的pid應與nvidia-smi中的pid匹配

參考資料

How to tell if tensorflow is using gpu acceleration from inside python shell?