TensorFlow Serving之安装及调用方法

2020-04-01 15:16 1460人阅读评论(0)

0 背景

Google在2016年2月开源了TensorFlow Serving，这个组件可以将TensorFlow训练好的模型导出，并部署成可以对外提供预测服务的RESTful/RPC接口。有了这个组件，TensorFlow就可以实现应用机器学习的全流程：从训练模型、调试参数，到打包模型，最后部署服务，名副其实是一个从研究到生产整条流水线都齐备的框架。TensorFlow Serving是一个为生产环境而设计的高性能的机器学习服务系统。它可以同时运行多个大规模深度学习模型，支持模型生命周期管理、算法实验，并可以高效地利用GPU资源，让TensorFlow训练好的模型更快捷方便地投入到实际生产环境。

目前TF Serving有Docker、APT（二级制安装）和源码编译三种方式，但考虑实际的生产环境项目部署和简单性，推荐使用Docker方式，这也是让TensorFlow服务支持GPU的最简单方法，因此首先介绍Docker的安装方法。

1 docker安装

如果想要使用GPU版本的docker，则需要安装19.03以上版本的docker，首先卸载旧版本的docker

sudo apt-get remove docker docker-engine docker.io containerd runc

如果apt提示没有这些包也没关系，更新apt库

sudo apt-get update

安装依赖

sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

添加Docker的官方GPG密钥

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo apt-key fingerprint 0EBFCD88

新增库

sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

更新库并安装

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

验证安装

sudo docker run hello-world

查看版本

sudo docker -v

输出Docker version 19.03.1表明安装成功

然后配置GPU

# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker

但运行时报错requirement error: unsatisfied condition: brand = tesla\\\\n\\\"\"": unknown.查询是cuda版本问题，我服务器上默认调用的是9.0，但显示要求需要10.1版本，因此安装并切换10.1版本的cuda，切换方法参考《Linux之cuda、cudnn版本切换》

TF Serving客户端和服务端的通信方式有两种分别是gRPC和RESTfull API，接下来对这两种通信的实现方法进行介绍

2 RESTfull API形式

下拉镜像，可以指定版本，这里我们选择支持GPU的镜像

docker pull tensorflow/serving:1.12.3-gpu

然后下载模型

mkdir -p /tmp/tfserving
cd /tmp/tfserving
git clone https://github.com/tensorflow/serving

运行TensorFlow Serving容器，将其指向此模型并打开REST API端口（8501）：

docker run --runtime=nvidia -p 8501:8501 --mount type=bind,source=$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu,target=/models/half_plus_two -e MODEL_NAME=half_plus_two -t tensorflow/serving:1.12.3-gpu &

运行docker容器，并指定用nvidia-docker，表示调用GPU，启动TensorFlow服务模型，绑定REST API端口8501（gRPC端口为8500），并将我们所需的模型从主机（source）映射到容器中预期模型的位置（target）。我们还将模型的名称作为环境变量传递，这在查询模型时非常重要。提示：在查询模型之前，请务必等到看到如下所示的消息，表明服务器已准备好接收请求：

2018-07-27 00:07:20.773693: I tensorflow_serving/model_servers/main.cc:333]
Exporting HTTP/REST API at:localhost:8501 ...

新打开一个终端，模拟客户端查询

curl -d '{"instances": [1.0, 2.0, 5.0]}' \
  -X POST http://localhost:8501/v1/models/half_plus_two:predict

会有返回值

{ "predictions": [2.5, 3.0, 4.5] }

如果是在没有GPU的计算机上运行将会报错

Cannot assign a device for operation 'a': Operation was explicitly assigned to /device:GPU:0

3 gRPC形式

接下来采用手写体数字识别模型的例子，来说明tf服务的过程

首先训练一个模型，如果要用GPU训练，则需要修改mnist_saved_model.py文件，在头文件处新增如下代码

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

创建mnist文件夹，然后运行脚本，将结果文件保存在mnist文件夹中

cd serving
mkdir mnist
python tensorflow_serving/example/mnist_saved_model.py mnist

输出如下

Training model...
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/t10k-labels-idx1-ubyte.gz
2019-09-18 11:24:11.010872: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-18 11:24:11.614926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: TITAN V major: 7 minor: 0 memoryClockRate(GHz): 1.455
pciBusID: 0000:06:00.0
totalMemory: 11.78GiB freeMemory: 11.36GiB
2019-09-18 11:24:11.614978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2019-09-18 11:24:15.718308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-09-18 11:24:15.718387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2019-09-18 11:24:15.718412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2019-09-18 11:24:15.721216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10974 MB memory) -> physical GPU (device: 0, name: TITAN V, pci bus id: 0000:06:00.0, compute capability: 7.0)
training accuracy 0.9092
Done training!
Exporting trained model to b'mnist/1'
Done exporting!

训练完之后会在mnist/1文件夹下生成模型文件

mnist
└── 1
├── saved_model.pb
└── variables
├── variables.data-00000-of-00001
└── variables.index

其中，saved_model.pb是序列化的tensorflow::SaveModel，它包括模型的一个或多个图形定义，以及模型的元数据（如签名）；variables是包含图形的序列化变量的文件。

运行服务

CPU版

docker run -p 8500:8500 --mount type=bind,source=$(pwd)/mnist,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving

GPU版

docker run --runtime=nvidia -p 8500:8500 --mount type=bind,source=$(pwd)/mnist,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving:1.12.3-gpu

客户端验证

python tensorflow_serving/example/mnist_client.py --num_tests=1000 --server=127.0.0.1:8500

输出如下

Extracting /tmp/train-images-idx3-ubyte.gz
Extracting /tmp/train-labels-idx1-ubyte.gz
Extracting /tmp/t10k-images-idx3-ubyte.gz
Extracting /tmp/t10k-labels-idx1-ubyte.gz
W0918 12:49:20.635714 140358698280704 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Inference error rate: 10.4%

转载：https://blog.csdn.net/zong596568821xp/article/details/99715005

查看评论

小言_互联网的博客

小言_互联网的博客

个人资料

文章分类

文章存档

阅读排行

评论排行

推荐文章

TensorFlow Serving之安装及调用方法

0 背景

1 docker安装

2 RESTfull API形式

3 gRPC形式

* 以上用户言论只代表其个人观点，不代表本网站的观点或立场