小言_互联网的博客

TensorFlow Serving之安装及调用方法

1261人阅读  评论(0)

0 背景

Google在2016年2月开源了TensorFlow Serving,这个组件可以将TensorFlow训练好的模型导出,并部署成可以对外提供预测服务的RESTful/RPC接口。有了这个组件,TensorFlow就可以实现应用机器学习的全流程:从训练模型、调试参数,到打包模型,最后部署服务,名副其实是一个从研究到生产整条流水线都齐备的框架。TensorFlow Serving是一个为生产环境而设计的高性能的机器学习服务系统。它可以同时运行多个大规模深度学习模型,支持模型生命周期管理、算法实验,并可以高效地利用GPU资源,让TensorFlow训练好的模型更快捷方便地投入到实际生产环境。

目前TF Serving有Docker、APT(二级制安装)和源码编译三种方式,但考虑实际的生产环境项目部署和简单性,推荐使用Docker方式,这也是让TensorFlow服务支持GPU的最简单方法,因此首先介绍Docker的安装方法。

1 docker安装

如果想要使用GPU版本的docker,则需要安装19.03以上版本的docker,首先卸载旧版本的docker

sudo apt-get remove docker docker-engine docker.io containerd runc

如果apt提示没有这些包也没关系,更新apt库

sudo apt-get update

安装依赖

sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

添加Docker的官方GPG密钥

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo apt-key fingerprint 0EBFCD88

新增库

sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

更新库并安装

sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

验证安装

sudo docker run hello-world

查看版本

sudo docker -v

输出Docker version 19.03.1表明安装成功

然后配置GPU

# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker

但运行时报错requirement error: unsatisfied condition: brand = tesla\\\\n\\\"\"": unknown.查询是cuda版本问题,我服务器上默认调用的是9.0,但显示要求需要10.1版本,因此安装并切换10.1版本的cuda,切换方法参考《Linux之cuda、cudnn版本切换

TF Serving客户端和服务端的通信方式有两种分别是gRPC和RESTfull API,接下来对这两种通信的实现方法进行介绍

2 RESTfull API形式

下拉镜像,可以指定版本,这里我们选择支持GPU的镜像

docker pull tensorflow/serving:1.12.3-gpu

然后下载模型

mkdir -p /tmp/tfserving
cd /tmp/tfserving
git clone https://github.com/tensorflow/serving

运行TensorFlow Serving容器,将其指向此模型并打开REST API端口(8501):

docker run --runtime=nvidia -p 8501:8501 --mount type=bind,source=$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu,target=/models/half_plus_two -e MODEL_NAME=half_plus_two -t tensorflow/serving:1.12.3-gpu &

运行docker容器,并指定用nvidia-docker,表示调用GPU,启动TensorFlow服务模型,绑定REST API端口8501(gRPC端口为8500),并将我们所需的模型从主机(source)映射到容器中预期模型的位置(target)。我们还将模型的名称作为环境变量传递,这在查询模型时非常重要。提示:在查询模型之前,请务必等到看到如下所示的消息,表明服务器已准备好接收请求:

2018-07-27 00:07:20.773693: I tensorflow_serving/model_servers/main.cc:333]
Exporting HTTP/REST API at:localhost:8501 ...

新打开一个终端,模拟客户端查询

curl -d '{"instances": [1.0, 2.0, 5.0]}' \
  -X POST http://localhost:8501/v1/models/half_plus_two:predict

会有返回值

{ "predictions": [2.5, 3.0, 4.5] }

如果是在没有GPU的计算机上运行将会报错

Cannot assign a device for operation 'a': Operation was explicitly assigned to /device:GPU:0

3 gRPC形式

接下来采用手写体数字识别模型的例子,来说明tf服务的过程

首先训练一个模型,如果要用GPU训练,则需要修改mnist_saved_model.py文件,在头文件处新增如下代码

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

创建mnist文件夹,然后运行脚本,将结果文件保存在mnist文件夹中 

cd serving
mkdir mnist
python tensorflow_serving/example/mnist_saved_model.py mnist

输出如下

Training model...
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/t10k-labels-idx1-ubyte.gz
2019-09-18 11:24:11.010872: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-18 11:24:11.614926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: TITAN V major: 7 minor: 0 memoryClockRate(GHz): 1.455
pciBusID: 0000:06:00.0
totalMemory: 11.78GiB freeMemory: 11.36GiB
2019-09-18 11:24:11.614978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2019-09-18 11:24:15.718308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-09-18 11:24:15.718387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2019-09-18 11:24:15.718412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2019-09-18 11:24:15.721216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10974 MB memory) -> physical GPU (device: 0, name: TITAN V, pci bus id: 0000:06:00.0, compute capability: 7.0)
training accuracy 0.9092
Done training!
Exporting trained model to b'mnist/1'
Done exporting!

训练完之后会在mnist/1文件夹下生成模型文件

mnist
└── 1
    ├── saved_model.pb
    └── variables
        ├── variables.data-00000-of-00001
        └── variables.index

其中,saved_model.pb是序列化的tensorflow::SaveModel,它包括模型的一个或多个图形定义,以及模型的元数据(如签名);variables是包含图形的序列化变量的文件。

运行服务

CPU版

docker run -p 8500:8500 --mount type=bind,source=$(pwd)/mnist,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving

GPU版

docker run --runtime=nvidia -p 8500:8500 --mount type=bind,source=$(pwd)/mnist,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving:1.12.3-gpu

客户端验证

python tensorflow_serving/example/mnist_client.py --num_tests=1000 --server=127.0.0.1:8500

输出如下

Extracting /tmp/train-images-idx3-ubyte.gz
Extracting /tmp/train-labels-idx1-ubyte.gz
Extracting /tmp/t10k-images-idx3-ubyte.gz
Extracting /tmp/t10k-labels-idx1-ubyte.gz
W0918 12:49:20.635714 140358698280704 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Inference error rate: 10.4%

 


转载:https://blog.csdn.net/zong596568821xp/article/details/99715005
查看评论
* 以上用户言论只代表其个人观点,不代表本网站的观点或立场