TF Serving has been around several years and is pretty mature. (It predates TFX). You can stick the model in the container or have the container pull the model from blob storage (or have no container at all if you really want).
It's too bad you didn't find a good guide - if you have the training dump a SavedModelBundle at the end, you can have a production-quality serving microservice up and running in about two lines of code - https://www.tensorflow.org/tfx/serving/docker.
But it doesn't really matter since you got it working.
It's too bad you didn't find a good guide - if you have the training dump a SavedModelBundle at the end, you can have a production-quality serving microservice up and running in about two lines of code - https://www.tensorflow.org/tfx/serving/docker.
But it doesn't really matter since you got it working.