Version Matching⚓︎
In general, version mismatch should not be a problem if you use up-to-date versions of Python libraries and KServe. However, significant differences between the training environment and the KServe's serving runtime can lead to compatibility issues, sometimes requiring more careful version matching.
Tip
If you encounter version conflicts, it is recommended to use the same library versions when you train your model as specified in KServe's ClusterServingRuntime custom resource image. You could also try downgrading ClusterServingRuntime image. See some recipes on how to find out the versions below.
Why we might need matching versions⚓︎
A typical version conflict can be seen in InferenceService logs, e.g.:
modelStatus:
lastFailureInfo:
exitCode: 1
message: |
INFO:root:Copying contents of /mnt/models to local
Traceback (most recent call last):
...
File "/prod_venv/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 152, in read_array
array = pickle.load(unpickler.file_handle)
ModuleNotFoundError: No module named 'numpy._core'
reason: ModelLoadFailed
This happens because Python serialization tools like pickle or joblib store references to internal module paths. If a model is saved in an environment with one version of a library (e.g. numpy), but loaded in another where the internal structure has changed, deserialization can fail.
How to find out cluster's KServe version⚓︎
KServe version corresponds to the controller image tag in kserve/kserve-controller-manager Deployment. You could check it using kubectl:
kubectl get deployment kserve-controller-manager -n kubeflow -o jsonpath="{.spec.template.spec.containers[?(@.name=='manager')].image}" | cut -d':' -f2
If you don't have access to the kubeflow namespace, there is a hacky way to get KServe version from your InferenceService. Run the following against a pod in your namespace that corresponds to an InferenceService you deployed:
kubectl get pod <your-inference-service-pod-name> -n <your-namespace> -o jsonpath="{.spec.initContainers[?(@.name=='storage-initializer')].image}" | cut -d':' -f2
How to find out matching python and library versions⚓︎
KServe's runtime library versions are defined in the ClusterServingRuntime custom resource. Runtimes' image tags correspond to the KServe version (see above). For some common runtimes you could quickly fetch library versions with following scripts (define your KServe tag first, e.g. export KSERVE_TAG=v0.11.1):
- sklearn:
wget -qO- https://raw.githubusercontent.com/kserve/kserve/$KSERVE_TAG/python/sklearnserver/pyproject.toml | awk '/\[tool.poetry.dependencies\]/,/^$/{if($0 !~ /^\[.*\]/) print}' - huggingface:
wget -qO- https://raw.githubusercontent.com/kserve/kserve/$KSERVE_TAG/python/huggingface/pyproject.toml | awk '/\[tool.poetry.dependencies\]/,/^$/{if($0 !~ /^\[.*\]/) print}'
In general case, you would need to look into source code to see how ClusterServingRuntime image was built. E.g., if you run a scikit-learn model and your KServe version is 0.11.1, you would do the following:
- Go to KServe repo.
- Select a corresponding tag in releases (0.11.1 in this example).
- Navigate to
install/<your-kserve-version>/kserve-runtimes.yamlfile and see which image corresponds to your runtime (in our casekserve/sklearnserver:v0.11.1). - If the image is provided by KServe, navigate to the
pythondirectory and find a directory corresponding to the desired ClusterServingRuntime (typically has "server" in its name). In our case it will besklearnserver. - Open
pyproject.tomlin this directory and see which library versions are specified there. For our example it will bescikit-learn = "~1.3.0".
If the image is not provided by KServe, you will need to check the upstream repo.