Running JupyterHub With Istio Service Mesh on Kubernetes — A Troubleshooting Journey
JupyterHub is an open-source tool that offers the ability to spin up Jupyter notebook servers on demand. The notebooks can be used for data analysis or to create and execute Machine learning models. Istio is a service mesh that offers secure and observable communication mechanism between different services in a Kubernetes cluster.
One of the benefits of running JupyterHub in an istio-enabled cluster is to gain support for mTLS(mutual TLS) capabilities between different JupyterHub components. mTLS ensures that all communication between the hub and the user-notebook servers is encrypted and is safe from eavesdropping. This capability has been requested by many users in the JupyterHub community.
To follow along this journey, it is important to know the basic component interactions in JupyterHub
- The Hub configures the proxy by calling proxy-api
- The proxy forwards all requests to the Hub by default
- The Hub handles login, and spawns single-user notebook servers on demand
- The Hub configures the proxy to forward url prefixes to single-user notebook servers
Setup
— Install istio
$ istioctl install --set profile=demo
— Install JupyterHub
Create the jupyterhub
namespace to install the JupyterHub. Set the istio-injection
label to configure the automatic injection of the istio-proxy sidecar in the pods that start in the namespace. Set the mTLS
mode for all services on the namespace.
$ kubectl create ns jupyterhub
$ kubectl label namespace jupyterhub istio-injection=enabled$ kubectl apply -n jupyterhub -f - <<EOF
apiVersion: "security.istio.io/v1beta1"
kind: "PeerAuthentication"
metadata:
name: "default"
spec:
mtls:
mode: STRICT
EOF
Next, setup the helm charts repository.
$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
$ helm repo update
Setup the config for the helm chart.
$ echo -n "proxy:\n secretToken: '$(openssl rand -hex 32)'\n" > config.yaml
Install JupyterHub in the jupyterhub
namespace
$ helm template jupyterhub/jupyterhub \
--version=0.9.0 \
--values config.yaml | kubectl -n jupyterhub apply -f -
$ # Not using `helm install` is a personal preference. I prefer qbec instead for day to day use. Using helm here as it is used to package JupyterHub for Kubernetes in the community.
Next, we’ll verify the deployment to see if the pods are running. Both the hub and the proxy pods running as expected.
$ kubectl -n jupyterhub get po
hub-fd88f65b6-6zqb9 2/2 Running 1 5m31s
proxy-98fdbb5fd-bv7nt 2/2 Running 0 5m31s
The 2/2 part shows that there are two containers in the pod - the main container and a sidecar istio-proxy container. kubectl -n jupyterhub describe po hub-fd88f65b6-6zqb9
shows that the hub pod has an istio-init container and an istio-proxy sidecar.
Network traffic is routed through the istio-proxy sidecar. To validate, look at the access logs from the sidecar.
$ kubectl -n jupyterhub logs hub-fd88f65b6-6zqb9 -c istio-proxy[2020-09-15T03:50:42.650Z] "GET /api/routes HTTP/1.1" 200 - "-" "-" 0 87 2 1 "-" "Mozilla/5.0 (compatible; pycurl)" "389f6b9c-c966-96d7-8cc3-2a565f623ccd" "10.106.101.111:8001" "10.1.0.19:8001" outbound|8001||proxy-api.jupyterhub.svc.cluster.local 10.1.0.18:52182 10.106.101.111:8001 10.1.0.18:36004 - default
So far so good. Looks like we got everything we need. But, navigating to the proxy-public results in an unexpected error.
$ kubectl -n jupyterhub port-forward svc/proxy-public 8080:80
Port-forwarding directly to the proxy pod also fails with the same 404 error
$ kubectl -n jupyterhub port-forward proxy-98fdbb5fd-265xq 8080:8000
Investigation
The 404 turns out to be a little tricky to debug. Everything checks out, the proxy container is able to connect to the hub container via the hub service, the user request on the port-forward lands on the proxy container. Could this be an issue with istio? Disabling the sidecar injection makes everything work again magically.
$ kubectl label namespace jupyterhub istio-injection=disabled --overwrite
The port-forward lands at the login page.
Digging deeper requires more networking expertise. Enabling sidecar injection again and simultaneously tailing logs of the istio-proxy(which is based on envoy) for the proxy pod shows the event that records the failure.
$ kubectl -n jupyterhub logs proxy-98fdbb5fd-svnzl -c istio-proxy -f
[2020-09-15T04:37:03.884Z] "GET / HTTP/1.1" 404 NR "-" "-" 0 0 0 - "127.0.0.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36" "b0bd27fe-90c3-9bfd-8ae7-e716baa9eb6e" "localhost:8080" "-" - - 10.105.254.81:8081 127.0.0.1:0 - -
10.105.254.81:8081
is destination(k8s service) where hub is listening but there is no corresponding inbound event in the logs for the istio-proxy on the hub pod. The real hint here is the 404 NR
which means that there is “No Route” or in other words the target service is not known to envoy. Hence envoy drops the outbound request resulting in a 404 NOT FOUND
response. To understand what’s going on, let’s fire up a shell on the proxy and make some tailored requests and reproduce the issue.
$ kubectl -n jupyterhub exec -it deploy/proxy -c chp sh
# curl hub:8081
The proxy sidecar logs show the event with a successful redirect as expected(302 in this case).
[2020-09-15T04:51:18.816Z] "GET / HTTP/1.1" 302 - "-" "-" 0 0 2 1 "-" "curl/7.67.0" "923377f0-4cb1-9c35-8ba9-0c42f8e247da" "hub:8081" "10.1.0.37:8081" outbound|8081||hub.jupyterhub.svc.cluster.local 10.1.0.38:46732 10.105.254.81:8081 10.1.0.38:50312 - default
The hub sidecar log also show the corresponding inbound event. The X-REQUEST-ID
field can be used to track the logs across services in an istio service mesh.
[2020-09-15T04:51:18.816Z] "GET / HTTP/1.1" 302 - "-" "-" 0 0 1 1 "-" "curl/7.67.0" "923377f0-4cb1-9c35-8ba9-0c42f8e247da" "hub:8081" "127.0.0.1:8081" inbound|8081||hub.jupyterhub.svc.cluster.local 127.0.0.1:34838 10.1.0.37:8081 10.1.0.38:46732 outbound_.8081_._.hub.jupyterhub.svc.cluster.local default
Now let’s route the request through the chp(configurable-http-proxy) — the nodejs server that proxies the call. This results in the 404 error — the same error as the calls from the browser.
# curl localhost:8000 # the entry below is from proxy sidecar logs
[2020-09-15T04:54:19.216Z] "GET / HTTP/1.1" 404 NR "-" "-" 0 0 0 - "127.0.0.1" "curl/7.67.0" "0fcf1f52-a809-9855-bb58-50501a17f694" "localhost:8000" "-" - - 10.105.254.81:8081 127.0.0.1:0 - -
Setting the correct Host header on the request works as expected.
# curl localhost:8000 -H "Host: hub:8081" -H "X-REQUEST-ID:
test"[2020-09-15T05:01:35.988Z] "GET / HTTP/1.1" 302 - "-" "-" 0 0 2 2 "127.0.0.1" "curl/7.67.0" "test" "hub:8081" "10.1.0.37:8081" outbound|8081||hub.jupyterhub.svc.cluster.local 10.1.0.38:46732 10.105.254.81:8081 127.0.0.1:0 - default
So, the issue with chp is the mismatch of the Host header on the request. Envoy would drop the outbound request with an NR and the proxy cannot hit the hub service when an external request is routed through it. (Some other tools and nodejs snippets used to narrow down the exact issue have been omitted for brevity.)
jupyterhub-istio-proxy to the rescue
While there are ways to hack the current proxy implementation(s) and in some cases use a less secure variant, it is kinda redundant as istio(underlying envoy to be precise) offers a first-class support for network proxying. Moreover, the chp proxy implementation becomes a bottleneck as soon as the Jupyterhub traffic grows. It cannot be scaled beyond one pod due to its technical limitations. jupyterhub-istio-proxy
can be used to configure istio to do the actual network routing based on user interactions with JupyterHub. It also offers a horizontally scalable solution needed to run production workloads at scale.
Create an istio gateway to handle ingress to the K8s cluster. The gateway is the entry point for network.
$ kubectl -n jupyterhub apply -f - https://gist.githubusercontent.com/harsimranmaan/4315477268fccea65accf8674f5c49ef/raw/0298f3f420365c7e56aedab7949ad39e00ffbcc3/jupyterhub-istio-proxy-gateway.yaml
Remove the proxy-public service as it no longer needed.
$ kubectl -n jupyterhub delete svc proxy-public
Replace the proxy deployment with the jupyterhub-istio-proxy:
$ kubectl -n jupyterhub apply -f https://gist.githubusercontent.com/harsimranmaan/2e77cf65019439052122b7b89f926686/raw/d800b8c60c2ac10226d549c1fbc6d8d75e8e6142/jupyterhub-istio-proxy.yaml
Once the above config is applied, a new virtual service would appear.
$ kubectl -n jupyterhub get vs
jupyterhub-8a5edab282632443219e051e4ade2d1d5bbc671c781051bf1437897cbdfea0f1 [jupyterhub-gateway] [*] 37m
Everything should now work in theory(right?), but there are a few more issues to address. By default, hub is configured to tell the proxy-api(jupyterhub-istio-proxy
) to route traffic to its IP instead of using its service name. This causes the istio virtual service to be configured with the IP
Hub log:
[I 2020-09-15 06:28:24.481 JupyterHub proxy:400] Adding default route for Hub: / => http://10.105.254.81:8081
resulting in invalid VS config
- destination:
host: 10.105.254.81.jupyterhub.svc.cluster.local
port:
number: 8081
Patch the Jupyterhub config to set the Jupyterhub.hub_connect_ip
property to the service name instead of the IP. The PROXY_PUBLIC_SERVICE_HOST and PROXY_PUBLIC_SERVICE_PORT are no longer in use and can be set to the external hostname and port (localhost:80 in this setup).
kubectl -n jupyterhub get cm/hub-config -o yaml | sed "s/os\.environ\['HUB_SERVICE_HOST'\]/'hub'/g" | sed "s/os\.environ\['PROXY_PUBLIC_SERVICE_HOST'\]/'localhost'/g" | sed "s/os\.environ\['PROXY_PUBLIC_SERVICE_PORT'\]/'80'/g" | kubectl -n jupyterhub apply -f -
Restart the hub pod to pick up the new config. The service name is set for the default route.
[I 2020-09-15 07:27:09.068 JupyterHub proxy:400] Adding default route for Hub: / => http://hub:8081- destination:
host: hub.jupyterhub.svc.cluster.local
port:
number: 8081
Follow the official guide to determine the istio Gateway URL. Navigate to http://YOUR_GATEWAY_URL and you’ll see JupyterHub running. TLS termination for web requests is not covered here but is fairly straightforward to setup with istio gateway. It is left as an exercise to the readers.
The last missing piece in the puzzle is to ensure that user-notebook servers can be spun up and users can run their favourite notebooks.
This requires patching another JupyterHub component(kubespawner). The details can be found in this PR: https://github.com/jupyterhub/kubespawner/pull/425
Under the hood
jupyterhub-istio-proxy
creates an Istio virtual service for every route request from the hub. Hub forwards routing requests to jupyterhub-istio-proxy
which sets up the desired destination route and waits for the route to warm up before sending the confirmation back to the hub. Once the route is created, hub redirects the user to their notebook server.
If you have questions or would like to contribute to the development of jupyterhub-istio-proxy
, drop a note or send in your contributions at https://github.com/splunk/jupyterhub-istio-proxy/issues