How To Use Serverless In A K8s Cluster

Once you have accessed the panel and obtained the kubeconfig of our tenant, it is possible to integrate the serverless GPU services into your private cluster.

Install the Chart

To use our services, you need to install our Chart on your cluster through Helm. To install the related chart, you can execute the following commands by replacing: * {tenant name} with the desired tenant name * {path to your tenant kubeconfig} with the absolute path to your kubeconfig

helm repo add clastix https://clastix.github.io/charts
helm repo update

helm upgrade --install k8sgpu clastix/k8sgpu \
  --namespace kube-system \
  --set "k8sgpuOptions.tenantName={tenant name}" \
  --set-file "kubeConfigSecret.content={path to your tenant kubeconfig}"

Verify the Presence of the GPU Node

After installing the chart, the k8s.gpu node should appear in your cluster:

kubectl get nodes

NAME STATUS ROLES AGE VERSION
k8s.gpu Ready agent 37m v1.0.0
...
...

Verify Available RuntimeClass

Once the node is made available, you can view all the available RuntimeClasses:

Info

A RuntimeClass is nothing more than a set of GPU resources defined to be made available to our POD:

kubectl get runtimeclasses

NAME HANDLER AGE
seeweb-nvidia-1xa100 nvidia 13h
seeweb-nvidia-1xa30 nvidia 13h
seeweb-nvidia-1xl4 nvidia 13h

Use the RuntimeClass

Once you have selected the RuntimeClass of interest, simply initialize our POD specifying which RuntimeClass you want to use:

cat << EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: nvidia-smi
spec:
  restartPolicy: OnFailure
  runtimeClassName: seeweb-nvidia-1xa30
  #runtimeClassName: seeweb-nvidia-1xl4
  #runtimeClassName: seeweb-nvidia-1xa100
  - name: nvidia
    image: nvidia/cuda:11.0.3-base-ubuntu20.04
    command: ["/bin/bash", "-c", "--"]
    args: ["sleep 3600"]
    imagePullPolicy: Always
    imagePullPolicy: Always
  ## Following toleration is required on some distributions, e.g. AKS
  ## as no CNI is assigned on virtual nodes
  # tolerations:
  # - key: "node.kubernetes.io/network-unavailable"
  # operator: "Exists"
  # effect: "NoSchedule"
EOF

The POD should now appear on the virtual GPU node with the ability to access the GPU specified by the RuntimeClass:

kubectl get pods

NAME READY STATUS RESTARTS AGE
nvidia-smi 1/1 Running 0 37m

We can verify access to the GPU with the following command:

kubectl exec nvidia-smi -- nvidia-smi

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI                   550.54.15 Driver Version:     550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name                  Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan Temp   Perf           Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0 NVIDIA A30                       On |   00000000:06:00.0 Off |                    0 |
| N/A 32C    P0                27W / 165W |        0MiB / 24576MiB |       0%     Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
| GPU  GI  CI        PID  Type  Process name                                   GPU Memory |
|      ID  ID                                                                  Usage      |
|=========================================================================================|
| No running processes found                                                              |
+-----------------------------------------------------------------------------------------+

Uninstall the Chart

To uninstall the Chart from your Cluster, simply execute the following commands:

helm uninstall k8sgpu --namespace kube-system

Using Serverless GPUs with K3S

Communicating with GPU pods when using a K3S cluster could result in anomalies due to K3S nodes using websocket tunnels to interface with the control plane.

To resolve these anomalies, simply disable this feature; to do so, the following steps must be followed.

Modify the systemd service

As a first step, you need to make changes to the k3s systemd service.

Inside the /etc/systemd/system/k3s.service.d folder you need to create the override.conf file with the following content:

[Service]
ExecStart=
ExecStart=/usr/local/bin/k3s server --egress-selector-mode=disabled

Reload Service

Having done so, it is necessary to reload the service with the following command:

sudo systemctl daemon-reload
sudo systemctl restart k3s

Current Limitations

K8sGPU Agent is a work in progress solution. Like many new things, it's something that can always be improved. We kindly ask you to be patient and to provide honest feedback to help us improve it. Currently, the following limitations are present (some will be removed in future versions):

Pods cannot mount local storage and CSI PersistentVolume
Pods cannot access other local Kubernetes services
Pods can only access S3 storage
Pods cannot be exposed on the local cluster
Pods can only be exposed on the public Internet and are accessed with an HTTPS endpoint in the form of https://<your_tenant_id>.k8sgpu.net
Remember to equip yourself with authentication