Skip to content

Getting Started

Learn how to use deployKF in production.
Easily deploy the best of Kubeflow and other MLOps tools as a complete platform!


Introduction

This page is about using deployKF in production, it will cover the requirements, configuration options, deployment process, and basic usage of the platform.

We suggest new users start with the About deployKF and Local Quickstart pages:

About deployKF
(Introduction)
Local Quickstart
(Try Locally)

We encourage you to join our community and learn about support options!

Join the Community Get Support

For existing Kubeflow users, we have a migration guide:

Migrate from Kubeflow Distributions

1. Requirements

Kubernetes Cluster

deployKF can run on any Kubernetes cluster, in any cloud or environment. See the version matrix for a list of supported Kubernetes versions.

Here are some Kubernetes distributions which are supported by deployKF:

Target Platform Kubernetes Distribution
Amazon Web Services Amazon Elastic Kubernetes Service (EKS)
Microsoft Azure Azure Kubernetes Service (AKS)
see special requirements
Google Cloud Google Kubernetes Engine (GKE)
IBM Cloud IBM Cloud Kubernetes Service (IKS)
Self-Hosted Rancher (RKE) // kOps // Kubespray // kubeadm
Edge k3s // k0s // MicroK8s
Local Machine k3d // Kind // Minikube

Dedicated Cluster

We strongly recommend using a dedicated cluster for deployKF. This is because deployKF has a number of cluster-level dependencies which may conflict with other applications.

If you are unable to create a new Kubernetes cluster, you may consider using vCluster to create a virtual Kubernetes cluster within an existing one.

Kubernetes Configurations

deployKF requires some specific Kubernetes configurations to work correctly.

The following table lists these configurations and their requirements:

Configuration Requirement
Node Resources The nodes must collectively have at least 4 vCPUs and 16 GB RAM, and 64 GB Storage.
CPU Architecture The cluster must have x86_64 Nodes.
Cluster Domain The clusterDomain of your kubelet must be "cluster.local".
Service Type By default, the cluster must have a LoadBalancer service type.
be careful not to expose your platform on the public internet by mistake
Default StorageClass The default StorageClass must support the ReadWriteOnce access mode.
Existing Argo Workflows The cluster must NOT already have Argo Workflows installed.
See deployKF/deployKF#116 to join the discussion.
ARM64 Support

The next minor version of deployKF (v0.2.0) should have native ARM64 for all core components. However, some upstream apps like Kubeflow Pipelines will need extra work to be production ready (#10309, #10308).

Other Service Types

For real-world usage, you should review the Expose Gateway and configure HTTPS guide.

To use a different service type, you can override the deploykf_core.deploykf_istio_gateway.gatewayService.type value:

deploykf_core:
  deploykf_istio_gateway:
    gatewayService:
      type: "NodePort" # or "ClusterIP"
Default StorageClass

If you do NOT have a compatible default StorageClass, you might consider the following options:

  1. Configure a default StorageClass that has ReadWriteOnce support
  2. Explicitly set the storageClass value for the following components:
  3. Disable components which require the StorageClass, and use external alternatives:

2. Platform Configuration

deployKF is very configurable, you can use it to deploy a wide variety of machine learning platforms and integrate with your existing infrastructure.

deployKF Values

All aspects of your deployKF platform are configured with YAML-based configs named "values". See the values page for more information.

deployKF Versions

Each deployKF version may include different ML & Data tools or support different versions of cluster dependencies. See the version matrix for an overview, and the changelog for detailed information, including important tips for upgrading.

How can I get notified about new releases?

Watch the deployKF/deployKF repo on GitHub.
At the top right, click WatchCustomReleases then confirm by selecting Apply.

Cluster Dependencies

deployKF has a number of cluster dependencies including Istio, cert-manager, and Kyverno. See the cluster dependencies page for an overview.

Existing Cluster Dependencies

deployKF installs its own versions of the cluster dependencies by default.
If you have existing versions on the cluster, you MUST configure deployKF to use them:

External Dependencies

deployKF has a number of external dependencies including MySQL and an Object Store. See the external dependencies page for an overview.

Connect External Dependencies

deployKF includes embedded versions of MySQL and MinIO for development and testing.
We strongly recommend connecting external versions for production use:


3. Deploy the Platform

⭐ Create ArgoCD Applications ⭐

deployKF uses ArgoCD to manage the deployment of the platform. The process to create the ArgoCD Applications will depend on which mode of operation you have chosen.

Step 1 - Install the ArgoCD Plugin

Your ArgoCD must have the deployKF ArgoCD Plugin.

Depending on your situation, there are different ways to install the plugin:

Step 2 - Define an App-of-Apps

Create a local file named deploykf-app-of-apps.yaml with the contents of the YAML below.

This will use deployKF version 0.1.4, read the sample-values.yaml from the deploykf/deploykf repo, and combine those values with the overrides defined in the values parameter.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: deploykf-app-of-apps
  namespace: argocd
  labels:
    app.kubernetes.io/name: deploykf-app-of-apps
    app.kubernetes.io/part-of: deploykf
spec:

  ## NOTE: if not "default", you MUST ALSO set the `argocd.project` value
  project: "default"

  source:
    ## source git repo configuration
    ##  - we use the 'deploykf/deploykf' repo so we can read its 'sample-values.yaml'
    ##    file, but you may use any repo (even one with no files)
    ##
    repoURL: "https://github.com/deployKF/deployKF.git"
    targetRevision: "v0.1.4"
    path: "."

    ## plugin configuration
    ##
    plugin:
      name: "deploykf"
      parameters:

        ## the deployKF generator version
        ##  - available versions: https://github.com/deployKF/deployKF/releases
        ##
        - name: "source_version"
          string: "0.1.4"

        ## paths to values files within the `repoURL` repository
        ##  - the values in these files are merged, with later files taking precedence
        ##  - we strongly recommend using 'sample-values.yaml' as the base of your values
        ##    so you can easily upgrade to newer versions of deployKF
        ##
        - name: "values_files"
          array:
            - "./sample-values.yaml"

        ## a string containing the contents of a values file
        ##  - this parameter allows defining values without needing to create a file in the repo
        ##  - these values are merged with higher precedence than those defined in `values_files`
        ##
        - name: "values"
          string: |
            ##
            ## This demonstrates how you might structure overrides for the 'sample-values.yaml' file.
            ## For a more comprehensive example, see the 'sample-values-overrides.yaml' in the main repo.
            ##
            ## Notes:
            ##  - YAML maps are RECURSIVELY merged across values files
            ##  - YAML lists are REPLACED in their entirety across values files
            ##  - Do NOT include empty/null sections, as this will remove ALL values from that section.
            ##    To include a section without overriding any values, set it to an empty map: `{}`
            ##

            ## --------------------------------------------------------------------------------
            ##                                      argocd
            ## --------------------------------------------------------------------------------
            argocd:
              namespace: argocd
              project: default

            ## --------------------------------------------------------------------------------
            ##                                    kubernetes
            ## --------------------------------------------------------------------------------
            kubernetes:
              {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                              deploykf-dependencies
            ## --------------------------------------------------------------------------------
            deploykf_dependencies:

              ## --------------------------------------
              ##             cert-manager
              ## --------------------------------------
              cert_manager:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##                 istio
              ## --------------------------------------
              istio:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##                kyverno
              ## --------------------------------------
              kyverno:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                                  deploykf-core
            ## --------------------------------------------------------------------------------
            deploykf_core:

              ## --------------------------------------
              ##             deploykf-auth
              ## --------------------------------------
              deploykf_auth:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##        deploykf-istio-gateway
              ## --------------------------------------
              deploykf_istio_gateway:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##      deploykf-profiles-generator
              ## --------------------------------------
              deploykf_profiles_generator:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                                   deploykf-opt
            ## --------------------------------------------------------------------------------
            deploykf_opt:

              ## --------------------------------------
              ##            deploykf-minio
              ## --------------------------------------
              deploykf_minio:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##            deploykf-mysql
              ## --------------------------------------
              deploykf_mysql:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                                  kubeflow-tools
            ## --------------------------------------------------------------------------------
            kubeflow_tools:

              ## --------------------------------------
              ##                 katib
              ## --------------------------------------
              katib:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##               notebooks
              ## --------------------------------------
              notebooks:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##               pipelines
              ## --------------------------------------
              pipelines:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  destination:
    server: "https://kubernetes.default.svc"
    namespace: "argocd"
Step 3 - Configure Values

deployKF is configured by centralized values which define the desired state of the platform:


In ArgoCD Plugin Mode, you can define custom values in two ways:

  1. Within the app-of-apps YAML itself, using the values plugin parameter.
  2. From files in the repoURL git repository, using the values_files plugin parameter.

Each version of deployKF has sample values with all supported ML & Data tools enabled, along with some sensible security defaults. We recommend using these samples as a base for your custom values.

If you want to version your values files in git, you may update the spec.source.repoURL of your app-of-apps to any repo you have access to. You will need to push the upstream sample-values.yaml file to your repo. The following command will download the sample-values.yaml file for deployKF 0.1.4:

# download the `sample-values.yaml` file
curl -fL -o "sample-values-0.1.4.yaml" \
  "https://raw.githubusercontent.com/deployKF/deployKF/v0.1.4/sample-values.yaml"
Step 4 - Apply App-of-Apps Resource

Create a local file named deploykf-app-of-apps.yaml with the contents of the app-of-apps YAML above.

Apply the resource to your cluster with the following command:

kubectl apply -f ./deploykf-app-of-apps.yaml
Step 1 - Install ArgoCD

If you have not already installed ArgoCD on your cluster, you will need to do so.

Please see the ArgoCD Getting Started Guide for instructions.

Step 2 - Install the deployKF CLI

If you have not already installed the deploykf CLI on your local machine, you will need to do so.

Please see the CLI Installation Guide for instructions.

Step 3 - Prepare a Git Repo

You will need to create a git repo to store your generated manifests. If your repo is private (recommended), you will need to configure ArgoCD with git credentials so it can access the repo.

Step 4 - Create Values Files

deployKF is configured by centralized values which define the desired state of the platform:


Each version of deployKF has sample values with all supported ML & Data tools enabled, along with some sensible security defaults. We recommend using these samples as a starting point for your custom values.

The following command will download the sample-values.yaml file for deployKF 0.1.4:

# download the `sample-values.yaml` file
curl -fL -o "sample-values-0.1.4.yaml" \
  "https://raw.githubusercontent.com/deployKF/deployKF/v0.1.4/sample-values.yaml"

To make upgrades easier, we recommend using the sample values as a base, and applying custom override files with only the values you want to change. This will help you swap out the sample values for a newer version in the future.

For example, you might structure your custom-overrides.yaml file like this:

##
## Notes:
##  - YAML maps are RECURSIVELY merged across values files
##  - YAML lists are REPLACED in their entirety across values files
##  - Do NOT include empty/null sections, as this will remove ALL values from that section.
##    To include a section without overriding any values, set it to an empty map: `{}`
##

## --------------------------------------------------------------------------------
##                                      argocd
## --------------------------------------------------------------------------------
argocd:
  namespace: argocd
  project: default

  source:
    ## the git repo where you will store your generated manifests
    ##  - url: the URL of the git repo
    ##  - revision: the git branch/tag/commit to read from
    ##  - path: the repo folder path where the generated manifests are stored
    ##
    repo:
      url: "https://github.com/deployKF/examples.git"
      revision: "main"
      path: "./GENERATOR_OUTPUT/"

## --------------------------------------------------------------------------------
##                                    kubernetes
## --------------------------------------------------------------------------------
kubernetes:
  {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                              deploykf-dependencies
## --------------------------------------------------------------------------------
deploykf_dependencies:

  ## --------------------------------------
  ##             cert-manager
  ## --------------------------------------
  cert_manager:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##                 istio
  ## --------------------------------------
  istio:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##                kyverno
  ## --------------------------------------
  kyverno:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                                  deploykf-core
## --------------------------------------------------------------------------------
deploykf_core:

  ## --------------------------------------
  ##             deploykf-auth
  ## --------------------------------------
  deploykf_auth:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##        deploykf-istio-gateway
  ## --------------------------------------
  deploykf_istio_gateway:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##      deploykf-profiles-generator
  ## --------------------------------------
  deploykf_profiles_generator:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                                   deploykf-opt
## --------------------------------------------------------------------------------
deploykf_opt:

  ## --------------------------------------
  ##            deploykf-minio
  ## --------------------------------------
  deploykf_minio:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##            deploykf-mysql
  ## --------------------------------------
  deploykf_mysql:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                                  kubeflow-tools
## --------------------------------------------------------------------------------
kubeflow_tools:

  ## --------------------------------------
  ##                 katib
  ## --------------------------------------
  katib:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##               notebooks
  ## --------------------------------------
  notebooks:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##               pipelines
  ## --------------------------------------
  pipelines:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!
Step 5 - Generate Manifests

The deploykf generate command writes generated manifests into a folder, using one or more values files.

The following command will use deployKF version 0.1.4 to generate manifests under ./GENERATOR_OUTPUT/:

deploykf generate \
    --source-version "0.1.4" \
    --values ./sample-values-0.1.4.yaml \
    --values ./custom-overrides.yaml \
    --output-dir ./GENERATOR_OUTPUT

Avoid Manual Changes

Manual changes in the --output-dir will be overwritten each time the deploykf generate command runs. If you find yourself needing to make manual changes, please raise an issue so we may consider adding a new value to support your use-case.

Multiple Values Files

If you specify --values multiple times, they will be merged with later ones taking precedence.
Learn more in the merging values guide.

Step 6 - Commit Generated Manifests

After running deploykf generate, you will need to commit the manifests to your repo, so ArgoCD can apply them to your cluster:

# for example, to directly commit changes to the 'main' branch of your repo
git add GENERATOR_OUTPUT
git commit -m "my commit message"
git push origin main
Step 7 - Apply App-of-Apps Manifest

The only manifest you need to manually apply is the app-of-apps, which creates all the other ArgoCD applications.

The app-of-apps.yaml manifest is generated at the root of your --output-dir folder, so you can apply it with:

kubectl apply --filename GENERATOR_OUTPUT/app-of-apps.yaml

⭐ Sync ArgoCD Applications ⭐

Now that your deployKF app-of-apps has been applied, you must sync the ArgoCD applications to deploy your platform. Syncing an application will cause ArgoCD to reconcile the actual state in the cluster, to match the state defined by the application resource.

Danger

DO NOT sync all the Applications at once!!!

The deployKF Applications depend on each other, they MUST be synced in the correct order to avoid errors. If you manually sync them all, you may need to uninstall and start over.

There are a few ways to sync the applications, you only need to use ONE of them.

The recommended way to sync the applications is with the automated script.

Step - Run the Sync Script

We provide the sync_argocd_apps.sh script to automatically sync the applications that make up deployKF. Learn more about the automated sync script from the scripts folder README .

For example, to run the script, you might use the following commands:

# clone the deploykf repo
# NOTE: we use 'main', as the latest script always lives there
git clone -b main https://github.com/deployKF/deployKF.git ./deploykf

# ensure the script is executable
chmod +x ./deploykf/scripts/sync_argocd_apps.sh

# run the script
bash ./deploykf/scripts/sync_argocd_apps.sh

About the sync script

  • The script can take around 5-10 minutes to run on first install.
  • If the script fails or is interrupted, you can safely re-run it, and it will pick up where it left off.
  • There are a number of configuration variables at the top of the script which change the default behavior.
  • Learn more about the automated sync script from the scripts folder README in the deployKF repo.

Please be aware of the following issue when using the automated sync script:

Bug in ArgoCD v2.9

There is a known issue (deploykf/deploykf#70, argoproj/argo-cd#16266) with all 2.9.X versions of the ArgoCD CLI that will cause the sync script to fail with the following error:

==========================================================================================
Logging in to ArgoCD...
==========================================================================================
FATA[0000] cannot find pod with selector: [app.kubernetes.io/name=] - use the --{component}-name flag in this command or set the environmental variable (Refer to https://argo-cd.readthedocs.io/en/stable/user-guide/environment-variables), to change the Argo CD component name in the CLI

Please upgrade your argocd CLI to at least version 2.10.0 to resolve this issue.

You can sync the applications using the ArgoCD Web UI.

Step 1 - Access the ArgoCD Web UI

For production usage, you may want to expose ArgoCD with a LoadBalancer or Ingress.

For testing, you may use kubectl port-forwarding to expose the ArgoCD Web UI on your local machine:

kubectl port-forward --namespace "argocd" svc/argocd-server 8090:https

The ArgoCD Web UI should now be available at the following URL:

https://localhost:8090


If this is the first time you are using ArgoCD, you will need to retrieve the initial password for the admin user:

echo $(kubectl -n argocd get secret/argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d)

Once you log in with the admin user and above password, the Web UI should look like this:

ArgoCD Web UI (Dark Mode) ArgoCD Web UI (Light Mode)

Step 2 - Sync deployKF Applications

You MUST sync the deployKF applications in the correct order. For each application, click the SYNC button, and wait for the application to become "Healthy" before syncing the next.

The applications are grouped and ordered as follows:

Group 0: "app-of-apps"

First, you must sync the app-of-apps application:

  1. deploykf-app-of-apps
  2. deploykf-namespaces (will only appear if using a remote destination)

Group 1: "deploykf-dependencies"

Second, you must sync the applications with the label app.kubernetes.io/component=deploykf-dependencies:

  1. dkf-dep--cert-manager (may fail on first attempt)
  2. dkf-dep--istio
  3. dkf-dep--kyverno

WARNING: for this group, each application MUST be synced INDIVIDUALLY and the preceding application MUST be "Healthy" before syncing the next.

Group 2: "deploykf-core"

Third, you must sync the applications with the label app.kubernetes.io/component=deploykf-core:

  1. dkf-core--deploykf-istio-gateway
  2. dkf-core--deploykf-auth
  3. dkf-core--deploykf-dashboard
  4. dkf-core--deploykf-profiles-generator (may fail on first attempt)

Group 3: "deploykf-opt"

Fourth, you must sync the applications with the label app.kubernetes.io/component=deploykf-opt:

  • dkf-opt--deploykf-minio
  • dkf-opt--deploykf-mysql

Group 4: "deploykf-tools"

Fifth, you must sync the applications with the label app.kubernetes.io/component=deploykf-tools:

  • (none yet)

Group 5: "kubeflow-dependencies"

Sixth, you must sync the applications with the label app.kubernetes.io/component=kubeflow-dependencies:

  • kf-dep--argo-workflows

Group 6: "kubeflow-tools"

Seventh, you must sync the applications with the label app.kubernetes.io/component=kubeflow-tools:

  • kf-tools--katib
  • kf-tools--notebooks--jupyter-web-app
  • kf-tools--notebooks--notebook-controller
  • kf-tools--pipelines
  • kf-tools--poddefaults-webhook
  • kf-tools--tensorboards--tensorboard-controller
  • kf-tools--tensorboards--tensorboards-web-app
  • kf-tools--training-operator
  • kf-tools--volumes--volumes-web-app

4. Use the Platform

Now that you have a working deployKF machine learning platform, here are some things to try out!

⭐ Expose the deployKF Dashboard ⭐

The deployKF dashboard is the web-based interface for deployKF, it gives users authenticated access to tools like Kubeflow Pipelines, Kubeflow Notebooks, and Katib.

deployKF Dashboard (Dark Mode) deployKF Dashboard (Light Mode)

All public deployKF services (including the dashboard) are accessed via the deployKF Istio Gateway, you will need to expose its Kubernetes Service.

Step 1 - Expose the Gateway

You may expose the deployKF Istio Gateway Service in a number of ways:

Step 2 - Log in to the Dashboard

See the authentication guide to define static credentials, or connect deployKF to an external identity provider like Okta or Active Directory.

There are a few default credentials set in the deploykf_core.deploykf_auth.dex.staticPasswords value:

Credentials: User 1

Username: user1@example.com
Password: user1

Credentials: User 2

Username: user2@example.com
Password: user2

Credentials: Admin (DO NOT USE - will be removed in future versions)

Username: admin@example.com
Password: admin

  • This account is the default "owner" of all profiles.
  • This account does NOT have access to "MinIO Console" or "Argo Server UI".
  • We recommend NOT using this account, and actually removing its staticPasswords entry.
  • We recommend leaving this account as the default "owner", even with @example.com as the domain (because profile owners can't be changed).
Step 3 - Explore the Tools

deployKF includes many ML & Data tools that address different stages of the machine learning lifecycle.

Here are a few popular tools to get started with:

We also provide a number of user-focused guides:

User Guide Description
Access Kubeflow Pipelines API Learn how to access the Kubeflow Pipelines API from both inside and outside the cluster with the Kubeflow Pipelines SDK.
GitOps for Kubeflow Pipelines Schedules Learn how to use GitOps to manage Kubeflow Pipelines schedules (rather than manually creating them with the UI or Python SDK).

Next Steps


Last update: 2024-04-17
Created: 2023-04-24