Skip to content

Getting Started

Learn how to use deployKF in production.
Easily deploy the best of Kubeflow and other MLOps tools as a complete platform!


Introduction

This page is about using deployKF in production. We will cover requirements, configuration, the deployment process, and basic usage of the platform.

We suggest new users start with the introduction and local quickstart pages.

About deployKF
(Introduction)
Local Quickstart
(Try Locally)

For existing Kubeflow users, we have a migration guide.

Migrate from Kubeflow Distributions

We encourage everyone to join our community and learn how to get support!

Join the Community Get Support

1. Requirements

Please ensure you meet the following requirements before using deployKF in production.

Kubernetes Cluster

deployKF can run on any Kubernetes cluster, in any cloud or environment. See the version matrix for a list of supported Kubernetes versions.

For example, deployKF can run on the following Kubernetes distributions:

Target Platform Kubernetes Distribution
Amazon Web Services Amazon Elastic Kubernetes Service (EKS)
Microsoft Azure Azure Kubernetes Service (AKS)
see special requirements
Google Cloud Google Kubernetes Engine (GKE)
IBM Cloud IBM Cloud Kubernetes Service (IKS)
Self-Hosted Rancher (RKE) // kOps // Kubespray // kubeadm
Edge k3s // k0s // MicroK8s
Local Machine k3d // Kind // Minikube

Dedicated Cluster

We strongly recommend using a dedicated cluster for deployKF. This is because deployKF has a number of cluster-level dependencies which may conflict with other applications.

If you are unable to create a new Kubernetes cluster, you may consider using vCluster to create a virtual Kubernetes cluster within an existing one.

Argo CD Dependency

deployKF requires Argo CD for managing the platform.

You may either use deployKF with an existing ArgoCD, or deploy a new one (if you don't already have it), both options are covered later in this guide.

Can I use <other tool> instead of Argo CD?

Not yet.

While we believe that Argo CD is currently the best in its category, we recognize that it's not the only option. In the future, we may support other Kubernetes GitOps tools (like Flux CD), or even build a deployKF-specific solution.

deployKF will make your MLOps life so much easier, that it's still worth using, even if you don't already love Argo CD. If you want, you can largely treat Argo CD as a "black box" and just use the provided sync scripts to manage the platform.

To learn more about this decision, and participate in the discussion, see deployKF/deployKF#110.

Kubernetes Requirements

Your Kubernetes cluster must meet the following requirements:

Configuration Requirement Notes
Node Resources The nodes must collectively have at least 4 vCPUs and 16 GB RAM, and 64 GB Storage.
CPU Architecture The cluster must have x86_64 CPU Nodes. ARM64 Support
Internet Access The cluster must have internet access for pulling images and installing dependencies. Offline Clusters
Cluster Domain The clusterDomain of your kubelet must be "cluster.local".
Service Type By default, the cluster must have a LoadBalancer service type. Override Service Type
Default StorageClass The default StorageClass must support the ReadWriteOnce access mode. Override StorageClass
Existing Argo Workflows The cluster must NOT already have Argo Workflows installed. Note, other Argo tools like Argo CD are fine. Join the discussion: deployKF#116
ARM64 Support

ARM64 Support

Currently, deployKF only supports x86_64 architecture clusters.

The next minor version of deployKF (v0.2.0) should have native ARM64 for all core components. However, some upstream apps like Kubeflow Pipelines will need extra work to be production ready (#10309, #10308).

Offline Clusters

Offline Clusters

deployKF can be used in offline and air-gapped clusters, but there are additional steps required.

Please see the Air-Gapped Clusters guide for more information.

Override Service Type

Override Service Type

By default, deployKF uses a LoadBalancer service type for the gateway.

For real-world usage, you should review the Expose the Gateway Service guide.

In some clusters, the LoadBalancer service type will create a public IP address. Consider the security implications before deploying, or use a different service type.

If you do not want this, you may override the service type to ClusterIP by setting the following value:

deploykf_core:
  deploykf_istio_gateway:
    gatewayService:
      type: "ClusterIP"
Override StorageClass

Override StorageClass

By default, deployKF requires a default StorageClass that supports the ReadWriteOnce access mode.

If you do NOT have a compatible default StorageClass, you might consider the following options:

  1. Configure a default StorageClass that has ReadWriteOnce support
  2. Explicitly set the storageClass value for the following components:
  3. Disable components which require the StorageClass, and use external alternatives:

Linux Node Requirements

If you are self-hosting your Kubernetes cluster, you must ensure that your Linux nodes meet the following requirements:

Configuration Requirement Notes
Inotify Limits Linux nodes must have sufficient inotify limits. Note, common distributions like Ubuntu do not ship with sufficient defaults. Increase Inotify Limits
Kernel Modules Linux nodes must have the required kernel modules for Istio. Istio Kernel Modules
Increase Inotify Limits

Increase Inotify Limits

You may need to increase the fs.inotify.max_user_* sysctl values on your nodes (only for Linux nodes). Otherwise, you may encounter Pod crashes with an error message like this:

too many open files

This error has been discussed in the upstream Kubeflow repo (kubeflow/manifests#2087), to resolve it, you will need to increase your system's open/watched file limits:

  1. Modify /etc/sysctl.conf to include the following lines:

    fs.inotify.max_user_instances = 1280
    fs.inotify.max_user_watches = 655360
    
  2. Now, apply immediately the changes with the following command:

    sudo sysctl -p
    
Istio Kernel Modules

Istio Kernel Modules

Your nodes must have the required kernel modules for Istio. Otherwise, you may encounter crashes in the Istio sidecars or other strange network behaviour.

  1. Get a list of the currently loaded kernel modules by running lsmod:

    lsmod | awk '{print $1}' | sort
    
  2. At the time of writing, the following command will enable the required kernel modules on boot:

    ## NOTE: if you are using Istio ambient mode, there are additional modules required
    cat <<EOF | sudo tee /etc/modules-load.d/99-istio-modules.conf
    br_netfilter
    ip_tables
    iptable_filter
    iptable_mangle
    iptable_nat
    iptable_raw
    nf_nat
    x_tables
    xt_REDIRECT
    xt_conntrack
    xt_multiport
    xt_owner
    xt_tcpudp
    EOF
    
  3. Now, either reboot your nodes or immediately load the modules with the following commands (which will also indicate if any modules are missing):

    sudo modprobe br_netfilter
    sudo modprobe ip_tables
    sudo modprobe iptable_filter
    sudo modprobe iptable_mangle
    sudo modprobe iptable_nat
    sudo modprobe iptable_raw
    sudo modprobe nf_nat
    sudo modprobe x_tables
    sudo modprobe xt_REDIRECT
    sudo modprobe xt_conntrack
    sudo modprobe xt_multiport
    sudo modprobe xt_owner
    sudo modprobe xt_tcpudp
    

2. Platform Configuration

deployKF is very configurable, you can use it to deploy a wide variety of machine learning platforms and integrate with your existing infrastructure.

deployKF Values

All aspects of your deployKF platform are configured with YAML-based configs named "values". See the values page for more information.

deployKF Versions

The "version" of your platform is the version of the generator package you are using. For information about upgrading, see the upgrade guide and changelog.

Can I be notified of new releases?

Yes. Watch the deployKF/deployKF repo on GitHub.
At the top right, click WatchCustomReleases then confirm by selecting Apply.

Cluster Dependencies

deployKF has a number of cluster dependencies including Istio, cert-manager, and Kyverno. See the cluster dependencies page for an overview.

Existing Cluster Dependencies

deployKF installs its own versions of the cluster dependencies by default.
If you have existing versions on the cluster, you MUST configure deployKF to use them:

External Dependencies

deployKF has a number of external dependencies including MySQL and an Object Store. See the external dependencies page for an overview.

Connect External Dependencies

deployKF includes embedded versions of MySQL and MinIO for development and testing.
We strongly recommend connecting external versions for production use:


3. Deploy the Platform

⭐ Create ArgoCD Applications ⭐

deployKF uses ArgoCD to manage the deployment of the platform. The process to create the ArgoCD Applications will depend on which mode of operation you have chosen.

Step 1 - Prepare ArgoCD

You will need to have ArgoCD deployed on your cluster, this ArgoCD instance must have the deployKF ArgoCD Plugin installed. Follow the appropriate guide for your situation:

Tips

Step 2 - Learn about Values

deployKF is configured by centralized values which define the desired state of the platform.


Sample Values:

Each version of deployKF has sample values with all ML & Data tools enabled, along with some sensible security defaults. We recommend using the sample values as a starting point for your custom values.

Here are the sample-values.yaml for deployKF 0.1.5.


Custom Values:

In ArgoCD Plugin Mode, values can be defined inline (values), or from a git repository (values_files).

Both methods may be used together. When a value is defined in multiple places, the result is calculated by merging, with files listed later taking precedence, and inline values having the highest precedence.

Tip

Learn about common configuration tasks in the ⭐ Configure deployKF ⭐ guide.

Step 3 - Define an App-of-Apps

Create a local file named deploykf-app-of-apps.yaml with the contents of the YAML below.

In this example, we will define an app-of-apps that:

  • Clones the deploykf/deploykf repo at the v0.1.5 tag.
  • Sets the source_version parameter to use deployKF version 0.1.5.
  • Sets the values_files parameter to read the sample-values.yaml from the repo.
  • Sets the values parameter with inline values that override the sample-values.yaml.
What is an App-of-Apps?

An app-of-apps is a special ArgoCD Application which manages other applications.

Can I read values from my own repo?

Yes. In this example, we only use the deploykf/deploykf repo to easily read the default sample-values.yaml file. See Step 4 to read values from a different repo.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: deploykf-app-of-apps
  namespace: argocd
  labels:
    app.kubernetes.io/name: deploykf-app-of-apps
    app.kubernetes.io/part-of: deploykf
spec:

  ## NOTE: if not "default", you MUST ALSO set the `argocd.project` value
  project: "default"

  source:
    ## source git repo configuration
    ##  - we use the 'deploykf/deploykf' repo so we can read its 'sample-values.yaml'
    ##    file, but you may use any repo (even one with no files)
    ##
    repoURL: "https://github.com/deployKF/deployKF.git"
    targetRevision: "v0.1.5"
    path: "."

    ## plugin configuration
    ##
    plugin:
      name: "deploykf"
      parameters:

        ## the deployKF generator version
        ##  - available versions: https://github.com/deployKF/deployKF/releases
        ##
        - name: "source_version"
          string: "0.1.5"

        ## paths to values files within the `repoURL` repository
        ##  - the values in these files are merged, with later files taking precedence
        ##  - we strongly recommend using 'sample-values.yaml' as the base of your values
        ##    so you can easily upgrade to newer versions of deployKF
        ##
        - name: "values_files"
          array:
            - "./sample-values.yaml"

        ## a string containing the contents of a values file
        ##  - this parameter allows defining values without needing to create a file in the repo
        ##  - these values are merged with higher precedence than those defined in `values_files`
        ##
        - name: "values"
          string: |
            ##
            ## This demonstrates how you might structure overrides for the 'sample-values.yaml' file.
            ## For a more comprehensive example, see the 'sample-values-overrides.yaml' in the main repo.
            ##
            ## Notes:
            ##  - YAML maps are RECURSIVELY merged across values files
            ##  - YAML lists are REPLACED in their entirety across values files
            ##  - Do NOT include empty/null sections, as this will remove ALL values from that section.
            ##    To include a section without overriding any values, set it to an empty map: `{}`
            ##

            ## --------------------------------------------------------------------------------
            ##                                      argocd
            ## --------------------------------------------------------------------------------
            argocd:
              namespace: argocd
              project: default

            ## --------------------------------------------------------------------------------
            ##                                    kubernetes
            ## --------------------------------------------------------------------------------
            kubernetes:
              {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                              deploykf-dependencies
            ## --------------------------------------------------------------------------------
            deploykf_dependencies:

              ## --------------------------------------
              ##             cert-manager
              ## --------------------------------------
              cert_manager:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##                 istio
              ## --------------------------------------
              istio:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##                kyverno
              ## --------------------------------------
              kyverno:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                                  deploykf-core
            ## --------------------------------------------------------------------------------
            deploykf_core:

              ## --------------------------------------
              ##             deploykf-auth
              ## --------------------------------------
              deploykf_auth:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##        deploykf-istio-gateway
              ## --------------------------------------
              deploykf_istio_gateway:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##      deploykf-profiles-generator
              ## --------------------------------------
              deploykf_profiles_generator:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                                   deploykf-opt
            ## --------------------------------------------------------------------------------
            deploykf_opt:

              ## --------------------------------------
              ##            deploykf-minio
              ## --------------------------------------
              deploykf_minio:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##            deploykf-mysql
              ## --------------------------------------
              deploykf_mysql:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

            ## --------------------------------------------------------------------------------
            ##                                  kubeflow-tools
            ## --------------------------------------------------------------------------------
            kubeflow_tools:

              ## --------------------------------------
              ##                 katib
              ## --------------------------------------
              katib:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##               notebooks
              ## --------------------------------------
              notebooks:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

              ## --------------------------------------
              ##               pipelines
              ## --------------------------------------
              pipelines:
                {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  destination:
    server: "https://kubernetes.default.svc"
    namespace: "argocd"
Step 4 - Read Values from Git (optional)

You may use the values_files parameter to read values from a git repo. This lets you version your values files in git, and easily update them without changing the app-of-apps resource.


Danger

We STRONGLY RECOMMEND using a PRIVATE repo for your values files!

If your git repo is private, you must configure ArgoCD with credentials to access the repo. For example, when using a GitHub repo, you might create a Secret with a Personal Access Token (PAT) as follows:

# create a secret with your GitHub credentials
# NOTE: kubectl can't create and label a secret in one command, so we use a pipe
kubectl create secret generic --dry-run=client -o yaml \
    "argocd-repository--MY_GITHUB_REPO" \
    --namespace "argocd" \
    --from-literal=type="git" \
    --from-literal=url="https://github.com/MY_GITHUB_ORG/MY_GITHUB_REPO.git" \
    --from-literal=username="MY_GITHUB_USERNAME" \
    --from-literal=password="MY_GITHUB_PAT" \
  | kubectl label --local --dry-run=client -o yaml -f - \
    "argocd.argoproj.io/secret-type"="repository" \
  | kubectl apply -f -

If you use the upstream sample-values.yaml as a base, you will also need to push that file to your repo.

The following command will download the sample-values.yaml file for deployKF 0.1.5:

# download the `sample-values.yaml` file
curl -fL -o "sample-values-0.1.5.yaml" \
  "https://raw.githubusercontent.com/deployKF/deployKF/v0.1.5/sample-values.yaml"

For example, say you now have the following files in your repo:

  • sample-values-0.1.5.yaml
  • values-1.yaml
  • values-2.yaml

Your app-of-apps resource may then be updated to look like this:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: deploykf-app-of-apps
  namespace: argocd
  labels:
    app.kubernetes.io/name: deploykf-app-of-apps
    app.kubernetes.io/part-of: deploykf
spec:
  project: "default"
  source:

    ## source git repo configuration
    ##
    repoURL: "https://github.com/MY_GITHUB_ORG/MY_GITHUB_REPO.git"
    targetRevision: "main"
    path: "."

    ## plugin configuration
    ##
    plugin:
      name: "deploykf"
      parameters:

        ## the deployKF generator version
        ##
        - name: "source_version"
          string: "0.1.5"

        ## paths to values files within the `repoURL` repository
        ##
        - name: "values_files"
          array:
            - "./sample-values-0.1.5.yaml"
            - "./values-1.yaml"
            - "./values-2.yaml"

        ## a string containing the contents of a values file
        ##  - this parameter allows defining values without needing to create a file in the repo
        ##  - these values are merged with higher precedence than those defined in `values_files`
        ##
        #- name: "values"
        #  string: |
        #    ...
        #    values file contents
        #    ...

  destination:
    server: "https://kubernetes.default.svc"
    namespace: "argocd"
Step 5 - Apply App-of-Apps Resource

Apply the deploykf-app-of-apps.yaml file to your cluster with the following command:

kubectl apply -f ./deploykf-app-of-apps.yaml
Step 1 - Prepare ArgoCD

If you have not already deployed ArgoCD on your cluster, you will need to do so.
Please see the ArgoCD Getting Started Guide for instructions.

TIP: If you use an ArgoCD "management cluster" pattern, see the off-cluster ArgoCD guide.

Step 2 - Install the deployKF CLI

If you have not already installed the deploykf CLI on your local machine, you will need to do so.

Please see the CLI Installation Guide for instructions.

Step 3 - Prepare a Git Repo

You will need to create a git repo to store your generated manifests.

Danger

We STRONGLY RECOMMEND using a PRIVATE repo for your manifests!

If your git repo is private, you must configure ArgoCD with credentials to access the repo. For example, when using a GitHub repo, you might create a Secret with a Personal Access Token (PAT) as follows:

# create a secret with your GitHub credentials
# NOTE: kubectl can't create and label a secret in one command, so we use a pipe
kubectl create secret generic --dry-run=client -o yaml \
    "argocd-repository--MY_GITHUB_REPO" \
    --namespace "argocd" \
    --from-literal=type="git" \
    --from-literal=url="https://github.com/MY_GITHUB_ORG/MY_GITHUB_REPO.git" \
    --from-literal=username="MY_GITHUB_USERNAME" \
    --from-literal=password="MY_GITHUB_PAT" \
  | kubectl label --local --dry-run=client -o yaml -f - \
    "argocd.argoproj.io/secret-type"="repository" \
  | kubectl apply -f -
Step 4 - Create Values Files

deployKF is configured by centralized values which define the desired state of the platform.


Sample Values:

Each version of deployKF has sample values with all ML & Data tools enabled, along with some sensible security defaults. We recommend using the sample values as a starting point for your custom values.

The following command will download the sample-values.yaml file for deployKF 0.1.5:

# download the `sample-values.yaml` file
curl -fL -o "sample-values-0.1.5.yaml" \
  "https://raw.githubusercontent.com/deployKF/deployKF/v0.1.5/sample-values.yaml"

Custom Values:

In Manifests Repo Mode, values are passed to the deploykf generate command as YAML files. When a value is defined in multiple files, the result is calculated by merging, with files listed later taking precedence.

To make upgrades easier, we recommend using the sample values as a base, and applying custom override files with only the values you want to change. This allows you to swap out the sample values for a newer version in the future.

Tip

Learn about common configuration tasks in the ⭐ Configure deployKF ⭐ guide.

For example, you might structure your custom-overrides.yaml file like this:

##
## Notes:
##  - YAML maps are RECURSIVELY merged across values files
##  - YAML lists are REPLACED in their entirety across values files
##  - Do NOT include empty/null sections, as this will remove ALL values from that section.
##    To include a section without overriding any values, set it to an empty map: `{}`
##

## --------------------------------------------------------------------------------
##                                      argocd
## --------------------------------------------------------------------------------
argocd:
  namespace: argocd
  project: default

  source:
    ## the git repo where you will store your generated manifests
    ##  - url: the URL of the git repo
    ##  - revision: the git branch/tag/commit to read from
    ##  - path: the repo folder path where the generated manifests are stored
    ##
    repo:
      url: "https://github.com/deployKF/examples.git"
      revision: "main"
      path: "./GENERATOR_OUTPUT/"

## --------------------------------------------------------------------------------
##                                    kubernetes
## --------------------------------------------------------------------------------
kubernetes:
  {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                              deploykf-dependencies
## --------------------------------------------------------------------------------
deploykf_dependencies:

  ## --------------------------------------
  ##             cert-manager
  ## --------------------------------------
  cert_manager:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##                 istio
  ## --------------------------------------
  istio:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##                kyverno
  ## --------------------------------------
  kyverno:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                                  deploykf-core
## --------------------------------------------------------------------------------
deploykf_core:

  ## --------------------------------------
  ##             deploykf-auth
  ## --------------------------------------
  deploykf_auth:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##        deploykf-istio-gateway
  ## --------------------------------------
  deploykf_istio_gateway:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##      deploykf-profiles-generator
  ## --------------------------------------
  deploykf_profiles_generator:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                                   deploykf-opt
## --------------------------------------------------------------------------------
deploykf_opt:

  ## --------------------------------------
  ##            deploykf-minio
  ## --------------------------------------
  deploykf_minio:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##            deploykf-mysql
  ## --------------------------------------
  deploykf_mysql:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

## --------------------------------------------------------------------------------
##                                  kubeflow-tools
## --------------------------------------------------------------------------------
kubeflow_tools:

  ## --------------------------------------
  ##                 katib
  ## --------------------------------------
  katib:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##               notebooks
  ## --------------------------------------
  notebooks:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!

  ## --------------------------------------
  ##               pipelines
  ## --------------------------------------
  pipelines:
    {} # <-- REMOVE THIS, IF YOU INCLUDE VALUES UNDER THIS SECTION!
Step 5 - Generate Manifests

The deploykf generate command writes manifests into a folder based on your values. When more than one --values file is provided, they are merged, with later files taking precedence.

For example, to generate manifests using deployKF version 0.1.5 under ./GENERATOR_OUTPUT/:

deploykf generate \
    --source-version "0.1.5" \
    --values ./sample-values-0.1.5.yaml \
    --values ./custom-overrides.yaml \
    --output-dir ./GENERATOR_OUTPUT

Do NOT Edit Manifests Directly

In general, you should NOT edit the manifests generated by deployKF. Changes in the --output-dir will be overwritten each time the deploykf generate command runs.

If you need to change something which is not configurable via values, please raise an issue so we can understand your use-case and potentially add a new configuration option.

Step 6 - Commit Generated Manifests

After running deploykf generate, you will need to commit the manifests to your repo, so ArgoCD can apply them to your cluster:

# for example, to directly commit changes to the 'main' branch of your repo
git add GENERATOR_OUTPUT
git commit -m "my commit message"
git push origin main
Step 7 - Apply App-of-Apps Manifest

The only manifest you need to manually apply is the app-of-apps, which creates all the other ArgoCD applications.

The app-of-apps.yaml manifest is generated at the root of your --output-dir folder, so you can apply it with:

kubectl apply --filename GENERATOR_OUTPUT/app-of-apps.yaml
Required Values - Azure AKS

When deploying on Azure AKS, you MUST set the following values, or the platform will not work correctly:

kubernetes:
  azure:
    admissionsEnforcerFix: true

For more information, please see the PR which introduced this value deployKF/deployKF#85.

⭐ Sync ArgoCD Applications ⭐

Now that your deployKF app-of-apps has been applied, you must sync the ArgoCD applications to deploy your platform. Syncing an application will cause ArgoCD to reconcile the actual state in the cluster, to match the state defined by the application resource.

Danger

DO NOT sync all the Applications at once!!!

The deployKF Applications depend on each other, they MUST be synced in the correct order to avoid errors. If you manually sync them all, you may need to uninstall and start over.

There are a few ways to sync the applications, you only need to use ONE of them.

The recommended way to sync the applications is with the automated script.

Step - Run the Sync Script

We provide the sync_argocd_apps.sh script to automatically sync the applications that make up deployKF. Learn more about the automated sync script from the scripts folder README .

For example, to run the script, you might use the following commands:

# download the latest version of the script
curl -fL -o "sync_argocd_apps.sh" \
  "https://raw.githubusercontent.com/deployKF/deployKF/main/scripts/sync_argocd_apps.sh"

# ensure the script is executable
chmod +x ./sync_argocd_apps.sh

# ensure your kubectl context is set correctly
kubectl config current-context

# run the script
bash ./sync_argocd_apps.sh

About the sync script

  • The script can take around 5-10 minutes to run on first install.
  • If the script fails or is interrupted, you can safely re-run it, and it will pick up where it left off.
  • There are a number of configuration variables at the top of the script which change the default behavior.
  • Learn more about the automated sync script from the scripts folder README in the deployKF repo.

Please be aware of the following issue when using the automated sync script:

Bug in ArgoCD v2.9

There is a known issue (deploykf/deploykf#70, argoproj/argo-cd#16266) with all 2.9.X versions of the ArgoCD CLI that will cause the sync script to fail with the following error:

==========================================================================================
Logging in to ArgoCD...
==========================================================================================
FATA[0000] cannot find pod with selector: [app.kubernetes.io/name=] - use the --{component}-name flag in this command or set the environmental variable (Refer to https://argo-cd.readthedocs.io/en/stable/user-guide/environment-variables), to change the Argo CD component name in the CLI

Please upgrade your argocd CLI to at least version 2.10.0 to resolve this issue.

Alternatively, you can sync the applications using the ArgoCD Web UI.

Step 1 - Access ArgoCD Web UI

For production usage, you may want to expose ArgoCD with a LoadBalancer or Ingress.

For testing, you may use kubectl port-forwarding to expose the ArgoCD Web UI on your local machine:

kubectl port-forward --namespace "argocd" svc/argocd-server 8090:https

The ArgoCD Web UI should now be available at the following URL:

https://localhost:8090


If this is the first time you are using ArgoCD, you will need to retrieve the initial password for the admin user:

echo $(kubectl -n argocd get secret/argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d)

Once you log in with the admin user and above password, the Web UI should look like this:

ArgoCD Web UI (Dark Mode) ArgoCD Web UI (Light Mode)

Step 2 - Sync Applications

You MUST sync the deployKF applications in the correct order. For each application, click the SYNC button, and wait for the application to become "Healthy" before syncing the next.

The applications are grouped and ordered as follows:

Group 0: "app-of-apps"

First, you must sync the app-of-apps application:

  1. deploykf-app-of-apps
  2. deploykf-namespaces (only exists when using off-cluster ArgoCD)

Group 1: "deploykf-dependencies"

Second, you must sync the applications with the label app.kubernetes.io/component=deploykf-dependencies:

  1. dkf-dep--cert-manager (may fail on first attempt)
  2. dkf-dep--istio
  3. dkf-dep--kyverno

WARNING: for this group, each application MUST be synced INDIVIDUALLY and the preceding application MUST be "Healthy" before syncing the next.

Group 2: "deploykf-core"

Third, you must sync the applications with the label app.kubernetes.io/component=deploykf-core:

  1. dkf-core--deploykf-istio-gateway
  2. dkf-core--deploykf-auth
  3. dkf-core--deploykf-dashboard
  4. dkf-core--deploykf-profiles-generator (may fail on first attempt)

Group 3: "deploykf-opt"

Fourth, you must sync the applications with the label app.kubernetes.io/component=deploykf-opt:

  • dkf-opt--deploykf-minio
  • dkf-opt--deploykf-mysql

Group 4: "deploykf-tools"

Fifth, you must sync the applications with the label app.kubernetes.io/component=deploykf-tools:

  • (none yet)

Group 5: "kubeflow-dependencies"

Sixth, you must sync the applications with the label app.kubernetes.io/component=kubeflow-dependencies:

  • kf-dep--argo-workflows

Group 6: "kubeflow-tools"

Seventh, you must sync the applications with the label app.kubernetes.io/component=kubeflow-tools:

  • kf-tools--katib
  • kf-tools--notebooks--jupyter-web-app
  • kf-tools--notebooks--notebook-controller
  • kf-tools--pipelines
  • kf-tools--poddefaults-webhook
  • kf-tools--tensorboards--tensorboard-controller
  • kf-tools--tensorboards--tensorboards-web-app
  • kf-tools--training-operator
  • kf-tools--volumes--volumes-web-app

4. Use the Platform

Now that you have a working deployKF machine learning platform, here are some things to try out!

⭐ Expose the deployKF Dashboard ⭐

The deployKF dashboard is the web-based interface for deployKF, it gives users authenticated access to tools like Kubeflow Pipelines, Kubeflow Notebooks, and Katib.

deployKF Dashboard (Dark Mode) deployKF Dashboard (Light Mode)

All public deployKF services (including the dashboard) are accessed via the deployKF Istio Gateway, you will need to expose its Kubernetes Service.

Step 1 - Expose the Gateway

You may expose the deployKF Istio Gateway Service in a number of ways:

Step 2 - Configure DNS

Trying to access deployKF with an IP address will NOT work, you MUST use a domain name.

See Configure DNS Records for more information.

This step is REQUIRED, you MUST configure DNS records or local /etc/hosts entries.

Step 3 - Configure TLS (optional)

We recommend configuring valid TLS/HTTPS certificates to avoid browser warnings for your users.

See the Configure TLS Certificates guide for more information.

If you want to configure TLS later, just skip this step for now.
We use a self-signed certificate by default.

Step 4 - User Authentication (optional)

See the following guides to configure user authentication on your platform:

If you want to configure authentication later, just skip this step for now.
We provide a few static credentials by default.

Step 5 - Define Profiles (optional)

deployKF uses the concept of "Profiles" to group users and resources together. You might define profiles for different teams, projects, or even individual users.

See the User Authorization and Profile Management guide for more information.

If you want to define profiles later, just skip this step for now.
We provide default profiles named team-1 and team-1-prod.

Step 6 - Log In

You should now be presented with a "Log In" screen when you visit the exposed URL.

Remember, you can NOT access deployKF with an IP address, you MUST use a domain name.


By default, there are a few static credentials set by the deploykf_core.deploykf_auth.dex.staticPasswords value:

Credentials: User 1

Username: user1@example.com
Password: user1

Credentials: User 2

Username: user2@example.com
Password: user2

Credentials: Admin (DO NOT USE - will be removed in future versions)

Username: admin@example.com
Password: admin

  • This account is the default "owner" of all profiles.
  • This account does NOT have access to "MinIO Console" or "Argo Server UI".
  • We recommend NOT using this account, and actually removing its staticPasswords entry.
  • We recommend leaving this account as the default "owner", even with @example.com as the domain (because profile owners can't be changed).
Step 7 - Explore the Tools

deployKF includes many tools which address different stages of the data & machine learning lifecycle:

We also provide a number of user-focused guides for these tools:

Tool User Guide
Kubeflow Pipelines Access Kubeflow Pipelines API
Kubeflow Pipelines GitOps for Kubeflow Pipelines Schedules

Next Steps


Last update: 2024-08-28
Created: 2023-04-24