Skip to main content

Kubernetes and Helm

This page explains how to deploy the job on Kubernetes with Helm.

Prerequisites

  • kubectl is authenticated to the target cluster
  • helm v3 is installed
  • gcloud is installed when you need to fetch cluster credentials or Workload Identity context from Google Cloud
  • The cluster can pull the published image
  • One supported storage mode is available:
    • PVC mode with a usable StorageClass
    • GCS mode on GKE with the GCS Fuse CSI driver enabled
  • Workload Identity is configured when the job must call the entitlement service with Google identity
  • A valid Marketplace reporting secret reference is available for supported Marketplace deployments

If you need to install the Kubernetes Application CRD used by some Marketplace flows:

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/application/master/config/crd/bases/app.k8s.io_applications.yaml

Supported deployment profiles

The current supported profiles are:

ProfileTypical useExpected resources
StandardDefault production path2 to 4 vCPU, 8 to 16 GiB RAM, 50 GiB PVC
DeepExpanded evidence generation and GPU-heavy workflows4 to 8 vCPU, 32 to 64 GiB RAM, 1 NVIDIA GPU, 200 GiB PVC

Standard runs use the default image tag and config.runMode=standard. Deep runs use a GPU-tagged image and config.runMode=deep.

Run profile is not the same thing as category routing. config.runMode controls infrastructure depth and execution profile, while the staged inputs and policy determine the resolved category_id. Review Category Policy and Routing before assuming that a Deep deployment implies a structure-backed or physics-heavy analysis path.

1. Set the core variables

export APP_NAME="glassbox-mol-audit"
export NAMESPACE="glassbox-mol-audit"

export IMAGE_REPO="REGION-docker.pkg.dev/PROJECT/REPO/glassbox-mol-audit"
export IMAGE_TAG="PUBLISHED_VERSION_TAG"

export PROJECT_ID="test"
export RUN_ID="run_$(date +%Y%m%dT%H%M%SZ)"

export GCP_REGION="us-central1"
export GCP_LOCATION="${GCP_REGION}"
export ENTITLEMENT_URL="https://YOUR_CLOUD_RUN_SERVICE"
export ENTITLEMENT_AUTH_MODE="google"
export ENTITLEMENT_AUDIENCE="${ENTITLEMENT_URL}"
export WORKLOAD_IDENTITY_GSA="your-sa@project.iam.gserviceaccount.com"
export REPORTING_SECRET="marketplace-reporting-secret"
export UBBAGENT_IMAGE_REPO="us-docker.pkg.dev/glassbox-bio-public/glassbox-bio-molecular-audit/glassbox-mol-audit/ubbagent"

2. Create the namespace

kubectl create namespace "${NAMESPACE}" 2>/dev/null || true

3. Install infrastructure only

Install the chart with the job disabled first:

helm upgrade --install "${APP_NAME}" ./manifest/chart \
--namespace "${NAMESPACE}" --create-namespace \
-f ./manifest/chart/values-standard.yaml \
--set job.enabled=false \
--set image.repository="${IMAGE_REPO}" \
--set image.tag="${IMAGE_TAG}" \
--set config.projectId="${PROJECT_ID}" \
--set config.gcpRegion="${GCP_REGION}" \
--set config.gcpLocation="${GCP_LOCATION}" \
--set config.entitlementUrl="${ENTITLEMENT_URL}" \
--set config.entitlementAuthMode="${ENTITLEMENT_AUTH_MODE}" \
--set config.entitlementAudience="${ENTITLEMENT_AUDIENCE}" \
--set workloadIdentity.enabled=true \
--set workloadIdentity.gcpServiceAccount="${WORKLOAD_IDENTITY_GSA}" \
--set marketplace.reportingSecret="${REPORTING_SECRET}" \
--set ubbagent.enabled=true \
--set ubbagent.image.repository="${UBBAGENT_IMAGE_REPO}" \
--set config.runId="${RUN_ID}"

4. Stage inputs

The runner expects:

<input-root>/<project_id>/01_sources/

On the default PVC layout, that is usually:

/data/input/<project_id>/01_sources/

For the input contract, see Prepare Inputs. For the module-routing policy driven by those inputs, see Category Policy and Routing.

5. Enable the job

Delete any previous job before a new install or upgrade because Jobs are immutable:

kubectl -n "${NAMESPACE}" delete job "${APP_NAME}" --ignore-not-found

Then enable execution:

helm upgrade --install "${APP_NAME}" ./manifest/chart \
--namespace "${NAMESPACE}" \
-f ./manifest/chart/values-standard.yaml \
--set job.enabled=true \
--set image.repository="${IMAGE_REPO}" \
--set image.tag="${IMAGE_TAG}" \
--set config.projectId="${PROJECT_ID}" \
--set config.gcpRegion="${GCP_REGION}" \
--set config.gcpLocation="${GCP_LOCATION}" \
--set config.entitlementUrl="${ENTITLEMENT_URL}" \
--set config.entitlementAuthMode="${ENTITLEMENT_AUTH_MODE}" \
--set config.entitlementAudience="${ENTITLEMENT_AUDIENCE}" \
--set workloadIdentity.enabled=true \
--set workloadIdentity.gcpServiceAccount="${WORKLOAD_IDENTITY_GSA}" \
--set marketplace.reportingSecret="${REPORTING_SECRET}" \
--set ubbagent.enabled=true \
--set ubbagent.image.repository="${UBBAGENT_IMAGE_REPO}" \
--set config.runId="${RUN_ID}"

6. Watch the job

kubectl -n "${NAMESPACE}" get job "${APP_NAME}" -o wide
kubectl -n "${NAMESPACE}" logs "job/${APP_NAME}" --all-containers --timestamps -f

Storage modes

PVC mode

helm upgrade --install glassbox-mol-audit ./manifest/chart \
--set storage.type=pvc \
--set storage.pvc.storageClassName=standard \
--set storage.pvc.size=50Gi

GCS Fuse mode on GKE

helm upgrade --install glassbox-mol-audit ./manifest/chart \
--set storage.type=gcs \
--set storage.gcs.bucket=YOUR_BUCKET \
--set workloadIdentity.enabled=true \
--set workloadIdentity.gcpServiceAccount=your-sa@project.iam.gserviceaccount.com

Customer GCS deployment example

helm upgrade --install glassbox-mol-audit ./manifest/chart \
--namespace glassbox-mol-audit --create-namespace \
-f ./manifest/chart/values-standard.yaml \
-f ./examples/values-gcs.yaml \
--set storage.gcs.bucket=YOUR_BUCKET \
--set workloadIdentity.gcpServiceAccount=your-sa@project.iam.gserviceaccount.com \
--set image.repository=us-central1-docker.pkg.dev/PROJECT/REPO/glassbox-mol-audit \
--set image.tag=1.0.0 \
--set config.projectId=YOUR_PROJECT_ID \
--set marketplace.reportingSecret=marketplace-reporting-secret \
--set ubbagent.enabled=true

Supported Marketplace deployments keep billing enabled. Do not blank marketplace.reportingSecret or disable ubbagent in the customer path.

Identity-only entitlement flow

The current deployment model is identity-based. The customer job authenticates to the entitlement service with Workload Identity rather than a customer-managed token.

Set all of the following:

  • config.entitlementAuthMode=google
  • config.entitlementAudience=<entitlement-url>
  • workloadIdentity.enabled=true
  • workloadIdentity.gcpServiceAccount=<gsa>

The customer job then calls these endpoints during a run:

  1. POST /api/entitlement/check
  2. POST /api/entitlement/consume
  3. POST /api/entitlement/run_start
  4. POST /api/entitlement/run_complete

See API Overview for the grouped endpoint reference.

Outputs

Outputs land under a run-scoped directory:

<output-root>/<run_id>/

Typical artifacts include:

  • results/summary.json
  • results/metrics.json
  • HTML report files
  • run_manifest.json
  • seal/seal.json
  • seal/seal.sig
  • seal/seal.svg

For the full artifact map, see Output File Reference.

Retrieval paths

PVC retrieval

Use a helper pod mounted to the same claim, then copy the output directory locally.

GCS retrieval

gsutil -m cp -r gs://YOUR_BUCKET/<project_id>/results ./gbx_output/results

Clean uninstall

Default teardown keeps the namespace and the external reporting secret:

./tools/clean_uninstall.sh \
--namespace "${NAMESPACE}" \
--release "${APP_NAME}"

Full purge teardown is destructive:

./tools/clean_uninstall.sh \
--namespace "${NAMESPACE}" \
--release "${APP_NAME}" \
--delete-pvc \
--delete-namespace \
--yes

Delete the Marketplace reporting secret only when that is intentional:

./tools/clean_uninstall.sh \
--namespace "${NAMESPACE}" \
--release "${APP_NAME}" \
--delete-reporting-secret \
--reporting-secret "<app-name>-reporting-secret" \
--yes

Common operational failures

Namespace ownership mismatch

If Helm reports namespace ownership problems, make sure:

  • You always pass --namespace "${NAMESPACE}"
  • The namespace exists or is created by the same Helm release context

Immutable Job errors

If an upgrade fails because the Job spec is immutable, delete the job and rerun the Helm command:

kubectl -n "${NAMESPACE}" delete job "${APP_NAME}" --ignore-not-found

PVC shrink errors

Do not shrink a PVC between installs. Increase capacity or recreate the PVC if you truly need a smaller volume and can accept data loss.

Multi-attach errors

If a leftover helper pod is still attached to a ReadWriteOnce volume, remove the helper pod before rerunning the job:

kubectl -n "${NAMESPACE}" delete pod gbx-output-reader --ignore-not-found
kubectl -n "${NAMESPACE}" delete job "${APP_NAME}" --ignore-not-found

Entitlement auth errors

401 and 403 usually mean the identity token or IAM invocation settings are wrong. Review Error Codes and the entitlement settings above.

Backup and restore

  • In PVC mode, snapshot the persistent disk or copy /data/output to a secondary location
  • In GCS mode, rely on bucket versioning and lifecycle controls where appropriate

For a day-2 operations view, see Multi-Component Operations.