Kubernetes and Helm
This page is the operator-facing runbook for deploying the job on Kubernetes with Helm.
Prerequisites
kubectlis authenticated to the target clusterhelmv3 is installedgcloudis installed when you need to fetch cluster credentials or Workload Identity context from Google Cloud- The cluster can pull the published image
- One supported storage mode is available:
- PVC mode with a usable
StorageClass - GCS mode on GKE with the GCS Fuse CSI driver enabled
- PVC mode with a usable
- Workload Identity is configured when the job must call the entitlement service with Google identity
If you need to install the Kubernetes Application CRD used by some Marketplace flows:
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/application/master/config/crd/bases/app.k8s.io_applications.yaml
Supported deployment profiles
The current supported profiles are:
| Profile | Typical use | Expected resources |
|---|---|---|
| Standard | Default production path | 2 to 4 vCPU, 8 to 16 GiB RAM, 50 GiB PVC |
| Deep | Expanded evidence generation and GPU-heavy workflows | 4 to 8 vCPU, 32 to 64 GiB RAM, 1 NVIDIA GPU, 200 GiB PVC |
Standard runs use the default image tag and config.runMode=standard. Deep runs use a GPU-tagged image and config.runMode=deep.
Recommended install flow
1. Set the core variables
export APP_NAME="glassbox-mol-audit"
export NAMESPACE="glassbox-mol-audit"
export IMAGE_REPO="REGION-docker.pkg.dev/PROJECT/REPO/glassbox-mol-audit"
export IMAGE_TAG="PUBLISHED_VERSION_TAG"
export PROJECT_ID="test"
export RUN_ID="run_$(date +%Y%m%dT%H%M%SZ)"
export ENTITLEMENT_URL="https://YOUR_CLOUD_RUN_SERVICE"
export ENTITLEMENT_AUTH_MODE="google"
export ENTITLEMENT_AUDIENCE="${ENTITLEMENT_URL}"
export WORKLOAD_IDENTITY_GSA="your-sa@project.iam.gserviceaccount.com"
2. Create the namespace
kubectl create namespace "${NAMESPACE}" 2>/dev/null || true
3. Install infrastructure only
Install the chart with the job disabled first:
helm upgrade --install "${APP_NAME}" ./manifest/chart \
--namespace "${NAMESPACE}" --create-namespace \
-f ./manifest/chart/values-standard.yaml \
--set job.enabled=false \
--set image.repository="${IMAGE_REPO}" \
--set image.tag="${IMAGE_TAG}" \
--set config.projectId="${PROJECT_ID}" \
--set config.entitlementUrl="${ENTITLEMENT_URL}" \
--set config.entitlementAuthMode="${ENTITLEMENT_AUTH_MODE}" \
--set config.entitlementAudience="${ENTITLEMENT_AUDIENCE}" \
--set workloadIdentity.enabled=true \
--set workloadIdentity.gcpServiceAccount="${WORKLOAD_IDENTITY_GSA}" \
--set config.runId="${RUN_ID}"
4. Stage inputs
The runner expects:
<input-root>/<project_id>/01_sources/
On the default PVC layout, that is usually:
/data/input/<project_id>/01_sources/
For the input contract, see Prepare Inputs.
5. Enable the job
Delete any previous job before a new install or upgrade because Jobs are immutable:
kubectl -n "${NAMESPACE}" delete job "${APP_NAME}" --ignore-not-found
Then enable execution:
helm upgrade --install "${APP_NAME}" ./manifest/chart \
--namespace "${NAMESPACE}" \
-f ./manifest/chart/values-standard.yaml \
--set job.enabled=true \
--set image.repository="${IMAGE_REPO}" \
--set image.tag="${IMAGE_TAG}" \
--set config.projectId="${PROJECT_ID}" \
--set config.entitlementUrl="${ENTITLEMENT_URL}" \
--set config.entitlementAuthMode="${ENTITLEMENT_AUTH_MODE}" \
--set config.entitlementAudience="${ENTITLEMENT_AUDIENCE}" \
--set workloadIdentity.enabled=true \
--set workloadIdentity.gcpServiceAccount="${WORKLOAD_IDENTITY_GSA}" \
--set config.runId="${RUN_ID}"
6. Watch the job
kubectl -n "${NAMESPACE}" get job "${APP_NAME}" -o wide
kubectl -n "${NAMESPACE}" logs "job/${APP_NAME}" --all-containers --timestamps -f
Storage modes
PVC mode
helm upgrade --install glassbox-mol-audit ./manifest/chart \
--set storage.type=pvc \
--set storage.pvc.storageClassName=standard \
--set storage.pvc.size=50Gi
GCS Fuse mode on GKE
helm upgrade --install glassbox-mol-audit ./manifest/chart \
--set storage.type=gcs \
--set storage.gcs.bucket=YOUR_BUCKET \
--set workloadIdentity.enabled=true \
--set workloadIdentity.gcpServiceAccount=your-sa@project.iam.gserviceaccount.com
Customer GCS deployment example
helm upgrade --install glassbox-mol-audit ./manifest/chart \
--namespace glassbox-mol-audit --create-namespace \
-f ./manifest/chart/values-standard.yaml \
-f ./examples/values-gcs.yaml \
--set storage.gcs.bucket=YOUR_BUCKET \
--set workloadIdentity.gcpServiceAccount=your-sa@project.iam.gserviceaccount.com \
--set image.repository=us-central1-docker.pkg.dev/PROJECT/REPO/glassbox-mol-audit \
--set image.tag=1.0.0 \
--set config.projectId=YOUR_PROJECT_ID \
--set ubbagent.enabled=false \
--set marketplace.reportingSecret=""
Identity-only entitlement flow
The current deployment model is identity-based. The customer job authenticates to the entitlement service with Workload Identity rather than a customer-managed token.
Set all of the following:
config.entitlementAuthMode=googleconfig.entitlementAudience=<entitlement-url>workloadIdentity.enabled=trueworkloadIdentity.gcpServiceAccount=<gsa>
The customer job then calls these endpoints during a run:
POST /api/entitlement/checkPOST /api/entitlement/consumePOST /api/entitlement/run_startPOST /api/entitlement/run_complete
See API Overview for the grouped endpoint reference.
Outputs
Outputs land under a run-scoped directory:
<output-root>/<run_id>/
Typical artifacts include:
results/summary.jsonresults/metrics.json- HTML report files
run_manifest.jsonseal/seal.jsonseal/seal.sigseal/seal.svg
For the full artifact map, see Output File Reference.
Retrieval paths
PVC retrieval
Use a helper pod mounted to the same claim, then copy the output directory locally.
GCS retrieval
gsutil -m cp -r gs://YOUR_BUCKET/<project_id>/results ./gbx_output/results
Clean uninstall
Default teardown keeps the namespace and the external reporting secret:
./tools/clean_uninstall.sh \
--namespace "${NAMESPACE}" \
--release "${APP_NAME}"
Full purge teardown is destructive:
./tools/clean_uninstall.sh \
--namespace "${NAMESPACE}" \
--release "${APP_NAME}" \
--delete-pvc \
--delete-namespace \
--yes
Delete the Marketplace reporting secret only when that is intentional:
./tools/clean_uninstall.sh \
--namespace "${NAMESPACE}" \
--release "${APP_NAME}" \
--delete-reporting-secret \
--reporting-secret "<app-name>-reporting-secret" \
--yes
Common operational failures
Namespace ownership mismatch
If Helm reports namespace ownership problems, make sure:
- You always pass
--namespace "${NAMESPACE}" - The namespace exists or is created by the same Helm release context
Immutable Job errors
If an upgrade fails because the Job spec is immutable, delete the job and rerun the Helm command:
kubectl -n "${NAMESPACE}" delete job "${APP_NAME}" --ignore-not-found
PVC shrink errors
Do not shrink a PVC between installs. Increase capacity or recreate the PVC if you truly need a smaller volume and can accept data loss.
Multi-attach errors
If a leftover helper pod is still attached to a ReadWriteOnce volume, remove the helper pod before rerunning the job:
kubectl -n "${NAMESPACE}" delete pod gbx-output-reader --ignore-not-found
kubectl -n "${NAMESPACE}" delete job "${APP_NAME}" --ignore-not-found
Entitlement auth errors
401 and 403 usually mean the identity token or IAM invocation settings are wrong. Review Error Codes and the entitlement settings above.
Backup and restore
- In PVC mode, snapshot the persistent disk or copy
/data/outputto a secondary location - In GCS mode, rely on bucket versioning and lifecycle controls where appropriate
For a day-2 operations view, see Multi-Component Operations.