Skip to content

MinIO Administration⚓︎

Minio is using policy based access control (PBAC) for buckets and objects. All other parts of the platform are using RBAC for their resources, therefore we need to link each MinIO policy to a role.

For administration, use the MinIO web UI accessible under https://<your-domain>/minio or via the link in the sidebar of the Kubeflow central dashboard.

Login to MinIO using the Single Sign On (SSO) of you prokube platform. Make sure to log in with a member of the pk-admin group to be able to administer MinIO. Follow the User Management section if you need to grant these privileges to a new user.

MinIO Login Page

Create Bucket⚓︎

Warning

Services like Kubeflow Pipelines and Jupyter Notebooks will only have access to buckets created by default. Users might have more privileges than these services. If you like to grant these services additional privileges you have to create a new service account and replace the s3creds in the profile namespace with updated access tokens

In the menu on the left you can find the Buckets tab. This is the place where you can manage all MinIO buckets. Click on the Create Bucket button in the top right corner.

MinIO Create Bucket

It is advisable to choose a descriptive name and only create buckets with sensible resource quotas. This can help during disaster recovery once the storage is full. There are a number of measures in place to prevent this, like alerting and monitoring, nevertheless it is a good precaution to take.

Create Policy⚓︎

MinIO uses policies to manage fine-grained permissions on objects and buckets. For a deeper look into MinIO's access management take a look at the official documentation.

  • Open the Policy tab and click on the Create Policy button.
  • Give the policy a descriptive name. Since the realm role has to have the same name it is recommended to use a prefix to indicate that this role is bound to a minio bucket. Consider using s3: as a prefix.
  • Write your policy and create it.

As an example, this policy gives read write access to all objects in the example-bucket.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ReadWriteBuckets",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteBucket",
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::example-bucket"
            ]
        },
        {
            "Sid": "ReadWriteObjects",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::example-bucket/*"
            ]
        }
    ]
}

Grant Access to Bucket⚓︎

To give user access to a bucket we need to do two things:

  1. Create a policy that grants the required privileges.
  2. Create a realm role in Keycloak and assign it to the User or Group that is supposed to have access. The realm role must have the same name as the policy.

Lifecycle rules⚓︎

MinIO supports different kind of lifecycle rules to manage your objects. For example, you can set up rules to automatically delete old objects using Object Expiration. This can be helpful to automatically delete artifacts from pipeline runs older than a certain age.

Changing the MinIO tenant volume size⚓︎

MinIO uses a default storage class to save its data. This is openebs-hostpath for single-node deployments and mayastor-three-replicas (mayastor) for high-availability (multinode) deployments.

With the mayastor storage class (our default for multinode deployments) you can increase the volume size, but not decrease it.

Increasing the volume size⚓︎

To increase the volume size:

  1. If possible, do a backup of all MinIO data and settings before (see Re-creating the MinIO tenant volume for how to back up data with mc).
  2. Disable ArgoCD syncing of the MinIO tenant app.
  3. Note the current replica count, then scale down the StatefulSet of the MinIO tenant:
    kubectl get statefulset defaulttenant-ss-0 -n minio  # note the desired replicas
    kubectl scale statefulset defaulttenant-ss-0 -n minio --replicas=0
    
  4. Change the volume size in the PVC to your needs (mayastor supports online volume expansion).
  5. Scale the StatefulSet back up to its previous size:
    kubectl scale statefulset defaulttenant-ss-0 -n minio --replicas=<previous-replicas>
    
  6. Change the storage size in the MinIO tenant CR in the corresponding GitOps branch. This does not have any effect on the volume itself, only for completeness (so the manifest matches the actual volume size).
  7. Enable ArgoCD syncing for the MinIO tenant app again.

Decreasing the volume size⚓︎

Mayastor volumes cannot be shrunk. To decrease the volume size, you have to re-create the tenant with a smaller volume. Follow the procedure in Re-creating the MinIO tenant volume, creating the new tenant with the smaller volume size.

Re-creating the MinIO tenant volume⚓︎

Some StorageClasses (e.g. mayastor) don't allow shrinking a volume, so to decrease a tenant's volume size you have to delete the old MinIO tenant and set up a new one with the desired volume. You can backup data with mc.

# expose minio via port forwarding
kubectl port-forward -n minio defaulttenant-ss-0-0 59090:9000

# new terminal
# connect mc 
mc alias set minio http://localhost:59090 <key> <secret>
# copy over data to local disk
mc cp --recursive minio/ .
# delete old tenant and release pvc
kubectl delete tenant defaulttenant -n minio
kubectl delete pvc -n minio 0-defaulttenant-ss-0-0
kubectl delete pvc -n minio defaulttenant-prometheus-defaulttenant-prometheus-0

# create new tenant (set the desired, smaller volume size in the Tenant CR first)
kubectl apply -k paas/data_storage/minio/tenants

# remove old mc alias and connect new tenant (you will likely have to restart port forwarding)
mc alias remove minio
mc alias set minio http://localhost:59090 <key> <secret>
# create all buckets and copy over data
mc mb minio/mlpipeline
mc cp --recursive mlpipeline minio/
# ...

Transferring MinIO tenant to a new StorageClass⚓︎

Use case: a MinIO tenant already has a PV with data which uses an undesired StorageClass (e.g. OpenEBS Hostpath on managed kubernetes). This section is similar to the previous one, but provides alternative instructions on how to use intermediate PV to transfer the data (without copying data locally). Steps:

  1. Downscale the MinIO tenant StatefulSet to 0 (required because the underlying PVC uses ReadWriteOnce access mode):

    kubectl scale statefulset defaulttenant-ss-0 -n minio --replicas=0
    

  2. Create an intermediate PV and a migrator pod:

    # Set this to your target StorageClass
    STORAGE_CLASS="your-desired-storageclass"
    
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: tenant-migrate-pvc
      namespace: minio
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 200Gi
      storageClassName: ${STORAGE_CLASS}
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: data-migrator
      namespace: minio
    spec:
      containers:
      - name: migrator
        image: alpine
        command: ["/bin/sh", "-c", "sleep 3600"]
        volumeMounts:
        - name: old
          mountPath: /mnt/old
        - name: new
          mountPath: /mnt/new
      volumes:
      - name: old
        persistentVolumeClaim:
          claimName: 0-defaulttenant-ss-0-0
      - name: new
        persistentVolumeClaim:
          claimName: tenant-migrate-pvc
    EOF
    

  3. Open pod shell and copy the content of the old tenant's PV:

    # exec into pod first
    kubectl exec -it data-migrator -n minio -- /bin/sh
    

    In the pod shell:

    cp -a /mnt/old/. /mnt/new/  # inside pod
    

    If copy fails

    If the cp command fails (e.g., connection to pod is lost), you may have incomplete files in the destination. Before retrying, run the following (inside the pod!) to remove any incomplete files:

    find /mnt/old -type f -print0 | while IFS= read -r -d '' f; do
      newf="/mnt/new/${f#/mnt/old/}"
      if [ -f "$newf" ]; then
        if [ "$(stat -c%s "$f")" -ne "$(stat -c%s "$newf")" ]; then
          echo "Removing incomplete $newf"
          rm "$newf"
        fi
      fi
    done
    

    Then resume copying:

    cp -au /mnt/old/. /mnt/new/
    

  4. Verify the contents are copied (inside pod):

    # Compare file counts
    find /mnt/old -type f | wc -l
    find /mnt/new -type f | wc -l
    
    # Compare directory sizes (note: sizes may differ slightly due to filesystem overhead)
    du -sh /mnt/old
    du -sh /mnt/new
    
    # OPTIONAL: Verify no files are missing by comparing checksums (for critical data, can be very slow)
    cd /mnt/old && find . -type f -exec md5sum {} \; | sort > /tmp/old.md5
    cd /mnt/new && find . -type f -exec md5sum {} \; | sort > /tmp/new.md5
    diff /tmp/old.md5 /tmp/new.md5
    

  5. Delete the migrator pod and the old PVC:

    kubectl delete po data-migrator -n minio
    kubectl delete pvc 0-defaulttenant-ss-0-0 -n minio  # this should also delete the corresponding PV
    

  6. Recreate the tenant:

    Using non-default StorageClass

    If you need to use a specific StorageClass (other than the cluster default), you must edit the tenant CR manifest to add or update spec.pools[0].volumeClaimTemplate.spec.storageClassName before applying. For example:

    spec:
      pools:
      - volumeClaimTemplate:
          spec:
            storageClassName: your-desired-storageclass
            # ... other specs
    

    If using GitOps with ArgoCD (prokube default), the general instructions are:

    # Step 1: Disable MinIO tenant autosync:
    kubectl -n argocd patch applicationset storage-apps \
      --type='json' \
      -p='[
        {"op":"remove","path":"/spec/template/spec/syncPolicy/automated"}
      ]'
    
    # Step 2: update the Tenant manifest (if needed, see note above) and push to your GitOps branch
    ...
    
    # Step 3: delete the existing tenant
    kubectl delete tenant defaulttenant -n minio
    
    # Step 4: enable autosync again - this should trigger tenant recreation
    kubectl -n argocd patch applicationset storage-apps \
      --type='merge' \
      -p='{"spec":{"template":{"spec":{"syncPolicy":{"automated":{}}}}}}'
    

    If GitOps is disabled, you can just recreate the tenant manually:

    kubectl delete tenant defaulttenant -n minio
    # command below will use current cluster DEFAULT StorageClass for tenant's new PV
    # if tenant was not edited
    kubectl apply -k paas/data_storage/minio/tenants  
    

  7. Downscale the tenant StatefulSet again (required for ReadWriteOnce PVC access):

    kubectl scale statefulset defaulttenant-ss-0 -n minio --replicas=0
    

  8. Migrate the copied content to the newly created PV:

    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Pod
    metadata:
      name: data-migrator-2
      namespace: minio
    spec:
      containers:
      - name: migrator
        image: alpine
        command: ["/bin/sh", "-c", "sleep 3600"]
        volumeMounts:
        - name: old
          mountPath: /mnt/old
        - name: new
          mountPath: /mnt/new
      volumes:
      - name: old
        persistentVolumeClaim:
          claimName: tenant-migrate-pvc
      - name: new
        persistentVolumeClaim:
          claimName: 0-defaulttenant-ss-0-0
    EOF
    

    Open shell in the pod:

    kubectl exec -it data-migrator-2 -n minio -- /bin/sh
    

    Copy the data in the pod shell:

    cp -a /mnt/old/. /mnt/new/  # inside pod
    

    If it fails, see the note in step 4. On successful completion, you might want to run the commands from step 4 to verify data integrity.

  9. Delete the migrator pod and upscale the tenant StatefulSet back to 1:

    kubectl delete po data-migrator-2 -n minio
    kubectl scale statefulset defaulttenant-ss-0 -n minio --replicas=1
    

  10. Verify the tenant is working, and you can log into MinIO in the Web UI, and that the MinIO data is shown there.

    Verify the default tenant PV is using the desired storage class:

    kubectl get pvc 0-defaulttenant-ss-0-0 -n minio -o jsonpath='{.spec.storageClassName}{"\n"}'
    

    Then delete the intermediate PV:

    kubectl delete pvc tenant-migrate-pvc -n minio  # this should also delete the corresponding PV