Late last year, Microsoft released the latest version of the snappily titled ‘Azure Container Storage enabled by Azure Arc’, (ACSA) which is a solution to make it easier to get data from your container solution to Azure Blob Storage. You can read the overview here, but in essence it’s a pretty configurable allowing you to setup local resilient storage for your container apps, or use for cloud ingest; to send data to Azure and purge once transfer is confirmed.

The purpose of the post is to give and example of the steps needed to get this setup on an Azure Local AKS cluster.

If you have an existing cluster you want to deploy to, take heed of the pre-reqs:

Single-node or 2-node cluster

per node:

4 CPUs
16 GB RAM

Multi-node cluster

per node:

8 CPUs
32 GB RAM

16GB RAM should be fine, but in more active scenarios, 32 GB is recommended.

Prepare AKS enabled by Azure Arc cluster

Make sure you have the latest AZ CLI extensions installed.

Azure Arc Kubernetes Extensions Documentation

# Make sure the az extensions are installed
az extension add --name connectedk8s --upgrade
az extension add --name k8s-extension --upgrade
az extension add -n k8s-runtime --upgrade
az extension add --name aksarc --upgrade

# Login to Azure
az login
az account set --subscription <subscription-id>

As of time of writing, here are the versions of the extensions:

If you have a virgin cluster, you will need to install the Load Balancer.

Install MetalLB

# Check you have relevent Graph permissions
az ad sp list --filter "appId eq '087fca6e-4606-4d41-b3f6-5ebdf75b8b4c'" --output json 

# If that command returns an empty result, use the alternative method: https://learn.microsoft.com/en-us/azure/aks/aksarc/deploy-load-balancer-cli#option-2-enable-arc-extension-for-metallb-using-az-k8s-extension-add-command


# Enable the extension
RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME" # name of the resource group where the AKS Arc cluster is deployed
CLUSTER_NAME="YOUR_CLUSTER_NAME"
AKS_ARC_CLUSTER_URI=$(az aksarc show --resource-group ${RESOURCE_GROUP_NAME} --name ${CLUSTER_NAME} --query id -o tsv | cut -d'/' -f1-9)

az k8s-runtime load-balancer enable --resource-uri $AKS_ARC_CLUSTER_URI

# Deploy the Load Balancer

LB_NAME="al-lb-01" # must be lowercase, alphanumeric, '-' or '.' (RFC 1123)
IP_RANGE="192.168.1.100-192.168.1.150"
ADVERTISE_MODE="ARP" # Options: ARP, BGP, Both

az k8s-runtime load-balancer create --load-balancer-name $LB_NAME \
--resource-uri $AKS_ARC_CLUSTER_URI \
--addresses $IP_RANGE \
--advertise-mode $ADVERTISE_MODE

Open Service Mesh is used to deliver the ACSA capabilities, so to deploy on the connected AKS cluster, use the following commands:

RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"

az k8s-extension create --resource-group $RESOURCE_GROUP_NAME \
--cluster-name $CLUSTER_NAME \
--cluster-type connectedClusters \
--extension-type Microsoft.openservicemesh \
--scope cluster \
--name osm \
--config "osm.osm.featureFlags.enableWASMStats=false" \
--config "osm.osm.enablePermissiveTrafficPolicy=false" \
--config "osm.osm.configResyncInterval=10s" \
--config "osm.osm.osmController.resource.requests.cpu=100m" \
--config "osm.osm.osmBootstrap.resource.requests.cpu=100m" \
--config "osm.osm.injector.resource.requests.cpu=100m"

Deploy IoT Operations Dependencies

In the official documentation, it says to deploy the IoT Operations extension, specifically the cert-manager component. It doesn't say if you don't have to deploy if not using Azure IoT Operations, so I deployed anyway.

RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"

az k8s-extension create --cluster-name "${CLUSTER_NAME}" \
--name "${CLUSTER_NAME}-certmgr" \
--resource-group "${RESOURCE_GROUP_NAME}" \
--cluster-type connectedClusters \
--extension-type microsoft.iotoperations.platform \
--scope cluster \
--release-namespace cert-manager

Deploy the container storage extension

RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"

az k8s-extension create --resource-group "${RESOURCE_GROUP_NAME}" \
--cluster-name "${CLUSTER_NAME}" \
--cluster-type connectedClusters \
--name azure-arc-containerstorage \
--extension-type microsoft.arc.containerstorage

Now it's time to deploy the edge storage configuration. As my cluster is deployed on Azure Local AKS and is connected to Azure Arc, I went with the Arc config option detailed in the docs.

cat <<EOF > edgeConfig.yaml
apiVersion: arccontainerstorage.azure.net/v1
kind: EdgeStorageConfiguration
metadata:
  name: edge-storage-configuration
spec:
  defaultDiskStorageClasses:
    - "default"
    - "local-path"
  serviceMesh: "osm"
EOF

kubectl apply -f "edgeConfig.yaml"

Once it's deployed, you can list the storage classes available to the cluster:

kubectl get storageclass

Setting up cloud ingest volumes

Now we're ready to configure permissions on the Azure Storage Account so that the Edge Volume provider has access to upload data to the blob container.

Offical Documentation

You can use the script below to get the extension identity and then assign the necessary role to the storage account:

RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"
export EXTENSION_TYPE=${1:-"microsoft.arc.containerstorage"}
EXTENSION_IDENTITY_PRINCIPAL_ID=$(az k8s-extension list \
--cluster-name ${CLUSTER_NAME} \
--resource-group ${RESOURCE_GROUP_NAME} \
--cluster-type connectedClusters \
| jq --arg extType ${EXTENSION_TYPE} 'map(select(.extensionType == $extType)) | .[] | .identity.principalId' -r)

STORAGE_ACCOUNT_NAME="YOUR_STORAGE_ACCOUNT_NAME"
STORAGE_ACCOUNT_RESOURCE_GROUP="YOUR_STORAGE_ACCOUNT_RESOURCE_GROUP"

STORAGE_ACCOUNT_ID=$(az storage account show --name ${STORAGE_ACCOUNT_NAME} --resource-group ${STORAGE_ACCOUNT_RESOURCE_GROUP} --query id --output tsv)

az role assignment create --assignee ${EXTENSION_IDENTITY_PRINCIPAL_ID} --role "Storage Blob Data Contributor" --scope ${STORAGE_ACCOUNT_ID}

Create a deployment to test the cloud ingest volume

Now we can test transferring data from edge to cloud.I'm using the demo from Azure Arc Jumpstart: Deploy demo from Azure Arc Jumpstart

First off, create a container on the storage account to store the data from the edge volume.

export STORAGE_ACCOUNT_NAME="YOUR_STORAGE_ACCOUNT_NAME"
export STORAGE_ACCOUNT_CONTAINER="fault-detection"
STORAGE_ACCOUNT_RESOURCE_GROUP="YOUR_STORAGE_ACCOUNT_RESOURCE_GROUP"

az storage container create --name ${STORAGE_ACCOUNT_CONTAINER} --account-name ${STORAGE_ACCOUNT_NAME} --resource-group ${STORAGE_ACCOUNT_RESOURCE_GROUP}

Next, create a file called acsa-deployment.yaml using the following content:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  ### Create a name for your PVC ###
  name: acsa-pvc
  ### Use a namespace that matched your intended consuming pod, or "default" ###
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: cloud-backed-sc
---
apiVersion: "arccontainerstorage.azure.net/v1"
kind: EdgeSubvolume
metadata:
  name: faultdata
spec:
  edgevolume: acsa-pvc
  path: faultdata # If you change this path, line 33 in deploymentExample.yaml must be updated. Don't use a preceding slash.
  auth:
    authType: MANAGED_IDENTITY
  storageaccountendpoint: "https://${STORAGE_ACCOUNT_NAME}.blob.core.windows.net/"
  container: ${STORAGE_ACCOUNT_CONTAINER}
  ingestPolicy: edgeingestpolicy-default # Optional: See the following instructions if you want to update the ingestPolicy with your own configuration
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: acsa-webserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: acsa-webserver
  template:
    metadata:
      labels:
        app: acsa-webserver
    spec:
      containers:
        - name: acsa-webserver
          image: mcr.microsoft.com/jumpstart/scenarios/acsa_ai_webserver:1.0.0
          resources:
            limits:
              cpu: "1"
              memory: "1Gi"
            requests:
              cpu: "200m"
              memory: "256Mi"
          ports:
            - containerPort: 8000
          env:
            - name: RTSP_URL
              value: rtsp://virtual-rtsp:8554/stream
            - name: LOCAL_STORAGE
              value: /app/acsa_storage/faultdata
          volumeMounts:
            ### This name must match the volumes.name attribute below ###
            - name: blob
              ### This mountPath is where the PVC will be attached to the pod's filesystem ###
              mountPath: "/app/acsa_storage"
      volumes:
        ### User-defined 'name' that will be used to link the volumeMounts. This name must match volumeMounts.name as specified above. ###
        - name: blob
          persistentVolumeClaim:
            ### This claimName must refer to the PVC resource 'name' as defined in the PVC config. This name will match what your PVC resource was actually named. ###
            claimName: acsa-pvc

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: virtual-rtsp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: virtual-rtsp
  minReadySeconds: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: virtual-rtsp
    spec:
      initContainers:
        - name: init-samples
          image: busybox
          resources:
            limits:
              cpu: "200m"
              memory: "256Mi"
            requests:
              cpu: "100m"
              memory: "128Mi"
          command:
          - wget
          - "-O"
          - "/samples/bolt-detection.mp4"
          - https://github.com/ldabas-msft/jumpstart-resources/raw/main/bolt-detection.mp4
          volumeMounts:
          - name: tmp-samples
            mountPath: /samples
      containers:
        - name: virtual-rtsp
          image: "kerberos/virtual-rtsp"
          resources:
            limits:
              cpu: "500m"
              memory: "512Mi"
            requests:
              cpu: "200m"
              memory: "256Mi"
          imagePullPolicy: Always
          ports:
            - containerPort: 8554
          env:
            - name: SOURCE_URL
              value: "file:///samples/bolt-detection.mp4"
          volumeMounts:
            - name: tmp-samples
              mountPath: /samples
      volumes:
        - name: tmp-samples
          emptyDir: { }
---
apiVersion: v1
kind: Service
metadata:
  name: virtual-rtsp
  labels:
    app: virtual-rtsp
spec:
  type: LoadBalancer
  ports:
    - port: 8554
      targetPort: 8554
      name: rtsp
      protocol: TCP
  selector:
    app: virtual-rtsp
---
apiVersion: v1
kind: Service
metadata:
  name: acsa-webserver-svc
  labels:
    app: acsa-webserver
spec:
  type: LoadBalancer
  ports:
    - port: 80
      targetPort: 8000
      protocol: TCP
  selector:
    app: acsa-webserver

Once created, apply the deployment :

export STORAGE_ACCOUNT_NAME="YOUR_STORAGE_ACCOUNT_NAME" # we need to export the storage account name so envsubst can substitute it
envsubst < acsa-deployment.yaml | kubectl apply -f -

[!NOTE]
This will deploy in to the default namespace.

This will create the deployment and the volumes, substituting the values for the storage account name with the variables previously set.

If you want to check the status of the edge volume, such as if it's connected or how many files are in the queue, you can use the following command:

# List the edge subvolumes
kubectl get edgesubvolume

kubectl describe edgesubvolume faultdata

Testing

Assuming everything has deployed without errors, you should be able to access the web server at the IP address of the webserver. You can find the IP address by running: