Late last year, Microsoft released the latest version of the snappily titled ‘Azure Container Storage enabled by Azure Arc’, (ACSA) which is a solution to make it easier to get data from your container solution to Azure Blob Storage. You can read the overview here, but in essence it’s a pretty configurable allowing you to setup local resilient storage for your container apps, or use for cloud ingest; to send data to Azure and purge once transfer is confirmed.
The purpose of the post is to give and example of the steps needed to get this setup on an Azure Local AKS cluster.
If you have an existing cluster you want to deploy to, take heed of the pre-reqs:
Single-node or 2-node cluster
per node:
- 4 CPUs
- 16 GB RAM
Multi-node cluster
per node:
- 8 CPUs
- 32 GB RAM
16GB RAM should be fine, but in more active scenarios, 32 GB is recommended.
Prepare AKS enabled by Azure Arc cluster
Make sure you have the latest AZ CLI extensions installed.
Azure Arc Kubernetes Extensions Documentation
# Make sure the az extensions are installed
az extension add --name connectedk8s --upgrade
az extension add --name k8s-extension --upgrade
az extension add -n k8s-runtime --upgrade
az extension add --name aksarc --upgrade
# Login to Azure
az login
az account set --subscription <subscription-id>
As of time of writing, here are the versions of the extensions:
If you have a virgin cluster, you will need to install the Load Balancer.
# Check you have relevent Graph permissions
az ad sp list --filter "appId eq '087fca6e-4606-4d41-b3f6-5ebdf75b8b4c'" --output json
# If that command returns an empty result, use the alternative method: https://learn.microsoft.com/en-us/azure/aks/aksarc/deploy-load-balancer-cli#option-2-enable-arc-extension-for-metallb-using-az-k8s-extension-add-command
# Enable the extension
RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME" # name of the resource group where the AKS Arc cluster is deployed
CLUSTER_NAME="YOUR_CLUSTER_NAME"
AKS_ARC_CLUSTER_URI=$(az aksarc show --resource-group ${RESOURCE_GROUP_NAME} --name ${CLUSTER_NAME} --query id -o tsv | cut -d'/' -f1-9)
az k8s-runtime load-balancer enable --resource-uri $AKS_ARC_CLUSTER_URI
# Deploy the Load Balancer
LB_NAME="al-lb-01" # must be lowercase, alphanumeric, '-' or '.' (RFC 1123)
IP_RANGE="192.168.1.100-192.168.1.150"
ADVERTISE_MODE="ARP" # Options: ARP, BGP, Both
az k8s-runtime load-balancer create --load-balancer-name $LB_NAME \
--resource-uri $AKS_ARC_CLUSTER_URI \
--addresses $IP_RANGE \
--advertise-mode $ADVERTISE_MODE
Open Service Mesh is used to deliver the ACSA capabilities, so to deploy on the connected AKS cluster, use the following commands:
RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"
az k8s-extension create --resource-group $RESOURCE_GROUP_NAME \
--cluster-name $CLUSTER_NAME \
--cluster-type connectedClusters \
--extension-type Microsoft.openservicemesh \
--scope cluster \
--name osm \
--config "osm.osm.featureFlags.enableWASMStats=false" \
--config "osm.osm.enablePermissiveTrafficPolicy=false" \
--config "osm.osm.configResyncInterval=10s" \
--config "osm.osm.osmController.resource.requests.cpu=100m" \
--config "osm.osm.osmBootstrap.resource.requests.cpu=100m" \
--config "osm.osm.injector.resource.requests.cpu=100m"
Deploy IoT Operations Dependencies
In the official documentation, it says to deploy the IoT Operations extension, specifically the cert-manager
component. It doesn't say if you don't have to deploy if not using Azure IoT Operations, so I deployed anyway.
RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"
az k8s-extension create --cluster-name "${CLUSTER_NAME}" \
--name "${CLUSTER_NAME}-certmgr" \
--resource-group "${RESOURCE_GROUP_NAME}" \
--cluster-type connectedClusters \
--extension-type microsoft.iotoperations.platform \
--scope cluster \
--release-namespace cert-manager
Deploy the container storage extension
RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"
az k8s-extension create --resource-group "${RESOURCE_GROUP_NAME}" \
--cluster-name "${CLUSTER_NAME}" \
--cluster-type connectedClusters \
--name azure-arc-containerstorage \
--extension-type microsoft.arc.containerstorage
Now it's time to deploy the edge storage configuration. As my cluster is deployed on Azure Local AKS and is connected to Azure Arc, I went with the Arc config option detailed in the docs.
cat <<EOF > edgeConfig.yaml
apiVersion: arccontainerstorage.azure.net/v1
kind: EdgeStorageConfiguration
metadata:
name: edge-storage-configuration
spec:
defaultDiskStorageClasses:
- "default"
- "local-path"
serviceMesh: "osm"
EOF
kubectl apply -f "edgeConfig.yaml"
Once it's deployed, you can list the storage classes available to the cluster:
kubectl get storageclass
Setting up cloud ingest volumes
Now we're ready to configure permissions on the Azure Storage Account so that the Edge Volume provider has access to upload data to the blob container.
You can use the script below to get the extension identity and then assign the necessary role to the storage account:
RESOURCE_GROUP_NAME="YOUR_RESOURCE_GROUP_NAME"
CLUSTER_NAME="YOUR_CLUSTER_NAME"
export EXTENSION_TYPE=${1:-"microsoft.arc.containerstorage"}
EXTENSION_IDENTITY_PRINCIPAL_ID=$(az k8s-extension list \
--cluster-name ${CLUSTER_NAME} \
--resource-group ${RESOURCE_GROUP_NAME} \
--cluster-type connectedClusters \
| jq --arg extType ${EXTENSION_TYPE} 'map(select(.extensionType == $extType)) | .[] | .identity.principalId' -r)
STORAGE_ACCOUNT_NAME="YOUR_STORAGE_ACCOUNT_NAME"
STORAGE_ACCOUNT_RESOURCE_GROUP="YOUR_STORAGE_ACCOUNT_RESOURCE_GROUP"
STORAGE_ACCOUNT_ID=$(az storage account show --name ${STORAGE_ACCOUNT_NAME} --resource-group ${STORAGE_ACCOUNT_RESOURCE_GROUP} --query id --output tsv)
az role assignment create --assignee ${EXTENSION_IDENTITY_PRINCIPAL_ID} --role "Storage Blob Data Contributor" --scope ${STORAGE_ACCOUNT_ID}
Create a deployment to test the cloud ingest volume
Now we can test transferring data from edge to cloud.I'm using the demo from Azure Arc Jumpstart: Deploy demo from Azure Arc Jumpstart
First off, create a container on the storage account to store the data from the edge volume.
export STORAGE_ACCOUNT_NAME="YOUR_STORAGE_ACCOUNT_NAME"
STORAGE_ACCOUNT_RESOURCE_GROUP="YOUR_STORAGE_ACCOUNT_RESOURCE_GROUP"
az storage container create --name "fault-detection" --account-name ${STORAGE_ACCOUNT_NAME} --resource-group ${STORAGE_ACCOUNT_RESOURCE_GROUP}
Next, create a file called acsa-deployment.yaml using the following content:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
### Create a name for your PVC ###
name: acsa-pvc
### Use a namespace that matched your intended consuming pod, or "default" ###
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: cloud-backed-sc
---
apiVersion: "arccontainerstorage.azure.net/v1"
kind: EdgeSubvolume
metadata:
name: faultdata
spec:
edgevolume: acsa-pvc
path: faultdata # If you change this path, line 33 in deploymentExample.yaml must be updated. Don't use a preceding slash.
auth:
authType: MANAGED_IDENTITY
storageaccountendpoint: "https://${STORAGE_ACCOUNT_NAME}.blob.core.windows.net/"
container: fault-detection
ingestPolicy: edgeingestpolicy-default # Optional: See the following instructions if you want to update the ingestPolicy with your own configuration
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: acsa-webserver
spec:
replicas: 1
selector:
matchLabels:
app: acsa-webserver
template:
metadata:
labels:
app: acsa-webserver
spec:
containers:
- name: acsa-webserver
image: mcr.microsoft.com/jumpstart/scenarios/acsa_ai_webserver:1.0.0
resources:
limits:
cpu: "1"
memory: "1Gi"
requests:
cpu: "200m"
memory: "256Mi"
ports:
- containerPort: 8000
env:
- name: RTSP_URL
value: rtsp://virtual-rtsp:8554/stream
- name: LOCAL_STORAGE
value: /app/acsa_storage/faultdata
volumeMounts:
### This name must match the volumes.name attribute below ###
- name: blob
### This mountPath is where the PVC will be attached to the pod's filesystem ###
mountPath: "/app/acsa_storage"
volumes:
### User-defined 'name' that will be used to link the volumeMounts. This name must match volumeMounts.name as specified above. ###
- name: blob
persistentVolumeClaim:
### This claimName must refer to the PVC resource 'name' as defined in the PVC config. This name will match what your PVC resource was actually named. ###
claimName: acsa-pvc
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: virtual-rtsp
spec:
replicas: 1
selector:
matchLabels:
app: virtual-rtsp
minReadySeconds: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: virtual-rtsp
spec:
initContainers:
- name: init-samples
image: busybox
resources:
limits:
cpu: "200m"
memory: "256Mi"
requests:
cpu: "100m"
memory: "128Mi"
command:
- wget
- "-O"
- "/samples/bolt-detection.mp4"
- https://github.com/ldabas-msft/jumpstart-resources/raw/main/bolt-detection.mp4
volumeMounts:
- name: tmp-samples
mountPath: /samples
containers:
- name: virtual-rtsp
image: "kerberos/virtual-rtsp"
resources:
limits:
cpu: "500m"
memory: "512Mi"
requests:
cpu: "200m"
memory: "256Mi"
imagePullPolicy: Always
ports:
- containerPort: 8554
env:
- name: SOURCE_URL
value: "file:///samples/bolt-detection.mp4"
volumeMounts:
- name: tmp-samples
mountPath: /samples
volumes:
- name: tmp-samples
emptyDir: { }
---
apiVersion: v1
kind: Service
metadata:
name: virtual-rtsp
labels:
app: virtual-rtsp
spec:
type: LoadBalancer
ports:
- port: 8554
targetPort: 8554
name: rtsp
protocol: TCP
selector:
app: virtual-rtsp
---
apiVersion: v1
kind: Service
metadata:
name: acsa-webserver-svc
labels:
app: acsa-webserver
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8000
protocol: TCP
selector:
app: acsa-webserver
Once created, apply the deployment :
export STORAGE_ACCOUNT_NAME="YOUR_STORAGE_ACCOUNT_NAME" # we need to export the storage account name so envsubst can substitute it
envsubst < acsa-deployment.yaml | oc apply -f -
[!NOTE]
This will deploy in to the default namespace.
This will create the deployment and the volumes, substituting the values for the storage account name with the variables previously set.
If you want to check the status of the edge volume, such as if it's connected or how many files are in the queue, you can use the following command:
# List the edge subvolumes
kubectl get edgesubvolume
kubectl describe edgesubvolume faultdata
Testing
Assuming everything has deployed without errors, you should be able to access the web server at the IP address of the webserver. You can find the IP address by running:
kubectl get svc acsa-webserver-svc
Obtain the EXTERNAL-IP and port (should be 80) and use that to access the web server.
take a look at the edgevolume for metrics:
kubectl get edgesubvolume
take a look at the edgevolume for metrics:
kubectl get edgesubvolume
And that’s how simple (?!) it is to setup. As long as you’ve met the pre-reqs and set permissions properly, it’s pretty smooth to implement.