Get Ready for Portworx Data Services

Over a year a go we were working on the final parts of the acquisition of Portworx. I knew Portworx was going to change everything at Pure. I also expected it to take a while. I knew that we were going to see amazing new things built on this Cloud Native Data Platform. In the last year I have witnessed customers do just that with their own stateful workloads. Examples include banks, online gaming, SaaS providers, retail chains and many more.

What I also knew would come someday was Portworx Data Services would introduce all of us a way to have stateful workloads as a service. Database as a service, anywhere k8s can run, on any cloud. On Tanzu, AWS, Azure, Google, RedHat, Rancher and so many more. Your data managed but not locked into a proprietary platform. Managed in a way built for Cloud Native, Built for Kubernetes. It is also here way faster than I thought. A big thank you to our Engineering teams for the amazing work to make Portworx Data Services a real thing.

More than just a deployment tool

This is not just “deploy me a container” with a database. This is a managed experience with the day 2 operations built in. You can work on getting results from your data while PDS manages the performance, protection and availability of your solution wherever you want it to be. Not locked to specific cloud but anywhere that runs K8s.

But Doesn’t Operator XYZ do that?

Maybe. A little bit. Today’s developers expect to choose the tools they need to deliver their application. Not to be forced onto a single platform. This results in Database Administrators and DevOps teams supporting many different data services all with their own nuances. Some places have 10-15 different databases or data services (some of them are not really databases). Imagine having to support, the deployment and ongoing management of everyone of those, in most cases the with no extra time or resources. Normally you don’t get a new headcount every time a developer wants to use a new kind of data service.

Portworx Data Services lets you learn one API, one Interface and you get one vendor to support and manage the little things you don’t have time for, like Performance, High Availability and Disaster Recovery best practices. Making the data available in other sites or clouds for analytics or other use cases. Even Building that data into Dev-Test-QA workflows.

The early access program is starting now, we want your input so if you have use cases for multiple Cloud Native stateful workloads please find me at @jon_2vcps on the twitter.

Pure Storage Unveils Portworx Data Services

Portworx 2.8 with FlashArray and FlashBlade Getting Started

Getting Started with Portworx 2.8 and the FlashArray and FlashBlade

Last week Portworx 2.8 went GA, with it new support for Tanzu TKGs TKGm (We supported PKS/TKGi for a long time), but also Support for Cloud Drives for FlashArray and Direct Access for FlashBlade. It also simplified the installation of Portworx with Tanzu from this earlier version.

FlashArray Volumes for Portworx

Portworx will automate creating a storage pool from volumes provisioned from the FlashArray. This is done for your during the install, you may specify the size of the volumes in the spec generator at https://central.portworx.com

Process to install

NOTE as of 8/4/21: This feature is in Tech Preview (contact me if you want to run in Production)

  1. Create the px-pure-secret from the pure.json file
  2. Generate your Portworx Cluster spec from https://central.portworx.com
  3. Install the PX-Operator (command at the end of the spec generator).
  4. Install the Portworx Storage Cluster

  1. https://docs.portworx.com/reference/pure-json-reference/
    Also look that the pure.json reference in order to get the API key and token in a secret for you to use with Portworx. The installation of Portworx detects this Kubernetes secret and uses that information to provision drives from the array.
    Check out the youtube demo:
FlashArray and FlashBlade from Portworx

Sample pure.json

{
    "FlashArrays": [
        {
            "MgmtEndPoint": "<first-fa-management-endpoint>",
            "APIToken": "<first-fa-api-token>"
        },
        {
            "MgmtEndPoint": "<second-fa-management-endpoint>",
            "APIToken": "<second-fa-api-token>"
        }
    ],
    "FlashBlades": [
        {
            "MgmtEndPoint": "<fb-management-endpoint>",
            "APIToken": "<fb-api-token>",
            "NFSEndPoint": "<fb-nfs-endpoint>",
        },
        {
            "MgmtEndPoint": "<fb-management-endpoint>",
            "APIToken": "<fb-api-token>",
            "NFSEndPoint": "<fb-nfs-endpoint>",
        }
    ]
}
kubectl create secret generic px-pure-secret --namespace kube-system --from-file=pure.json

Remember the secret must be called px-pure-secret and be in the namespace that you install Portworx.

Select Pure FlashArray

2. Generate the spec for the Portworx Cluster.

3. Install the PX-Operator – I suggest using what you get from the Spec generator online or down to a local file.

kubectl apply -f pxoperator.yaml

4. Install the Storage Cluster

kubectl apply -f px-spec.yaml

For FlashBlade!

Also included in the pure.json is the API Token and IP information for my FlashBlade. Since the FlashBlade runs NFS the K8s node mounts it directly. We call with Direct Attach and allows you to leverage your FlashBlade for data that may exist outside of the PX-Cluster. Watch the demo to see it in action. Create a StorageClass for FlashBlade and a PVC using that class. Portworx automates the rest.
More info:
https://docs.portworx.com/portworx-install-with-kubernetes/storage-operations/create-pvcs/pure-flashblade/

Links

https://docs.portworx.com/cloud-references/auto-disk-provisioning/pure-flash-array/
https://docs.portworx.com/reference/pure-json-reference/
https://docs.portworx.com/portworx-install-with-kubernetes/storage-operations/create-pvcs/pure-flashblade/

Setting up Portworx on a Tanzu Kubernetes Grid aka TKG Cluster

First, this process works today on clusters made with the TKG tool that does not use the embedded management cluster. For clarity I call those clusters TKC or TKC Guest Clusters. The run as VM’s. You just can’t add block devices outside of the Cloud Native Storage (VMware’s CSI Driver). At least I couldn’t.

Now TKG deploys using a Photon 3.0 template. When I wrote this blog and recorded the demo the current latest version is TKG 1.2.1 and the k8s template is 1.19.3-vmware.

Check the release notes here: https://docs.portworx.com/reference/release-notes/portworx/#improvements-4

First generate base64 encoded versions of your user and password to vCenter.

# Update the following items in the Secret template below to match your environment:

VSPHERE_USER: Use output of printf <vcenter-server-user> | base64
VSPHERE_PASSWORD: Use output of printf <vcenter-server-password> | base64

The vsphere-secret.yaml save this to a file with your own user and password to vCenter (from above).


apiVersion: v1
kind: Secret
metadata:
  name: px-vsphere-secret
  namespace: kube-system
type: Opaque
data:
  VSPHERE_USER: YWRtaW5pc3RyYXRvckB2c3BoZXJlLmxvY2Fs
  VSPHERE_PASSWORD: cHgxLjMuMEZUVw==


kubectl apply the above spec after you update the above template with your user and password.

Follow these steps:

# create a new TKG cluster
tkg create cluster tkg-portworx-cluster -p dev -w 3 --vsphere-controlplane-endpoint-ip 10.21.x.x 

# Get the credentials for your config
tkg get credentials tkg-portworx-cluster

# Apply the secret and the operator for Portworx
kubectl apply -f vsphere-secret.yaml
kubectl apply -f 'https://install.portworx.com/2.6?comp=pxoperator'

#generate your spec first, you get this from generating a spec at https://central.portworx.com
kubectl apply -f tkg-px.yaml 

# Wait till it all comes up.
watch kubectl get pod -n kube-system

# Check pxctl status
PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
kubectl exec $PX_POD -n kube-system -- /opt/pwx/bin/pxctl status

You can now create your own or use the premade storageClass

kubectl get sc
NAME                             PROVISIONER                     RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
default (default)                csi.vsphere.vmware.com          Delete          Immediate           false                  7h50m
px-db                            kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-db-cloud-snapshot             kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-db-cloud-snapshot-encrypted   kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-db-encrypted                  kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-db-local-snapshot             kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-db-local-snapshot-encrypted   kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-replicated                    kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
px-replicated-encrypted          kubernetes.io/portworx-volume   Delete          Immediate           false                  7h44m
stork-snapshot-sc                stork-snapshot                  Delete          Immediate           false                  7h44m

Now Deploy Kube-Quake

The example.yaml is from my fork of the kube-quake repo on github where I redirected the data to be on a persistent volume.

kubectl apply -f https://raw.githubusercontent.com/2vcps/quake-kube/master/example.yaml
deployment.apps/quakejs created
service/quakejs created
configmap/quake3-server-config created
persistentvolumeclaim/quake3-content created

k get pod
NAME                      READY   STATUS              RESTARTS   AGE
quakejs-668cd866d-6b5sd   0/2     ContainerCreating   0          7s
k get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
quake3-content   Bound    pvc-6c27c329-7562-44ce-8361-08222f9c7dc1   10Gi       RWO            px-db          2m

k get pod
NAME                      READY   STATUS    RESTARTS   AGE
quakejs-668cd866d-6b5sd   2/2     Running   0          2m27s

k get svc
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                                         AGE
kubernetes   ClusterIP      100.64.0.1     <none>        443/TCP                                         20h
quakejs      LoadBalancer   100.68.210.0   <pending>     8080:32527/TCP,27960:31138/TCP,9090:30313/TCP   2m47s

Now point your browser to: http://<some node ip>:32527
Or if you have the LoadBalancer up and running go to the http://<Loadbalancer IP>:8080

Finishing some setup for my Unifi Dream Machine Pro

I wanted a better home router. During the learning from home phase of the 2020 pandemic I learned I could not have advanced security features of the USG (Unifi Security Gateway) turned on and get sufficient bandwidth for 3 Kids and myself to stream and zoom. So I wanted an upgrade. I went with the Unifi Dream Machine Pro.
https://store.ui.com/collections/unifi-network-routing-switching/products/udm-pro
For reals though file this post under, I need to remember what I changed in case I have to do it again.

OpenDNS

First thing that I did on my older routers was to configure opendns as the external DNS for my networks. In order for OpenDNS so apply my content filtering settings it must know the source IP for my home. This can change because most ISP’s use DHCP to assign the IP’s. Although it seems that my ISP likes to reassign the same IP, I can’t trust that will always be true.

So first, make sure you sign up for an opendns and dns-o-matic account.

Log into the UDM UI

Click on the Settings Gear…

Click on Advanced Features -> Advanced Gateway Settings

Click Create new Dynamic DNS

For DNS-o-matic the settings look like:
Hostname: all.dnsomatic.com
Username: [Your DNS-o-matic user]
Password: [Your DNS-o-matic password]
Server: updates.dnsomatic.com/\/nic/update?hostname=%h&myip=%i

Links Below were very helpful

https://community.ui.com/questions/OpenDNS-not-working-with-UDM-Pro/c9d5589b-c14e-4c86-8470-4c228b0b5282

Very helpful link for getting the server URL. Also contains a few for some other services.
https://community.ui.com/questions/UDM-DynDNS-Google-Domains/fe9ba35d-66c3-437d-8323-debe2af55879#answer/2181146e-79b8-485c-8042-eb975c291242

https://community.ui.com/questions/Any-way-to-get-DNS-O-Matic-to-work-with-UDM-Pro-to-enable-OpenDNS-Home-with-dynamic-IP/ede30618-663c-43e0-b198-0f2cf2805e1d

DDClient

Another thing I want to do, is set a DNS A record. I could probably use some form of the settings above to inform my Google Name Service to update the record with the dynamic IP. But why be boring? Lets run the DDClient perl program in a container on my K3s cluster.

First, read the google domains documentation for dynamic records. I created a dynamic record and it generates the host record along with a username and password that can be used via the API to update the IP associated to the Domain Name.

Next, why create the container if I don’t need to?

https://hub.docker.com/r/linuxserver/ddclient/tags?page=1&ordering=last_updated
My k3s is on some Raspberry Pi’s so I choose the arm image.

Then another nice person built the deployment. Check out that blog for full detail. Without getting too distracted by kubesail and setting up k8s. I skipped to the YAML:
https://kubesail.com/template/loopDelicious/ddclient

Save this as ddclient-secret.yaml changing the info necessary for your google account.

apiVersion: v1
kind: Secret
metadata:
  name: ddclient-secret
  labels:
    app: ddclient
stringData:
  ddclient.conf: |
    daemon=300
    syslog=yes
    protocol=dyndns2
    use=web
    server=domains.google.com
    ssl=yes
    login=<google generated login>
    password=<google generated password> 
    your.domain.record.com

Now save this as ddclient.yaml, remember to modify the image for the type of arch your Kubernetes is running on.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ddclient-deployment
  labels:
    app: ddclient
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  replicas: 1
  selector:
    matchLabels:
      app: ddclient
  template:
    metadata:
      labels:
        app: ddclient
    spec:
      volumes:
        - name: ddclient-config-file
          secret:
            secretName: ddclient-secret
      containers:
        - name: ddclient
          image: linuxserver/ddclient:arm64v8-version-v3.9.1
          imagePullPolicy: Always
          volumeMounts:
            - mountPath: /config
              name: ddclient-config-file
          resources:
            requests:
              cpu: 10m
              memory: 64Mi
            limits:
              cpu: 50m
              memory: 128Mi

This deployment will use the secret for the settings and deploy the small container to update the Google Domain record with the new IP from the host.

kubectl create ns ddclient
kubectl -n ddclient apply -f ddclient-secret.yaml
kubectl -n ddclient apply -f ddclient.yaml

Some DNS stuff I might try later

This is interesting repo that updates dns with static hostnames. Unfortunately the UDM does not have this built in. I would suggest Ubiquiti to build pi-hole into the UDM Pro to integrate with its DHCP server and also provide some abilities to block bad DNS names for ads/phishing/malware.

A deeper view into OpenStack Cinder using Pure Storage FlashArrays

OpenStack administrators have to deal with a lot, including, potentially, many different storage backends in Cinder. Pure Storage now make it easier for them to see what is going on with their Pure FlashArray backends.

With so many different storage backends available to OpenStack Cinder administrators who want to understand how their Cinder backends are being utilized have, historically, had to log on to every backend and therefore need to be conversant with all the vendor-specific storage frontends they have in their environment. The OpenStack Horizon GUI is complex enough, without having to learn other GUIs.

Additionally, OpenStack tenants who are interested in their storage utilization and performance have no way of getting this information without raising internal tickets for their storage support teams – and we all know how long those can take to get answered…

Well, Pure Storage has tried to alleviate these problems by providing an OpenStack plugin for Horizon.

From an OpenStack administrators perspective give a high level view of the utilization levels of Pure Storage FlashArrays configured as Cinder backends, and the tenants it will provide real-time volume utilization and performance information.

So what do you get with the plugin?

For the Administrator, there is a new Horizon panel in the Admin / System section called Pure Storage.

In this new panel you get a simple view of your FlashArray backends in the well-known Horizon format. Interesting information such as overall data reduction rates (with and without thin-provisioning included) is given as well as utilization levels against array limits – useful to see for both OpenStack dedicated arrays and those that have multiple workloads.

If you select the actual array name in the table a new browser tab will open at the actual FlashArray GUI if you want to log in directly, however if you select the Cinder Name in the table you get a detailed view of the array in Horizon providing more capacity and performance information.

The Overview pie charts in this detailed view show the array specific limits for this array, so will be different depending on the Purity version of the FlashArray.

If you aren’t an Administrator and just a regular Tenant in OpenStack, you won’t see these options available to you, but you will be able to get more detail on any volumes are using that are backed by Pure Storage FlashArrays.

By selecting a Pure backed volume in your Volumes page you will get enhanced detail information around the utilization, data reduction and performance of your volume. This data is current, so a refresh of the page will update these statistics.

Hopefully, OpenStack Admins and Users will find this new Horizon plugin useful.

To get more details on installing and configuring check out this GitHub repo.

Ephemeral or Persistent? The Storage Choices for Containers (Part 3)

In this, the final part of a 3-part series, I cover the latest developments in ephemeral storage. Part 1 covered traditional ephemeral storage and Part 2 covered persistent storage.

CSI Ephemeral Storage

With the release of Kubernetes 1.15, there came the ability for CSI drivers that support this feature, the ability to create ephemeral storage for pods using storage provisioned from external storage platforms. Within 1.15 a feature gate needed to be enabled to allow this functionality, but with 1.16 and the beta release of this feature, the feature gate defaulted to true.

Conceptually CSI ephemeral volumes are the same as emptyDir volumes that were discussed above, in that the storage is managed locally on each node and is created together with other local resources after a Pod has been scheduled onto a node. It is required that volume creation has to be unlikely to fail, otherwise, the pod gets stuck at startup. 

These types of ephemeral volumes are currently not covered by the storage resource usage limits of a Pod, because that is something that kubelet can only enforce for storage that it manages itself and not something provisioned by a CSI provisioner. Additionally, they do not support any of the advanced features that the CSI driver might provide for persistent volumes, such as snapshots or clones.

To identify if an installed CSI supports ephemeral volumes just run the following command and check supported modes:

# kubectl get csidriver
NAME       ATTACHREQUIRED   PODINFOONMOUNT   MODES                  AGE
pure-csi   true             true             Persistent,Ephemeral   28h

With the release of Pure Service Orchestrator v6.0.4, CSI ephemeral volumes are now supported by both FlashBlade and FlashArray storage. 

The following example shows how to create an ephemeral volume that would be included in a pod specification:

 volumes:
  - name: pure-vol
    csi:
      driver: pure-csi
      fsType: xfs
      volumeAttributes:
        backend: block
        size: "2Gi"

This volume is to be 2GiB in size, formatted as xfs and be provided from a FlashArray managed by Pure Service Orchestrator.

Even though these CSI ephemeral volumes are created as real volumes on storage platforms, they are not visible to Kubernetes other than in the description of the pod using them. There are no associated Kubernetes objects and are not persistent volumes and have no associated claims, so these are not visible through the kubectl get pv or kubectl get pvc commands.

When implemented by Pure Storage Orchestrator the name of the actual volume created on either a FlashArray or FlashBlade does not match the PSO naming convention for persistent volumes.

A persistent volume has the naming convention of:

<clusterID>-pvc-<persistent volume uid>

Whereas a CSI ephemeral volumes naming convention is:

<clusterID>-<namespace>-<podname>-<unique numeric identifier>

Generic Ephemeral Storage

For completeness, I thought I would add the next iteration of ephemeral storage that will become available.

With Kubernetes 1.19 the alpha release of Generic Ephemeral Volumes was made available, but you do need to enable a feature gate for this feature to be capable.

These next generation of ephemeral volumes will again be similar to emptyDir volumes but with more flexibility, 

It is expected that the typical operations on volumes that are implementing by the driver will be supported, including snapshotting, cloning, resizing, and storage capacity tracking.

Conclusion

I hope this series of posts have been useful and informative. 

Storage for Kubernetes has been through many changes over the last few years and this process shows no sign of stopping. More features and functionality are already being discussed in the Storage SIGs and I am excited to see what the future brings to both ephemeral and persistent storage for the containerized world.

Demo! Zero Data loss App Recovery in Kubernetes aka Disaster Recovery

For the demo I have 2 Kubernetes clusters with a single stretched Portworx cluster in AWS. This allows Metro DR to mirror the data between the 2 clusters so if there is a complete loss of Cluster 1 the application can be restarted with no loss of data.

You can have active workloads on both clusters. Just FYI.

Lots of new things to learn over the last month. I wanted to present everyone with my first demo with #portworxbypure. The official documentation is here. Always read the docs on how to set it up.

For the demo I have 2 Kubernetes clusters with a single stretched Portworx cluster in AWS. This allows Metro DR to mirror the data between the 2 clusters so if there is a complete loss of Cluster 1 the application can be restarted with no loss of data. The ELB in Amazon can be set to provide little interaction when getting your app back up and working, for this demo I tell the the deployment to fail over. Sort of the big red button for failover. Like all the things Cloud Native this can be automated.

Please check out this demo on YouTube and let me know what you think.

There are of course many options when it comes to how your app will work and this is for a basic web frontend and database. Scale out databases can be treated different. It all depends on how your application is architected and what the DR requirements will be.

Ephemeral or Persistent? The Storage Choices for Containers (Part 2)

In this, the second part of a 3-part series, I cover persistent storage. Part 1 covered traditional ephemeral storage.

Persistent Storage

Persistent storage as the name implies is storage that can maintain the state of the data it holds over the failure and restart of an application, regardless of the worker node on which the application is running. It is also possible with persistent storage to keep the data used or created by an application after the application has been deleted. This is useful if you need to reutilize the data in another application, or as enable the application to restart in the future and still have the latest dataset available. You can also leverage persistent storage to allow for disaster recovery or business continuity copies of the dataset. 

StorageClass

A construct in Kubernetes that has to be understood for storage is the StorageClass. A StorageClass provides a way for administrators to describe the “classes” of storage they offer. Different classes might map to quality-of-service levels, or different access rules, or any arbitrary policies determined by the cluster administrators.

Each CSI storage driver will have a unique provisioner that is assigned as an attribute to a storage class and instructs any persistent volumes associated with that storage class to use the named provisioner, or CSI driver when provisioning the underlying volume on the storage platform.

Provisioning

Obtaining persistent storage for a pod is a three-step process:

  1. Define a PersistentVolume (PV), which is the disk space available for use
  2. Define a PersistentVolumeClaim (PVC), which claims usage of part or all of the PersistentVolume disk space
  3. Create a pod that references the PersistentVolumeClaim

In modern-day CSI drivers, the first two steps are usually combined into a single task and this is referred to as dynamic provisioning. Here the PersistentVolumeClaim is 100% if the PersistentVolume and the volume will be formatted with a filesystem on first attachment to a pod.

Manual provisioning can also be used with some CSI drivers to import existing volumes on storage devices into the control of Kubernetes by converting the existing volume into a PersistentVolume. In this case, the existing filesystem on the original volume is kept with all existing data when first mounted to the pod. An extension of this is the ability to import a snapshot of an existing volume, thereby creating a full read-write clone of the source volume the snapshot derived from.

When a PV is created it is assigned a storageClassName attribute and this class name controls many attributes of the PV as mentioned earlier. Note that the storageClassName attribute ensures the use of this volume to only the PVCs that request the equivalent StorageClass. In the case of dynamic provisioning, this is all managed automatically and the application only needs to call the required StorageClass the PVC wants storage from and the volume is created and then bound to a claim.

When the application is complete or is deleted, depending on the way the PV was initially created, the underlying volume construct can either be deleted or retained for use by another application, or a restart of the original application. This is controlled by the reclaimPolicy in the storageClass definition. In dynamic provisioning the normal setting for this is delete, meaning that when the PVC is deleted the associated PV is deleted and the underlying storage volume is also deleted. 

By setting the reclaimPolicy to retain this allows for manual reclamation of the PV.

On deletion of the PVC, the associated PV is not deleted and can be reused by another PVC with the same name as the original PVC. This is the only PVC that can access the PV and this concept is used a lot with StatefulSets.

It should be noted that when a PV is retained a subsequent deletion of the PV will result in the underlying storage volume NOT being deleted, so it is essential that a simple way to ensure orphaned volumes do not adversely affect your underlying storage platforms capacity.

At this point, I’d like to mention Pure Service Orchestrator eXplorer which is an Open Source project to provide a single plane of glass for storage and Kubernetes administrator to visualize how Pure Service Orchestrator, the CSI driver provided by Pure Storage, is utilising storage. One of the features of PSOX is its ability to identify orphaned volumes from a Kubernetes cluster.

Persistent Volume Granularity

There are a lot of options available when it comes to how the pod can access the persistent storage volume and these are controlled by Kubernetes. These different options are normally defined with a storageClass.

The most common of these is the accessMode which controls how the data in the PV can be accessed and modified. There are three modes available in Kubernetes:

  • ReadWriteMany (RWX) – the volume can be mounted as read-write by many nodes
  • ReadWriteOnce (RWO) – the volume can be mounted as read-write by a single node
  • ReadOnlyMany (ROX) – the volume can be mounted read-only by many nodes

Additional controls for the underlying storage volume can be provided through the storageClass include mount options, volume expansion, binding mode which is usually used in conjunction with storage topology (also managed through the storageClass). 

A storageClass can also apply specific, non-standard, granularity for different features a CSI driver can support.

In the case of Pure Service Orchestrator, all of the above-mentioned options are available to an administrator creating storage classes, plus a number of the non-standard features.

Here is an example of a storageClass definition configured to use Pure Service Orchestrator as the CSI provisioner:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: example
provisioner: pure-csi
parameters:
  iops_limit: "30000"
  bandwidth_limit: "10G"
  backend: block
  csi.storage.k8s.io/fstype: xfs
  createoptions: -q
mountOptions:
  - discard
allowedTopologies:
  - matchLabelExpressions:
      - key: topology.purestorage.com/rack
        values:
          - rack-0
          - rack-1
allowedVolumeExpansion: true

This might look a little complex, but simplistically this example ensures that PersistentVolumes created through this storageClass will have the following attributes:

  • Quality of Service limits of 10Gb/s bandwidth and 30k IOPs
  • Volumes are capable of being expended in size
  • One first use by a pod the volume will be formatted with the xfs filesystem and mounted with the discard flag
  • The volume will only be created by an underlying FlashArray found in either rack-0 or rack-1 (based on labels defined in the PSO configuration file)

Pure Service Orchestrator even allows the parameters setting to control the NFS ExportRules of PersistentVolumes created on a FlashBlade.

Check back for Part 3 of this series, where I’ll discuss the latest developments in ephemeral storage in Kubernetes.

Ephemeral or Persistent? The Storage Choices for Containers (Part 1)

In this series of posts, I’ll cover the difference between ephemeral and persistent storage as far as Kubernetes containers are concerned and discuss the latest developments in ephemeral storage. I’ll also occasionally mention Pure Service Orchestrator™ to show how this can provide storage to your applications do matter what type is required.

Back in the mists of time when Kubernetes and containers, in general, were young storage was only ephemeral. There was no concept of persistency for your storage and the applications running in container environments were inherently ephemeral themselves and therefore there was no need for data persistency.

Initially, with the development of FlexDriver plugins and lately CSI compliant drivers, persistent storage has become a mainstream offering to enable applications that need or require state for their data. Persistent storage will be covered in the second blog in this series.

Ephemeral Storage

Ephemeral storage can come from several different locations, the most popular and simplest being emptyDir. This is, as the name implies, an empty directory mounted in the container that can be accessed by one or more pods in the container. When the container terminates, whether that be cleanly or through a failure event, the mounted emptyDir storage is erased and all its contents are lost forever. 

emptyDir

You might wonder where this “storage” used by emptyDir comes from and that is a great question. It can come from one of two places. The most common is actually from the actual physical storage available to the Kubernetes nodes running the container, usually from the root partition. This space is finite and completely dependent on the available free capacity of the disk partition the directory is present on. This partition is also used for lots of other dynamic data, such as container logs, image layers, and container-writable layers, so it is potentially an ever-decreasing resource.

To create this type of ephemeral storage for a pod(s) running in a container, ensure the pod specification has the following section:

 volumes:
  - name: demo-volume
    emptyDir: {}

Note that the {} states that we are not providing any further requirements for the ephemeral volume. The name parameter is required so that pods can mount the emptyDir volume, like this:

   volumeMounts:
    - mountPath: /demo
      name: demo-volume

If multiple pods are running in the container they can all access the same emptyDir if they mount the same volume name.

From the pods perspective, the emptyDir is a real filesystem mapped to the root partition, which is already part utilised, so you will see it in a df command, executed in the pod, as follows (this example has the pod running on a Red Hat CoreOS worker node):

# df -h /demo
Filesystem                Size      Used Available Use% Mounted on
/dev/mapper/coreos-luks-root-nocrypt
                        119.5G     28.3G     91.2G  24% /demo

If you want to limit the size of your ephemeral storage this can be achieved by adding resource limits to the container in the pod as follows:

      requests:
        ephemeral-storage: "2Gi"
      limits:
        ephemeral-storage: "4Gi"

Here the container has requested 2GiB of local ephemeral storage, but the container has a limit of 4GiB of local ephemeral storage.

Note that if you use this method and you exceed the ephemeral-storage limits value the Kubernetes eviction manager will evict the pod, so this is a very aggressive space limit enforcement method.

emptyDir from RAM

There might be instances that you only need a minimal scratch space area for your emptyDir and you don’t want to use any of the root partition. In this case, resources permitting, you can create this in RAM. The only difference in the creation of the emptyDir is that more information is passed during its creation in the pod specification as follows:

 volumes:
  - name: demo-volume
    emptyDir:
      medium: Memory

In this case, the default size of the mounted directory is half of the RAM the running node has and is mounted on tmpfs. For example, here the worker node has just under 32GB of RAM and therefore the emptyDir is 15.7GB, about half:

# df -h /demo
Filesystem                Size      Used Available Use% Mounted on
tmpfs                    15.7G         0     15.7G   0% /demo

You can use the concept of sizeLimit for the RAM-based emptyDir but this does not work as you would expect (at the time of writing). In this case, the sizeLimit is used by the Kubernetes eviction manager to evict any pods that exceed the sizeLimit specified in the emptyDir

Check back for Part 2 of this series, where I’ll discuss persistent storage in Kubernetes.

Portworx and TKG – Portworx Scalable Storage in TKG

Portworx + Pure Storage = awesome

I have recently been pretty occupied with learning TKG and oh yeah also Portworx. I wanted to share what I have learned so far when it comes to getting Portworx up and running in a TKG Cluster. So without too much introduction lets dive right in.

Create a new cluster

You need 3 worker nodes for Portworx.

tkg create cluster px1 --plan=dev -w 3

Install Portworx

Get IP’s for Ansible inventory
TKG uses DHCP for all of the deployed Kubernetes VM’s which is fine. This command will create an inventory.ini in order to run ansible playbooks against the cluster. Remember if you add nodes to update the inventory.ini.

kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="ExternalIP")].address}' | awk -v ORS='\n' '{ for (i = 1; i <= NF; i++) print $i }' >inventory.ini

Run the Ansible Playbook
This playbook is install the linux headers the TKG Photon template does not include. Copy this playbook and save it to playbook.yaml for example.

--- 
- hosts: all 
  become: yes 
  tasks: 
  - name: upgrade photon 
    raw: tdnf install -y linux-devel-$(uname -r)
ansible-playbook -i inventory.ini -b -v playbook.yaml -u capv

Notice that the username for the TKG nodes is capv.

# Follow this link from portworx for more details.

https://docs.portworx.com/cloud-references/auto-disk-provisioning/vsphere/

Create the vsphere credentials in a secret

Create a vsphere-secret.yaml file and paste the yaml below making sure replace the credentials with your own generated with the base64 example below.

#VSPHERE_USER: Use output of printf <vcenter-server-user> | base64
#VSPHERE_PASSWORD: Use output of printf <vcenter-server-password> | base64
apiVersion: v1
kind: Secret
metadata:
  name: px-vsphere-secret
  namespace: kube-system
type: Opaque
data:
  VSPHERE_USER: YWRtaW5pc3RyYXRvckB2c3BoZXJlLmxvY2Fs
  VSPHERE_PASSWORD: cHgxLjMuMEZUVw==

Then apply the secret

kubectl apply -f vsphere-secret.yaml

# Hostname or IP of your vCenter server

export VSPHERE_VCENTER=vc01.fsa.lab


# Prefix of your shared ESXi datastore(s) names. Portworx will use datastores who names match this prefix to create disks.

export VSPHERE_DATASTORE_PREFIX=px1


# Change this to the port number vSphere services are running on if you have changed the default port 443

export VSPHERE_VCENTER_PORT=443

export VSPHERE_DISK_TEMPLATE=type=thin,size=200

export VER=$(kubectl version --short | awk -Fv '/Server Version: /{print $3}')

curl -fsL -o px-spec.yaml "https://install.portworx.com/2.6?kbver=$VER&c=portworx-demo-cluster&b=true&st=k8s&csi=true&vsp=true&ds=$VSPHERE_DATASTORE_PREFIX&vc=$VSPHERE_VCENTER&s=%22$VSPHERE_DISK_TEMPLATE%22"

kubectl apply -f px-spec.yaml

So the curl command at the end of this code block will create the px-spec.yaml file that will install Portworx in your cluster. Notice all the variables that have to be set for this to work. If you skip any of these above or below you will have problems.

Create a repl = 3 storage class or whatever you want to test.

Copy the text below to a new file called px-repl3-sc.yaml

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
    name: px-repl3-sc
provisioner: kubernetes.io/portworx-volume
parameters:
   repl: "3"

Then apply the new StorageClass

kubectl apply -f px-repl3-sc.yaml

PX Backup also will get you the PX-Central UI

helm install px-backup portworx/px-backup --namespace px-backup --create-namespace --set persistentStorage.enabled=true,persistentStorage.storageClassName="px-repl3-s"

This will get you up and running on a trial license and enough to experiment and learn Portworx. If you are new to helm make sure to learn more here.