Zero RPO for TKG? How to get Synchronous Disaster Recover for your Tanzu Cluster

Kubecon and VMware Explore are coming up. One of our most popular sessions from our VMware Explore(and VMworld) is the Stretched Cluster for VMware/vVols. Now, you all may notice that SRM and other DR solutions do not work with Tanzu, but I want all of you to know that PX-DR Sync or Metro-DR is supported for Tanzu. This allows you to have ZERO RPO when failing Stateful workloads from 1 cluster to another. This can be from one vSphere cluster to another each running TKG.

Metro-DR

More information for how to setup Sync-DR with Tanzu can be found here in our docs page.

https://docs.portworx.com/operations/operate-kubernetes/disaster-recovery/px-metro/

Pay close attention to the docs as Tanzu has some special steps in the setup because of the way the Cloud Drives are created and managed with raw CNS volumes.

This is done with a shared etcd between the two distinct TKG clusters. That etcd can run at a third site where you would run the “witness node”. I run this in a standalone admin k8s cluster that runs all my internal services like etcd, externaldns, harbor and more. Just so you know this etcd is used by Portworx Enterprise only and is not the one used by k8s.

Slightly better image of Metro-DR

At the end of the process you have 2 TKG Clusters and 1 Portworx Cluster. We use Async schedules to copy the objects between clusters. The data is synchronously copied between nodes only limited by the latency. (Max for sync-dr is 10ms). This means the deployment for Postgres or Cassandra in the picture above is copied on a schedule and the non-live or target cluster is scaled to 0 replicas. The RPO is 0 since the data is copied instantly, the RTO is based on how fast you can spin up the replicas on the target.

Even though Portworx Enterprise and Metro-DR works with any storage target supported by Tanzu (VSAN, NFS Datastores, VMFS Datastores, other vVOls). The SPBM and vVols integrations from Pure Storage with the FlashArray are the most used anywhere. The effort for the integration and collaboration betweet Pure and VMware Engineering is amazing. Cody Hosterman and his team have done some amazing things. Metro-DR works great with Pure vVols and is the perfect cloud-native compliment to your stretched vVols VM’s using FlashArray ActiveCluster. If you are interested in using both together let your Pure Storage team know or send me a message on the twitter and I will track them down for you.

Database as a Service Platform on Tanzu with Portworx Data Services

So I am a week or so late but the latest update of Portworx Data Services now officially supports Tanzu. Now I say officially since it did kind of work the whole time. I just can’t declare our support to the world until it passes all the tests from Engineering. So go ahead. The easiest way to get a Database Platform as a Service can now be built on you Tanzu clusters. Go to https://central.portworx.com and contact your local PX team to get access.

The Proof is in the Docs!
https://pds.docs.portworx.com/prerequisites/#_supported_kubernetes_versions

A quick demo of PDS on Tanzu and a couple more of Portworx on Tanzu just for fun.

Fastest way to Get started with Portworx Data Services.

TL;DR This is the process to get an EKS Cluster with Portworx installed with PDS installed and registered to your account in the PDS Control Plane. By following this you can use one command:

eksctl create cluster -f gitops-cluster.yaml

Wait about 22 minutes and you are ready to go with Amazon EKS, PX Enterprise and Portworx Data Services!


With big help form the amazing Chris Kennedy I would like to share how to quickly get a K8s cluster with Portworx Enterprise. To start read Chris’ blog post here:
https://portworx.com/blog/how-to-deploy-portworx-using-gitops-workflows/

Go back and follow it carefully, in this example I assume you fully have the above article working. Now that you are up to speed on using Flux to deploy Portworx.

Save this to a file called your EKS cluster specI called mine gitops-cluster.yaml
Get more on the using eksctl to create your cluster and what is required for Portworx here https://docs.portworx.com/install-portworx/cloud/aws/aws-eks/eksctl/eksctl-operator/


gitops:
  flux:
    gitProvider: github      # required. options are github, gitlab or git
    flags:                   # required. arbitrary map[string]string for all flux args.
    # these args are not controlled by eksctl. see https://fluxcd.io/docs/get-started/ for all available flags
      owner: "yourgithubusername"
      repository: "demo-cloud-ops"
      private: "true"
      branch: "main"
      namespace: "flux-system"
      path: "clusters/demo-cluster"

If you followed Chris’ instructions you will have a git repo with the needed cloned from his example. The file you need to pay attention to for PDS to install on bootstrap of your cluster:

portworx/pds/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: pds-system
resources:
  - pds-helm-release.yaml
  - pds-helm-repo.yaml

portworx/pds/pds-helm-repo.yaml

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: pds
  namespace: pds-system
spec:
  interval: 1m0s
  url: https://portworx.github.io/pds-charts

portworx/pds/pds-helm-release.yaml

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: pds
  namespace: pds-system
spec:
  interval: 1m0s
  chart:
    spec:
      chart: pds-target
      sourceRef:
        kind: HelmRepository
        name: pds
        namespace: pds-system
      version: 1.5.0 #set to your current pds version
  values:
        tenantId: "inserter your pds tennant id"
        bearerToken: "insert pds bearer token"
        apiEndpoint: https://your api endpoint for pds

Notice the parts in bold you must get from the add target cluster wizard in the PDS control plane UI.

clusters/pds-flux-eksdemo/pds.yaml

apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: portworx-pds
  namespace: flux-system
spec:
  interval: 1m0s
  dependsOn: 
    - name: portworx-storagecluster
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./portworx/pds
  prune: true
  wait: true

This file has an important part that makes the PDS install wait until after the storagecluster is finished.

The next few files in the repo are there to add the required namespaces for PDS and for a “pds-demo” namespace with the label pds.portworx.com/available: “true” This namespace label allows PDS to deploy data services to that namespace. All other namespaces are not seen by the PDS Deployment UI.

clusters/pds-flux-eksdemo/px-ns.yaml

apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: namespaces
  namespace: flux-system
spec:
  interval: 1m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./portworx/namespaces
  prune: true

portworx/namespaces/px-ns.yaml

kind: Namespace
apiVersion: v1
metadata:
  name: pds-system
---
kind: Namespace
apiVersion: v1
metadata:
  name: portworx
---
kind: Namespace
apiVersion: v1
metadata:
  name: pds-demo
  labels:
    pds.portworx.com/available: "true"

portworx/namespaces/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - px-ns.yaml

Once you have your files edited you can install your EKS cluster from a file. That has the gitops settings set to your Github repo.

eksctl create cluster -r gitops-admin.yaml

You will wait probably 20 – 25 minutes depending on how many nodes you are using. The minimum is 3 for Portworx but if you plan on have lots of Data Services running you will need to choose a bigger EC2 size and more nodes.

Have a great afternoon,

Test all Kubeconfig Contexts

One day I woke up and had like 14 clusters in my Kubeconfig. I didn’t remember which ones did what or if the clusters even still existed.

kubectl config get-contexts -o name | xargs -I {} kubectl --context={} get nodes -o wide

So I cooked up this command to run through them all and make sure they actually responded. This works for me. If you have an alternative way please share in the comments.

Connecting your Application to Cassandra on PDS

Next up is a way to test Cassandra when deployed with PDS. I saved my python application to GitHub here: https://github.com/2vcps/py-cassandra

The key here is to deploy Cassandra via PDS then get the server connection names from PDS. Each step is explained in the repo. Go over there and fork or clone the repo or just use my settings. A quick summary though (it is really this easy).

  1. Deploy Cassandra to your Target in PDS.
  2. Edit the env-secret.yaml file to match your deployment.
  3. Apply the secret. kubectl -n namespace apply -f env-secret.yaml
  4. Apply the deployment. kubectl -n namespace apply -f worker.yaml
  5. Check the database in the Cassandra pod. kubectl -n namespace exec -it cas-pod — bash
  6. Use cqlsh to check the table the app creates.

That is it pretty easy and it creates a lot of records in the database. You could also scale it up in order to test connections from many sources. I hope this helps you quickly use PDS and if you have any updates or changes to me repo please submit a PR.

Testing Apache Kafka in Portworx Data Services

With the GA of Portworx Data Services I needed a way to connect some test applications with Apache Kafka. Kafka is one of the most asked for Data Services in PDS. Deploying Kafka is very easy with PDS but I wanted to show how it easy it was for a data team to connect their application to Kafka in PDS. I was able to find a kafka-python library, so I started working on a couple of things.

  1. A python script to create some kind of load on Kafka.
  2. Containerize it, so I can make it easy and repeatable.
  3. Create the kubernetes deployments so it is quick and easy.

This following github repo is the result of that project.
https://github.com/2vcps/py-kafka

See the repo for the steps on setting up the secret and deployments in K8s to use with your PDS Kafka, honestly it should work with any Kafka deployment where you have the connection service, username and password.

A quick demo of it all in action

Check out the youtube demo I did above to see it all in action.

Portworx Data Services GA: An Admins View, So easy you won’t believe it…

Portworx Data Services (PDS) the DBaaS platform built on the Portworx Enterprise platform is One Platform for All Databases. This SaaS platform can work with your platform in the Cloud or in your datacenter. Check out this demo of some of the Admin tasks available.

The good part other than add your DB consumers and your target Clusters to run workloads, the rest is configured for you. The power in the platform is you can change many settings but for the best practices are already put in place for you. Now you can have all databases with just one API and one UI.

You no longer need to learn an Operator or Custom resources for every different data service your Developers, Data Architects and DBA’s ask for. Also your data teams don’t learn K8s, They work with databases without having to every become platform exports. Just one API, One UI.
One Platform. All Databases.

Collection of PDS Links (as of May 18, 2022)

Blog from Umair Mufti on should you use DBaaS or DIY
https://blog.purestorage.com/perspectives/dbaas-or-diy-build-versus-buy-comparison/
How to deploy Postgres via Bhavin Shah with video demo
https://portworx.com/blog/accelerate-data-services-deployment-on-kubernetes-using-portworx-data-services/
Deploying Postgres via PDS by Ron Ekins step by step details
https://ronekins.com/2022/05/18/portworx-data-services-pds-and-postgresql/
Portworx Data Services page on Purestorage.com
https://www.purestorage.com/enable/portworx/data-services.html
PDS is GA Announcement from Pure Storage.
https://www.purestorage.com/company/newsroom/press-releases/pure-boosts-developer-productivity-expanding-portworx-portfolio.html
Migrating existing PostgreSQL into Managed PostgreSQL in PDS
https://portworx.com/blog/migrating-postgresql-to-portworx-data-services/