Kubernetes and the Pure Storage FlexVolume Plugin

First, if you are using Pure Storage and Kubernetes make life easier and take a look at our plugin. Now version 1.2.2 and GA.

https://hub.docker.com/r/purestorage/k8s/

Make sure the follow the directions on the page to pull and install the plugin. If you are using Openshift pay special attention to the Readme. I will post more on this in the near future.

Cockroach DB as our Persistent Database

I want to simulate a very easy database that I can easily use in a container. That is also not the same old. I built a Go app that will write to a database over and over to kind of demonstrate the inner workings of the plugin but not necessarily supply a performance test.

To learn more about the steps I use in the video to deploy and manage CRDB in K8s please check out this link. https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes.html

With that said, please check out how to deploy and scale a database with a persistent data platform from a Pure FlashArray. Watch this in Full screen to make the CLI commands easier to see.

What you are seeing in the video:

  1. Deploy the initial 3 pods with volumes automatically created and connected on the Pure FA.
  2. Initialize the cluster.
  3. Fail a node and watch K8s redeploy a new container and re-attach the data volume.
  4. Run a load generation application as a K8s Job.
  5. Scale the DB cluster out to 8 nodes.

What is next?

This is a really easy and quick demo but it show the ease of using the Pure Plugin to manage the persistent data, making sure you do not lose data in the event of app crashes. Also easily scaling. This can all be done via policy and the deployment can be made even easier using Helm. In a future post we will see how we can take advantage of these methods and keep the same highly available, high performance and very easy to use persistent data platform for your application.

vSphere Container Hosts Storage Networking

In the last couple of days I had a couple of questions from customers implementing some kind of container host on top of vSphere. Each was doing it to make use of either Kubernetes or Docker Volume Plugin for Pure Storage. First, there was a little confusion if the actual container needs to have iSCSI access to the array. The container needs network access for sure (I mean if you want somone to use the app) but it does not need access to the iSCSI network. Side Note: iSCSI is not required to use the persistent storage plugins for Pure. Fiber channel is supported. ISCSI may just be an easy path to using a PureFlash Array or NFS (10G network) for FlashBlade with an existing vSphere Setup.

To summarize all that: The container host VM needs access to talk directly to the storage. I accomplish this today with multiple vnics but you can do it however you like. There may be some vSwitches, physical nics and switches in the way, but the end result should be the VM talking to the FlashArray or FlashBlade.

More information on configuring our plugins is here:

  1. Docker/DCOS/Mesos – https://store.docker.com/plugins/pure-docker-volume-plugin
  2. Kubernetes and OpenShift – https://hub.docker.com/r/purestorage/k8s/

Basically the container host needs to be able to talk to the MGMT interface of the array, to do it’s automation of creating host objects, volumes and connecting them together (also removing them when you are finished). The thing is to know the plugin does all the work for you. Then when your application manifest requests the storage the plugin mounts the device to the required mount point inside the container. The app (container) does not know or care anything about iSCSI, NFS or Fiber Channel (and it should not).

Container HOST Storage Networking

Container hosts as VM’s Storage Networking

If you are setting up iSCSI in vSphere for Pure, you should probably go see Cody’s pages on doing this most of this is a good idea as a foundation for what I am about to share.

https://www.codyhosterman.com/pure-storage-vmware-overview/flasharray-and-vmware-best-practices/iscsi-setup/

Make sure you can use MPIO. Follow the linux best practices for Pure Storage. Inside your container hosts.

Do it the good old (new) gui way

So what I normally do is setup 2 new port groups on my VDS.

something like… iscsi-1 and iscsi-2 I know I am very original and creative.

Set the uplink for the Portgroup

We used to setup “in guest iSCSI” for VM’s that needed array based snaphost features way back in the day. This is basically the same piping. After creating the new port groups edit the settings in the HTML5 GUI as shown below.

Set the Failover Order

Go for iSCSI-1 on Uplink 1 and iSCSI-2 on Uplink 2

I favor putting the other Uplink into “Unused” as this gives me the straightest troubleshooting path in case something downstream isn’t working. You can put it in “standby” and probably be just fine.

CockroachDB with Persistent Data

There IS an Official Whitepaper!

While I was writing this post the awesome Simon Dodsley was writing a great whitepaper on Persistent storage with Pure. As you can see there is some very different ways to deploy CockroachDB but the main goal is to keep your important data persistent no matter what happens to the containers as the scale, live and die.

I know most everyone loved seeing the demo of the most mission critical app in my house. I also want to show a few quick ways to leverage the Pure plugin to provide persistent data to a database. I am posting my files I used to create the demo here https://github.com/2vcps/crdb-demo-pure

First note
I started with the instructions provided here by Cockroach Labs.
This is an insecure installation for demo purposes. They do provide the instructions for a more Prod ready version. This is good enough for now.

Second note
The loadbalancer I used was created for my environment using the intructions to output the HAProxy file found here on the Cockroach Labs website:
https://www.cockroachlabs.com/docs/stable/generate-cockroachdb-resources.html

My yaml file refers to a docker image I built for the HAproxy loadbalancer. If it works for you cool! If not please follow the instructions above to create your own. If you really need to know more I can write another post showing how to take the Dockerfile and copy the CFG generated by CRDB into a new image just for you.

 

My nice little docker swarm

media_1501095950777.png

I have three VMware VM’s running Ubuntu 16.04. With Docker CE and the Pure plugin already installed. Read more here if you want to install the plugin.

media_1501096079095.png

Run the deploy

https://github.com/2vcps/crdb-demo-pure/blob/master/3node-cockroachdb-pure.yml

version: '3.1'
services:
    db1:
      image: cockroachdb/cockroach:v1.0.2
      deploy:
            mode: replicated
            replicas: 1
      ports:
            - 8888:8080
      command: start --advertise-host=cockroach_db1 --logtostderr --insecure
      networks:
            - cockroachdb
      volumes:
            - cockroachdb-1:/cockroach/cockroach-data
    db2:
      image: cockroachdb/cockroach:v1.0.2
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db2 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-2:/cockroach/cockroach-data
    db3:
      image: cockroachdb/cockroach:v1.0.2
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db3 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-3:/cockroach/cockroach-data
    crdb-proxy:
      image: jowings/crdb-proxy:v1
      deploy:
         mode: replicated
         replicas: 1
      ports:
         - 26257:26257
      networks: 
         - cockroachdb

networks:
    cockroachdb:
        external: true

volumes:
    cockroachdb-1:
      driver: pure
    cockroachdb-2:
      driver: pure
    cockroachdb-3:
      driver: pure

 

#docker stack deploy -c 3node-cockroachdb-pure.yml cockroach

Like it shows in the compose file This command deploys 4 services. 3 database nodes and 1 HAproxy. Each database node gets a brand new volume attached directly to the path by the Pure Docker Volume Plugin.

New Volumes

media_1501098437804.png

Each new volume created and attached to the host via iSCSI and mounted into the container.

Cool Dashboard

media_1501098544719.png

Other than being no data do you notice something else?
First lets generate some data.
I run this from a client machine but you can attach to one of the DB containers and run this command to generate some sample data.

cockroach gen example-data | cockroach sql --insecure --host [any host ip of your docker swam]

media_1501098910914.png

I am also going to create a “bank” database and use a few containers to start inserting data over and over.

cockroach sql --insecure --host 10.21.84.7
# Welcome to the cockroach SQL interface.
# All statements must be terminated by a semicolon.
# To exit: CTRL + D.
root@10.21.84.7:26257/> CREATE database bank;
CREATE DATABASE
root@10.21.84.7:26257/> set database = bank;
SET
root@10.21.84.7:26257/bank> create table accounts (
-> id INT PRIMARY KEY,
-> balance DECIMAL
-> );
CREATE TABLE
root@10.21.84.7:26257/bank> ^D

I created a program in golang to insert some data into the database just to make the charts interesting. This container starts, inserts a few thousand rows then exits. I run it as a service with 12 replicas so it is constantly going, I call it gogogo because I am funny.

media_1501108005294.png

gogogo

media_1501108062456.png
media_1501108412285.png

You can see the data slowly going into the volumes.

media_1501171172944.png

Each node remains balanced (roughly) as cockroachdb stores that data.

What happens if a container dies?

media_1501171487843.png

Lets make this one go away.

media_1501171632191.png

We kill it.
Swarm starts a new one. The Docker engine uses the Pure plugin and remounts the volume. The CRDB cluster keeps on going.
New container ID but the data is the same.

media_1501171737281.png

Alright what do I do now?

media_1501171851533.png

So you want to update the image to the latest version of Cockroach? Did you notice this in our first screenshot?

Also our database is getting a lot of hits, (not really but lets pretend), so we need to scale it out. What do we do now?

https://github.com/2vcps/crdb-demo-pure/blob/master/6node-cockroachdb-pure.yml

version: '3.1'
services:
    db1:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
            mode: replicated
            replicas: 1
      ports:
            - 8888:8080
      command: start --advertise-host=cockroach_db1 --logtostderr --insecure
      networks:
            - cockroachdb
      volumes:
            - cockroachdb-1:/cockroach/cockroach-data
    db2:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db2 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-2:/cockroach/cockroach-data
    db3:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db3 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-3:/cockroach/cockroach-data
    crdb-proxy:
      image: jowings/crdb-haproxy:v2
      deploy:
         mode: replicated
         replicas: 1
      ports:
         - 26257:26257
      networks: 
         - cockroachdb
    db4:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db4 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-4:/cockroach/cockroach-data
    db5:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db5 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-5:/cockroach/cockroach-data
    db6:
      image: cockroachdb/cockroach:v1.0.3
      deploy:
         mode: replicated
         replicas: 1
      command: start --advertise-host=cockroach_db6 --join=cockroach_db1:26257 --logtostderr --insecure
      networks:
         - cockroachdb
      volumes:
         - cockroachdb-6:/cockroach/cockroach-data
networks:
    cockroachdb:
        external: true

volumes:
    cockroachdb-1:
      driver: pure
    cockroachdb-2:
      driver: pure
    cockroachdb-3:
      driver: pure
    cockroachdb-4:
      driver: pure
    cockroachdb-5:
      driver: pure
    cockroachdb-6:
      driver: pure
$docker stack deploy -c 6node-cockroachdb-pure.yml cockroach

(important to provide the name of the stack you already used, or else errors)

media_1501172007803.png

We are going to update the services with the new images.

  1. This will replace the container with the new version — v1.0.3
  2. This will attach the existing volumes for nodes db1,db2,db3 to the already created FlashArray volumes.
  3. Also create new empty volumes for the new scaled out nodes db4,db5,db6
  4. CockroachDB will begin replicating the data to the new nodes.
  5. My gogogo client “barage” is still running

This is kind of the shotgun approach in this non-prod demo environment. If you want no downtime upgrades to containers I suggest reading more on blue-green deployments. I will show how to make the application upgrade with no downtime and use blue-green in another post.

Cockroach DB begins to reblance the data.

media_1501172638117.png

6 nodes

media_1501172712079.png

If you notice the gap in the queries it is becuase I updated every node all at once. A better way would be to do one at a time and make sure each node is back up while they “roll” through the upgrade to the new image. Not prod remember?

media_1501172781312.png
media_1501172828992.png

Application says you are using 771MiB of your 192GB. While the FlashArray is using just maybe 105MB across these volumes.

A little while later…

media_1501175811897.png

Now we are mostly balanced with replicas in each db node.

Conclusion
This is just scratching the surface and running highly scalable data applications in containers with persistent data on a FlashArray. Are you a Pure customer or potential Pure customer about to run stateful/persistent apps on Docker/Kubernetes/DCOS? I want to hear from you. Leave a comment or send me a message on Twitter @jon_2vcps.

If you are a developer and have no clue what your infrastructure team does or is doing I am here to help make everyone’s life better. No more weekend long deployments or upgrades. Get out of doing storage performance troubleshooting.

Go to more of your kids soccer games.

FlashStack Your Way to Awesomeness

You may or may not have heard about Pure Storage and Cisco partnering to provide solutions together to help our current and prospective customers using UCS, Pure Storage, and VMware. These predesigned and tested architectures provide a full solution for compute, network and storage. Read more here:

https://www.purestorage.com/company/technology-partners/cisco.html

http://blogs.cisco.com/datacenter/accelerate-vdi-success-with-cisco-ucs-and-pure-storage

This results in CVD’s (Cisco Validated Designs)

http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/ucs_flashstack_view62_5k.html

There are more coming for SQL, Exchange, SAP and general Virtual Machines (I call it JBOVMs, Just a Bunch of VM’s).

Turn-key like solution for compute, network, and storage

Know how much and what to purchase when it comes to compute, network and storage as we have worked with Cisco to validate with actual real workloads. Many times mixed workloads because who runs just SQL or just Active Directory. It is proven and works. Up in running in a couple of days. If a couple of months was not good (legacy way), and then 2-4 weeks (newer way with legacy HW) wasn’t good enough, how about 1-2 days? For reals next generation datacenter. Also, scale compute, network and storage independently. Why buy extra hypervisor licenses when you just need 5 TB of space?

Ability to connect workload from/to the publics clouds (AWS, AZURE)

I don’t think as many people know this as they should, but Rob Barker “Barkz” is awesome. He worked hard to prove out the ability to use Pure FlashArray with Azure compute. Great read and more details here:

Announcing: Pure Storage All-Flash Cloud for Microsoft Azure

Official Pure information here:

https://www.purestorage.com/resources/type-a/pure-storage-all-flash-cloud-azure-deployment-guide.html

Azure is ready now and AWS is in the works.

Ability to backup to the public clouds.

No secret here we are working hard to integrate with backup software vendors. Some have been slow and others have been willing to work with our API to make seamless backup and snapshot management integration with Pure and amazing thing.

Just one example of how Commvault is enabling backup to Azure:

http://www.commvault.com/resource-library/55fc5ff8991435a6ce000c9c/backup-to-azure-with-commvault.pdf

IntelliSnap and Pure Storage

https://documentation.commvault.com/commvault/v10/article?p=features/snap_backup/pure/overview.htm

Check how easy it is to setup the Commvault and Pure Storage.

https://www.youtube.com/watch?v=af-dxbYYo2g

Ease of storage allocation without the need of a storage specialist

If I have ever talked to you about Pure Storage and I didn’t say how simple it is to use or mention my own customers that are not “Storage Peeps” that manage it quite easily then I failed. Take away my Orange sunglasses.

If you are looking at FlashStack or just now finding out how easy it is now. Remember no Storage Ph.D. required. We even have nearly everything you need to be built into our free vSphere Plugin. Info from my here Cody Hosterman here.

The Pure Storage Plugin for the vSphere Web Client

Here is a demo if you want to see how it works. This is a little older but I know he is working on some new stuff.

Even better if you would like to automate end to end and tie the Pure Storage provisioning with UCS Director that is possible too! See here:

https://www.youtube.com/watch?v=cMkQhbh6_As

Pure//Accelerate

Have you registered for Pure Accelerate yet? You should do it right now.

The next great conference where you actually learn about what is pertinent to your passion for IT. Develop insight for what is next, and hear from your peers and industry experts about moving to the next generation of IT.

accelerate

In 10 years you will tell people, yeah, I was at the very first Pure//Accelerate, I was there before EVERYONE else. You can be the IT hipster all over again. Before it moved to Moscone and had 30,000 people. You can move to Portland and drink IPA’s and post pictures of them to Instagram.

JPEG image-069AC5307867-1

UNMAP – Do IT!

Pretty sure my friend Cody Hosterman has talked about this until he turned blue in the face.  Just a point I want to quickly re-iterate here for the record. Run unmap on your vSphere Datastores.

Read this if you are running Pure Storage, but even if you run other arrays (especially all-flash) find a way to do UNMAP on a regular basis:

http://www.codyhosterman.com/2016/01/flasharray-unmap-script-with-the-pure-storage-powershell-sdk-and-poweractions/

Additionally, start to learn the ins-n-outs of vSphere 6 and automatic unmap!

http://blog.purestorage.com/direct-guest-os-unmap-in-vsphere-6-0-2/

Speaking of In-n-out…. I want a double double before I start Whole 30.

in-n-out

Easy Storage Monitoring – Setting Up PureELK with Docker

[UPDATE June 2016: Appears this works with Ubuntu only, maybe a debian flavor. I am hearing RHEL is problematic to get the dependencies working.]

I have blogged in the past about setting up vROPS (vCOPS) and Splunk to monitor a Pure Storage FlashArray using the REST API. Scripts and GETs and PUTs are fun and all but what if there was a simple tool you can install to have your own on site monitoring and analytics of your FlashArrays?

Enter Pure ELK. Some super awesome engineers back in Mountain View wrote this integration for Pure and ELK and packaged it an amazingly easy insatllation and released it on Github! Open Source and ready to go!
https://github.com/pureelk

and

http://github.com/pureelk/pureelk

Don’t know Docker? Cool we will install it for you. Don’t know Kibana or elasticsearch? Got you covered. One line in a fresh Ubuntu install (I used Ubuntu but I bet your favorite flavor will suffice).

go ahead and try:

curl -s https://raw.githubusercontent.com/pureelk/pureelk/master/pureelk.sh | bash -s install

(fixed url to reflect no longer in Dev)

This will download and install docker, setup all the dependencies for Pure ELK and let you know where to go from your browser to config your FlashArrays.

I had one small snag:

Connecting to the Docker Daemon!

media_1450716022076.png

My user was not in the right group to connect to docker the first time. The Docker install when it is not automated actually tells you to add your user to the “docker” group in order to

$sudo usermod -aG docker [username]

Log out and back in that did the trick. If you know a better way for the change to be recognized without logging out let me know in the comments.

I re-ran the install
curl -s https://raw.githubusercontent.com/pureelk/pureelk/dev/pureelk.sh | bash -s install

In about 4 minutes I was able to hit the management IP and start adding FlashArrays!

Quickly add all your FlashArrays

media_1450715719804.png

Click the giant orange PLUS button.

This is great if you have more than one FlashArray. If you only have one it still works. Everyone should have more Flash though right?

media_1450715771293.png

Fill in your FlashArray information. You can choose your time-to-live for the metrics and how often to pull data from the FlashArray

Success!

media_1450715937834.png

I added a couple of arrays for fun and then clicked “Go to Kibana”
I could have gone to
https://[server ip]:5601

Data Already Collecting

media_1450716109188.png

This is just the beginning. The next post I will share some of the pre-packaged dashboards and also some of the cusotmizations you can make in order to visualize all the Data PureELK is pulling from the REST API. Have fun with this free tool. It can be downloaded and setup in less than 10 minutes on a linux machine, 15 minutes if you need to build a new VM.

Register: VMUG Webinar and Pure Storage September 22

Register here: http://tinyurl.com/pq5fd9k

September 22 at 1:00pm Eastern time Pure Storage and VMware will be highlighting the results of ESG Lab Validation paper. The study on consolidating workloads with VMware and Pure Storage used a single FlashArray //m50 and deployed five virtualized mission-critical workloads VMware Horizon View, Microsoft Exchange Server, Microsoft SQL Server (OLTP), Microsoft SQL Server (data warehouse) and Oracle (OLTP). While I won’t steal all the thunder it is good to note that all of this was run with zero tuning on the applications. Want out of the business of tweaking and tuning everything in order to get just a little more performance from your application? Problem Solved. Plus check out the FlashArray and the consistent performance even during failures.

Tier 1 workloads in 3u of Awesomeness

wpid1910-media_1442835406510.png

You can see in the screenshot the results of running tier one application on an array made to withstand real-world ups and downs of the datacenter. Things happen to hardware and software even, but it is good to see the applications still doing great. We always tell customers, it is not how fast the array is in a pristine benchmark, but how does it respond when things are not going well, when controller loses power or a drive (or two) fails. That is what sets Pure Storage apart (that and data reduction and real Evergreen Storage).

Small note: Another proven environment with near 32k block sizes. This one hung out between 20k and 32k, don’t fall for 4k or 8k nonsense benchmarks. When the blocks hit the array from VMware this is just what we see.

Register for the Webinar
http://tinyurl.com/pq5fd9k
You can win a GoPro too.

PureStorage + REST API + Splunk = Fun with Data about Data

A few months back I posted a powershell script to post Pure Storage data directly into VMware vCenter Operations Manager (now called vRealize Operations). Inspiration hit me like a brick when a big customer of mine said, “Do you have a plugin for Splunk?”

He already wrote some scripts in python to pull data from our REST API. He just said, “Sure wish I didn’t have to do this myself.” I took the hint. Now I am not a python person, so I did the best I could with the tools I have.
You will notice that the script is very similar to the one I wrote for vCOPS. That is because open REST API’s rock, if you don’t have one for your product you are wrong. 🙂

The formatting in WordPress ALWAYS breaks scripts when I paste them. So head over to GitHub and download the script today.
https://github.com/2vcps/post-rest2splunk/tree/master

Like before I schedule this as a task to run every 5 minutes. That seems to not explode the tiny Splunk VM I am running in VMware Fusion to test this out.

Dashboards. Check.

wpid1855-media_1429109420445.png

Some very basic Dashboards I created. I am not a Splunk ninja, perhaps you know one? I am sure people that have done this for a while can pull much better visuals out of this data.

wpid1856-media_1429109524852.png
wpid1857-media_1429109617758.png

Pivot Table

wpid1858-media_1429109962843.png

Stats from a Lab array some Averages computed by Splunk.

Gauge Report of Max Latency (that is micro seconds)

wpid1859-media_1429110138347.png

A 1000 of these is 1 millisecond 🙂 pretty nice.

From Wikipedia
A microsecond is an SI unit of time equal to one millionth (0.000001 or 10−6 or 1/1,000,000) of a second. Its symbol is μs. One microsecond is to one second as one second is to 11.574 days. A microsecond is equal to 1000 nanoseconds or 1/1,000 milliseconds.

Even if everything else didn’t help you at least you learned that today. Right?

The link to github again https://github.com/2vcps/post-rest2splunk/tree/master

Top 5 – Pure Storage Technical Blog Posts 2014

Today I thought it would be pretty cool to list out my favorite 5 technical blog posts that pertain to Pure Storage. These are posts that I use to show customers how to get things done without re-inventing the wheel. Big thanks to Barkz and Cody for all the hard work they put in this year. Looking forward to even more awesomeness this year.

SQL Server 2014 Prod/Dev with VMware PowerCLI and Pure Storage PowerShell Toolkit – Rob “Barkz” Barker

Enhanced UNMAP script using with PowerCLI and RESTful API – Cody Hosterman

VMware PowerCLI  and Pure Storage – Cody Hosterman
Check out the great script to set all the vSphere Best Practices for the Pure Storage Flash Array.

Pure Storage PowerShell Toolkit Enhancements – Rob “Barkz” Barker

PowerActions – The PowerCLI Plugin for the vSphere Web Client with UNMAP – Cody Hosterman

JO-Unicorn-Rainbow