Creating a Helm Repo with Github

Next step in learning helm is being able to take an existing helm package and put it in your own repo.

There are ways to do this with github pages. I don’t really want mess withthat right now, how can I use a Github repo to host my changes to the deployment?

For installing helm and an additional demo please see part 1 of this series.

http://54.88.246.86/2018/03/27/getting-started-with-helm-for-k8s/

Continue reading “Creating a Helm Repo with Github”

Getting Started with Helm for K8s

Over the last few weeks I was setting up Kubernetes in the lab. One thing I quickly learned was managing and editing yaml files for deployments, services and persistent volume claims became confusing and hard. Even when I had things commited in github sometimes I would make edits then not push them then rebuild my K8s cluster.

The last straw was when 2 of our Pure developers said that editing yaml in vi wasn’t very cool and to start using helm.

Needless to say that was good advice. I still have to remember to push my repos to github. Now my demostration applications are more “cloud native”. I can create and edit them in one environment and use helm install in another and have it just work.

Continue reading “Getting Started with Helm for K8s”

Using Snapshots with the Pure Storage Plugin for Kubernetes

One request from customers is not only provision persistent storage for Kubernetes but also integrate into workflows that may need to snap and copy the data for different environments. Much like we do this with powershell or python for SQL and Oracle environments to accelerate development or QA. Pure has enabled snapshots using the Pure Provisioner as part of our Kubernetes Plugin.

In this demo I am showing how I can take a users data directory for JupyterHub and clone it for another user to take advantage of all the benefits of Pure’s snapshots and clones. You instantly get access to a copy of the dataset. The dataset doesn’t take up room on the backend storage. Only globally unique changes will grow the volume. In this use case the Data Science team will see increases in productivity as they are not waiting for data to download from the cloud or copy from another place on the array.

The command to run the snap using kubectl is below:

kubectl exec <pure provisioner pod name> -- snapshot create -n <namespace> <pvc-claim-name>

Kubernetes and the Pure Storage FlexVolume Plugin

First, if you are using Pure Storage and Kubernetes make life easier and take a look at our plugin. Now version 1.2.2 and GA.

https://hub.docker.com/r/purestorage/k8s/

Make sure the follow the directions on the page to pull and install the plugin. If you are using Openshift pay special attention to the Readme. I will post more on this in the near future.

Cockroach DB as our Persistent Database

I want to simulate a very easy database that I can easily use in a container. That is also not the same old. I built a Go app that will write to a database over and over to kind of demonstrate the inner workings of the plugin but not necessarily supply a performance test.

To learn more about the steps I use in the video to deploy and manage CRDB in K8s please check out this link. https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes.html

With that said, please check out how to deploy and scale a database with a persistent data platform from a Pure FlashArray. Watch this in Full screen to make the CLI commands easier to see.

What you are seeing in the video:

Deploy the initial 3 pods with volumes automatically created and connected on the Pure FA.
Initialize the cluster.
Fail a node and watch K8s redeploy a new container and re-attach the data volume.
Run a load generation application as a K8s Job.
Scale the DB cluster out to 8 nodes.

What is next?

This is a really easy and quick demo but it show the ease of using the Pure Plugin to manage the persistent data, making sure you do not lose data in the event of app crashes. Also easily scaling. This can all be done via policy and the deployment can be made even easier using Helm. In a future post we will see how we can take advantage of these methods and keep the same highly available, high performance and very easy to use persistent data platform for your application.

Four Resources that Got Me Started with Kuberenetes

In the last post I mentioned there are resources that have already gone through that do a better job than me in helping you understand containers and Kubernetes.
So if you are a virtualization admin like me and want to make 2018 the year you know enough to be dangerous I suggest the following resources.

Do Nigel Poulton’s Docker Deep Dive. A foundational understanding to containers will help the orchestration parts make sense.https://app.pluralsight.com/library/
Read Nigel’s The Kubernetes Book
Do Kubernetes the Hard Way. Once you see this the options that make K8s easier will seem a lot cooler and you will understand what they do in the background.
Go and Play with Docker and Kubernetes. Free sandboxes for you to try out.
https://labs.play-with-docker.com/
https://labs.play-with-k8s.com/

Start thinking: Does this app need a VM or a container? Once you are asking the question you will begin to think critically about the choices.

I am not sure we all need to move 100% off of VM’s today. Starting to ask the questions will help prepare us to provide these services to our customers when the workloads and workflows that require them to arise.

Anaconda with Jupyter Notebooks on Kubernetes

WARNING: YAML Heavy post. Sorry.

So I have been internally debating the best way to share this latest little thing I was working on/ learning. My goal over 2018 is to post more on migrating applications from virtual to containers managed by K8s. That transition isn’t for everything and has definetley required diving more into applications. There are many Kubernetes concepts I am going to skip over as others may already have explained them better. I do plan on doing a vSphere to K8s quick and easy to help us VCP’s and other Virtual Admins get started.

OK, getting started. Define some concepts

Anaconda, Conda for short.

Conda is a python package and environment manager for Data Science. You can download Anaconda here:
https://www.anaconda.com/download/

I wanted to keep it running in my lab and even though it works just fine on my local laptop, I switch between PC and Mac (2 of them) and wanted my environment (and data) available from a central place. Plus, I can’t learn Kubernetes without real applications to run.

Jupyter

Jupyter is an open source web application that allows you to display interactive code, equations and visualizations. I use it for Data Analytics in Python.

http://jupyter.org/

So jupyter is an application that can run in your conda environment. I want to run it as a container with persistent NFS storage in my Kuberenetes cluster in my basement. Notebooks are the files that contain the code and visualizations. I can post notebooks to github to allow others to test my work. In the github repo, I included a very basic file with some python. Once you have this all running you can play with it if you would like.

So how to get it to run. ContinuumIO the keepers of Anaconda provide a container image and some basic instructions for running the container on Docker. I googled for ways that people provide this in cluster environment. In the near future Jupyterhub will be the solution for you if you want multi-tenant jupyter deployments with Oauth and all kinds of fancy features I do not need in my tiny lab.

The following files are all available on my github at Conda-K8s. This worked in my environment with Kuberenetes 1.9. Your mileage may vary depending on access rights, version and anything you do that I don’t know about.

First create the persistent volume you will need to create and edit the following nfs-pv.yaml file.

nfs-pv.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: conda-notebooks
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteMany
  nfs:
    # FIXME: use the right IP and the right path
    server: 192.168.x.x
    path: "/nfs/repos/yourvalidpath"

First make sure you edit the file with your NFS server IP and valid already created path to your NFS Share. This is where your jupyter notebook data will be stored. If the POD crashes or the host server dies it will start elsewhere in the cluster, your data will persist. Brilliant!

via GIPHY

IF you want an automated way to create, mount and manage these volumes with Pure Storage check our our awesome flexvolume plugin for Kubernetes. Right now we will focus on making it work with any NFS path. This is manual and slow, so if you are serious about analytics get the plugin, and a FlashBlade.

$kubectl create -f nfs-pv.yaml

Then to view if your volume is ready run:

$kubectl get pv

Output for my system

NAME                                 CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                        STORAGECLASS   REASON    AGE
claim-jowings                        10Gi       RWX            Retain           Released   jupyter4me/hub-db-dir                                                 3d
conda-notebooks                      100Gi      RWX            Retain           Bound      default/conda-claim                                                   3d

Now that the volume object is created we can now create the “claim”
I am not going to get into the why of doing this but as far as my tiny brain can understand it is the way K8s manages what application can connect with what persistent volume. Notice how the request section of the yaml is asking for 100Gi, the size of my volume in the last step.

nfs-pvc.yaml.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: conda-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  resources:
    requests:
      storage: 100Gi

kubectl create -f nfs-pvc.yaml

To view the results

kubectl get pvc

Finally we can create the POD. The pod is what kubernetes uses to schedule a application and its most basic component. It can be just one container. It can be more, for now we won’t get into what all that means.

conda-pod.yaml

kind: Pod
apiVersion: v1
metadata:
  generateName: conda-
  labels:
    app: conda
spec:
  volumes:
    - name: conda-volume
      persistentVolumeClaim:
       claimName: conda-claim
  containers:
    - name: conda
      image: continuumio/anaconda3
      env:
      - name: JUPYTERCMD
        value: "/opt/conda/bin/conda install jupyter nb_conda -y --quiet && /opt/conda/bin/jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root"
      command: ["bash"]
      args: ["-c","$(JUPYTERCMD)"]
      ports:
        - containerPort: 8888
          name: "http-server"
      volumeMounts:
        - mountPath: "/opt/notebooks"
          name: conda-volume

If you take a look at the file above there are some things we are doing to get conda and jupyter to work. First notice the “env” section I created. I didn’t want to create a custom container image but rather use the default image provided by continuumio. I don’t want to accidentally become reliant on my own proprietary image. Without the command and the arguments in the $JUYPTERCMD environment variable, the container starts, has nothing to do, and shuts down. K8s sees this as a failure so it starts it again (and again and again). Also we see in the volumes section we are telling the POD to use our “conda-claim” we created in the last step. Under containers the volumeMounts declaration tells k8s to mount the pv to the mountPath inside the container.

kubectl create -f conda-pod.yaml

Now lets see what the results look like:

kubectl get pod
NAME                                     READY     STATUS    RESTARTS   AGE
conda-742lc                              1/1       Running   0          2d

Very good, the pod is running and we have a “READY 1/1”

A few things we need to connect to the jupyter notebook. Run the following command and notice the output. It gives you a URL with a token to access the web app. Obviously localhost is going to not work from my remote workstations. Save that token for later though.

$kubectl logs conda-742lc


Package plan for installation in environment /opt/conda:

The following NEW packages will be INSTALLED:

    _nb_ext_conf:     0.4.0-py36_1         
    nb_anacondacloud: 1.4.0-py36_0         
    nb_conda:         2.2.1-py36h8118bb2_0 
    nb_conda_kernels: 2.1.0-py36_0         
    nbpresent:        3.0.2-py36h5f95a39_1 

The following packages will be UPDATED:

    anaconda:         5.0.1-py36hd30a520_1  --> custom-py36hbbc8b67_0
    conda:            4.3.30-py36h5d9f9f4_0 --> 4.4.7-py36_0         
    pycosat:          0.6.2-py36h1a0ea17_1  --> 0.6.3-py36h0a5515d_0 

+ /opt/conda/bin/jupyter-nbextension enable nbpresent --py --sys-prefix
Enabling notebook extension nbpresent/js/nbpresent.min...
      - Validating: OK
+ /opt/conda/bin/jupyter-serverextension enable nbpresent --py --sys-prefix
Enabling: nbpresent
- Writing config: /opt/conda/etc/jupyter
    - Validating...
      nbpresent  OK

+ /opt/conda/bin/jupyter-nbextension enable nb_conda --py --sys-prefix
Enabling notebook extension nb_conda/main...
      - Validating: OK
Enabling tree extension nb_conda/tree...
      - Validating: OK
+ /opt/conda/bin/jupyter-serverextension enable nb_conda --py --sys-prefix
Enabling: nb_conda
- Writing config: /opt/conda/etc/jupyter
    - Validating...
      nb_conda  OK

[I 17:09:25.393 NotebookApp] [nb_conda_kernels] enabled, 3 kernels found
[I 17:09:25.399 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret
[W 17:09:25.421 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 17:09:26.044 NotebookApp] [nb_anacondacloud] enabled
[I 17:09:26.050 NotebookApp] [nb_conda] enabled
[I 17:09:26.095 NotebookApp] ✓ nbpresent HTML export ENABLED
[W 17:09:26.095 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'
[I 17:09:26.098 NotebookApp] Serving notebooks from local directory: /opt/notebooks
[I 17:09:26.098 NotebookApp] 0 active kernels 
[I 17:09:26.098 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8888/?token=08938eb3b2bc00f350c43f7535e38f6aa339f5915e12d912
[I 17:09:26.098 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 17:09:26.099 NotebookApp] 
    
    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=08938<blah blah blah

We must create a “service” in Kubernetes in order for the application to be accessible. There is a ton about services and ingress into applications. Since I am running on an private cluster. Not on Google or Amazon I am going to use the simplest way for this post to create external access. That is done using the “type” under the spec. See how it says NodePort? Also I am not specifying an inbound port (you can do that if you want). I am just telling it to find the app called “conda” and forward traffic to tcp 8888.

conda-svc.yaml

kind: Service
apiVersion: v1
metadata:
  name: conda-svc
spec:
  type: NodePort
  ports:
    - port: 8888
  selector:
    app: conda

kubectl create -f conda-svc.yaml

This creates the service from the file. This is actually a cool concept that allows the inbound traffic management (ingress) be disaggregated from the application pod/deployment. That means I can swap versions of the app without changing the inbound rules or loadbalancers (lb is a whole book unto itself). To see my services now I run:

$ kubectl get svc
NAME                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE
conda-svc                         NodePort    10.98.67.191     <none>        8888:32250/TCP    2d
kubernetes                        ClusterIP   10.96.0.1        <none>        443/TCP           36d
mc-nash-minecraft                 NodePort    10.105.112.153   <none>        25565:31642/TCP   31d
mc-shea-minecraft                 NodePort    10.111.206.174   <none>        25565:31048/TCP   31d
mc-survival-minecraft             NodePort    10.99.46.7       <none>        25565:31723/TCP   31d
prom-2vcps-prometheus-server-np   NodePort    10.104.173.0     <none>        80:31400/TCP      30d

Great, now we see the service is forwarding port 32250 (yours will be different) to 8888. Using the node port type I can actually hit any node in my cluster and my K8s CNI will forward the traffic.

now just go to and paste your token from earlier.

http://<a node ip>:32250/

In my github repo for this project I included a basic notebook file that shows some python code to simulate coin flips many many times. Feel free to “upload” and play with it and have fun with Data Science on Juypter / Conda running in a K8s cluster.

vSphere Container Hosts Storage Networking

In the last couple of days I had a couple of questions from customers implementing some kind of container host on top of vSphere. Each was doing it to make use of either Kubernetes or Docker Volume Plugin for Pure Storage. First, there was a little confusion if the actual container needs to have iSCSI access to the array. The container needs network access for sure (I mean if you want somone to use the app) but it does not need access to the iSCSI network. Side Note: iSCSI is not required to use the persistent storage plugins for Pure. Fiber channel is supported. ISCSI may just be an easy path to using a PureFlash Array or NFS (10G network) for FlashBlade with an existing vSphere Setup.

To summarize all that: The container host VM needs access to talk directly to the storage. I accomplish this today with multiple vnics but you can do it however you like. There may be some vSwitches, physical nics and switches in the way, but the end result should be the VM talking to the FlashArray or FlashBlade.

More information on configuring our plugins is here:

Docker/DCOS/Mesos – https://store.docker.com/plugins/pure-docker-volume-plugin
Kubernetes and OpenShift – https://hub.docker.com/r/purestorage/k8s/

Basically the container host needs to be able to talk to the MGMT interface of the array, to do it’s automation of creating host objects, volumes and connecting them together (also removing them when you are finished). The thing is to know the plugin does all the work for you. Then when your application manifest requests the storage the plugin mounts the device to the required mount point inside the container. The app (container) does not know or care anything about iSCSI, NFS or Fiber Channel (and it should not).

Container HOST Storage Networking

Container hosts as VM’s Storage Networking

If you are setting up iSCSI in vSphere for Pure, you should probably go see Cody’s pages on doing this most of this is a good idea as a foundation for what I am about to share.

https://www.codyhosterman.com/pure-storage-vmware-overview/flasharray-and-vmware-best-practices/iscsi-setup/

Make sure you can use MPIO. Follow the linux best practices for Pure Storage. Inside your container hosts.

Do it the good old (new) gui way

So what I normally do is setup 2 new port groups on my VDS.

something like… iscsi-1 and iscsi-2 I know I am very original and creative.

Set the uplink for the Portgroup

We used to setup “in guest iSCSI” for VM’s that needed array based snaphost features way back in the day. This is basically the same piping. After creating the new port groups edit the settings in the HTML5 GUI as shown below.

Set the Failover Order

Go for iSCSI-1 on Uplink 1 and iSCSI-2 on Uplink 2

I favor putting the other Uplink into “Unused” as this gives me the straightest troubleshooting path in case something downstream isn’t working. You can put it in “standby” and probably be just fine.

Kubernetes Anywhere and PhotonOS Template

Experimenting with Kubernetes to orchestrate and manage containers? If you are like me and already have a lot invested in vSphere (time, infra, knowledge) you might be exctied to use Kubernetes Anywhere to deploy it quickly. I won’t re-write the instruction found here:

https://github.com/kubernetes/kubernetes-anywhere

It works with

Google Compure Engine
Azure
vSphere

The vSphere option uses the Photon OS ova to spin up the container hosts and managers. So you can try it out easily with very little background in containers. That is dangerous as you will find yourself neck deep in new things to learn.

Don’t turn on the template!

If you are like me and *skim* instructions you could be in for hours of “Why do all my nodes have the same IP?” When you power on the Photon OS template the startup sequence generates a machine ID (and mac address). So even though I powered it back off, the cloning processes was producing identical VM’s for my kubernetes cluster. Those not hip to networking this is bad for communication.

Also, don’t try to be a good VMware Admin cad convert that VM to a VM Template. The Kubernetes Anywhere script won’t find it.

IF you do like me and skip a few lines reading (happens right) make sure to check this documenation out on Photon OS. It will help get you on the right track.

https://github.com/vmware/photon/blob/master/docs/photon-admin-guide.md#clearing-the-machine-id-of-a-cloned-instance-for-dhcp

This is clearly marked in the documentation now.