Thursday, October 02, 2014

CloudStack simulator on Docker

Docker is a lot of fun, one of its strength is in the portability of applications. This gave me the idea to package the CloudStack management server as a docker image.

CloudStack has a simulator that can fake a data center infrastructure. It can be used to test some of the basic functionalities. We use it to run our integration tests, like the smoke tests on TravisCI. The simulator allows us to configure an advanced or basic networking zone with fake hypervisors.

So I bootstrapped the CloudStack management server, configured the Mysql database with an advanced zone and created a docker image with Packer. The resulting image is on DockerHub, and I realized after the fact that four other great minds already did something similar :)

On a machine running docker:

docker pull runseb/cloudstack
docker run -t -i -p 8080:8080 runseb/cloudstack:0.1.4 /bin/bash
# service mysql restart
# cd /opt/cloudstack
# mvn -pl client jetty:run -Dsimulator

Then open your browser on http://<IP_of_docker_host>:8080/client and enjoy !

Tuesday, September 30, 2014

On Docker and Kubernetes on CloudStack

On Docker and Kubernetes on CloudStack

Docker has pushed containers to a new level, making it extremely easy to package and deploy applications within containers. Containers are not new, with Solaris containers and OpenVZ among several containers technologies going back 2005. But Docker has caught on quickly as mentioned by @adrianco. The startup speed is not surprising for containers, the portability is reminiscent of the Java goal to "write once run anywhere". What is truly interesting with Docker is that availability of Docker registries (e.g Docker Hub) to share containers and the potential to change the application deployment workflows.

Rightly so, we should soon see IT move to a docker based application deployment, where developers package their applications and make them available to Ops. Very much like we have been using war files. Embracing a DevOps mindset/culture should be easier with Docker. Where it becomes truly interesting is when we start thinking about an infrastructure whose sole purpose is to run containers. We can envision a bare operating system with a single goal to manage docker based services. This should make sys admin life easier.

The role of the Cloud with Docker

While the buzz around Docker has been truly amazing and a community has grown over night, some may think that this signals the end of the cloud. I think it is far from the truth as Docker may indeed become the killer app of the cloud.

A IaaS layer is what is: an infrastructure orchestration layer, while Docker and its ecosystem will become the application orchestration layer.

The question then becomes: How do I run Docker in the cloud ? And there is a straightforward answer: Just install Docker in your cloud templates. Whether on AWS or GCE or Azure or your private cloud, you can prepare linux based templates that provide Docker support. If you are aiming for the bare operating system whose sole purpose is to run Docker then the new CoreOS linux distribution might be your best pick. CoreOS provides rolling upgrades of the kernel, systemd based services, a distributed key value store (i.e etcd) and a distributed service scheduling system (i.e fleet)

exoscale an Apache CloudStack based public clouds, recently announced the availability of CoreOS templates.

Like exoscale, any cloud provider be it public or private can make CoreOS templates available. Providing Docker within the cloud instantly.

Docker application orchestration, here comes Kubernetes

Running one container is easy, but running multiple coordinated containers across a cluster of machines is not yet a solved problem. If you think of an application as a set of containers, starting these on multiple hosts, replicating some of them, accessing distributed databases, providing proxy services and fault tolerance is the true challenge.

However, Google came to the resuce and announced Kubernetes a system that solves these issues and makes managing scalable, fault-tolerant container based apps doable :)

Kubernetes is of course available on Google public cloud GCE, but also in Rackspace, Digital Ocean and Azure. It can also be deployed on CoreOS easily.

Kubernetes on CloudStack

Kubernetes is under heavy development, you can test it with Vagrant. Under the hood, aside from the go code :), most of the deployment solutions use SaltStack recipes but if you are a Chef, Puppet or Ansible user I am sure we will see recipes for those configuration management solution soon.

But you surely got the same idea that I had :) Since Kubernetes can be deployed on CoreOS and that CoreOS is available on exoscale. We are just a breath away from running Kubernetes on CloudStack.

It took a little more than a breath, but you can clone kubernetes-exoscale and you will get running in 10 minutes. With a 3 node etcd cluster and a 5 node kubernetes cluster, running a Flannel overlay.

CloudStack supports EC2 like userdata, and the CoreOS templates on exoscale support cloud-init, hence passing some cloud-config files to the instance deployment was straightforward. I used libcloud to provision all the nodes, created proper security groups to let the Kubernetes nodes access the etcd cluster and let the Kubernetes nodes talk to each other, especially to open a UDP port to build a networking overlay with Flannel. Fleet is used to launch all the Kubernetes services. Try it out.

Conclusions.

Docker is extremely easy to use and taking advantage of a cloud you can get started quickly. CoreOS will put your docker work on steroid with availability to start apps as systemd services over a distributed cluster. Kubernetes will up that by giving you replication of your containers and proxy services for free (time).

We might see pure docker based public clouds (e.g think Mesos cluster with a Kubernetes framework). These will look much more like PaaS, especially if they integrate a Docker registry and a way to automatically build docker images (e.g think continuous deployment pipeline).

But a "true" IaaS is actually very complimentary, providing multi-tenancy, higher security as well as multiple OS templates. So treating docker as a standard cloud workload is not a bad idea. With three dominant public clouds in AWS, GCE and Azure and a multitude of "regional" ones like exoscale it appears that building a virtualization based cloud is a solved problem (at least with Apache CloudStack :)).

So instead of talking about cloudifying your application, maybe you should start thinking about Dockerizing your applications and letting them loose on CloudStack.

Friday, July 11, 2014

GCE Interface to CloudStack

Gstack, a GCE compatible interface to CloudStack

Google Compute Engine (GCE) is the Google public cloud. In december 2013, Google announced the General Availability (GA) of GCE. With AWS and Microsoft Azure, it is one of the three leading public clouds in the market. Apache CloudStack now has a brand new GCE compatible interface (Gstack) that lets users use the GCE clients (i.e gcloud and gcutil) to access their CloudStack cloud. This has been made possible through the Google Summer of Code program.

Last summer Ian Duffy, a student from Dublin City University participated in GSoC through the Apache Software Foundation (ASF) and worked on a LDAP plugin to CloudStack. He did such a great job that he finished early and was made an Apache CloudStack committer. Since he was done with his original GSoC project I encouraged him to take on a new one :), he brought in a friend for the ride: Darren Brogan. Both of them worked for fun on the GCE interface to CloudStack and learned Python doing so.

They remained engaged with the CloudStack community and has a third year project worked on an Amazon EC2 interface to CloudStack using what they learned from the GCE interface. They got an A :). Since they loved it so much, Darren applied to the GSoC program and proposed to go back to Gstack, improve it, extend the unittests and make it compatible with the GCE v1 API.

Technically, Gstack is a Python Flask application that provides a REST API compatible with the GCE API and forwards the requests to the corresponding CloudStack API. The source is available on GitHub and the binary is downloadable via PyPi. Let's show you how to use it.

Installation and Configuration of Gstack

You can grab the Gstack binary package from Pypi using pip in one single command.

pip install gstack

Or if you plan to explore the source and work on it, you can Clone the repository and install it by hand. Pull requests are of course welcome.

git clone https://github.com/NOPping/gstack.git
sudo python ./setup.py install

Both of these installation methods will install a gstack and a gstack-configure binary in your path. Before running Gstack you must configure it. To do so run:

gstack-configure

And enter your configuration information when prompted. You will need to specify the host and port where you want gstack to run on, as well as the CloudStack endpoint that you want gstack to forward the requests to. In the example below we use the exoscale cloud:

$ gstack-configure
gstack bind address [0.0.0.0]: localhost
gstack bind port [5000]: 
Cloudstack host [localhost]: api.exoscale.ch
Cloudstack port [8080]: 443
Cloudstack protocol [http]: https
Cloudstack path [/client/api]: /compute

The information will be stored in a configuration file available at ~/.gstack/gstack.conf:

$ cat ~/.gstack/gstack.conf 
PATH = 'compute/v1/projects/'
GSTACK_BIND_ADDRESS = 'localhost'
GSTACK_PORT = '5000'
CLOUDSTACK_HOST = 'api.exoscale.ch'
CLOUDSTACK_PORT = '443'
CLOUDSTACK_PROTOCOL = 'https'
 CLOUDSTACK_PATH = '/compute'

You are now ready to start Gstack in the foreground with:

gstack

That's all there is to running Gstack. To be able to use it as if you were talking to GCE however, you need to use gcutil and configure it a bit.

Using gcutil with Gstack

The current version of Gstack only works with the stand-alone version of gcutil. Do not use the version of gcutil bundled in the Google Cloud SDK. Instead install the 0.14.2 version of gcutil. Gstack comes with a self-signed certificate for the local endpoint gstack/data/server.crt, copy the certificate to the gcutil certificates file gcutil/lib/httplib2/httplib2/cacerts.txt. A bit dirty I know, but that's a work in progress.

At this stage your CloudStack apikey and secretkey need to be entered in the gcutil auth_helper.py file at gcutil/lib/google_compute_engine/gcutil/auth_helper.py.

Again not ideal but hopefully gcutil or the Cloud SDK will soon be able to configure those without touching the source. Darren and Ian opened a feature request with google to pass the client_id and client_secret as options to gcutil, hopefully future release of gcutil will allow us to do so.

Now, create a cached parameters file for gcutil. Assuming you are running gstack on your local machine, using the defaults that were suggested during the configuration phase. Modify ~/.gcutil_params with the following:

--auth_local_webserver
--auth_host_port=9999
--dump_request_response
--authorization_uri_base=https://localhost:5000/oauth2
--ssh_user=root
--fetch_discovery
--auth_host_name=localhost
--api_host=https://localhost:5000/

Warning: Make sure to set the --auth_host_name variable to the same value as GSTACK_BIND_ADDRESS in your ~/.gstack/gstack.conf file. Otherwise you will see certificates errors.

With this setup complete, gcutil will issues requests to the local Flask application, get an OAuth token, issue requests to your CloudStack endpoint and return the response in a GCE compatible format.

Example with exoscale.

You can now start issuing standard gcutil commands. For illustration purposes we use Exoscale. Since there are several semantic differences, you will notice that as a project we use the account information from CloudStack. Hence we pass our email address as the project value.

Let's start by listing the availability zones:

$ gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com listzones
+----------+--------+------------------+
| name     | status | next-maintenance |
+----------+--------+------------------+
| ch-gva-2 | UP     | None scheduled   |
+----------+--------+------------------+

Let's list the machine types or in CloudStack terminology: the compute service offerings and to list the available images.

$ ./gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com listimages
$ gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com listmachinetypes
+-------------+----------+------+-----------+-------------+
| name        | zone     | cpus | memory-mb | deprecation |
+-------------+----------+------+-----------+-------------+
| Micro       | ch-gva-2 |    1 |       512 |             |
+-------------+----------+------+-----------+-------------+
| Tiny        | ch-gva-2 |    1 |      1024 |             |
+-------------+----------+------+-----------+-------------+
| Small       | ch-gva-2 |    2 |      2048 |             |
+-------------+----------+------+-----------+-------------+
| Medium      | ch-gva-2 |    2 |      4096 |             |
+-------------+----------+------+-----------+-------------+
| Large       | ch-gva-2 |    4 |      8182 |             |
+-------------+----------+------+-----------+-------------+
| Extra-large | ch-gva-2 |    4 |     16384 |             |
+-------------+----------+------+-----------+-------------+
| Huge        | ch-gva-2 |    8 |     32184 |             |
+-------------+----------+------+-----------+-------------+

You can also list firewalls which gstack maps to CloudStack security groups. To create a securitygroup, use the firewall commands:

$ ./gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com addfirewall ssh --allowed=tcp:22
$ ./gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com getfirewall ssh

To start an instance you can follow the interactive prompt given by gcutil. You will need to pass the --permit_root_ssh flag, another one of those semantic and access configuration that needs to be ironed out. The interactive prompt will let you choose the machine type and the image that you want, it will then start the instance

$ ./gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com addinstance foobar
Selecting the only available zone: CH-GV2
1: Extra-large  Extra-large 16384mb 4cpu
2: Huge Huge 32184mb 8cpu
3: Large    Large 8192mb 4cpu
4: Medium   Medium 4096mb 2cpu
5: Micro    Micro 512mb 1cpu
6: Small    Small 2048mb 2cpu
7: Tiny Tiny 1024mb 1cpu
7
1: CentOS 5.5(64-bit) no GUI (KVM)
2: Linux CentOS 6.4 64-bit
3: Linux CentOS 6.4 64-bit
4: Linux CentOS 6.4 64-bit
5: Linux CentOS 6.4 64-bit
6: Linux CentOS 6.4 64-bit
<...snip>
INFO: Waiting for insert of instance . Sleeping for 3s.
INFO: Waiting for insert of instance . Sleeping for 3s.

Table of resources:

+--------+--------------+--------------+----------+---------+
| name   | network-ip   | external-ip  | zone     | status  |
+--------+--------------+--------------+----------+---------+
| foobar | 185.1.2.3    | 185.1.2.3    | ch-gva-2 | RUNNING |
+--------+--------------+--------------+----------+---------+

Table of operations:

+--------------+--------+--------------------------+----------------+
| name         | status | insert-time              | operation-type |
+--------------+--------+--------------------------+----------------+
| e4180d83-31d0| DONE   | 2014-06-09T10:31:35+0200 | insert         |
+--------------+--------+--------------------------+----------------+

You can of course list (with listinstances) and delete instances

$ ./gcutil --cached_flags_file=~/.gcutil_params --project=runseb@gmail.com deleteinstance foobar
Delete instance foobar? [y/n]
y 
WARNING: Consider passing '--zone=CH-GV2' to avoid the unnecessary zone lookup which requires extra API calls.
INFO: Waiting for delete of instance . Sleeping for 3s.
+--------------+--------+--------------------------+----------------+
| name         | status | insert-time              | operation-type |
+--------------+--------+--------------------------+----------------+
| d421168c-4acd| DONE   | 2014-06-09T10:34:53+0200 | delete         |
+--------------+--------+--------------------------+----------------+

Gstack is still a work in progress, it is now compatible with GCE GA v1.0 API. The few differences in API semantics need to be investigated further and additional API calls need to be supported. However it provides a solid base to start working on hybrid solutions between GCE public cloud and a CloudStack based private cloud.

GSoC has been terrific to Ian and Darren, they both learned how to work with an open source community and ultimately became part of it through their work. They learned tools like JIRA, git, Review Board and became less shy with working publicly on a mailing lists. Their work on Gstack and EC2stack is certainly of high value to CloudStack and should become the base for interesting products that will use hybrid clouds.

Tuesday, June 03, 2014

Eutester with CloudStack

Eutester

An interesting tool based on Boto is Eutester it was created by the folks at Eucalyptus to provide a framework to create functional tests for AWS zones and Eucalyptus based clouds. What is interesting with eutester is that it could be used to compare the AWS compatibility of multiple clouds. Therefore the interesting question that you are going to ask is: Can we use Eutester with CloudStack ? And the answer is Yes. Certainly it could use more work but the basic functionality is there.

Install eutester with:

pip install eutester

Then start Python/iPython interactive shell or write a script that will import ec2ops and create a connection object to your AWS EC2 compatible endpoint. For example, using ec2stack:

    #!/usr/bin/env python

    from eucaops import ec2ops
    from IPython.terminal.embed import InteractiveShellEmbed

    accesskey="my api key"
    secretkey="my secret key"

    conn.ec2ops.EC2ops(endpoint="localhost",
                   aws_access_key_id=apikey,
                   aws_secret_access_key=secretkey,
                   is_secure=False,
                   port=5000,
                   path="/",
                   APIVersion="2014-02-01")

    ipshell = InteractiveShellEmbed(banner1="Hello Cloud Shell !!")
    ipshell()

Eutester at the time of this writing has 145 methods. Only the methods available through the CloudStack AWS EC2 interface will be availble. For example, get_zones and get_instances would return:

In [3]: conn.get_zones()
Out[3]: [u'ch-gva-2']

In [4]: conn.get_instances()
[2014-05-21 05:39:45,094] [EUTESTER] [DEBUG]: 
--->(ec2ops.py:3164)Starting method: get_instances(self, state=None, 
     idstring=None, reservation=None, rootdevtype=None, zone=None,
     key=None, pubip=None, privip=None, ramdisk=None, kernel=None,
     image_id=None, filters=None)
Out[4]: 
[Instance:5a426582-3aa3-49e0-be3f-d2f9f1591f1f,
 Instance:95ee8534-b171-4f79-9e23-be48bf1a5af6,
 Instance:f18275f1-222b-455d-b352-3e7b2d3ffe9d,
 Instance:0ea66049-9399-4763-8d2f-b96e9228e413,
 Instance:7b2f63d6-66ce-4e1b-a481-e5f347f7e559,
 Instance:46d01dfd-dc81-4459-a4a8-885f05a87d07,
 Instance:7158726e-e76c-4cd4-8207-1ed50cc4d77a,
 Instance:14a0ce40-0ec7-4cf0-b908-0434271369f6]

This example shows that I am running eight instances at the moment in a zone called ch-gva-2, our familiar exoscale. Selecting one of these instance objects will give you access to all the methods available for instances. You could also list, delete and create keypairs. List, delete and create security groups etc.

Eutester is meant for building integration tests and easily creating test scenarios. If you are looking for a client to build an application with, use Boto.

The master branch of eutester may still cause problems to list images from a CloudStack cloud. I recently patched a fork of the testing branch and opened an issue on their github page. You might want to check its status if you want to use eutester heavily.

Wednesday, March 19, 2014

Migrating from Publican to Sphinx and Read The Docs

Migration from Publican to Sphinx and Read The Docs

When we started with Cloudstack we chose to use publican for our documentation. I don't actually know why, except that Red Hat documentation is entirely based on publican. Perhaps David Nalley's background with Fedora influenced us :) In any case publican is a very nice documentation building system, it is based on the docbook format and has great support for localization. However it can become difficult to read and organize lots of content, and builds may break for strange reasons. We also noticed that we were not getting many contributors to the documentation, in contrast, the translation efforts via transifex has had over 80 contributors. As more features got added to CloudStack the quality of the content also started to suffer and we also faced issues with publishing the translated documents. We needed to do something, mainly making it easier to contribute to our documentation. Enters ReStructured Text (RST) and Read The Docs (RTD).

Choosing a new format

We started thinking about how to make our documentation easier to contribute to. Looking at Docbook, purely xml based, it is a powerful format but not very developer friendly. A lot of us are happy with basic text editor, with some old farts like me mainly stuck with vi. Markdown has certainly helped a lot of folks in writing documentation and READMEs, just look at Github projects. I started writing in Markdown and my production in terms of documentation and tutorials skyrocketed, it is just a great way to write docs. Restructured Text is another alternative, not really markdown, but pretty close. I got familiar with RST in the Apache libcloud project and fell in love with it, or at least liked it way more than docbook. RST is basically text, only a few markups to learn and your off.

Publishing Platform

A new format is one thing but you then need to build documentation in multiple formats: html, pdf, epub potentially more. How do you move from .rst to these formats for your projects ? Comes in Sphinx, pretty much an equivalent to publican originally aimed at Python documentation but now aimed at much more. Installing sphinx is easy, for instance on Debian/Ubuntu:
apt-get install python-sphinx
You will then have the sphinx-quickstart command in your path, use it to create your sphinx project, add content in index.rst and build the docs with make html. Below is a basic example for a ff sample project.




What really got me sold on reStructuredText and Sphinx was ReadTheDocs (RTD). It hosts documentation for open source projects. It automatically pulls your documentation from your revision control system and builds the docs. The killer feature for me was the integration with github (not just git). Using hooks, RTD can trigger builds on every commit and it also displays an edit on github icon on each documentation page. Click on this icon, and the docs repository will get forked automatically on your github account. This means that people can edit the docs straight up in the github UI and submit pull requests as they read the docs and find issues.

Conversion

After [PROPOSAL] and [DISCUSS] threads on the cloudstack mailing list, we reached consensus and decided to make the move. This is still on-going but we are getting close to going live with our new docs in RST and hosted by RTD. There were couple challenges:
  1. Converting the existing docbook based documentation to RST
  2. Setting up new repos, CNAMEs and Read The Docs projects
  3. Setting up the localization with transifex
The conversion was much easier than expected thanks to pandoc, one of those great command line utility that saves your life.
pandoc -f docbook -t rst -o test.rst test.docbook
You get the just of it, iterate through your docbook files and generate the RST files, combine everything to reconstruct your chapters and books and re-organize as you wish. They are off course couple gotchas, namely the table formatting may not be perfect, the note and warnings may be a bit out of whack and the heading levels should probably be checked. All of these are actually good to check as a first pass through the docs to revamp the content and the way it is organized.
One thing that we decided to do before talking about changing the format was to move our docs to a separate repository. What we wanted to do was to be able to release docs on a different time frame than the code release, as well as make any doc bug fixes go live as fast as possible and not wait for a code release (that's a long discussion...). With a documentation specific repo in place, we used Sphinx to create the proper directory structure and add the converted RST files. Then we created a project on Read The Docs and pointed to the github mirror of our Apache git repo. Pointing to the github mirror allowed us to enable the nice github interaction that RTD provides. The new doc site looks like this.



There is a bit more to it, as we actually created several repositories and used a RTD feature called subprojects to make all the docs live under the same CNAME docs.cloudstack.apache.org. This is still work in progress but in track for the 4.3 code release. I hope to be able to announce the new documentation sites shortly after 4.3 is announced.
The final hurdle is the localization support. Sphinx provides utilities to generate POT files. They can then be uploaded to transifex and translation strings can be pulled to construct the translated docs. The big challenge that we are facing is to not loose the existing translation that were done from the docbook files. Strings may have changed. We are still investigating how to not loose all that work and get back on our feet to serve the translated docs. The Japanese translators have started to look at this.
Overall the migration was easy, ReStructuredText is easy, Sphinx is also straigthfoward and Read The Docs provides a great hosting platform well integrated with Github. Once we go live, we will see if our doc contributors increase significantly, we have already seen a few pull requests come in, which is very encouraging.
I will be talking about all of this at the Write The Docs conference in Budapest on March 31st, april 1st. If you are in the area stop by :)

Tuesday, March 04, 2014

Why CloudStack is not a Citrix project

I was at CloudExpo Europe in London last week for the Open Cloud Forum to give a tutorial on CloudStack tools. A decent crowd showed up, all carrying phones. Kind of problematic for a tutorial where I wanted the audience to install python packages and actually work :) Luckily I made it self-paced so you can follow at home. Giles from Shapeblue was there too and he was part of a panel on Open Cloud. He was told once again "But Apache CloudStack is a Citrix project !" This in itself is a paradox and as @jzb told me on twitter yesterday "Citrix donated CloudStack to Apache, the end". Apache projects do not have any company affiliation.

I don't blame folks, with all the vendors seemingly supporting OpenStack, it does seem that CloudStack is a one supporter project. The commit stats are also pretty clear with 39% of commits coming from Citrix. This number is also probably higher since those stats are reporting gmail and apache as domain contributing 20 and 15% respectively, let's say 60% is from Citrix. But nonetheless, this is ignoring and mis-understanding what Apache is and looking at the glass half empty.

When Citrix donated CloudStack to the Apache Software Foundation (ASF) it relinquished control of the software and the brand. This actually put Citrix in a bind, not being able to easily promote the CloudStack project. Indeed, CloudStack is now a trademark of the ASF and Citrix had to rename their own product CloudPlatform (powered by Apache CloudStack). Citrix cannot promote CloudStack directly, it needs to get approval to donate sponsoring and follow the ASF trademark guidelines. Every committer and especially PMC members of Apache CloudStack are now supposed to work and protect the CloudStack brand as part of the ASF and make sure that any confusion is cleared. This is what I am doing here.

Of course when the software was donated, an initial set of committers was defined, all from Citrix and mostly from the former cloud.com startup. Part of the incubating process at the ASF is to make sure that we can add committers from other organization and attract a community. "Community over Code" is the bread and butter of ASF and so this is what we have all been working on, expanding the community outside Citrix, welcoming anyone who thinks CloudStack is interesting enough to contribute a little bit of time and effort. Looking at the glass half empty is saying that CloudStack is a Citrix project "Hey look 60% of their commits is from Citrix", looking at it half full like I do is saying "Oh wow, in a year since graduation, they have diversified the committer based, 40% are not from Citrix". Is 40% enough ? of course not, I wish it were the other way around, I wish Citrix were only a minority in the development of CloudStack.

Couple other numbers: Out of the 26 members of the project management committee (PMC) only seven are from Citrix and looking at mailing lists participation since the beginning of the year, 20% of the folks on the users mailing list and 25% on the developer list are from Citrix. We have diversified the community a great deal but the "hand-over", that moment when new community members are actually writing more code than the folks who started it, has not happened yet. A community is not just about writing code, but I will give it to you that it is not good for a single company to "control" 60% of the development, this is not where we/I want to be.

This whole discussion is actually against Apache's modus operandi. Since one of the biggest tenant of the foundation is non-affiliation. When I participate on the list I am Sebastien, I am not a Citrix employee. Certainly this can put some folks in conflicting situations at times, but the bottom line is that we do not and should not take into account company affiliation when working and making decisions for the project. But if you really want some company name dropping, let's do an ASF faux-pas and let's look at a few features:

The Nicira/NSX and OpenDaylight SDN integration was done by Schuberg Phillis, the OpenContrail plugin was done by Juniper, Midokura created it's own plugin for Midonet and Stratosphere as well, giving us a great SDN coverage. The LXC integration was done by Gilt, Klarna is contributing in the ecosystem with the vagrant and packer plugins, CloudOps has been doing terrific job with Chef recipes, Palo-Alto networks integration and Netscaler support, a google summer of code intern did a brand new LDAP plugin and another GSoC did the GRE support for KVM. RedHat contributed the Gluster plugin and PCExtreme contributed the Ceph interface while Basho of course contributed the S3 plugin for secondary storage as well as major design decisions on the storage refactor. The Solidfire plugin was done by, well Solidfire and Netapp has developed a plugin as well for their virtual storage console. NTT contributed the CloudFoundry interface via BOSH. On the user side, Shapeblue is leading the user support company. So no it's not just Citrix.

Are all these companies members of the CloudStack project ? No. There is no such thing as a company being a member of an ASF project. There is no company affiliation, there is no lock in, just a bunch of guys trying to make good software and build a community. And yes, I work for Citrix and my job here will be done when Citrix only contributes 49% of the commits. Citrix is paying me to make sure they loose control of the software, that a healthy ecosystem develops and that CloudStack keeps on becoming a strong and vibrant Apache project. I hope one day folks will understand what CloudStack has become, an ASF project, like HTTP, Hadoop, Mesos, Ant, Maven, Lucene, Solr and 150 other projects. Come to Denver for #apachecon you will see ! The end.

Tuesday, January 21, 2014

PaaS with CloudStack

A few talks from CCC in Amsterdam

In November at the CloudStack Collaboration Conference I was pleased to see several talks on PaaS. We had Uri Cohen (@uri1803) from Gigaspaces, Marc-Elian Begin (@lemeb) from Sixsq and Alex Heneveld (@ahtweetin) from CloudSoft. We also had some related talks -depending on your definition of PaaS- with talks about Docker and Vagrant.

PaaS variations

The differences between PaaS solutions is best explained by this picture from AWS FAQ about application management.
There is clearly a spectrum that goes from operational control to pure application deployment. We could argue that true PaaS abstracts the operational details and that management of the underlying infrastructure should be totally hidden, that said, automation of virtual infrastructure deployment has reached such a sophisticated state that it blurs the definition between IaaS and PaaS. Not suprisingly AWS offers services that covers the entire spectrum.
Since I am more on the operation side, I tend to see a PaaS as an infrastructure automation framework. For instance I look for tools to deploy a MongoDB cluster or a RiakCS cluster. I am not looking for an abstract plaform that has Monogdb pre-installed and where I can turn a knob to increase the size of the cluster or manage my shards. An application person will prefer to look at something like Google App Engine and it's open source version Appscale. I will get back to all these differences in a next post on PaaS but this article by @DavidLinthicum that just came out is a good read.

Support for CloudStack

What is interesting for the CloudStack community is to look at the support for CloudStack in all these different solutions wherever they are in the application management spectrum.
  • Cloudify from Gigaspaces was all over twitter about their support for OpenStack, and I was getting slightly bothered with the lack of CloudStack support. That's why it was great to see Uri Cohen in Amstredam. He delivered a great talk and he gave me a demo of Cloudify. I was very impressed of course by the slick UI but overall by the ability to provision complete application/infrastructure definitions on clouds. Underlying it uses Apache jclouds, so there was no reason that it could not talk to CloudStack. Over christmas Uri did a terrific job and the CloudStack support is now tested and documented. It works not only on the commercial version from Citrix CloudPlatform but also with Apache CloudStack. And of course it works with my neighbors Cloud exoscale
  • Slipstream is not widely known but worth a look. At CCC @lemeb demoed a CloudStack driver. Since then, they now offer an hosted version of their slipstream cloud orchestration framework which turns out to be hosted on exoscale CloudStack cloud. Slipstream is more of a Cloud broker than a PaaS but it automates application deployment on multiple clouds abstracting the various cloud APIs and offering application templates for deployments of virtual infrastructure. Check it out.
  • Cloudsoft main application deployment engine is brooklyn, it originated from Alex Heneveld contribution to Apache Whirr that I wrote about couple times. But it can use OpenShift for additional level of PaaS. I will need to check with Alex how they are doing this, as I believe Openshift uses LXC. Since CloudStack has LXC support, one ought to be able to use Brooklyn to deploy a LXC cluster on CloudStack and then use OpenShift to manage deployed applications.
  • A quick note on OpenShift. As far as I understand, it actually uses a static cluster. The scalability comes from the use of containes in the nodes. So technically you could create an OpenShift cluster in CloudStack, but I don't think we will see OpenShift talking directly to the CloudStack API to add nodes. Openshift bypasses the IaaS APIs. Of course I have not looked at it in a while and I may be wrong :)
  • Talking about PaaS for Vagrant is probably a bit far fetched, but it fits the infrastructure deployment criteria and could be compared with AWS OpsWorks. Vagrant helps to define reproducible machines so that devs and ops can actually work on the same base servers. But Vagrant with its plugins can also help deployment on public clouds, and can handle multiple server definitions. So one can look at a Vagrantfile as a template defintion for a virtual infrastructure deployment. As a matter of fact, there are many Vagrant boxes out there to deploy things like Apache Mesos clusters, MongoDB, RiakCS clusters etc. It's not meant to manage that stack in production but at a minimum can help develop it. Vagrant has a CloudStack plugin demoed by Hugo Correia from Klarna at CCC. Exoscale took the bait and created a set of -exoboxes- that's real gold for developers deploying in exoscale and any CloudStack provider should follow suit.
  • Which brings me on to Docker, there is currently no support for Docker in CloudStack. We do have LXC support therefore it would not be to hard to have a 'docker' cluster in CloudStack. You could even install docker within an image and deploy that on KVM or Xen. Of course some would argue that using containers within VMs defeats the purpose. In any case, with the Docker remote API you could then manage your containers. OpenStack already has a Docker integration, we will dig deeper into Docker functionality to see how best to integrate it in CloudStack.
  • AWS as I mentioned has several PaaS like layers with OpsWorks, CloudFormation, Beanstalk. CloudStack has an EC2 interface but also has a third party solution to enabled CloudFormation. This is still under development but pretty close to full functionality, check out stackmate and its web interface stacktician. With a CF interface to CloudStack we could see a OpSWork and a Beanstalk interface coming in the future.
  • Finally, not present at CCC but the leader of PaaS for enterprise is CloudFoundry. I am going to see Andy Piper (@andypiper) in London next week and will make sure to talk to him about the recent CloudStack support that was merged in the cloudfoundry community repo. It came from folks in Japan and I have not had time to test it. Certainly we as a community should look at this very closely to make sure there is outstanding support for CloudFoundry in ACS.
It is not clear what the frontier between PaaS and IaaS is, it is highly dependent on the context, who you are and what you are trying to achieve. But CloudStack offers several interfaces to PaaS or shall I say PaaS offer several connectors to CloudStack :)