Friday 5 August 2016

TripleO Composable Services 101

Over the newton cycle, we've been working very hard on a major refactor of our heat templates and puppet manifiests, such that a much more granular and flexible "Composable Services" pattern is followed throughout our implementation.

It's been a lot of work, but it's been a frequently requested feature for some time, so I'm excited to be in a position to say it's complete for Newton (kudos to everyone involved in making that happen!) :)

This post aims to provide an introduction to this work, an overview of how it works under the hood, some simple usage examples and a roadmap for some related follow-on work.



Why Composable Services?


It probably helps to start with some historical context here.  As described in previous posts TripleO has provided a fixed architecture with 5 roles (where "roles" means groups of nodes) Controller, Compute, BlockStorage, CephStorage and ObjectStorage.

To configure each of these roles, we used puppet, and we had a large manifest per role, with some relatively inflexible assumptions about which services would run on each role.

This worked OK, but many users have been requesting more flexibility, such as:

  • Ability to easily disable services they don't need
  • Allow service placement choice, such as co-locating the Ceph OSD service with nova-compute services to reduce the required hardware footprint (so-called "hyperconverged" deployments)
  • Make it easier to integrate new services and integrate third-party pieces (get closer to a strongly defined "plugin" interface)


The pre-newton Tripleo architecture, one manifest and heat template per role.

So, how does it work?

So, basically we've made two fundamental changes to our interfaces:

  • Each service, e.g "nova-api" is now defined by an individual heat template.  The interfaces for these are standardized so all services must implement a basic subset of input parameters and output values.
  • Every service defines a small puppet "profile", which is a puppet manifest fragment that defines configuring that service.  Again a standard interface is used, in particular a "step" variable is passed to each puppet profile, so you can choose which step configuration occurs in (we apply configuration in a series of six steps so the author of the profile can choose when a service is configured relative to other services).
This is the basis of the TripleO "service plugin" interface, and it should enable *much* easier integration of new services, and hopefully provide a more accessible interface to new contributors.

Inside the TripleO templates, we made use of a new-for-mitaka Heat ResourceChain interface to compose a deployment of multiple services.  Basically a ResourceChain is a group of resources that may have different types, but conform to the same interfaces, which is what we need to combine a bunch of service templates that all have some standard interfaces.

Here's an illustration of how it works - essentially you define an input parameter which is a list of services, e.g OS::TripleO::Services: NovaApi which then maps to the heat template for that service, e.g puppet/services/nova-api.yaml via the resource_registry interface discussed in previous posts.   

For Newton, each role has a ServiceChain that combines the chosen services for that role.

If you'd like more information on the implementation details, I'd encourage you to check out the developer documentation where we're starting to document these interfaces in more detail.

Ok, how do I use it?

Here I'm going to focus on usage of the feature vs developing new services (which is pretty well covered in the aforementioned developer docs), and hopefully illustrate why this is an important step forward that improves operator deployment choices.

Scenario 1 - All in one minimal deployment

Lets say for a moment that you're a keystone developer and you want a shorter debug cycle and/or are resource constrained.  With the new interfaces, it's become very easy to deploy a minimal subset of services on a single node:

First you create an environment file that overrides the default ControllerServices list (which at the time of writing contains about 50 services!) so it only includes OS::TripleO::Services::Keystone and the services keystone depends on.  We also set ComputeCount to zero as we don't need any compute nodes.

$ cat keystone-only.yaml
parameter_defaults:
  ControllerServices:
      - OS::TripleO::Services::Keystone
      - OS::TripleO::Services::RabbitMQ

      - OS::TripleO::Services::HAproxy
      - OS::TripleO::Services::MySQL
  ComputeCount: 0


(Note that in some environments it may also be necessary to include the OS::TripleO::Services::Pacemaker too)

You can then deploy your single node keystone-only environment:

openstack overcloud deploy --templates -e keystone_only.yaml

When this completes, you'll see the following message, and you can source the overcloudrc and get a token to prove the deployed keystone is working:

...
Overcloud Endpoint: http://192.0.2.15:5000/v2.0
Overcloud Deployed
[stack@instack ~]$ . overcloudrc
[stack@instack ~]$ openstack token issue
+------------+----------------------------------+
| Field      | Value                            |
+------------+----------------------------------+
| expires    | 2016-08-05 10:16:16+00:00        |
| id         | 976d5fcf9f744a5a9cf840e83d825560 |
| project_id | 99e92ae58d1f4147a5d7eda0af516060 |
| user_id    | 29fe578e45b24406ba6c5fd0baaeaa9c |
+------------+----------------------------------+


We can see by looking at the undercloud nova (don't forget to source the stackrc after interacting with the overcloud above!) that there is one controller node):


[stack@instack ~]$ . stackrc
[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+--------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks           |
+--------------------------------------+------------------------+--------+------------+-------------+--------------------+
| d5155616-d2a6-4cee-a6d1-37bb83fccfe0 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.7 |
+--------------------------------------+------------------------+--------+------------+-------------+--------------------+



Scenario 2 - "hyperconverged" ceph deployment

In this case, we want to move the Ceph OSD services, which normally run on the CephStorage role, and instead have them run on the Compute role.

To do this, we first look at the default values for the ComputeServices and CephStorageServices parameters in overcloud.yaml (as in the example above for the Controller role, these lists define the services to be deployed on the Compute and CephStorage roles respectively):

ComputeServices:
    default:
      - OS::TripleO::Services::CephClient
      - OS::TripleO::Services::CephExternal
      - OS::TripleO::Services::Timezone
      - OS::TripleO::Services::Ntp
      - OS::TripleO::Services::Snmp
      - OS::TripleO::Services::NovaCompute
      - OS::TripleO::Services::NovaLibvirt
      - OS::TripleO::Services::Kernel
      - OS::TripleO::Services::ComputeNeutronCorePlugin
      - OS::TripleO::Services::ComputeNeutronOvsAgent
      - OS::TripleO::Services::ComputeCeilometerAgent


  CephStorageServices:
    default:
      - OS::TripleO::Services::CephOSD
      - OS::TripleO::Services::Kernel
      - OS::TripleO::Services::Ntp
      - OS::TripleO::Services::Timezone



Our aim is to deploy one Compute node, running both the standard compute services, and the OS::TripleO::Services::CephOSD service (the other services are clearly common to both roles).  We also don't need the OS::TripleO::Services::CephExternal service defined in ComputeServices, because we won't be referencing any external ceph cluster, which gives us this:

$ cat ceph_osd_on_compute.yaml
parameter_defaults:
  ComputeServices:
      - OS::TripleO::Services::CephClient
      - OS::TripleO::Services::CephOSD

      - OS::TripleO::Services::Timezone
      - OS::TripleO::Services::Ntp
      - OS::TripleO::Services::Snmp
      - OS::TripleO::Services::NovaCompute
      - OS::TripleO::Services::NovaLibvirt
      - OS::TripleO::Services::Kernel
      - OS::TripleO::Services::ComputeNeutronCorePlugin
      - OS::TripleO::Services::ComputeNeutronOvsAgent
      - OS::TripleO::Services::ComputeCeilometerAgent


That is all that's required to enable a hyperconverged ceph deployment!  :)

Since the default count for CephStorage is zero, we can then deploy like this:


[stack@instack ~]$ openstack overcloud deploy --templates /tmp/tripleo-heat-templates -e ceph_osd_on_compute.yaml -e /tmp/tripleo-heat-templates/environments/storage-environment.yaml

Here we can see I'm specifying a non-default location /tmp/tripleo-heat-templates for the template tree (this defaults to /usr/share/openstack-tripleo-heat-templates), passing the ceph_osd_on_compute.yaml environment to enable the OSD service on the Compute role, and finally passing the storage-environment.yaml that configures things so they are backed by Ceph.

Logging onto the compute node after deployment we see this:

[root@overcloud-novacompute-0 ~]# ps ax | grep ceph
17437 ?        Ss     0:00 /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf --cluster ceph -f
17438 ?        Sl     0:00 /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf --cluster ceph -f


So, it worked, and we have the OSD service running on the Compute role! :)


Similar patterns to those described above can be used to achieve various deployment topologies which were not previously possible (an all-in-one deployment including nova-compute on a single node for example, as is done in one of our CI jobs now)

Future Work

Hopefully by now you can see that these new interfaces provide a much cleaner abstraction for services, and a lot more operator flexibility regarding their placement.  However for some environments this is not enough, and completely new roles may be needed.  We're working towards enabling that via the custom-roles blueprint, which will hopefully land for Newton.

Another related piece of work is enabling more flexible environment merging inside Heat.  This will mean there is less need to specify the full list of Services as described above, and instead we'll be able to build up a list of services based on multiple environment files (which are then merged appending to the final list).

 

4 comments:

  1. seems that link to developer document is not working.
    http://docs.openstack.org/developer/tripleo-docs/tht_walkthrough/tht_walkthrough.html

    ReplyDelete
    Replies
    1. Thanks, looks like it should be:

      https://docs.openstack.org/developer/tripleo-docs/developer/tht_walkthrough/tht_walkthrough.html

      Delete
  2. This is a nice article you shared great information I have read it thanks for giving such a wonderful Blog for the reader.
    Openstack Training
    Openstack Training Online
    Openstack Training in Hyderabad

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete