OpenStack Monitoring: Tutorial and Course

OpenStack Monitoring: Tutorial and Course - OpenStack Monitoring Tutorial and OpenStack Monitoring Course, The Ultimate Guide to OpenStack Monitoring. Learn OpenStack Monitoring Tutorial and OpenStack Monitoring Course at OpenStack Monitoring Tutorial and Course.

OpenStack Monitoring Tutorial and OpenStack Monitoring Course


OpenStack Monitoring: Overview


OpenStack Monitoring Tutorial and Course - OpenStack Monitoring tutorial and OpenStack Monitoring course, the ultimate guide to OpenStack Monitoring, including facts and information about OpenStack Monitoring. OpenStack Monitoring Tutorial and Course is one of the ultimate created by to help you learn and understand OpenStack Monitoring and the related cloud computing technologies, as well as facts and information about OpenStack Monitoring.



OpenStack Monitoring Tutorial, OpenStack Monitoring Course

OpenStack Monitoring: Tutorial and Course - OpenStack Monitoring Tutorial and OpenStack Monitoring Course by , The Ultimate Guide to OpenStack Monitoring.



OpenStack Monitoring: Tutorial and Course


There are a number of ways to monitor computer systems and their services but the same principles remain. Adequate monitoring and alerting of services is the only way to ensure we know there's a problem before our customers. From SNMP traps to agents running on machines specific to the services running, configuration of monitoring is an essential step in production deployments of OpenStack.



OpenStack Monitoring: Monitoring Compute Services With Munin


Munin is a network and system monitoring application that outputs graphs through a web interface. It comprises of a master server that gathers the output from the agents running on each of our hosts.



We will be configuring Munin on a server that has access to the OpenStack Compute environment hosts. Ensure this server has enough RAM, disk, and CPU capacity for the environment you are running.



To set up Munin with OpenStack, carry out the following steps:





The Munin Master node is the server that provides us with the web interface to view the collected information about the nodes in our network and must be installed first, as follows:



1. Configure a server with the Ubuntu or Debian 64-bit version, with access to the servers in our OpenStack environment.



2. Install Munin from the Ubuntu repositories:




sudo apt-get update
sudo apt-get -y install apache2
sudo apt-get -y install munin munin-plugins-extra
sudo service apache2 restart



3. By default, the Apache configuration for Munin only allows access from 127.0.0.1. To allow access from our network, we edit /etc/apache2/conf.d/munin and allow the server(s) or network(s) that can access Munin. For example, to allow access from 192.168.1.0/24, we add the following Access line in:




Allow from 192.168.1.



4. We reload the Apache service to pick up this change. We do this as follows:




sudo service apache2 reload



5. At this stage, we have a basic installation of Munin that is gathering statistics for the running machine where we have just installed Munin. The web interface can be seen if you load up a web browser and browse to http://server/munin.



6. Configuration of Munin Master is done in the /etc/munin/munin.conf file. Here, we tell Munin where our OpenStack hosts, which are specified as FQDNs, are. Munin groups these hosts under the same domain. For example, to add in two OpenStack hosts that have the addresses 172.16.0.1 (openstack1) and 172.16.0.2 (openstack2), we add the following section into the munin.conf file:




[openstack1.cloud.test]
    address 172.16.0.1
    use_node_name yes

[openstack2.cloud.test]
    address 172.16.0.2
    use_node_name yes



We can now proceed to configure the nodes openstack1 and openstack2.



With the Munin Master server installed, we can now configure the Munin nodes. These have an agent on them, called munin-node, that the master uses to gather the information and present to the user.



1. We first need to install the munin-node package on our OpenStack hosts. So, for each one, we execute the following:




sudo apt-get update
sudo apt-get -y install munin-node munin-plugins-extra



2. Once installed, we need to configure this so that our Munin Master host is allowed to get information from the node. To do this, we edit the /etc/munin/munin-node.conf file and add in an allow line. To allow our Master on IP address 172.16.0.253, we add the following entry:




allow ^172\.16\.0\.253$



3. Once that line is in, we can restart the munin-node service to pick up the change.




sudo restart munin-node



With Munin Master installed, and having a couple of nodes with graphs showing up on the Master, we can add in plugins to pick up the OpenStack services and graph them. To do this, we check out some plugins from GitHub.



1. We first ensure we have the git client available to us on our OpenStack nodes:




sudo apt-get update
sudo apt-get -y install git



2. We can now check out the OpenStack plugins for Munin as they're not yet available in the munin-plugins-extra package:




git clone https://github.com/munin-monitoring/contrib.git



3. This checks out contributed code and plugins to a directory named contrib. We copy the relevant plugins for the OpenStack services into the Munin plugins directory, as follows:




cd contrib/plugins
sudo cp nova/* /usr/share/munin/plugins/
sudo cp keystone/* /usr/share/munin/plugins
sudo cp glance/* /usr/share/munin/plugins



4. Munin-node comes with a utility that allows us to enable appropriate plugins on our hosts automatically. We run the following commands to do this:




sudo munin-node-configure --suggest
sudo -i # get root shell
munin-node-configure --shell 2>&1 | egrep -v "^\#" | sh



5. The Keystone and Glance plugins don't get picked up automatically, so we add these to the plugins' directory, manually, with symlinks:




cd /etc/munin/plugins
sudo ln -s /usr/share/munin/plugins/keystone_stats
sudo ln -s /usr/share/munin/plugins/glance_size 
sudo ln -s /usr/share/munin/plugins/glance_status



6. We also need to add in an extra configuration file to sit alongside the OpenStack plugins, called /etc/munin/plugin-conf.d/openstack.




[nova_*]
user nova

[keystone_*]
user keystone

[glance_*]
user glance



7. With the appropriate plugins configured, we restart the munin-node service, as follows, to pick up the change:




sudo restart munin-node



8. When the Master server refreshes, we see OpenStack services as options and graphs we can click through to.



Munin is an excellent, open source networked, resource-monitoring tool that can help analyze resource trends and identify problems with our OpenStack environment. Configuration is very straightforward, with out of the box configuration providing lots of very useful graphs from RRD (Round Robin Database) files. By adding in a few extra configuration options and plugins, we can extend Munin to monitor our OpenStack environment.



Once Munin has been installed, we have to do a few things to configure it to produce graphed statistics for our environment:



1. Configure the Master Munin server with the nodes we wish to get graphs from. This is done in the /etc/munin/munin.conf file by using the tree-like structure domain/host address sections.



2. We then configure each node with the munin-node service. The munin-node service has its own configuration file where we set the IP address of our master Munin server. This authorizes the master server, with this IP address, to retrieve the collected data from this node. This is set in the allow line in the /etc/munin/munin.conf file.



3. Finally, we configure appropriate plugins for the services that we want to monitor. With the OpenStack plugins installed, we can monitor the Compute, Keystone, and Glance services and obtain statistics on the number of instances running, the number of floating IPs assigned, allocated, and used, and so on.



OpenStack Monitoring: Monitoring Instances Using Munin & Collectd


The health of the underlying infrastructure operating our on-premise cloud solution is important, but of equal importance is to understand the metrics given by the Compute instances themselves. For this, we can get metrics sent from them by using a monitoring tool called Collectd, and we can leverage Munin for an overall view of our running virtual instances.



To set Munin and Collectd up, carry out the following steps:



We can configure Munin to look at more than just the CPU, memory, and disk space of the host, by invoking the libvirt plugin to query values within the running instances on our Compute hosts.



1. The libvirt munin plugin is conveniently provided by the Ubuntu repositories, so we grab these in the usual way:




sudo apt-get update
sudo apt-get -y install munin-libvirt-plugins



2. Once downloaded, we then configure the munin libvirt plugins on the Compute host:




cd /etc/munin/plugins
sudo ln -s /usr/share/munin/plugins/libvirt-blkstat  
sudo ln -s /usr/share/munin/plugins/libvirt-ifstat
sudo ln -s /usr/share/munin/plugins/libvirt-cputime
sudo ln -s /usr/share/munin/plugins/libvirt-mem



3. With the plugins in place, we now need to configure them. This is done by placing a file in /etc/munin/plugin-conf.d/libvirt, with the following contents:




[libvirt*]
user root
env.address qemu:///system
env.tmpfile /var/lib/munin/plugin-state/libvirt



Once this is done, we restart the munin-node service, and we will see an additional category show up in Munin, named virtual machines, where we can then see how much of the system resources are being consumed on the host.



Collectd is set up in three parts. There is a collectd server that listens over UDP for data sent from clients. There is the client collectd service that sends the data to the collectd server. Finally, there is a web interface to Collectd, named collectd-web, that allows for easy viewing of the graphs sent from collectd.



1. We first install collectd and the required Perl resources in the usual way from Ubuntu's repositories:




sudo apt-get update
sudo apt-get -y install collectd libjson-perl



2. Once installed, we configure the service to listen on a port of our choosing. The configuration of collectd is done in /etc/collectd/collectd.conf. In the following configuration, we listen on UDP port 12345:




Hostname "servername"
Interval  10
ReadThreads 5

LoadPlugin network

  Listen "*" "12345"


LoadPlugin cpu
LoadPlugin df
LoadPlugin disk
LoadPlugin load
LoadPlugin memory
LoadPlugin processes
LoadPlugin swap
LoadPlugin syslog
LoadPlugin users
LoadPlugin interface

    Interface "eth0"

LoadPlugin tcpconns

LoadPlugin rrdtool

  CacheFlush 120
  WritesPerSecond 50


Include "/etc/collectd/filters.conf"
Include "/etc/collectd/thresholds.conf"



3. We restart the service to pick up these changes:




sudo service collectd restart



1. The collectd client and server both use the same package, so we install the client in the same way.




sudo apt-get update
sudo apt-get -y install collectd libjson-perl



2. The configuration file for the guest is the same as for the server, but we specify different options. Edit /etc/collectd/collectd.conf with the following contents:




FQDNLookup true
Interval  10
ReadThreads 5
LoadPlugin network

  Server "172.16.0.253" "12345"

LoadPlugin cpu
LoadPlugin df
LoadPlugin disk
LoadPlugin load
LoadPlugin memory
LoadPlugin processes
LoadPlugin swap
LoadPlugin syslog
LoadPlugin users
LoadPlugin interface

  Interface "eth0"




3. Restart the collectd service to pick up this change:




sudo service collectd restart



1. At this point, data is being sent over to the collectd server. To view this data, we install another package that can interpret the RRD files and present them in an easy-to-use web interface. We first download the collectd-web tarball from the web.



2. We then unpack the archive, as follows:




tar zxvf collectd-web_X.X.X.tar.gz



3. Then, we copy everything over to the web server DocumentRoot directory:




sudo cp -a ./collectd-web /var/www



4. Create or modify the /etc/collectd/collection.conf file with the following contents:




datadir: "/var/lib/collectd/"
libdir: "/usr/lib/collectd/"



5. We then run the standalone server that will listen locally for requests from Apache:




cd /var/www/collectd-web
sudo nohup python runserver.py &



6. After this we edit the vhost file that controls the DocumentRoot of our Apache setup (on Ubuntu, this is /etc/apache2/sites-enabled/000-default) to ensure that .htaccess files are understood with the AllowOverride all configuration:





      Options Indexes FollowSymLinks MultiViews
      AllowOverride all
      Order allow,deny
      allow from all




7. We can now simply reload Apache to pick up the changes, as follows:




sudo service apache2 reload



8. Now, we point our web browser to our installation, for example, http://172.16.0.253/collectd-web, to view the collectd stats from the listed servers.



Munin has plugins for various monitoring activities, including libvirt. As libvirt is used to manage the running instances on our Compute nodes, they hold an array of information that we can send to Munin to allow us to get a better understanding of what is happening in and on our OpenStack Compute hosts and instances.



Collectd is regarded as one of the standard ways of collecting resource information from servers and instances. It can act as a server and a client and, as such, we use the same installation binaries on both our monitoring host and guests. The difference is in the configuration file, /etc/collectd/collectd.conf. For the server, we specify that we listen on a specific port using the following lines in the server's configuration file:





  Listen "*" "12345"




For the client configuration, we specify where we want the data sent to, using the following lines in the client's configuration file:





  Server "172.16.0.253" "12345"




To bring the two together in a convenient interface to collectd, we install the collectd-web interface that has a standalone service that is used in conjunction with Apache to provide us with the interface.



OpenStack Monitoring: Monitoring The Storage Service Using StatsD/Graphite


When monitoring the OpenStack Storage service, Swift, we are looking at gathering key metrics from within the storage cluster in order to make decisions on its health. For this, we can use a small piece of middleware named swift-informant, together with StatsD and Graphite, to produce near real-time stats of our cluster.



We will be configuring StatsD and Graphite on a server that has access to the OpenStack Storage proxy server. Ensure this server has enough RAM, disk, and CPU capacity for the environment you are running.



Prerequisites



To install StatsD and Graphite, carry out the following steps:



For this, we will be configuring a new Ubuntu server. Once Ubuntu has been installed, we need to install some prerequisite packages.




sudo apt-get update
sudo apt-get -y install git python-pip gcc python2.7-dev apache2 libapache2-mod-python python-cairo python-django libapache2-mod-wsgi python-django-tagging



Graphite



1. Installation of Graphite is achieved using the Python Package Index tool, pip:




sudo pip install carbon
sudo pip install whisper
sudo pip install graphite-web



2. Once installed, we can configure the installation. Example configuration files for Graphite are found in /opt/graphite/conf. We rename these to their respective conf files:




cd /opt/graphite/conf
sudo mv carbon.conf.example carbon.conf
sudo mv storage-schemas.conf.example storage-schemas.conf



3. We now create the vhost file for Apache that will load the Graphite frontend. Create /etc/apache2/sites-available/graphite with the following contents:





        ServerName 172.16.0.253
        DocumentRoot "/opt/graphite/webapp"
        ErrorLog /opt/graphite/storage/log/webapp/error.log
        CustomLog /opt/graphite/storage/log/webapp/access.log common

        # I've found that an equal number of processes & threads # tends
        # to show the best performance for Graphite (ymmv).
        WSGIDaemonProcess graphite processes=5 threads=5 display-name='%{GROUP}' inactivity-timeout=120
        WSGIProcessGroup graphite
        WSGIApplicationGroup %{GLOBAL}
        WSGIImportScript /opt/graphite/conf/graphite.wsgi process-group=graphite application-group=%{GLOBAL}

        WSGIScriptAlias / /opt/graphite/conf/graphite.wsgi
        Alias /content/ /opt/graphite/webapp/content/
        
                SetHandler None
        

        Alias /media/ "/usr/lib/python2.7/dist-packages/django/contrib/admin/media/"
        
                SetHandler None
        

        # The graphite.wsgi file has to be accessible by apache. # It won't be visible to clients
        # because of the DocumentRoot though.
        
                Order deny,allow
                Allow from all
        




4. We enable this website using the a2ensite utility:




sudo a2ensite graphite



5. We now need to enable the WSGI file for Graphite:




sudo mv graphite.wsgi.example graphite.wsgi



6. Various areas need to change their ownership to that of the process running the Apache web server:




sudo chown -R www-data:www-data /opt/graphite/storage/log/
sudo touch /opt/graphite/storage/index
sudo chown www-data:www-data /opt/graphite/storage/index



7. We can now restart Apache to pick up these changes:




sudo service apache2 restart



8. The Graphite service runs with a SQLite database backend, so we need to initialize this.




cd /opt/graphite/webapp/graphite
sudo python manage.py syncdb



9. This will ask for some information, as displayed next:




You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (Leave blank to use 'root'): 
E-mail address: user@somedomain.com
Password: 
Password (again): 
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
No fixtures found.



10. We also need to ensure that Apache can write to this, too:




sudo chown -R www-data:www-data /opt/graphite/storage



11. Finally, we start the services, thus:




cd /opt/graphite
sudo bin/carbon-cache.py start



StatsD



1. StatsD runs using node.js, so we have to install it first, using packages from Ubuntu's repositories:




sudo apt-get update
sudo apt-get -y install nodejs



2. We then check out the StatsD code from Git:




git clone https://github.com/etsy/statsd.git



3. Configuring StatsD is done by modifying an example configuration file:




cd statsd
cp exampleConfig.js Config.js



4. We need to modify the Config.js file to change the graphiteHost: parameter to localhost, as we're running Graphite on the same host as StatsD:




{
  graphitePort: 2003
, graphiteHost: "localhost"
, port: 8125
}



5. To start the service, we issue the following command:




nohup node stats.js Config.js &



swift-informant



We are now ready to configure the OpenStack Swift proxy server to include the swift-informant middleware in the pipeline. This is done by configuring the /etc/swift/proxy-server.conf file.



1. We first download and install the middleware by running the following commands:




git clone https://github.com/pandemicsyn/swift-informant.git
cd swift-informant
sudo python setup.py install



2. Once installed, we modify the pipeline in /etc/swift/proxy-server.conf to specify a filter named informant:




[pipeline:main]
pipeline =  informant healthcheck cache swift3 s3token tokenauth keystone proxy-server



3. We then add in the informant filter section, specifying the address of our StatsD server, in the statsd_host section, as follows:




[filter:informant]
use = egg:informant#informant
statsd_host = 172.16.0.9
# statsd_port = 8125
# standard statsd sample rate 0.0 <= 1
# statsd_sample_rate = 0.5
# list of allowed methods, all others will generate a "BAD_METHOD" event
# valid_http_methods = GET,HEAD,POST,PUT,DELETE,COPY
# send multiple statsd events per packet as supported by statsdpy
# combined_events = no
# prepends name to metric collection output for easier recognition, e.g. company.swift.
# metric_name_prepend =



4. Once done, we simply restart our OpenStack proxy service:




sudo swift-init proxy-server restart



5. Load up your web browser and point it to your Graphite web installation, to see the graphs get populated in real time.




sudo swift-init proxy-server restart



Gaining insight into what our OpenStack Storage cluster is doing can be achieved by including a piece of middleware in the pipeline of our OpenStack Storage proxy server, named swift-informant, along with StatsD and Graphite. StatsD is a node.js service that listens for statistics sent to it in UDP packets. Graphite takes this data and gives us a real-time graph view of our running services.



Installation and configuration is done in stages. We first install and configure a server that will be used for StatsD and Graphite. Graphite can be installed using Python's Package Index (using the pip tool), and for this, we install three pieces of software: carbon (the collector), whisper (fixed-size RRD service), and the Django Web Interface, graphite-web. Using the pip tool installs these services to the /opt directory of our server.



Once the server for running Graphite and StatsD has been set up, we can configure the OpenStack Storage proxy service, so that statistics are then sent to the Graphite and StatsD server. With the appropriate configuration in place, the OpenStack Storage service will happily send events, via UDP, to the StatsD service.



Configuration of the Graphite interface is done in an Apache vhost file that we place in Ubuntu's Apache sites-available directory. We then enable this for our installation.



Note that vhost needs to be configured appropriately for our environment—specifically the path to the DJANGO_ROOT area—as part of our Python installation. For Ubuntu, this is /usr/lib/python2.7/dist-packages/django to give us the following in our vhost file:




Alias /media/ "/usr/lib/python2.7/dist-packages/django/contrib/admin/media/"



We then ensure that the Graphite WSGI (Web Service Gateway Interface) file is in place at the appropriate path, as specified by the WSGIScriptAlias directive at /opt/graphite/conf/graphite.wsgi.



Once in place, we ensure that our filesystem has the appropriate permissions to allow Graphite to write various logs and information as it's running.



When this has been done, we simply restart Apache to pick up the changes.



With the Graphite web interface configured, we initialize the database; for this installation we will make use of a SQLite database resource. This is achieved by running the syncdb option with the Graphite manage.py script in the /opt/graphite/webapp/graphite directory. This asks us to create a superuser called user for the system, to manage it later.



Once this has been done, we can start the collector service, carbon, which starts the appropriate services that will listen for data being sent to it.



With all that in place, we simply move our efforts to the OpenStack Storage proxy service, where we checkout the swift-informant middleware to be inserted into the pipeline of our proxy service.



OpenStack Monitoring: Monitoring MySQL With Hyperic


Database monitoring can be quite complex, and, depending on your deployment or experience, monitoring may already be set up. For those that don't have existing monitoring of a MySQL service, Hyperic from SpringSource is an excellent tool to set up monitoring and alerting for MySQL. The software comes in two editions—an Open Source edition - suitable for smaller installations - and an Enterprise edition with paid for support. The steps in the following section are for the Open Source edition.



Hyperic can monitor many aspects of our OpenStack environment including system load, network statistics, Memcached, and RabbitMQ status.



We will be configuring Hyperic on an Ubuntu server that has access to the MySQL server in our OpenStack environment. Ensure this server has enough RAM, disk, and CPU capacity for the environment you are running. Log in as a normal user to download and install the software.



To install Hyperic, carry out the following steps:



Hyperic server



1. We can find the Hyperic server installation package at the following URL:




http://www.springsource.com/landing/hyperic-open-source-download



2. Fill in the details, and you will be presented with two links. One is for the server, and the other for the agent. Download both.



3. On the server that will be running the Hyperic server, we unpack the Hyperic server installation package as follows:




tar zxvf hyperic-hq-installer-4.5-x86-64-linux.tar.gz



4. Once unpacked, change to the directory:




cd hyperic-hq-installer-4.5



5. The default install area for Hyperic is /home/hyperic, so we create this and ensure our unprivileged user can write to it:




sudo mkdir -p /home/hyperic
sudo chown openstack /home/hyperic



6. Once this area is ready, we can run the setup script to install Hyperic:




./setup.sh



7. During the installation, a message will pop up asking us to open up another terminal on our server as the root user to execute a small script.



8. In another terminal, log in as root and execute the previous step.



9. Return to the original shell and continue the installation. Eventually, the installation will complete. We can now start the Hyperic HQ service with the following command:




/home/hyperic/server-4.5/bin/hq-server.sh start



10. First-time start up can be quite slow, but eventually you will be able to point your web browser at the address the installation has presented to you, which will be http://server:7080/.



11. Log in with user hqadmin and password hqadmin.



Nodes



Each node that we want to monitor in Hyperic needs an agent installed, which then gets configured to talk back to the Hyperic server.



1. Copy the agent tarball to the server that we'll be monitoring in Hyperic.



2. Unpack the agent as follows:




tar zxvf hyperic-hq-agent-4.5-x86-64.tar.gz



3. Change to the unpacked directory:




cd hyperic-hq-agent-4.5



4. Start the agent, which will ask for information about the Hyperic server installation. Specify the server address, port, username (hqadmin), and password (hqadmin). When asked for the IP to use, specify the address that Hyperic can use to communicate with the server.




bin/hq-agent.sh start



5. This completes the installation of the agent.



6. Once done, the new node will appear in Hyperic, with auto-discovered services listed.



7. Click on the Add to Inventory button to accept these to be added to Hyperic, and you will see our new node listed with the services that have been discovered.



Monitoring MySQL



To monitor MySQL, carry out the following steps:



1. Monitoring MySQL involves the agent understanding how to authenticate with MySQL. We first add in the MySQL service to our host by selecting the host that has recently been added. This takes us to the main screen for that host, where we can click through services that are being monitored.



2. We then click on the Tools Menu option and select New Server.



3. This takes us to a screen where we can add in a label for the new service and the service type.




Name: openstack1 MySQL
Server Type: MySQL 5.x
Install Path: /usr



4. Clicking on OK takes us to the configuration screen for this new service. At the bottom of the page, there is a section named Configuration Properties. Click on the EDIT... button for this section.



5. We can now specify the username, password, and connect string, to connect to the running MySQL instance.




JDBC User: root
JDBC Password: openstack



These are the credentials for a user in MySQL that can see all databases. Check the Auto-Discover Tables option and leave the rest of the options at their default values, unless you need to change the address that the agent will connect to for MySQL.



6. By clicking on OK and then browsing back to the host, we will now have a monitoring option named openstack1 MySQL, as specified in step 3. The agent will then collect statistics about our MySQL instance.



Hyperic uses agents to collect information and sends this back to the Hyperic server, where we can view statistics about the environment and configure alerting based on thresholds. The agent is very flexible and can be configured to monitor many more services than just MySQL.



Configuration of the agent is done through the Hyperic server's interface, where a running node's service is known as a "server". Here, we can configure usernames, ports, and passwords, to allow the agent to successfully communicate with that service. For MySQL, this is providing the agent with the correct username, password, and address for the familiar jdbc (Java Database Connector) connect string.



In your datacenter, you may have a MySQL cluster rather than a single server, where a view of the cluster as a whole is of equal (if not more) importance to that of the individual nodes. An example cluster monitoring suite that has both free and enterprise options is named CMON and is available at SeveralNines website



OpenStack Monitoring: Further Reading