Nginxmgr - Nginx upstream pool manager.

What?

Nginxmgr is an application for managing pools of upstream servers in an Nginx deployment. It allows you to dynamically allocate, deallocate and monitor servers in an upstream pool via health checks and an XML-RPC interface.

Why?

I wrote this software because I use Nginx as a web application load balancer and have often found the need to monitor the upstream servers and have it react like a real load balancer would. I was also looking for a way to be able to manage the upstream servers programmatically with Python and monitor the health of the upstream pools by being aware of the number of servers available.

Use Case

Typical use case for Nginxmgr is in a large environment, with a large number of upstream pools and servers. Of course, that’s not a requirement. You could also use it on a pool of just a few servers.

Checks/Handlers

Out of the box, Nginxmgr comes with one check named url_ping. It’s job is to check the status of an upstream member every ‘n’ seconds and if it fails, call the appropriate handler. The handler will then remove the server from the upstream pool and reload Nginx. The url_ping check is also configurable, allowing checks on a port other than 80 and also status codes other than 200 (in case you want to get a 404 or something back).

Checks and handlers are Python modules and functions that allow for flexibility and ease of development were new checks/handlers needed.

Interoperability With Nginx

Currently Nginxmgr runs along side Nginx as a separate daemon, however it will not attempt to start if Nginx is not running. It also does a check in the main loop before the health checks are executed to determine if Nginx was shut down or died since the last check. When that check happensm, if Nginx isn’t running, Nginxmgr will shut itself down.

Installing Nginxmgr

First, you need to get the package. You can find links to the code below in the section titled “Get The Code”. Once you have the source tarball, you should extract it, then cd into the newly created nginxmgr directory. Once there you will then run: python setup.py install. This should install all the library code and the command line script. If you’re cloning the code (hg clone), all you will need to do is run the setup script as above in the root directory of your newly acquired repository.

Configuring Nginxmgr

There are two different configuration files for Nginxmgr. One is for the core code, and one is for the url_ping check command. You should only need to mess with the core config. Below are example configs with comments explaining what is going on:

/etc/nginxmgr/nginxmgr.cfg:

[core]
# Should I run in  the background?
daemon           = True
# Should I enable debug level logging?
debug            = True
# Path to handler scripts
handler_path     = '/etc/nginxmgr/handlers/'
# Path to check scripts
checks_path      = '/etc/nginxmgr/checks/'
# Health check config definitions
healthcheck_defs = '/etc/nginxmgr/health_check_defs.cfg'
# How do you want the log to look?
log_format       = '%(asctime)s %(levelname)s %(threadName)s %(filename)s LINE %(lineno)s: %(message)s'
# Where should I log to?
log_file         = '/var/log/nginxmgr.log'
# Where should I deposit my pid?
pid_file         = '/var/run/nginxmgr.pid'
# Location of nginx base config
ngx_base_config  = '/etc/nginx/nginx.conf'
# Location of nginx binary
ngx_bin          = '/usr/sbin/nginx'
# Location of nginx upstreams config
ngx_config       = '/etc/nginx/conf.d/upstreams.conf'
# Location of nginx pid file
ngx_pid_file     = '/var/run/nginx.pid'
# Interface for XML-RPC to listen on
xml_rpc_ip       = "0.0.0.0"
# Port for XML-RPC to bind to
xml_rpc_port     = 9000

    [healthcheck]
    # Enable health checks?
    enabled = True

        [[checks]]
        # Inidividual checks

            [[[url_ping_test_upstreams]]]
            # Type maps to a python module in etc/checks
            type     = url_ping
            # Upstream maps to an upstream pool in nginx upstream.conf
            upstream = test-upstreams 
            # Handler maps to a python module:function in etc/handlers
            handler  = url_ping_handler:badupstream
                # Custom stuff to define for custom checks (defined by check module)
                [[[[extras]]]]
                # port — If port is defined, any existing port in the upstream
                # will be stripped and replaced.
                # port     = 2000
                url      = "http://%s/"
                status   = 200
            [[[url_ping_test_upstreams-two]]]
            # Type maps to a python module in etc/checks
            type     = url_ping
            # Upstream maps to an upstream pool in nginx upstream.conf
            upstream = test-upstreams-two
            # Handler maps to a python module:function in etc/handlers
            handler  = url_ping_handler:badupstream
                # Custom stuff to define for custom checks (defined by check module)
                [[[[extras]]]]
                # port — If port is defined, any existing port in the upstream
                # will be stripped and replaced.
                # port     = 2000
                url      = "http://%s/setup"
                status   = 404

/etc/nginxmgr/health_check_defs.cfg:

[url_ping]
# Check a url ever little while to make sure it's up
# Corresponds to ./checks/check_name.py:function_name
check=url_ping:ping
# Timeout (in seconds)
schedule=2

Running, Stopping and Reloading Nginxmgr

To start Nginxmgr, you would run this:

nginxmgr -c /etc/nginxmgr/nginxmgr/cfg

It will then fork off into the background and write a pid file to wherever you specify in the config (default is /var/run/nginxmgr.pid).

To shut it down properly, get the pid from the pid file and send it a kill signal(SIGTERM, SIGQUIT or SIGABRT) . If not running in daemon mode, just hit CTRL-C.

To reload the config and the health checks send the pid a SIGHUP.

You could also use Nginxmgr in an init script with out difficulty.

XML-RPC Interface

The XML-RPC interface is ideal for managing upstream pools and nodes on the fly, without having to mess with Nginx itself. It is also nice because it provides a language agnostic way to automate operations on upstream pools and provide a way to monitor the health of the pools.

XML-RPC Exposed Methods

Methods exposed to the XML-RPC interface are:

  • get_pools() — lists all of the pools available
  • get_members('pool_name') — get all members available in specified pool
  • disable_member('pool_name', 'IPADDR:PORT') — Temporarily disable member in pool
  • remove_member('pool_name', 'IPADDR:PORT') — Remove member from pool
  • enable_member('pool_name', 'IPADDR:PORT') — Enable disabled member in pool
  • add_member('pool_name', 'IPADDR:PORT') — Add new member to pool

Things to Note

I made some assumptions when developing this and as such it should be noted that Nginxmgr has only been tested on Linux(Centos and Ubuntu) with Python > 2.5. Nginxmgr makes some calls to utilities common to these systems such as ‘ps’, ‘kill’ and ‘pgrep’. The paths are currently hard coded, but that could easily be changed and/or stuffed in the configuration file.

I also make some assumptions about your Nginx setup. Mainly I assume that you separate your Nginx upstreams in it’s own file. That’s how I prefer to manage it, but It’s not required.

I have a major rewrite planned as an exercise in Test Driven Development as well as taking advantage of practices outlined in Clean Code by Robert (Uncle Bob) Martin.

In this rewrite I will also be experimenting with the Python multiprocessing module rather than threads. I’ve also considered REST instead of XML-RPC.

Documentation is non-existent. Sorry, that is one of the things I would like to work on :)

Purpose of This Blog Post

The purpose of this post is to introduce the software see if there is even any interest for such a thing in the community. My hope is that someone else will have a use for it and wish to improve upon it.

Get The Code

I use mercurial for DVCS, so you can find the code on bitbucket here: https://bitbucket.org/benjaminws/nginxmgr/

From there you can either clone the repository or download a tarball of the source.

If for whatever reason there is a problem getting the code from bitbucket, I also have a copy of the code here: http://nginxmgr.code.just-another.net/

From there you can clone the repository.

Hope this software helps someone. Thanks for reading!

byteflow/django supervisord nginx = WIN

Recently I made the final step in converting my website over to nginx. I decided to manage my django application, byteflow, with supervisor. I have had great success with this over at AGI and thought I should advocate the success I’ve had to the public by extending it to my own site.

I first heard about supervisor at pycon last year and thought it could be useful in many ways, especially at the office. In a nutshell, supervisor ‘supervises’ processes, and allows you to manage them with a simple interface, but I’ll go into more detail about it later. Around the same time that I discovered supervisor I had also started experimenting with nginx and fastcgi to run my blog. I ended up going with lighty and fastcgi instead, however, mainly due to familiarity. Things change, and so do my opinions on technology. Nginx sold me on it’s performance and simplistic configuration, plain and simple. So now, here’s the meat and potatoes.

To break it down, here is what we’re looking at.

INTARWEBS -> nginx -> fastcgi -> django/byteflow(managed by supervisor)

Everything below assumes you have installed nginx, django, byteflow and supervisor.

So how about a look at the actual setup.

First, let’s take a look at the code to fire up the django app. I’ve dubbed it runserver.py and put it in my byteflow code tree.

#!/usr/bin/env python
if __name__ == '__main__':
    from flup.server.fcgi_fork import WSGIServer
    from django.core.handlers.wsgi import WSGIHandler
    WSGIServer(WSGIHandler()).run()

This keeps the django devs happy by not relying on manage.py to keep up with the server stuffs.

It’s pretty simple.. Starts up flups’ WSGIServer and uses django’s WSGIHandler to do the dirty work. Not much to it, really.

Now, lets take a look at the supervisor setup. You can get a lot of what you need by running:

# sudo echo_supervisord_conf > /etc/supervisord.conf

That sets up a basic (but verbose!) supervisor config. I’ve trimmed it down to this:

[unix_http_server]
file=/tmp/supervisor.sock   ; (the path to the socket file)

[supervisord]
logfile=/var/log/supervisord/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB       ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10          ; (num of main logfile rotation backups;default 10)
loglevel=info               ; (log level;default info; others: debug,warn,trace)
pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false              ; (start in foreground if true;default false)
minfds=1024                 ; (min. avail startup file descriptors;default 1024)
minprocs=200                ; (min. avail process descriptors;default 200)
user=nobody                 ; (default is current user, required if root)
childlogdir=/var/log/supervisord/            ; ('AUTO' child log dir, default $TEMP)

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket

; Production setup
[fcgi-program:django_fcgi]
socket=tcp://127.0.0.1:8080  ; We reference this later in nginx
command = /home/bsmith/Dev/byteflow/runserver.py  ; Calls the above code
environment=PYTHONPATH=/home/bsmith/Dev/byteflow  ; Setup needed environment
environment=DJANGO_SETTINGS_MODULE=settings

; Development setup
[fcgi-program:django_dev_fcgi]
socket=tcp://127.0.0.1:8081
command = /home/bsmith/Dev/byteflow_new/runserver.py
environment=PYTHONPATH=/home/bsmith/Dev/byteflow_new
environment=DJANGO_SETTINGS_MODULE=settings

Simple enough, eh? Comments in the configuration should explain what’s going on.

Now you can crank it up by running:

$ sudo supervisord

and check the status…

$ sudo supervisorctl status
django_dev_fcgi:django_dev_fcgi_0 RUNNING    pid 15949, uptime 0:00:08
django_fcgi:django_fcgi_0        RUNNING    pid 15950, uptime 0:00:08

Could it be more simple? I doubt it. This barely begins to scratch the surface of what supervisor can do. Process groups/pools, XML-RPC interface for remote management, built-in web interface for process management(utilizing XML-RPC interface), tons of process management options (priority, umask, user/group, capture std* pipes, environment variables, auto restart/start, process naming), event listeners/handling and a simple configuration ta boot! All I’m sayin’ is, it’s awesome…I’m just sayin’.

Keeping with our simple but awesome theme, enter nginx…

My main config:

user www-data;
# Could vary by number of processors available.
worker_processes  1;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
    use epoll;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    access_log      /var/log/nginx/access.log;

    sendfile        on;

    keepalive_timeout  65;
    tcp_nodelay        on;

    gzip  on;

    include /etc/nginx/sites/*;

}

and my various site-specific configurations that I keep /etc/nginx/sites:

server {
        # We listen on port 80
        listen 80;
        server_name just-another.net;

        # access and error logs for our site
        access_log /var/log/nginx/my_site_access_log;
        error_log /var/log/nginx/my_site_error_log;

        # Configure redirect for our fastcgi server
        # The fastcgi server later runs on localhost Port 8080
        location / {
                expires 10d;
                # to fastcgi server could use socket..
                #fastcgi_pass unix:{project_location}/log/django.sock;
                fastcgi_pass 127.0.0.1:8080;
                fastcgi_param PATH_INFO $fastcgi_script_name;
                fastcgi_param REQUEST_METHOD $request_method;
                fastcgi_param QUERY_STRING $query_string;
                fastcgi_param CONTENT_TYPE $content_type;
                fastcgi_param CONTENT_LENGTH $content_length;
                fastcgi_param REMOTE_ADDR $remote_addr;
                fastcgi_pass_header Authorization;
                fastcgi_intercept_errors on;
        }

        location /images {
    # For my own images..
            alias /home/bsmith/public_html/images;
            expires 10d;
        }

        # Alias for static content like themes
        location /static {
            alias /home/bsmith/Dev/byteflow/static;
            expires 10d;
        }

        # Alias for python contrib.admin stuff, needed for admin interface
        location /admin-media {
    # Point to my most recent install of django.
            alias /home/bsmith/Dev/django_trunk/django/contrib/admin/media;
            expires 10d;
        }
        # Point to media
        location /media {
            alias /home/bsmith/public_html/media;
            expires 10d;
        }
        # Use feedburner for my feeds.
        rewrite ^/rpc(.*) http://feeds.feedburner.com/Just-anothernetBlogPosts$1 permanent;
}

Pretty simple, eh? I have the same server setup for my ‘dev’ site, only with a volatile code base that I can hack on.

Now when I make code changes all I need to do is reload the supervisor process like so:

$ sudo supervisorctl restart django_fcgi:django_fcgi_0

for the production site or replacing the process name with django_dev_fcgi:django_dev_fcgi_0 to restart the dev site. I could also restart everything by replacing a process name with all as an argument.

The possibilities here for process management endless and I couldn’t be happier with how simple it has become to run what I have.

Basically, this setup is easy to get running and easy to keep running. Highly recommended for any lazy django site maintainer like myself!

A cleaner way of extracting block of text from file?

I’m writing some code to parse an nginx config file in Python. The goal is to extract all the upstream ‘pools’ and put them into a nice data structure for later use.

I’ve come up with the below solution but am unsure about my approach. This seems like something that should have already been done for some other application, I just couldn’t construct the right search terms to find what I need. I’m also wondering if there is some python trick that I am unaware of that could achieve what I want with less (perceived?) bloat.

Need:

  • Extract every instance of ‘upstream’ in nginx config, make it useful.

Example data:

worker_processes 2;
pid  /var/run/nginx.pid;
error_log /var/log/nginx/error_log debug;
debug_points abort;

events {
  worker_connections  1024;
  use epoll;
  debug_connection 10.10.231.159;
}
http {
  upstream pool1 {
      server 10.10.240.48:8888;
      server 10.10.231.159:8888;
  }
  upstream pool2 {
      server 10.10.240.48:8889;
      server 10.10.231.159:8889;
  }
  server {
      listen 0.0.0.0:80;
      access_log /var/log/nginx/access_log_80;
      location /nginx_status {
          stub_status on;
          access_log off;
          allow all;
      }
      location / {
          proxy_pass http://pool1;
      }
      location /blah {
          proxy_pass http://pool2;
      }
  }
}

Example code (comments explain the logic):

class NgxConfig(object):
    def __init__(self,logging,config_file):
        # Set us up the bomb
        self.logging = logging
        self.upstreams = {}
        try:
            f = open(config_file)
            self.config_file = [line.strip() for line in f.readlines()]
        except IOError:
            self.logging.error("Cannot process nginx config file!")
            sys.exit("Cannot process nginx config file!")
        f.close()

        self.parse_upstreams()

    def parse_upstreams(self): 
        """ Parse upstreams from config
        """
        # Setup markers
        us_start_matched = 0
        us_end_matched = 0 
        # Enumerate over config to keep track of position
        for pos,line in enumerate(self.config_file):
            # See if our line matches "upstream" at all.
            usm = re.search('^upstream([^"]+){',line)
            if usm:
                # Matched upstream, set our position and move on
                us_start_matched = pos
                continue
            # We have a position set for upstream, look for the end of its block
            if us_start_matched != 0 and line == '}':
                # Got the end of the block
                us_end_matched = pos
                # Extract the name of the upstream
                usm = re.search('^upstream\s+([^"]+)\s+{',self.config_file[us_start_matched])
                # Setup list of upstreams
                self.upstreams[usm.group(1)] = []
                # Get the servers in the upstream between the start and end of the block
                # Also remove needless characters, only need the server info
                srvs = [s.strip('server; ') for s in self.config_file[us_start_matched+1:us_end_matched]]
                # Set them in the list
                self.upstreams[usm.group(1)] = srvs
                # Reset position markers and move on
                us_start_matched = 0
                us_end_matched = 0
                continue

        pprint.pprint(self.upstreams)

Result:

{'pool1': ['10.10.240.48:8888', '10.10.231.159:8888'],
 'pool2': ['10.10.240.48:8889', '10.10.231.159:8889']}

Thoughts, concerns, criticisms?

Thanks!

EDIT: May have found a bug in my blog software, seems that even though I had marked this as a ‘draft’ it was still publicly viewable via tag feed!