Saturday, January 21, 2012

MySQL replication monitoring on Ubuntu 10.04 with Nagios and NRPE

If you're using MySQL replication, then you're probably counting on it for some fairly important need. Monitoring via something like Nagios is generally considered a best practice. This article assumes you've already got your Nagios server setup and your intention is to add a Ubuntu 10.04 NRPE client. This article also assumes the Ubuntu 10.04 NRPE client is your MySQL replication master, not the slave. The OS of the slave does not matter.

Getting the Nagios NRPE client setup on Ubuntu 10.04

At first it wasn't clear what packages would be appropriate packages to install. I was initially misled by the naming of the nrpe package, but I found the correct packages to be:

1
sudo apt-get install nagios-nrpe-server nagios-plugins

The NRPE configuration is stored in /etc/nagios/nrpe.cfg, while the plugins are installed in /usr/lib/nagios/plugins/ (or lib64). The installation of this package will also create a user nagios which does not have login permissions. After the packages are installed the first step is to make sure that /etc/nagios/nrpe.cfg has some basic configuration.

Make sure you note the server port (defaults to 5666) and open it on any firewalls you have running. (I got hung up because I forgot I have both a software and hardware firewall running!) Also make sure the server_address directive is commented out; you wouldn't want to only listen locally in this situation. I recommend limiting incoming hosts by using your firewall of choice.

Choosing what NRPE commands you want to support

Further down in the configuration, you'll see lines like command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10. These are the commands you plan to offer the Nagios server to monitor. Review the contents of /usr/lib/nagios/plugins/ to see what's available and feel free to add what you feel is appropriate. Well designed plugins should give you a usage if you execute them from the command line. Otherwise, you may need to open your favorite editor and dig in!

After verifying you've got your NRPE configuration completed and made sure to open the appropriate ports on your firewall(s), let's restart the NRPE service:

1
service nagios-nrpe-server restart

This would also be an appropriate time to confirm that the nagios-nrpe-server service is configured to start on boot. I prefer the chkconfig package to help with this task, so if you don't already have it installed:

1
2
3
4
5
6
7
8
sudo apt-get install chkconfig
chkconfig | grep nrpe
 
# You should see...
nagios-nrpe-server     on
 
# If you don't...
chkconfig nagios-nrpe-server on

Pre flight check - running check_nrpe

Before going any further, log into your Nagios server and run check_nrpe and make sure you can execute at least one of the commands you chose to support in nrpe.cfg. This way, if there are any issues, it is obvious now, while we've not started modifying your Nagios server configuration. The location of your check_nrpe binary may vary, but the syntax is the same:

1
check_nrpe -H host_of_new_nrpe_client -c command_name

If your command output something useful and expected, your on the right track. A common error you might see: Connection refused by host. Here's a quick checklist:

  • Did you start the nagios-nrpe-server service?
  • Run netstat -lunt on the NRPE client to make sure the service is listening on the right address and ports.
  • Did you open the appropriate ports on all your firewall(s)?
  • Is there NAT translation which needs configuration?

Adding the check_mysql_replication plugin

There is a lot of noise out there on Google for Nagios plugins which offer MySQL replication monitoring. I wrote the following one using ideas pulled from several existing plugins. It is designed to run on the MySQL master server, check the master's log position and then compare it to the slave's log position. If there is a difference in position, the alert is considered Critical. Additionally, it checks the slave's reported status, and if it is not "Waiting for master to send event", the alert is also considered critical. You can find the source for the plugin at my Github account under the project check_mysql_replication. Pull that source down into your plugins directory (/usr/lib/nagios/plugins/ (or lib64)) and make sure the permissions match the other plugins.

With the plugin now in place, add a command to your nrpe.cfg.

1
command[check_mysql_replication]=sudo /usr/lib/nagios/plugins/check_mysql_replication.sh -H <slave_host_address></slave_host_address>

At this point you may be saying, WAIT! How will the user running this command (nagios) have login credentials to the MySQL server? Thankfully we can create a home directory for that nagios user, and add a .my.cnf configuration with the appropriate credentials.

1
2
3
4
5
6
7
8
9
10
11
12
usermod -d /home/nagios nagios #set home directory
mkdir /home/nagios
chmod 755 /home/nagios
chown nagios:nagios /home/nagios
 
# create /home/nagios/.my.cnf with your preferred editor with the following:
[client]
user=example_replication_username
password=replication_password
 
chmod 600 /home/nagios/.my.cnf
chown nagios:nagios /home/nagios/.my.cnf

This would again be an appropriate place to run a pre flight check and run the check_nrpe from your Nagios server to make sure this configuration works as expected. But first we need to add this command to the sudoer's file.

1
nagios ALL= NOPASSWD: /usr/lib/nagios/plugins/check_mysql_replication.sh

Wrapping Up

At this point, you should run another check_nrpe command from your server and see the replication monitoring report. If not, go back and check these steps carefully. There are lots of gotchas and permissions and file ownership are easily overlooked. With this in place, just add the NRPE client using the existing templates you have for your Nagios servers and make sure the monitoring is reporting as expected.

Saturday, January 14, 2012

Using Disqus and Ruby on Rails

Recently, I posted about how to import comments from a Ruby on Rails app to Disqus. This is a follow up to that post where I outline the implementation of Disqus in a Ruby on Rails site. Disqus provides what it calls Universal Code which can be added to any site. This universal code is just JavaScript, which asynchronously loads the Disqus thread based on one of two unique identifiers Disqus uses.

Disqus in a development environment

Before we get started, I'd recommend that you have two Disqus "sites"; one for development and one for production. This will allow you to see real content and experiment with how things will really behave once you're in production. Ideally, your development server would be publicly accessible to allow you to fully use the Disqus moderation interface, but it isn't required. Simply register another Disqus site, and make sure that you have your shortname configured by environment. Feel free to use whatever method you prefer for defining these kinds of application preferences. If you're looking for an easy way, considering checking out my article on Working with Constants in Ruby. It might look something like this:

1
2
3
# app/models/article.rb
 
DISQUS_SHORTNAME = Rails.env == "development" ? "dev_shortname".freeze : "production_shortname".freeze

Disqus Identifiers

Each time you load the universal code, you need to specify a few configuration variables so that the correct thread is loaded:

  • disqus_shortname: tells Disqus which website account (called a forum on Disqus) this system belongs to.
  • disqus_identifier: tells Disqus how to uniquely identify the current page.
  • disqus_url: tells Disqus the location of the page for permalinking purposes.
Let's create a Rails partial to set up these variables for us, so we can easily call up the appropriate comment thread.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# app/views/disqus/_thread.html.erb
# assumes you've passed in the local variable 'article' into this partial
 
<div id="disqus_thread"></div>
<script type="text/javascript">
 
    var disqus_shortname = '<%= Article::DISQUS_SHORTNAME %>';
    var disqus_identifier = '<%= article.id %>';
    var disqus_url = '<%= url_for(article, :only_path => false) %>';
 
    /* * * DON'T EDIT BELOW THIS LINE * * */
    (function() {
        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
        dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
    })();
</script>

The above code will populate the div#disqus_thread with the correct content based on your disqus_identifier. By setting up a single partial that will always render your threads, it becomes very easy to adjust this code if needed.

Disqus Identifier Gotcha

We found during our testing a surprising and unexpected behavior in how Disqus associates a thread to a URL. In our application, the landing page was designed to show the newest article as well as the Disqus comments thread. We found that once a new article was posted, the comments from the previous article were still shown! It seems Disqus ignored the unique disqus_identifier we had specified and instead associated the thread with the landing page URL. In our case, a simple routing change allowed us to forward the user to the unique URL for that content and thread. In your case, there may not be such an easy work around, so be certain you include both the disqus_identifier and disqus_url JavaScript configuration variables above to minimize the assumptions Disqus will make. When at all possible, always use unique URLs for displaying Disqus comments.

Comment Counters

Often an index page will want to display a count of how many comments are in a particular thread. Disqus uses the same asynchronous approach to loading comment counts. Comment counts are shown by adding code such as the following where you want to display your count:

1
2
3
4
5
6
7
8
9
10
# HTML
<a href="http://example.com/article1.html#disqus_thread"
   data-disqus-identifier="<%=@article.id%>">
This will be replaced by the comment count
</a>
 
# Rails helper
<%= link_to "This will be replaced by the comment count",
    article_path(@article, :anchor => "disqus_thread"),
    :"data-disqus-identifer" => @article.id %>

At first this seemed strange, but it is the exact same pattern used to display the thread. It would likely be best to remove the link text so nothing is shown until the comment count is loaded, but I felt for my example, having some meaning to the test would help understanding. Additionally, you'll need to add the following JavaScript to your page.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# app/view/disqus/_comment_count_javascript.html.erb
# add once per page, just above </body>
 
<script type="text/javascript">
    
    var disqus_shortname = '<%= Article::DISQUS_SHORTNAME %>';
 
    /* * * DON'T EDIT BELOW THIS LINE * * */
    (function () {
        var s = document.createElement('script'); s.async = true;
        s.type = 'text/javascript';
        s.src = 'http://' + disqus_shortname + '.disqus.com/count.js';
        (document.getElementsByTagName('HEAD')[0] || document.getElementsByTagName('BODY')[0]).appendChild(s);
    }());
</script>

Disqus recommends adding it just before the closing </body> tag. You only need to add this code ONCE per page, even if you're planning on showing multiple comment counts on a page. You will need this code on any page with a comment count, so I do recommend putting it in a partial. If you wanted, you could even include it in a layout.

Styling Comment Counts

Disqus provides extensive CSS documentation for its threads, but NONE for its comment counters. In our application, we had some very particular style requirements for these comment counts. I found that in Settings > Appearance, I could add HTML tags around the output of the comments.

This allowed me to style my comments as needed, although these fields are pretty small, so make sure to compress your HTML as much as possible.

Tuesday, January 3, 2012

Automating removal of SSH key patterns

Every now and again, it becomes necessary to remove a user's SSH key from a system. At End Point, we'll often allow multiple developers into multiple user accounts, so cleaning up these keys can be cumbersome. I decided to write a shell script to brush up on those skills, make sure I completed my task comprehensively, and automate future work.

Initial Design and Dependencies

My plan for this script is to accept a single argument which would be used to search the system's authorized_keys files. If the pattern was found, it would offer you the opportunity to delete the line of the file on which the pattern was found.

I've always found mlocate to be very helpful; it makes finding files extremely fast and its usage is trivial. For this script, we'll use the output from locate to find all authorized_keys files in the system. Of course, we'll want to make sure that the mlocate.db has recently been updated. So let's show the user when the database was last updated and offer them a chance to update it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mlocate_path="/var/lib/mlocate/mlocate.db"
if [ -r $mlocate_path ]
then
    echo -n "mlocate database last updated: "
    stat -c %y $mlocate_path
    echo -n "Do you want to update the locate database this script depends on? [y/n]: "
    read update_locate
    if [ "$update_locate" = "y" ]
    then
        echo "Updating locate database.  This may take a few minutes..."
        updatedb
        echo "Update complete."
    fi 
else
    echo "Cannot read the mlocate db path: $mlocate_path"
    exit 2
fi

First we define the path where we can find the mlocate database. Then we check to see if we can read that file. If we can't read the file, we let the user know and exit. If we can read the file, print the date and time it was last modified and offer the user a chance to update the database. While this is functional, it's pretty brittle. Let's make things a bit more flexible by letting locate tell us where its database is.

1
2
3
4
5
6
7
8
9
10
11
if
    mlocate_path=`locate -S`
then
    # locate -S command will output database path in following format:
    # Database /full/path/to/db: (more output)...
    mlocate_path=${mlocate_path%:*} #remove content after colon
    mlocate_path=${mlocate_path#'Database '*} #remove 'Database '
else
    echo "Couldn't run locate command.  Is mlocate installed?"
    exit 5
fi

Instead of hard-coding the path to the database, we collect the locate database details using the -S parameter. By using some string manipulation functions we can tease out the file path from the output.

Because we are going to offer to update the location database (as well as eventually manipulate authorized_keys files), it makes sense to check that we are root before proceeding. Additionally, let's check to see that we get a pattern from our user, and provide some usage guidance.

1
2
3
4
5
6
7
8
9
10
11
if [ ! `whoami` = "root" ]
then
    echo "Please run as root."
    exit 4
fi
 
if [ -z $1 ]
then
    echo "Usage: check_authorized_keys PATTERN"
    exit 3
fi

Checking and modifying authorized_keys for a pattern

With some prerequisites in place, we're finally ready to scan the system's authorized_keys files. Let's just start with the syntax for that loop.

1
2
3
for key_file in `locate authorized_keys`; do
    echo "Searching $key_file..."
done

We do not specify a dollar sign ($) in front of key_file when defining the loop, but once inside our loop we use the regular syntax. We use command substitution by placing a command around back quotes (`) around the output of the command we want to use. We're now scanning each file, but how do we find matching entries?

1
2
3
4
5
6
IFS=$'\n'
for matching_entry in `grep "$1" $key_file`; do
    IFS=' '
    echo "Found an entry in $key_file:"
    echo $matching_entry
done

For each $key_file, we now grep our user's pattern ($1) and store it in $matching_entry. We have to change the Input Field Seperator (IFS) to a new line, instead of the default space, in order to capture each grepped line in its entriety. (Thanks to Brian Miller for that one!)

With a matching entry found in a key file, it's time to finally offer the user a chance to remove the entry.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
echo "Found an entry in $key_file:"
echo $matching_entry
echo -n "Remove entry? [y/n]: "
read remove_entry
if [ "$remove_entry" = "y" ]
then
    if [ ! -w $key_file ]
    then
        echo "Cannot write to $key_file."
        exit 1
    else
        sed -i "/$matching_entry/d" $key_file
        echo "Deleted."
    fi
else
    echo "Not deleted."
fi

We prompt the user if they want to delete the shown entry, verify we can write to the $key_file, and then delete the $matching entry. By using the -i option to the sed command, we are able to make modifications in place.

The Final Product

I'm sure there is a lot of room for improvement on this script and I'd welcome pull requests on the GitHub repo I setup for this little block of code. As always, be very careful when running automated scripts as root. Please test this script out on a non-production system before use.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#!/bin/bash
 
if [ ! `whoami` = "root" ]
then
    echo "Please run as root."
    exit 4
fi
 
 
if [ -z $1 ]
then
    echo "Usage: check_authorized_keys PATTERN"
    exit 3
fi
 
if
    mlocate_path=`locate -S`
then
    # locate -S command will output database path in following format:
    # Database /full/path/to/db: (more output)...
    mlocate_path=${mlocate_path%:*} #remove content after colon
    mlocate_path=${mlocate_path#'Database '*} #remove 'Database '
else
    echo "Couldn't run locate command.  Is mlocate installed?"
    exit 5
fi
 
if [ -r $mlocate_path ]
then
    echo -n "mlocate database last updated: "
    stat -c %y $mlocate_path
    echo -n "Do you want to update the locate database this script depends on? [y/n]: "
    read update_locate
    if [ "$update_locate" = "y" ]
    then
        echo "Updating locate database.  This may take a few minutes..."
        updatedb
        echo "Update complete."
        echo ""
    fi
else
    echo "Cannot read from $mlocate_path"
    exit 2
fi
 
for key_file in `locate authorized_keys`; do
    echo "Searching $key_file..."
    IFS=$'\n'
    for matching_entry in `grep "$1" $key_file`; do
    IFS=' '
        echo "Found an entry in $key_file:"
        echo $matching_entry
        echo -n "Remove entry? [y/n]: "
        read remove_entry
        if [ "$remove_entry" = "y" ]
        then
            if [ ! -w $key_file ]
            then
                echo "Cannot write to $key_file."
                exit 1
            else
                sed -i "/$matching_entry/d" $key_file
                echo "Deleted."
            fi
        else
            echo "Not deleted."
        fi
    done
done
 
echo "Search complete."