monitoring stack using CollectD, Graphite

monitoring stack using CollectD, Graphite

Install Graphite

On Graphite Server install graphite packages:

sudo apt-get update

sudo apt-get install graphite-web graphite-carbon

Install and configure PostgreSQL

sudo apt-get install postgresql libpq-dev python-psycopg2

Switch to postgres user and create DB user graphite_user

su - postgres

Note: postgres user has no default password, you can set the password by sudo passwd postgres

createuser graphite_user --pwprompt

Create graphite_db databases owned by graphite_user:

createdb -O graphite_user graphite_db

Once this is done, you can switch back to previous user:

logout

Configure Graphite

Edit file /etc/graphite/local_settings.py and change DATABESES values we configured in PostgreSQL:

1
2
3
4
5
6
7
8
9
10
DATABASES = {
'default': {
'NAME': 'graphite_db',
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'USER': 'graphite_user',
'PASSWORD': 'graphite_user_password',
'HOST': '127.0.0.1',
'PORT': ''
}
}

Uncomment SECRET_KEY line and set key for hashing:

SECRET_KEY = 'a_salty_string'

Note: you can pick a appropriate salty string as you like.

Uncomment and set time zone which will be displayed on graphs:

TIME_ZONE = 'Asia/Hong_Kong'

Enable authentication to save graph data:

USE_REMOTE_USER_AUTHENTICATION = True

As for the django whose version is later than 1.8, you should do as following:

cd /usr/lib/python2.7/dist-packages/graphite/

sudo python manage.py migrate auth

sudo python manage.py migrate

After saving and closing the file, we need to sync the database:

sudo graphite-manage syncdb

Configure Carbon

sudo sed -i.bak s/0.0.0.0/YOUR_SERVER_IP/g /etc/carbon/carbon.conf

sudo cp /usr/share/doc/graphite-carbon/examples/storage-aggregation.conf.example /etc/carbon/storage-aggregation.conf

Enable Carbon to start on boot by editing file /etc/default/graphite-carbon and changing CARBON_CACHE_ENABLED to true:

CARBON_CACHE_ENABLED=true

We save the file and start carbon-cache daemon:

sudo service carbon-cache start

Install and configure Apache + wsgi

Since Django recommends using WSGI as the middleware service, we can run it on Apache with mod_wsgi, nginx with Gunicorn, or nginx with uWSGI. We will be installing Apache because it has good logging support and authentication modules.

Installing apache packages:

sudo apt-get install apache2 libapache2-mod-wsgi

Disable default Apache site:

sudo a2dissite 000-default

Copy Graphite’s virtual host template to Apache’s available sites directory:

sudo cp /usr/share/graphite-web/apache2-graphite.conf /etc/apache2/sites-available

Enable Graphite virtual host and reload Apache to implement changes:

sudo a2ensite apache2-graphite

sudo service apache2 reload

We can now access Graphite interface by browsing to http://YOUR_SERVER_IP.

Install and configure CollectD

Collectd is simply a daemon which collects system performance statistics and sends it to Graphite. It’s easy to configure and has a high number of plugins. It uses Plaintext protocol to send data series to Graphite/Carbon.

Install collectd packages:

sudo apt-get install collectd collectd-utils

Edit /etc/collectd/collectd.conf file and enable these plugins to collect various system data and push it to graphite:

1
2
3
4
5
6
7
LoadPlugin cpu
LoadPlugin df
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin ping
LoadPlugin write_graphite

Next plugin we’ll need to configure is write_graphite. It’s used to push data to graphite server which is listening on TCP port 2003 on 192.168.3.12 IP, there can be multiple Node stanzas to push to multiple graphite servers.

1
2
3
4
5
6
7
8
9
10
11
12
<Plugin write_graphite>
<Node "graphite">
Host "192.168.3.12"
Port "2003"
Protocol "tcp"
LogSendErrors true
Prefix "collectd."
StoreRates true
AlwaysAppendDS false
EscapeCharacter "_"
</Node>
</Plugin>

On Graphite server edit file /etc/carbon/storage-schemas.conf to configure storage parameters, add [collectd] stanza below [carbon] parameters, but before [default_1min_for_1day] stanza:

1
2
3
[collectd]
pattern = ^collectd.*
retentions = 10s:1h,1m:1d,10m:1y

For changes to take affect, please restart carbon-cache service :

1
2
3
sudo service carbon-cache stop
sudo service carbon-cache start
sudo service collectd restart