Quantcast
Channel: Nickebo.net
Viewing all articles
Browse latest Browse all 15

Setting up a HA Graylog cluster

$
0
0

Graylog is one of the best OSS projects I've ever come across. When deployed in a simple setup (all nodes on the same machine) there's a pretty good guide in the Graylog documentation, but I've been looking for a guide on how to set it up in a HA cluster with all parts using HA. This is what this blog post is about. I will be setting up the following:

  • 3x ElasticSearch in a sharding cluster
  • 3x MongoDB in a replicating cluster
  • 2x Graylog server nodes
  • 2x HAProxy and keepalived nodes for TCP (HAProxy) and UDP (keepalived) load balancing
  • 1x Graylog Web interface

This guide will be focusing on CentOS 6.x and below, only applicable for the firewall. Everything else should be more or less universal.

ElasticSearch

To start with, install ElasticSearch of the same version on all three nodes. In my cluster I'm using version 1.4.4.
Now to the configuration. Change the following lines on all nodes in elasticsearch.yml:

cluster.name: graylog2  
node.name: "[name of node, example elasticsearch01]"  

If needed change the number of shards and replicas. I'm using 5 and 1 (standard):

index.number_of_shards: 5  
index.number_of_replicas: 1  

I use unicast ping, to do this disable multicast ping and the specify the cluster nodes:

discovery.zen.ping.multicast.enabled: false  
discovery.zen.ping.unicast.hosts: ["IP_of_node1:9300", "IP_of_node2:9300", "IP_of_node3:9300"]  
ElasticSearch firewall configuration

Now, for ElasticSearch to be able to communicate the firewall has to allow this:

-A INPUT -m state --state NEW -m tcp -p tcp --dport 9200 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9300 -j ACCEPT

If you want you can specify explicit IPs here to the other nodes in the cluster. When this is done, start ElasticSearch on all nodes. To check cluster status, use this command:

[root@es02 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "graylog2",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 5,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

Here I have 5 nodes (two of which are Graylog nodes, which will be covered later) and 3 data nodes, the ones we just set up. The number of shards correspond with what was configured earlier, 5 primary shards and 5 backup shards, 10 in total.

MongoDB

Start be installing MongoDB. MongoDB will be configured to be a three node replica set. Let's have a look at the config changes in mongod.conf:

replSet = graylog  
MongoDB firewall config

The firewall has to allow port 27017.

-A INPUT -m state --state NEW -m tcp -p tcp --dport 27017 -j ACCEPT
Initiating the cluster and adding members

Issue the following command on the first node:

mongo  

Now initiate the replica set:

rs.initiate()  

Add the new members:

rs.add("mongodb02.example.com")  
rs.add("mongodb03.example.com")  

Now, use the rs.status() command to make sure that everything looks alright.

graylog:PRIMARY> rs.status()  
{
    "set" : "graylog",
    "date" : ISODate("2015-06-01T11:07:08Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "mongodb01.example.com:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 165272,
            "optime" : Timestamp(1433156828, 1),
            "optimeDate" : ISODate("2015-06-01T11:07:08Z"),
            "electionTime" : Timestamp(1432991818, 1),
            "electionDate" : ISODate("2015-05-30T13:16:58Z"),
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "mongodb02.example.com:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 165018,
            "optime" : Timestamp(1433156826, 2),
            "optimeDate" : ISODate("2015-06-01T11:07:06Z"),
            "lastHeartbeat" : ISODate("2015-06-01T11:07:06Z"),
            "lastHeartbeatRecv" : ISODate("2015-06-01T11:07:07Z"),
            "pingMs" : 0,
            "syncingTo" : "mongodb01.example.com:27017"
        },
        {
            "_id" : 2,
            "name" : "mongodb03.example.com:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 165172,
            "optime" : Timestamp(1433156826, 2),
            "optimeDate" : ISODate("2015-06-01T11:07:06Z"),
            "lastHeartbeat" : ISODate("2015-06-01T11:07:06Z"),
            "lastHeartbeatRecv" : ISODate("2015-06-01T11:07:06Z"),
            "pingMs" : 0,
            "syncingTo" : "mongodb01.example.com:27017"
        }
    ],
    "ok" : 1
}

MongoDB is now set up with a three way replication.

Graylog

Install graylog-server on your nodes, I'll be using two nodes as graylog-server nodes.
Configure the nodes like this:

password_secret = [someting very secret]  
root_password_sha2 = [password hash]  
root_timezone = CET # Or whatever your time zone is  
rest_listen_uri = http:/[node IP]:12900/  
elasticsearch_shards = 5  
elasticsearch_replicas = 1  
elasticsearch_discovery_zen_ping_multicast_enabled = false  
elasticsearch_discovery_zen_ping_unicast_hosts = [elasticsearch node 01 IP]:9300,[elasticsearch node 02 IP]:9300,[elasticsearch node 03 IP]:9300  
mongodb_uri = mongodb://[MongoDB IP 01]:27017,[MongoDB IP 02]:27017,[MongoDB IP 03]:27017/graylog?replicaSet=graylog  

Plus check the config file for any other options you might need.

Graylog firewall config

Graylog will need some firewall configuration aswell.

-A INPUT -m state --state NEW -m tcp -p tcp --dport 12900 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 9350 -j ACCEPT

Don't forget to add firewall configuration for your inputs, like 514 TCP/UDP for syslog.

HAProxy

HAProxy will take care of the TCP load balancing. I've configured it to listen on port 514 with my two graylog nodes added as backend servers.

global  
log 127.0.0.1 local0  
pidfile /var/run/haproxy.pid  
debug

defaults  
option dontlognull  
option redispatch  
option contstats  
retries 3  
timeout connect 5s  
timeout queue 30s  
timeout tarpit 1m  
backlog 10000  
option tcplog  
option redispatch  
log global  
timeout client 300s  
timeout server 300s  
default-server inter 3s rise 2 fall 3

# HTTP server for admin status check
listen stats 0.0.0.0:8000    #Listen on all IP's on port 8000  
mode http  
balance  
timeout client 5000  
timeout connect 4000  
timeout server 30000

#This is the virtual URL to access the stats page
stats uri /haproxy_stats

#Authentication realm. This can be set to anything. Escape space characters with a backslash.
stats realm HAProxy\ Statistics  
#The user/pass you want to use. Change this password!
stats auth admin:secret123

#This allows you to take down and bring up back end servers.
#This will produce an error on older versions of HAProxy.
stats admin if TRUE

# Graylog input
listen graylogsys_1 172.16.0.26:514  
    mode tcp
    balance roundrobin
    option tcplog
    option tcpka
    option httpchk GET /system/lbstatus
    http-check expect string ALIVE
    server graylog_1 [graylog-server ip 1]:514 check port 12900
    server graylog_2 [graylog-server ip 2]:514 check port 12900
    maxconn 10000

To check whether a backend server is up or down I use [graylog IP]:12900/system/lbstatus and checks if it says "ALIVE", if it does the node is OK otherwise it will be removed and not get any traffic. I've also changed the name of the input to graylogsys2 for the second HAProxy instance, so that I know which one is active if I connect to 172.16.0.26:8000/haproxystats with my browser.

HAproxy firewall config

More firewall rules which you might need.

-A INPUT -m state --state NEW -m tcp -p tcp --dport 8000 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 514 -j ACCEPT
HAproxy SELinux config

Need to allow this SELinux boolean:

setsebool -P haproxy_connect_any 1  
HAproxy status

Visit http://[HA_IP]:8000/haproxy_stats and login with admin/your password.

Keepalived

Keepalived will be used for moving the cluster IP between the two loadbalancing servers (172.16.0.26 in my case, as you've seen in the HAproxy config). It will also be responsible for load balancing UDP with a Linux Virtual Server.

global_defs {  
        lvs_id LoadBalancer01 # Load balancer name
}

vrrp_script check_haproxy {  
        script "/usr/bin/killall -0 haproxy" # make sure haproxy is running
        interval 2                  # check every 2 seconds
        weight 2                    # add weight if OK
}

vrrp_instance FloatIP01 {  
        state MASTER
        interface eth0 # Change this if needed
        virtual_router_id 10
        priority 101 #Change this to 100 on node 2
    advert_int 2
        virtual_ipaddress {
                172.16.0.26 # Floating IP
        }
        track_script {
                check_haproxy
        }
}

virtual_server 172.16.0.26 514 {  
    delay_loop 6
    lb_algo rr
      protocol UDP

      real_server [Graylog node 1] 514 {
            weight 100
      }
    real_server [Graylog node 2] 514 {
                weight 100
        }
}
Keepalived firewall config

Keepalived uses VRRP, VRRP needs to be able to talk to the other nodes in the HA cluster via a multicast address and the VRRP protocol.

-A INPUT -d 224.0.0.0/8 -j ACCEPT
-A INPUT -p vrrp -j ACCEPT
Keepalived status

Use ipvsadm to check the status of keepalived UDP load balancer.

[root@lb01 keepalived]# ipvsadm
IP Virtual Server version 1.2.1 (size=4096)  
Prot LocalAddress:Port Scheduler Flags  
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  172.16.0.26:syslog rr  
  -> graylog01.nickebo.net:514    Local   100    0          1
  -> graylog02.nickebo.net:514    Masq    100    0          2

If you shut down keepalived the IP should hopefully move to the other node and a virtual server for UDP should be started. The same thing will happen if you shut down HAproxy, since we don't want the node to have the cluster IP if HAproxy crashes.

Graylog web interface

The graylog webinterface doesn't need very much configuration. I've changed these lines:

graylog2-server.uris="http://[graylog node 01]:12900/,http://[graylog node 02]:12900/"  
application.secret="your_sercret"  
timezone="Europe/Stockholm" # Change to where you are!  
Graylog web interface firewall config

Another rule

-A INPUT -m state --state NEW -m tcp -p tcp --dport 9000 -j ACCEPT

All done

That should be it. I'm sure I've missed something here, please give me a shout if you feel there's something I should add.


Viewing all articles
Browse latest Browse all 15

Trending Articles