Install HAProxy with SSL Termination

These days I have been working with scaling solutions for a PHP framework. Previously I came with Nginx as load balancers, however, with the requirement of health check and failover, I need to come to HAProxy this time. So I write this entry as a note for installing HAProxy with SSL Termination. Most of my machines are on the Ubuntu 16.04 stock OS. My testing cluster comprises 4 following machines:

SERVER_1 plays the HAProxy role (I will reuse it for ProxySQL later).
SERVER_2 and SERVER_3 act as 2 web servers: webserver-01 and webserver-02.
SERVER_4 is dedicated for MySQL 5.7 database. No replication, group replication or multiple DB servers in this article.

I just quick notes steps, since this blog entry is for me, and why do I care if others are angry with it :D?

Install MySQL server to SERVER_4

One line:

apt-get install -y mysql-server
# Modify configuration as my need
systemctl start mysql
systemctl enable mysql

Do remember to create a user which can access from your web application servers and allow remote connection to MySQL from these servers. In case of testing, you can simply create a user with ‘%’ in the host and tell MySQL to listen for every network interfaces (Add bind-address = 0.0.0.0 to the [mysqld] config section).

Install Nginx with PHP-FPM on SERVER_2 and SERVER_3

Just starting with Nginx:

apt-get install -y nginx
# Modify configuration as my need
systemctl enable nginx
systemctl start nginx

Next we will install PHP-FPM:

apt-get install -y php-fpm php-mysql php-mysqli php-mcrypt php-opcache php-mbstring php-gd php-xml php-curl php-zip php-cli
# Modify configuration as my need
systemctl enable php7.0-fpm
systemctl start php7.0-fpm

Edit /etc/nginx/sites-available/default to enable index.php and PHP for default site. Remember, I am testing so why do I need to care with configuration? Remember to restart Nginx after this step.

Create a /var/www/html/info.php file with a simple phpinfo thing:

<?php

echo "<p>I am server {$_SERVER['SERVER_ADDR']}</p>";
phpinfo();

Access SERVER_2/info.php and SERVER_3/info.php to be sure things are working well.

Install Let’s Encrypt and HAProxy to SERVER_1

Now it’s time to install Let’s Encrypt:

add-apt-repository ppa:certbot/certbot

apt-get update -y
apt-get install certbot -y

Then, start and enable HA-Proxy:

apt-get install -y haproxy
systemctl start haproxy
systemctl enable haproxy

Edit /etc/haproxy/haproxy.cfg and add the following lines for the HA to listen on port 80, go to backend at port 8888 for Let’s Encrypt requests and go to 2 web-servers (SERVER_2 and SERVER_3) for normal requests:

frontend fe-scaling
    bind *:80
    # Test URI to see if its a letsencrypt request
    acl letsencrypt-acl path_beg /.well-known/acme-challenge/
    use_backend letsencrypt-backend if letsencrypt-acl
    # Default: use normal backend for web apps
    default_backend be-scaling

# LE Backend -> Go to LE on current server
backend letsencrypt-backend
    server letsencrypt 127.0.0.1:8888

# Normal (default) Backend for web app servers
backend be-scaling
#    redirect scheme https if !{ ssl_fc }
    server webserver-1  SERVER_2_IP:80 check
    server webserver-2  SERVER_3_IP:80 check

Then, service haproxy reload and access the SERVER_1_IP/info.php to different results (round-robin loading from 2 different servers).

One note on this: if we want to access backend web servers with session persistence (so N requests from 1 user only stick to 1 server), we need to define a cookie to be used by HAProxy as follows:
```
backend be-scaling
    cookie HAWEBSERVER insert
    server webserver-1  SERVER_2_IP:80 cookie 1 check
    server webserver-2  SERVER_3_IP:80 cookie 2 check
```
If we do not want to listen for all traffics to HA server, we can then only use be-scaling backend for a specific domain as in the following configuration:
```
#### Inside frontend fe-scaling
    acl app-ha-acl hdr(Host) -i YOUR_HA_DOMAIN.com
    use_backend be-scaling if app-ha-acl
```

Create a location for HAProxy SSL and get cert issued. I do not use the port 80 this time as an assumption that HAProxy is running on it (so it does work in case we install on an existing HA-based system):

mkdir /etc/ssl/MY_HA_DOMAIN.com
certbot certonly --standalone -d MY_HA_DOMAIN.com --non-interactive --agree-tos --email [email protected] --http-01-port=8888
DOMAIN='MY_HA_DOMAIN.com' bash -c 'cat /etc/letsencrypt/live/$DOMAIN/fullchain.pem /etc/letsencrypt/live/$DOMAIN/privkey.pem > /etc/ssl/haproxy/$DOMAIN.pem'

Now we need to edit /etc/haproxy/haproxy.cfg and add the SSL listening directive. We will use 2 different frontends for HTTP and HTTPS so that we can pass some additional headers in different cases (to avoid nginx unlimited redirect some cases):
```
frontend fe-https-scaling
    bind *:443 ssl crt /etc/ssl/haproxy/MY_HA_DOMAIN.com.pem
    reqadd X-Forwarded-Proto:\ https
    acl letsencrypt-acl path_beg /.well-known/acme-challenge/
    use_backend letsencrypt-backend if letsencrypt-acl
    default_backend be-scaling
```
Then, service haproxy reload and access the https://MY_HA_DOMAIN/info.php to see the result.

Finally, we need to setup renewal and schedule it to run monthly. We can create a new /root/haproxy-certbot-renewal.sh as follows:

#!/bin/bash
certbot renew --tls-sni-01-port=8888 --pre-hook "service haproxy stop" --post-hook "service haproxy start"
DOMAIN='MY_HA_DOMAIN.com' bash -c 'cat /etc/letsencrypt/live/$DOMAIN/fullchain.pem /etc/letsencrypt/live/$DOMAIN/privkey.pem > /etc/ssl/haproxy/$DOMAIN.pem'

Enable HAProxy Stats / Monitoring

We can simply edit /etc/haproxy/haproxy.cfg and add the stats section as follows:

listen  stats   
	bind *:8080
        mode            http
        log             global

        maxconn 10
        clitimeout      100s
        srvtimeout      100s
        contimeout      100s
        timeout queue   100s

        stats enable
        stats hide-version
        stats refresh 30s
        stats show-node
        stats auth haadmin:1234567890
        stats uri  /ha-monitor?stats

With the above configuration, we configure the port for stats server is 8080 (so do remember to open port for it on HA server), path to access is /ha-monitor?stats, and the user to access is haadmin with password 1234567890.

Reload HAProxy and access to the ports & path that we defined in the above step.

GlusterFS in each web server for file replication

Jan 2019 Update: Consider use Unison instead of GlusterFS (use apt-get -y install unison)

Add all web servers IPs to /etc/hosts of each web server machines:
```
SERVER_2_IP webserver-1
SERVER_3_IP webserver-2
```
Open necessary ports on each web server for GlusterFS:
- TCP and UDP ports 24007 and 24008 on all GlusterFS servers. The port 2049 TCP-only (from GlusterFS 3.4 & later) for portmapper.
- One port for each brick starting from port 49152. A brick is a filesystem that is mounted. In my case I only need one mount point for one web application, so only need to open port 49152.

Install GlusterFS on each web server machine:

add-apt-repository ppa:gluster/glusterfs-4.0
apt update -y
apt install glusterfs-server -y
systemctl start glusterfs-server
systemctl enable glusterfs-server

Configure the trusted pool for GlusterFS:

gluster peer probe webserver-1
gluster peer probe webserver-2
gluster peer status

Set up a GlusterFS volume on each web server:

On each web server, create the folder to contain the app:
```
mkdir -p /data/bricks/webapps
```

On any web server, create a GlusterFS volume (I use force param here since I will create the GlusterFS volume inside the system root partition) and start it:

gluster volume create gvapp0 transport tcp replica 2 webserver-1:/data/bricks/gvapp0 webserver-2:/data/bricks/gvapp0 force
# Set meta cache
gluster volume set gvapp0 group metadata-cache
gluster volume set gvapp0 network.inode-lru-limit 500000
gluster volume set gvapp0 features.cache-invalidation on
gluster volume set gvapp0 features.cache-invalidation-timeout 600
gluster volume set gvapp0 performance.stat-prefetch on
gluster volume set gvapp0 performance.cache-invalidation on
gluster volume set gvapp0 performance.cache-samba-metadata on
gluster volume set gvapp0 performance.md-cache-timeout 600
# Start the volume
gluster volume start gvapp0

Check with gluster volume info to see if the gvapp0 volume is properly started.

On each web server, we need to mount a brick to the web application location:
1. Install the attr package:
```
apt-get install -y attr
```
2. Mount the brick to web app home:
```
mount -t glusterfs webserver-1:gvapp0 /home/hawebapp/
```
  Remember that you need to specify the VOLUME_NAME (e.g. gvapp0), not the full path (/data/bricks/gvapp0) when mounting, otherwise you will get the error “failed to fetch volume file (key:/data/bricks/gvapp0)”.
3. When you find that mounting is working properly, just simply edit the /etc/fstab to include the mounting point when the server starts:
```
webserver-1:/gvapp0 /home/hawebapp/ glusterfs defaults,_netdev,noauto,x-systemd.automount 0 0
```
On each web server, start to configure Nginx settings to listen on port 80 with the root web app of the HA domain pointing to /home/hawebapp/ to start serving your users.

Enable NFS and mount client as NFS instead of Gluster

Even after enabling meta-data cache for Gluster volumes, the performance of mounting the volume as the gluster type is still bad for my web app (A legacy PHP app with more than 15k small-size files). I often see glusterd and glusterfs services consume lots of CPU, so I come to enabling NFS for the gluster volume. Even NFS is marked as deprecated, it is still ok to use, comparing to the original gluster mount type. Of course, if we start with Redhat / CentOS, we should use nfs-genesha as per the guide. I am in Ubuntu, so I just use the deprecated NFS with NFS v3.

In any Gluster server, enable NFS for the volume:

gluster volume set gvapp0 nfs.disable off
gluster volume stop gvapp0
gluster volume start gvapp0

In all Gluster servers, unmount current volume and re-mount as NFS:

umount /home/hawebapp
mount -t nfs -o vers=3 webserver-1:/gvapp0 /home/hawebapp/

In all Gluster servers, edit mount point in /etc/fstab:

webserver-1:/gvapp0 /home/hawebapp/ nfs defaults,_netdev,vers=3 0 0

Put enough load to your web app to see the difference in term of resource consumption.