Nginx, Varnish, HAProxy, and Thin/Lighttpd
Over the last few days, I have been playing with Ruby on Rails again and came across Thin, a small, yet stable web server which will serve applications written in Ruby.
This is a small tutorial on how to get Nginx, Varnish, HAProxy working together with Thin (for dynamic pages) and Lighttpd (for static pages).
I decided to take this route as from reading in many places I found that separating static and dynamic content improves performance significantly.
Nginx
Nginx is a lightweight, high performance web server and reverse proxy. It can also be used as an email proxy, although this is not an area I have explored. I will be using Nginx as the front-end server for serving my rails applications.
I installed Nginx using the RHEL binary package available from EPEL.
Configuration of Nginx is very simple. I have kept it very simple, and made Nginx My current configuration file consists of the following:
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] $request "$status" $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"';
sendfile on;
tcp_nopush on;
tcp_nodelay off;
keepalive_timeout 5;
# This section enables gzip compression.
gzip on;
gzip_comp_level 2;
gzip_proxied any;
gzip_types text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;
# Here you can define the addresses on which varnish will listen. You can place multiple servers here, and nginx will load balance between them.
upstream cache_servers {
server localhost:6081 max_fails=3 fail_timeout=30s;
}
# This is the default virtual host.
server {
listen 80 default;
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log;
charset utf-8;
# This is optional. It serves up a 1x1 blank gif image from RAM.
location = /1x1.gif {
empty_gif;
}
# This is the actual part which will proxy all connections to varnish.
location / {
proxy_pass http://cache_servers/;
proxy_redirect http://cache_servers/ http://$host:$server_port/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}
Varnish
Varnish is a high performance caching server. We can use Varnish to cache content which will not be changed often.
I installed Varnish using the RHEL binary package available from EPEL as well. Initially, I only needed to edit /etc/sysconfig/varnish
, and configure the address on which varnish will listen on.
DAEMON_OPTS="-a localhost:6081 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-u varnish -g varnish \
-s file,/var/lib/varnish/varnish_storage.bin,10G"`
This will make varnish listen on port 6081 for normal HTTP traffic, and port 8082 for administration.
Next, you must edit /etc/varnish/default.vcl
to actually cache data. My current configuration is as follows:
backend thin {
.host = "127.0.0.1";
.port = "8080";
}
backend lighttpd {
.host = "127.0.0.1";
.port = "8081";
}
sub vcl_recv {
if (req.url ~ "^/static/") {
set req.backend = lighttpd;
} else {
set req.backend = thin;
}
# Allow purging of cache using shift + reload
if (req.http.Cache-Control ~ "no-cache") {
purge_url(req.url);
}
# Unset any cookies and autorization data for static links and icons, and fetch from catch
if (req.request == "GET" && req.url ~ "^/static/" || req.request == "GET" && req.url ~ "^/icons/") {
unset req.http.cookie;
unset req.http.Authorization;
lookup;
}
# Look for images in the cache
if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
unset req.http.cookie;
lookup;
}
# Do not cache any POST'ed data
if (req.request == "POST") {
pass;
}
# Do not cache any non-standard requests
if (req.request != "GET" && req.request != "HEAD" &&
req.request != "PUT" && req.request != "POST" &&
req.request != "TRACE" && req.request != "OPTIONS" &&
req.request != "DELETE") {
pass;
}
# Do not cache data which has an autorization header
if (req.http.Authorization) {
pass;
}
lookup;
}
sub vcl_fetch {
# Remove cookies and cache static content for 12 hours
if (req.request == "GET" && req.url ~ "^/static/" || req.request == "GET" && req.url ~ "^/icons/") {
unset obj.http.Set-Cookie;
set obj.ttl = 12h;
deliver;
}
# Remove cookies and cache images for 12 hours
if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
unset obj.http.set-cookie;
set obj.ttl = 12h;
deliver;
}
# Do not cache anything that does not return a value in the 200's
if (obj.status >= 300) {
pass;
}
# Do not cache content which varnish has marked uncachable
if (!obj.cacheable) {
pass;
}
# Do not cache content which has a cookie set
if (obj.http.Set-Cookie) {
pass;
}
# Do not cache content with cache control headers set
if (obj.http.Pragma ~ "no-cache" || obj.http.Cache-Control ~ "no-cache" || obj.http.Cache-Control ~ "private") {
pass;
}
if (obj.http.Cache-Control ~ "max-age") {
unset obj.http.Set-Cookie;
deliver;
}
pass;
}
HAProxy
HAProxy is a high performance TCP/HTTP load balancer. It can be used to load balance almost any type of TCP connection, although I have only used it with HTTP connections.
We will be using HAProxy to balance connections over multiple thin instances.
HAProxy is also available in EPEL. My HAProxy configuration is as follows:
global
daemon
log 127.0.0.1 local0
maxconn 4096
nbproc 1
chroot /var/lib/haproxy
user haproxy
group haproxy
defaults
mode http
clitimeout 60000
srvtimeout 30000
timeout connect 4000
option httpclose
option abortonclose
option httpchk
option forwardfor
balance roundrobin
stats enable
stats refresh 5s
stats auth admin:123abc789xyz
listen thin 127.0.0.1:8080
server thin 10.10.10.2:2010 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2011 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2012 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2013 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2014 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2015 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2016 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2017 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2018 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2019 weight 1 minconn 3 maxconn 6 check inter 20000
Thin
My Thin server is running on a separate Gentoo box. I installed Thin using the package in Portage.
To configure Thin, I used the following command:
thin config -C /etc/thin/config-name.yml -c /srv/myapp --servers 10 -e production -p 2010
This configures thin to start 10 servers, listening on port 2010 to 2019. If you want an init script for Thin, so you can start it at boot, run
thin init
This is will create the init script, and you can set it to start up at boot using the normal method (rc-update add thin default
or chkconfig thin on
).
You should now be able to reach your rails app through http://nginx.servers.ip.address
.
Next, we must configure the static webserver.
Lighttpd
I decided to go with Lighttpd as it is a fast, stable and lightweight webserver which will do the job perfectly with little configuration.
You could also use nginx as the static server instead of using lighttpd, but I decided to separate it.
I decided to use the package from EPEL for Lighttpd, and found that most of the default configuration was as I wanted it to be. The only thing I needed to change was the port and address the server was listening on:
server.port = 8081
server.bind = "127.0.0.1"
And that’s pretty much it!
Now you just have to dump any static content into /var/www/lighttpd/ (the default location that the Lighttpd package in EPEL is configured to use) and reference any static links using “/static/document_path_of_file”, such as if I put an image into /var/www/lighttpd/images/ called “bg.png”, I can reach it using http://servers_hostname/static/images/bg.png
.
I have not really done any performance tests on how well this works, and there are probably many things which I could have done better. This was mainly an experiment, so I am always looking for feedback or tips on how to make this better, so please do contact me if you have any suggestions! 🙂