Nginx, Varnish, HAProxy, and Thin/Lighttpd

Over the last few days, I have been playing with Ruby on Rails again and came across Thin, a small, yet stable web server which will serve applications written in Ruby.

This is a small tutorial on how to get Nginx, Varnish, HAProxy working together with Thin (for dynamic pages) and Lighttpd (for static pages).

I decided to take this route as from reading in many places I found that separating static and dynamic content improves performance significantly.

Nginx

Nginx is a lightweight, high performance web server and reverse proxy. It can also be used as an email proxy, although this is not an area I have explored. I will be using Nginx as the front-end server for serving my rails applications.

I installed Nginx using the RHEL binary package available from EPEL.

Configuration of Nginx is very simple. I have kept it very simple, and made Nginx My current configuration file consists of the following:

user nginx;
worker_processes 1;

error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;

events {
worker_connections 1024;
}

http {
include /etc/nginx/mime.types;
default_type application/octet-stream;

log_format main '$remote_addr - $remote_user [$time_local] $request "$status" $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"';

sendfile on;
tcp_nopush on;
tcp_nodelay off;

keepalive_timeout 5;

# This section enables gzip compression.
gzip on;
gzip_comp_level 2;
gzip_proxied any;
gzip_types text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;

# Here you can define the addresses on which varnish will listen. You can place multiple servers here, and nginx will load balance between them.
upstream cache_servers {
server localhost:6081 max_fails=3 fail_timeout=30s;
}

# This is the default virtual host.
server {
listen 80 default;
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log;
charset utf-8;

# This is optional. It serves up a 1x1 blank gif image from RAM.
location = /1x1.gif {
empty_gif;
}

# This is the actual part which will proxy all connections to varnish.
location / {
proxy_pass http://cache_servers/;
proxy_redirect http://cache_servers/ http://$host:$server_port/;

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}

Varnish

Varnish is a high performance caching server. We can use Varnish to cache content which will not be changed often.

I installed Varnish using the RHEL binary package available from EPEL as well. Initially, I only needed to edit /etc/sysconfig/varnish, and configure the address on which varnish will listen on.

DAEMON_OPTS="-a localhost:6081 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-u varnish -g varnish \
-s file,/var/lib/varnish/varnish_storage.bin,10G"

This will make varnish listen on port 6081 for normal HTTP traffic, and port 8082 for administration.

Next, you must edit /etc/varnish/default.vcl to actually cache data. My current configuration is as follows:

backend thin {
.host = "127.0.0.1";
.port = "8080";
}

backend lighttpd {
.host = "127.0.0.1";
.port = "8081";
}

sub vcl_recv {
if (req.url ~ "^/static/") {
set req.backend = lighttpd;
} else {
set req.backend = thin;
}

# Allow purging of cache using shift + reload
if (req.http.Cache-Control ~ "no-cache") {
purge_url(req.url);
}

# Unset any cookies and autorization data for static links and icons, and fetch from catch
if (req.request == "GET" && req.url ~ "^/static/" || req.request == "GET" && req.url ~ "^/icons/") {
unset req.http.cookie;
unset req.http.Authorization;
lookup;
}

# Look for images in the cache
if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
unset req.http.cookie;
lookup;
}

# Do not cache any POST'ed data
if (req.request == "POST") {
pass;
}

# Do not cache any non-standard requests
if (req.request != "GET" && req.request != "HEAD" &&
req.request != "PUT" && req.request != "POST" &&
req.request != "TRACE" && req.request != "OPTIONS" &&
req.request != "DELETE") {
pass;
}

# Do not cache data which has an autorization header
if (req.http.Authorization) {
pass;
}

lookup;
}

sub vcl_fetch {
# Remove cookies and cache static content for 12 hours
if (req.request == "GET" && req.url ~ "^/static/" || req.request == "GET" && req.url ~ "^/icons/") {
unset obj.http.Set-Cookie;
set obj.ttl = 12h;
deliver;
}

# Remove cookies and cache images for 12 hours
if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
unset obj.http.set-cookie;
set obj.ttl = 12h;
deliver;
}

# Do not cache anything that does not return a value in the 200's
if (obj.status >= 300) {
pass;
}

# Do not cache content which varnish has marked uncachable
if (!obj.cacheable) {
pass;
}

# Do not cache content which has a cookie set
if (obj.http.Set-Cookie) {
pass;
}

# Do not cache content with cache control headers set
if(obj.http.Pragma ~ "no-cache" || obj.http.Cache-Control ~ "no-cache" || obj.http.Cache-Control ~ "private") {
pass;
}

if (obj.http.Cache-Control ~ "max-age") {
unset obj.http.Set-Cookie;
deliver;
}

pass;
}

HAProxy

HAProxy is a high performance TCP/HTTP load balancer. It can be used to load balance almost any type of TCP connection, although I have only used it with HTTP connections.

We will be using HAProxy to balance connections over multiple thin instances.

HAProxy is also available in EPEL. My HAProxy configuration is as follows:

global
daemon
log 127.0.0.1 local0
maxconn 4096
nbproc 1
chroot /var/lib/haproxy
user haproxy
group haproxy

defaults
mode http
clitimeout 60000
srvtimeout 30000
timeout connect 4000

option httpclose
option abortonclose
option httpchk
option forwardfor

balance roundrobin

stats enable
stats refresh 5s
stats auth admin:123abc789xyz

listen thin 127.0.0.1:8080
server thin 10.10.10.2:2010 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2011 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2012 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2013 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2014 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2015 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2016 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2017 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2018 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2019 weight 1 minconn 3 maxconn 6 check inter 20000

Thin

My Thin server is actually run on a separate Gentoo box. I installed Thin using the package in Portage.

To configure Thin, I used the following command:

thin config -C /etc/thin/config-name.yml -c /srv/myapp --servers 10 -e production -p 2010

This configures thin to start 10 servers, listening on port 2010 to 2019. If you want an init script for Thin, so you can start it at boot, run

thin init

This is will create the init script, and you can set it to start up at boot using the normal method (rc-update add thin default or chkconfig thin on).

You should now be able to reach your rails app through http://nginx.servers.ip.address

Next, we must configure the static webserver.

Lighttpd

I decided to go with Lighttpd as it is a fast, stable and lightweight webserver which will do the job perfectly with little configuration.

You could also use nginx as the static server instead of using lighttpd, but I decided to separate it.

I decided to use the package from EPEL for Lighttpd, and found that most of the default configuration was as I wanted it to be. The only thing I needed to change was the port and address the server was listening on:

server.port = 8081
server.bind = "127.0.0.1"

And that’s pretty much it! Now you just have to dump any static content into /var/www/lighttpd/ (the default location that the Lighttpd package in EPEL is configured to use) and reference any static links using “/static/document_path_of_file”, such as if I put an image into /var/www/lighttpd/images/ called “bg.png”, I can reach it using http://servers_hostname/static/images/bg.png.

I have not really done any performance tests on how well this works, and there are probably many things which I could have done better. This is the first time I made any attempt HTTP performance tuning, and so I am always looking for feedback or tips on how to make this better, so please do contact me if you have any suggestions! 🙂

11 thoughts on “Nginx, Varnish, HAProxy, and Thin/Lighttpd”

  1. Thanks, the configs are helpful. What was the reasoning behind using both lighttpd and nginx? Seems like the setup is needlessly complex, can you explain your reasoning behind it? Thanks.

  2. Well, there isn’t a real reason behind it.

    I was originally going to just use Nginx, but I had most of my static content on a different server and I didn’t want to move it over to the new one at that time, so I just stuck with using Nginx to proxy connections to lighttpd.

    I’ll probably move the data over some time soon though, I don’t think there is any need in having both running.

  3. Hey,

    That was very useful. But in your default.vci varnish cfg file, you wrote this :
    backend thin {
    .host = “127.0.0.1”;
    .port = “8080”;
    }

    Whitch server is thin? Is it your varnish server?
    On my configuration, I’ve actually one squid settin on reverse proxy mode, one nginx who’s for the static files (css,jpg,text..), and one apache who’s for dynamic content.
    So my question is : thin = my nginx server?

    Sorry for my english i’m a french guy :p
    thanks

  4. Thin is actually the name of the Ruby HTTP server I use to serve my Rails applications.

    In my varnish configuration, the backend definition is actually pointing to HAproxy which in turn load-balances connections and between my Thin instances (Nginx -> varnish -> HAproxy -> Multiple Thin instances).

    If you only have one nginx server which is serving static content, you could just point varnish to that dynamic server rather than to HAProxy.

    For example if you have your nginx server running on 10.2.1.2:80, you could do :
    backend static {
    .host = “10.2.1.2”;
    .port = “80”;
    }

    and use “static” instead of “thin” everywhere else in your varnish configuration.

    I hope that helps.

  5. First thanks for your quick answer.
    And in my configuration I had already did that and it works fine.
    Thanks for for great tutorial, and for your time, I appreciate it

    1. Hello DHK,

      It’s strange that your varnish server got such a low score compared to the others. I used your ab.c script to benchmark my server (This is over LAN, not over the internet), and I’m getting 10018 requests per second. I’m sure this value could be increased too since I have some things which could effect the benchmark.

      Of course I can’t compare my result against yours since we are using different hardware configurations, but I feel as though there must be some bottleneck somewhere with your Varnish configuration considering it got such a low score compared to the others.

      (Fyi, I’m not using the varnish VCL I have shown in this post anymore).

      G-WAN does sound very interesting though, I will take a look at it. 🙂

  6. @M. Hamzah Khan

    > strange that varnish server got such a low score

    If you read the comments of this article, Poul-Henning Kamp (the guy who wrote Varnish) has double-checked these options, and despite his suggested modifications (and others made by his team), results remained the same:

    http://nbonvin.wordpress.com/2011/03/24/serving-small-static-files-which-server-to-use/

    Others, like Nginx, also suggested modifications, and the new test made them faster (from 57,000 to 80,000 RPS – better than Lighhtps but only half as fast as G-WAN).

    By the way, in your test Varnish was 3 times slower than on the published benchmark (made on a small laptop).

    Varnish seems to be much less efficient than Web servers, making its value as a “Web accelerator” questionable.

  7. Hey guys, it has been a few years. What are your impressions/conclusions now? I have Lighttpd running a WordPress site on a Raspberry Pi, just installed Varnish but I am having problems having normal HTTP traffic connecting. I assume I need to have Lighttpd listening on both 80 and 8080 but not sure how to do that. Or whether CloudFlare is affected somehow. After reading your comments about Lighttpd working faster than Varnish I am wondering whether to bother with it at all. My photography-related server mostly handles largish jpeg files (200KB). Any comments or suggestions? Thanks.

    1. Hi,

      Well this article is pretty old. I switched to Apache Traffic Server, purely because it was easier to manage. I haven’t tested performance of Varnish vs Traffic Server, but it’s working well enough for me.

      I assume I need to have Lighttpd listening on both 80 and 8080 but not sure how to do that.

      If you are using just Varnish and Lighttpd for your setup (I assume this is the case since you are running on a Raspberry Pi), then you probably want Varnish listening on port 80, and Lighttpd on 8080 or any other port, but this will be listening on localhost only as only Varnish will be talking to it directly.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.