Apache Traffic Server Basic Configuration on RHEL6/CentOS 6

In this guide, I will explain how to get setup Apache Traffic Server with a very very basic configuration.

I will be using RHEL6/CentOS 6, but actually creating the configuration files for Traffic Server is exactly the same on all distributions.

As a pre-requisite for setting up Traffic Server, you must know a little about the HTTP protocol, and what a reverse proxy’s job actually is.

What is Apache Traffic Server?

I don’t really want to go into too much detail into this as there are many sites which explain this better than I ever could, but in short, Traffic Server is a caching proxy created by Yahoo! and donated to the Apache Foundation.

Installation

Apache Traffic Server is available from the EPEL repository, and this is the version I will be using.

Firstly, you must add the EPEL repositories if you haven’t already:
rpm -ivh http://mirror.us.leaseweb.net/epel/6/i386/epel-release-6-7.noarch.rpm
Next, we can just use yum to install Traffic Server:
yum install trafficserver
While we are at it, we might as well set Traffic Server to start at boot:
chkconfig trafficserver on

Configuration

In this tutorial, I will only configure Apache Traffic Server to forward all requests to a single webserver.

For this, we really really only need to edit two files:

  • /etc/trafficserver/records.config
    This is the main configuration file which stores all the “global” configuration options.
  • /etc/trafficserver/remap.config
    This contains mapping rules for which real web server ATS should forward requests to.

Firstly, edit records.conf.

I didn’t really have to change much initially for a basic configuration.

The lines I changed were these:
CONFIG proxy.config.proxy_name STRING xantara.web.g3nius.net
CONFIG proxy.config.url_remap.pristine_host_hdr INT 1

Next we can edit remap.config.

Add the following line to the bottom:
regex_map http://(.*)/ http://webservers.hostname:80/
This should match everything and forward it to your web server.

Start traffic server:
service trafficserver start
And that’s it! It should now just work! πŸ™‚

Nginx, Varnish, HAProxy, and Thin/Lighttpd

Over the last few days, I have been playing with Ruby on Rails again and came across Thin, a small, yet stable web server which will serve applications written in Ruby.

This is a small tutorial on how to get Nginx, Varnish, HAProxy working together with Thin (for dynamic pages) and Lighttpd (for static pages).

I decided to take this route as from reading in many places I found that separating static and dynamic content improves performance significantly.

Nginx

Nginx is a lightweight, high performance web server and reverse proxy. It can also be used as an email proxy, although this is not an area I have explored. I will be using Nginx as the front-end server for serving my rails applications.

I installed Nginx using the RHEL binary package available from EPEL.

Configuration of Nginx is very simple. I have kept it very simple, and made Nginx My current configuration file consists of the following:

user nginx;
worker_processes 1;

error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;

events {
worker_connections 1024;
}

http {
include /etc/nginx/mime.types;
default_type application/octet-stream;

log_format main '$remote_addr - $remote_user [$time_local] $request "$status" $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"';

sendfile on;
tcp_nopush on;
tcp_nodelay off;

keepalive_timeout 5;

# This section enables gzip compression.
gzip on;
gzip_comp_level 2;
gzip_proxied any;
gzip_types text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;

# Here you can define the addresses on which varnish will listen. You can place multiple servers here, and nginx will load balance between them.
upstream cache_servers {
server localhost:6081 max_fails=3 fail_timeout=30s;
}

# This is the default virtual host.
server {
listen 80 default;
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log;
charset utf-8;

# This is optional. It serves up a 1x1 blank gif image from RAM.
location = /1x1.gif {
empty_gif;
}

# This is the actual part which will proxy all connections to varnish.
location / {
proxy_pass http://cache_servers/;
proxy_redirect http://cache_servers/ http://$host:$server_port/;

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}

Varnish

Varnish is a high performance caching server. We can use Varnish to cache content which will not be changed often.

I installed Varnish using the RHEL binary package available from EPEL as well. Initially, I only needed to edit /etc/sysconfig/varnish, and configure the address on which varnish will listen on.

DAEMON_OPTS="-a localhost:6081 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-u varnish -g varnish \
-s file,/var/lib/varnish/varnish_storage.bin,10G"

This will make varnish listen on port 6081 for normal HTTP traffic, and port 8082 for administration.

Next, you must edit /etc/varnish/default.vcl to actually cache data. My current configuration is as follows:

backend thin {
.host = "127.0.0.1";
.port = "8080";
}

backend lighttpd {
.host = "127.0.0.1";
.port = "8081";
}

sub vcl_recv {
if (req.url ~ "^/static/") {
set req.backend = lighttpd;
} else {
set req.backend = thin;
}

# Allow purging of cache using shift + reload
if (req.http.Cache-Control ~ "no-cache") {
purge_url(req.url);
}

# Unset any cookies and autorization data for static links and icons, and fetch from catch
if (req.request == "GET" && req.url ~ "^/static/" || req.request == "GET" && req.url ~ "^/icons/") {
unset req.http.cookie;
unset req.http.Authorization;
lookup;
}

# Look for images in the cache
if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
unset req.http.cookie;
lookup;
}

# Do not cache any POST'ed data
if (req.request == "POST") {
pass;
}

# Do not cache any non-standard requests
if (req.request != "GET" && req.request != "HEAD" &&
req.request != "PUT" && req.request != "POST" &&
req.request != "TRACE" && req.request != "OPTIONS" &&
req.request != "DELETE") {
pass;
}

# Do not cache data which has an autorization header
if (req.http.Authorization) {
pass;
}

lookup;
}

sub vcl_fetch {
# Remove cookies and cache static content for 12 hours
if (req.request == "GET" && req.url ~ "^/static/" || req.request == "GET" && req.url ~ "^/icons/") {
unset obj.http.Set-Cookie;
set obj.ttl = 12h;
deliver;
}

# Remove cookies and cache images for 12 hours
if (req.url ~ "\.(png|gif|jpg|ico|jpeg|swf|css|js)$") {
unset obj.http.set-cookie;
set obj.ttl = 12h;
deliver;
}

# Do not cache anything that does not return a value in the 200's
if (obj.status >= 300) {
pass;
}

# Do not cache content which varnish has marked uncachable
if (!obj.cacheable) {
pass;
}

# Do not cache content which has a cookie set
if (obj.http.Set-Cookie) {
pass;
}

# Do not cache content with cache control headers set
if(obj.http.Pragma ~ "no-cache" || obj.http.Cache-Control ~ "no-cache" || obj.http.Cache-Control ~ "private") {
pass;
}

if (obj.http.Cache-Control ~ "max-age") {
unset obj.http.Set-Cookie;
deliver;
}

pass;
}

HAProxy

HAProxy is a high performance TCP/HTTP load balancer. It can be used to load balance almost any type of TCP connection, although I have only used it with HTTP connections.

We will be using HAProxy to balance connections over multiple thin instances.

HAProxy is also available in EPEL. My HAProxy configuration is as follows:

global
daemon
log 127.0.0.1 local0
maxconn 4096
nbproc 1
chroot /var/lib/haproxy
user haproxy
group haproxy

defaults
mode http
clitimeout 60000
srvtimeout 30000
timeout connect 4000

option httpclose
option abortonclose
option httpchk
option forwardfor

balance roundrobin

stats enable
stats refresh 5s
stats auth admin:123abc789xyz

listen thin 127.0.0.1:8080
server thin 10.10.10.2:2010 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2011 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2012 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2013 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2014 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2015 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2016 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2017 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2018 weight 1 minconn 3 maxconn 6 check inter 20000
server thin 10.10.10.2:2019 weight 1 minconn 3 maxconn 6 check inter 20000

Thin

My Thin server is actually run on a separate Gentoo box. I installed Thin using the package in Portage.

To configure Thin, I used the following command:

thin config -C /etc/thin/config-name.yml -c /srv/myapp --servers 10 -e production -p 2010

This configures thin to start 10 servers, listening on port 2010 to 2019. If you want an init script for Thin, so you can start it at boot, run

thin init

This is will create the init script, and you can set it to start up at boot using the normal method (rc-update add thin default or chkconfig thin on).

You should now be able to reach your rails app through http://nginx.servers.ip.address

Next, we must configure the static webserver.

Lighttpd

I decided to go with Lighttpd as it is a fast, stable and lightweight webserver which will do the job perfectly with little configuration.

You could also use nginx as the static server instead of using lighttpd, but I decided toΓ‚ separateΓ‚ it.

I decided to use the package from EPEL for Lighttpd, and found that most of the default configuration was as I wanted it to be. The only thing I needed to change was the port and address the server was listening on:

server.port = 8081
server.bind = "127.0.0.1"

And that’s pretty much it! Now you just have to dump any static content into /var/www/lighttpd/ (the default location that the Lighttpd package in EPEL is configured to use) and reference any static links using “/static/document_path_of_file”, such as if I put an image into /var/www/lighttpd/images/ called “bg.png”, I can reach it using http://servers_hostname/static/images/bg.png.

I have not really done any performance tests on how well this works, and there are probably many things which I could have done better. This is the first time I made any attempt HTTP performance tuning, and so I am always looking for feedback or tips on how to make this better, so please do contact me if you have any suggestions! πŸ™‚