Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached

SAVING THE WORLD
From guaranteed APOCALYPSE*

using varnish, memcached, and some other stuff
* apocalypse not really guaranteed

WHY NOT DOING CACHING
IS BAD?
•

Keep executing the same code with the same data

•

Waste computing power getting the same result

•

That power is probably generated by burning coal*

•

Burning stuff produces tons of CO2**
* it most likely is not

** probably a smaller unit of mass

Too much CO2 will make

THE EARTH EXPLODE*

* based on pure speculation

WHY SHOULD YOU CARE?
•

Your web apps will become WAY faster

•

Users and search engines will like you MORE

•

You will use A LOT less hardware resources

2
CO

and/or save

$$$

•

You will generate LESS

•

The Earth will NOT explode and/or you’ll have more $$$

•

Women like people who save the world and/or have $$$

•

And lots of other stuff*
* 0 or greater amount of other stuff

WHY YOU SHOULD AVOID USING TTL
•

You might use obsolete data

•

Your server might get a cache stampede and go down

•

You should PUSH the fresh data in your cache as soon as you have it,
BEFORE the old one has expired from the cache

requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds

requests


seconds

1.A critical piece of your cached data expired through TTL (or is evicted)

requests


seconds

2. A client requests a service which relies on that data

requests


seconds

3. That data takes relatively long time to compute

requests


seconds

4. Other requests come that need the same data

requests


seconds

5. A lot of them stack on the server before the ﬁrst one is even ﬁnished

I DONT WANT THAT!
No you don’t!

HOW DO I CACHE THINGS?
1. Create a Memcached instance

$memcached = new Memcached;
$memcached->addServers( $memcachedServers );

2. Put data in

$memcached->set( $key, $value, $expireAt );

3. Get data out

$memcached->get( $key );

HELPFUL TIPS
•

It’s best if you cache the final result of an operation rather than the entry data

•

You should always have a fallback if you get a cache miss

•

Try to avoid flushing the entire cache, use clever key names instead

•

Use Memcached::getAllKeys() to help you manage/release/update data

•

Use Memcached::stats() to help you improve efficiency

•

Have a warmup script!

WHAT TO CHECK IN STATS()
…

…

[“get_hits”]=>int(110825125)

[“get_misses”]=>int(17396765)

[“evictions”]=>int(0)

…

…

VARNISH IS:
•

A caching HTTP reverse proxy

•

Really, really really FAST

•

Usually limited by the speed of the network

•

Has decent ﬂexibility with VCL conﬁguration language

NICE SPEED
Now lets see how to use Varnish effectively on my very dynamic site

COMMON PROBLEMS TO OVERCOME
•

My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches
whole pages

•

I need to control/ﬂush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/
assaulting the datacenter and I prefer to do it from within my app

•

My visitors have unique stuff

•

Sessions

•

Cookies

•

Statistics and tracking visitors

ABOUT ESI
•

Edge Side Includes or ESI is a small markup language for edge level
dynamic web content assembly. The purpose of ESI is to tackle the
problem of web infrastructure scaling.
<HTML>

<BODY>

…

<esi:include src=“/esi/private/recentproducts“/>

…

</BODY>

</HTML>

Doesn't change at all

session speciﬁc

1minute

session speciﬁc

24h

Doesn't change at all

2-4minutes

1 hour

SETTING UP BACKENDS
backend www {
.host = “192.168.0.2”;
.port = “81”;
.connect_timeout = 1s;
.first_byte_timeout = 5s;
.between_bytes_timeout = 2s;
}

HOW DOES IT WORK?
pipe

vcl_recv

vcl_pipe

pass

lookup

vcl_pass

vcl_hash

Backend1

pass

Client request
vcl_hit

vcl_miss
vcl_fetch

vcl_deliver
vcl_error

pipe

fetch

Backend2

vcl_recv
•

First checkpoint when a request arrives and is parsed

•

We must decide whether to lookup, pass or pipe the request

•

We can choose a backend to use

•

We have the req object

•

Deﬁnition of PURGE, BAN or REFRESH like requests is here

•

We can set a header in the req object to tell our backend the request is from varnish

set req.backend = default;

set req.http.X-Varnish-Handshake = “1”;

set req.http.X-Forwarded-For = client.ip;

!

if (req.url ~ "/esi/") {

set req.http.X-Varnish-Esi = regsub(req.url, ".esi/(w+)/.*", "1");

remove req.http.Accept-Encoding;

}

if (req.request != "GET" && req.request != "HEAD") {

# We only deal with GET and HEAD by default

return (pass);

}

if (req.http.Cookie !~ “PHPSESSID="){

call generate_session;

}

return (lookup);

WAIT, WHAT?
sub generate_session {
C{
char uuid_buf [50];
generate_uuid(uuid_buf);
VRT_SetHdr(sp, HDR_REQ,
"030X-Varnish-Fake-Session:",
uuid_buf,
vrt_magic_string_end
);
}C
!

if (req.http.Cookie) {
set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie;
} else {
set req.http.Cookie = req.http.X-Varnish-Fake-Session;
}
}

C{
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <pthread.h>
!

static pthread_mutex_t lrand_mutex = PTHREAD_MUTEX_INITIALIZER;
!

void generate_uuid(char* buf) {
pthread_mutex_lock(&lrand_mutex);
long a = lrand48();
long b = lrand48();
long c = lrand48();
long d = lrand48();
pthread_mutex_unlock(&lrand_mutex);
sprintf(buf, "PHPSESSID=%08lx%04lx%04lx%04lx%04lx%08lx",
a,
b & 0xffff,
(b & ((long)0x0fff0000) >> 16) | 0x4000,
(c & 0x0fff) | 0x8000,
(c & (long)0xffff0000) >> 16,
d
);
return;
}
}C

vcl_hash
•

Generates the hash through which Varnish looks up an object

•

We have the req object

•

We can make certain objects unique in the cache based on
something more than just the url - like a session cookie.

hash_data(req.url);

if (req.http.host) {

hash_data(req.http.host);

} else {

hash_data(server.ip);

}

!

if (req.http.Accept-Encoding) {

hash_data(req.http.Accept-Encoding);

}

!

if (req.http.X-Varnish-Esi == "private" && req.http.Cookie ~ "PHPSESSID=") {

hash_data(regsub(req.http.Cookie, "^.*?PHPSESSID=([^;]*);*.*$", "1"));

}

!

return (hash);

vcl_fetch
•

Takes control when a response from the backend is fetched and parsed

•

We have the req and beresp objects

•

A good place to sanitise the backend response and control TTL

•

Removal of Set-Cookie header is a good practice here

•

Add helper headers to the cached object for the ban lurker

•

We can choose to deliver or hit_for_pass here

beresp.ttl
Before Varnish runs vcl_fetch, the beresp.ttl variable has already been set to a value. It will
use the ﬁrst value it ﬁnds among:
!

•

The s-maxage variable in the Cache-Control response header

•

The max-age variable in the Cache-Control response header

•

The Expires response header

•

The default_ttl parameter

set beresp.http.X-Url = req.url;

set beresp.http.X-Host = req.http.host;

set beresp.http.X-Varnish-Session = regsub(req.http.Cookie,"^.*?PHPSESSID=([^;]*);*.*$", “1");

if (beresp.status != 200 && beresp.status != 404) {

set beresp.ttl = 15s;

return (hit_for_pass);

}

if (beresp.http.Set-Cookie) {

remove beresp.http.Set-Cookie;

}

if (beresp.http.X-Varnish-Esi == "1") {

set beresp.do_esi = true;

}

if (req.url ~ ".(jpg|jpeg|gif|otf|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|ﬂv|swf|scripts)$"){

set beresp.ttl = 180m;

}

return (deliver);

vcl_deliver
•

Takes control just before a response is sent to the client

•

We have the req and resp objects

•

Executes after hit, miss and fetch, hit_for_pass or pass (but not pipe)

•

Removal of all headers we set during the VCL ﬂow is a good idea here

•

We can also add headers here that should go to the client, but shouldn’t be in
the cache

if (req.http.X-Varnish-Fake-Session) {

call generate_session_expires;

set resp.http.Set-Cookie

= req.http.X-Varnish-Fake-Session + "; expires="

+ resp.http.X-Varnish-Cookie-Expires + "; path=/";

if (req.http.Host) {

set resp.http.Set-Cookie = resp.http.Set-Cookie + "; domain=" + regsub(req.http.Host, ":d+$", "");

}

set resp.http.Set-Cookie = resp.http.Set-Cookie + "; httponly";

unset resp.http.X-Varnish-Cookie-Expires;

}

if (!client.ip ~ debug) {

unset resp.http.X-Host;

unset resp.http.X-Url;

unset resp.http.X-Varnish-Session;

} else {

if (obj.hits > 0) {

set resp.http.X-Cache = "HIT";

} else {

set resp.http.X-Cache = "MISS";

}

}

!

return (deliver);

ACLs
acl purge {

"localhost";

"127.0.0.1";

}

!

acl debug {

"192.168.0.128";

}

INVALIDATING CACHED OBJECTS
•

We can control cached objects through http requests to varnish with
some clever VCL-ing

•

PURGE - we can purge a single object from the cache

•

BAN - we can ban a selection of matching objects from the cache

•

REFRESH - we can fetch a new copy of an object whole the old one is
still served in the meantime

sub vcl_recv {

if (req.request == "PURGE") {

if (!client.ip ~ purge) {

error 405 "Not allowed.";

}

return(lookup);

}

}

!

sub vcl_hit {


purge;

error 200 "Purged";

}

}

!

sub vcl_miss {


error 404 "Not in cache";

}

}

$cacheServerSocket = fsockopen($varnishHostname, 80, $errno, $errstr, 2);

!

$request = "PURGE /something.htm HTTP/1.0rn”;

$request .= "Host: www.varnished-site.comrn”;

$request .= "Connection: Closernrn”;

!

fwrite($cacheServerSocket, $request);

$response = fgets($cacheServerSocket);

fclose($cacheServerSocket);

sub vcl_recv {

if (req.request == "BAN") {



}

ban("obj.http.X-Host ~ " + req.http.host + " &&

obj.http.X-Url ~ " + req.url);

error 200 "Bannerd";

}

}

sub vcl_recv {

if (req.request == "REFRESH") {



}

set req.request = "GET";

set req.hash_always_miss = true;

}

}

COMMON PROBLEMS TO OVERCOME
•

My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches whole
pages => Use ESI

•

I need to control/ﬂush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/assaulting
the datacenter and I prefer to do it from within my app => Set up PURGE/BAN/REFRESH in the VCL

•

My visitors have unique stuff => Use the session cookie in the vcl_hash to keep unique copy

•

Sessions => Use the generate session in Varnish trick

•

Cookies => Uhhh, don't use em?

•

Statistics and tracking visitors => Use the memcached VMOD and process stuff asynch on the backend

QUESTIONS?*

* answers not guaranteed to be available and/or true

Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached

More Related Content

What's hot

Viewers also liked

Similar to Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached

Recently uploaded

Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached