Step by step, how to set up simple service using Tiktalik Computing and Tiktalik Load Balancer.
An application server
First we need a simple application server. Based on this tutorial I will need only a local console with configured environment, as in tutorial above.
So, lets create a new instance:
(lb)[adam@toddler lb]$ tiktalik create-instance -n pub2 4a2b3e72-47f1-4e88-b482-1834478ade28 0.5 app-backend-1
and search for it's IP address:
(lb)[adam@toddler lb]$ tiktalik info -n app-backend-1 app-backend-1 (8a73f0cb-b78c-4d5c-a223-a344c226d6ba) - Running network interfaces: eth0 addr: 37.233.98.111 mac: e6:63:35:f2:fe:b8 network: pub2 default password: a410b4ff14 running image Ubuntu 12.04.3 LTS 64-bit (4a2b3e72-47f1-4e88-b482-1834478ade28) recent operations: Create_Install: app-backend-1 started at 2013-07-26 22:28:28 +0200 (UTC) ended at 2013-07-26 22:29:48 +0200 (UTC) cost per hour: 0.04920 PLN/h
Next we'll install there standard lighttpd service and set up an example page. Without full console output it looks like call:
(lb)[adam@toddler lb]$ ssh [email protected] [email protected]'s password: root@app-backend-1:~# apt-get upgrade root@app-backend-1:~# apt-get install apache2 libapache2-mod-php5 root@app-backend-1:~# echo '<? echo "Hello Load Balancer!\n"; for ($i=0; $i<1000; $i++) { echo rand(1000, 9999)."\n"; } ?>' >/var/www/test.php
This short php script needs to create some load on the server. In our example it's a little bit stupid, but in real world load likes to be auto-generated in a better way.
Check if it working:
(lb)[adam@toddler lb]$ wget http://37.233.98.111/test.php -q -O - | head -n 4 Hello Load Balancer! 5066 8235 8372
and our simple application server is ready.
Be my mirror
It's good to make a backup of our application server, no matter if it'll be load balanced or not. Especially when we've put a lot of work in it's creation. It's not hard, it doesn't hurt. Still using command line tool:
(lb)[adam@toddler lb]$ tiktalik stop -n app-backend-1 Instance app-backend-1 (8a73f0cb-b78c-4d5c-a223-a344c226d6ba) is now being stopped (lb)[adam@toddler lb]$ tiktalik backup -n app-backend-1 Instance app-backend-1 (8a73f0cb-b78c-4d5c-a223-a344c226d6ba) is now being backed up (lb)[adam@toddler lb]$ tiktalik start -n app-backend-1 Error: TiktalikAPIError: 409 Another operation is currently being performed on this instance
Yes, it does hurt. A little. Unfortunatelly backup operation takes a few minutes. We are working to do it on the fly, without even rebooting. So look forward to our news. And until this doesn't happen, just wait a while and try again:
(lb)[adam@toddler lb]$ tiktalik start -n app-backend-1 Instance app-backend-1 (8a73f0cb-b78c-4d5c-a223-a344c226d6ba) is now being started
We've got backup.
To make sense of load balancing we need at least two the same Instances. Fast resolution:
(lb)[adam@toddler lb]$ tiktalik list-images | grep app-backend 06ff60fe-6808-4774-8f83-93472da6d00e "backup of app-backend-1", type=backup (private) tiktalik create-instance -b -n pub2 06ff60fe-6808-4774-8f83-93472da6d00e 0.5 app-backend-2 [...] (lb)[adam@toddler lb]$ tiktalik info -n app-backend-2 | grep addr eth0 addr: 37.233.98.222 mac: e6:65:f6:8e:1b:f6 network: pub2
Setup the Load Balancer
One short example see how easy it is:
(lb)[adam@toddler lb]$ tiktalik create-load-balancer -b 37.233.98.111:80:10 -b 37.233.98.222:80:10 -d lb.foo.com simple-lb HTTP simple-lb (50725e05-a2f6-4531-8cb5-b403032fbbe0) enabled input: HTTP on a52b2e26-6d35-49d6-82b6-7866d091bdc2.lb.tiktalik.com:80 domains: lb.foo.com health monitor: tcp-connection, interval 5.000 sec, timeout 1.000 sec backends: 37.233.98.111:80, weight=10 (50bc15cd-4b79-42f1-be0c-ed1318667085) 37.233.98.222:80, weight=10 (856756fc-9647-4d01-9236-1462767a21c6)
To explain what is happen:
- 37.233.98.111:80:10 and 37.233.98.222:80:10 in parameters defines two backend servers, both on port 80, and both with the same weight. In the future you'll be able to add another server, or remove one. So when someone gives you a gift as a good post on reddit.com about your service, then it'll take no more than a few minutes to create and add a new big Instance to your setup, so reddit's readers won't kill you.
- lb.foo.com is a sample domain that our Load Balancer must know to route requests into right direction. If you'd like, each Load Balancer is able to handle more than one domain, eg. lb.foo.com, www.lb.foo.comand static.foo.com.
- a52b2e26-6d35-49d6-82b6-7866d091bdc2.lb.tiktalik.com is an entry point where your Load Balancer has been assigned. Now you must put this address as a CNAME DNS record of your domain. After that is should looks like eg:
$ host lb.foo.com lb.foo.com is an alias for a52b2e26-6d35-49d6-82b6-7866d091bdc2.lb.tiktalik.com. a52b2e26-6d35-49d6-82b6-7866d091bdc2.lb.tiktalik.com has address 37.233.96.249
To simple check if it's working, try:
(lb)[adam@toddler lb]$ wget http://lb.foo.com/test.php -q -O - | head -n 4 Hello Load Balancer!
Response is the same, but it comes via Load Balancer from one of backend serwer.
Let's balance the load!
Now the main course - what can be done.
Be fast
Case with reddit.com mentioned above - running benchmark "ab -k -c 200 -n 10000 <url>" on two urls - first directly into one of application server:
$ ab -k -c 200 -n 10000 http://37.233.98.111/test.php [...] Time taken for tests: 6.337 seconds Requests per second: 1577.91 [#/sec] (mean) Time per request: 126.750 [ms] (mean) Time per request: 0.634 [ms] (mean, across all concurrent requests) Transfer rate: 8122.26 [Kbytes/sec] received
and second via load blancer to both application servers:
$ ab -k -c 200 -n 10000 http://lb.foo.com/test.php [...] Time taken for tests: 3.679 seconds Requests per second: 2718.45 [#/sec] (mean) Time per request: 73.571 [ms] (mean) Time per request: 0.368 [ms] (mean, across all concurrent requests) Transfer rate: 13886.93 [Kbytes/sec] received
There is almost two times faster, in both requests per second and mean time per request.
Another advantage is that Apache by default uses a process per connection. For first time I try to run above benchmarks with 500 concurency connections, but while benchmarking via Load Balancer was not a problem, when benchmarking single Instance almost all the time ab raised some connection problems. Of course Apache on processes with php language is a little bit old approach to build a fast web application, but still - this example have to show how load balancing is working.
Be safe
To show how Load Balancer is handling application server inaccessibility I have made a little shortcut: digging into the system I've change timeouts for backends health checks, from defaults 5 seconds interval and 1 second connection timeout (as seen above - output from Load Balancer creation) to 0.5 second interval and 100ms connection timeout. For now there is not possible to change these timeouts, even by our REST API, but for sure it will be available before we start to charge a fee for this service.
Another trick was to simulate death of our service - eg. oom killer, socket limit exceeded, etc. So on one of application server I've run from bash:
while true; do service apache2 stop; sleep 5s; service apache2 start; sleep 5s; done
Standard benchmark to such server gives results as below:
$ ab -r -k -c 20 -n 100000 http://37.233.98.222/test.php [...] Time taken for tests: 87.383 seconds Complete requests: 100000 Failed requests: 166078 (Connect: 0, Receive: 55474, Length: 55130, Exceptions: 55474) Write errors: 0 Keep-Alive requests: 44132 Requests per second: 1144.39 [#/sec] (mean) Time per request: 17.477 [ms] (mean) Time per request: 0.874 [ms] (mean, across all concurrent requests) Transfer rate: 2622.64 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 5 159.8 0 7020 Processing: 0 12 212.3 4 10684 Waiting: 0 5 43.7 0 1207 Total: 0 17 273.5 4 10684
Notice I've added a -r parameter, so that ab won't panic when some socket error occurs.
Same test but run via Load Balancer:
Concurrency Level: 20 Time taken for tests: 77.872 seconds Complete requests: 100000 Failed requests: 0 Write errors: 0 Keep-Alive requests: 99008 Requests per second: 1284.16 [#/sec] (mean) Time per request: 15.574 [ms] (mean) Time per request: 0.779 [ms] (mean, across all concurrent requests) Transfer rate: 6559.93 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 1.3 0 187 Processing: 2 16 160.3 11 10764 Waiting: 2 15 160.3 11 10764 Total: 2 16 160.6 11 10764
No errors at all!
What's next?
In current state of our Load Balancer is still being developed. Before we make it fully usable (and payable), we'd like to add there at least:
- backend health monitor tunning - like in the example above, with timeouts changing or testing not only connection possibilities but also some requests and responses.
- current status of application servers extended to some counters, possibly with graph visualisation with number of redirection connections/requests;
- TCP balancing - so availability to balance almost all protocols based on TCP connections, eg. POP3, SMTP, HTTPS.
- HTTPS balancing - so that application servers could handle pure HTTP requests, and our balancer will be responsible for encoding the secure layer.