BINPIPE | blog: May 2012

Load Balancing Techniques

Load balancing is a term that describes a method to distribute incoming socket connections to different servers. It’s not distributed computing, where jobs are broken up into a series of sub-jobs, so each server does a fraction of the overall work. It’s not that at all. Rather, incoming socket connections are spread out to different servers. Each incoming connection will communicate with the node it was delegated to, and the entire interaction will occur there. Each node is not aware of the other nodes existence.
Why do you need load balancing?
Simple answer: Scalability and Redundancy.
Scalability

If your application becomes busy, resource limits, such as bandwidth, cpu, memory, disk space, disk I/O, and more may reach its limits. In order to remedy such problem, you have two options: scale up, or scale out. Load balancing is a scale out technique. Rather than increasing server resources, you add cost effective, commodity servers, creating a “cluster” of servers that perform the same task. Scaling out is more cost effective, because commodity level hardware provides the most bang for the buck. High end super computers come at a premium, and can be avoided in many cases.

Redundancy

Servers crash, this is the rule, not the exception. Your architecture should be devised in a way to reduce or eliminate single points of failure (SPOF). Load balancing a cluster of servers that perform the same role provides room for a server to be taken out manually for maintenance tasks, without taking down the system. You can also withstand a server crashing. This is called High Availability, or HA for short. Load balancing is a tactic that assists with High Availability, but is not High Availability by itself. To achieve high availability, you need automated monitoring that checks the status of the applications in your cluster, and automates taking servers out of rotation, in response to failure detected. These tools are often bundled into Load Balancing software and appliances, but sometimes need to be programmed independently.

How to perform load balancing?
There are 3 well known ways:

DNS based
Hardware based
Software based

DNS based

This is also known as round robin DNS. You can inject multiple A records for the same hostname. This creates a random distribution – requests for the hostname will receive the list in a random order. If you wish to weight it (say serverA can take 2x the number of requests that serverB can), you can simply add more A records for a particular IP.

Hardware based

There are many commercial vendors out there selling appliances to perform load balancing.

Cisco Ace Application Control Engine Module

Barracuda Load Balancer

JetNexus Accelerating Load Balancer Extreme

Kemp Loadmaster 2000

Many, many, more

Hardware based load balancing is the best way to go, if you have budget for it. These appliances provide the latest features, with little fuss.

Software based
This is where it gets fun, if you’re a technology enthusiast. If your budget doesn’t allow a load balancing appliance, or if you just like doing things yourself, software based load balancing is for you. You can turn a Linux server into your own load balancing appliance. Presumably, you could also use a Windows server, maybe even a Mac, but this article doesn’t cover those. For RHEL based, the “piranha” package provides Linux Virtual Server (LVS) and piranha (an LVS management tool – web based gui). Just “yum install piranha” and you’ll have everything you need to get started. Other softwares include BalanceNG (commercial) and a basic freeware counterpart balance.
This was super simple to use. Just download, run the program. There are a few basic input parameters, and you can be load balancing in no time. This is a no frills binary program. There are no configuration files, no startup/shutdown programs, no logging or reporting. But it does have a nifty console that you can get runtime statistics from. You could create your own tools around “balance” to monitor and gather statistics.
LVS and piranha on RHEL or CentOS

piranha is a gui that makes configuring Linux Virtual Server (LVS) easy. Here are some of the virtual server scheduling features:

Round robin
Weighted least-connections
Weighted round robin
Least-connection
Locality-Based Least-Connection Scheduling
Locality-Based Least-Connection Scheduling (R)
Destination Hash Scheduling
Source Hash Scheduling

Courtesy: http://www.koopman.me

Directory Tree in Linux

If you want to get the hierarchy of the directory in a tree format there is the tree command for that. But here I have put the command in a small loop so that the output becomes more informative with the directory and file sizes also.

Try this command out in the bash shell and see the output:

# for i in $(ls -d */); do tree $i ; done

Note: If the directory structure is too long to caontain it in a screen you can use the following to save it to a file:

# for i in $(ls -d */); do tree $i ; done > result.txt

Find Total Number of Sub-directories and Files in a Directory

We have often wondered how many sub-directories and files are there recursively under a directory in Linux. Sometimes, we may also need to calculate the number so that we don't cross inode count in shared hosting servers.

The following code snippet will allow you to get the count of files & directories inside a folder. Please run this command after doing a 'cd' to inside the folder.

find $DIR -exec stat -c '%F' {} \; | sort | uniq -c | sort -rn

If you want a tree view of the directories with the count you can use this:

for i in $(ls -d */); do tree $i ; done > result.txt

or try this one below:

for i in $(ls -d */); do tree $i | grep -v \\-\\-\ ; done

Quite simple but effective one-liners aren't they? :-)

Configure NTP to Synchronize Server Time

The following steps are sufficient to make the NTP Sync.

Type the following command to install NTP:
# yum install ntpTurn on service:
# chkconfig ntpd onSynchronize the system clock with 0.pool.ntp.org server:
# ntpdate pool.ntp.orgStart the NTP:
# /etc/init.d/ntpd start

Passwordless SSH Login

You want to access computer A securely from computer B without having to enter a password. This technique is very useful for automation, where you don't want to put passwords into scripts.

On computer B login as the user you want to use to access computer A, and generate yourself a pair of public private keys by issuing the following command, and simply press return to the prompts to use an

empty passphrase. On computer B login as the user you want to use to access computer A, and generate yourself a pair of public private keys by issuing the following command, and simply press return to the prompts to use an empty passphrase.

ssh-keygen -t rsa

This command created a public/private key pair in your home directory under the directory of ~/.ssh. Your public key is stored in the file

~/.ssh/id_rsa.pub

Now we need to create an ~/.ssh directory on the remote computer B if it doesn't exist, under the username we're going to access it with, substitute the username your going to use in the following. you'll need

at this stage to enter the password for username@B

> ssh username@B mkdir -p .ssh

username@B's password:

Now use ssh to push your public key to computer B via the following command, you'll again need to enter username@b's password

> cat .ssh/id_rsa.pub | ssh username@B 'cat >> .ssh/authorized_keys'

username@B's password:

That's it !

From now on entering

ssh username@b

Will get you logged is as username on computer B without the need to

enter a password.

Analyse LAMP Server Slowness

Analysis of LAMP server slowness can be done basically by following these four steps:

Track MySQL Queries Taking over a Second

By default, mysql will log queries which take 10 seconds or longer. Depending on your installation, these may or may not be logged to a file. Certainly, if you have queries showing up here, at 10+ seconds, you should investigate them. In most cases, I expect that you'll be investigating performace before the point at which you have numerous queries taking over ten seconds. On a production site, the vast majority of your queries should be returning in significantly under a second.

To lower your slow query threshold and enable logging, you'll want to modify your my.cnf (often /etc/mysql/my.cnf), and ensure that you have settings like this:
```
set-variable=long_query_time=1
  log-slow-queries=/var/log/mysql/slow_query.log
```
You may need to touch that file, and ensured that it is owned and writable by your mysql process. When you restart mysql, look in the mysql error log (often /var/log/mysql/mysql.err) for any errors along the lines of "Could not use /var/log/slow_query.log for logging (error 13)"_. If you see these, create the @slowquery.log@ and set it's ownership to that of your mysql user.

Now, depending on the state of your system, slow_query.log will begin to accumulate queries. The actual format of the slow log is bit verbose, but mysqldumpslow, a perl script included with most mysql installations can parse it and produce some more meaningful output. It will take various integers in your queries (a user_id, thread_id, etc) and generalize them, so you can locate types, instead of specific queries.
```
> mysqldumpslow -t=10 -s=t /var/log/slow_query.log
  Reading mysql slow query log from /var/log/slow_query.log
  Count: 46  Time=80.46s (3701s)  Lock=0.00s (0s)  Rows=512311 (117447821), bob[bob]@[10.0.0.32]
    SELECT * FROM forum_posts
Count: 26  Time=68.26s (1775s)  Lock=0.00s (0s)  Rows=426 (117447821), bob[bob]@[10.0.0.32]
    SELECT * FROM forum_posts WHERE thread=N
Count: 120  Time=3.52s (422s)  Lock=0.63s (76s)  Rows=58.0 (6960), bob[bob]@[10.0.0.32]
    SELECT authors FROM forum_posts WHERE lastpost > N
...
```
The next step is analyzing this, likely throwing each of these into an EXPLAIN query (or asking yourself why you are selecting every row in the forum_posts table), adding some indexes, and rewriting some code. The scope of this article is finding the bottlenecks… fixing them is left as an exercise for the reader.

Monitor PHP Memory Usage & Log Apache Delivery Times

Out of the box, your apache install is likely using the NCSA extended/combined log format. You're going to take this format and add to pieces of data to it. The first will be the memory used by PHP during the rendering of each page. The second will be the time that apache spends delivering this page. Both of these values will be tacked onto the end of the log format. Many log processing scripts will ignore fields added onto the end of the line, so adding them here is least likely to break things.

Unless you've mucked with it, your httpd.conf likely has lines like this:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
  LogFormat "%h %l %u %t \"%r\" %>s %b" common
CustomLog /usr/local/apache/logs/access_log common
  </pre>

  Whichever log format is being used, this is the name at the end of the CustomLog directive, you're going to make a copy of that LogFormat, give it a name like "commondebug", and switch the CustomLog directive to use this format.
  
  The fields you will be adding are:
  * %T – The time taken to server the request in seconds
  * %{mod_php_memory_usage}n – Memory used by PHP in bytes

  LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %T %{mod_php_memory_usage}n" combineddebug
  LogFormat "%h %l %u %t \"%r\" %>s %b %T %{mod_php_memory_usage}n" commondebug
CustomLog /usr/local/apache/logs/access_log commondebug
  </pre>

  At this point, you'll be collecting some great data in you apache logs.  You can get some good information with some quick shell magic, like so:


> awk '{printf("%07d\t%d\t%s\n", $(NF), $(NF-1), $7)}' access_log  | sed 's/\?.*//' | sort -g -k1
  0001232 0       /baz.php
  0001232 0       /bar.php
  ...
1712160 0       /foo.php
  1717640 0       /foo.php
  1907800 0       /foo.php
  2010840 0       /foo.php
  </pre>

  Replace, -k1 with -k2 to sort by the delivery times.  Keep in mind, delivery times will include the time to send the bytes over the network — http clients can do screwy things, and you'll occasionally see anomalous data including 120+ second connects where the client just stopped accepting packets, but didn't close the socket.
  
  From here, you'll want to examine each of the memory-hogging scripts, and anything which is consitently long-running.

Log PHP Errors

This is one that is obvious, but easy to miss. Many sites disable the display of errors, via php.ini on their production servers. (This is a good idea, preventing the revelation of any inappopriate information to end users.)

You'll be modifying your php.ini to include lines such as these:
```
error_reporting =       E_ALL & ~E_NOTICE         ; Show all errors except for notices
  display_errors  =       Off                       ; Do not print out errors (as a part of the HTML script)
  log_errors      =       On                        ; Log errors into a log file 
  error_log       =       "/var/log/php_error.log"  ; log errors to specified file
```
By itself, this isn't likely to pinpoint any performance issues, but it may help you locate other issues. Although, if are really bad and you may see a slew of "maximum execution time of XX seconds exceeded" in various pages. You are more likely to see errors which correlate to long-running or memory-consuming scripts and queries identified earlier.

Take Snapshots at an OS Level

This is the area that makes the average developer wish they had a sysadmin on call. Unfortunately, for many small sites and projects, the developer is forced to wear that hat as well. At some point, if usage gets high enough, no amount of redesign or optimization will be enough to stretch your hardware further. So the question will be, what is the bottleneck? RAM, IO, CPU?

On this topic, your options are pretty endless. Linux provides endless tools for monitoring resources, and different people will recommend different ones. Something incredibly simple, like below can be pretty informative.

#!/usr/bin/perl
my $URL_TO_TEST = 'http://test.test/test.test';
  my $THRESHOLD = 2;

open(my $log, '>>status_log.txt');
  while (1) {
    my $start = time();
      `curl "$URL_TO_TEST" > /dev/null`;
    my $took = time() - $start;
      if ($took > $THRESHOLD) {
          print $log "$took seconds load ::";
        print $log `date`;
          print $log "\n\n";
        print $log `vmstat -a`;
          print $log "\n\n";
        print $log `uptime`;
          print $log "\n\n";
        print $log `ps awux`;
          print $log "\n\n";
        print $log `mysqladmin -hHOST -uUSER -pPASSWORD processlist`;
          print $log "----------------------------------------------\n\n";
    }
      sleep(30);
}

After these tests you will get an idea of where the bottleneck lies and can put your strength together to resolve that.

BINPIPE | blog

Search Posts on Binpipe Blog