<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
	>
<channel>
	<title>Comments on: Best open-source software for a firewall/load balancer?</title>
	<atom:link href="http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/</link>
	<description>A posting every day; an interesting idea every three months...</description>
	<lastBuildDate>Tue, 24 Nov 2009 02:24:17 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Jesper Mortensen</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-19557</link>
		<dc:creator>Jesper Mortensen</dc:creator>
		<pubDate>Thu, 21 Dec 2006 11:23:35 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-19557</guid>
		<description>I didn&#039;t see this post before now, but I hope it&#039;s not too late to offer some good, free advice here. :-)

LOAD BALANCERS:
My understanding is that Willy Tarreau&#039;s HAProxy is the premier open-source HTTP load balancer today. I have not used it myself, but it has a good reputation and seems well designed (http://haproxy.1wt.eu/). It&#039;s author has written a great article, which gives a solid overview of best practices for load balancers today, at http://1wt.eu/articles/2006_lb/ .

I have used Pound myself, and it&#039;s really nice (www.apsis.ch/pound/). Strong points are SSL support (not a requirement here) and ease of setup. I&#039;m not sure if Pound is the optimal choice for a high-performance scenario, but then again I really don&#039;t know how fast Pound can go. Consider hanging back in the older version numbers of Pound, sometimes the newest versions have less well tested functionality.

I have never used Danga&#039;s PerlBal, but from what else I have seen from those guys I would have no reservations. Their site has a very good presentation about high-performance, dynamic-content web architectures (www.danga.com/words/).

The new kid on the block will probably be Varnish Cache. It&#039;s brand new and not quite polished, but it&#039;s architecture will allow it to scale beyond all the other contenders, and I know it&#039;s main programmer to be really sharp. It&#039;s probably not ready for your use now, but it should grow within the next year (www.varnish-cache.org; www.nuug.no/aktiviteter/20060919-varnish/)


Looking towards the future, it seems to me that the trend is to move away from the classic HTTP load balancer. Today HTTP servers have load-balancing built in, and can serve static content directly from the front-end server (the “load-balancer”), while balancing retrieval of dynamic content from back-end app servers via HTTP. Apache 2.2 can do this out of the box, I believe LightTPd can too (www.lighttpd.net), and I know that Litespeed Web Server 2.2 can (www.litespeedtech.com). The Litespeed server has a really good rep in the Ruby On Rails community; Lighttpd has a strong rep for high speed but has been a bit brittle in use. The website of Mongrel HTTPd has good docs on configuring these servers (http://mongrel.rubyforge.org &gt; Documentation).

Performance is really a non-issue of sorts today. If your preferred load balancer / front end HTTPd isn&#039;t fast enough, then you just add more copies of it. Typically you&#039;d go with two cloned LB&#039;s, DNS round-robin between the two, and IP level fail-over. This setup is cheap, and gives you good protection in the event of traffic spikes OR hardware failure (but not both a traffic spike AND a LB failure). There are many IP level fail-over systems, Linux HA, Linux Virtual Servers VRRPd etc., but Whack-a-Mole seems to be a favorite despite it&#039;s age (www.backhand.org).
Another thing is that your usage is atypical. Most LB&#039;s hit their bottleneck in the session creation phase, ie. how many new TCP/HTTP sessions can be established pr. second. Photo.net would create most of it&#039;s load by sending large files in long-running TCP sessions. The internal design and buffering strategy of the balancer would matter much more than usual. I can&#039;t say how this will work out, but my guess is that 150 Mbps in long-running TCP sessions is a piece of cake for a single contemporary server. Benchmarking your specific setup is really the only way to know.

APPLIANCES:
Loads of companies make load-balancing switches, but the prices are all over the map. Radware, Coyote Point, Zeus, Cisco and Juniper are some of the names. Perhaps some of these can be found at reasonable prices in your area, perhaps not. Firewalls are more commonly purchased as appliances, partially for ease of setup and layering of security responsibilities across multiple systems for resilience.

HTTP PERFORMANCE:
There is a small end-user performance benefit to spreading the HTML/CSS/image content over multiple host names, such as www.photo.net and static.photo.net, because the end-users browser then opens more parallel TCP connections. See http://yuiblog.com/blog/2006/11/28/performance-research-part-1/ for an introduction. I do not think this is worth optimizing for today, and would personally ignore it for now, but it is real.

MY CONCLUSION:
Load balancing is cheap and attainable today, also at 150 Mbps. The preferred setup really comes down to individual taste and past experiences. Personally, for a new site with those requirements, I&#039;d first try Litespeed HTTPd, and if benchmarks show insufficient performance, then I&#039;d add more HTTP front ends with Whack-a-Mole IP fail over. HAProxy on a dedicated load balancer PC also seems like a good, problem-free approach. Firewalling I would try to keep on a dedicated appliance,  mostly so that I don&#039;t accidentally create security holes while mucking around with IP settings on the load balancers.</description>
		<content:encoded><![CDATA[<p>I didn&#8217;t see this post before now, but I hope it&#8217;s not too late to offer some good, free advice here. <img src='http://blogs.law.harvard.edu/philg/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>LOAD BALANCERS:<br />
My understanding is that Willy Tarreau&#8217;s HAProxy is the premier open-source HTTP load balancer today. I have not used it myself, but it has a good reputation and seems well designed (<a href="http://haproxy.1wt.eu/)" rel="nofollow">http://haproxy.1wt.eu/)</a>. It&#8217;s author has written a great article, which gives a solid overview of best practices for load balancers today, at <a href="http://1wt.eu/articles/2006_lb/" rel="nofollow">http://1wt.eu/articles/2006_lb/</a> .</p>
<p>I have used Pound myself, and it&#8217;s really nice (www.apsis.ch/pound/). Strong points are SSL support (not a requirement here) and ease of setup. I&#8217;m not sure if Pound is the optimal choice for a high-performance scenario, but then again I really don&#8217;t know how fast Pound can go. Consider hanging back in the older version numbers of Pound, sometimes the newest versions have less well tested functionality.</p>
<p>I have never used Danga&#8217;s PerlBal, but from what else I have seen from those guys I would have no reservations. Their site has a very good presentation about high-performance, dynamic-content web architectures &nbsp;<a href="http://www.danga.com" title="http://www.danga.(" target="_blank">www.danga.com</a>).</p>
<p>The new kid on the block will probably be Varnish Cache. It&#8217;s brand new and not quite polished, but it&#8217;s architecture will allow it to scale beyond all the other contenders, and I know it&#8217;s main programmer to be really sharp. It&#8217;s probably not ready for your use now, but it should grow within the next year &nbsp;<a href="http://www.varnish-cache.org" title="http://www.varnish-cache.(" target="_blank">www.varnish-cache.org</a>; <a href="http://www.nuug.no/aktiviteter/20060919-varnish/" rel="nofollow">http://www.nuug.no/aktiviteter/20060919-varnish/</a>)</p>
<p>Looking towards the future, it seems to me that the trend is to move away from the classic HTTP load balancer. Today HTTP servers have load-balancing built in, and can serve static content directly from the front-end server (the “load-balancer”), while balancing retrieval of dynamic content from back-end app servers via HTTP. Apache 2.2 can do this out of the box, I believe LightTPd can too &nbsp;<a href="http://www.lighttpd.net" title="http://www.lighttpd.(" target="_blank">www.lighttpd.net</a>), and I know that Litespeed Web Server 2.2 can &nbsp;<a href="http://www.litespeedtech.com" title="http://www.litespeedtech.(" target="_blank">www.litespeedtech.com</a>). The Litespeed server has a really good rep in the Ruby On Rails community; Lighttpd has a strong rep for high speed but has been a bit brittle in use. The website of Mongrel HTTPd has good docs on configuring these servers (<a href="http://mongrel.rubyforge.org" rel="nofollow">http://mongrel.rubyforge.org</a> &gt; Documentation).</p>
<p>Performance is really a non-issue of sorts today. If your preferred load balancer / front end HTTPd isn&#8217;t fast enough, then you just add more copies of it. Typically you&#8217;d go with two cloned LB&#8217;s, DNS round-robin between the two, and IP level fail-over. This setup is cheap, and gives you good protection in the event of traffic spikes OR hardware failure (but not both a traffic spike AND a LB failure). There are many IP level fail-over systems, Linux HA, Linux Virtual Servers VRRPd etc., but Whack-a-Mole seems to be a favorite despite it&#8217;s age &nbsp;<a href="http://www.backhand.org" title="http://www.backhand.(" target="_blank">www.backhand.org</a>).<br />
Another thing is that your usage is atypical. Most LB&#8217;s hit their bottleneck in the session creation phase, ie. how many new TCP/HTTP sessions can be established pr. second.&nbsp;<a href="http://Photo.net" title="http://Photo. " target="_blank">Photo.net</a> would create most of it&#8217;s load by sending large files in long-running TCP sessions. The internal design and buffering strategy of the balancer would matter much more than usual. I can&#8217;t say how this will work out, but my guess is that 150 Mbps in long-running TCP sessions is a piece of cake for a single contemporary server. Benchmarking your specific setup is really the only way to know.</p>
<p>APPLIANCES:<br />
Loads of companies make load-balancing switches, but the prices are all over the map. Radware, Coyote Point, Zeus, Cisco and Juniper are some of the names. Perhaps some of these can be found at reasonable prices in your area, perhaps not. Firewalls are more commonly purchased as appliances, partially for ease of setup and layering of security responsibilities across multiple systems for resilience.</p>
<p>HTTP PERFORMANCE:<br />
There is a small end-user performance benefit to spreading the HTML/CSS/image content over multiple host names, such as <a href="http://www.photo.net" rel="nofollow">http://www.photo.net</a> and&nbsp;<a href="http://static.photo.net" title="http://static.photo. " target="_blank">static.photo.net</a>, because the end-users browser then opens more parallel TCP connections. See <a href="http://yuiblog.com/blog/2006/11/28/performance-research-part-1/" rel="nofollow">http://yuiblog.com/blog/2006/11/28/performance-research-part-1/</a> for an introduction. I do not think this is worth optimizing for today, and would personally ignore it for now, but it is real.</p>
<p>MY CONCLUSION:<br />
Load balancing is cheap and attainable today, also at 150 Mbps. The preferred setup really comes down to individual taste and past experiences. Personally, for a new site with those requirements, I&#8217;d first try Litespeed HTTPd, and if benchmarks show insufficient performance, then I&#8217;d add more HTTP front ends with Whack-a-Mole IP fail over. HAProxy on a dedicated load balancer PC also seems like a good, problem-free approach. Firewalling I would try to keep on a dedicated appliance,  mostly so that I don&#8217;t accidentally create security holes while mucking around with IP settings on the load balancers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: matt</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-17793</link>
		<dc:creator>matt</dc:creator>
		<pubDate>Thu, 16 Nov 2006 13:09:43 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-17793</guid>
		<description>patrickg, I think the fact that livejournal is using it is a pretty good certification that perl can handle a pretty large load.  The idea that
perl/php/ruby/&quot;insert dynamic/scripting language of choice&quot; can&#039;t
handle big loads and scale well is pretty well disproved at this point
(see livejournal, digg, 37 signals projects for examples of each
language&#039;s ability to scale) It&#039;s not about the language you use, it&#039;s
about having the time, money, and knowledge to work with what you use
enough to understand how to make it scale well.</description>
		<content:encoded><![CDATA[<p>patrickg, I think the fact that livejournal is using it is a pretty good certification that perl can handle a pretty large load.  The idea that<br />
perl/php/ruby/&#8221;insert dynamic/scripting language of choice&#8221; can&#8217;t<br />
handle big loads and scale well is pretty well disproved at this point<br />
(see livejournal, digg, 37 signals projects for examples of each<br />
language&#8217;s ability to scale) It&#8217;s not about the language you use, it&#8217;s<br />
about having the time, money, and knowledge to work with what you use<br />
enough to understand how to make it scale well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: patrickg</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-17133</link>
		<dc:creator>patrickg</dc:creator>
		<pubDate>Wed, 08 Nov 2006 23:13:39 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-17133</guid>
		<description>for #1, I strongly recommend OpenBSD plus their built-in packet filter (pf) for the firewalling.  Much easier to understand and configure than iptables.  

Anything with 2x gigabit NICs (they have better chipsets than 100mbit NICs) and a P3 processor will be able to handle your load quite easily.

For #2, :

Get a P4 machine for better SSL encrypt/decrypt, and make the proxy machine handle SSL connections as well.  Then you can set a header value to let AOLserver know that the connection was SSL-secured.

For reverse proxying, you can use either Pound (www.apsis.ch/pound) or Apache with the mod_proxy and possibly mod_rewrite module.  

Given the massive traffic photo.net sees, I doubt that a Perl based program would be the best solution.  Everything I have mentioned is open-source.

Some of the stuff could be done inside a switch, but it would be more expensive and not opensource.</description>
		<content:encoded><![CDATA[<p>for #1, I strongly recommend OpenBSD plus their built-in packet filter (pf) for the firewalling.  Much easier to understand and configure than iptables.  </p>
<p>Anything with 2x gigabit NICs (they have better chipsets than 100mbit NICs) and a P3 processor will be able to handle your load quite easily.</p>
<p>For #2, :</p>
<p>Get a P4 machine for better SSL encrypt/decrypt, and make the proxy machine handle SSL connections as well.  Then you can set a header value to let AOLserver know that the connection was SSL-secured.</p>
<p>For reverse proxying, you can use either Pound (www.apsis.ch/pound) or Apache with the mod_proxy and possibly mod_rewrite module.  </p>
<p>Given the massive traffic&nbsp;<a href="http://photo.net" title="http://photo. " target="_blank">photo.net</a> sees, I doubt that a Perl based program would be the best solution.  Everything I have mentioned is open-source.</p>
<p>Some of the stuff could be done inside a switch, but it would be more expensive and not opensource.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Campbell</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-17048</link>
		<dc:creator>Alex Campbell</dc:creator>
		<pubDate>Wed, 08 Nov 2006 07:50:58 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-17048</guid>
		<description>Just saw the question at the end of your requirements doc about a smart switch.

It is possible to engineer it so you have two switches and each server has an uplink to each switch.  This protects you against a switch failure but introduces additional complexity and things that can go wrong.  Unless you have a network geek around, this could end up causing more downtime than would have resulted from a single switch failure anyway.</description>
		<content:encoded><![CDATA[<p>Just saw the question at the end of your requirements doc about a smart switch.</p>
<p>It is possible to engineer it so you have two switches and each server has an uplink to each switch.  This protects you against a switch failure but introduces additional complexity and things that can go wrong.  Unless you have a network geek around, this could end up causing more downtime than would have resulted from a single switch failure anyway.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Campbell</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-17047</link>
		<dc:creator>Alex Campbell</dc:creator>
		<pubDate>Wed, 08 Nov 2006 07:41:34 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-17047</guid>
		<description>A decent PC running Linux will be able to handle 150mbps fine.  It will probably be able to handle 3x that pretty comfortably but in a commercial environment I wouldn&#039;t be prepared to try it - although I understand photo.net&#039;s situation though and you should be able to find plenty of willing Linux geeks to help tune it.

I have used LEAF (Linux Embedded Application Firewall - http://leaf.sourceforge.net) for border routers and firewalls with excellent results.  It comes with Shorewall which makes complex iptables configurations easy and Quagga in case you need to do BGP.  There is a module for keepalived which I&#039;m pretty sure would allow you to load-balance your webservers.</description>
		<content:encoded><![CDATA[<p>A decent PC running Linux will be able to handle 150mbps fine.  It will probably be able to handle 3x that pretty comfortably but in a commercial environment I wouldn&#8217;t be prepared to try it &#8211; although I understand&nbsp;<a href="http://photo.net" title="http://photo. " target="_blank">photo.net</a>&#8217;s situation though and you should be able to find plenty of willing Linux geeks to help tune it.</p>
<p>I have used LEAF (Linux Embedded Application Firewall &#8211; <a href="http://leaf.sourceforge.net)" rel="nofollow">http://leaf.sourceforge.net)</a> for border routers and firewalls with excellent results.  It comes with Shorewall which makes complex iptables configurations easy and Quagga in case you need to do BGP.  There is a module for keepalived which I&#8217;m pretty sure would allow you to load-balance your webservers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Frank Wiles</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-17004</link>
		<dc:creator>Frank Wiles</dc:creator>
		<pubDate>Tue, 07 Nov 2006 17:51:19 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-17004</guid>
		<description>Perlbal and iptables on one or more Linux servers is definitely the way to go here. Both are OSS, and Perlbal has great scalability and it&#039;s easy to write plugins to handle any site specific situations you might have.  I really can&#039;t recommend it enough.</description>
		<content:encoded><![CDATA[<p>Perlbal and iptables on one or more Linux servers is definitely the way to go here. Both are OSS, and Perlbal has great scalability and it&#8217;s easy to write plugins to handle any site specific situations you might have.  I really can&#8217;t recommend it enough.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: philg</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-16995</link>
		<dc:creator>philg</dc:creator>
		<pubDate>Tue, 07 Nov 2006 14:04:52 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-16995</guid>
		<description>Mike:  A lot of the photo.net services require a reader to be logged in, which means that they already have a cookie.

Fazal: I looked at that Cisco load balancer. It isn&#039;t clear that patterns of the kind that I&#039;m talking about can be implemented.  What&#039;s worse, it doesn&#039;t look as though these things provide any firewall action.  So we would be introducing an extra point of failure.</description>
		<content:encoded><![CDATA[<p>Mike:  A lot of the&nbsp;<a href="http://photo.net" title="http://photo. " target="_blank">photo.net</a> services require a reader to be logged in, which means that they already have a cookie.</p>
<p>Fazal: I looked at that Cisco load balancer. It isn&#8217;t clear that patterns of the kind that I&#8217;m talking about can be implemented.  What&#8217;s worse, it doesn&#8217;t look as though these things provide any firewall action.  So we would be introducing an extra point of failure.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Scott</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-16976</link>
		<dc:creator>Mike Scott</dc:creator>
		<pubDate>Tue, 07 Nov 2006 08:18:03 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-16976</guid>
		<description>There&#039;s no way to guarantee session persistence without making some application changes. Your problem is users with cookies disabled coming via a megaproxy so that their IP address changes for each request -- you simply have nothing to track them by unless you embed some session information in the URL for every internal link on the site.</description>
		<content:encoded><![CDATA[<p>There&#8217;s no way to guarantee session persistence without making some application changes. Your problem is users with cookies disabled coming via a megaproxy so that their IP address changes for each request &#8212; you simply have nothing to track them by unless you embed some session information in the URL for every internal link on the site.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fazal Majid</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-16969</link>
		<dc:creator>Fazal Majid</dc:creator>
		<pubDate>Tue, 07 Nov 2006 04:23:05 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-16969</guid>
		<description>A redundant pair of Cisco/Arrowpoint CSS11501 load balancers costs $15K (with SSL acceleration, it&#039;s double, but you probably don&#039;t need hardware accelerated SSL). At that price (keep in mind Cisco isn&#039;t known to be the low-price leader), it doesn&#039;t make much sense to putz around with a software solution when hardware is more scalable and reliable.</description>
		<content:encoded><![CDATA[<p>A redundant pair of Cisco/Arrowpoint CSS11501 load balancers costs $15K (with SSL acceleration, it&#8217;s double, but you probably don&#8217;t need hardware accelerated SSL). At that price (keep in mind Cisco isn&#8217;t known to be the low-price leader), it doesn&#8217;t make much sense to putz around with a software solution when hardware is more scalable and reliable.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Demitrious S. Kelly</title>
		<link>http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewallload-balancer/comment-page-1/#comment-16960</link>
		<dc:creator>Demitrious S. Kelly</dc:creator>
		<pubDate>Tue, 07 Nov 2006 02:25:14 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.law.harvard.edu/philg/2006/11/06/best-open-source-software-for-a-firewal#comment-16960</guid>
		<description>Lets break the problem into its distinct parts and see if we can solve them in a very *nix style fashion (by chaining solutions together) (hey... you want low cost OSS... gotta work the system!)

The first problem is an IP Firewall.  There&#039;s really very little to say here. iptables comes with linux and works remarkably well.  Mix this with the right network architecture, and you&#039;re in business.

The second problem is persistent load balancing.  You want to be able to define services based on protocol (port 80/443 to the web servers, 25 to the mail servers, etc, etc)  And, preferably, if a user connects to web server #3, and then makes another request within X number of seconds their second request is also routed to web server #3. The LVS project actually manages this quite well (and cheaply) see: http://www.linuxvirtualserver.org including persistence (or affinity in cisco-ese)  LVS also supports a couple of different connect and passthrough models for different requirements.

The third problem is that of load balancing your content (Thumbnails on web cluster 2 versus original photos on web cluster 1 for example)  Your request was to be able to do that via the URL.  This is where the answer gets a bit more ambiguous.  You can use a reverse proxying setup (perlbal and pound for example) if you&#039;re still interested in a URL based forwarding pattern.  But if you were willing to dedicate an IP address you could use DNS based load balancing.  Take the following setup, for example.

Request for server.com comes into the load balancer server.com is a.b.c.d.  The load balancer knows that the web1 web2 web3 and web4 servers on the LAN are able to handle web requests for the ip address a.b.c.d.  The load balancer hands the request off to web2. Further requests from the same client ip address in the next 180 seconds (because of persistence) also get directed to web2. A thumbnail appears on the page at thumbs.server.com which has a different ip address of a.b.c.e.  The load balancer knows that web5 and web6 are set to be the real servers for a.b.c.e and the request is sent off to web6.  Because persistence was not configured for a.b.c.e like it was for a.b.c.d the next request might be sent to web5 since all requests are round-robin load balanced for that IP address.  Any time a web server goes down the mon (or other monitoring daemon) running on the load balancer notices and takes it out of rotation. and future requests are routed around the down web server until it is brought back up.  Heartbeat detects if one of the LVS load balancers goes down and has the other assume its position. 

In the above scenario you don&#039;t really get to use the URL as part of the balancing mechanism, but you do get to direct based on content type. 

You could also insert into the chain a reverse proxying solution (or cluster) and have the best of all available worlds by load balancing, in a highly available fashion, to the proxy servers

(Request)  [ 2 LVS machines (public IPs) ]  [ 4 Reverse Proxies (priv) ]  [ X &quot;real&quot; web servers (priv) ] 

It&#039;s not as elegant as a set of $50k hardware load balancers, I&#039;ll grant you, but it&#039;s workable and a LOT cheaper.  And I would definitely recommend not looking for an all-in-one solution for the problem in OSS (if you are) and start thinking in terms of chaining the proper functionality together.</description>
		<content:encoded><![CDATA[<p>Lets break the problem into its distinct parts and see if we can solve them in a very *nix style fashion (by chaining solutions together) (hey&#8230; you want low cost OSS&#8230; gotta work the system!)</p>
<p>The first problem is an IP Firewall.  There&#8217;s really very little to say here. iptables comes with linux and works remarkably well.  Mix this with the right network architecture, and you&#8217;re in business.</p>
<p>The second problem is persistent load balancing.  You want to be able to define services based on protocol (port 80/443 to the web servers, 25 to the mail servers, etc, etc)  And, preferably, if a user connects to web server #3, and then makes another request within X number of seconds their second request is also routed to web server #3. The LVS project actually manages this quite well (and cheaply) see: <a href="http://www.linuxvirtualserver.org" rel="nofollow">http://www.linuxvirtualserver.org</a> including persistence (or affinity in cisco-ese)  LVS also supports a couple of different connect and passthrough models for different requirements.</p>
<p>The third problem is that of load balancing your content (Thumbnails on web cluster 2 versus original photos on web cluster 1 for example)  Your request was to be able to do that via the URL.  This is where the answer gets a bit more ambiguous.  You can use a reverse proxying setup (perlbal and pound for example) if you&#8217;re still interested in a URL based forwarding pattern.  But if you were willing to dedicate an IP address you could use DNS based load balancing.  Take the following setup, for example.</p>
<p>Request for&nbsp;<a href="http://server.com" title="http://server. " target="_blank">server.com</a> comes into the load balancer&nbsp;<a href="http://server.com" title="http://server. " target="_blank">server.com</a> is a.b.c.d.  The load balancer knows that the web1 web2 web3 and web4 servers on the LAN are able to handle web requests for the ip address a.b.c.d.  The load balancer hands the request off to web2. Further requests from the same client ip address in the next 180 seconds (because of persistence) also get directed to web2. A thumbnail appears on the page at&nbsp;<a href="http://thumbs.server.com" title="http://thumbs.server. " target="_blank">thumbs.server.com</a> which has a different ip address of a.b.c.e.  The load balancer knows that web5 and web6 are set to be the real servers for a.b.c.e and the request is sent off to web6.  Because persistence was not configured for a.b.c.e like it was for a.b.c.d the next request might be sent to web5 since all requests are round-robin load balanced for that IP address.  Any time a web server goes down the mon (or other monitoring daemon) running on the load balancer notices and takes it out of rotation. and future requests are routed around the down web server until it is brought back up.  Heartbeat detects if one of the LVS load balancers goes down and has the other assume its position. </p>
<p>In the above scenario you don&#8217;t really get to use the URL as part of the balancing mechanism, but you do get to direct based on content type. </p>
<p>You could also insert into the chain a reverse proxying solution (or cluster) and have the best of all available worlds by load balancing, in a highly available fashion, to the proxy servers</p>
<p>(Request)  [ 2 LVS machines (public IPs) ]  [ 4 Reverse Proxies (priv) ]  [ X "real" web servers (priv) ] </p>
<p>It&#8217;s not as elegant as a set of $50k hardware load balancers, I&#8217;ll grant you, but it&#8217;s workable and a LOT cheaper.  And I would definitely recommend not looking for an all-in-one solution for the problem in OSS (if you are) and start thinking in terms of chaining the proper functionality together.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
