Short DescriptionIf you're interested in building one of these systems, there are 5 basic steps we'll need to perform:1. Get a single server system working perfectly for the production environment and pre-configure this system for cloning.As we configure the ETS, we'll also be optionally putting a Disaster Recovery Procedure in place that will allow you to restore a primary or secondary server that fails. The extent of what you do here will depend on how much downtime you can tolerate versus how much money you want to spend. At minimum, you can use a floppy to save the unique settings that distinguish the primary and secondary. Better yet is to clone another hard drive from the primary... or clone both the primary and secondary. If downtime needs to be kept to an absolute minimum you can also create a complete backup server that can be swapped in place of a failed machine. Putting together a recovery system will be easy while we're configuring the servers. Times to do this will be shown in red as BRD for Backup and Recovery Data. This procedure may look elaborate but it's really quite simple and goes pretty quickly... after the first time you do it. :o) The most time consuming part is getting the primary server to perform the way you want and the process of duplicating the primary hard drive. If you've got an existing LTSP installation that's working to your satisfaction, most of the real work is already done! Starting with a working primary and cloned secondary drive, the cluster integration process has never taken me more then half an hour to complete. |
Optional Backup and Recovery PreparationIf you're in a hurry and don't want to create a backup system please skip down to Step 1. Otherwise, let's prepare for saving Backup and Recovery Data or "BRD". It will be much easier to do this as we go rather then after we've finished the install. Here are the steps: 1. Prepare a floppy (or suitable alternative) for backing up your configuration data as you go. 2. Create a "primary" and "secondary" subdirectory on this backup. 3. Before you edit a file for the secondary configuration, save the original to the primary backup subdirectory. 4. After you make and verify your changes on the secondary, save the modified file to the secondary backup subdirectory. When you're done, this backup will contain the configuration differences between the primary and secondary drives. If you have a copy of the primary, you can easily create the secondary from it and visa versa. Once the backup media has been initialized, you're ready to begin configuring the servers! |
Step 1 - Pre-configuring the Primary (base) SystemTo save work and guarantee system symmetry,
we're going to pre-configure the primary with all
the files that we'll need to have on both servers.
The only package we're going to install (if you're not running
MDK9.0, RH8.0 or Debian 3.0) is the latest version of dhcpd
which must be version 3.0.1 rc9. Other runtime options
should also be installed and configured on the primary before
starting the build of the secondary system. These could include
installing
an openMosix kernel and userland tools, mapping Samba shares,
configuring Winbind, printers,
etc. Both servers will also need to have synchronized clocks so it is highly recommended to install ntpd and sync to a reliable time server. If the servers don't have access to an external time server, sync the secondary clock to the primary. Be critical and demanding when it comes to the configuration and performance of the primary. Ideally, run the box in a simulated production environment for a while and make sure it's rock solid. When you're confident you've got a complete and fully operational primary, continue with the following steps:. 1. If needed, install/update the primary DHCP package to 3.0.1 rc9. Make sure this version is running after a reboot.
6. Test the dhcpd.conf configuration... make sure it's working. Don't forget to restart dhcpd after making dhcpd.conf changes. |
Step 2 - Duplicating the Primary and Building the Secondary SystemFor the secondary server, try to select hardware that is roughly equivalent to the primary in terms of memory and CPU performance. If you're stuck with something that isn't that close, we'll discuss load balancing techniques later that will help you to compensate for the differences. Size the secondary as if it were going to handle the entire load... because if the primary fails, that's exactly what it will attempt to do. Here's the process for building the secondary: 1. Use your favorite method to copy the primary drive's system image to the secondary hard drive. At this point, we should have two operational and functionally identical servers. Please don't proceed until you've verified that the secondary can drive the workstations and support all the options that you've configured on the primary. Once you're sure that you've got a genuine clone, you can start the process of configuring them for their individual roles in the fail-over and load balancing pair. |
Step 3 - Configuring the PrimaryThe primary server is already set up and is the easiest to configure so we'll do it first. I usually set my LTSP workstations up on a 172.16.0.0/16 network. Make changes to the following files to accommodate your network and IP address ranges. 1. Copy dhcpd.conf to dhcpd.conf.singleserver. You'll need this file if you want to quickly return to a single server configuration.
3. Set "address" to the IP of your Primary and "peer address" to the IP of your Secondary.
5. If you've got a firewall set up on your server, make sure to open the LTSP interface up to TCP traffic on "port" and "peer port". |
Step 4 - Configuring the SecondaryThe secondary is just as simple to configure for fail-over and load balancing, but it's more overall work because you have to provide it with unique IP addresses that differ from the primary... then reconfigure LTSP and any other network dependant services. This involves changing quite a few files and plenty of testing. 1. We're about to fork the configuration of the two boxes. The primary should be powered down for this step. /etc/ltsp/i386/etc/lts.conf 5. Thoroughly test to make sure that the new secondary works with it's new network settings. Keep the primary offline.
8. Set "address" to the IP of your Primary and "peer address" to the IP of your Secondary.
10. Save changes for dhcpd.conf to the BRD media. |
Step 5 - System Integration and TestAt this point, we've got two boxes with different unique IP addresses that function identically. If you save dhcpd.conf and copy dhcpd.conf.singleserver to dhcpd.conf, each box will function as a stand alone server... but at a different IP address. We've configured the two DHCP servers to listen to one another, take over in the event the other one doesn't talk or respond. We've also set up DHCP load balancing and a pool of IP addresses for the two servers to share. Now it's time to put it all together! 1. Hook everything up... network cables, switches, hubs, keyboards etc. Aside from some fine tuning, you're done... enjoy your ETS cluster! The ratio of workstation load balancing is adjustable if the performance of your two servers isn't symetrical. See Appendix B for details. |