Embedded Terminal Server HOWTO

Copyright © 2002 by Tom Lisjac.
E-Mail: vlx at users.sourceforge.net

Configuration Instructions

Short Description

If you're interested in building one of these systems, there are 5 basic steps we'll need to perform: 
1. Get a single server system working perfectly for the production environment and pre-configure this system for cloning.
2. Duplicate the image of the tested and pre-configured system to another machine of equivalent performance.
3. Configure the Primary system for DHCP peering.
4. Configure Secondary for DHCP peering, assign it unique IP's and make LTSP and all services work with the new settings.
5. Integrate the two systems and test the ETS cluster for load balancing, failover, openMosix and installed option support.
As we configure the ETS, we'll also be optionally putting a Disaster Recovery Procedure in place that will allow you to restore a primary or secondary server that fails. The extent of what you do here will depend on how much downtime you can tolerate versus how much money you want to spend. At minimum, you can use a floppy to save the unique settings that distinguish the primary and secondary. Better yet is to clone another hard drive from the primary... or clone both the primary and secondary. If downtime needs to be kept to an absolute minimum you can also create a complete backup server that can be swapped in place of a failed machine. Putting together a recovery system will be easy while we're configuring the servers. Times to do this will be shown in red as BRD for Backup and Recovery Data.

This procedure may look elaborate but it's really quite simple and goes pretty quickly... after the first time you do it. :o) The most time consuming part is getting the primary server to perform the way you want and the process of duplicating the primary hard drive. If you've got an existing LTSP installation that's working to your satisfaction, most of the real work is already done! Starting with a working primary and cloned secondary drive, the cluster integration process has never taken me more then half an hour to complete
.

Optional Backup and Recovery Preparation


If you're in a hurry and don't want to create a backup system please skip down to Step 1. Otherwise, let's prepare for saving Backup and Recovery Data or "BRD". It will be much easier to do this as we go rather then after we've finished the install. Here are the steps:

        1. Prepare a floppy (or suitable alternative) for backing up your configuration data as you go.
        2. Create a "primary" and "secondary" subdirectory on this backup.
        3. Before you edit a file for the secondary configuration, save the original to the primary backup subdirectory.
        4. After you make and verify your changes on the secondary, save the modified file to the secondary backup subdirectory.


When you're done, this backup will contain the configuration differences between the primary and secondary drives. If you have a copy of the primary, you can easily create the secondary from it and visa versa. Once the backup media has been initialized, you're ready to begin configuring the servers!


Creating the Dual Servers

Step 1 - Pre-configuring the Primary (base) System

To save work and guarantee system symmetry, we're going to pre-configure the primary with all the files that we'll need to have on both servers. The only package we're going to install (if you're not running MDK9.0, RH8.0 or Debian 3.0) is the latest version of dhcpd which must be version 3.0.1 rc9. Other runtime options should also be installed and configured on the primary before starting the build of the secondary system. These could include installing an openMosix kernel and userland tools, mapping Samba shares, configuring Winbind, printers, etc. Both servers will also need to have synchronized clocks so it is highly recommended to install ntpd and sync to a reliable time server. If the servers don't have access to an external time server, sync the secondary clock to the primary.

Be critical and demanding when it comes to the configuration and performance of the primary. Ideally, run the box in a simulated production environment for a while and make sure it's rock solid. When you're confident you've got a complete and fully operational primary, continue with the following steps:.

1. If needed, install/update the primary DHCP package to 3.0.1 rc9. Make sure this version is running after a reboot.
2. Backup your existing /etc/dhcpd.conf to dhcpd.conf.orig and copy the sample peering dhcpd.conf to /etc
3. Using dhcpd.conf.orig for a reference, begin customizing the new dhcpd.conf for your existing system.
4. Leave the commented out failover declarations in place... we'll use these later ( ie: #failover peer "ltsp" {... )
5. Set up a lease pool and include a commented out reference to the previously declared "ltsp" peer:
pool {
# failover peer "ltsp";
max-lease-time 7200;
deny dynamic bootp clients;
range 172.16.0.25 172.16.0.200;
}
6. Test the dhcpd.conf configuration... make sure it's working. Don't forget to restart dhcpd after making dhcpd.conf changes.
7. Complete testing and configuration of single server primary system. Verify that server is still fully functional and running the terminals.
8. Verify that your ntp time sync mechanism is working. Both servers need to be time sync'ed for dhcpd fail-over to work.
9. Halt the system and power down the primary server.

Step 2 - Duplicating the Primary and Building the Secondary System

For the secondary server, try to select hardware that is roughly equivalent to the primary in terms of memory and CPU performance. If you're stuck with something that isn't that close, we'll discuss load balancing techniques later that will help you to compensate for the differences. Size the secondary as if it were going to handle the entire load... because if the primary fails, that's exactly what it will attempt to do. Here's the process for building the secondary:

1. Use your favorite method to copy the primary drive's system image to the secondary hard drive.
    (as in "dd if=/dev/hda of=/dev/hdb bs=8192k" where hda=primary and hdb=secondary ide devices)

2. Install the cloned drive in the secondary server. Due to potential IP conflicts, make sure the primary is powered down.
3. Plug in the network connections and bring up the new secondary system.
4. If the secondary hardware is different then the primary, the hardware configuration may fork at this point. Prepare to back up!
5. Adjust drivers and other hardware configuration issues.
Save any changes before and after to the BRD media.
6. Verify that the secondary server is functionally identical to the primary and working perfectly!

At this point, we should have two operational and functionally identical servers. Please don't proceed until you've verified that the secondary can drive the workstations and support all the options that you've configured on the primary. Once you're sure that you've got a genuine clone, you can start the process of configuring them for their individual roles in the fail-over and load balancing pair.

Step 3 - Configuring the Primary

The primary server is already set up and is the easiest to configure so we'll do it first. I usually set my LTSP workstations up on a 172.16.0.0/16 network. Make changes to the following files to accommodate your network and IP address ranges.

1. Copy dhcpd.conf to dhcpd.conf.singleserver. You'll need this file if you want to quickly return to a single server configuration.
2. In /etc/dhcpd.conf, remove the comment (#) from the following lines: (leave split 128; commented out)
#failover peer "ltsp" {
# primary;
# address 172.16.0.1;
# port 519;
# peer address 172.16.0.2;
# peer port 520;
# mclt 3600;
# max-response-delay 30;
# max-unacked-updates 10;
# load balance max seconds 3;
## split 128;
# hba ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:
# 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00;
#}
3. Set "address" to the IP of your Primary and "peer address" to the IP of your Secondary.
4. Further down in /etc/dhcpd.conf, remove the comment (#) from the following line:
pool {
# failover peer "ltsp";
max-lease-time 7200;
deny dynamic bootp clients;
range 172.16.0.25 172.16.0.200;
}
5. If you've got a firewall set up on your server, make sure to open the LTSP interface up to TCP traffic on "port" and "peer port".
6. If you've installed openMosix, configure /etc/mosix.map. If there's a firewall, open the appropriate ports for openMosix.
7. Don't test yet.... you're done with the Primary for the moment. Halt the system and power the machine down.

Step 4 - Configuring the Secondary

The secondary is just as simple to configure for fail-over and load balancing, but it's more overall work because you have to provide it with unique IP addresses that differ from the primary... then reconfigure LTSP and any other network dependant services. This involves changing quite a few files and plenty of testing.

1. We're about to fork the configuration of the two boxes. The primary should be powered down for this step.
2. Power up the secondary and verify that it is working on the network
and operationally identical to the primary.
3. Configure all interfaces of the secondary with their new IP addresses and network configurations. Save BRD changes.
4. Make all required changes for LTSP and services to work with the new network configuration. Save BRD changes.
    At minimum, LTSP will require changes to the following files:
/etc/ltsp/i386/etc/lts.conf
/etc/dhcpd.conf
5. Thoroughly test to make sure that the new secondary works with it's new network settings. Keep the primary offline.
6. Copy dhcpd.conf to dhcpd.conf.singleserver.  Save BRD changes. You'll need this file to return to a single server configuration.
7. In /etc/dhcpd.conf, remove the comment (#) from the following lines:
#failover peer "ltsp" {
# secondary;
# address 172.16.0.2;
# port 520;
# peer address 172.16.0.1;
# peer port 519;
# max-response-delay 30;
# max-unacked-updates 10;
# load balance max seconds 3;
#}
8. Set "address" to the IP of your Primary and "peer address" to the IP of your Secondary.
9. Further down in /etc/dhcpd.conf, remove the comment (#) from the following line:
pool {
# failover peer "ltsp";
max-lease-time 7200;
deny dynamic bootp clients;
range 172.16.0.25 172.16.0.200;
10. Save changes for dhcpd.conf to the BRD media.
11. If you've got a firewall set up on your server, make sure to open the LTSP interface up to TCP traffic on "port" and "peer port" .
12. If you've installed openMosix, configure /etc/mosix.map. If there's a firewall, open the appropriate ports for openMosix
13.
Save changes you've made to the secondary configuration to the BRD media.
14. Verify that your ntp time sync mechanism is working. Both servers need to be time sync'ed for dhcpd fail-over to work.
15. The secondary configuration is complete. Halt the system and power the machine down.

Step 5 - System Integration and Test

At this point, we've got two boxes with different unique IP addresses that function identically. If you save dhcpd.conf and copy dhcpd.conf.singleserver to dhcpd.conf, each box will function as a stand alone server... but at a different IP address. We've configured the two DHCP servers to listen to one another, take over in the event the other one doesn't talk or respond. We've also set up DHCP load balancing and a pool of IP addresses for the two servers to share. Now it's time to put it all together!

1. Hook everything up... network cables, switches, hubs, keyboards etc.
2. Simultaneously power up both servers and allow them to boot. Watch for errors during the boot process.
3. If it was a clean boot, wait a minute or two and try starting one of the LTSP workstations.
4. If you got a boot error or the workstation didn't come up, proceed to Appendix A for troubleshooting hints.
5. Unplug the LTSP network cable from one server at a time and make sure the workstations can boot from the remaining box.
6. Simultaneously boot the workstations. They should come up in roughly a 50/50 proportion. If they don't see Appendix B.
7. If openMosix is installed, you can run Matt Rechenburg's stress test to tune and optimize the computational cluster.
8. If you made any changes during integration,
save them to your BRD media.
9. Make a backup of your BRD media! :o) Backing up of one or both hard drives is also recommended.

Aside from some fine tuning, you're done... enjoy your ETS cluster! The ratio of workstation load balancing is adjustable if the performance of your two servers isn't symetrical. See Appendix B for details.


Contents  Back  Next
This Project is generously hosted by SourceForge.net Logo