Xen at ServerBeach
From OptionC
This is how we got (and will continue to get) our Xen servers up at ServerBeach (http://www.serverbeach.com). I've kept them in mind ever since I encountered this post in the Serverbeach forums (http://forums.serverbeach.com/showthread.php?t=5042) - from somebody who has done this already.
This howto is not for the faint of heart, nor is it recommended for those who are drunk or very tired. The point of much of this is to make it so that you can never lose access to your remote server, but that, unfortunately, can be risky process. Six-and-a-half of one, not-quite-half-a-dozen of the other...
Hardware at ServerBeach
First I ordered what is called the "Linux Power Server" from ServerBeach (http://www.serverbeach.com). The particular model I got was a closeout special, but there should be something similar. The basic specs are...
- eth 8139too
- AMD Athlon 2100 Processor
- 1GB Memory
- 60 GB Hard Drive (Maxtor 6Y060L0)
Oh, and of course, I ordered it with Debian installed, although you could just as well order any of the other flavours of Linux that they offer (currently Red Hat and CentOS 4, although the option of CentOS isn't obvious until you start the order process) and debootstrap, since this process does involving destroying the initial install. Still, they way I did it involves less hand tweaking...
If you're not familiar with ServerBeach (http://www.serverbeach.com), the do dedicated, unmanaged servers and they host sites such as OLS (http://www.linuxsymposium.org/). Although they are not the only provider we use, so far they've seemed responsive and I haven't had any problems that I didn't cause myself. You can hard reboot your servers even after you've lost control, and they also have a "RapidRescue" environment for when you've really hosed things. They will do a phone verification of your first order (at least they did when I ordered.)
Add grub from "etch/testing"
(When I first did this documentation, the version of grub I wanted was only in unstable; it's long since become available in etch, and I've been using that version, so I've tried to clean up this section to reflect that)
I did this so that I could take advantage of grub's fallback and boot-only-once options, which have been disabled in the version of grub in Sarge. The part of the howto may seem a bit long, because it involves apt-pinning, but the basic overview is...
- add etch/testing to /etc/apt/sources
- pin repositories
- install grub grom etch
- install the new grub to the MBR
Make sure everything is up-to-date. If you requested Debian on your server, the /etc/apt/sources list should be for sarge/stable.
# apt-get update # apt-get dist-upgrade
Modify /etc/apt/sources.list and pin the system
Add the following to "/etc/apt/sources.list"
deb ftp://ftp.us.debian.org/debian/ etch main deb http://security.debian.org/ etch/updates main
Add the following to /etc/apt/apt.conf.d/70debconf
APT::Default-Release "stable";
Create the following /etc/apt/preferences file so we only get the packages we need from etch, not everything. /etc/apt/preferences
Package: * Pin: release a=stable Pin-Priority: 700 Package: * Pin: release a=testing Pin-Priority: 650
Now test to make sure you're getting what you want (the "-s" switch just shows you what will happen, it doesn't actually do it)
# apt-get update # apt-get -s -t testing install grub Reading Package Lists... Done Building Dependency Tree... Done The following extra packages will be installed: libc6 libncurses5 Suggested packages: grub-doc grubconf locales glibc-doc The following packages will be upgraded: grub libc6 libncurses5 3 upgraded, 0 newly installed, 0 to remove and 129 not upgraded. Inst libc6 [2.3.2.ds1-22] (2.3.5-11 Debian:unstable) Conf libc6 (2.3.5-11 Debian:unstable) Inst libncurses5 [5.4-4] (5.5-1 Debian:testing) Conf libncurses5 (5.5-1 Debian:testing) Inst grub [0.95+cvs20040624-17] (0.97-3 Debian:testing) Conf grub (0.97-3 Debian:testing)
As long as you saw something along those lines, you are okay (if you saw something like "about to install a zillion packages, most of the from something other than stable" you are not okay, check your /etc/apt/preferences file).
Add grub 0.97-x
Add the grub package. The install process should look like this (although by the time I confirmed this, the current version was 0.97-4)
# apt-get -t unstable install grub
If you want to be really careful (or carefree, depending on your point of view) reboot here, just to make sure the changes to libc6 didn't mess anything up (that's what I did). At this point you are still using the original version of grub as a boot loader, as we haven't installed the new one to the MBR. After the reboot (mine takes about 60 seconds from when I type "shutdown -r now" to when the system starts to respond to pings again), you'll actually do the install of grub to the Master Boot Record.
Install the new version of grub to the MBR
# grub-install /dev/hda Installation finished. No error reported. This is the contents of the device map /boot/grub/device.map. Check if this is correct or not. If any of the lines is incorrect, fix it and re-run the script `grub-install'. (fd0) /dev/fd0 (hd0) /dev/hda
NOTE: The grub documentation says that if you have a separate boot partition mounted at /boot (which serverbeach does), you should use this form of grub-install: "grub-install --root-directory=/boot /dev/hda" (http://www.gnu.org/software/grub/manual/html_node/Invoking-grub_002dinstall.html#Invoking-grub_002dinstall). However, that creates the wrong structure. I also found a post that said "don't use --root-directory=/boot, it is not needed anymore!" (http://lists.gnu.org/archive/html/bug-grub/2005-07/msg00008.html), so we don't.
If you're paranoid about rebooting, don't do it. I'm paranoid about it, but I still do it now, so that if I have problems later I know how far I got in the process before it all started to go horribly awry.
A comment about splash images and update-grub
When making these sorts of serious modifications to the kernel and boot options, I take over modifying grub by hand, as opposed to using Debian's handy "update-grub" script. However, there are things that you might do in the future (installing a standard kernel, for example), that might cause update-grub to be run. As such, there are a few things to keep in mind.
1) Check the file /etc/kernel-img.conf for the following lines. I remove them to keep update-grub from running when adding/removing kernels.
postinst_hook = /sbin/update-grub postrm_hook = /sbin/update-grub
2) Make a backup copy of /boot/grub/menu.lst every time you are about to do something that you think might even possibly have a chance of triggering update-grub to run
3) The version of update-grub from unstable looks in /boot/grub for spash image files, and adds them to /boot/grub/menu.lst. I remove these files (such as "spash.xmp.gz") and also make sure there is no line in /boot/grub/menu.lst that refers to them. Otherwise your boot will hang to matter how you've set up your failsafes (You have been warned! Please learn from my mistakes...)
Copy / to current swap (create safe boot partition)
Turn off swap, copy root
I started from the very helpful instructions "How to resize the root partition of a remote server" (http://www.xmlvalidation.com/repartion_server.0.html), but had to change them slightly.
Turn off swap and change the partition type.
# swapoff -a # fdisk /dev/hda
Change partition 2 to type 83 and save your changes. Your system will complain that it is using the old partition table until the next reboot; don't worry about it.
Format /dev/hda
# mkfs.ext3 /dev/hda2
Copy the root filesystem to the former swap partition.
# mkdir /mnt/hda2 # mount /dev/hda2 /mnt/hda2 # tar clf - -C / .| tar xf - -C /mnt/hda2 tar: ./dev/log: socket ignored
Copy the boot files to the former swap partition.
# cd /mnt/hda2/boot # cp -dpR /boot/* .
And remove the grub directory, just to avoid confusion later on.
# rm -rf grub
Modify fstab
You want to remove and/or comment out the swap line from the current fstab (on /dev/hda3).
/etc/fstab
# /etc/fstab for "original" partition /dev/hda3 / ext3 errors=remount-ro 0 1 proc /proc proc defaults 0 0 #/dev/hda2 none swap sw 0 0 /dev/hda1 /boot ext3 defaults 0 2
Do the same for fstab in /mnt/hda2/etc/fstab, and also change it so that "/dev/hda2" is mounted at "/" (If you are prone to typos, like I am, you may want to chroot into this partition first).
/mnt/hda2/etc/fstab
# /etc/fstab for "safe" partition /dev/hda2 / ext3 errors=remount-ro 0 1 proc /proc proc defaults 0 0 #/dev/hda2 none swap sw 0 0 /dev/hda1 /boot ext3 defaults 0 2
Unmount /dev/hda2
# umount /mnt/hda2
Change /boot/grub/menu.lst to allow for fallback recovery
A standard Debian /boot/grub/menu.lst (as created by update-grub), has no fallback, and the default boot entry is "0". update-grub will look through /boot to create stanzas, and puts them in between the "AUTOMAGIC KERNEL LIST" demarcators. Backup original /boot/grub/menu.lst before making changes. All the defaults before the line "# Put static boot stanzas before and/or after AUTOMAGIC KERNEL LIST" should be removed or commented out and replaced with the ones in the example.
# cp /boot/grub/menu.lst /boot/grub/menu.bkup
# /boot/grub/menu.lst default saved timeout 5 color cyan/blue white/blue fallback 0 title Debian GNU/Linux, kernel 2.6.8-2-k7 root (hd0,0) kernel /vmlinuz-2.6.8-2-k7 root=/dev/hda3 ro initrd /initrd.img-2.6.8-2-k7 savedefault boot title Debian GNU/Linux, kernel 2.6.8-2-k7, safe root (hd0,1) kernel /boot/vmlinuz-2.6.8-2-k7 root=/dev/hda2 ro initrd /boot/initrd.img-2.6.8-2-k7 savedefault fallback boot # # Put static boot stanzas before and/or after AUTOMAGIC KERNEL LIST [...and so on...]
Set the system to boot of the "new" partition
# grub-set-default 1 # cat /boot/grub/default 1 # # # # # # # # # # # WARNING: If you want to edit this file directly, do not remove any line # from this file, including this warning. Using `grub-set-default\' is # strongly recommended.
Reboot. The point is to reboot into the new partition, but if something goes wrong, we will fall back to the original configuration.
# shutdown -r now
Check to make sure we've booted into the correct partition
# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 1004M 366M 587M 39% / tmpfs 499M 4.0K 499M 1% /dev/shm /dev/hda1 76M 11M 66M 14% /boot
Repartition Server
Delete /dev/hda3. The swap space is for the domO only, not for all the machines, so although I create a swap, it's not as big as the original.
This is the layout with which you start.
# fdisk -l /dev/hda Disk /dev/hda: 61.4 GB, 61492838400 bytes 255 heads, 63 sectors/track, 7476 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 10 80293+ 83 Linux /dev/hda2 11 140 1044225 83 Linux /dev/hda3 141 7476 58926420 83 Linux
And this is where you finish (personal preferences for swap and need for additional partitions notwithstanding).
# fdisk -l /dev/hda Device Boot Start End Blocks Id System /dev/hda1 * 1 10 80293+ 83 Linux /dev/hda2 11 140 1044225 83 Linux /dev/hda3 141 203 506047+ 82 Linux swap / Solaris /dev/hda4 204 7476 58420372+ 5 Extended /dev/hda5 204 453 2008093+ 83 Linux /dev/hda6 454 7476 56412216 8e Linux LVM
You'll get the warning about the new table not being used until the next reboot. This time, they mean it. Also, you need to reset the default to the "safe" partition (because once you are using grub "fallback" it sets the default to the next fallback at every boot - verify this if you want by looking at /boot/grub/default).
# grub-set-default 1 # shutdown -r now
Check (again!) to make sure we've booted into the correct partition. (You are probably going to get very tired of hearing me say that, and you might begin to suspect that I have not been as careful in the past as I could be about this. You would be correct.)
# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 1004M 366M 587M 39% / tmpfs 499M 4.0K 499M 1% /dev/shm /dev/hda1 76M 11M 66M 14% /boot
Recreate the original partition and swap
# mkswap /dev/hda3
Make final modifications to fstab on the safe partition
Add line for swap, and change to mount /dev/hda1 at /mnt/hda1, not /boot
# mkdir /mnt/hda1
/etc/fstab
# /etc/fstab for safe partition /dev/hda2 / ext3 errors=remount-ro 0 1 proc /proc proc defaults 0 0 /dev/hda3 none swap sw 0 0 /dev/hda1 /mnt/hda1 ext3 defaults 0 2
Turn on swap
# swapon -a
Do that thing to 'recreate' the main partition
Sorry, language seems to be escaping me at this point. What we are doing here is reversing the process from earlier, and getting the main partition into its permanent home. (If you feel like you are being forced, entirely against your will, to play a very slow game of hokey-pokey, I'm sorry. Feel free to try this using RapidRescue(tm) (http://www.serverbeach.com/catalog/rapidrescue_public.php), and please let me know how it works for you.)
Copy files
# mkfs.ext3 /dev/hda5 # mkdir /mnt/hda5 # mount /dev/hda5 /mnt/hda5 # tar clf - -C / .| tar xf - -C /mnt/hda5 # rm -rf /mnt/hda5/boot/*
Modify fstab (hopefully for the last time)
# vi /mnt/hda5/etc/fstab
/mnt/hda5/etc/fstab
# /etc/fstab for main partition /dev/hda5 / ext3 errors=remount-ro 0 1 proc /proc proc defaults 0 0 /dev/hda3 none swap sw 0 0 /dev/hda1 /boot ext3 defaults 0 2
# umount /mnt/hda5
Switch grub again
We are now approaching the "final" /boot/grub/menu.lst, with /dev/hda2 as the complete and safe fallback, and /dev/hda5 as the main / partition.
# vi /boot/grub/menu.lst
/boot/grub/menu.lst
#/boot/grub/menu.lst # default saved timeout 5 color cyan/blue white/blue fallback 1 title HDA5: Debian GNU/Linux, kernel 2.6.8-2-k7 root (hd0,0) kernel /vmlinuz-2.6.8-2-k7 root=/dev/hda5 ro initrd /initrd.img-2.6.8-2-k7 savedefault fallback boot title HDA2: Debian GNU/Linux, kernel 2.6.8-2-k7 root (hd0,1) kernel /boot/vmlinuz-2.6.8-2-k7 root=/dev/hda2 ro initrd /boot/initrd.img-2.6.8-2-k7 savedefault boot # # Put static boot stanzas before and/or after AUTOMAGIC KERNEL LIST [...et cetera, et cetera, et cetera...]
Okay, let's test the newly created partition (or perhaps run off to the little girls' or boys' room first, depending on your own personal state).
# grub-set-default 0 # shutdown -r now
(If you are pinging, as I was - and always am - at this point, it takes about 60 seconds to come back, which is about 5 seconds after you've already thought to yourself "it's not coming back")
Did I boot into the correct partition?
# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda5 1.9G 376M 1.5G 21% / tmpfs 503M 4.0K 503M 1% /dev/shm /dev/hda1 76M 11M 66M 14% /boot
Set up watchdog, deadman and automatic grub-set-default
WARNING: All of this should be done on your 'main' system; in our example it is with /dev/hda5 mounted as /. If you do not, then when you are booted into safe mode, you will be under tremendous time constaints. However, this means that it is up to you to make sure your safe install really is safe (that is, it will boot with networking, not firewall you out, etc).
Now we've got a pretty good failsafe for if we load a bad kernel, or even if a partition goes bad. As long as we always tell grub to fallback to the "safe" partition last, we should be able to get through kernel panics, missing files, typos and so forth. We still need to reset the default when we boot, because one of the features of the fallback mechanism is that it sets the default to the next fallback (after the booted entry).
However, what if the system boots, and we messed up the networking? As far as grub is concerned, it was a successful boot, yet we can't get to the box. A hard reboot _might_ bring us to the next fallback entry, but it depends on how things are configured. What we'd really like is a way for the system to reboot itself after a certain period of time and in certain conditions. The following set-up is very crude, but it works for me. That's because I'm a almost guaranteed to have a constant power supply, so the only reboot condition is a controlled reboot (by me). The next version of these scripts will probably remove even _that_ assumption.
Enough chatter, back to configuration...
Make sure we are in the main system:
# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda5 1.9G 376M 1.5G 21% / tmpfs 503M 4.0K 503M 1% /dev/shm /dev/hda1 76M 11M 66M 14% /boot
Create and configure /etc/init.d/grub-set-default
Create the following "/etc/init.d/grub-set-default" file:
#!/bin/bash
#
# grub-set-default
#
# Script to reset default to GRUBDEFAULT in scenario where
# /boot/grub/menu.lst has fallback set
#
#
GRUBDEFAULT=0
case "$1" in
start)
if ! [ -e /etc/deadman.d/imok ]; then
cp /boot/grub/default /boot/grub/old.default
fi
echo "setting grub default to " $GRUBDEFAULT
grub-set-default $GRUBDEFAULT
;;
stop)
if [ -e /etc/deadman.d/imok ]; then
echo "resetting deadman state..."
rm /etc/deadman.d/imok
fi
;;
status)
if [ -e /boot/grub/default ]; then
cat /boot/grub/default
fi
;;
*)
# do not advertise unreasonable commands
echo $"Usage: $0 {start|stop|status}"
exit 1
esac
exit $?
Make it executable
# chmod 0755 /etc/init.d/grub-set-default
Configure /etc/init.d/grub-set-default to run after watchdog starts and before it stops
# update-rc.d grub-set-default defaults 90 80
And run it now, to make sure the default is set correctly
# /etc/init.d/grub-set-default start setting grub default to 0
Install watchdog
At the prompts, say "yes" to configure it to start automatically at boot time, "no" when it asks if you want to start it now (both are the defaults, the install should look like this).
#apt-get install watchdog
Configure watchdog
Create /etc/deadman.d/scripts/deadman.sh - this is a very crude script that checks for the /etc/deadman.d/imok file and if it doesn't exist, eventually reboots after setting the grub default to the safe entry.
# mkdir -p /etc/deadman.d/scripts # vi /etc/init.d/deadman.d/scripts/deadman.sh
/etc/deadman.d/scripts/deadman.sh
#!/bin/sh
#Debug mode logs more information to /etc/deadman.d/log
#Use sparingly
#1 is on, anything else is off
DEBUG=1
#which grub menu item are we SURE of?
SAFEBOOT=1
#time without contact from mother ship before hard reboot?
WAIT_MIN=15
#date | cat >> /etc/deadman.d/log
if [ -e /etc/deadman.d/imok ]; then
if [ $DEBUG -eq 1 ]; then
date | cat >> /etc/deadman.d/log
echo 'OK' >> /etc/deadman.d/log
fi
exit 0
fi
upt=`cat /proc/uptime | tr -d . | cut -d' ' -f1`
mins=`expr $upt / 6000`
if [ $DEBUG -eq 1 ]; then
date | cat >> /etc/deadman.d/log
echo 'NOT OK' >> /etc/deadman.d/log
fi
if [ $mins -ge $WAIT_MIN ]; then
if [ $DEBUG -eq 1 ]; then
date | cat >> /etc/deadman.d/log
echo 'uptime more than '$WAIT_MIN' min ' >> /etc/deadman.d/log
fi
grub-set-default $SAFEBOOT
shutdown -r now
# and if that doesn't work, let watchdog try
# return code -1 means reboot
exit -1
fi
exit 0
Make it executable
# chmod 0755 /etc/deadman.d/scripts/deadman.sh
Make sure you tell the system you're okay, or you might reboot before you finish the configuration.
# touch /etc/deadman.d/imok
Tell watchdog to use the script as the test, as opposed to any of the built-in tests, by changing this line in /etc/watchdog.conf...
#test-binary =
...to this...
test-binary = /etc/deadman.d/scripts/deadman.sh
As you can see by looking at /etc/watchdog.conf, there are much more sophisticated ways to go about implementing this. Go ahead, experiment, and if you are feeling generous, report what you've done back here.
We've started in debug mode, so restart watchdog with the new configuration, and watch to make sure it's working.
# /etc/init.d/watchdog restart # tail -f /etc/deadman.d/log Tue Jan 24 21:01:54 UTC 2006 OK Tue Jan 24 21:02:09 UTC 2006 OK Tue Jan 24 21:02:24 UTC 2006 OK Tue Jan 24 21:02:39 UTC 2006 OK
Test watchdog/deadman configuration
If your system has been up longer than "WAIT_MIN" in /etc/deadman.d/scripts/deadman.sh, then the minute you remove /etc/deadman.d/imok your system will reboot. So if you feel like watching the process in slow motion, leave debugging on and set the wait time to something higher than your current uptime.
Remove /etc/deadman.d/imok. When your system comes back, you should be in the safe partition.
# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 1004M 367M 587M 39% / tmpfs 503M 4.0K 503M 1% /dev/shm /dev/hda1 76M 11M 66M 14% /mnt/hda1
You will need to manually reset the default and reboot. If this wasn't a test, manually resetting everything would be a small portion of what you would do now, since you'd be trying to figure out what was wrong. Anyhow...
Mount the main partition and change the "WAIT_MIN" variable back to something more reasonable.
# mount /dev/hda5 /mnt/hda5 # vi /mnt/hda5/etc/deadman.d/scripts/deadman.sh [...] WAIT_MIN=15 [...] # umount /mnt/hda5
Reset the grub default and reboot
# grub-set-default --root-directory=/mnt/hda1 0 # shutdown -r now
After you reboot...
# touch /etc/deadman.d/imok
And turn off debug logging (change DEBUG to 0 in /etc/deadman.d/scripts/deadman.sh).
Near the top of my list of "things to do when I have time" is to automate this slightly, including using real logging, but it works for now, and is faster and cheaper than getting the remote support staff to intervene when I've done something wrong.
REPEAT WARNING: All of this configuration should be done when you've boot into your 'main' system; in the example is is with /dev/hda5 mounted as /.
Add Xen kernel and test
At this point, if you haven't died of boredom, you're thinking something like "didn't she mention Xen?" Well, yes, I did. And since Xen requires a recompiled, custom, non-standard kernel and network modifications, I'm not terribly comfortable throwing them on a remote system, especially when my local sandbox is only a vague approxmation of that system.
Add Xen kernel
If you feel like compiling your own kernel, by all means do so. If you want to use one of ours, add the following to /etc/apt/sources.list. This example uses our "big" kernel, with the initrd and the modules.
deb http://www.option-c.com/debian/ unstable main
Normally pinning the Xen packages to the Option-C repository and version 2.0.7 of xen would keep you from having issues; however the current (2006-01-24) packages don't have a versioned dependency for xen-hypervisor, so for now just remove the line for debian official testing and unstable in /etc/apt/sources.list
# apt-get update
Make a backup of your current /boot/grub/menu.lst
# cp /boot/grub/menu.lst /boot/grub/menu.bkup
Install the kernel and modules (which should play out like this)
# apt-get install kernel-image-ksxen0
Make the initrd (sorry, the package doesn't do this automatically yet; to check if this has been fixed, just look for initrd.img-2.6.11-ksxen0 in /boot)
# mkinitrd -o /boot/initrd.img-2.6.11-ksxen0 2.6.11-ksxen0
If you've read other parts of this wiki (or are familiar with Xen) you might wonder why we aren't installing any of the other tools. It's because at this stage we simply want to make sure the kernel boots and can use the network card, so we don't want to mess with anything else. There's no problem booting a Xen kernel without the control tools, as long as you have the Xen hypervisor installed (you just won't be able to do much).
Add Xen entry to /boot/grub/menu.lst
Add a stanza to /boot/grub/menu.lst so that it has the following entries (also change "fallback 1" to "fallback 1 2")
/boot/grub/menu.lst
default saved timeout 5 color cyan/blue white/blue fallback 1 2 title HDA5: Debian 3.1/Xen 2.0.7, kernel 2.6.11-ksxen0 root (hd0,0) kernel /xen-2.0.7.gz dom0_mem=262144 module /xen-linux-2.6.11-ksxen0 root=/dev/hda5 ro console=tty0 module /initrd.img-2.6.11-ksxen0 savedefault fallback boot title HDA5: Debian GNU/Linux, kernel 2.6.8-2-k7 root (hd0,0) kernel /vmlinuz-2.6.8-2-k7 root=/dev/hda5 ro initrd /initrd.img-2.6.8-2-k7 savedefault fallback boot title HDA2: Safe Debian GNU/Linux, kernel 2.6.8-2-k7 root (hd0,1) kernel /boot/vmlinuz-2.6.8-2-k7 root=/dev/hda2 ro initrd /boot/initrd.img-2.6.8-2-k7 savedefault boot # # Put static boot stanzas before and/or after AUTOMAGIC KERNEL LIST [... la la la la la...]
- NOTE:* If you are following along trying to adapt this to a CentOS or Fedora box, be careful of this syntax. Your grub is likely a little older and the syntax of your menu.lst file is going to be subtly different. Be careful not to screw up your menu.lst file because nothing makes a box more useless than a remote machine with an invalid menu.lst file.
Edit /etc/deadman.d/scripts/deadman.sh so that the safe entry is "SAFEBOOT=2"
Check that the default is 0 (the new Xen kernel entry).
# cat /boot/grub/default
Reboot and do whatever it is you do when you feel that your efforts alone might now result in the outcome you desire... (cross your fingers, take a shot of whiskey, that sort of thing).
# shutdown -r now
If (when) your system comes back (usually right on schedule, five seconds after panic has set in), check to see if you booted the Xen kernel
# cat /proc/version Linux version 2.6.11-ksxen0 (root@debian_build) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #1 Fri Nov 4 01:09:42 UTC 2005
And tell the system you're in and okay
# touch /etc/deadman.d/imok
Congratulations. Take a break, I know I'm going to.
Add and configure the rest of Xen
Set up LVM
We need to actually install LVM and create a place for our domUs to live. There are more details about LVM on this site, but the basics are...
Install LVM (if you want to know what that would look like, we provide...)
# apt-get install lvm2
Create an LVM physical volume from the large partition that we went to such pains to create a few steps back
# pvcreate /dev/hda6 Physical volume "/dev/hda6" successfully created
Create a volume group named "vg" (or whatever you choose) with that PV in it
# vgcreate vg /dev/hda6 Volume group "vg" successfully created
Look at your new volume group
# vgdisplay --- Volume group --- VG Name vg System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 1 VG Access read/write VG Status resizable MAX LV 0 Cur LV 0 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 53.80 GB PE Size 4.00 MB Total PE 13772 Alloc PE / Size 0 / 0 Free PE / Size 13772 / 53.80 GB VG UUID B5Bfqu-YjiI-gz2h-XoPC-54qT-e9E6-mvqHCe
Get a domU kernel
This should also give you the rest of what you need (xen-tools and dependencies), and the "monolithic" domU kernel
# apt-get install kernel-image-xenU kernel-image-2.6.11-bixenu
Create a domU
It's really likely that you've set up Xen before and have some domUs ready to deploy. If so, your eyes probably glazed over at these detailed instructions long ago. For those who haven't, we want at least one domU set up (but not started) before we get to the network configuration (for testing), so here are the basics. Details about getting domUs are either on the domU metapage, or on the page Create_a_Debian_VM_with_debootstrap. As such, we're just stepping through the basics creating a baby domU.
Add debootstrap
# apt-get install debootstrap
Create logical volumes for the domU and swap
# lvcreate -L 600M -n domu1 vg Logical volume "domu1" created # lvcreate -L 64M -n domu1_swap vg Logical volume "domu1_swap" created
Format
# mkfs.ext2 /dev/vg/domu1 [blah blah blah] # mkswap /dev/vg/domu1_swap Setting up swapspace version 1, size = 67104 kB no label, UUID=dd5b4071-f19c-4f6e-b0ea-0bab188b1ffb
Mount LV and add OS
# mkdir /mnt/lvm1 # mount /dev/vg/domu1 /mnt/lvm1 # debootstrap --arch i386 sarge /mnt/lvm1 http://ftp.us.debian.org/debian
(As I sit in a Starbuck's somewhere in the midwestern United States and test these commands at my remote Serverbeach server, which I believe is somewhere in California, I had to wait 11 minutes for this process to be completed. The second time, testing from across the street in Borders - with the server still in California - it took just over 8 minutes, which just goes to show that a watched install may complete, but not as quickly as an unmonitored one. Your mileage may vary.)
Change domU settings
# mv /mnt/lvm1/lib/tls /mnt/lvm1/lib/tls.disabled
# chroot /mnt/lvm1
# echo "domu1" > /etc/hostname
# cat > /etc/apt/sources.list << "EOF"
# sarge/stable
deb ftp://ftp.us.debian.org/debian/ sarge main
deb-src ftp://ftp.us.debian.org/debian/ sarge main
deb http://security.debian.org/ sarge/updates main
EOF
# cat > /etc/fstab << "EOF"
# Begin /etc/fstab
# <file system> <mount-point> <type> <options> <dump> <pass>
/dev/sda1 / ext3 defaults,errors=remount-ro 0 0
/dev/sda2 swap swap sw 0 0
proc /proc proc defaults 0 0
# End /etc/fstab
EOF
# cat > /etc/network/interfaces << "EOF"
# Begin /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet static
address 10.66.66.67
netmask 255.255.255.0
gateway 10.66.66.66
broadcast 10.66.66.255
# End /etc/network/interfaces
EOF
Check /etc/resolv.conf - if there are no entries
# cat > /etc/resolv.conf << "EOF" # Serverbeach nameservers nameserver 64.34.160.76 nameserver 64.34.160.92 EOF [ctrl]+D
# umount /mnt/lvm1
Create a configuration file to boot the domU
# mkdir /etc/xen/conf # touch /etc/xen/conf/domu1.cfg # cat > /etc/xen/conf/domu1.cfg << "EOF" kernel = "/boot/xen-linux-2.6.11-bixenu" memory = 32 name = "domu1" nics = 1 disk = ['phy:vg/domu1,sda1,w','phy:vg/domu1_swap,sda2,w' ] root = "/dev/sda1 ro" restart = 'onreboot' EOF
Create the link so it will automatically start on boot
# ln -s /etc/xen/conf/domu1.cfg /etc/xen/auto/domu1
Please, no matter how tempted you may be, do not start the domU right now!!!!!
Set up Networking
This is a subsection of the Xen Networking Page which is under construction on this site. We are using the directions for a dom0 with a single IP address. It is more or less an exact copy of that, and may be folded in at some point.
/etc/network/interfaces
Add the bridge for the internal network.
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface # (Will be a valid external IP address) auto eth0 iface eth0 inet static address 64.34.x.x netmask 255.255.255.192 gateway 64.34.x.x pre-up modprobe -v 8139too || true # The Xen bridged network # (An rfc1918 IP address) auto xen-br0 iface xen-br0 inet static pre-up brctl addbr xen-br0 address 10.66.66.66 netmask 255.255.255.0 network 10.66.66.0 broadcast 10.66.66.255 bridge_fd 0 bridge_hello 0 bridge_stp off
Bring up the bridge (this will be done automatically at boot time).
# ifup xen-br0
Check the bridge (the bridge won't have any interfaces on it until we bring up some virtual machines later)
# ifconfig xen-br0
xen-br0 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet addr:10.66.66.66 Bcast:10.66.66.255 Mask:255.255.255.0
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:378 (378.0 b)
# brctl show
bridge name bridge id STP enabled interfaces
xen-br0 8000.000000000000 no can't get port info: Function not implemented
s:0 (0.0 b) TX bytes:378 (378.0 b)
/etc/xen/xend-config.sxp
The critical change from the default is that "network-script" will be "network-route", not "network"
# /etc/xen/xend-config.sxp # Xend configuration file. # Port xend should use for the HTTP interface. (xend-port 8000) # Port xend should use for the event interface. (xend-event-port 8001) # Address xend should listen on for HTTP connections. # Specifying 'localhost' prevents remote connections. # Specifying the empty string '' allows all connections. (xend-address 'localhost') # The port xend should start from when allocating a port # for a domain console. (console-port-base 9600) # Address xend should listen on for console connections. # Specifying 'localhost' prevents remote connections. # Specifying the empty string '' allows all connections. (console-address 'localhost') ## Use the following if VIF traffic is routed. # The script used to start/stop networking for xend. (network-script network-route) # The default script used to control virtual interfaces. #(vif-script vif-route) ## Use the following if VIF traffic is bridged. # The script used to start/stop networking for xend. #(network-script network) # The default bridge that virtual interfaces should be connected to. (vif-bridge xen-br0) # The default script used to control virtual interfaces. (vif-script vif-bridge) # Whether iptables should be set up to prevent IP spoofing for # virtual interfaces. Specify 'yes' or 'no'. (vif-antispoof no) # Setup script for file-backed block devices (block-file block-file) # Setup script for enbd-backed block devices (block-enbd block-enbd)
Start xend.
# xend start
If you create your domain now, it should come up but won't be able to access the internet until you configure Shorewall in the next step.
# xm create -c /etc/xen/conf/domu1.cfg
(If you do this, which has the "-c" option so that you can start the domU with the console and watch it boot, you will need to type [ctrl]+] to get back to the dom0).
There is no password for a default debootstrap install, just enter root as the login. If you want to check things after the boot (it will take a while because of the lack of internet connectivity)...
Look at the network
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr AA:00:00:0C:A3:7E
inet addr:10.66.66.67 Bcast:10.66.66.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:10 errors:0 dropped:0 overruns:0 frame:0
TX packets:80 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:476 (476.0 b) TX bytes:5140 (5.0 KiB)
Ping yourself
# ping 10.66.66.67 PING 10.66.66.67 (10.66.66.67) 56(84) bytes of data. 64 bytes from 10.66.66.67: icmp_seq=1 ttl=64 time=0.054 ms 64 bytes from 10.66.66.67: icmp_seq=2 ttl=64 time=0.051 ms --- 10.66.66.67 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.051/0.052/0.054/0.007 ms
Ping the bridge...
# ping 10.66.66.66 PING 10.66.66.66 (10.66.66.66) 56(84) bytes of data. 64 bytes from 10.66.66.66: icmp_seq=1 ttl=64 time=1.45 ms 64 bytes from 10.66.66.66: icmp_seq=2 ttl=64 time=0.164 ms --- 10.66.66.66 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1010ms rtt min/avg/max/mdev = 0.164/0.810/1.457/0.647 ms
Unsuccessfully try to ping past the bridge...
# ping 66.94.234.13 PING 66.94.234.13 (66.94.234.13) 56(84) bytes of data. --- 66.94.234.13 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 4017ms
Get out of the domU
# [ctrl]+]
Shorewall
If we want our domUs to have access to the internet, we need to treat them like a regular LAN. I use Shorewall to set up the Masquerading (as I also use it for port forwarding), but if you want to set this up manually, by all means do. These directions are for Shorewall version 3.x.x, which currently is in etch, not sarge (the debian package does not start shorewall until it is configured, so this step is safe).
If this version of the howto still involves commenting out the etch/testing sources earlier, add it back in now and run "apt-get update" again
# apt-get install -t testing shorewall
We start with the basic shorewall two-interface example (http://www.shorewall.net/two-interface.htm), and simply need to modify it so that all references to eth1 become xen-br0. We also need to add a rule to allow SSH traffic to $FW from net BEFORE starting shorewall.
# gunzip /usr/share/doc/shorewall/examples/two-interfaces/*.gz # cd /etc/shorewall # cp /usr/share/doc/shorewall/examples/two-interfaces/* .
Edit the config files.
/etc/shorewall/interfaces
# ############################################################################### #ZONE INTERFACE BROADCAST OPTIONS net eth0 detect dhcp,tcpflags,norfc1918,routefilter,nosmurfs,logmartians loc xen-br0 detect tcpflags,detectnets,nosmurfs #LAST LINE -- ADD YOUR ENTRIES BEFORE THIS ONE -- DO NOT REMOVE
/etc/shorewall/masq
############################################################################### #INTERFACE SUBNET ADDRESS PROTO PORT(S) IPSEC eth0 xen-br0 #LAST LINE -- ADD YOUR ENTRIES ABOVE THIS LINE -- DO NOT REMOVE
/etc/shorewall/policy
(We want our firewall to have full access to the internet.)
############################################################################### #SOURCE DEST POLICY LOG LEVEL LIMIT:BURST loc net ACCEPT $FW net ACCEPT net all DROP info loc loc ACCEPT # THE FOLLOWING POLICY MUST BE LAST all all REJECT info #LAST LINE -- ADD YOUR ENTRIES ABOVE THIS LINE -- DO NOT REMOVE
/etc/shorewall/routestopped
(We want to be able to administer the firewall during start/stop/reload, and for the domUs to still talk to each other)
############################################################################## #INTERFACE HOST(S) OPTIONS xen-br0 - routeback eth0 sys.admin.ip.address source #LAST LINE -- ADD YOUR ENTRIES BEFORE THIS ONE -- DO NOT REMOVE
/etc/shorewall/rules
(We want ssh access to our box; you may lock this down further. The line to accept Ping from bad places " Ping/ACCEPT net $FW" is for testing only and should quite likely be removed once the box is stable)
##################################################################### SSH/ACCEPT net $FW Ping/ACCEPT loc $FW Ping/ACCEPT net $FW Ping/REJECT net $FW ACCEPT $FW loc icmp ACCEPT $FW net icmp # #LAST LINE -- ADD YOUR ENTRIES BEFORE THIS ONE -- DO NOT REMOVE
/etc/shorewall/zones
############################################################################### #ZONE TYPE OPTIONS IN OUT # OPTIONS OPTIONS fw firewall net ipv4 loc ipv4 #LAST LINE - ADD YOUR ENTRIES ABOVE THIS ONE - DO NOT REMOVE
/etc/shorewall/shorewall.conf
(Make sure the following are set. Configure the rest as desired)
STARTUP_ENABLED=Yes IP_FORWARDING=On ADMINISABSENTMINDED=Yes
/etc/defaults/shorewall
(We want to be able to start shorewall)
startup=1
After all that
# shorewall safe-start
Using safe-start, shorewall will give you a prompt after it's configured like this
# Do you want to accept the new firewall configuration? [y/n]
If you answer "n" or fail to respond in 60 seconds, a "shorewall clear is performed." Use this time to test... try to ssh into the box again. If you can't you have configured something incorrectly and you should answer 'n' if possible or just wait.
That's about it
You'll want to reboot to make sure everything still works. Make sure to...
Check you have booted into the Xen kernel...
# cat /proc/version Linux version 2.6.11-ksxen0 (root@debian_build) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #1 Fri Nov 4 01:09:42 UTC 2005
Let the system know you are back in...
# touch /etc/deadman.d/imok
Make sure your domU is running...
# xm list Name Id Mem(MB) CPU State Time(s) Console Domain-0 0 251 0 r---- 17.2 domu1 1 31 0 -b--- 1.2 9601
If something doesn't seem right, hopefully you only have to go one or two steps back to check the configurations. If you run up against a brick wall, feel free to post to the talk section. Otherwise, if you haven't visited the Xen meta-page on this site, you might want to head there now to look for other resources.
Recommended Reading
- Grub Fallback:
- http://www.gnu.org/software/grub/manual/html_node/Making-your-system-robust.html#Making-your-system-robust
- http://www.gnu.org/software/grub/manual/html_node/Booting-once_002donly.html#Booting-once_002donly
- http://www.gnu.org/software/grub/manual/html_node/Invoking-grub_002dset_002ddefault.html#Invoking-grub_002dset_002ddefault
- Debian's Implementation of Grub Fallback (pre 0.97) - this is just for those who are curious as to what the other options might be, it is not from the howto
- Apt-pinning
- Apt-Pinning for Beginners (http://jaqque.sbih.org/kplug/apt-pinning.html)
- Using APT with more than 2 sources (http://www.argon.org/~roderick/apt-pinning.html)
- Managing packages (http://www.debian.org/doc/manuals/apt-howto/ch-apt-get.en.html)

