VME-CPU

From DaqWiki
Jump to navigation Jump to search

VME CPU information

Cloning USB and CF Flash boot cards

The V7805 VME CPU can run Linux from USB flash memory, V7865 VME CPU can run Linux from CompactFlash or USB flash memory. Disk size 8 GB or bigger is recommended for running SL5.5 Linux. Highest available speed grade devices should be used: 266X or better for CompactFlash, 200X or "30MB read/15MB write" for USB Flash. Also be aware that some CompactFlash and USB Flash devices have been observed to corrupt Linux filesystems within a few days of use. The specific flash memory brands and models we presently use do not see to have this problem.

When working CompactFlash memory, attach it to a USB-CF adapter and treat it as USB flash memory in the following instructions.

Clone disk using the script clone.perl

  • attach the USB flash disk to some computer connected to the LADD cluster
  • become root
  • check that correct device appears in the device list: fdisk -l
  • assume new device is /dev/sdc
  • select an Linux image that we will clone:
    • 64-bit SL6 image for V7865 VME processors: use /ladd/data1/root/lxiris01
    • 32-bit SL6 image for dual-Athlon machines: use /ladd/data1/root/ladd13
  • cd /home/olchansk/sysadm/clone
  • ./clone.perl /ladd/data1/root/ladd13 /dev/sdc
  • df -kl ### check that /dev/sdc is not mounted
  • disconnect the USB flash disk, try to boot from it.

The clone script has been tested in these configurations:

  • clone 64-bit SL6 VME CPU image to 8GB and 16GB USB flash, GRUB bootloader
  • clone 32-bit SL6 image to 500GB IDE-USB disk, EXTLINUX bootloader.

Note that the clone script has to be run from the correct directory per instructions above as it has to find and run the uuidfix script to make the destination disk bootable.

Clone disk manually

  • attach USB flash to the computer to be cloned (or any computer - we will use rsync to copy the data)
  • become root
  • check that correct device appears in the device list: fdisk -l
  • assume new device is /dev/sdX, original boot disk is /dev/sda.
  • repartition the device:
    • fdisk -H 224 -S 56 /dev/sdX
    • create one partition covering the whole device
    • set partition type 83 (Linux)
    • set bootable flag (command "a")
    • result should look like this:
[root@lxdaq09 ~]# fdisk -l

Disk /dev/sda: 8011 MB, 8011120640 bytes
224 heads, 56 sectors/track, 1247 cylinders
Units = cylinders of 12544 * 512 = 6422528 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        1247     7821156   83  Linux
  • /usr/bin/time mke2fs -j /dev/sdX1 (should take about 30 seconds)
  • tune2fs -i0 -c0 /dev/sdX1
  • mkdir -p /mnt/tmp
  • mount /dev/sdX1 /mnt/tmp
  • CLONE CURRENT BOOT DISK: cd /; rsync -ax . /mnt/tmp; cd /dev; rsync -a . /mnt/tmp/dev
  • CLONE lxdaq08 32-bit boot image (V7805, V7865): cd /ladd/data1/root/lxdaq08; rsync -ax --delete-after . /mnt/tmp
  • CLONE ANOTHER FLASH DISK: cd /another/flash/disk; /usr/bin/time rsync -ax . /mnt/tmp
  • check result: run "df", new filesystem should be about as full as the original one
  • sync; cd /; umount /dev/sdX1; mount /dev/sdX1 /mnt/tmp
  • install SYSLINUX/EXTLINUX boot loader (SL5)
    • install master boot loader: cd /mnt/tmp/boot; dd if=mbr.bin of=/dev/sdX (NOTE: ***NOT** /dev/sdX1)
    • install extlinux boot loader: cd /mnt/tmp/boot; ./extlinux -i . (NOTE: notice the "./" - make sure to run the extlinux executable from .../boot, NOT the one installed in the system)
  • install GRUB boot loader (SL6) --- (NOTE: in the line below, remember to replace "/dev/sdX" with the disk name)
    • echo -e "device (hd0) /dev/sdX\nroot (hd0,0)\nsetup (hd0)\n" | grub --batch --no-floppy
  • update boot disk UUID (SL6) --- (NOTE: dumpe2fs prints the UUID of the new disk, cut-and-paste this UUID into the sed commands below)
    • dumpe2fs /dev/sdX1 | grep UUID
    • edit grub.conf: sed 's/UUID=\S*/UUID=ddc00d49-1c17-4803-ac0b-d6eb89d9e729/' -i /mnt/tmp/boot/grub/grub.conf
    • edit fstab: sed 's/UUID=\S*/UUID=ddc00d49-1c17-4803-ac0b-d6eb89d9e729/' -i /mnt/tmp/etc/fstab
  • cd /; umount /dev/sdX1
  • disconnect the new boot disk, try to boot from it.

Cloning NFS-Root

We will clone "lxsrc" into "lxdst":

cd lxsrc
mkdir ../lxdst
rsync -av . ../lxdst
vi etc/sysconfig/network ### change HOSTNAME and NIS domain
vi etc/nodeinfo ### change description
vi etc/yp.conf ### add "domain xxx-NIS broadcast" for the new NIS domain
vi etc/fstab ### change the "/" NFS mount-point if needed
vi etc/cron.d/triumf_nodeinfo ### change nodeinfo server name

Extlinux boot file

DEFAULT menu.c32
PROMPT 0
TIMEOUT 50

MENU TITLE TRIUMF DAQ USB BOOT32 ver K.O. 2011feb03

LABEL automatic
  MENU DEFAULT
  com32 ifcpu.c32
  append debug multicore -- linux-V7865-32 -- linux-V7805

LABEL linux-V7865-32
  kernel vmlinuz-2.6.18-194.32.1.el5
  append initrd=initrd-2.6.18-194.32.1.el5.img panic=60 ro rootdelay=5 rootwait ro root=/dev/sda1

LABEL linux-V7865-32-old
  kernel vmlinuz-2.6.18-194.8.1.el5
  append initrd=initrd-2.6.18-194.8.1.el5-32-usbboot.img panic=60 ro rootdelay=5 rootwait ro root=/dev/sda1

LABEL linux-V7865-64
  kernel vmlinuz-2.6.18-194.11.1.el5
  append initrd=initrd-2.6.18-194.11.1.el5.V7865.img panic=60 ro rootdelay=5 rootwait ro root=/dev/sda1

LABEL linux-V7805
  kernel vmlinuz-2.6.34.1-32-v7805
  append panic=60 rootdelay=5 rootwait ro root=/dev/sda1

LABEL memtest
  kernel memtest86+-1.65

#label linux
#  kernel vmlinuz-2.6.34.1-32-v7805
#  append panic=60 rootdelay=5 rootwait ro root=/dev/sda1
#label linux
#  kernel vmlinuz-2.6.34.1-32-v7805
#  append panic=60 root=/dev/nfs nfsroot=142.90.111.60:/home1/laddvme05.triumf.ca,nfsvers=3,tcp,rsize=32768,wsize=32768 ip=::::::dhcp console=ttyS0,115200n8

Updating Linux kernel

Updating Linux kernel on USB/CF flash boot disks

  • install latest kernel: yum update
  • identify latest kernel files:
    • ls -ltr /boot | grep vmlinuz | tail -1
    • ls -ltr /boot | grep initrd | tail -1
  • edit /boot/extlinux.conf
    • duplicate the entry marked "MENU DEFAULT"
    • change file names for the first entry according to the newly installed kernel
    • remove "MENU DEFAULT" from all entries except the new one
  • reboot into the new kernel

V7865 BIOS Settings

  • enter BIOS by pressing "DEL" during power up
  • Boot->Boot setting "Wait for F1" set to "Disabled"
  • Chipset->South Bridge "USB 2.0 Controller" set to "Enabled"
  • Advanced->IDE configuration set to "Disabled" (unless using CompactFlash boot disk)
  • Advanced->Remote access set to "Disabled"
  • Advanced->USB->"USB 2.0 Controller Mode" set to "HiSpeed"
  • Exit-> "Save changes and Exit"

Network boot

Explanation

Network booting of linux computers is done by, in order:

  • PXE (in-BIOS, in-EFI-BOOT or GPXE booted from disk) issues DHCP request to learn the IP address and further boot instructions
  • DHCP server (/etc/dhcp/dhcpd.conf) responds with the IP address and instructions to boot pxelinux.0
  • TFTP server (with the xinetd server) provides access to files in /var/lib/tftpboot on the boot host
  • pxe and pxelinux use tftp to load pxelinux.0, the pxelinux config file (pxelinux.cfg/default or as specified in dhcpd.conf), the linux kernel and the linux initramfs files
  • the linux kernel uses DHCP (again) to configure the network and mount the root file system (as specified in dhcpd.conf)
  • NFS server on the boot host provides access to the root filesystem.

After booting is complete only the NFS server is required for running Linux. After Linux is fully booted, DHCP, TFTP and PXE are not used (until reboot).

Setup the boot host computer

install packages

yum install dhcp tftp-server

configure dhcpd

  • create emacs -nw /etc/logrotate.d/dhcp
/var/log/dhcp.log {
    weekly
    notifempty
    missingok
    postrotate
	/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
	/bin/kill -HUP `cat /var/run/rsyslogd.pid 2> /dev/null` 2> /dev/null || true
    endscript
}
  • put this in emacs -nw /etc/rsyslog.conf right after the line "#### RULES ####"
#### RULES ####

# Annoying dhcp spam ... (need to add to rotation ...)
local3.*						-/var/log/dhcp.log
local3.*						~
  • chmod a+rx /etc/dhcp
  • edit /etc/dhcp/dhcpd.conf - add contents from the next section
  • chkconfig dhcpd on
  • service rsyslog restart # to enable dhcp.log
  • service dhcpd restart # watch /var/log/messages for errors, watch /var/log/dhcp.log for dhcp activity
  • try to boot the VME CPU, watch messages in dhcp.log should be similar to those shown in the next section, the VME CPU should get an IP address and stop on "TFTP load" stage

setup tftp

  • emacs -nw /etc/xinetd.d/tftp line "server_args" to read "server_args = -vvvv -s /var/lib/tftpboot"
  • chkconfig tftp on
  • service xinetd restart
  • try to boot the VME CPU, watch messages in the syslog, should see message about "pxelinux.0 not found", the VME CPU should get an IP address and stop with an error "TFTP - File not found"

setup pxelinux

cd ~
wget https://www.kernel.org/pub/linux/utils/boot/syslinux/4.xx/syslinux-4.03.tar.bz2
tar xjvf syslinux-4.03.tar.bz2
cd syslinux-4.03
cp -pv ./core/pxelinux.0 ./com32/hdt/hdt.c32 ./memdisk/memdisk ./com32/menu/menu.c32 /var/lib/tftpboot/
  • mkdir /var/lib/tftpboot/pxelinux.cfg
  • create /var/lib/tftpboot/pxelinux.cfg/default with contents from the next section
  • get additional boot files
cd /var/lib/tftpboot
wget http://ladd00.triumf.ca/tftpboot/memtest86+-4.20.iso.zip
wget http://ladd00.triumf.ca/tftpboot/memtest86+-5.01.iso.gz
wget http://ladd00.triumf.ca/tftpboot/modules.alias
wget http://ladd00.triumf.ca/tftpboot/modules.pcimap
wget http://ladd00.triumf.ca/tftpboot/pci.ids

setup linux kernel files

  • get vmlinuz and initramfs files from the VME processor boot image
  • note that the initramfs file needs to be built with the dracut-network function, it should be about 2.5Mbytes in size.
  • note the file names must match the file names in the pxelinux config file
wget http://ladd00.triumf.ca/tftpboot/vmlinuz-2.6.32-431.11.2.el6.i686
wget http://ladd00.triumf.ca/tftpboot/initramfs-2.6.32-431.11.2.el6.i686.img
  • try to boot the VME CPU, watch messages in the syslog, the VME CPU should get an IP address, load the PXELINUX menu, load the linux kernel and stop with an error mounting NFS filesystems.

setup NFS export of VME processor boot image

  • copy the VME processor boot image to /nfsroot/alphavme03 (or /home1/alphavme03)
  • customise the boot image per instructions in previous section (edit sysconfig/network, etc)
  • edit /etc/exports:
/nfsroot/alphavme03 daqtmp4(rw,no_root_squash,async)
  • chkconfig nfs on; service nfs restart
  • if needed adjust the boot image path name in dhcpd.conf to match the export name in /etc/exports, restart dhcpd.conf
  • try to boot the VME CPU, watch messages in the syslog, the VME CPU should get an IP address, load the PXELINUX menu, load the linux kernel, linux kernel will start, mount the VME processor boot image ("NFS Root"), boot the userland. At the end you should be able to ping and ssh into the VME processor.

fixup file permissions

cd /var/lib/tftpboot
chown -R root.root .
chmod a+r *
chmod a-w *

DHCP configuration

#
# /etc/dhcpd.conf
#
#
# general setup
#

log-facility local3;

# make network booting the SystemImager autoinstallclient possible
allow booting;
allow bootp;
ignore unknown-clients;
ddns-update-style ad-hoc;

# set lease time to 3 days
default-lease-time 259200;
max-lease-time 259200;

# define network addresses

subnet 142.90.96.0 netmask 255.255.224.0 {
  not authoritative;
  ignore unknown-clients;
  option domain-name "triumf.ca"; 
  option domain-name-servers 142.90.100.19;
  option routers 142.90.100.18; 
} 

# special PXELINUX options
 
option space pxelinux; 
option pxelinux.magic      code 208 = string; 
option pxelinux.configfile code 209 = text; 
option pxelinux.pathprefix code 210 = text; 
option pxelinux.reboottime code 211 = unsigned integer 32; 

#
# setup for TIGRESS VME processors (boot from midtig06)
#
 
group { 
        filename "pxelinux.0"; 
        next-server ladd00; 
        option routers 142.90.100.18; 
        option subnet-mask 255.255.224.0; 
        option domain-name "triumf.ca"; 
        option domain-name-servers 142.90.100.19, 142.90.100.68; 
        #use-host-decl-names on; 

        site-option-space "pxelinux"; 
        if exists dhcp-parameter-request-list { 
                # Always send the PXELINUX options (specified in hexadecimal) 
                option dhcp-parameter-request-list = concat(option dhcp-parameter-request-list,d0,d1,d2,d3); 
        } 
 
        option pxelinux.reboottime 10; 
        option pxelinux.pathprefix "./"; 
        #option pxelinux.configfile "pxelinux.cfg/default";
        option pxelinux.configfile "pxelinux.cfg/V7750-SL6a";
 
        #host lxdaq17-eth0 { option pxelinux.configfile "pxelinux.cfg/V7750-SL6d"; option host-name "lxdaq17.triumf.ca"; option root-path "nfs:ladd00:/data0/root/lxdaq17:rw"; fixed-address lxdaq17; hardware ethernet 00:20:38:00:DA:1D; } 
        #host lxdaq17-eth1 { option pxelinux.configfile "pxelinux.cfg/V7750-SL6d"; option host-name "lxdaq17.triumf.ca"; option root-path "nfs:ladd00:/data0/root/lxdaq17:rw"; fixed-address lxdaq17; hardware ethernet 00:20:38:00:DA:1F; } 

}

PXELINUX configuration

[root@ladd00 pxelinux.cfg]# more /var/lib/tftpboot/pxelinux.cfg/V7750-SL6d
default menu.c32
prompt 0

menu title Welcome to the LADD00 VME V7750 PXE boot menu

timeout 50

label hdt
  kernel hdt.c32

label memtest86+-5.01 
  kernel memdisk iso initrd=memtest86+-5.01.iso.gz 

label memtest86+-4.20
  kernel memdisk iso initrd=memtest86+-4.20.iso.zip

label SL6-431.11.2
  menu default
  kernel vmlinuz-2.6.32-431.11.2.el6.i686
  append default_hugepagesz=0 hugepages=0 hugepagesz=0 highmem=0 userpte=nohigh apm=off acpi=off initrd=initramfs-2.6.32-431.11.2.el6.i686.img root=dhcp panic=60

label SL6-358.18
  kernel vmlinuz-2.6.32-358.18.1.el6.i686
  append default_hugepagesz=0 hugepages=0 hugepagesz=0 highmem=0 userpte=nohigh apm=off acpi=off initrd=initramfs-2.6.32-358.18.1.el6.i686.img root=dhcp panic=60

label SL6-220.4
  kernel vmlinuz-2.6.32-220.4.1.el6.i686
  append default_hugepagesz=0 hugepages=0 hugepagesz=0 highmem=0 userpte=nohigh apm=off acpi=off initrd=initramfs-2.6.32-220.4.1.el6.i686.img root=dhcp panic=60

#end

Boot host dhcp messages

grep -i b4 /var/log/dhcp.log
[root@daqtmp5 Desktop]# grep -i 0e:b4 /var/log/dhcp.log 
Jun 10 12:50:26 daqtmp5 dhcpd: DHCPDISCOVER from 00:20:38:03:0e:b4 via eth0
Jun 10 12:50:26 daqtmp5 dhcpd: DHCPOFFER on 142.90.96.104 to 00:20:38:03:0e:b4 via eth0
Jun 10 12:50:28 daqtmp5 dhcpd: DHCPREQUEST for 142.90.96.104 (142.90.96.105) from 00:20:38:03:0e:b4 via eth0
Jun 10 12:50:28 daqtmp5 dhcpd: DHCPACK on 142.90.96.104 to 00:20:38:03:0e:b4 via eth0

Boot host syslog messages

tail -100f /var/log/messages
May  5 15:54:08 ladd00 xinetd[1860]: START: tftp pid=3396 from=142.90.111.107
May  5 15:54:08 ladd00 in.tftpd[3398]: RRQ from 142.90.111.107 filename pxelinux.0
May  5 15:54:08 ladd00 in.tftpd[3398]: tftp: client does not accept options
May  5 15:54:08 ladd00 in.tftpd[3400]: RRQ from 142.90.111.107 filename pxelinux.0
May  5 15:54:08 ladd00 in.tftpd[3402]: RRQ from 142.90.111.107 filename ./pxelinux.cfg/V7750-SL6d
May  5 15:54:08 ladd00 in.tftpd[3404]: RRQ from 142.90.111.107 filename ./menu.c32
May  5 15:54:08 ladd00 in.tftpd[3406]: RRQ from 142.90.111.107 filename ./pxelinux.cfg/V7750-SL6d
May  5 15:54:13 ladd00 in.tftpd[3418]: RRQ from 142.90.111.107 filename ./vmlinuz-2.6.32-431.11.2.el6.i686
May  5 15:54:14 ladd00 in.tftpd[3431]: RRQ from 142.90.111.107 filename ./initramfs-2.6.32-431.11.2.el6.i686.img
May  5 15:54:29 ladd00 rpc.mountd[1796]: authenticated mount request from lxdaq17.triumf.ca:677 for /data0/root/lxdaq17 (/data0)