SLinstall: Difference between revisions

From DaqWiki
Jump to navigation Jump to search
Line 61: Line 61:
== Configure RAID arrays ==
== Configure RAID arrays ==


<li>If you created only one mirror partitions, it is time to create the others: Create one 60 GB primary partition on each disks for /home1 and another one using the remainder of space.<br /></li></ul>
If you created only one mirror partitions, it is time to create the others: Create one 60 GB primary partition on each disks for /home1 and another one using the remainder of space.<br /></li></ul>
<pre><strong>fdisk /dev/sda</strong>
<pre><strong>fdisk /dev/sda</strong>
...
...
Line 77: Line 77:
</strong><strong><strong>fdisk /dev/sdb  </strong></strong>and repeat as above
</strong><strong><strong>fdisk /dev/sdb  </strong></strong>and repeat as above
<strong><em>-  Reboot the machine  ---
<strong><em>-  Reboot the machine  ---
</em></strong>Check the newly created partitions: <strong>fdisk -l /dev/sda; fdisk -l /dev/sdb
</em></strong>
mdadm --create /dev/md2 -a yes  -l 1 -n /dev/sda3 /dev/sdb2</strong>
</pre>
Check the progress of building the RAID with <strong>more /proc/mdstat
 
</strong>When finished:<strong> mkfs -j -L /home1 /dev/md2; tune2fs -i 0 -c 0 /dev/md2; mkdir /home1</strong>
* Check the newly created partitions: fdisk -l /dev/sda; fdisk -l /dev/sdb
Add a line to /etc/fstab:  
* mdadm --create /dev/md2 -a yes  -l 1 -n /dev/sda3 /dev/sdb2
<strong>/dev/md2                /home1                  ext3    defaults        1 2</strong>
* Check the progress of building the RAID with: more /proc/mdstat
Finally mount this new partition: <strong>mount -a</strong>
* When finished: mkfs -j -L /home1 /dev/md2; tune2fs -i 0 -c 0 /dev/md2; mkdir /home1
Repeat from <strong>mkfs</strong>  for each of the data partitions.</pre>
* Add a to /etc/fstab: "/dev/md2                /home1                  ext3    defaults        1 2"
<ul><li>At this point you should have these disk partitions (single-disk in parenthesis)<br /></li>
* Finally mount this new partition: <strong>mount -a</strong>
<ul><li>/dev/md0 (/dev/hda1) is the system partition, 20 GBytes or more</li><li>/dev/md1 (/dev/hda2) is the swap partition, 8 GBytes or more</li><li>/dev/md2 (/dev/hda3) is the /home1 partition, 60 GBytes or more</li><li>/dev/md3 or /dev/sda4 (/dev/hda4), etc are the data partitions<br /></li></ul>
* Repeat from <strong>mkfs</strong>  for each of the data partitions
* At this point you should have these disk partitions (single-disk in parenthesis)
** /dev/md0 (/dev/hda1) is the system partition, 20 GBytes or more
** /dev/md1 (/dev/hda2) is the swap partition, 8 GBytes or more
** /dev/md2 (/dev/hda3) is the /home1 partition, 60 GBytes or more
** /dev/md3 or /dev/sda4 (/dev/hda4), etc are the data partitions


* (SL5.5 or newer) enable raid1 bitmap files, for each /dev/mdX device, run "mdadm --grow --bitmap=internal /dev/mdX". In /proc/mdstat, verify that there is a "bitmap" entry. This will fail if the md device reconstruction is in progress.
* (SL5.5 or newer) enable raid1 bitmap files, for each /dev/mdX device, run "mdadm --grow --bitmap=internal /dev/mdX". In /proc/mdstat, verify that there is a "bitmap" entry. This will fail if the md device reconstruction is in progress.

Revision as of 13:01, 16 September 2010

Notes

  • these instructions are periodically updated to include items needed for older/newer versions of Linux. They are marked like this: (SL4.2+) means Scientific Linux 4.2 and newer; (SL4 is equivalent to FC3). (FC5 only) means Fedora Core 5; etc.
  • obsolete items are marked by the "#" sign at the beginning of the line and sometimes have a comment about the reason for removal.
  • typically, we do not "upgrade" machines using the Red Hat "upgrade" function. Instead, we save critical files from the old installation and do a "fresh install" from scratch


Preparation

  • save /etc, /var, /root, /usr/local, /opt and /tftpboot (tar and scp to another machine or use either of: /triumfcs/trshare/midas/Disks/rsync_all_NIS.csh or /triumfcs/trshare/midas/Disks/rsync_all_noNIS.csh)
  • NIS only: ascertain NIS domain name (use authconfig) e.g. "DAQ-NIS","MUSR-NIS" etc.
  • check existing partition sizes on machine: df -hl, fdisk -l
  • note which are the /home1 and /data partitions
  • if /home1 is inside the "/"partition you must save it also
  • shutdown

Running SL installer

  • Start installation of the new system:
  • IMPORTANT: if you have WDC "advanced partitioning disks" (4kB sectors), disks have to be repartitioned before use, see special instructions (TBW) (note: use fdisk -H 224 -S 56 /dev/sdx)
  • boot from latest "SL5 kickstart" CD from Kelvin Raywood or PXE boot the latest SL installation image
  • after the system enters graphical mode, one can remove the CD- the installation is running over the network
  • two questions will be asked: how to partition the disks and the root password. The rest of the installation is automatic.
  • select "Custom partioniong". You can create either simple normal partitions if you have only one disk or RAID 1(mirror) for / and /home1 if you have two disks.
       <strong>For the case of one disk only</strong>: - no "/boot" filesystem,
       - allocate at least 20 GBytes for "/" filesystem (primary partition, hda1)
       - allocate at least 8 GBytes for swap (primary partition, hda2)
       - allocate the rest as /home1 and /data (primary partitions hda3 and hda4)
<strong>For the case of two or more disks:</strong>
- If there are already some partitions on the disks, consider DELETING them all 
- Click New, select software RAID for /dev/sda, 20000MB (20 GB) and Force primary partition
- Same as above for /dev/sdb
- Click RAID, select create a RAID device /dev/md0, mount point /, RAID Level1 (mirror)
- Repeat for the swap partition (/dev/md1, make it at least 8 Gbytes)
- Leave the rest of the disks free. The other RAID partitions will be created AFTER installation.
  • enter the root password for the new system
  • continue with installation, at the very end, the system will ask you to reboot
  • boot newly installed system and answer the few questions for SL5
  • Firewall: Leave it disabled; SELinux: choose Disabled; KDump: Leave it disabled; Date and Time: Leave kickstart defaults (should be NTP using TRIUMF time servers)
  • Create user: not necessary if you are using NIS; Sound card: ignore possibly.
  • The system will reboot again.

Configure SSH

  • Login from the console
  • restore the SSH keys from backup (/etc/ssh/*key*)
  • service sshd restart
  • ssh into the new machine as root
  • ssh root@localhost, ctrl-C
  • scp root@ladd00:/root/authorized_keys ~root/.ssh/
  • (not needed for SL5.5 kickstart) check that /etc/ssh/ssh_config contains "ForwardX11 yes" and "ForwardX11Trusted yes".

Configure RAID arrays

If you created only one mirror partitions, it is time to create the others: Create one 60 GB primary partition on each disks for /home1 and another one using the remainder of space.

<strong>fdisk /dev/sda</strong>
...
Command (m for help): <strong>n</strong>
Command action ...   <strong>p</strong>
Partition number ... <strong>2, 3 or 4</strong> according to what has been defined before
First cylinder ... default
Last cylinder ...  <strong>+60000M</strong>  or default
Command action ...   t
Partition number ... : <strong>2, 3 or 4</strong> according to what has been defined before
Hex code ... : fd if you intend to include this into a RAID array
Command action ...   <strong>p to check all is correct</strong>
Command (m for help): <strong>w
.... </strong>The new table will be used at the next reboot.<strong>
</strong><strong><strong>fdisk /dev/sdb  </strong></strong>and repeat as above
<strong><em>-  Reboot the machine  ---
</em></strong>
  • Check the newly created partitions: fdisk -l /dev/sda; fdisk -l /dev/sdb
  • mdadm --create /dev/md2 -a yes -l 1 -n /dev/sda3 /dev/sdb2
  • Check the progress of building the RAID with: more /proc/mdstat
  • When finished: mkfs -j -L /home1 /dev/md2; tune2fs -i 0 -c 0 /dev/md2; mkdir /home1
  • Add a to /etc/fstab: "/dev/md2 /home1 ext3 defaults 1 2"
  • Finally mount this new partition: mount -a
  • Repeat from mkfs for each of the data partitions
  • At this point you should have these disk partitions (single-disk in parenthesis)
    • /dev/md0 (/dev/hda1) is the system partition, 20 GBytes or more
    • /dev/md1 (/dev/hda2) is the swap partition, 8 GBytes or more
    • /dev/md2 (/dev/hda3) is the /home1 partition, 60 GBytes or more
    • /dev/md3 or /dev/sda4 (/dev/hda4), etc are the data partitions
  • (SL5.5 or newer) enable raid1 bitmap files, for each /dev/mdX device, run "mdadm --grow --bitmap=internal /dev/mdX". In /proc/mdstat, verify that there is a "bitmap" entry. This will fail if the md device reconstruction is in progress.

Post installation

  • (on midm15/midm9b/midm20 only) install correct ethernet driver eepro100 not e100
  • restore /home (non-NIS) or /home1 (NIS) and other required user directories from backup.
    (Can use /triumfcs/trshare/midas/Disks/rsync_back.csh ).
  • if needed, for non-NIS only, make a softlink for /home1 (NIS automounts /home).
    ln -s /home /home1
  • Restore users accounts (non-NIS and NIS master only):
    - edit /etc/passwd and /etc/shadow
    Append users' login info to the end of these files from the backup versions.
  • edit ~root/.forward
    (create file if not present, or restore from backup) In file add email address of the person(s) to receive root's email - always include Konstantin - olchansk@triumf.ca)
  • cleanup network setup (or postfix would not work):
    •  check that /etc/sysconfig/network has the entry "HOSTNAME=xxx.triumf.ca" (fully qualified domain name)
    • check that /etc/hosts has only the entry for "localhost".

    Configure NIS and AUTOFS

  • ###run "system-config-authentication", check "select NIS support", click on "configure NIS", set NIS domain, leave NIS server field blank.
  • run "authconfig --enablenis --nisdomain LADD-NIS --update"
  • run "sed 's/^hosts:.*/hosts: files dns/' -i /etc/nsswitch.conf" (to undo a mistake from authconfig)
  • On the master NIS node (ladd00), add this new node to /etc/netgroup, and update NIS maps (cd /var/yp; make)
  • (if needed) To setup NIS slave server:
    •  check if /usr/lib64/yp exists. If not, run "yum install ypserv"
    • /usr/lib64/yp/ypinit -s ladd00 (/usr/lib/yp/ypinit on 32-bit machines)

    • chkconfig ypserv on

    • service ypserv start

    • service ypbind restart

    • add the new machine to /var/yp/ypservers on the NIS master, run "make -C /var/yp" and also "cd /var/yp; yppush -h newmachine ypservers"

    • If using securenets on the NIS master, remember to copy this file to the new slave server in /var/yp

  • (if NIS master or standalone) check /etc/auto.* against backups, particularly auto.master if NIS master
  • (if needed) add "+auto.master" at the end of /etc/auto.master
  • restart autofs to use the newly configured NIS maps: "service autofs restart"
  • Use "system-config-users" to add local user accounts
  • NIS: check user accounts: run "ypcat -k passwd"
  • Configure time

    Verify time and date configuration. Run "ntpstat", it should say "synchronised to NTP server (142.90.x.y). If not, run "system-config-time", check "use network time", enter NTP servers: time1, time2, time3

    Configure system updates

  • (If needed) enable automatic system updates: run "/triumfcs/trshare/olchansk/linux/triumf-update/yum-setup.perl -o"
  • (If needed) enable automatic kernel updates: run "/triumfcs/trshare/olchansk/linux/triumf-update/yum-setup.perl -k -o"

    Configure system services

    • chkconfig --list | grep on | sort (to see enabled services)
    • disable unwanted services:
    (only if amanda is not used) -> chkconfig --level 12345 xinetd off
    chkconfig --level 12345 canna off
    chkconfig --level 12345 FreeWnn off
    chkconfig --level 12345 hpoj off
    chkconfig --level 12345 ip6tables off
    chkconfig --level 12345 iptables off
    chkconfig --level 12345 isdn off
    chkconfig --level 12345 pcmcia off
    chkconfig --level 12345 rhnsd off
    chkconfig --level 12345 spamassassin off
    chkconfig --level 12345 bluetooth off
    chkconfig --level 12345 apmd off
    chkconfig --level 12345 iiim off
    chkconfig --level 12345 fenced off
    chkconfig --level 12345 ccsd off
    chkconfig --level 12345 cpuspeed off
    chkconfig --level 12345 pcp off
    chkconfig --level 12345 pmie off
    chkconfig --level 12345 yum-updatesd off
    chkconfig --level 12345 clvmd off
    chkconfig --level 12345 cman off
    chkconfig --level 12345 lvm2-monitor off
    chkconfig --level 12345 modclusterd off
    chkconfig --level 12345 yum-updateonboot off
    chkconfig --level 12345 cmirror off
    chkconfig --level 12345 lock_gulmd off
    chkconfig --level 12345 firstboot off
    chkconfig --level 12345 ricci off
    chkconfig --level 12345 gfs off
    chkconfig --level 12345 scsi_reserve off
    chkconfig --level 12345 openibd off
    chkconfig --level 12345 arptables_jf off
    chkconfig --level 12345 auditd off
    chkconfig --level 12345 avahi-daemon off
    chkconfig --level 12345 hplip off
    chkconfig --level 12345 iscsi off
    chkconfig --level 12345 iscsid off
    chkconfig --level 12345 mcstrans off
    chkconfig --level 12345 pcscd off
    chkconfig --level 12345 restorecond off
    chkconfig --level 12345 setroubleshoot off
    chkconfig --level 12345 xend off
    chkconfig --level 12345 xendomains off
    chkconfig --level 12345 kudzu off
    #chkconfig --level 12345 yum-cron off

    Configure hardware sensors

    • yum install lm_sensors
    • sensors-detect (accept default answer to all questions - press ENTER)
    • service lm_sensors restart (to reload the kernel modules)
    • sensors (to see available sensors)
    • (not needed) if sensors reports "General parse error", do "cp /triumfcs/trshare/olchansk/linux/lm_sensors/lm_sensors-3.0.1/prog/sensors/sensors /usr/bin/sensors" and try again

    Enable NFS

    • edit /etc/hosts.allow, add or uncomment "mountd: 142.90.0.0/255.255.0.0"
    • create /etc/exports, e.g. "/home1 @daqmachines(rw,no_root_squash,async)"
    • chkconfig nfs on
    • chkconfig nfslock on
    • service nfs restart

    Enable AMANDA backups

    AMANDA backups are already enabled by TRIUMF kickstart installs. For non-kickstart installation, follow instructions at [http://amanda/~amanda], or look at "/triumfcs/trshare/olchansk/linux/amanda/amanda-enable.perl". As final step, use [https://helpdesk.triumf.ca] to contact TRIUMF CS to add this new machine to the amanda backup list.

    Enable DCACHE

    • mkdir -p /pnfs/triumf.ca
    • edit /etc/rc.local, add to the end of file: "mount -o intr,rw,noac,hard,nfsvers=2 trdata00:/pnfsdoor /pnfs/triumf.ca"
    • . /etc/rc.local

    section ddd

  • (not needed for SL5.5 kickstart) erase unwanted packages: yum erase logwatch mailman mrtg inn inn-devel cyrus-imapd cyrus-imapd-devel cyrus-imapd-murder cyrus-imapd-nntp webalizer squirrelmail rhn-applet yumex-applet apt-autoupdate SL_enable_serialconsole tog-pegasus kernel-largesmp kernel-hugemem kernel-largesmp-devel spamassassin slrn-pull openafs kernel-module-openafs openafs-debug openafs-devel openafs-kernel-source kernel-largesmp kernel-hugemem kernel-hugemem-devel xen kernel-xen bash-completion
  • install xemacs (from EPEL): yum install "xemacs*"
  • install ganglia:
    scp ladd00:/etc/gmond.conf /etc/gmond.conf
    yum install ganglia ganglia-gmond
    chkconfig gmond on
    service gmond restart
  • install misc stuff
    cd /triumfcs/trshare/olchansk/linux/misc-rpms
    rpm -vh --install triumf-ko-ganglia-*
    rpm -vh --install emailonreboot-*
    rpm -vh --install monitor_nfs-*
    rpm -vh --install /triumfcs/trshare/olchansk/public_html/diskscrub/download/diskscrub*
  • install memtest86: "yum install memtest86+", "ls -1 /boot/* | grep memtest" to find out the name of the memtest boot file to use on the "kernel" line of the incantation to be added at the end of /boot/grub/grub.conf:
  • cat >> /boot/grub/grub.conf
    
    title memtest
          root (hd0,0)
          kernel /boot/memtest86+-1.65
    • yum install tkcvs
    • install node monitoring
      cd /triumfcs/trshare/olchansk/linux/misc-rpms
      rpm -vh --install triumf_nodeinfo.noarch.rpm
      /usr/sbin/sendnodeinfo.perl --config ladd00:8600
      edit /etc/nodeinfo
      /usr/sbin/sendnodeinfo.perl ladd00:8600
    • (before FC5) remove the fancy screen savers
      rm -f /usr/X11R6/lib/xscreensaver/*
      rm -f /usr/bin/*.kss
    • chmod a+r /var/log/messages
    • (before FC5) edit /etc/updatedb.conf, set "DAILY_UPDATE=yes"
    • cd ~; yum update
    • Install packages needed for ROOT and MIDAS DAQ

      yum install giflib.i386 giflib.x86_64 compat-libf2c-34.i386 mysql-devel sysstat "libusb-devel*" unixODBC-devel postgresql-devel libxml2-devel libXpm-devel libgfortran libstdc++-devel.i386 git

      Section eee

      • yum install xpdf (from EPEL)
      • (do not do this) install all missing packages: /triumfcs/trshare/olchansk/linux/triumf-update/yum-everything.perl
      • (SL5) If yum complains about unsigned java packages: cd ~; rpm -vh --upgrade /triumfcs/mirror/scientificlinux.org/5x/i386/updates/security/jdk*rpm /triumfcs/mirror/scientificlinux.org/5x/i386/updates/security/java*rpm

      Section fff

    • (SL5.5 or newer) enable raid1 bitmap files, for each /dev/mdX device, run "mdadm --grow --bitmap=internal /dev/mdX". In /proc/mdstat, verify that there is a "bitmap" entry. This will fail if the md device reconstruction is in progress.
    • Configure GRUB boot loader

      • edit /boot/grub/grub.conf, remove the "quiet" and "rhgb" options
      • edit /boot/grub/grub.conf, comment out (with "#") the "splashimage=" line
      • check that GRUB boot loader is installed on all system disks: dd if=/dev/sda bs=1 count=1024 2>&1 | strings | grep GRUB
      • if GRUB is not installed, (i.e. on the 2nd disk of machines with mirrored system disk), (but check that /dev/sdb is the right disk):
      # grub
      grub> device (hd0) /dev/sdb
      grub> root (hd0,0)
      grub> setup (hd0)

      Configure X11 graphics

      Special settings for DAQ

      • add the following at the end of /etc/X11/xorg.conf. The enables Ctrl-Alt-KP-/ and Ctrl-Alt-KP-* to unlock the keyboard after Altera Quartus crash:
      Section "ServerFlags"
              Option "AllowDeactivateGrabs" "true"
              Option "AllowClosedownGrabs" "true"
      EndSection

      Install NVIDIA drivers

      • yum install nvidia-x11-drv kmod-nvidia (if it fails due to conflict with module-init-tools, run "yum --disablerepo \* --enablerepo elrepo update module-init-tools")
      • mv /etc/X11/xorg.conf /etc/X11/xorg.conf-xxx
      • nvidia-xconfig
      • /dev/MAKEDEV nvidia
      • restart the X11 server (Ctrl-Alt-Backspace or killall gdm-binary)
      • observe that X11 server restarts using the NVIDIA driver (big NVIDIA logo on startup)
      • if needed, login as root and run "nvidia-settings" to setup dual-screen configuration, etc

      Manual selection of monitor, video mode and resolution

      Automatic selection of monitor and video mode usually works. When it does not, configure it manualls:

      • physically go to the computer
      • login as root
      • run "system-config-display".
      • In the "hardware" tab, select monitor type: "generic LCD 1280x1024" or "generic LCD 1600x1200".
      • In the "settings" tab, select "1280x1024" or "1600x1200" and "Thousands of colors".
      • Press "ok", the display settings application should close.
      • Logout, the new login window should use the new settings.

      Finish installation

      • logout and reboot the computer to have all the changes to take effect