SLinstall: Difference between revisions

From DaqWiki
Jump to navigation Jump to search
m (303 revisions)
Line 587: Line 587:
== Install packages needed for QUARTUS, ROOT, EPICS and MIDAS DAQ ==
== Install packages needed for QUARTUS, ROOT, EPICS and MIDAS DAQ ==


yum install --skip-broken giflib.i386 giflib.i686 giflib.x86_64 compat-libf2c-34.i386 compat-libf2c-34.i686 mysql-devel mysql-devel.i686 openssl-devel.i686 sysstat "libusb-devel*" unixODBC-devel unixODBC-devel.i686 postgresql-devel libxml2-devel libXpm-devel libgfortran libstdc++-devel.i386 libstdc++-devel.i686 git compat-readline43 "graphviz*" dcap "zlib-*.i686" "libXext-*.i686" "libXtst-*.i686" "tigervnc*" telnet glibc"*" glibc-static.i686 freetype.i686 fontconfig.i686 libpng.i686 libXrender.i686 strace "fftw*" libpng "freetype*" xpdf "xemacs*" tkcvs xterm mutt "*g77*" joe "libXmu*" glibc-devel.i686 libX11-devel.i686 libXpm-devel.i686 libXft-devel.i686 mysql-devel.i686 dcap-devel dcap-devel.i686 gsl-devel gsl-devel.i686 pcre-devel pcre-devel.i686 fontconfig-devel.i686 freetype-devel.i686 libpng-devel.i686 libjpeg-devel.i686 libgfortran.i686 libxml2-devel.i686 h5py gd-devel gd-devel.i686 readline-devel.i686 ncurses-devel.i686 libXdmcp.i686 xorg-x11-fonts"*" rdesktop minicom xfig"*" perl-BSD-Resource "net-snmp-*" readline-static readline-static.i686 git-all "boost-*" "boost-devel.i686" nasm imake tcl-devel gv xorg-x11-twm expat-devel screen compat-readline5 compat-readline5.i686 ImageMagick ImageMagick-devel
yum install --skip-broken giflib.i386 giflib.i686 giflib.x86_64 compat-libf2c-34.i386 compat-libf2c-34.i686 mysql-devel mysql-devel.i686 openssl-devel.i686 sysstat "libusb-devel*" unixODBC-devel unixODBC-devel.i686 postgresql-devel libxml2-devel libXpm-devel libgfortran libstdc++-devel.i386 libstdc++-devel.i686 git compat-readline43 "graphviz*" dcap "zlib-*.i686" "libXext-*.i686" "libXtst-*.i686" "tigervnc*" telnet glibc"*" glibc-static.i686 freetype.i686 fontconfig.i686 libpng.i686 libXrender.i686 strace "fftw*" libpng "freetype*" xpdf "xemacs*" tkcvs xterm mutt "*g77*" joe "libXmu*" glibc-devel.i686 libX11-devel.i686 libXpm-devel.i686 libXft-devel.i686 mysql-devel.i686 dcap-devel dcap-devel.i686 gsl-devel gsl-devel.i686 pcre-devel pcre-devel.i686 fontconfig-devel.i686 freetype-devel.i686 libpng-devel.i686 libjpeg-devel.i686 libgfortran.i686 libxml2-devel.i686 h5py gd-devel gd-devel.i686 readline-devel.i686 ncurses-devel.i686 libXdmcp.i686 xorg-x11-fonts"*" rdesktop minicom xfig"*" perl-BSD-Resource "net-snmp-*" readline-static readline-static.i686 git-all "boost-*" "boost-devel.i686" nasm imake tcl-devel gv xorg-x11-twm expat-devel screen compat-readline5 compat-readline5.i686 ImageMagick ImageMagick-devel wget


yum reinstall urw-fonts
yum reinstall urw-fonts

Revision as of 21:52, 8 May 2013

Notes

  • these instructions are periodically updated to include items needed for older/newer versions of Linux. They are marked like this: (SL4.2+) means Scientific Linux 4.2 and newer; (SL4 is equivalent to FC3). (FC5 only) means Fedora Core 5; etc.
  • obsolete items are marked by the "#" sign at the beginning of the line and sometimes have a comment about the reason for removal.
  • typically, we do not "upgrade" machines using the Red Hat "upgrade" function. Instead, we save critical files from the old installation and do a "fresh install" from scratch


Preparation

  • save /etc, /var, /root, /usr/local, /opt and /tftpboot (tar and scp to another machine or use either of: /triumfcs/trshare/midas/Disks/rsync_all_NIS.csh or /triumfcs/trshare/midas/Disks/rsync_all_noNIS.csh)
  • NIS only: ascertain NIS domain name (use authconfig) e.g. "DAQ-NIS","MUSR-NIS" etc.
  • check existing partition sizes on machine: df -hl, fdisk -l
  • note which are the /home1 and /data partitions
  • if /home1 is inside the "/"partition you must save it also
  • shutdown

Running SL installer

  • Start installation of the new system:
  • IMPORTANT: if you have WDC "advanced partitioning disks" (4kB sectors), disks have to be repartitioned before use, see special instructions (TBW) (note: use fdisk -H 224 -S 56 /dev/sdx)
  • boot from latest "SL5 kickstart" CD from Kelvin Raywood or PXE boot the latest SL installation image
  • after the system enters graphical mode, one can remove the CD- the installation is running over the network
  • two questions will be asked: how to partition the disks and the root password. The rest of the installation is automatic.
  • select "Custom partioniong". You can create either simple normal partitions if you have only one disk or RAID 1(mirror) for / and /home1 if you have two disks.
       <strong>For the case of one disk only</strong>: - no "/boot" filesystem,
       - allocate at least 20 GBytes for "/" filesystem (primary partition, hda1)
       - allocate at least 8 GBytes for swap (primary partition, hda2)
       - allocate the rest as /home1 and /data (primary partitions hda3 and hda4)
<strong>For the case of two or more disks:</strong>
- If there are already some partitions on the disks, consider DELETING them all 
- Click New, select software RAID for /dev/sda, 20000MB (20 GB) and Force primary partition
- Same as above for /dev/sdb
- Click RAID, select create a RAID device /dev/md0, mount point /, RAID Level1 (mirror)
- Repeat for the swap partition (/dev/md1, make it at least 8 Gbytes)
- Leave the rest of the disks free. The other RAID partitions will be created AFTER installation.
  • enter the root password for the new system
  • continue with installation, at the very end, the system will ask you to reboot
  • boot newly installed system and answer the few questions for SL5
  • Firewall: Leave it disabled; SELinux: choose Disabled; KDump: Leave it disabled; Date and Time: Leave kickstart defaults (should be NTP using TRIUMF time servers)
  • Create user: not necessary if you are using NIS; Sound card: ignore possibly.
  • The system will reboot again.

Configure SSH

  • Login from the console
  • restore the SSH keys from backup (/etc/ssh/*key*)
  • service sshd restart
  • ssh into the new machine as root
  • ssh root@localhost, ctrl-C
  • scp root@ladd00:/root/authorized_keys ~root/.ssh/
  • (not needed for SL5.5 kickstart) check that /etc/ssh/ssh_config contains "ForwardX11 yes" and "ForwardX11Trusted yes":
echo "  ForwardX11 yes" >> /etc/ssh/ssh_config
echo "  ForwardX11Trusted yes" >> /etc/ssh/ssh_config

Configure RAID arrays

NOTE1: For compatibility with the SL6 installer, use "fdisk -u" when creating new partitions.

NOTE2a: For 3TB disks, GPT partitions are required - use "gparted" (yum install epel-release; yum install gparted), create GPT partition in Device->Create partition table->Advanced->Select->gpt->Apply. Do NOT use "parted".

NOTE2b: (NOT TESTED YET) For 3TB disks, GPT partitions are required - use "gdisk" (yum install epel-release;yum install gdisk).

Typical disk configuration for DAQ use has 2 large disks with system ("/"), swap, home and data partitions, fully mirrored across the 2 disks using RAID1 software raid (MD).

In this fully mirrored configuration, a DAQ system will continue to operate without interruption and without performance degradation when there is a full or partial failure of either of the two disks.

If disks are hot-swappable, the failed or defective disk can then be physically replaced by a spare, the spare disk can be partioned and added to the RAID1 array, restoring full normal operation, without shutting down or rebooting the system or interrupting data taking. (Since SATA, eSATA and USB are always electrically hot-swappable, disk hot-replacement is more of a mechanical issue).

A typical disk layout looks like this:

[root@ladd06 ~]# fdisk -l  ### use "fdisk -lu" instead!!!

Disk /dev/sdb: 750.2 GB, 750156374016 bytes
...
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        5100    40960000   fd  Linux raid autodetect
/dev/sdb2            5100        9179    32768000   fd  Linux raid autodetect
/dev/sdb3            9179       21927   102399603+  fd  Linux raid autodetect
/dev/sdb4           21928       91201   556443405   fd  Linux raid autodetect

Disk /dev/sda: 750.2 GB, 750156374016 bytes
...
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        5100    40960000   fd  Linux raid autodetect
/dev/sda2            5100        9179    32768000   fd  Linux raid autodetect
/dev/sda3            9179       21927   102399603+  fd  Linux raid autodetect
/dev/sda4           21928       91201   556443405   fd  Linux raid autodetect
...
[root@ladd06 ~]# cat /proc/mdstat
Personalities : [raid1] 
md3 : active raid1 sdb4[1] sda4[0]
      556442245 blocks super 1.2 [2/2] [UU]
      bitmap: 0/5 pages [0KB], 65536KB chunk

md2 : active raid1 sdb3[1] sda3[0]
      102398507 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md1 : active raid1 sda2[0] sdb2[1]
      32766908 blocks super 1.1 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md0 : active raid1 sda1[0] sdb1[1]
      40959928 blocks super 1.0 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk
...
[root@ladd06 ~]# df -kl
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md0              40316208   6222676  32045536  17% /
/dev/md2             100790232    192116  95478192   1% /home1
/dev/md3             547709948    202404 519685432   1% /data6
...
[root@ladd06 ~]# swapon -s
Filename                                Type            Size    Used    Priority
/dev/md1                                partition       32766900        0       -1

Typical size of partitions:

  • /dev/md0 : "/" : 40 Gbytes should be sufficient. SL5 fits into an 8GB "/" and SL6 fits into a 16GB "/".
  • /dev/md1 : swap : double the amount of physical RAM or bigger. If it turns out to be too small, one can add swap space by adding a swap file located on the data partition.
  • /dev/md2 : "/home" : user home directories space is limited by the capacity and capability of the backup and archiving system used to protect user data against accidental file deletion, filesystem corruption and disastrous system failures. 100GB seems to be reasonable.
  • /dev/md3 : "/data" : data partition uses the remaining space on the disks.

Usually, the "/" and swap partitions are created through the SL installer program. The /home and /data partitions can be created at the same time.

Otherwise, follow these instructions:

  • create the partitions using fdisk or similar (this example creates a 60 GB partition):
    • fdisk -u /dev/sda
    • Command (m for help): n
    • Command action ... p
    • Partition number ... 2, 3 or 4 according to what has been defined before
    • First cylinder ... default
    • Last cylinder ... +60000M or default
    • Command action ... t
    • Partition number ... : 2, 3 or 4 according to what has been defined before
    • Hex code ... : fd
    • Command action ... p to check all is correct
    • Command (m for help): w
    • fdisk /dev/sdb and repeat as above
    • Reboot the machine
  • Check the newly created partitions: fdisk -lu /dev/sda; fdisk -lu /dev/sdb
  • mdadm --create /dev/md2 --bitmap=internal -l 1 -n 2 /dev/sda3 /dev/sdb3
  • Check the progress of building the RAID with: more /proc/mdstat
  • When finished: mkfs -j /dev/md2; tune2fs -i 0 -c 0 /dev/md2
  • mkdir /home1
  • Add to /etc/fstab: "/dev/md2 /home1 ext3 defaults 1 2" ### could be ext4 is SL6
  • Finally mount this new partition: mount -a
  • Repeat from "mkfs" for each of the data partitions
  • At this point you should have these disk partitions (single-disk in parenthesis)
    • /dev/md0 (/dev/sda1, sdb1) is the system partition, 40 GBytes or more
    • /dev/md1 (/dev/sda2, sdb2) is the swap partition, 32 GBytes or more
    • /dev/md2 (/dev/sda3, sdb3) is the /home1 partition, 100 GBytes or more
    • /dev/md3 (/dev/sda4, sdb4) is the data partition
  • (SL5.5 or newer) enable raid1 bitmap files, for each /dev/mdX device, run "mdadm --grow --bitmap=internal /dev/mdX". In /proc/mdstat, verify that there is a "bitmap" entry. This will fail if the md device resync is in progress, so you have to wait for raid1 resync to finish.

Restore data from backups

  • (on midm15/midm9b/midm20 only) install correct ethernet driver eepro100 not e100
  • restore /home (non-NIS) or /home1 (NIS) and other required user directories from backup. (Can use /triumfcs/trshare/midas/Disks/rsync_back.csh ).
  • if needed, for non-NIS only, make a softlink for /home1: ln -s /home /home1
  • restore users accounts (non-NIS and NIS master only): edit /etc/passwd and /etc/shadow, append users' login info to the end of these files from the backup versions.

Post installation

  • emacs -nw ~root/.forward, (create file if not present, or restore from backup) In file add email address of the person(s) to receive root's email: echo "olchansk@triumf.ca" >> ~root/.forward
  • emacs -nw /etc/sysconfig/network
    • set "HOSTNAME=" (set it to blank to use hostname from DHCP)
    • set "NETWORKWAIT=yes"
  • (not needed for SL6.1, NEEDED for SL6->6.1 update) in /etc/hosts, remove exteraneous entries - only entries for localhost and localhost6 should remain
  • disable selinux: edit /etc/sysconfig/selinux, change line to read: SELINUX=disabled, reboot later for change to take effect
  • chmod a+r /var/log/messages

Configure network

(DO NOT DO THIS)

disable "persistent network names"

touch /etc/udev/rules.d/75-persistent-net-generator.rules
rm /etc/udev/rules.d/70-persistent-net.rules
#shutdown -r now

Configure NIS master

(do not use SL6.2 for NIS master)

  • yum install ypserv
  • domainname DEAP-NIS
  • cd /var/yp
  • edit Makefile
    • change NOPUSH=false
    • change the "all:" entry to read: all: passwd group netgrp shadow auto.master auto.home auto.local ypservers
  • touch /etc/auto.home /etc/auto.local ./ypservers
  • make
  • inspect created NIS maps: ls -l DEAP-NIS
  • chkconfig ypserv on
  • chkconfig ypxfrd on
  • chkconfig yppasswdd on
  • service ypserv start

Configure NIS client

  • run "authconfig --enablenis --enablepreferdns --nisdomain LADD-NIS --update"
  • if NIS server is SL6.2, add "--nisserver=ladd00" to above command
  • (not needed with --enablepreferdns above) run "sed 's/^hosts:.*/hosts: files dns/' -i /etc/nsswitch.conf" (to undo a mistake from authconfig)
  • On the master NIS node (ladd00), add this new node to /etc/netgroup, and update NIS maps (cd /var/yp; make)
  • Use "system-config-users" to add local user accounts
  • NIS: check user accounts: run "ypcat -k passwd"
  • echo "NISTIMEOUT=5" >> /etc/sysconfig/network
  • echo "NETWORKWAIT=yes" >> /etc/sysconfig/network

Configure NIS secondary server

(do this only if needed)

(do not use SL6.2 for NIS secondary server!)

  • yum install ypserv
  • /usr/lib64/yp/ypinit -s ladd00 (/usr/lib/yp/ypinit on 32-bit machines)
  • chkconfig ypserv on
  • service ypserv start
  • service ypbind restart
  • on the NIS master:
    • add the new machine to /var/yp/ypservers, run "make -C /var/yp" and also "cd /var/yp; yppush -h newmachine ypservers"
    • if using /var/yp/securenets, copy it from NIS master to new NIS secondary server

Configure AUTOFS

  • (if NIS master or standalone) check /etc/auto.* against backups, particularly auto.master if NIS master
  • (if needed) add "+auto.master" at the end of /etc/auto.master
  • restart autofs to use the newly configured NIS maps: "service autofs stop; service autofs start"

Configure time

Verify time and date configuration. Run "ntpstat", it should say "synchronised to NTP server (142.90.x.y). If not:

Configure system updates

  • (do not do this) enable automatic system updates: run "/triumfcs/trshare/olchansk/linux/triumf-update/yum-setup.perl -o"
  • (do not do this) enable automatic kernel updates: run "/triumfcs/trshare/olchansk/linux/triumf-update/yum-setup.perl -k -o"
  • (experimental, do not do this) /triumfcs/trshare/olchansk/linux/triumf-update/yum-autoupdate.sh
  • enable kernel updates: sed 's/^EXCLUDE=/#EXCLUDE=/' -i /etc/sysconfig/yum-autoupdate

Configure system services

  • chkconfig --list | grep on | sort (to see enabled services)
  • disable unwanted services:
(only if amanda is not used) -> chkconfig --level 12345 xinetd off
chkconfig --level 12345 canna off
chkconfig --level 12345 FreeWnn off
chkconfig --level 12345 hpoj off
chkconfig --level 12345 ip6tables off
chkconfig --level 12345 iptables off
chkconfig --level 12345 isdn off
chkconfig --level 12345 pcmcia off
chkconfig --level 12345 rhnsd off
chkconfig --level 12345 spamassassin off
chkconfig --level 12345 bluetooth off
chkconfig --level 12345 apmd off
chkconfig --level 12345 iiim off
chkconfig --level 12345 fenced off
chkconfig --level 12345 ccsd off
chkconfig --level 12345 cpuspeed off
chkconfig --level 12345 pcp off
chkconfig --level 12345 pmie off
chkconfig --level 12345 yum-updatesd off
chkconfig --level 12345 clvmd off
chkconfig --level 12345 cman off
chkconfig --level 12345 lvm2-monitor off
chkconfig --level 12345 modclusterd off
chkconfig --level 12345 yum-updateonboot off
chkconfig --level 12345 cmirror off
chkconfig --level 12345 lock_gulmd off
chkconfig --level 12345 firstboot off
chkconfig --level 12345 ricci off
chkconfig --level 12345 gfs off
chkconfig --level 12345 scsi_reserve off
chkconfig --level 12345 openibd off
chkconfig --level 12345 arptables_jf off
chkconfig --level 12345 auditd off
chkconfig --level 12345 avahi-daemon off
chkconfig --level 12345 hplip off
chkconfig --level 12345 iscsi off
chkconfig --level 12345 iscsid off
chkconfig --level 12345 mcstrans off
chkconfig --level 12345 pcscd off
chkconfig --level 12345 restorecond off
chkconfig --level 12345 setroubleshoot off
chkconfig --level 12345 xend off
chkconfig --level 12345 xendomains off
chkconfig --level 12345 kudzu off
#chkconfig --level 12345 yum-cron off
chkconfig --level 12345 kdump off
chkconfig --level 12345 libvirt-guests off
chkconfig --level 12345 libvirtd off
chkconfig --level 12345 spice-vdagentd off
chkconfig --level 12345 ksm off
chkconfig --level 12345 ksmtuned off
chkconfig --level 12345 iscsi off
chkconfig --level 12345 iscsid off
chkconfig --level 12345 openct off

Configure external package repositories

  • yum install elrepo-release epel-release

Configure TRIUMF packages

(TRIUMF kickstart usually installs this automatically)

Configure TRIUMF mirror of yum repositories

scp root@ladd00:triumfcs-mirror-SL6.repo /etc/yum.repos.d/
yum clean all

Configure hardware sensors

  • yum install lm_sensors kmod-k10temp kmod-coretemp
  • sensors-detect (accept default answer to all questions - press ENTER)
  • service lm_sensors restart (to reload the kernel modules)
  • sensors (to see available sensors)

If no sensors are detected by standard drivers, follow motherboard-specific instructions at the bottom of this page.

Configure coretemp CPU sensors

On some machines, the coretemp driver for Intel CPU temperature sensors is not loaded after the above steps.

  • sensors | grep coretemp ### number of sensors reported should be the same as the number of CPU cores
  • if output is blank, add this to /etc/rc.local
emacs -nw /etc/rc.local
modprobe coretemp

Configure IPMI sensors

Some machines support the IPMI interface for monitoring the hardware: fan speeds, temperatures, voltages.

  • find out if IPMI is supported. Try this:
dmidecode | grep -i ipmi

if output is not blank, IPMI is maybe supported.

  • install and enable IPMI software:
yum install "OpenIPMI*" ipmitool
service ipmi start
ipmitool sensor ### to confirm IPMI is present. If output is blank, do not go further.
chkconfig ipmi on
chkconfig ipmievd on
service ipmi restart
service ipmievd restart
tail -100 /var/log/messages ### look at messages logged by ipmievd
  • if ipmievd complains about SEL buffer overflow, clear it manually:
ipmitool sel list ### show ipmi messages in raw format
ipmitool sel elist ### show ipmi messages in useful format
ipmitool sel elist > file ### save ipmi messages into a file
ipmitool sel clear  ### clear all accumulated ipmi messages
  • useful ipmi commands:
    • ipmitool sensor -- read hardware sensors
    • ipmitool sel elist -- report all accumulated messages

Enable User Disk Quotas

[root@isdaq00 home1]# grep quota /etc/fstab
UUID=5a2aefbd-45db-475e-841e-12ec89220fbd /home1 ext4 defaults,grpquota,usrquota 1 2
  • cd /; umount /home1; mount /home1
  • quotacheck -cug /home1
  • quotacheck -avug
  • quotaon -av
  • quota system is now active
  • increase the soft quota time limit from default 7days to 30 or 60 days: edquota -t
  • set quotas for all users (see below)
  • setup warnquota:
    • create warnquota config file: emacs -nw /etc/warnquota.conf
# values can be quoted:
MAIL_CMD        = "/usr/sbin/sendmail -t"
FROM            = root
SUBJECT         = User %i@%h exceeded allocated disk quota
CC_TO           = "root"
# If you set this variable CC will be used only when user has less than
# specified grace time left (examples of possible times: 5 seconds, 1 minute,
# 12 hours, 5 days)
# CC_BEFORE = 2 days
SUPPORT         = "root"
# Text in the beginning of the mail (if not specified, default text is used)
# This way text can be split to more lines
# Line breaks are done by '|' character
# The expressions %i, %h, %d, and %% are substituted for user/group name,
# host name, domain name, and '%' respectively. For backward compatibility
# %s behaves as %i but is deprecated.
MESSAGE         = User "%i" on "%h" has exceeded the allocated disk quota.||Please delete any unnecessary files on following filesystems or|contact the system administrato
r to increase your quota allocation:|
SIGNATURE       = --|automated email from warnquota
    • note that %i@%h in the SUBJECT line do not seem to work
    • create cron job: emacs -nw /etc/cron.daily/warnquota
#!/bin/sh
warnquota
#end
    • chmod a+x /etc/cron.daily/warnquota
    • touch /etc/crontab

Useful commands for managing quotas:

  • repquota -a | sort -n -k3 ### show quota of all users sorted by disk usage
  • edquota -u username ### open "vi" editor to change user quotas
  • repquote -a | grep username ### report quota for given user
  • setquota -u username 0 0 0 0 /home1 ### disable quotas for given user
  • setquota -u username 40000000 20000000 0 0 /home1 ### set quotas for 40GB soft and 20GB hard
  • edquota -t ### change soft quota time limits

Enable NFS V3 server

  • edit /etc/hosts.allow, add or uncomment "mountd: 142.90.0.0/255.255.0.0"
  • create /etc/exports, e.g. "/home1 @daqmachines(rw,no_root_squash,async)"
  • check the netgroup file
    • if using NIS: check NIS netgroup: ypcat -k netgroup
    • if no NIS, create /etc/netgroup: @daqmachines (deap00,,) (deap01,,) (deap02,,)
    • if no NIS, edit /etc/nsswitch.conf, make the netgrooup line read: "netgroup: files"
  • chkconfig nfs on
  • chkconfig nfslock on
  • service nfs restart

Enable NFS V4 SERVER (SL6)

  • if used with NIS, same as NFSv3
  • if used as standalone, need to edit idmapd.conf - set the "Domain" name to the same value on NFS server and NFS slave (default automagically determined value does not always work). More TBW.

Enable AMANDA backups

AMANDA backups are already enabled by TRIUMF kickstart installs. For non-kickstart installation, follow instructions at [http://amanda/~amanda], or look at "/triumfcs/trshare/olchansk/linux/amanda/amanda-enable.perl". As final step, use [https://helpdesk.triumf.ca] to contact TRIUMF CS to add this new machine to the amanda backup list.

  • yum install triumf-amanda

Enable DCACHE

  • mkdir -p /pnfs
  • edit /etc/rc.local, add to the end of file: "mount -o intr,rw,noac,hard,nfsvers=3 trdata00:/pnfs /pnfs"
  • . /etc/rc.local

For more information, see dcache page.

Configure Ganglia

Ganglia-3.0 instructions for SL4 and SL5:

scp ladd00:/etc/gmond.conf /etc/gmond.conf
yum install ganglia ganglia-gmond
chkconfig gmond on
service gmond restart

Ganglia-3.1 instructions for SL6:

/bin/rm /etc/gmond.conf
yum install "*gmond*"
scp ladd00:ganglia-triumf-daq.conf /etc/ganglia/conf.d
chkconfig gmond on
service gmond restart

Ganglia-3.2 instructions for 32-bit SL4: (for 64-bit, go to .../SL4-64)

cd /triumfcs/trshare/olchansk/linux/ganglia-3.2/SL4-32
rpm -vh --install apr*.rpm
rpm -vh --upgrade lib*.rpm ganglia*.rpm
/bin/rm /etc/gmond.conf
install new gmond.conf in /etc/ganglia
chkconfig gmond on
service gmond restart

Configure TRIUMF DAQ packages

Install Konstantin's packages

  • yum --disablerepo=\* --enablerepo=triumf-daq --skip-broken install diskscrub emailonreboot monitor_nfs "ganglia-*" triumf_nodeinfo

(THIS DOES NOT WORK - TRSHARE is DEAD)

cd /triumfcs/trshare/olchansk/linux/misc-rpms
rpm -vh --install *ganglia-*rpm
rpm -vh --install emailonreboot-*rpm
rpm -vh --install monitor_nfs-*rpm
rpm -vh --install /triumfcs/trshare/olchansk/public_html/diskscrub/download/diskscrub*noarch*

Install memtest and PXE boot

cd /boot
wget http://ladd00.triumf.ca/tftpboot/memtest86+-4.10
wget http://ladd00.triumf.ca/tftpboot/gpxe-1.0.1+-gpxe.lkrn

emacs -nw /boot/grub/grub.conf
title memtest
      root (hd0,0)
      kernel /boot/memtest86+-4.10
title pxeboot
      root (hd0,0)
      kernel /boot/gpxe-1.0.1+-gpxe.lkrn

(OBSOLETE) Install memtest (OBSOLETE)

  • yum install memtest86+
  • ls -1 /boot/* | grep memtest ### to find out the name of the memtest boot file to use on the "kernel" line of the incantation to be added at the end of /boot/grub/grub.conf:

For SL4, SL5:

emacs -nw /boot/grub/grub.conf
title memtest
      root (hd0,0)
      kernel /boot/memtest86+-1.65

For SL6:

emacs -nw /boot/grub/grub.conf
title memtest
      root (hd0,0)
      kernel /boot/memtest86+-4.10

Install node monitoring

rpm -vh --upgrade http://trshare.triumf.ca/~olchansk/triumf_nodeinfo/download/triumf_nodeinfo.noarch.rpm
/usr/sbin/sendnodeinfo.perl --config ladd00.triumf.ca:8600
emacs -nw /etc/nodeinfo
/usr/sbin/sendnodeinfo.perl ladd00.triumf.ca:8600

Install packages needed for QUARTUS, ROOT, EPICS and MIDAS DAQ

yum install --skip-broken giflib.i386 giflib.i686 giflib.x86_64 compat-libf2c-34.i386 compat-libf2c-34.i686 mysql-devel mysql-devel.i686 openssl-devel.i686 sysstat "libusb-devel*" unixODBC-devel unixODBC-devel.i686 postgresql-devel libxml2-devel libXpm-devel libgfortran libstdc++-devel.i386 libstdc++-devel.i686 git compat-readline43 "graphviz*" dcap "zlib-*.i686" "libXext-*.i686" "libXtst-*.i686" "tigervnc*" telnet glibc"*" glibc-static.i686 freetype.i686 fontconfig.i686 libpng.i686 libXrender.i686 strace "fftw*" libpng "freetype*" xpdf "xemacs*" tkcvs xterm mutt "*g77*" joe "libXmu*" glibc-devel.i686 libX11-devel.i686 libXpm-devel.i686 libXft-devel.i686 mysql-devel.i686 dcap-devel dcap-devel.i686 gsl-devel gsl-devel.i686 pcre-devel pcre-devel.i686 fontconfig-devel.i686 freetype-devel.i686 libpng-devel.i686 libjpeg-devel.i686 libgfortran.i686 libxml2-devel.i686 h5py gd-devel gd-devel.i686 readline-devel.i686 ncurses-devel.i686 libXdmcp.i686 xorg-x11-fonts"*" rdesktop minicom xfig"*" perl-BSD-Resource "net-snmp-*" readline-static readline-static.i686 git-all "boost-*" "boost-devel.i686" nasm imake tcl-devel gv xorg-x11-twm expat-devel screen compat-readline5 compat-readline5.i686 ImageMagick ImageMagick-devel wget

yum reinstall urw-fonts

Install NTFS drivers

yum install ntfs-3g ntfsprogs (from EPEL)

Install KDE Konqueror web browser

This is the only web browser that can open multiple copies of itself on multiple computers for the same user (does not bomb out with errors about "this profile is already in used by you on some other computer").

  • yum install kdebase
  • konqueror
  • web away!

Install Google Chrome web browser

  • this works only for 64-bit SL6
  • create yum repo with following contents: emacs -nw /etc/yum.repos.d/google-chrome-64.repo
[google-chrome-64]
name=google-chrome - 64-bit
baseurl=http://dl.google.com/linux/chrome/rpm/stable/x86_64
enabled=1
gpgcheck=1
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub
  • yum install google-chrome-stable

Disable gdm and X11

initctl stop prefdm
mv /etc/init/prefdm.conf /etc/init/prefdm.conf-disabled
mv /etc/init/splash-manager.conf /etc/init/splash-manager.conf-disabled
initctl reload-configuration

Install JAVAWS

  • to run Java "web start" jnlp files (EVO, SEEVOGH, etc): javaws Downloads/spider.jnlp
  • install javaws:
  • yum install icedtea-web icedtea-web-javadoc

Install firefox java plugin

This installs the Oracle Java plugin:

  • rpm -vh --install ~deap/jdk-7u15-linux-x64.rpm
  • ls -l /usr/lib64/mozilla/plugins/
  • ln -s /usr/java/jdk1.7.0_15/jre/lib/amd64/libnpjp2.so /usr/lib64/mozilla/plugins/
  • start firefox, go edit->preferences->general->manage add-ons->plugins
  • "java plugin 1.7.0_15" should be listed

Install SKYPE

  • on SL6
  • yum install alsa-lib.i686 libXv.i686 libXScrnSaver.i686 glib2.i686 libtiff.i686
  • ln -s /usr/lib/libtiff.so.3 /usr/lib/libtiff.so.4
  • download skype_staticQT-4.0.0.8 ("linux static Qt" choice), untar, cd into it
  • "ldd ./skype" to confirm that all required shared libraries are installed
  • ./skype

Configure USB device permissions

Configure USB device permissions for user access to USB-serial devices, Altera USB Blaster, etc.

  • create file /etc/udev/rules.d/99-usb-chmod.rules with this contents:
emacs -nw /etc/udev/rules.d/99-usb-chmod.rules
ACTION=="add", SUBSYSTEM=="usb_device", RUN+="/bin/chmod a+wr /dev/%c"
ACTION=="add", SUBSYSTEM=="usb_device", RUN+="/bin/chmod a+wr /proc/%c"
ACTION=="add", ENV{DEVTYPE}=="usb_device", RUN+="/bin/chmod a+wr $env{DEVNAME}"
ACTION=="add", ENV{DEVTYPE}=="usb_device", RUN+="/bin/chmod a+wr $env{DEVICE}"
ACTION=="add", ENV{PHYSDEVBUS}=="usb-serial", RUN+="/bin/chmod a+wr $env{DEVNAME}"
ACTION=="add", ENV{DEVPATH}=="/class/tty/ttyS*", RUN+="/bin/chmod a+wr $env{DEVNAME}"
ACTION=="add", SUBSYSTEM=="tty", DEVPATH=="*ttyUSB*", RUN+="/bin/chmod a+rw $env{DEVNAME}"
ACTION=="add", SUBSYSTEM=="tty", DEVPATH=="*ttyS*", RUN+="/bin/chmod a+rw $env{DEVNAME}"
  • apply new permissions: udevadm trigger --action=add

Configure Altera jtagd

(if needed)

mkdir /etc/jtagd
echo 'Password = "123";' > /etc/jtagd/jtagd.conf
cp -pv /triumfcs/trshare/olchansk/altera/11.0/quartus/linux/pgm_parts.txt /etc/jtagd/jtagd.pgm_parts
  • start local jtagd: /triumfcs/trshare/olchansk/altera/11.0/quartus/bin/jtagd
  • test local connection: /triumfcs/trshare/olchansk/altera/11.0/quartus/bin/jtagconfig
  • test remote connection (add this machine to your .jtag.conf, run jtagconfig

For more information, go to Quartus

(OBSOLETE) Configure packages

  • yum install xpdf "xemacs*" tkcvs xterm mutt "*g77*" joe "libXmu*"
  • (not needed for SL5.5 kickstart) erase unwanted packages: yum erase logwatch mailman mrtg inn inn-devel cyrus-imapd cyrus-imapd-devel cyrus-imapd-murder cyrus-imapd-nntp webalizer squirrelmail rhn-applet yumex-applet apt-autoupdate SL_enable_serialconsole tog-pegasus kernel-largesmp kernel-hugemem kernel-largesmp-devel spamassassin slrn-pull openafs kernel-module-openafs openafs-debug openafs-devel openafs-kernel-source kernel-largesmp kernel-hugemem kernel-hugemem-devel xen kernel-xen bash-completion
  • yum update

Configure GRUB boot loader

  • edit /boot/grub/grub.conf, remove the "quiet" and "rhgb" options
  • edit /boot/grub/grub.conf, comment out (with "#") the "splashimage=" line
  • check that GRUB boot loader is installed on all system disks:
    • dd if=/dev/sda bs=1 count=1024 2>&1 | strings | grep GRUB
    • dd if=/dev/sdb bs=1 count=1024 2>&1 | strings | grep GRUB
  • if GRUB is not installed, (i.e. on the 2nd disk of machines with mirrored system disk), (but check that /dev/sdb is the right disk):
# grub
grub> device (hd0) /dev/sdb
grub> root (hd0,0)
grub> setup (hd0)

Special hardware settings

ASUS Crosshair mobo

  • use BIOS version 1207 or newer
  • sensors need these drivers from ELREPO: yum install --noplugins kmod-it87 kmod-k10temp; sensors-detect; service lm_sensors restart; sensors

ASUS Crosshair-II mobo

  • use BIOS version 2607 or newer
  • for the onboard IDE to work, add "all-generic-ide" to kernel boot options in grub.conf
  • sensors need these drivers from ELREPO: yum install --noplugins kmod-it87 kmod-k10temp; sensors-detect; service lm_sensors restart; sensors

ASUS P7P55D EVO mobo

  • use BIOS version 1808 (or 2004) or newer
  • (before SL6.2) sensors do not work (no driver in SL5 kernel)
  • (before SL6.2) install special driver for the r8168 GigE network interface (SL5.5 stock driver sometimes freezes):
    • yum install kmod-r8168 (from elrepo)
    • edit /etc/modprobe.conf, change the eth0 entry to read: "alias eth0 r8168"
  • (SL6) install ELREPO network drivers:
    • yum install kmod-r8168 kmod-r8169
    • sed 's/^blacklist/#blacklist/' -i /etc/modprobe.d/blacklist-r8169.conf
    • reboot
    • verify that correct drivers are loaded: ethtool -i eth0; ethtool -i eth1

ASUS P6X58-E-WS mobo

  • BIOS settings
    • F1 or DEL to enter BIOS setup, F8 boot menu
    • go to POWER->HW mon, confirm CPU temperature is around 30C. (heatsink is installed correctly. Bad heatsink temperature quickly goes up to 50-70C).
    • Main menu: Storage config - SATA change IDE->AHCI
    • System information: confirm BIOS version 301, CPU type, memory size
    • AI Tweak: set DRAM frequency - AUTO->DDR3-1333
    • Advanced->Onboard devices: LAN BOOT: enabled
    • Power->HW monitor: CPU Q-FAN: enabled
    • Boot->Settings: Quick boot: enabled; Full screen logo: disabled; Wait for F1: disabled
    • Save and exit

ASUS E35M1-M PRO mobo

  • use BIOS version 1002 or newer
  • for CPU temperature: install kmod-k10temp from ELREPO (kmod-k10temp-0.0-4.el6.elrepo.x86_64.rpm)
  • for Sensors, install driver for NCT6776F chip from https://github.com/groeck/w83627ehf/archives/master (in the Makefile, change the line "KERNEL_BUILD=" to read: "KERNEL_BUILD:=/usr/src/kernels/$(TARGET)"):
scp ladd00:/home/olchansk/daq/linux/groeck-w83627ehf-dd3e543/w83627ehf.ko /root/w83627ehf.ko
echo "modprobe hwmon; modprobe hwmon-vid; modprobe k10temp; rmmod w83627ehf; insmod /root/w83627ehf.ko" >> /etc/rc.local
  • to enable booting from USB3, edit /etc/dracut.conf, change line "add_drivers" to read: add_drivers+="xhci-hcd"
  • to use multiple monitors, install ATI proprietary drivers version 11.11 or newer (Radeon 6300), run "aticonfig --initial --heads=2 --adapter=1 --xinerama=on", to change screen layout, edit /etc/X11/xorg.conf. Only dual monitors DVI+HDMI seem to work. Tripple monitors does not seem to work.

ASUS P9X79 WS

  • http://www.asus.com/Motherboard/P9X79_WS/
  • use BIOS version 3101, 3401 or newer. If BIOS is 1305 or older, install P9X79-WS-CAP-Converter.ROM (BIOS 2902/3101), then the new BIOS.
  • for CPU temperature, install coretemp
  • for sensors, install driver for NCT6776F chip same as E35M1-M above.
  • BIOS Settings:
    • enter "Advanced mode"
    • Ai Tweaker -> Ai Overclock Tuner -> Set to "XMP" - this enables DDR3-1600 RAM speed vs DDR3-1333 by default
    • Monitor -> CPU fan speed low limit -> Set to "200 RPM" - we are using high efficiency slow turning CPU coolers and the default 600 RPM is right on the edge of firing false warnings
    • Boot -> Full screen logo -> Set to "disabled"
    • Wait for F1 -> Set to "disabled"

ASUS P8B-M

  • use BIOS version 6103 or newer
  • for CPU temperature, install coretemp
  • for sensors, install driver for NCT6776F chip same as E35M1-M above.

SUPERMICRO

(use "dmidecode | more" to read mobo model number)

X9SCL

  • yum install kmod-w83627ehf.x86_64 coretemp
  • xemacs -nw /etc/rc.local, add:
modprobe coretemp
modprobe w83627ehf

Configure X11 graphics

Special settings for DAQ

  • add the following at the end of /etc/X11/xorg.conf. The enables Ctrl-Alt-KP-/ and Ctrl-Alt-KP-* to unlock the keyboard after Altera Quartus crash:
Section "ServerFlags"
        Option "AllowDeactivateGrabs" "true"
        Option "AllowClosedownGrabs" "true"
EndSection

Install NVIDIA drivers

  • yum --enablerepo elrepo install nvidia-x11-drv kmod-nvidia (if it fails due to conflict with module-init-tools, run "yum --disablerepo \* --enablerepo elrepo update module-init-tools")
  • mv /etc/X11/xorg.conf /etc/X11/xorg.conf-xxx
  • nvidia-xconfig
  • (SL6) reboot
  • (SL5) /dev/MAKEDEV nvidia
  • (SL5) restart the X11 server (Ctrl-Alt-Backspace or "killall Xorg gdm-binary")
  • observe that X11 server restarts using the NVIDIA driver (big NVIDIA logo on startup)
  • if needed, login as root and run "nvidia-settings" to setup dual-screen configuration, etc

Install legacy NVIDIA drivers

For old NVIDIA cards:

  • GeForce FX 5500
wget http://us.download.nvidia.com/XFree86/Linux-x86/173.14.31/NVIDIA-Linux-x86-173.14.31-pkg1.run
sh ./NVIDIA-Linux-x86-173.14.31-pkg1.run

Install ATI/AMD drivers

  • yum --enablerepo elrepo install kmod-fglrx fglrx-x11-drv
  • check that /etc/X11/xorg.conf section "Device" entry "Driver" says "fglrx"
  • killall Xorg

Manual selection of monitor, video mode and resolution

Automatic selection of monitor and video mode usually works. When it does not, configure it manualls:

  • physically go to the computer
  • login as root
  • run "nvidia-settings" on machines using the NVIDIA driver
  • run "aticonfig" on machines with the ATI/AMD driver (use "aticonfig --initial" for initial setup, and good luck with anything more complicated)
  • run "system-config-display".
    • In the "hardware" tab, select monitor type: "generic LCD 1280x1024" or "generic LCD 1600x1200".
    • In the "settings" tab, select "1280x1024" or "1600x1200" and "Thousands of colors".
    • Press "ok", the display settings application should close.
  • Logout, the new login window should use the new settings.

Disable screen saver

If machine is booted without any monitor connected, current video cards to not enable any video outputs. If a monitor is connected later, there is no video image and there is no easy way to get a video image.

This can be solved by configuring X11 to always enable some video output. Because the monitor type is not known when X11 starts, one has to select some standard video mode (i.e. VESA 1280x1024) on some video output (VGA, DVI or HDMI).

Only NVIDIA cards with the NVIDIA driver (from EPEL) is supported by these instructions.

  • create default xorg.conf: nvidia-xconfig
  • edit /etc/X11/xorg.conf
  • add monitor section for the fake monitor:
Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       31.0 - 83.0
    VertRefresh     59.0 - 61.0
    Option         "DPMS" "off"
    ModeLine "1280x1024"   108.00   1280 1328 1440 1688   1024 1025 1028 1066 +hsync +vsync
EndSection
  • add output selection in the "Device" section:
Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce 210"
    #Option "ConnectedMonitor" "DFP"
    #Option "ConnectedMonitor" "CRT"
    Option "ConnectedMonitor" "CRT-1"
    Option "UseEDID" "no"
EndSection
  • add fake video mode to the "Screen" section:
Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
        Modes       "1280x1024"
    EndSubSection
EndSection
  • disable screen saver and DPMS power off in the "ServerLayout" or "ServerFlags" section:
Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
    Option         "Xinerama" "0"
    Option         "BlankTime" "0"
    Option         "StandbyTime" "0"
    Option         "SuspendTime" "0"
    Option         "OffTime" "0"
EndSection

Section "ServerFlags" 
    Option         "BlankTime" "0" 
    Option         "StandbyTime" "0" 
    Option         "SuspendTime" "0" 
    Option         "OffTime" "0" 
EndSection 

Finish installation

  • logout and reboot the computer to have all the changes to take effect

Obsolete items

  • (do not do this) install all missing packages: /triumfcs/trshare/olchansk/linux/triumf-update/yum-everything.perl
  • (SL5) If yum complains about unsigned java packages: cd ~; rpm -vh --upgrade /triumfcs/mirror/scientificlinux.org/5x/i386/updates/security/jdk*rpm /triumfcs/mirror/scientificlinux.org/5x/i386/updates/security/java*rpm
  • (before FC5) remove the fancy screen savers
rm -f /usr/X11R6/lib/xscreensaver/*
rm -f /usr/bin/*.kss
  • (before FC5) edit /etc/updatedb.conf, set "DAILY_UPDATE=yes"