SLinstall: Difference between revisions
Line 287: | Line 287: | ||
== Install TRIUMF packages == | == Install TRIUMF packages == | ||
(TRIUMF kickstart usually installs this automatically) | |||
* rpm -vh --install http://mirror.triumf.ca/triumf/6/x86_64/RPMS/triumf-release-1.4-1.noarch.rpm | * rpm -vh --install http://mirror.triumf.ca/triumf/6/x86_64/RPMS/triumf-release-1.4-1.noarch.rpm |
Revision as of 11:02, 13 November 2011
Notes
- these instructions are periodically updated to include items needed for older/newer versions of Linux. They are marked like this: (SL4.2+) means Scientific Linux 4.2 and newer; (SL4 is equivalent to FC3). (FC5 only) means Fedora Core 5; etc.
- obsolete items are marked by the "#" sign at the beginning of the line and sometimes have a comment about the reason for removal.
- typically, we do not "upgrade" machines using the Red Hat "upgrade" function. Instead, we save critical files from the old installation and do a "fresh install" from scratch
Preparation
- save /etc, /var, /root, /usr/local, /opt and /tftpboot (tar and scp to another machine or use either of: /triumfcs/trshare/midas/Disks/rsync_all_NIS.csh or /triumfcs/trshare/midas/Disks/rsync_all_noNIS.csh)
- NIS only: ascertain NIS domain name (use authconfig) e.g. "DAQ-NIS","MUSR-NIS" etc.
- check existing partition sizes on machine: df -hl, fdisk -l
- note which are the /home1 and /data partitions
- if /home1 is inside the "/"partition you must save it also
- shutdown
Running SL installer
- Start installation of the new system:
- IMPORTANT: if you have WDC "advanced partitioning disks" (4kB sectors), disks have to be repartitioned before use, see special instructions (TBW) (note: use fdisk -H 224 -S 56 /dev/sdx)
- boot from latest "SL5 kickstart" CD from Kelvin Raywood or PXE boot the latest SL installation image
- after the system enters graphical mode, one can remove the CD- the installation is running over the network
- two questions will be asked: how to partition the disks and the root password. The rest of the installation is automatic.
- select "Custom partioniong". You can create either simple normal partitions if you have only one disk or RAID 1(mirror) for / and /home1 if you have two disks.
<strong>For the case of one disk only</strong>: - no "/boot" filesystem, - allocate at least 20 GBytes for "/" filesystem (primary partition, hda1) - allocate at least 8 GBytes for swap (primary partition, hda2) - allocate the rest as /home1 and /data (primary partitions hda3 and hda4)
<strong>For the case of two or more disks:</strong> - If there are already some partitions on the disks, consider DELETING them all - Click New, select software RAID for /dev/sda, 20000MB (20 GB) and Force primary partition - Same as above for /dev/sdb - Click RAID, select create a RAID device /dev/md0, mount point /, RAID Level1 (mirror) - Repeat for the swap partition (/dev/md1, make it at least 8 Gbytes) - Leave the rest of the disks free. The other RAID partitions will be created AFTER installation.
- enter the root password for the new system
- continue with installation, at the very end, the system will ask you to reboot
- boot newly installed system and answer the few questions for SL5
- Firewall: Leave it disabled; SELinux: choose Disabled; KDump: Leave it disabled; Date and Time: Leave kickstart defaults (should be NTP using TRIUMF time servers)
- Create user: not necessary if you are using NIS; Sound card: ignore possibly.
- The system will reboot again.
Configure SSH
- Login from the console
- restore the SSH keys from backup (/etc/ssh/*key*)
- service sshd restart
- ssh into the new machine as root
- ssh root@localhost, ctrl-C
- scp root@ladd00:/root/authorized_keys ~root/.ssh/
- (not needed for SL5.5 kickstart) check that /etc/ssh/ssh_config contains "ForwardX11 yes" and "ForwardX11Trusted yes".
Configure RAID arrays
If you created only one mirror partitions, it is time to create the others: Create one 60 GB primary partition on each disks for /home1 and another one using the remainder of space.
<strong>fdisk /dev/sda</strong> ... Command (m for help): <strong>n</strong> Command action ... <strong>p</strong> Partition number ... <strong>2, 3 or 4</strong> according to what has been defined before First cylinder ... default Last cylinder ... <strong>+60000M</strong> or default Command action ... t Partition number ... : <strong>2, 3 or 4</strong> according to what has been defined before Hex code ... : fd if you intend to include this into a RAID array Command action ... <strong>p to check all is correct</strong> Command (m for help): <strong>w .... </strong>The new table will be used at the next reboot.<strong> </strong><strong><strong>fdisk /dev/sdb </strong></strong>and repeat as above <strong><em>- Reboot the machine --- </em></strong>
- Check the newly created partitions: fdisk -l /dev/sda; fdisk -l /dev/sdb
- mdadm --create /dev/md2 --bitmap=internal -l 1 -n 2 /dev/sda3 /dev/sdb2
- Check the progress of building the RAID with: more /proc/mdstat
- When finished: mkfs -j -L /home1 /dev/md2; tune2fs -i 0 -c 0 /dev/md2; mkdir /home1
- Add a to /etc/fstab: "/dev/md2 /home1 ext3 defaults 1 2"
- Finally mount this new partition: mount -a
- Repeat from "mkfs" for each of the data partitions
- At this point you should have these disk partitions (single-disk in parenthesis)
- /dev/md0 (/dev/sda1, sdb1) is the system partition, 40 GBytes or more
- /dev/md1 (/dev/sda2, sdb2) is the swap partition, 16 GBytes or more
- /dev/md2 (/dev/sda3, sdb3) is the /home1 partition, 100 GBytes or more
- /dev/md3 (/dev/sda4, sdb4) is the data partition
- (SL5.5 or newer) enable raid1 bitmap files, for each /dev/mdX device, run "mdadm --grow --bitmap=internal /dev/mdX". In /proc/mdstat, verify that there is a "bitmap" entry. This will fail if the md device reconstruction is in progress.
Post installation
- (on midm15/midm9b/midm20 only) install correct ethernet driver eepro100 not e100
- restore /home (non-NIS) or /home1 (NIS) and other required user directories from backup. (Can use /triumfcs/trshare/midas/Disks/rsync_back.csh ).
- if needed, for non-NIS only, make a softlink for /home1: ln -s /home /home1
- restore users accounts (non-NIS and NIS master only): edit /etc/passwd and /etc/shadow, append users' login info to the end of these files from the backup versions.
- edit ~root/.forward, (create file if not present, or restore from backup) In file add email address of the person(s) to receive root's email - always include Konstantin - olchansk@triumf.ca)
- in /etc/sysconfig/network, set "HOSTNAME=" (blank, to use hostname from DHCP)
- in /etc/sysconfig/network, set "NETWORKWAIT=yes"
- in /etc/hosts, remove exteraneous entries - only entries for localhost and localhost6 should remain
- chmod a+r /var/log/messages
Configure NIS and AUTOFS
- ###run "system-config-authentication", check "select NIS support", click on "configure NIS", set NIS domain, leave NIS server field blank.
- run "authconfig --enablenis --nisdomain LADD-NIS --update"
- run "sed 's/^hosts:.*/hosts: files dns/' -i /etc/nsswitch.conf" (to undo a mistake from authconfig)
- On the master NIS node (ladd00), add this new node to /etc/netgroup, and update NIS maps (cd /var/yp; make)
- (if needed) To setup NIS slave server:
- check if /usr/lib64/yp exists. If not, run "yum install ypserv"
- /usr/lib64/yp/ypinit -s ladd00 (/usr/lib/yp/ypinit on 32-bit machines)
- chkconfig ypserv on
- service ypserv start
- service ypbind restart
- add the new machine to /var/yp/ypservers on the NIS master, run "make -C /var/yp" and also "cd /var/yp; yppush -h newmachine ypservers"
- If using securenets on the NIS master, remember to copy this file to the new slave server in /var/yp
- (if NIS master or standalone) check /etc/auto.* against backups, particularly auto.master if NIS master
- (if needed) add "+auto.master" at the end of /etc/auto.master
- restart autofs to use the newly configured NIS maps: "service autofs stop; service autofs start"
- Use "system-config-users" to add local user accounts
- NIS: check user accounts: run "ypcat -k passwd"
- add "NISTIMEOUT=5" to /etc/sysconfig/network
- add "NETWORKWAIT=yes" to /etc/sysconfig/network
Configure time
Verify time and date configuration. Run "ntpstat", it should say "synchronised to NTP server (142.90.x.y). If not:
- (SL6.0) rpm -vh --upgrade http://mirror.triumf.ca/rpmindex/fedora/archive/updates/12/i386/python-slip-0.2.13-1.fc12.noarch.rpm (see https://bugzilla.redhat.com/show_bug.cgi?id=720848)
- (SL6.1) rpm -vh --upgrade http://mirror.triumf.ca/rpmindex/fedora/archive/updates/12/i386/python-slip-0.2.13-1.fc12.noarch.rpm http://mirror.triumf.ca/rpmindex/fedora/archive/updates/12/i386/python-slip-dbus-0.2.13-1.fc12.noarch.rpm http://mirror.triumf.ca/rpmindex/fedora/archive/updates/12/i386/python-slip-gtk-0.2.13-1.fc12.noarch.rpm (see above)
- run "system-config-date"
- check "use network time"
- enter NTP servers: time1, time2, time3
- say "OK" (if there is an error about writing config files, say "cancel")
- chkconfig ntpdate on
- chkconfig ntpd on
- service ntpdate start
- service ntpd start
Configure system updates
- (do not do this) enable automatic system updates: run "/triumfcs/trshare/olchansk/linux/triumf-update/yum-setup.perl -o"
- (do not do this) enable automatic kernel updates: run "/triumfcs/trshare/olchansk/linux/triumf-update/yum-setup.perl -k -o"
- (experimental, do not do this) /triumfcs/trshare/olchansk/linux/triumf-update/yum-autoupdate.sh
Configure system services
- chkconfig --list | grep on | sort (to see enabled services)
- disable unwanted services:
(only if amanda is not used) -> chkconfig --level 12345 xinetd off chkconfig --level 12345 canna off chkconfig --level 12345 FreeWnn off chkconfig --level 12345 hpoj off chkconfig --level 12345 ip6tables off chkconfig --level 12345 iptables off chkconfig --level 12345 isdn off chkconfig --level 12345 pcmcia off chkconfig --level 12345 rhnsd off chkconfig --level 12345 spamassassin off chkconfig --level 12345 bluetooth off chkconfig --level 12345 apmd off chkconfig --level 12345 iiim off chkconfig --level 12345 fenced off chkconfig --level 12345 ccsd off chkconfig --level 12345 cpuspeed off chkconfig --level 12345 pcp off chkconfig --level 12345 pmie off chkconfig --level 12345 yum-updatesd off chkconfig --level 12345 clvmd off chkconfig --level 12345 cman off chkconfig --level 12345 lvm2-monitor off chkconfig --level 12345 modclusterd off chkconfig --level 12345 yum-updateonboot off chkconfig --level 12345 cmirror off chkconfig --level 12345 lock_gulmd off chkconfig --level 12345 firstboot off chkconfig --level 12345 ricci off chkconfig --level 12345 gfs off chkconfig --level 12345 scsi_reserve off chkconfig --level 12345 openibd off chkconfig --level 12345 arptables_jf off chkconfig --level 12345 auditd off chkconfig --level 12345 avahi-daemon off chkconfig --level 12345 hplip off chkconfig --level 12345 iscsi off chkconfig --level 12345 iscsid off chkconfig --level 12345 mcstrans off chkconfig --level 12345 pcscd off chkconfig --level 12345 restorecond off chkconfig --level 12345 setroubleshoot off chkconfig --level 12345 xend off chkconfig --level 12345 xendomains off chkconfig --level 12345 kudzu off #chkconfig --level 12345 yum-cron off chkconfig --level 12345 kdump off chkconfig --level 12345 libvirt-guests off
Configure external package repositories
(if not done by TRIUMF kickstart)
- rpm -vh --install /triumfcs/mirror/epel/6/x86_64/epel-release-6-5.noarch.rpm
- rpm -vh --install /triumfcs/mirror/elrepo/el6/x86_64/RPMS/elrepo-release-6-3.el6.elrepo.noarch.rpm
Configure hardware sensors
- yum install lm_sensors
- sensors-detect (accept default answer to all questions - press ENTER)
- service lm_sensors restart (to reload the kernel modules)
- sensors (to see available sensors)
- (not needed) if sensors reports "General parse error", do "cp /triumfcs/trshare/olchansk/linux/lm_sensors/lm_sensors-3.0.1/prog/sensors/sensors /usr/bin/sensors" and try again
Enable NFS
- edit /etc/hosts.allow, add or uncomment "mountd: 142.90.0.0/255.255.0.0"
- create /etc/exports, e.g. "/home1 @daqmachines(rw,no_root_squash,async)"
- chkconfig nfs on
- chkconfig nfslock on
- service nfs restart
Enable AMANDA backups
AMANDA backups are already enabled by TRIUMF kickstart installs. For non-kickstart installation, follow instructions at [http://amanda/~amanda], or look at "/triumfcs/trshare/olchansk/linux/amanda/amanda-enable.perl". As final step, use [https://helpdesk.triumf.ca] to contact TRIUMF CS to add this new machine to the amanda backup list.
Enable DCACHE
- mkdir -p /pnfs/triumf.ca
- edit /etc/rc.local, add to the end of file: "mount -o intr,rw,noac,hard,nfsvers=2 trdata00:/pnfsdoors /pnfs/triumf.ca"
- . /etc/rc.local
Configure Ganglia
scp ladd00:/etc/gmond.conf /etc/gmond.conf yum install ganglia ganglia-gmond chkconfig gmond on service gmond restart
Install Konstantin's packages
cd /triumfcs/trshare/olchansk/linux/misc-rpms rpm -vh --install triumf-ko-ganglia-* rpm -vh --install emailonreboot-* rpm -vh --install monitor_nfs-* rpm -vh --install /triumfcs/trshare/olchansk/public_html/diskscrub/download/diskscrub*
Install memtest
yum install memtest86+ ls -1 /boot/* | grep memtest ### to find out the name of the memtest boot file to use on the "kernel" line of the incantation to be added at the end of /boot/grub/grub.conf: cat >> /boot/grub/grub.conf title memtest root (hd0,0) kernel /boot/memtest86+-1.65
Install node monitoring
cd /triumfcs/trshare/olchansk/linux/misc-rpms rpm -vh --install triumf_nodeinfo.noarch.rpm /usr/sbin/sendnodeinfo.perl --config ladd00:8600 emacs -nw /etc/nodeinfo /usr/sbin/sendnodeinfo.perl ladd00:8600
Install packages needed for ROOT, EPICS and MIDAS DAQ
yum install giflib.i386 giflib.i686 giflib.x86_64 compat-libf2c-34.i386 compat-libf2c-34.i686 mysql-devel sysstat "libusb-devel*" unixODBC-devel postgresql-devel libxml2-devel libXpm-devel libgfortran libstdc++-devel.i386 libstdc++-devel.i686 git compat-readline43 "graphviz*" dcap "zlib-*.i686" "libXext-*.i686" "libXtst-*.i686" "tigervnc*" telnet
Install NTFS drivers
yum install ntfs-3g ntfsprogs (from EPEL)
Install TRIUMF packages
(TRIUMF kickstart usually installs this automatically)
- rpm -vh --install http://mirror.triumf.ca/triumf/6/x86_64/RPMS/triumf-release-1.4-1.noarch.rpm
- yum install triumf-automount
Configure USB device permissions
Configure USB device permissions for user access to USB-serial devices, Altera USB Blaster, etc.
- create file /etc/udev/rules.d/99-usb-chmod.rules with this contents:
emacs -nw /etc/udev/rules.d/99-usb-chmod.rules ACTION=="add", SUBSYSTEM=="usb_device", RUN+="/bin/chmod a+wr /dev/%c" ACTION=="add", SUBSYSTEM=="usb_device", RUN+="/bin/chmod a+wr /proc/%c" ACTION=="add", ENV{DEVTYPE}=="usb_device", RUN+="/bin/chmod a+wr $env{DEVNAME}" ACTION=="add", ENV{DEVTYPE}=="usb_device", RUN+="/bin/chmod a+wr $env{DEVICE}" ACTION=="add", ENV{PHYSDEVBUS}=="usb-serial", RUN+="/bin/chmod a+wr $env{DEVNAME}" ACTION=="add", ENV{DEVPATH}=="/class/tty/ttyS*", RUN+="/bin/chmod a+wr $env{DEVNAME}"
- apply new permissions: udevadm trigger --action=add
Configure Altera jtagd
(if needed)
mkdir /etc/jtagd echo 'Password = "123";' > /etc/jtagd/jtagd.conf cp -pv /triumfcs/trshare/olchansk/altera/11.0/quartus/linux/pgm_parts.txt /etc/jtagd/jtagd.pgm_parts
- start local jtagd: /triumfcs/trshare/olchansk/altera/11.0/quartus/bin/jtagd
- test local connection: /triumfcs/trshare/olchansk/altera/11.0/quartus/bin/jtagconfig
- test remote connection (add this machine to your .jtag.conf, run jtagconfig
Configure packages
- yum install xpdf "xemacs*" tkcvs xterm mutt "*g77*" joe "libXmu*"
- (not needed for SL5.5 kickstart) erase unwanted packages: yum erase logwatch mailman mrtg inn inn-devel cyrus-imapd cyrus-imapd-devel cyrus-imapd-murder cyrus-imapd-nntp webalizer squirrelmail rhn-applet yumex-applet apt-autoupdate SL_enable_serialconsole tog-pegasus kernel-largesmp kernel-hugemem kernel-largesmp-devel spamassassin slrn-pull openafs kernel-module-openafs openafs-debug openafs-devel openafs-kernel-source kernel-largesmp kernel-hugemem kernel-hugemem-devel xen kernel-xen bash-completion
- yum update
Configure GRUB boot loader
- edit /boot/grub/grub.conf, remove the "quiet" and "rhgb" options
- edit /boot/grub/grub.conf, comment out (with "#") the "splashimage=" line
- check that GRUB boot loader is installed on all system disks:
- dd if=/dev/sda bs=1 count=1024 2>&1 | strings | grep GRUB
- dd if=/dev/sdb bs=1 count=1024 2>&1 | strings | grep GRUB
- if GRUB is not installed, (i.e. on the 2nd disk of machines with mirrored system disk), (but check that /dev/sdb is the right disk):
# grub grub> device (hd0) /dev/sdb grub> root (hd0,0) grub> setup (hd0)
Special hardware settings
ASUS Crosshair mobo
- use BIOS version 1207 or newer
- sensors need these drivers from ELREPO: yum install --noplugins kmod-it87 kmod-k10temp; sensors-detect; service lm_sensors restart; sensors
ASUS Crosshair-II mobo
- use BIOS version 2607 or newer
- for the onboard IDE to work, add "all-generic-ide" to kernel boot options in grub.conf
- sensors need these drivers from ELREPO: yum install --noplugins kmod-it87 kmod-k10temp; sensors-detect; service lm_sensors restart; sensors
ASUS P7P55D EVO mobo
- use BIOS version 1808 (or 2004) or newer
- sensors do not work (no driver in SL5 kernel)
- install special driver for the r8168 GigE network interface (SL5.5 stock driver sometimes freezes):
- yum install kmod-r8168 (from elrepo)
- edit /etc/modprobe.conf, change the eth0 entry to read: "alias eth0 r8168"
ASUS E35M1-M PRO mobo
- use BIOS version 1002 or newer
- for CPU temperature: install kmod-k10temp from ELREPO (kmod-k10temp-0.0-4.el6.elrepo.x86_64.rpm)
- for Sensors, install driver for NCT6776F chip from https://github.com/groeck/w83627ehf/archives/master (in the Makefile, change the line "KERNEL_BUILD=" to read: "KERNEL_BUILD:=/usr/src/kernels/$(TARGET)"):
ssh ladd00 cp ~olchansk/daq/linux/groeck-w83627ehf-dd3e543/w83627ehf.ko /root/w83627ehf.ko echo "modprobe hwmon; modprobe hwmon-vid; rmmod w83627ehf; insmod /root/w83627ehf.ko" >> /etc/rc.local
Configure X11 graphics
Special settings for DAQ
- add the following at the end of /etc/X11/xorg.conf. The enables Ctrl-Alt-KP-/ and Ctrl-Alt-KP-* to unlock the keyboard after Altera Quartus crash:
Section "ServerFlags" Option "AllowDeactivateGrabs" "true" Option "AllowClosedownGrabs" "true" EndSection
Install NVIDIA drivers
- yum --enablerepo elrepo install nvidia-x11-drv kmod-nvidia (if it fails due to conflict with module-init-tools, run "yum --disablerepo \* --enablerepo elrepo update module-init-tools")
- mv /etc/X11/xorg.conf /etc/X11/xorg.conf-xxx
- nvidia-xconfig
- (SL6) reboot
- (SL5) /dev/MAKEDEV nvidia
- (SL5) restart the X11 server (Ctrl-Alt-Backspace or "killall Xorg gdm-binary")
- observe that X11 server restarts using the NVIDIA driver (big NVIDIA logo on startup)
- if needed, login as root and run "nvidia-settings" to setup dual-screen configuration, etc
Manual selection of monitor, video mode and resolution
Automatic selection of monitor and video mode usually works. When it does not, configure it manualls:
- physically go to the computer
- login as root
- run "system-config-display".
- In the "hardware" tab, select monitor type: "generic LCD 1280x1024" or "generic LCD 1600x1200".
- In the "settings" tab, select "1280x1024" or "1600x1200" and "Thousands of colors".
- Press "ok", the display settings application should close.
- Logout, the new login window should use the new settings.
Finish installation
- logout and reboot the computer to have all the changes to take effect
Obsolete items
- (do not do this) install all missing packages: /triumfcs/trshare/olchansk/linux/triumf-update/yum-everything.perl
- (SL5) If yum complains about unsigned java packages: cd ~; rpm -vh --upgrade /triumfcs/mirror/scientificlinux.org/5x/i386/updates/security/jdk*rpm /triumfcs/mirror/scientificlinux.org/5x/i386/updates/security/java*rpm
- (before FC5) remove the fancy screen savers
rm -f /usr/X11R6/lib/xscreensaver/* rm -f /usr/bin/*.kss
- (before FC5) edit /etc/updatedb.conf, set "DAILY_UPDATE=yes"