DEAP: Difference between revisions

From DaqWiki
Jump to navigation Jump to search
daqwiki>Olchansk
daqwiki>Olchansk
Line 150: Line 150:
deapdaqgw is the gateway machine that provides internet access to the DEAP DAQ cluster.
deapdaqgw is the gateway machine that provides internet access to the DEAP DAQ cluster.


* NAT (/etc/rc.local,
* NAT ("network address translation", see /etc/rc.local)
* IP address assignement via /etc/hosts
* DNS via dnsmasq serving contents of /etc/hosts and bridge to upstream DNS (configured in /etc/resolv.conf by upstream DHCP)
* DHCP for all machines except deap01..deap05 via /etc/dhcpd/dhcpd.conf


=== NIS configuration ===
=== NIS configuration ===

Revision as of 09:41, 18 February 2013

Links

DAQ machines

  • deapdaqgw: gateway machine (DHCP for deap00, UPS, CDU, NAT)
  • deap00: main daq machine (storage, home directories, central services, etc)
  • deap01..05: A3818 daq machines
  • deap06.triumf.ca: temporary network gateway
  • deap07: spare A3818 daq machine (old deap00) (used for PCIe ADC DAQ)
  • deap08: spare deap00 machine
  • lxdeap01: VME daq machine
  • deapvme01..03: VME crate power supplies
  • deapups: DAQ UPS unit
  • deapcdu: DAQ power distribution unit
  • deapkvm8: 8-port IP KVM
  • mscb520: MSCB-ETH bridge

deapups connections

  • C13(f) : Switched Load 1 - N/C
  • C13(f) : Switched Load 2 - N/C
  • C13(f) : Switched Load 3 - N/C
  • C13(f) : Unswitched Load 4 - Rack Fan Left
  • C13(f) : Unswitched Load 4 - Rack Fan Centre
  • C13(f) : Unswitched Load 4 - Rack Fan Right
  • C13(f) : Unswitched Load 4 -
  • C13(f) : Unswitched Load 4 -
  • C19(f) : Unswitched Load 4 - CDU

UPS configuration

Tripp-lite management software

ssh deap00 /var/tripplite/poweralert/console/pal_console.sh

USB connections

  • lsusb -v | grep -i product
  • lsusb -v | grep -i serial

NUT UPS configuration

[ups1]
        driver = usbhid-ups
        port = auto
        desc = "ups1"
        serial = "2231ELCPS720300082"
[ups2]
        driver = usbhid-ups
        port = auto
        desc = "ups2"
        serial = "2211KW0PS733900093"
[ups3]
        driver = usbhid-ups
        port = auto
        desc = "ups3"
        serial = "2231ELCPS720300090"
  • restart drivers: /opt/nut/bin/upsdrvctl start
  • reload upsd: /opt/nut/sbin/upsd -c reload

deapcdu connections

 1 : DEAP00
 2 : DEAP01
 3 : DEAP02
 4 : DEAP03
 5 : DEAP04
 6 : DEAP05
 7 : DEAPVME02
 8 : DEAP07
------------
 9 : DEAP08
10 : SCB1
11 : SCB2
12 : n/c
13 : n/c
14 : DEAPDAQGW (temp)
15 : DEAPMPOD
16 : DEAPKVM8

deapcdu snmp

  • snmpwalk -v 2c -M +/home/deap/online/slow/fesnmp -m +Sentry3-MIB -c public deapcdu sentry3
  • snmpset -v 2c -M +/home/deap/online/slow/fesnmp -m +Sentry3-MIB -c write deapcdu outletControlAction.1.1.1 i 1 ### turn on outlet 1
  • snmpset -v 2c -M +/home/deap/online/slow/fesnmp -m +Sentry3-MIB -c write deapcdu outletControlAction.1.1.3 i 2 ### turn off outlet 3

Network configuration (TRIUMF)

DEAP DAQ machines are on the private network (see below).

Gateway to TRIUMF network is 1U machine deap06.triumf.ca connected to the LADD-NIS cluster (deap account on ladd00).

Gateway services running on the gateway:

  • DHCP server for the 192.168.1.x network (/etc/hosts, /etc/dhcp/dhcpd.conf)
  • apache SSL/https proxy for MIDAS status page, ELOG, ganglia and nodeinfo (/etc/httpd/conf.d/ssl.conf, /etc/httpd/htpasswd)
  • NAT proxy from private network to the TRIUMF network (/etc/rc.local). Makes the internet accessible from deapNN machines.

Network configuration (DEAP)

The DEAP DAQ cluster is configured for standalone running with or without an internet connection.

(NB: Some internet functions are required: access to NTP for time synchronization and access to Linux package repositories to install packages, etc)

Network numbers

Network numbers are assigned by deapdaqgw and deap00 DHCP servers:

192.168.1.x (netmask 255.255.255.0): main private network
192.168.2.x: deap00a-deap01a connection
192.168.3.x: deap00b-deap02b connection
192.168.4.x: deap00c-deap03c connection
192.168.5.x: deap00d-deap04d connection

DHCP servers

Main DHCP server is deapdaqgw. ... On deap00 there is a dhcp server running

DEAP network nodes with statically configured IP addresses:

deapmod : Wiener MPOD firmware does not support DHCP

Gateway machine

deapdaqgw is the gateway machine that provides internet access to the DEAP DAQ cluster.

  • NAT ("network address translation", see /etc/rc.local)
  • IP address assignement via /etc/hosts
  • DNS via dnsmasq serving contents of /etc/hosts and bridge to upstream DNS (configured in /etc/resolv.conf by upstream DHCP)
  • DHCP for all machines except deap01..deap05 via /etc/dhcpd/dhcpd.conf

NIS configuration

Usernames, passwords and hostnames are distributed using NIS:

  • domain name: DEAP-NIS
  • deap00 is the master server
  • there are no secondary servers
  • hostnames are distributed using NIS (from deap00:/etc/hosts, MUST MATCH deap06:/etc/hosts!)
  • to solve chicken-and-egg problem deap00 IP address has to be listed in each machine /etc/hosts (MUST MATCH deap06 and deap00 /etc/hosts!) (SL6.2+ NIS broadcast does not work so deap00 has to be listed in each machine /etc/yp.conf, also NFS filesystems are mounted before NIS is started).
  • also NIS has to be listed in front of DNS in the "hosts:" entry of /etc/nsswitch.conf

DNS kludge:

  • normally DNS would be used to distribute IP addresses and hostnames to the DHCP server, to deap00 and to other deap machines. But we do not have a private DNS server and the TRIUMF DNS server has the wrong IP addresses for deap machines (142.90.x.x).
  • deap06 DHCP is telling all machines to use the TRIUMF DNS server (to resolve internet addresses - google, etc). To avoid confusion between local deap00, etc hostnames and deap00, etc hostnames from TRIUMF, /etc/nsswitch.conf "hosts:" entry has to list "nis" before "dns".
  • hopefully the deap00, etc hostnames will be resolved correctly by the SNOlab DNS servers and all this kludging can go away.

System monitoring tools:

  • ganglia
  • triumf_nodeinfo
  • konstantin's ganglia packages (monitor_nfs, ganglia sensors, top, etc) - To install/update: yum --disablerepo="*" --enablerepo=konstantin update
  • diskscrub

Backups

  • backups of Linux images:
    • backups of linux images are done to deap00:/data/root/backups using cron job on deap00:/etc/cron.d/backup.lxdaq.cron and deap00:~root/backup.lxdaq
  • backups of home directories: NONE
  • backups of data disks: NONE

Creating boot disks for deap01..deap05

mirrored 16GB USB Flash disks

go here Cloning_raid1_boot_disks

V7865 single 8GB/16GB USB Flash disks

The V7865 VME processors use single USB flash disks. To create the boot disks, follow instructions for #64GB_SSD_boot_disks, but clone "lxdeap01" instead of "deap01".

Single 8/16GB USB and 64GB SSD boot disks

  • attach SSD disk to any of the deap01..deap05 machines (SATA+power)
  • login as root to that machine
  • "fdisk -l" to identify which /dev/sdX disk it is
  • cd /data/root/backups
  • ./clone.perl ./deap01 /dev/sdX
  • observe script completes sucessfully and prints "Done. You can remove /dev/sdX and try to boot from it."
  • disconnect the disk
  • connect to new machine, try to boot from it