Amanda: Difference between revisions

From DaqWiki
Jump to navigation Jump to search
m (New page: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>TRIUMF AMANDA Backup System</title> </head> <body> <h1>TRIUMF AMANDA Backup System</h1> ...)
 
 
(34 intermediate revisions by the same user not shown)
Line 1: Line 1:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
== Welcome ==
<html>
  <head>
    <title>TRIUMF AMANDA Backup System</title>
  </head>


  <body>
The AMANDA disk backup server is operated by the TRIUMF DAQ group as a backup and archiving service for data acquisition,
    <h1>TRIUMF AMANDA Backup System</h1>
detector facility and other computers systems in the ST department, Science division and other TRIUMF users.


    <h2>AMANDA information</h2>
Backups are stored on a 20TB disk array (RAID6/XFS). Backups are done on a 10 day schedule - full backup, level 1 incremental, level 2 incremental, etc, full backup, cycle repeats. Generally the system stores 2 full backups (the last full backup and the one before it) plus the incremental backups between them. Older full backups and older incrementals are stored space permitting.


    <ul>
Guaranteed data retention time is 10 days, typical data retention time is 1 month.
      <li><a href="http://www.amanda.org">Amanda project home page</a>
 
      <li><a href="http://sourceforge.net/projects/amanda">Amanda on sourceforge</a>
Periodically, amanda backups are archived to offline media (SDLT tapes, USB hard drives) stored in secure location and kept permanently.
      <li><a href="http://www.zmanda.com">Amanda on zmanda.com</a>
      <li><a href="http://wiki.zmanda.com">Amanda wiki</a>
    </ul>


    <h2>AMANDA status (automatically updated)</h2>
== AMANDA information ==


    <ul>
* http://www.amanda.org - Amanda project home page
      <li><a href="recycle.txt">summary of virtual tape usage</a>
* http://sourceforge.net/projects/amanda - Amanda on sourceforge
      <li>latest backup reports: <a href="summary.txt">summary</a>, <a href="reports/activity_last.txt">last run</a>, <a href="report
* http://www.zmanda.com - Amanda on zmanda.com
s/log.txt">last amreport</a>, <a href="errors.txt">errors</a>, <a href="errorssummary.txt">summary of errors</a>
* http://wiki.zmanda.com - Amanda wiki
      <li>reports: <a href="index_perhost.html">per host</a>, <a href="index_perdate.html">per day</a>, <a href="index_peramreport.ht
* https://github.com/zmanda/amanda - Amanda git repository on github
ml">amreport per day</a></a>
      <li>all backup files: <a href="all_files.txt">all files</a>
      <li>top users: <a href="top_files.txt">files</a>, <a href="top_fs.txt">file systems</a>, <a href="top_hosts.txt">hosts</a></a>
      <li><a href="erase.txt">erased virtual tapes</a>
      <li><a href="amcheck.txt">amcheck</a>
      <li><a href="amstatus.txt">amstatus</a>
      <li><a href="amoverview.txt">amoverview</a>
      <li><a href="amdump.txt">last amdump</a>
      <li><a href="amflush.txt">last amflush</a>
      <li><a href="vtapes.txt">virtual tape slots</a>
      <li><a href="amadmin_balance.txt">amadmin balance</a>
      <li><a href="amadmin_disklist.txt">amadmin disklist</a>
      <li><a href="amadmin_find.txt">amadmin find</a>
      <li><a href="amadmin_info.txt">amadmin info</a>
    </ul>


    <h2>Amanda administrators</h2>
== AMANDA status (automatically updated) ==


    <p>The Amanda server is managed by
Use username: amanda, password: amanda
      <a href="http://it-services.triumf.ca/main-page">TRIUMF Computing & Network
      Services</a>.  Please make requests via the <a href="https://helpdesk.triumf.ca"> TRIUMF help-desk</a>.
    </p>


    <h2>Instructions for clients</h2>
* [https://daq.triumf.ca/~daqweb/amanda/recycle.txt summary of virtual tape usage]
* latest backup reports: [https://daq.triumf.ca/~daqweb/amanda/summary.txt summary], [https://daq.triumf.ca/~daqweb/amanda/reports/activity_last.txt last run], [https://daq.triumf.ca/~daqweb/amanda/reports/log.txt last amreport], [https://daq.triumf.ca/~daqweb/amanda/errors.txt errors], [https://daq.triumf.ca/~daqweb/amanda/errorssummary.txt summary of errors]
* reports: [https://daq.triumf.ca/~daqweb/amanda/index_perhost.html per host], [https://daq.triumf.ca/~daqweb/amanda/index_perdate.html per day], [https://daq.triumf.ca/~daqweb/amanda/index_peramreport.html amreport per day]
* all backup files: [https://daq.triumf.ca/~daqweb/amanda/all_files.txt all files]
* top users: [https://daq.triumf.ca/~daqweb/amanda/top_files.txt files], [https://daq.triumf.ca/~daqweb/amanda/top_fs.txt file systems], [https://daq.triumf.ca/~daqweb/amanda/top_hosts.txt hosts]
* [https://daq.triumf.ca/~daqweb/amanda/erase.txt erased virtual tapes]
* [https://daq.triumf.ca/~daqweb/amanda/amcheck.txt amcheck]
* [https://daq.triumf.ca/~daqweb/amanda/amstatus.txt amstatus]
* [https://daq.triumf.ca/~daqweb/amanda/amoverview.txt amoverview]
* [https://daq.triumf.ca/~daqweb/amanda/amdump.txt last amdump]
* [https://daq.triumf.ca/~daqweb/amanda/amflush.txt last amflush]
* [https://daq.triumf.ca/~daqweb/amanda/vtapes.txt virtual tape slots]
* [https://daq.triumf.ca/~daqweb/amanda/amadmin_balance.txt amadmin balance]
* [https://daq.triumf.ca/~daqweb/amanda/amadmin_disklist.txt amadmin disklist]
* [https://daq.triumf.ca/~daqweb/amanda/amadmin_find.txt amadmin find]
* [https://daq.triumf.ca/~daqweb/amanda/amadmin_info.txt amadmin info]


    <ul>
== Adding backup clients ==
<!-- Removed by H. It is nolonger supported
      <li><h3>Adding a new Mac OS X client, go here: <a href="http://www.triumf.ca/RT/Mac"> See Amanda for Mac OS X</a></h3>
-->
      <li><h3>Adding a new Linux client:</h3>


            If your system was installed with a TRIUMF kickstart for
Note: currently only linux backup clients are documented. MacOS is known to work but amanda client package does not seem to be generally available.
            SL-4.5 or later, then it already has the rpm
            "triumf-amanda". In this case, you can skip the next step.
<p>
           
          <ul>


  <li>Prepare the client:
* prepare backup client machine:
    <pre>
* Ubuntu LTS 20.04 - [[Ubuntu#Install_amanda_client]]
* RHEL7/SL7/CentOS7 - [[SLinstall#Enable_AMANDA_backups_.28CentOS7.29]]
* RHEL/SL/4/5/6:
<pre>
       ssh root@client
       ssh root@client
       edit ~amanda/.amandahosts, add line "amanda.triumf.ca amanda"
       edit ~amanda/.amandahosts, add line "amanda.triumf.ca amanda"
Line 70: Line 54:
       chkconfig amanda on
       chkconfig amanda on
       service xinetd reload
       service xinetd reload
    </pre>
</pre>
  </li>
<pre>
 
  # if local filewall is running (before RHEL/SL/CentOS7)
 
   # Allow input from the backup server over UDP to the amanda port and on any unprivileged TCP port
  <li>Update contact information in the TRIUMF host database at http://trweb.triumf.ca/hostmonitor. When backup errors occur,
   iptables -I INPUT 1 -m state  --state NEW  -s 142.90.100.196 -p udp    --dport amanda        -j ACCEPT
amanda scripts will send an email to the user and the manager listed in the database.
   iptables -I INPUT 1 -m state  --state NEW  -s 142.90.100.196 -p tcp    --dport 1024:65535    -j ACCEPT
  </li>
 
 
  <li>Adjust firewall rules (if firewall is enabled):
    <pre>
   # Allow input from the backup server over UDP to the amanda port
  # and on any unprivileged TCP port
 
   iptables -I INPUT 1 -m state  --state NEW  -s 142.90.100.196 \
                      -p udp    --dport amanda        -j ACCEPT
   iptables -I INPUT 1 -m state  --state NEW  -s 142.90.100.196 \
                      -p tcp    --dport 1024:65535    -j ACCEPT
 
  # Make these changes permanent.  This assumes that you do not have
  # some other custom firewall-script.  In that case, you know what to do.
 
   service iptables save
   service iptables save
    </pre>
</pre>
* send a request to the DAQ group with following information: machine full hostname (i.e. titan00.triumf.ca), list of filesystems to backup (i.e. /home1, /etc) and a contact email address who will receive any error messages from failed backups.


  <li>Send a request via the <a href="https://helpdesk.triumf.ca"> TRIUMF help-desk</a> with "Please add filesystems /BAR and
== Removing backup clients ==
/BAZ on COMPUTERNAME to the AMANDA backup list, the contact person for this computer is SOMEBODY@triumf.ca". For typical desktop mac
hines, one would backup only the directories containing user files, for server-type machines, one would also backup /etc, /var and /r
oot.
</ul>


      <li><h3>Removing clients</h3>
Send a request to the DAQ group with the name of machine or filesystem to be removed, stop the amanda services on the client machine.
<ul>
  <li>Send an email to the amanda administrators requesting that the machine be removed from the amanda backup list.
  <li>(NOTE: once a client is removed from the backup list, normal data recycling will eventually delete the backup data from
amanda. Backup data saved on amanda archive tapes is preserved forever)
  <li>ssh root@client; chkconfig amanda off
</ul>


      <li><h3>Restoring data from amanda</h3>
NOTE: once a client is removed from the amanda backup list, normal data recycling will eventually permanently delete the backup data from amanda. Backup data saved on amanda archive tapes is preserved forever.
<ul>
  <li>the AMANDA system does not support direct file recovery by individual users (sorry!)
  <li>the AMANDA system has many options for recovering data, but the simplest operations are recovery of a single file and r
ecovery of a complete filesystem.
  <li>start by looking at the amanda index files (see links above) to see when the last backups are available for your machin
e.
  <li>send an email to the amanda administrators asking them to do the data recovery. Ask for recovery of either a specific f
ile or of a whole filesystem.
</ul>
    </ul>


    <h2>Instructions for troubleshooting failing clients</h2>
== Troubleshooting failing clients ==


     <p>If an amanda client is failing backups, login to the client as root and try this:</p>
     <p>If an amanda client is failing backups, login to the client as root and try this:</p>
Line 136: Line 85:
     </ul>
     </ul>


    <h2>Instructions for adding a new client on the amanda server</h2>
== Restoring data from amanda ==


    <pre>
To recover files from amanda:
      ssh root@amanda
* examine the list of all available backup files: https://daq.triumf.ca/~daqweb/amanda/all_files.txt
      edit ~amanda/daily/disklist,
* identify the backup files to be restored. For restoration of deleted files, this would be the latest full backup (level 0), the latest level 1 backup, the latest level 2 backup, etc.
      add entry "client /home1 comp-user-tar # clientusername@triumf.ca"
* if this is an emergency (you need the files right away, or you need files from a backup older than the latest full backup), contact [http://www.triumf.ca/profiles/3103 Konstantin Olchanski] by email or SMS to the listed cell phone number.
      su - amanda
* send a request to the DAQ group asking to retrieve the amanda backup files. Typically with rsync or scp these files from amanda to disk storage on some client machine.
      run "amcheck -c daily"
* unpack the amanda backup files: (for full restoration, untar the amanda incremental backups files consecutively)
    </pre>
<pre>
dd if=00004.titan00._home1.0 bs=32k skip=1 | tar xzvf -
dd if=00055.titan00._home1.1 bs=32k skip=1 | tar xzvf -
dd if=00114.titan00._home1.2 bs=32k skip=1 | tar xzvf -
etc...
</pre>
 
== Adding a new client on the amanda server ==
 
* ssh root@amanda
* edit ~amanda/daily/disklist,
* add entry "clienthostname /home1 comp-user-tar # clientusername@triumf.ca"
* su - amanda
* run "amcheck -c daily"
* if amcheck is not happy, follow instructions for troubleshooting failing client
 
== Instructions for using bsdtcp authentication (Obsolete) ==


    <h2>Instructions for using bsdtcp authentication</h2>
NOTE: Information in this section is obsolete


     The default "bsd" authentication uses the UDP transport and sometimes has problems
     The default "bsd" authentication uses the UDP transport and sometimes has problems
Line 188: Line 153:
     </ul>
     </ul>


    <h2>Instructions for data recovery (whole disk)</h2>
== Restoring data from archive USB HDDs ==
 
    <pre>
ssh root@amanda
mkdir /home1/amanda_recover
chown amanda.root /home1/amanda_recover/
su - amanda
cd /home1/amanda_recover
 
---> we will recover neut04:/etc
 
amadmin daily info neut04 /etc
---> prints:
Current info for neut04 /etc:
  Stats: dump rates (kps), Full:  475.0,  -1.0,  -1.0
                    Incremental:    2.0,  -1.0,  -1.0
          compressed size, Full:  20.8%,-100.0%,-100.0%
                    Incremental:  6.3%,  6.4%,  5.2%
  Dumps: lev datestmp  tape            file  origK  compK secs
          0  20041123  daily0001          5  18320    3802    8
          1  20041130  daily0007          10    650      41    0
---> note the backup levels, the tape numbers and file numbers:
---> level 0: tape daily0001, slot 1, file  5
---> level 1: tape daily0007, slot 7, file 10
 
ALTERNATIVE1 (extract tarballs on amanda, copy them to client, untar)
amrestore -f5 file:/home1/amanda/daily/vtapes/1 neut04 /etc  <--- level0 backup
amrestore -f10 file:/home1/amanda/daily/vtapes/7 neut04 /etc  <--- level1 backup
---> creates files neut04._etc.20041123.0, neut04._etc.20041130.1
scp neut04* neut04:
ssh root@neut04
mkdir etc
cd etc
tar xvBf - ~/neut04._etc.*.0
tar xvBf - ~/neut04._etc.*.1
 
ALTERNATIVE2 (extract data directly to the client, using ttcp)


(NOTE: If ttcp is not available, use the copy in amanda:~root
Archive USB HDDs contain an rsync copy of amanda system files and of the latest full backups. Incremental backups are not archived.
or in /triumfcs/trshare/olchansk/bin).


ssh root@neut04
To restore the data, mount the USB disk *read only*, identify the backups file (use find . | grep clienthostname), copy the backup file to local storage, unmount, disconnect the archive disk, recover individual files per instructions above (dd if=00004.titan00._home1.0 bs=32k skip=1 | tar xzvf -).
mkdir etc
cd etc
ttcp -r | tar xzvBf -
(in another window)
(we are amanda@amanda)
amrestore -C -p -f5 file:/home1/amanda/daily/vtapes/1 neut04 /etc | ttcp -t neut04
(observe neut04 receiving and untaring the data)
(repeat with the level1 backup:)
(restart "ttcp | tar" on neut04)
amrestore -C -p -f10 file:/home1/amanda/daily/vtapes/7 neut04 /etc | ttcp -t neut04
 
ALTERNATIVE2a (extract data directly to the client, using netcat - standard in SL)
(pick any unused port allowed by iptables; amanda should be allowed between 1024:65535)
 
root@neut04: nc -l 12345 | tar -xvB
amanda@amanda: amrestore  -p -f5 file:/home1/amanda/daily/vtapes/1 neut04 /etc | nc neut04 12345
etc.
 
ALTERNATIVE3 (extract data without amanda tools)
 
Find out which amanda backup we will extract using "amadmin info" as above or
from the amanda listing at http://amanda.triumf.ca/~amanda/all_files.txt
 
Then run the "tar" command below ("dd" is needed to strip the amanda header),
substituting your vtape number for "38" and your file number amnd filename for "00383...".
 
dd if=/home1/amanda/daily/vtapes/38/data/00383.dork._home1.0 bs=32k skip=1 | tar xzvf -
    </pre>
 
    <h2>(***THIS DOES NOT WORK***) Instructions for data recovery (individual files)</h2>
 
    <pre>
ssh root@amanda
run amrecover
amrecover> sethost tw04
amrecover> listdisk <----- get list of available disks
amrecover> setdisk /etc
amrecover> ls <---- look at the files
amrecover> add hosts  <---- add "hosts" to recovery list
amrecover> extract
...
The following tapes are needed: daily0020  <---- note tape number "20"
...
Restoring files into directory /root
Continue [?/Y/n]?  <---- YES
Load tape daily0020 now
Continue [?/Y/n/t]? t  <----- answer "t"
New tape device [?]: amanda.triumf.ca:file:/home1/amanda/daily/vtapes/20 <---- use tape number from above
./hosts
amrecover>
    </pre>
 
    <h2>Instructions for data recovery (archived AMANDA backup tapes)</h2>


== Restoring data from archive SDLT tapes ==
     <ul>
     <ul>
<li>AMANDA backup tapes are made using "tar -M" from amanda:~root/backup.sh.
<li>AMANDA backup tapes are made using "tar -M" from amanda:~root/backup.sh.
Line 295: Line 171:
<li>This restores the amanda compressed backup file. Unpack it by running "dd if=00257.dork._home1.0 bs=32k skip=1 | tar xzvf -"
<li>This restores the amanda compressed backup file. Unpack it by running "dd if=00257.dork._home1.0 bs=32k skip=1 | tar xzvf -"
     </ul>
     </ul>
    <hr>
<!-- Created: Tue Oct 19 22:55:05 PDT 2004 -->
<!-- hhmts start -->
Last modified: Mon Jul  1 19:53:28 PDT 2013
<!-- hhmts end -->
  </body>
</html>

Latest revision as of 18:19, 27 January 2021

Welcome

The AMANDA disk backup server is operated by the TRIUMF DAQ group as a backup and archiving service for data acquisition, detector facility and other computers systems in the ST department, Science division and other TRIUMF users.

Backups are stored on a 20TB disk array (RAID6/XFS). Backups are done on a 10 day schedule - full backup, level 1 incremental, level 2 incremental, etc, full backup, cycle repeats. Generally the system stores 2 full backups (the last full backup and the one before it) plus the incremental backups between them. Older full backups and older incrementals are stored space permitting.

Guaranteed data retention time is 10 days, typical data retention time is 1 month.

Periodically, amanda backups are archived to offline media (SDLT tapes, USB hard drives) stored in secure location and kept permanently.

AMANDA information

AMANDA status (automatically updated)

Use username: amanda, password: amanda

Adding backup clients

Note: currently only linux backup clients are documented. MacOS is known to work but amanda client package does not seem to be generally available.

      ssh root@client
      edit ~amanda/.amandahosts, add line "amanda.triumf.ca amanda"
      edit /etc/hosts.allow, add line "amandad: amanda.triumf.ca"
      chkconfig xinetd on
      chkconfig amanda on
      service xinetd reload
   # if local filewall is running (before RHEL/SL/CentOS7)
   # Allow input from the backup server over UDP to the amanda port and on any unprivileged TCP port
   iptables -I INPUT 1 -m state  --state NEW   -s 142.90.100.196  -p udp    --dport amanda        -j ACCEPT
   iptables -I INPUT 1 -m state  --state NEW   -s 142.90.100.196  -p tcp    --dport 1024:65535    -j ACCEPT
   service iptables save
  • send a request to the DAQ group with following information: machine full hostname (i.e. titan00.triumf.ca), list of filesystems to backup (i.e. /home1, /etc) and a contact email address who will receive any error messages from failed backups.

Removing backup clients

Send a request to the DAQ group with the name of machine or filesystem to be removed, stop the amanda services on the client machine.

NOTE: once a client is removed from the amanda backup list, normal data recycling will eventually permanently delete the backup data from amanda. Backup data saved on amanda archive tapes is preserved forever.

Troubleshooting failing clients

If an amanda client is failing backups, login to the client as root and try this:

  • Run "df", it should not hang
  • The system disk should have some free space
  • There should be no stuck amanda processes: "ps -efw | grep amanda" should return nothing. If you see any running amanda pro cesses, kill them.
  • xinetd should be running, do "service xinetd restart" if unsure
  • (on Solaris) inetd should be running, do "inetd -s" if unsure
  • Examine the log files in /var/log/amanda. Do "ls -ltr" to see the latest files. If any log files have size zero, the system disk may have been 100% full at that time, preventing amanda from working.
  • On the amanda server, as amada, run "amcheck -c daily host"

Restoring data from amanda

To recover files from amanda:

  • examine the list of all available backup files: https://daq.triumf.ca/~daqweb/amanda/all_files.txt
  • identify the backup files to be restored. For restoration of deleted files, this would be the latest full backup (level 0), the latest level 1 backup, the latest level 2 backup, etc.
  • if this is an emergency (you need the files right away, or you need files from a backup older than the latest full backup), contact Konstantin Olchanski by email or SMS to the listed cell phone number.
  • send a request to the DAQ group asking to retrieve the amanda backup files. Typically with rsync or scp these files from amanda to disk storage on some client machine.
  • unpack the amanda backup files: (for full restoration, untar the amanda incremental backups files consecutively)
dd if=00004.titan00._home1.0 bs=32k skip=1 | tar xzvf -
dd if=00055.titan00._home1.1 bs=32k skip=1 | tar xzvf -
dd if=00114.titan00._home1.2 bs=32k skip=1 | tar xzvf -
etc...

Adding a new client on the amanda server

  • ssh root@amanda
  • edit ~amanda/daily/disklist,
  • add entry "clienthostname /home1 comp-user-tar # clientusername@triumf.ca"
  • su - amanda
  • run "amcheck -c daily"
  • if amcheck is not happy, follow instructions for troubleshooting failing client

Instructions for using bsdtcp authentication (Obsolete)

NOTE: Information in this section is obsolete

   The default "bsd" authentication uses the UDP transport and sometimes has problems
   in the presence of packet loss on the network. In this case, try the "bsdtcp"
   authentication that uses the TCP transport.
  • On the client:
  • create file /etc/xinetd.d/amanda-bsdtcp:
    # default: off 
    # description:  The client for the Amanda backup system.\ 
    #               This must be on for systems being backed up\ 
    #               by Amanda. 
    service amanda 
    { 
            disable = no 
            socket_type             = stream 
            protocol                = tcp 
            wait                    = no 
            user                    = amandabackup 
            group                   = disk 
            server                  = /usr/sbin/amandad 
    # Configure server_args for the authentication type you will be using, 
    # and the services you wish to allow the amanda server and/or recovery 
    # clients to use. 
    # 
    # Change the -auth= entry to reflect the authentication type you use. 
    # Add amindexd to allow recovery clients to access the index database. 
    # Add amidxtaped to allow recovery clients to access the tape device. 
            server_args             = -auth=bsdtcp amdump 
    } 
        
  • chkconfig amanda off
  • chkconfig amanda-bsdtcp on
  • service xinetd reload
  • On the server:
  • in the disklist, replace dump type "comp-user-tar" with "bsdtcp-comp-user-tar"
  • su - amanda
  • amcheck -c daily ladd03

Restoring data from archive USB HDDs

Archive USB HDDs contain an rsync copy of amanda system files and of the latest full backups. Incremental backups are not archived.

To restore the data, mount the USB disk *read only*, identify the backups file (use find . | grep clienthostname), copy the backup file to local storage, unmount, disconnect the archive disk, recover individual files per instructions above (dd if=00004.titan00._home1.0 bs=32k skip=1 | tar xzvf -).

Restoring data from archive SDLT tapes

  • AMANDA backup tapes are made using "tar -M" from amanda:~root/backup.sh.
  • The catalog of these tapes, the output of "tar tvf", is created by running "verify.sh" and manually renaming it to amanda-backup- yyyy-mm-dd.txt.
  • To restore any file, look in the catalog to find out which tape(s) contains it.
  • a) If the file is contained on a single tape, restore it using "tar -b 2048 -xvf /dev/nst0 file1..." as usual.
  • b) If the file is split between several tapes, load both tapes into the tape robot into consecutive slots, then follow the exampl e in ~root/example-restore.sh (run "tar -C /home1/restore -M -b 2048 --new-volume-script="/root/loadNextTape.perl" --checkpoint -xvf /dev/nst0 amanda/daily/vtapes/79/data/00257.dork._home1.0")
  • This restores the amanda compressed backup file. Unpack it by running "dd if=00257.dork._home1.0 bs=32k skip=1 | tar xzvf -"