Other aop/scm

Table Of Contents

Previous topic

Dybinst : Dayabay Offline Software Installer

Next topic

SSH Setup For Automated transfers

This Page

Daya Bay Links

Content Skeleton

Trac+SVN backup/transfer

Backups/transfers and recovery of Trac/SVN instances are implemented in bash functions scm-backup-* env:source:trunk/scm/scm-backup.bash to interactively examine these functions use the normal env discovery approach Heirarchy of Bash Functions, for example:

env-
scm-backup-
scm-backup-<TAB>
type scm-backup-all
t scm-backup-trac      ## env- defines alias "t" for type
scm-backup-vi

Backups with scm-backup-all

Performs scm-backup-trac and scm-backup-repo for all instances/repositories under $SCM_FOLD/{repos,svn,tracs}. The SCM_FOLD is node dependent: /home/scm on the dybsvn server, /var/scm on the env server. Such node dependent details are defined in local-* bash functions.

The backups are performed using hotcopy techniques/scripts provided by the Trac and Subversion projects, and result in tarballs in dated folders beneath $SCM_FOLD/backup/$LOCAL_NODE where LOCAL_NODE is eg dayabay or cms01 : the node on which the instances reside.

Additional tasks are performed by scm-backup-all:

  1. $(svn-setupdir) which contains config details such as users lists are backed up by scm-backup-folder into a separate tarball
  2. a digest of each tarball is made and the resulting 32 char hex code is stored in .dna sidecar files
  3. tarballs are purged by scm-backup-purge to retain a configured number
  4. locks are planted and cleared during backups

Offbox Transfers with scm-backup-rsync

First the SSH agent on the source node is checked with ssh--agent-check , then for each target node tag listed in $BACKUP_TAG, an rsync command is composed and run. The target directory for each node is provided by an echoing bash function scm-backup-dir:

[blyth@cms02 ~]$ scm-backup-dir       ## defaults to current node
/var/scm/backup
[blyth@cms02 ~]$ scm-backup-dir C     ## knows about other nodes
/data/var/scm/backup
[blyth@cms02 ~]$ scm-backup-dir N
/var/scm/backup

Locks are planted and cleared during transfers in order to avoid usage of incomplete tarballs.

When the target account has the env functions installed additional DNA checks are performed following this transfer. This recalculate the tarball digests on the target machines and compares values with those written in the sidecar .dna files.

The most problematic part of adding new nodes as backup targets, is usually configuring the SSH connections that allows passwordless rsync transfers to be performed using SSH keys SSH Setup For Automated transfers.

Recovery using scm-recover-all

Requires a fromnode argument, recovers all Trac/SVN tarballs with scm-recover-repo and users with scm-recover-users, performs apache required ownerwhip changes and syncronises the trac instances with corresponding svn repositories scm-backup-synctrac.

Distributed backup monitoring

Repeated incidents of failure to perform backups and tarball transfers for the Trac/SVN dybsvn, dybaux, env and heprez repositories for extended periods motivated development of a more robust distributed monitoring approach. The pre-existing monitoring used a self monitoring approach which was ineffective for many causes of failure, including the common one of failure to properly restart SSH agents after server reboots.

A distributed monitoring approach was implemented whereby the central server collects tarball information from all remote backup nodes into a central SQLite database and publishes the data as a web accessible JSON data file. Subsequently cron jobs on any node are able to access the JSON data file and check the state of the backup tarballs on all the backup nodes, for example checking the size and age of the last backup tarballs and sending email if notification is required. In this way the monitoring is made robust to the failure of the central server and the backup nodes. The only way for the distributed monitoring to fail to provide notification of problems is for all nodes to fail simultaneously.

The same JSON data files are used from monitoring web pages such as http://dayabay.ihep.ac.cn/e/scm/monitor/ihep/ where users web browsers access the JSON data files and present them as time series charts showing the backup history using the HighCharts Javascript framework.

env:source:trunk/scm/monitor.py
server collection of tarball data and creation of JSON data file, invoked by scm-backup-monitor
env:source:trunk/scm/tgzmon.py
standalone monitoring of remote JSON data, invoked by scm-backup-tgzmon

Adding a new target node for backups

The administrator of the source node will need to:

  1. create a new node tag in ~/.ssh/config with the nodename and user identity of the new target, an unused tag must be chosen: check with local-vi to see tags that have been used already

Node characterization

The target node administrator will need to update the env node characterisation of the new node, using the local-vi function and commit changes into the env repository. The changes required are mostly just additional lines in case statements, providing for example:

  1. local-scm-fold
  2. local-var-base used by local-scm-fold

Placement of SSH keys

Source account public keys ~/.ssh/id_dsa.pub or ~/.ssh/id_rsa.pub need to be appended to the target account ~/.ssh/authorized_keys2 on the target node. This affords access from the source account to the target account allowing the scm-backup-rsync to automatically perform its transfers.

Add target tag to BACKUP_TAG of source node

Once the env working copy is updated on the source node to pick up the new target node characterization the new backup node for the source node is configured by modifiying the case statement in the local-backup-tag function.