Other api/dbsrv

Table Of Contents

Previous topic

dbsvn - DBI SVN Gatekeeper

Next topic

DybDbiPre

This Page

Daya Bay Links

Content Skeleton

DBSRV

DybPython.dbsrv

dbsrv : MySQL Server Utilities

A more admin centric version of sibling db.py with advanced features, including:

  • on server optimizations such as select ... into outfile taking advantage of the situation when the mysql client and server are on the same node.
  • partitioned dump/load for dealing with very large tables and incremental backups
  • implicit DB addressing without a ~/.my.cnf section allowing handling of multiple databases all from the same server via comma delimited names or regular expressions
  • despite coming from NuWa it does not need the NuWa environment, system python with MySQLdb is OK

TODO

  1. checking the digests on the target and sending notification emails
  2. test dumplocal when partitionsize is an exact factor of table size
  3. warnings or asserts when using partitioned dumplocal with disparate table sizes

Usage

./dbsrv.py tmp_ligs_offline_db_0 databases
./dbsrv.py tmp_ligs_offline_db_0 tables
./dbsrv.py tmp_ligs_offline_db_0 dumplocal  --where "SEQNO < 100" 

Similar to db.py the first argument can be a ~/.my.cnf section name. Differently to db.py it can also simply be a database name which does not have a corresponding config section.

In this implicit case the other connection pararameters are obtained from the so called home section. Normally the home section is “loopback” indicating an on server connection. The home section must point to the information_schema database.

When the –home option is used databases on remote servers can be accessed without having config sections for them all.

Comparing DB via partitioned dump

Three table dumps skipping the crashed table in order to compare:

  • dybdb1_ligs.tmp_ligs_offline_db_dybdb1 original on dybdb1
  • dybdb2_ligs.channelquality_db_dybdb2 recovered on dybdb2
  • loopback.channelquality_db_belle7 recovered onto belle7 from hotcopy created on belle1

Invoked from cron for definiteness, and ability to leave running for a long time:

07 17 * * * ( $DYBPYTHON_DIR/dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home dybdb1_ligs tmp_ligs_offline_db_dybdb1 dumplocal /tmp/cq/tmp_ligs_offline_db_dybdb1  --partition --partitioncfg 10000,0,33 ) >  $CRONLOG_DIR/dbsrv_dump_tmp_ligs_offline_db_dybdb1.log 2>&1 
52 18 * * * ( $DYBPYTHON_DIR/dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home dybdb2_ligs channelquality_db_dybdb2 dumplocal /tmp/cq/channelquality_db_dybdb2 --partition --partitioncfg 10000,0,33 ) > $CRONLOG_DIR/dbsrv_dump_channelquality_db_dybdb2.log 2>&1 
28 20 * * * ( $DYBPYTHON_DIR/dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home loopback channelquality_db_belle7 dumplocal /tmp/cq/channelquality_db_belle7 --partition --partitioncfg 10000,0,33 ) > $CRONLOG_DIR/dbsrv_dump_channelquality_db_belle7.log 2>&1 

Warning

–partitioncfg has now been split into –partitionsize and –partitionrange

Dump speed:

  1. remote dumps from dybdb1/dybdb2 to belle7 take approx 165s for each chunk. Thus ~90min for all.
  2. local dumps on belle7 take approx 20s for each chunk. Thus ~11min for all.
diffing the dumped partitions

For the first two all but the partial chunk match.

Range of partition dirs to diff controlled by envvar:

[blyth@belle7 DybPython]$ RANGE=0,10 ./diff.py  /tmp/cq/tmp_ligs_offline_db_dybdb1/10000 /tmp/cq/channelquality_db_dybdb2/10000   
[blyth@belle7 DybPython]$ RANGE=10,20 ./diff.py  /tmp/cq/tmp_ligs_offline_db_dybdb1/10000 /tmp/cq/channelquality_db_dybdb2/10000   
[blyth@belle7 DybPython]$ RANGE=20,30 ./diff.py  /tmp/cq/tmp_ligs_offline_db_dybdb1/10000 /tmp/cq/channelquality_db_dybdb2/10000   
[blyth@belle7 DybPython]$ RANGE=30,33 ./diff.py  /tmp/cq/tmp_ligs_offline_db_dybdb1/10000 /tmp/cq/channelquality_db_dybdb2/10000      ## see diff.py for the output from these

[blyth@belle7 DybPython]$ RANGE=0,10 ./diff.py  /tmp/cq/channelquality_db_belle7/10000 /tmp/cq/channelquality_db_dybdb2/10000   
[blyth@belle7 DybPython]$ RANGE=10,20 ./diff.py  /tmp/cq/channelquality_db_belle7/10000 /tmp/cq/channelquality_db_dybdb2/10000   
[blyth@belle7 DybPython]$ RANGE=20,30 ./diff.py  /tmp/cq/channelquality_db_belle7/10000 /tmp/cq/channelquality_db_dybdb2/10000   
[blyth@belle7 DybPython]$ RANGE=30,33 ./diff.py  /tmp/cq/channelquality_db_belle7/10000 /tmp/cq/channelquality_db_dybdb2/10000   
oops a difference, but its just different formatting of 0.0001 or 1e-04
[blyth@belle7 DybPython]$  RANGE=10,20 ./diff.py  /tmp/cq/channelquality_db_belle7/10000 /tmp/cq/channelquality_db_dybdb2/10000   
2013-06-07 17:58:06,933 __main__ INFO     rng ['10', '11', '12', '13', '14', '15', '16', '17', '18', '19'] 
2013-06-07 17:58:26,526 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/10 /tmp/cq/channelquality_db_dybdb2/10000/10 => 0 
2013-06-07 17:58:44,896 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/11 /tmp/cq/channelquality_db_dybdb2/10000/11 => 0 
2013-06-07 17:59:04,360 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/12 /tmp/cq/channelquality_db_dybdb2/10000/12 => 0 
2013-06-07 17:59:22,531 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/13 /tmp/cq/channelquality_db_dybdb2/10000/13 => 0 
2013-06-07 17:59:42,205 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/14 /tmp/cq/channelquality_db_dybdb2/10000/14 => 0 
2013-06-07 18:00:00,385 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/15 /tmp/cq/channelquality_db_dybdb2/10000/15 => 0 
2013-06-07 18:00:20,000 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/16 /tmp/cq/channelquality_db_dybdb2/10000/16 => 0 
2013-06-07 18:00:38,198 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/17 /tmp/cq/channelquality_db_dybdb2/10000/17 => 0 
2013-06-07 18:00:38,704 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/18 /tmp/cq/channelquality_db_dybdb2/10000/18 => 1
Files /tmp/cq/channelquality_db_belle7/10000/18/DqChannel.csv and /tmp/cq/channelquality_db_dybdb2/10000/18/DqChannel.csv differ
 
2013-06-07 18:00:56,602 __main__ INFO     diff -r --brief /tmp/cq/channelquality_db_belle7/10000/19 /tmp/cq/channelquality_db_dybdb2/10000/19 => 0 
[blyth@belle7 DybPython]$ 
[blyth@belle7 DybPython]$ 
[blyth@belle7 DybPython]$ 
[blyth@belle7 DybPython]$ diff /tmp/cq/channelquality_db_belle7/10000/18/DqChannel.csv  /tmp/cq/channelquality_db_dybdb2/10000/18/DqChannel.csv 
1196930c1196930
< 186235,2,28473,7,67175938,0.0001,7.35714,3.39868,-1,-1
---
> 186235,2,28473,7,67175938,1e-04,7.35714,3.39868,-1,-1
...

Commands

summary

Providea a summary of table counts and update times in all selected databases. The DB names are specified by comma delimited OR Regexp string arguments specifying the DB names.

./dbsrv.py               tmp_ligs_offline_db_\d summary           

           # local home, requires "loopback" config section pointing to information_schema DB

./dbsrv.py --home dybdb1 tmp_\S* summary                          

           # remote home,  requires "dybdb1" config section pointing to information_schema DB

TODO:

Check handling of section names the same as DB names on different nodes, as the section config will trump the dbname ? BUT home config host matching should trip asserts ?

dumplocal

The DB tables are dumped as .csv files and separate .schema files containing table creation SQL. Without a directory argument the dumps are writes beneath the –backupfold controllable directory, such as /var/dbbackup/dbsrv

[blyth@belle7 DybPython]$ ./dbsrv.py  tmp_ligs_offline_db_0 dumplocal --where 'SEQNO <= 100'    
2013-06-13 16:49:38,152 __main__ INFO     partition_dumplocal___ SEQNO <= 100 writing /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_0/DqChannel.csv 
2013-06-13 16:49:38,578 __main__ INFO     partition_dumplocal___ SEQNO <= 100 writing /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_0/DqChannel.csv took  0.39 seconds 
...

[blyth@belle7 DybPython]$ ./dbsrv.py  tmp_ligs_offline_db_0 dumplocal /tmp/check/tmp_ligs_offline_db_0 --where 'SEQNO <= 100' 
2013-06-13 16:50:49,003 __main__ WARNING  using basedir /tmp/check/tmp_ligs_offline_db_0 different from standard /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_0 
2013-06-13 16:50:49,031 __main__ INFO     partition_dumplocal___ SEQNO <= 100 writing /tmp/check/tmp_ligs_offline_db_0/DqChannel.csv 
2013-06-13 16:50:49,203 __main__ INFO     partition_dumplocal___ SEQNO <= 100 writing /tmp/check/tmp_ligs_offline_db_0/DqChannel.csv took  0.17 seconds 
...  

Warning

When there are databases of the same name on multiple nodes it is useful to include the names of the node in the section name

loadlocal

When doing a load into a database to be created use –DB_DROP_CREATE option:

[blyth@belle7 DybPython]$ ./dbsrv.py tmp_ligs_offline_db_5 loadlocal ~/tmp_ligs_offline_db_0  -l debug --DB_DROP_CREATE 

Typically when loading a database name change in needed, in this case the directory and new section name must be given:

[blyth@belle7 DybPython]$ ./dbsrv.py tmp_ligs_offline_db_50 loadlocal /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_0 --DB_DROP_CREATE  
DROP and reCREATE database tmp_ligs_offline_db_50 loosing all tables contained ? Enter "YES" to proceed : YES
2013-06-13 16:58:41,491 __main__ WARNING  using basedir /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_0 different from standard /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_50 
2013-06-13 16:58:41,499 __main__ WARNING  creating table DqChannel from schema file /var/dbbackup/dbsrv/belle7.nuu.edu.tw/tmp_ligs_offline_db_0/DqChannel.schema 
...
partitioned loadlocal

NB when restoring need to do a name change, so it is neccesary to specify the source directory as an argument

[root@cms01 DybPython]# dbsrv channelquality_db_restored loadlocal /data/var/dbbackup/dbsrv/dybdb2.ihep.ac.cn/channelquality_db_dybdb2 --partition --extract -l debug --DB_DROP_CREATE  -C
                        ##  initial run, creating the DB  from 32 partitions took ~100 min

[root@cms01 DybPython]# dbsrv channelquality_db_restored loadlocal /data/var/dbbackup/dbsrv/dybdb2.ihep.ac.cn/channelquality_db_dybdb2 --partition --extract  -K
                        ##  quick re-run, notices nothing to do and completes in a few seconds

[blyth@cms01 ~]$ type dbsrv    # function to nab the NuWa python MySQLdb, as yum is being uncooperative on cms01
dbsrv is a function
dbsrv () 
{ 
   local python=/data/env/local/dyb/trunk/external/Python/2.7/i686-slc4-gcc34-dbg/bin/python;
   export PYTHONPATH=/data/env/local/dyb/trunk/NuWa-trunk/../external/mysql_python/1.2.3_mysql5.0.67_python2.7/i686-slc4-gcc34-dbg/lib/python2.7/site-packages;
   LD_LIBRARY_PATH=/data/env/local/dyb/trunk/NuWa-trunk/../external/mysql/5.0.67/i686-slc4-gcc34-dbg/lib/mysql:$LD_LIBRARY_PATH;
   LD_LIBRARY_PATH=/data/env/local/dyb/trunk/NuWa-trunk/../external/mysql_python/1.2.3_mysql5.0.67_python2.7/i686-slc4-gcc34-dbg/lib:$LD_LIBRARY_PATH;
   export LD_LIBRARY_PATH;
   $python -c "import MySQLdb";
   $python ~blyth/DybPython/dbsrv.py $*
}

Test run on cms01 chugging along at ~3 min per 10k partition, at 32 partitions estimate ~100 min to complete

[blyth@belle7 DybPython]$ ./dbsrv.py  channelquality_db_restored loadlocal  /var/dbbackup/dbsrv/dybdb2.ihep.ac.cn/channelquality_db_dybdb2 --partition --extract -l debug --DB_DROP_CREATE  -C

Partitioned Commands

The partitioning relies on these options:

–partition
switches on partitioning, default False
–partitionkey
default “SEQNO,0”, corresponding to the key name and its position in CSV dumps
–partitioncfg
NOW RPLACED WITH THE BELOW TWO OPTIONS default “10000,0,33”, the three integers specify the number of keys in each chunk 10000 and the range of chunks range(0,33) ie 0 to 32
–partitionsize
default “10000”, specify the number of keys in each chunk
–partitionrange
default of None, meaning all partitions. If specified as eg “0,33” it restricts to a range of partition indices range(0,33)
–partitionlast`
NOW DEPRECATED This the last partition is now auto determined, to allow daily cron running default None, when set to an integer string eg “32” this is used to identifiy the index of the last incomplete partition

For dump and load to refer to the same partition set, requires the same chunk size (and partition key although this is not checked).

partitioned loadlocal

From cron:

DYBPYTHON_DIR=/data1/env/local/dyb/NuWa-trunk/dybgaudi/DybPython/python/DybPython
03 20 * * * ( $DYBPYTHON_DIR/dbsrv.py channelquality_db_0 loadlocal  /tmp/cq/channelquality_db --partition --DB_DROP_CREATE -C ) > $CRONLOG_DIR/dbsrv_load_.log 2>&1
quick partitioning test

For fast dump/load testing use small chunks and range of partitions:

./dbsrv.py tmp_ligs_offline_db_0 dumplocal /tmp/pp/tmp_ligs_offline_db_0 --partition --partitionsize 10 --partitionrange 0,5
./dbsrv.py tmp_ligs_offline_db_5 loadlocal /tmp/pp/tmp_ligs_offline_db_0 --partition --partitionsize 10 --partitionrange 0,5 --DB_DROP_CREATE -C 

Archiving and transfers to remote node

Controlled via options:

-a/–archive
switch on archive creation
-x/–extract
switch on archive extraction
–backupfold
default /var/dbbackup/dbsrv, the location of backup dumps and tarballs
-T/–transfer
switch on remote transfer of archives, must be used together with -a/–archive and the dumplocal command to be effective
–transfercfg
configures the remote node and possible a directory prefix, that is prepended infront of the backupfold

For example the below command dumps partitions 0,1 and 2, creates archive tarballs and transfers them to the remote node configured:

./dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home loopback channelquality_db_belle7 dumplocal /tmp/cq/channelquality_db_belle7 --partition --partitionrange 0,3 -aT

The local and remote tarball paths are the same, with no transfercfg prefix specified, namely:

/var/dbbackup/dbsrv/belle7.nuu.edu.tw/channelquality_db_belle7/10000_0.tar.gz 
Transfer Optimization

A small .dna sidecar to the tarballs is used for tarball content identification. When a rerun of the transfer is made, the sidecar DNA is first checked to see if the remote node already holds the tarball.

This means that only newly reached partitions are archived and transferred. The last incomplete partition will typically be transferred every time as it will have a different content causing the DNA mismatch to trigger a re-transfer.

Full archive/transfer cron test from belle7 to belle1

To prepare the remote node just need to create and set ownership of backupfold eg /var/dbbackup/dbsrv and ensure keyed ssh access is working

DYBPYTHON_DIR=/data1/env/local/dyb/NuWa-trunk/dybgaudi/DybPython/python/DybPython
DBSRV_REMOTE_NODE=N1
35 18 * * * ( $DYBPYTHON_DIR/dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home loopback channelquality_db_belle7 dumplocal --partition --archive --transfer ) > $CRONLOG_DIR/dbsrv_pat_channelquality_db_belle7.log 2>&1  

Installation on dybdb2

Prepare target node

The administrator of target node needs to prepare a folder for the archives:

[blyth@cms01 ~]$ sudo mkdir /data/var/dbbackup/dbsrv
[blyth@cms01 ~]$ sudo chown -R dayabayscp.dayabayscp /data/var/dbbackup/dbsrv
Setup mysql config at source

The config file ~/.my.cnf needs two sections “loopback” and “channelquality_db_dybdb2”:

[loopback]
host      = 127.0.0.1
database  = information_schema
user      = root
password  = ***

[channelquality_db_dybdb2]
host      = 127.0.0.1
database  = channelquality_db
user      = root
password  = ***
SSH environment configuration

The script runs scp commands internally that require:

  • ssh-agent process to be running and authenticated
  • public keys of source node to be appended to .ssh/authorized_keys2 of target
  • SSH_AUTH_SOCK to be defined.

When run from cron the envvar is typically not present. In order to define this the ~/.ssh-agent-info-$NODE_TAG is parsed by the sshenv() from common.py.

This file is created by the env function ssh–agent-start which is used following reboots to start and authenticate the ssh agent process.

Get DybPython from dybsvn
cd 
svn co http://dayabay.ihep.ac.cn/svn/dybsvn/dybgaudi/trunk/DybPython/python/DybPython

Despite coming from dybsvn the dbsrv.py script does not need the NuWa environment. Just the MySQLdb extension in the system python should be adequate.

Quick Interactive Test

Configuring 5 small 100 SEQNO partitions allows the machinery to be quickly tested:

cd DybPython
./dbsrv.py channelquality_db_dybdb2 dumplocal --partition --partitionsize 100 --partitionrange 0,5 --archive --transfer  
CRON commandline
## NB no DBSRV_REMOTE_NODE is needed, the default of S:/data is appropriate
DYBPYTHON_DIR=/root/DybPython
CRONLOG_DIR=/root/cronlog
NODE_TAG=D2
#
42 13 * * * ( $DYBPYTHON_DIR/dbsrv.py channelquality_db_dybdb2 dumplocal --partition --archive --transfer ) > $CRONLOG_DIR/dbsrv_channelquality_db_dybdb2.log 2>&1 

A do-nothing run, when there are no new partitions to dump/archive/transfer takes about 4 mins and uses little resources. When there are new completed partitions to archive and transfer, the default chunk size of 10000 SEQNO leads to tarballs of only 35M (maybe 70M when move for all 4 tables) resulting in rapid transfers.

Although new completed partitions might be reached perhaps every ~10 days with the 10k chunks, a daily transfer is still recommended in order to backup the last incomplete partition and also in order that issues with the transfer are rapidly identified and resolved.

Transfer Monitoring

Implemented using valmon.py with digestpath.py. Valmon needs to run as a daily cronjob on the remote node. Configure with dbsrvmon section:

% ~/.env.cnf blyth@belle1.nuu.edu.tw 
[dbsrvmon]
tn = channelquality_db 
chdir = /var/dbbackup/dbsrv/belle7.nuu.edu.tw/channelquality_db_belle7/archive/10000 
return = dict 
dbpath = ~/.env/dbsrvmon.sqlite 
cmd = digestpath.py 
note = stores the dict returned by the command as a string in the DB without interpretation 
valmon_version = 0.2 
constraints = ( tarball_count >= 34, dna_mismatch == 0, age < 86400 , age < 1000, ) 

Tested on belle1:

[blyth@belle1 e]$ valmon.py -s dbsrvmon ls
2013-06-17 11:48:01,515 env.db.valmon INFO     /home/blyth/env/bin/valmon.py -s dbsrvmon ls
2013-06-17 11:48:01,520 env.db.valmon WARNING  no email section configures and no MAILTO envvar, NOTIFICATION WILL FAIL
2013-06-17 11:48:01,521 env.db.valmon INFO     arg ls
('2013-06-13T19:46:01', 5.5278148651123047, "{'dna_match': 34, 'lookstamp': 1371123961.7826331, 'dna_mismatch': 0, 'tarball_count': 34, 'age': 9030.7826330661774, 'lastchange': 1371114931, 'dna_missing': 0}", 0.0, 0)
('2013-06-13T19:54:06', 5.8677470684051514, "{'dna_match': 34, 'lookstamp': 1371124446.7869501, 'dna_mismatch': 0, 'tarball_count': 34, 'age': 9515.7869501113892, 'lastchange': 1371114931, 'dna_missing': 0}", 0.0, 0)
Obtain the backup tarballs

As of Dec 24 2013 there are 54 tarballs of 43M each, corresponding to 2322M total. scp them using the scponly account on cms01. Qiumei/Simon can provide the password:

dir=/data/var/dbbackup/dbsrv/dybdb2.ihep.ac.cn/channelquality_db_dybdb2/archive/
mkdir -p $dir && cd $dir         
scp -r dayabayscp@cms01.phys.ntu.edu.tw:/data/var/dbbackup/dbsrv/dybdb2.ihep.ac.cn/channelquality_db_dybdb2/archive/10000 .

Partioned dump usage

Full backups are impractical for 10G tables.

Partitioned dumping is attactive for backups of such large tables, as just new partitions need to be dumped on each invokation.

For scp transfers would need to create tarfiles for each partition with dna sidecars, and add a transfer subcommand with option controlled remote node. Clearly via dna checking would allow only new partitions to be transfereed.

System python API warning

Careful regarding PYTHONPATH, when mixing a NuWa PYTHONPATH with a system python get API RuntimeWarning:

[blyth@belle7 DybPython]$ /usr/bin/python dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home dybdb2_ligs   channelquality_db_dybdb2 dumplocal /tmp/cq/channelquality_db_dybdb2  --partition --partitioncfg 10,0,1 
/data1/env/local/dyb/external/mysql_python/1.2.3_mysql5.0.67_python2.7/i686-slc5-gcc41-dbg/lib/python2.7/site-packages/MySQLdb/__init__.py:19: RuntimeWarning: Python C API version mismatch for module _mysql: This Python has API version 1012, module _mysql has version 1013.
  import _mysql
2013-06-06 18:03:08,963 __main__ INFO     schema dir /tmp/cq/channelquality_db_dybdb2/10/_ exists already
2013-06-06 18:03:08,963 __main__ INFO     /* 10-partition 1  /1  */ SEQNO >= 1 and SEQNO <= 10 
2013-06-06 18:03:09,165 __main__ INFO     checking prior csv dump /tmp/cq/channelquality_db_dybdb2/10/0  --partitioncfg 10,0,1 
[blyth@belle7 DybPython]$ 

Avoiding the NuWa PYTHONPATH means are entirely system and avoid the RuntimeWarning:

[blyth@belle7 DybPython]$ PYTHONPATH=  /usr/bin/python dbsrv.py -t DqChannel,DqChannelVld,DqChannelStatusVld --home dybdb2_ligs   channelquality_db_dybdb2 dumplocal /tmp/cq/channelquality_db_dybdb2  --partition --partitioncfg 10,0,1 
2013-06-06 18:04:58,078 __main__ INFO     schema dir /tmp/cq/channelquality_db_dybdb2/10/_ exists already
2013-06-06 18:04:58,078 __main__ INFO     /* 10-partition 1  /1  */ SEQNO >= 1 and SEQNO <= 10 
2013-06-06 18:04:58,282 __main__ INFO     checking prior csv dump /tmp/cq/channelquality_db_dybdb2/10/0  --partitioncfg 10,0,1 
[blyth@belle7 DybPython]$ 

Import Notes

  1. keep to a minimum of imports for portability to server situation, ie do not rely on NuWa environment
  2. MySQLdb optionality is to allows non MySQL-python nodes to autodoc

DybPython.dbsrv.DB

class DybPython.dbsrv.DB(sect, opts=None, home=None)[source]

Bases: object

Parameters:
  • sect – name of section in config file
  • opts – options
  • home – DB instance

Safety constraints on config to minimize accidents from config confusion.

Initially required non-loopback section names and database names to be the same

Loosening this to allow remote commands, by designating a “home” instance and requiring all other instances to match that one in everything but the database name

archive(dir, force=False)[source]
Parameters:dir – directory the contents of which should be archived

As a partition corresponds to a certain SEQNO range, it never changes so there is no need for a datestring in the path.

The configured backupfold needs to be created before using the archive -a option with:

[blyth@belle7 DybPython]$ sudo mkdir /var/dbbackup/dbsrv
[blyth@belle7 DybPython]$ sudo chown -R blyth.blyth /var/dbbackup/dbsrv/
archivepath(dir, base=None)[source]
Parameters:dir – directory to be archived or extracted into
Returns:path to archive tarball, dir path relative to base
database_drop_create(dbname)[source]
Parameters:dbname – name of the database to be dropped and recreated
databases

List of database names obtained from information_schema.tables

datadir

Query DB server to find the datadir, eg /var/lib/mysql/ OR /data/mysql/

determine_basedir(*args)[source]
classmethod docs()[source]

collect the docstrings on command methods identified by naming convention of ending with ___

dumplocal___(*args, **kwa)[source]
Parameters:outdir – specifies output directory which must be writable by mysql user, it will be created if not existing

Rerunning this will do quick checks of the CSV files, looking at line counts and the first and last line and comparing with expections from DB queries. The quick checks are done via commands:

  • wc
  • head -1
  • tail -1

This is not called in the partitioned case.

extract(dir, base)[source]
Parameters:
  • dir – directory to be created by extraction
  • base
loadlocal___(*args, **kwa)[source]
Parameters:outdir – specifies directory containing normal or partitioned dump of CSV files
loadlocal_dir(dir)[source]
lsdatabases___(*args, **kwa)[source]

list databases

lstables___(*args, **kwa)[source]

list tables

partition_dumpcheck(pdir, pwhere, is_last, keycount=False)[source]
Checks a partition dump returning flag to signal a dump or not.
Parameters:
  • pdir
  • pwhere
  • is_last
  • keycount – doing distinct keycount is quite slow, so can skip for pre-existing
Returns:

pdump, chk

partition_dumplocal___(*args, **kwa)[source]
partition_loadlocal___(*args, **kwa)[source]
  1. look into putting the partitions back togther again, in partitioned load local
  2. read file system tealeaves wrt the partitioning
  3. factor off the checking
  4. need to work out which partitions are new and just load those
ptables()[source]
Returns:list of tables with the key field
size

Size estimate of the DB in MB

summary___(*args, **kwa)[source]

Present summary of tables in rst table format:

TABLE_NAME TABLE_ROWS CREATE_TIME CHECK_TIME
DqChannel 62126016 2013-05-30 18:52:51 2013-05-30 18:52:51
DqChannelStatus 62126016 2013-05-30 18:17:42 2013-05-30 18:17:42
DqChannelStatusVld 323573 2013-05-30 18:52:44 None
DqChannelVld 323573 2013-05-30 19:34:55 None
LOCALSEQNO 3 2013-05-30 19:35:02 None
tables

List of table names obtained from show tables

timestamped_dir(*args)[source]

Timestamping is needed for non-partitioned case

utables

List of tables to use in operations, when –tables option is used this can be a subset of all tables.