Other _modules/DybPython/db

Daya Bay Links

Content Skeleton

Source code for DybPython.db

#!/usr/bin/env python
$Id: db.py 22557 2014-02-20 07:08:30Z blyth $

DB operations performed via MySQLdb::

   ./db.py [options] <dbconf> <cmd>  

Each invokation of this script talks to a single database only.
A successful connection to "sectname" requires the config file 
(default :file:`~/.my.cnf`) named section to provide the below keys, eg:: 

   host = dybdb1.ihep.ac.cn
   user = dayabay
   password = youknowit
   database = offline_db


For a wider view of how ``db.py`` is used see :ref:`sop-dbops`


#. dry run option to report commands that would have been used without doing them
#. better logging and streamlined output 

Required Arguments

   the name of the section in ~/.my.cnf
   that specifies the host/database/user/password
   to use in making connection to the mysql server 

   perform command on the database specified in 
   the prior argument. NB some commands can only be performed locally, 
   that is on the same node that the MySQL server is running on.

command summary

  =========   =======================================  =============================== 
   Command      Action                                   Note
  =========   =======================================  ===============================
  dump          performs mysqldump, works remotely      special LOCALSEQNO handling
  load          loads mysqldump, works remotely         very slow when done remotely, 
                                                        insert statement for every row
  ---------   ---------------------------------------  ------------------------------- 
  rdumpcat      dumps ascii catalog, works remotely     duplicates dumpcat output
                                                        using low level _mysql 
                                                        uses LOCALSEQNO merging  
  rloadcat      loads ascii catalog, works remotely     mysqlimport implementation,
  rcmpcat       compare ascii catalog with DB           readonly command   
  ls            lists tables in various sets 
  cli           emit mysql client connection cmdline    Does not actually connect
  =========   =======================================  ===============================

former commands

  =========   =======================================  =============================== 
   Command      Action                                   Note
  =========   =======================================  ===============================
  dumpcat       dumps ascii catalog, LOCAL ONLY         SELECT ... INTO OUTFILE
  loadcat       loads ascii catalog, LOCAL ONLY         LOAD DATA LOCAL INFILE ... 
                                                        INTO TABLE    
  =========   =======================================  ===============================

Former **loadcat** and **dumpcat** can be mimicked with ``--local`` option of **rdumpcat** and **rloadcat**.
These are for expert usage only into self administered database servers.

using db.py in standalone manner (ie without NuWa) 

This script is usuable with any recent python which 
has the mysql-python (1.2.2 or 1.2.3) package installed.
Check your python and mysql-python with::

       which python
       python -V
       python -c "import MySQLdb as _ ; print _.__version__ "

Checkout :file:`DybPython/python/DybPython` in order to access  *db.py*, *dbcmd.py* and *dbconf.py*, for example with ::

     svn co http://dayabay.ihep.ac.cn/svn/dybsvn/dybgaudi/trunk/DybPython/python/DybPython
     chmod u+x DybPython/db.py

Use as normal::

     ~/DybPython/db.py --help
     ~/DybPython/db.py offline_db count 

checkout offline_db catalog from dybaux 

Example, checkout OR update the catalog::

     mkdir ~/dybaux 
     cd ~/dybaux 
     svn co http://dayabay.ihep.ac.cn/svn/dybaux/catalog    

OR ::
     cd ~/dybaux/catalog
     svn up 

rdumpcat ``tmp_offline_db`` into dybaux working copy::

     db.py tmp_offline_db rdumpcat ~/dybaux/catalog/tmp_offline_db

Test usage of serialized ascii DB

Get into environment and directory of pkg :dybgaudi:`Database/DybDbi`
Modify the config to use ascii DB, for an example see :dybgaudi:`Database/DybDbi/tests/test_calibpmtspec.py`

rloadcat testing, DB time machine

.. warning:: **forced_rloadcat** is for testing only, it skips checks and ploughs ahead with the load, also ``--DROP`` option drops and recreates tables
Fabricate a former state of the DB using **forced_rloadcat** and an earlier revision 
from dybaux, with::

     ## get to a clean revision of catalog (blowing away prior avoids conflicts when doing that)
     rm -rf ~/dybaux/catalog/tmp_offline_db  ; svn up -r 4963 ~/dybaux/catalog/tmp_offline_db

     ## forcefully propagate that state into the tmp_offline_db 
     ./db.py tmp_offline_db forced_rloadcat ~/dybaux/catalog/tmp_offline_db --DROP

     ## compare DB and catalog .. no updates should be found
     ./db.py tmp_offline_db rcmpcat ~/dybaux/catalog/tmp_offline_db 
     ## wind up the revision 
     rm -rf ~/dybaux/catalog/tmp_offline_db  ; svn up -r 4964 ~/dybaux/catalog/tmp_offline_db

     ## compare DB and catalog again  ... updates expected, check timeline diffs
     ./db.py tmp_offline_db rcmpcat ~/dybaux/catalog/tmp_offline_db 

     ## test rloadcat operation and check diffs afterwards
     ./db.py tmp_offline_db rloadcat ~/dybaux/catalog/tmp_offline_db 
     ./db.py tmp_offline_db rcmpcat ~/dybaux/catalog/tmp_offline_db 

import os, inspect, logging, stat, sys
from tempfile import mkdtemp
from string import strip
import MySQLdb
from MySQLdb.constants import FIELD_TYPE as FT
import _mysql
from dbconf import DBConf
from dbcmd import MySQLDump, MySQLDumpDBI, MySQLLoad, MySQLImport, MySQLDescribe, MySQLCommand, MySQLCmd
from dbselect import Select
from asciicat import AsciiCat, AsciiCSV, DD

log = logging.getLogger(__name__)

class Stat(str):
    mtime = property(lambda self:self._stat[stat.ST_MTIME])
    atime = property(lambda self:self._stat[stat.ST_ATIME])
    ctime = property(lambda self:self._stat[stat.ST_CTIME])
    size = property(lambda self:self._stat[stat.ST_SIZE])
    mode = property(lambda self:self._stat[stat.ST_MODE])
    isdir = property(lambda self:stat.S_ISDIR(self.mode))
    mdate = property(lambda self:datetime.datetime.fromtimestamp(self.mtime))

    def _age(self):
        td = datetime.datetime.now() - self.mdate
        return float( td.days * 3600 * 24 + td.seconds ) / ( 3600 * 24 )
    age  = property(_age )

    def __init__(self, arg):
        self._stat = os.stat(self)

class CSVFormat(list):
    Provides the format string to create the CSV line
    appropriate for the field types of the low level query result description 
    It looks something like::


    Usage example::

        llc = _mysql.connection(...) 
        llc.query("select * from ...")
        result = llc.store_result()
        csvf = CSVFormat( result.describe() ) 
        for row in result.fetch_row(0):
            print str(csvf) % tuple(row)


    def __str__(self):
        def field_fmt(fdesc):
            if fdesc[1] in string_types:
                return "\"%s\""
            return "%s"
        return ",".join( map(field_fmt, self) )    

def attrs_(mod, names):
    """Access module constants is a version independant way """  
    return map( lambda _:getattr(mod, _), filter( lambda _:hasattr(mod,_), names.split() ) ) 
string_types = attrs_( FT , 'VARCHAR DATETIME STRING VAR_STRING' ) 

def dbi_pairs(tables):
    From a list of tables (can be payload only) return a list of payload and validity tables 
    return sorted(list(set(tables + map(lambda t:"%sVld" % t, filter(lambda _:_[-3:] != "Vld",tables))))) 

[docs]class DB(object): def __init__(self, sect=None , opts={}, **kwa ): """ Initialize config dict corresponding to section of config file :param sect: section in config file """ self.opts = opts self.skip_pay_check = opts.get('SKIP_PAY_CHECK',False) self.allow_partial = opts.get('ALLOW_PARTIAL',False) dbc = DBConf(sect=sect, **kwa) pars = dbc.mysqldb_parameters(nodb=kwa.get('nodb',False)) if pars.get('db',"").find("#") > -1: raise Exception("DB handling of ascii DBI catalog not yet implemented, see #864 ") try: conn = MySQLdb.connect( **pars ) except MySQLdb.Error, e: raise Exception("Error %d: %s " % ( e.args[0], e.args[1] ) ) self.conn = conn self.llconn = _mysql.connect( **pars ) self.dbc = dbc self.sect = dbc.sect ## as DBConf promotes None to the DBCONF value from environment is_lowlevel = property(lambda self:self.opts.get('lowlevel', False)) def close(self): self.conn.close() def execute_(self, cmd): cursor = self.conn.cursor(MySQLdb.cursors.DictCursor) cursor.execute( cmd ) return cursor def fetchone(self, cmd ): cursor = self.execute_(cmd) row = cursor.fetchone() cursor.close() return row def fetchcount(self, cmd ): row = self.fetchone(cmd) assert len(row) == 1 return row.values()[0] def fetchall(self, cmd ): cursor = self.execute_(cmd) rows = cursor.fetchall() self.count = cursor.rowcount cursor.close() return rows def _get_tmpfold(self): """ Path to temporary folder, named after the DBCONF section. The base directory can be controlled by tmpbase (-b) option """ return os.path.join( self.opts.get('tmpbase','/tmp') , self.sect ) tmpfold = property( _get_tmpfold , doc=_get_tmpfold.__doc__ ) def _get_tmpdir(self): """ Create new temporary directory for each instance, writable by ugo """ if not hasattr(self,'_tmpdir'): if not os.path.exists(self.tmpfold): os.makedirs(self.tmpfold) os.chmod(self.tmpfold, 0777) self._tmpdir = mkdtemp(dir=self.tmpfold) os.chmod( self._tmpdir, 0777 ) return self._tmpdir tmpdir = property( _get_tmpdir, doc=_get_tmpdir.__doc__ ) def __call__(self, cmd): if self.opts.get('verbose',False): print cmd return self.fetchall(cmd)
[docs] def check_(self, *args, **kwa): """ check connection to DB by issuing a SELECT of info functions such as DATABASE() and CURRENT_USER() command """ rec = self.fetchone("SELECT DATABASE(),CURRENT_USER(),VERSION(),CONNECTION_ID() ") return rec
[docs] def noop_(self, *args, **kwa): """ Do nothing command, allowing to just instanciate the DB object and provide it for interactive prodding, eg:: ~/v/db/bin/ipython -- ~/DybPython/db.py tmp_offline_db noop In [1]: db("show tables") ## high level In [2]: db.llconn.query("select * from CalibPmtSpecVld") ## lowlevel _mysql In [3]: r = db.conn.store_result() This also demonstrates standalone :file:`db.py` usage, assuming svn checkout:: svn co http://dayabay.ihep.ac.cn/svn/dybsvn/dybgaudi/trunk/DybPython/python/DybPython """ pass
[docs] def ls_(self, *args, **kwa ): """ Listing of tables in various sets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Usage:: ./db.py tmp_offline_db ls Annotation '-' indicates tables not in the table selection, typically only the below types of tables should appear with '-' annotation. #. non-SOP tables such as scraped tables #. temporary working tables not intended for offline_db If a table appears with annotation '-' that is not one of the above cases then either ``db.py`` **tselect** needs to be updated to accomodate a new table (ask Liang to do this) OR you need to update your version of ``db.py``. The first few lines of ``db.py --help`` lists the revision in use. See :dybsvn:`ticket:1269` for issue with adding new table McsPos that this command would have helped to diagnose rapidly. """ kvs = ( ('showtables', "all tables in DB, the result of SHOW TABLES"), ('showpaytables', "all DBI payload tables, tables ending Vld with Vld removed, from SHOW TABLES LIKE '%Vld' "), ('tables', "DBI paired up table selection + LOCALSEQNO, controlled via tselect default option"), ('paytables', "derived from `tables`, selected DBI payload tables"), ('optables', "tables to be operated upon, in default decoupled case same as `tables` for offline_db or `showtables` otherwise "), ) tt = self.tables def present_(t): anno = " " if t in tt else "-" return " %s %s" % (anno,t) print self.ls_.__doc__ for k,v in kvs: vv = getattr(self, k ) print "\n **%s** : %s \n" % ( k, v ) print "\n".join(map(present_,vv))
def _get_showtables( self , nocache=False ): """ list names of all tables in DB as reported by SHOW TABLES, NB the result is cached so will become stale after deletions or creations unless `nocache=True` option is used """ if not hasattr(self, '_showtables') or nocache == True: self._showtables = [rec.values()[0] for rec in self("SHOW TABLES")] return self._showtables showtables = property( _get_showtables, doc=_get_showtables.__doc__ ) def _get_showpaytables( self , nocache=False ): """ list names of all DBI payload tables in DB as reported by SHOW TABLES LIKE '%Vld' with the 'Vld' chopped off NB the result is cached so will become stale after deletions or creations unless `nocache=True` option is used """ if not hasattr(self, '_showpaytables') or nocache == True: self._showpaytables = [rec.values()[0][:-3] for rec in self("SHOW TABLES LIKE '%Vld'")] return self._showpaytables showpaytables = property( _get_showpaytables, doc=_get_showpaytables.__doc__ ) def _get_tables( self ): """ list of selected table names to operate on plus the mandatory LOCALSEQNO Poorly named should be ``table_selection`` """ tabs = self.opts['tselect'].split(",") if self.opts['NOPAIRING']: log.warn("option NOPAIRING causes tables selection to be used verbatim, with no DBI pairing up") else: tabs = dbi_pairs(tabs) if self.opts['NOLOCALSEQNO']: log.warn("option NOLOCALSEQNO inhibits operations on LOCALSEQNO table") else: tabs += ["LOCALSEQNO"] return tabs tables = property( _get_tables, doc=_get_tables.__doc__ ) def _get_paytables(self): """ list of selected DBI payload tables """ tabs = self.tables return filter(lambda t:t not in ('LOCALSEQNO','GLOCALSEQNO',) and t[-3:] != 'Vld', tabs ) paytables = property( _get_paytables, doc=_get_paytables.__doc__ ) def _get_optables(self): """ List of tables that commands such as **rdumpcat** perform operations on, outcome depends on: #. table selection from the `-t/--tselect` option #. decoupled option setting #. DBCONF section name, where name **offline_db** is regarded as special The default value of the table selection option constitutes the current standard set of DBI tables that should be reflected in the dybaux catalog. When following the SOP in the now default "decoupled" mode the **offline_db** rdumpcat needs to abide by the table selection in force, whereas when dumping from **tmp_offline_db** onto a dybaux checkout need to dump all of the subset. Rather than the default table selection. This special casing avoids the need for the ``-t`` selection when rdumpcating **tmp_offline_db** """ if self.opts['decoupled']: if self.sect == 'offline_db': ## NASTY SPECIAL CASING return self.tables ## tselected tables, dbi paired with LOCALSEQNO added else: return self.showtables ## all tables as listed by "SHOW TABLES" else: return self.tables optables = property( _get_optables, doc=_get_optables.__doc__ ) def info_(self, *args, **kwa): log.info("info t %s %s %s " % (self.sect, repr(args), repr(kwa))) for att in ('showtables','tables','optables'): tabs = getattr(self, att ) log.info( " ******** %s ( %s ) ************* " % ( att,len(tabs)) )
[docs] def count_(self, *args, **kwa): """ List table counts of all tables in database, usage example:: db.py offline_db count *offline_db* is :file:`~/.my.cnf` section name specifying host/database/user/password """ log.debug("count %s %s %s " % (self.sect, repr(args), repr(kwa))) counts = dict(TOTAL=0) for tab in self.showtables: cnt = self.fetchone("SELECT COUNT(*) FROM %s" % tab ) n = float(cnt.values()[0]) counts[tab] = n counts['TOTAL'] += n log.info( counts ) for tab in self.showtables + ['TOTAL']: perc = 100.*counts[tab]/counts['TOTAL'] log.info( "%-30s : %-10s : %10s " % ( tab, counts[tab] , "%.3f" % perc ) )
[docs] def vsssta(self, tab): """ Look at VERSIONDATE/TIMESTART/... within SSSTA groups """ sssta = 'concat(SITEMASK,":",SIMMASK,":",SUBSITE,":",TASK,":",AGGREGATENO)' having = " having ( count(distinct(VERSIONDATE)) < count(*) )" q = "select min(SEQNO) as miSEQNO,max(SEQNO) as mxSEQNO,count(distinct(VERSIONDATE)) as nVERSIONDATE,count(distinct(TIMESTART)) as nTIMESTART,count(*) as n,%(sssta)s as sssta from %(tab)sVld group by %(sssta)s %(having)s " for i,e in enumerate(self(q % locals())): e.update(tab=tab) log.info(e)
def vsssta_(self, *args, **kwa): for tab in self.showpaytables: self.vsssta( tab )
[docs] def vdupe(self, tab): """ Currently is overreporting as needs to be balkanized by context """ q = "SELECT SEQNO,VERSIONDATE,COUNT(VERSIONDATE) AS dupe FROM %(tab)sVld GROUP BY VERSIONDATE HAVING ( COUNT(VERSIONDATE) > 1 ) " n = 0 log.info("vdupe_ tab %(tab)s " % locals() ) for i,e in enumerate(self(q % locals())): n += 1 vdate = e['VERSIONDATE'].strftime("%Y-%m-%d %H:%M:%S") vq = "select * from %(tab)sVld where VERSIONDATE = '%(vdate)s' ;" % locals() log.warn("vdupe_ %s seqno %s q %s " % (e['dupe'], e['SEQNO'], vq ) ) return n
[docs] def vdupe_(self, *args, **kwa ): """ Report the first Vlds which feature duplicated VERSIONDATEs:: mysql> SELECT SEQNO,VERSIONDATE,COUNT(VERSIONDATE) AS dupe FROM DemoVld GROUP BY VERSIONDATE HAVING ( COUNT(VERSIONDATE) > 1 ) ; +-------+---------------------+------+ | SEQNO | VERSIONDATE | dupe | +-------+---------------------+------+ | 71 | 2011-08-04 05:55:47 | 2 | | 72 | 2011-08-04 05:56:47 | 3 | +-------+---------------------+------+ 2 rows in set (0.00 sec) mysql> select * from DemoVld ; +-------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+ | SEQNO | TIMESTART | TIMEEND | SITEMASK | SIMMASK | SUBSITE | TASK | AGGREGATENO | VERSIONDATE | INSERTDATE | +-------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+ | 70 | 2011-08-04 05:54:47 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:54:47 | 2011-08-11 10:12:32 | | 71 | 2011-08-04 06:15:46 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:55:47 | 2011-08-11 10:12:32 | | 72 | 2011-08-04 07:02:51 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:56:47 | 2011-08-11 10:12:32 | | 73 | 2011-08-04 05:54:47 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:55:47 | 2011-08-11 10:12:32 | | 74 | 2011-08-04 06:15:46 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:56:47 | 2011-08-11 10:12:32 | | 75 | 2011-08-04 05:54:47 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:56:47 | 2011-08-11 10:12:32 | | 76 | 2011-08-04 06:15:46 | 2038-01-19 03:14:07 | 127 | 1 | 0 | 0 | -1 | 2011-08-04 05:57:47 | 2011-08-11 10:12:32 | +-------+---------------------+---------------------+----------+---------+---------+------+-------------+---------------------+---------------------+ 7 rows in set (0.00 sec) """ for tab in self.showtables: if tab[-3:] != 'Vld':continue self.vdupe( tab[:-3] )
[docs] def mysql(self, *args, **kwa ): """ """ cmd = MySQLCommand( self.dbc ) return cmd( *args, **kwa )
[docs] def describe(self, tab ): """ """ dsc = MySQLDescribe( self.dbc ) return dsc(table=tab)
[docs] def desc(self, tab ): """ Header line with table definition in .csv files shift the pk definition to the end """ pks = [] def _desc( f ): if f['Key'] == "PRI": pks.append(f['Field']) return "%(Field)s %(Type)s" % f cols = ",".join( [ _desc(f) for f in self("describe %s" % tab) ] ) if pks: cols += ",PRIMARY KEY (" + ",".join( pks ) + ")" return cols + "\n"
def fields(self, tab): """ :param tab: :return: list of field names """ return map(lambda _:_['Field'], self("describe `%s`" % tab ))
[docs] def read_desc(self, tabfile ): """ Read first line of csv file containing the description """ tf = open(tabfile, "r") hdr = tf.readline().strip() tf.close() return hdr
[docs] def outfile(self, tab): """Path of raw outfile as dumped by SELECT ... INTO OUTFILE """ return os.path.join( self.tmpdir , "%s.csv" % tab )
def reldir(self, tab ): return tab[-3:].upper() == 'VLD' and tab[:-3] or tab def relname(self, tab): return os.path.join( self.reldir(tab) , "%s.csv" % tab )
[docs] def tabfile(self, tab, catfold ): """ path of table obtained from """ dir = os.path.join( catfold , self.reldir(tab) ) if not os.path.isdir(dir): os.makedirs(dir) return os.path.join( catfold, self.relname(tab) )
[docs] def predump(self): """ Checks performed before : **dump**, **dumpcat**, **rdumpcat** """ if self.opts['decoupled']: pass else: show = set(self.showtables) tsel = set(self.tables) missing = list(tsel.difference(show)) log.info("predump show %s " % show ) log.info("predump tsel %s " % tsel ) assert len(missing) == 0, "selected tables are missing from DB : %r \n use -t option to restrict the table selection" % missing
[docs] def rdumpcat_(self, *args, **kwa ): """ Dumps DBI tables and merges LOCALSEQNO from ``tmp_offline_db`` into a pre-existing ascii catalog. Usage:: db.py -d tmp_offline_db rdumpcat ~/dybaux/catalog/tmp_offline_db ## -d/--decoupled is now the default db.py tmp_offline_db rdumpcat ~/dybaux/catalog/tmp_offline_db svn status ~/dybaux/catalog/tmp_offline_db ## see whats changed Features of the default ``-d/--decoupled`` option: #. requires dumping into a pre-existing catalog #. subset of tables present in the DB are dumped #. partial LOCALSEQNO.csv is merged into the pre-existing catalog LOCALSEQNO.csv #. performs safe writes, if the merge fails detritus files with names ending ``.csv._safe`` and ``.csv._merged`` will be left in the working copy With alternate ``-D/--nodecoupled`` option must ensure that the table selection is appropriate to the content of the DB:: db.py -D -t CableMap,HardwareID offline_db rdumpcat ~/offline_db To obtain the dybaux SVN catalog:: mkdir ~/dybaux cd ~/dybaux ; svn co http://dayabay.ihep.ac.cn/svn/dybaux/catalog The ascii catalog is structured :: ~/dybaux/catalog/tmp_offline_db tmp_offline_db.cat CalibFeeSpec/ CalibFeeSpec.csv CalibFeeSpecVld.csv CalibPmtSpec/ CalibPmtSpec.csv CalibPmtSpecVld.csv ... LOCALSEQNO/ LOCALSEQNO.csv The .csv files comprise a single header line with the table definition and remainder containing the row data. ADVANCED USAGE OF ASCII CATALOGS IN CASCADES The resulting catalog can be used in a DBI cascade by setting DBCONF to:: tmp_offline_db_ascii:offline_db Assuming a section:: [tmp_offline_db_ascii] host = localhost user = whatever password = whatever db = tmp_offline_db#/path/to/catname/catname.cat NB from :dybsvn:`r9869` /path/to/catname/catname.cat can also be a remote URL such as :: http://dayabay:youknowit\@dayabay.ihep.ac.cn/svn/dybaux/trunk/db/cat/zhe/trial/trial.cat http://dayabay:youknowit\@dayabay.ihep.ac.cn/svn/dybaux/!svn/bc/8000/trunk/db/cat/zhe/trial/trial.cat When stuffing basic authentication credentials into the URL it is necessary to backslash escape the "@" to avoid confusing DBI(TUrl) Note the use of "!svn/bc/NNNN" that requests apache mod_dav_svn to provide a specific revision of the catalog. rather than the default latest. ADVANTAGES OF CATALOG FORMAT OVER MYSQLDUMP SERIALIZATIONS * effectively native DBI format that can be used in ascii cascades allowing previewing of future database after updates are made * very simple/easily parsable .csv that can be read by multiple tools * very simple diffs (DBI updates should be contiguous additional lines), unlike mysqldump, this means efficient storage in SVN * no-variants/options that change the format (unlike mysqldump) * no changes between versions of mysql * much faster to load than mysqldumps IMPLEMENTATION NOTES #. `mysql` does not support remote `SELECT ... INTO OUTFILE` even with `OUTFILE=/dev/stdout` #. `mysqldump -Tpath/to/dumpdir` has the same limitation To workaround these limitations a `csvdirect` approach is taken where low level mysql-python is used to perform a ``select *`` on selected tables and the strings obtained are written directly to the csv files of the catalog. Low-level mysql-python is used to avoid pointless conversion of strings from the underlying mysql C-api into python types and then back into strings. """ log.debug("rdumpcat %s %s %s " % ( self.sect, repr(args), repr(kwa))) assert len(args) > 0, "argument specifying the path of the catalog folder to be created is required " self.predump() catfold = args[0] catname = os.path.basename(catfold) catfile = os.path.join(catfold, "%s.cat" % catname) if os.path.exists(catfold): assert os.path.isdir(catfold),"argument must specify directory, not a file %s " % catfold if self.opts['decoupled']: log.info("performing decoupled rdumpcat into existing directory %s " % catfold ) else: log.info("CAUTION : performing non-decoupled rdumpcat into existing directory ") else: assert not self.opts['decoupled'], "decoupled rdumpcat must be done into a preexisting catalog " cat = ['name'] omrg = {} ## results of decoupled LOCALSEQNO merge for tab in sorted(self.optables): select = Select(tab=tab) select.db = self tabfile = self.tabfile(tab, catfold) preexists = os.path.exists(tabfile) safe_write = preexists and self.opts.get('safe') log.debug("rdumpcat_ writing %s pre-exists %s safe_write %s " % (tabfile,preexists,safe_write) ) ## when safe writing, write first to "<tablename>.csv._safe" ## and do some sanity checks before renaming to "<tablename>.csv" if safe_write: prior = Stat(tabfile) safefile = tabfile + "._safe" else: safefile = tabfile tf = open(safefile,"w") tf.write( self.desc(tab) ) if not self.opts.get('local'): self._write_csvdirect( select , tf ) ## result of select is returned to python and thence formatted directly into csv, works remotely else: outfile = self.outfile(tab) ## temporary written by mysql select['outfile'] = outfile ## presence of this key changes the form of query self._write_outfile( select ) ## mysql does the writing into outfile , local only tf.write( open(outfile,"r").read() ) ## shove the mysql created file on the end of the description tf.close() if safe_write: if tab == 'LOCALSEQNO' and self.opts['decoupled']: resfile, workings, omrg = self._merge_localseqno( prior, safefile ) else: resfile, workings = Stat(safefile), [] ## safety checks before overwriting csv if resfile.size < prior.size: log.fatal("size reduction prior %s %s whereas new %s %s " % ( prior , prior.size, resfile, resfile.size )) log.fatal("dumpcat is only allowed to add, size reductions are prohibited unless using OVERRIDE") for path in workings: log.info("workings %s " % path ) if self.opts.get('OVERRIDE',False): log.warn("proceeding by virtue of OVERRIDE option") else: sys.exit(1) pass log.debug("renaming %s into %s " % ( resfile, prior )) for path in workings: log.debug("remove workings %s " % path ) assert len(path) > 10 os.remove( path ) os.rename( resfile, prior ) else: pass ## non-safe simply writes directly to target cat.append( self.relname(tab) ) pass if self.opts['decoupled']: log.info("completed decoupled merge into catalog, catfile %s " % ( catfile ) ) for k in sorted(omrg): log.info( "%-15s %s " % ( k, omrg[k] )) else: open( catfile , "w" ).write( "\n".join(cat) + "\n" ) log.info("completed writing catalog, catfile %s " % catfile )
def _merge_localseqno( self, prior, safefile ): """ decoupled handling of LOCALSEQNO requires merging of small LOCALSEQNO into preexisting bigger one Merge checks: #. changed LASTUSEDSEQNO restricted to declared subset of tables #. changes increment LASTUSEDSEQNO """ ori = AsciiCSV( None, prior )() ## .csv full old one upd = AsciiCSV( None, safefile )() ## .csv._safe slim decoupled subset mrg = ori.merged( upd , postpend="._merged" ) ## .csv._merged merge slim decoupled into full old one mrg.write() log.debug( "ori %r " % ori.lastusedseqno ) log.debug( "upd %r " % upd.lastusedseqno ) log.debug( "mrg %r " % mrg.lastusedseqno ) ## compared the merged with the original om = DD( ori.lastusedseqno , mrg.lastusedseqno , increments=True ) log.debug( "om (ori cf mrg) %s " % om ) assert len(om['removed']) == 0, om ts = self.tables assert list(om['changed']) == om['increments'].keys() for t, i in om['increments'].items(): assert t in ts, "changed table %s must be in selection %s " % ( t, ts ) if i < 0: msg = "LASTUSEDSEQNO must increase %s %s %s unless using OVERRIDE " % ( i, t , om ) log.fatal(msg) if not self.opts.get('OVERRIDE',False): raise Exception(msg) else: log.warn("proceed due to OVERRIDE option") log.warn("writing merged %r into %s " % ( mrg.lastusedseqno, mrg.path )) workings = (upd.path,) resfile = Stat( mrg.path ) return resfile, workings , om def _write_csvdirect(self, select , tf ): """ Adopt low level approach to avoid unnecessary conversions into python types then back to string and the associated difficulties of then getting precisely the same as SELECT * INTO OUTFILE Note that use of `store_result` rather than `use_result` means that all rows are in memory at once. NB for consistency the CSV ouput by this command MUST MATCH that by _write_outfile `_write_csvdirect` is used by **rdumpcat** , this mimics the output from `_write_outfile` (used by **dumpcat**) with the big advantage that it works remotely, with no strong permission requirements TODO: #. when there is a pre-existing LOCALSEQNO redirect LOCALSEQNO to a temporay file and do a merge... easiest to instanciate them as AsciiCSV and then merge at that level """ q = str(select) log.debug("_write_csvdirect %s " % q) llconn = self.llconn llconn.query( q ) lessmemory = self.opts.get('LESSMEMORY', False) if lessmemory: log.info("using `--LESSMEMORY` option : less memory expensive but more network expensive 'use_result' ") result = llconn.use_result() else: log.info("using more memory expensive but less network expensive 'store_result' ") result = llconn.store_result() csvf = CSVFormat( result.describe() ) for row in result.fetch_row(maxrows=0, how=0): ## all rows as tuples tf.write( str(csvf) % tuple(row) +"\n" ) def _write_outfile(self, select ): """ Use of "INTO OUTFILE" forces client and server to be on the same machine """ q = str(select) log.debug("_write_outfile %s " % q) self(q)
[docs] def rloadcat_(self, *args, **kwa): """ Loads an ascii catalog into a possibly remote database. This is used by DB managers in the final step of the update SOP to propagate ``dybaux`` updates into ``offline_db``. Usage:: ./db.py tmp_offline_db rloadcat ~/dybaux/catalog/tmp_offline_db Steps taken by **rloadcat**: #. compares tables and `SEQNO` present in the ascii catalog with those in the DB and reports diffences found. The comparison looks both at the ``LOCALSEQNO`` tables that DBI uses to hold the `LASTUSEDSEQNO` for each table and also by looking directly at all `SEQNO` present in the validity tables. The **rcmpcat** command does only these comparisons. #. if updates are found the user is asked for consent to continue with updating #. for the rows (`SEQNO`) that are added by the update the catalog validity tables ``INSERTDATE`` timestamps are *fastforwarded* inplace to the current UTC time #. catalog tables are imported into the DB with the `mysqlimport` tool. For payload and validity tables the `mysqlimport` option ``--ignore`` is used meaning that only new rows (as determined by their primary keys) are imported, other rows are ignored. For the ``LOCALSEQNO`` table the option ``--replace`` is used in order to replace the ``(TABLENAME,LASTUSEDSEQNO)`` entry. :return: dictionary keyed by payload table names with values containing lists of SEQNO values :rtype: dict You might be tempted to use **rloadcat** as a faster alternative to **load** however this is not advised due to the extra things that **rloadcat** does such as update comparisons and fastforwarding and potentially merging in (when the decouped option is used). In comparison the **load** command blasts what comes before it, this can be done using **forced_rloadcat** with the ``--DROP`` option:: ./db.py --DROP tmp_offline_db forced_rloadcat ~/dybaux/catalog/tmp_offline_db After which you can check operation via an **rdumpcat** back onto the working copy, before doing any updates:: ./db.py tmp_offline_db rdumpcat ~/dybaux/catalog/tmp_offline_db svn st ~/dybaux/catalog/tmp_offline_db ## should show no changes Reading full catalog into memory is expensive. #. can I omit the payload tables from the read ? """ log.debug("rloadcat %s %s %s " % ( self.sect, repr(args), repr(kwa))) assert len(args) > 0, "argument specifying the path of an existing catalog directory is required %s " % args[0] tselect = self.paytables log.debug("rloadcat_ tselect %r " % tselect ) cat = AsciiCat(args[0], skip_pay_check=self.skip_pay_check, allow_partial=self.allow_partial ) updates = cat.updates( self , tselect=tselect , fastforward=False ) if len(updates) == 0: log.warn("no updates (new tables or new SEQNO) are detected, nothing to do ") return {} for tn,seq in updates.items(): log.info(" %-20s has %d new SEQNO : %r " % ( tn, len(seq), seq ) ) uptabs = updates.keys() ## payload names only log.info("changed tables %r " % uptabs ) for tn in uptabs: if tn not in self.tables: raise Exception("updated table %s is not in selected table list %r " % (tn,self.tables)) if self.opts['noconfirm']: log.info("proceed without confirmation due to noconfirm option " ) else: var = raw_input("Enter YES to proceed with rloadcat for : %r " % updates.keys() ) if var == "YES": log.info("user consents to update tables %r " % updates.keys() ) else: log.info("user declined to update " ) return updates_ = cat.updates( self , tselect=tselect, fastforward=True ) ## run again with fastforward enabled assert updates == updates_ alltabs = sorted( map(lambda _:"%sVld"%_, uptabs) + uptabs + ["LOCALSEQNO"] ) ## double up for pairs and machinery table LOCALSEQNO showtables = self._get_showtables(nocache=True) for tab in alltabs: if tab not in showtables: raise Exception("table %(tab)s does not exist " % locals() ) for tn in sorted(uptabs) + ["LOCALSEQNO"]: self.loadcsv( cat, tn ) ## force DB access following a load in order to notice changes self.wipe_cache() return updates
[docs] def loadcsv( self, cat, tn ): """ :param cat: AsciiCat instance :param tn: string payload table name or LOCALSEQNO """ tabs = ( tn, ) if tn == "LOCALSEQNO" else ( tn,tn+'Vld' ) csvs = map(lambda _:cat[_], tabs ) paths = map(lambda _:_.abspath, csvs ) replace_ignore = "REPLACE" if tn == "LOCALSEQNO" else "IGNORE" ## ignore is effective default for remote imports anyhow log.info("loadcsv_ %(tn)s loading paths %(paths)s into tabs %(tabs)s replace_ignore %(replace_ignore)s " % locals() ) if self.opts.get('DROP'): log.warn("dropping and recreating table ") assert self.sect == 'tmp_offline_db' and os.getlogin() == 'blyth' for tab,csv in zip(tabs,csvs): ctx = dict(tab=tab,hdr=csv.hdr) self("DROP TABLE IF EXISTS %(tab)s " % ctx ) self("CREATE TABLE %(tab)s ( %(hdr)s )" % ctx) else: pass if not self.opts.get('local'): impr = MySQLImport(self.dbc) xopts = "" if self.opts['nolock'] else "--lock-tables" out = impr(csvpaths=paths, verbose=True, replace_ignore="--"+replace_ignore.lower() , xopts=xopts ) log.info(out) else: for path,tab in zip(paths,tabs): ctx = dict(replace_ignore=replace_ignore,path=path, tab=tab) self("LOAD DATA LOCAL INFILE '%(path)s' %(replace_ignore)s INTO TABLE %(tab)s FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' IGNORE 1 LINES " % ctx)
[docs] def forced_rloadcat_(self, *args, **kwa ): """ Forcible loading of a catalog ... FOR TESTING ONLY """ assert self.sect == 'tmp_offline_db' restrict = self.opts['RESTRICT'].split(",") log.info("forced_rloadcat_ restrict %s " % restrict ) cat = AsciiCat(args[0]) for tn in map(lambda _:_[:-3],filter(lambda _:_[-3:] == 'Vld', cat )) + ["LOCALSEQNO"]: if len(restrict) == 0 or tn in restrict: self.loadcsv( cat , tn ) else: log.warn("forced_rloadcat_ RESTRICT option excludes tn %s " % tn ) return cat
[docs] def rcmpcat_(self, *args, **kwa ): """ Just dumps a comparison between target DB and ascii catalog, allowing the actions an **rloadcat** will do to be previewed. Compares DBI vitals such as LASTUSEDSEQNO between a DBI database and a DBI ascii catalog, usage:: ./db.py tmp_offline_db rcmpcat ~/dybaux/catalog/tmp_offline_db """ log.debug("rcmpcat %s %s %s " % ( self.sect, repr(args), repr(kwa))) assert len(args) > 0, "argument specifying the path of the catalog folder to be compared with is required " tselect = self.paytables log.debug("rcmpcat_ tselect %r " % tselect ) cat = AsciiCat(args[0], skip_pay_check=self.skip_pay_check ) updates = cat.updates( self , tselect=tselect , fastforward=False ) if len(updates) == 0: log.warn("no updates (new tables or new SEQNO) are detected ") uptabs = updates.keys() uptabs = sorted( map(lambda _:"%sVld"%_, uptabs) + uptabs + ["LOCALSEQNO"] ) ## double up for pairs and machinery table LOCALSEQNO for tn in uptabs: if tn not in self.tables: raise Exception("updated table %s is not in selected table list %r " % (tn,self.tables)) for tn,seq in updates.items(): log.info(" %-20s has %d new SEQNO : %r " % ( tn, len(seq), seq ) ) return updates
[docs] def dump_(self, *args, **kwa): """ Dumps tables from any accessible database into a mysqldump file. Usage:: db.py offline_db dump /tmp/offline_db.sql ## without -t a default list of tables is dumped db.py -t CableMap,HardwareID offline_db dump /tmp/offline_db.sql tail -25 /tmp/offline_db.sql ## checking tail, look for the LASTUSEDSEQNO entries Use the ``-t/--tselect`` option with a comma delimited list of to select payload tables. Corresponding validity tables and the `LOCALSEQNO` table are included automatically. The now default ``-d/--decoupled`` option means that the ``LOCALSEQNO`` table is dumped separately and only contains entries corresponding to the selected tables. The decoupled dump can be loaded into ``tmp_offline_db`` without any special options, as the table selection is reflected within the dump:: db.py tmp_offline_db load /tmp/offline_db.sql Partial dumping is implemented using:: mysqldump ... --where="TABLENAME IN ('*','CableMap','HardwareID')" LOCALSEQNO """ self.predump() msg = r""" performing mysqldump DO NOT INTERRUPT FOR A VALID DUMP ... MAY TAKE ~30s OR MORE DEPENDING ON NETWORK """ assert len(args) == 1, "dump_ : ERROR an argument specifying the path of the dumpfile is required" if self.opts['decoupled']: dmpr = MySQLDumpDBI(self.dbc) log.warn( "using decoupled dump : only entries corresponding to the table selection are included in the LOCALSEQNO table ") else: dmpr = MySQLDump(self.dbc) log.info( msg ) if self.opts['all']: tables = [] else: tables = self.tables log.info("dumping tables %r to %s " % (tables, args[0]) ) ret = dmpr( tables , args[0] ) log.info( ret )
[docs] def load_(self, *args, **kwa): """ Loads tables from a mysqldump file into a target db, the target db is configured by the parameters in the for example `tmp_offline_db` section of the config file. For safety the name of the configured target database must begin with `tmp_` .. note:: CAUTION IF THE TARGET DATABASE EXISTS ALREADY IT WILL BE DROPPED AND RECREATED BY THIS COMMAND Usage example:: db.py tmp_offline_db load /tmp/offline_db.sql """ dbn = self.dbc['database'] assert dbn.startswith('tmp_'), "load_ ERROR : configured database name must start with tmp_ : %s " % dbn path = args[0] assert os.path.exists(path) , "load_ ERROR : need an existing path to a mysqldump file : %s " % path if self.opts['APPENDLOAD']: log.warn("APPENDLOAD option prevents database dropping before load, resulting in commingled tables : NOT SOP KOSHER ") else: self("DROP DATABASE IF EXISTS %(database)s" % self.dbc ) self("CREATE DATABASE %(database)s" % self.dbc ) lodr = MySQLLoad(self.dbc) ret = lodr(path) log.info(ret)
[docs] def read_seqno(self, tab="LOCALSEQNO"): """ Read LASTUSEDSEQNO entries from table ``LOCALSEQNO`` """ self._seqno = {} for d in self("select * from %s " % tab ): n,t = int(d['LASTUSEDSEQNO']), d['TABLENAME'] if t == "*":continue self._seqno[t] = n
[docs] def check_seqno(self): """ Compares the LASTUSEDSEQNO entries read into ``self._seqno`` with the ``max(SEQNO)`` results of selects on the DB payload and validity tables. """ for t,n in self._seqno.items(): v = int(self("select max(SEQNO) as n from %s " % "%sVld" % t )[0]["n"]) p = int(self("select max(SEQNO) as n from %s " % t )[0]["n"]) sseq = set([v,p,n]) #1392 nseq = len(sseq) if nseq == 1: log.debug("check_seqno succeeds for table %s n/v/p: %s %s %s " % ( t,n,v,p) ) pass elif nseq == 2: dseq = max(sseq) - min(sseq) if dseq == 1: log.info("check_seqno permitting single SEQNO offset, presumably update in progress t:%s n:%s v:%s p:%s " ( t,n,v,p )) else: assert 0, ( "seqno mismatch dseq %s for %s " % (dseq,t) , n,v,p ) else: log.fatal("check_seqno nseq %s v/p/n all different this should not happen %s : %s %s %s " % (nseq, t, v,p,n )) assert 0, dseq
[docs] def get_seqno(self): """ SEQNO accessor, reading and checking is done on first access to ``self.seqno`` with :: db = DB() print db.seqno ## checks DB print db.seqno ## uses cached del db._seqno ## force a re-read and check print db.seqno """ if hasattr(self, '_seqno' ): return self._seqno self.read_seqno() self.check_seqno() return self._seqno
seqno = property( get_seqno ) def vseqnos(self, tn ): return map(lambda _:int(_["n"]), self("select SEQNO as n from %sVld" % tn )) def read_allseqno(self): self._allseqno = {} for tn,lus in self.seqno.items(): self._allseqno[tn] = self.vseqnos(tn)
[docs] def check_allseqno(self): """ """ assert sorted(self._allseqno.keys()) == sorted(self._seqno.keys()), "seqno keys mismatch " fabseqno = self.fabseqno for tn in self._allseqno.keys(): if fabseqno[tn] != self._seqno[tn]: msg = "check_allseqno mismatch for %s %s %s " % ( tn, fabseqno[tn], self._seqno[tn]) if tn in self.miscreants: log.warn( msg + " (KNOWN MISCREANT) " ) else: log.fatal(msg) assert 0 else: pass if len(self.miscreants) == 0: assert fabseqno == self._seqno , ("fabseqno mismatch ", fabseqno, self._seqno ) else: log.debug( "fabseqno from allseqno %r seqno from LOCALSEQNO table %r " % (fabseqno, self._seqno )) pass
[docs] def get_allseqno(self): """ Provides a table name keyed dict containing lists of all SEQNO in each Vld table The tables included correspond to the read DBI tables (namely those in LOCALSEQNO) """ if hasattr(self, '_allseqno' ): return self._allseqno self.read_allseqno() self.check_allseqno() return self._allseqno
allseqno = property( get_allseqno , doc=get_allseqno.__doc__ )
[docs] def wipe_cache(self): """ Wipe the cache forcing DB access to retrieve the info afresh This is needed when wish to check status after a DB load from the same process that performed the load. """ log.debug("wipe_cache") if hasattr(self, '_seqno' ): del self._seqno if hasattr(self, '_allseqno' ): del self._allseqno
miscreants = ('CalibPmtSpec','CalibFeeSpec',) # expect 'FeeCableMap' not here by luck
[docs] def get_fabseqno(self): """ Summarizes ``db.allseqno``, by fabricating a dict keyed by table name contaoning the number of Vld SEQNO (from length of values in ``db.allseqno``) This dict can be compared with ``db.seqno``, which is obtained from the LASTUSEDSEQNO entries in the ``LOCALSEQNO`` table:: Assuming kosher DBI handling of tables this fabricated dict ``db.fabseqno`` should match ``db.seqno``, meaning that SEQNO start from 1 and have no gaps. .. code-block:: ipython In [1]: from DybPython import DB In [2]: db = DB("tmp_fake_offline_db") In [3]: db.seqno ## queries the LOCALSEQNO table in DB Out[3]: {'CableMap': 213, 'CalibFeeSpec': 113, 'CalibPmtSpec': 29, 'FeeCableMap': 3, 'HardwareID': 172} In [4]: db.fabseqno ## a summarization of db.allseqno Out[4]: {'CableMap': 213, 'CalibFeeSpec': 111, 'CalibPmtSpec': 8, 'FeeCableMap': 3, 'HardwareID': 172} In [5]: db.miscreants ## assertions avoided by miscreant status Out[5]: ('CalibPmtSpec', 'CalibFeeSpec') """ if hasattr(self, '_fabseqno' ): return self._fabseqno self._fabseqno = dict(map(lambda(k,v):(k,len(v)),self.allseqno.items())) return self._fabseqno
fabseqno = property( get_fabseqno, doc=get_fabseqno.__doc__ )
[docs] def docs( cls ): """ collect the docstrings on command methods identified by naming convention of ending with _ (and not starting with _) """ mdoc = lambda m:getattr(m,'__doc__',None) mdocs = [ dict(meth=k[:-1],doc=mdoc(v)) for k,v in [(k,v) for k,v in inspect.getmembers(cls) if k[-1]=='_' and k[0] != '_' and mdoc(v)]] return "\n".join([ """ %(meth)s : %(doc)s """ % d for d in mdocs ])
docs = classmethod(docs)
[docs] def has_table( self, tn ): """ :param tn: table name :return exists: if table exists in the DB """ return len(self("show tables like '%s'" % tn )) == 1
[docs] def tab( self, name ): """ :param name: DBI payload table name """ from tab import Tab return Tab( name, self)
[docs] def cli_(self, *args, **kwa ): """ Emit to stdout the shell commandline for connecting to a mysql DB via the client, without actually doing so. The section names depends on content of :file:`~/.my.cnf` Usage:: eval $(db.py tmp_offline_db cli) Bash function examples to define in :file:`~/.bash_profile` using this command:: idb(){ local cnf=$1 ; shift ; eval $(db.py $cnf cli) $* ; } offline_db(){ idb $FUNCNAME $* ; } tmp_offline_db(){ idb $FUNCNAME $* ; } tmp_etw_offline_db(){ idb $FUNCNAME $* ; } tmp_jpochoa_offline_db(){ idb $FUNCNAME $* ; } ihep_dcs(){ idb $FUNCNAME $* ; } Invoke the shortcut with fast start extra argument for the client:: ihep_dcs -A Note a lower level *almost* equivalent command to this sub-command for standalone usage without `db.py` is provided by `my.py` which can probably run with the older system python alone. Install into your PATH with:: svn export http://dayabay.ihep.ac.cn/svn/dybsvn/dybgaudi/trunk/DybPython/scripts/my.py """ cmd = MySQLCmd(self.dbc)() if not kwa.get('silent'): print cmd return cmd
def main(): """ """ #problem with exclusion, is that added scraped tables would be a nasty surprise tselect = "CalibFeeSpec,CalibPmtSpec,FeeCableMap,SimPmtSpec,HardwareID,CableMap,CoordinateAd,CoordinateReactor,PhysAd,Reactor,CalibPmtHighGain,CalibPmtPedBias,CalibSrcEnergy,EnergyRecon,CalibPmtFineGain,CalibPmtTiming,AdWpHvMap,AdWpHvToFee,AdWpHvSetting,GoodRunList,McsPos,DqChannelStatus" from optparse import OptionParser op = OptionParser(usage=__doc__ + "\n" + DB.docs() ) op.add_option("-v", "--verbose", action="store_true" ) op.add_option("-a", "--all", action="store_true" , help="Do not apply table exclusions for the command. Default %default " ) op.add_option("-t", "--tselect", help="Comma delimited list of DBI payload table names to be included in operations. Default %default " ) op.add_option("-d", "--decoupled", dest="decoupled", action="store_true", help="Decoupled LOCALSEQNO handling. Default %default " ) op.add_option("-D", "--nodecoupled",dest="decoupled", action="store_false", help="Non-decoupled operation. Default %default " ) op.add_option("-b", "--tmpbase", help="Path of existing temporary base directory. Default %default " ) op.add_option("-s", "--safe", dest="safe", action="store_true", help="Switch on safe csv writing. Default %default " ) op.add_option("-S", "--nosafe", dest="safe", action="store_false", help="Switch off safe csv writing. Default %default " ) op.add_option("-N", "--noconfirm",action="store_true", help="Skip action confirmation questions. Default %default " ) op.add_option( "--nolock", action="store_true", help="Do not use the mysqlimport lock-tables option when importing in rloadcat. Default %default " ) op.add_option( "--local", action="store_true", help="Use local only variant loading, sever and client must be on same machine.. EXPERT USE ONLY. Default %default " ) op.add_option( "--DROP", action="store_true", help="DROP and CREATE TABLES BEFORE LOAD.. EXPERT USE ONLY. Default %default " ) op.add_option( "--APPENDLOAD", action="store_true", help="DO NOT DROP DB PRIOR TO LOAD.. EXPERT USE ONLY. Default %default " ) op.add_option( "--NOLOCALSEQNO", action="store_true", help="DO NOT INCLUDE LOCALSEQNO IN TABLE OPERATIONS. EXPERT USE ONLY. Default %default " ) op.add_option( "--NOPAIRING", action="store_true", help="DO NOT ASSUME DBI PAIRING IN TABLE OPERATIONS. JUST USE TSELECT VERBATIM. EXPERT USE ONLY. Default %default " ) op.add_option("-P", "--ALLOW_PARTIAL", action="store_true", help="ALLOW PARTIAL ASCII CATALOGS. EXPERT USE ONLY. Default %default " ) op.add_option( "--EXTRA_PAYLOAD_CHECKS", action="store_true", help="Load payload tables into memory in AsciiCat, allowing extra payload checking. Default %default " ) op.add_option( "--RESTRICT", help="EXPERT USE ONLY. Comma delimited list of table names that restricts csv tables loaded from an ascii catalog by forced_rloadcat. Default %default " ) op.add_option( "--LESSMEMORY", action="store_true", help="Attempt to operate with less memory expense, but possible more network activity resulting in slower operations. Default %default " ) op.add_option( "--OVERRIDE",action="store_true", help="EXPERT USE ONLY. Allow size reductions of csv by rdumpcat. Default %default " ) op.add_option("-C", "--nocheck", action="store_false", help="Skip connection check at startup, for debugging usage only. Default %default " ) op.add_option("-l", "--loglevel", help="Choose logging level case insensitively eg INFO,DEBUG,WARN. Default %default " ) op.add_option("-L", "--nolog", action="store_true", help="Skip logging config, eg when this is done elsewhere such as in dybaux. Default %default ") op.set_defaults(all=False, verbose=False, nocheck=False, noconfirm=False, safe=True, decoupled=True, tselect=tselect, tmpbase="/tmp", loglevel="INFO", local=False, nolog=False, nolock=False, DROP=False, APPENDLOAD=False, NOLOCALSEQNO=False, NOPAIRING=False, ALLOW_PARTIAL=False, RESTRICT=None, LESSMEMORY=False, OVERRIDE=False, EXTRA_PAYLOAD_CHECKS=False, ) (opts_ , args) = op.parse_args() opts = vars(opts_) opts['SKIP_PAY_CHECK'] = not opts['EXTRA_PAYLOAD_CHECKS'] if not opts['nolog']: logging.basicConfig(level = getattr( logging, opts['loglevel'].upper() )) if opts['decoupled'] and not opts['safe']: log.fatal("option --decoupled is incompatible with --nosafe ") sys.exit(1) sect = len(args)>0 and args[0] or "offline_db" cmd = len(args)>1 and "%s_" % args[1] or "count_" ## "load" is a special case as need to connect without database specified ## allowing dropping of any preexisting DB and creation of a new one nodb = cmd == 'load_' db = DB(sect, verbose=opts['verbose'], opts=opts, nodb=nodb ) if not opts.get('nocheck'): log.info("%r", db.check_(**opts)) if opts['verbose'] and nodb == False: log.info("showtables : %s " % repr(db.showtables) ) log.info(" tables : %s " % repr(db.tables)) if hasattr(db, cmd): getattr( db , cmd)( *args[2:], **opts ) else: raise Exception("cmd %s not implemented " % cmd) return db #db.close() if __name__=='__main__': db = main()