/search.css" rel="stylesheet" type="text/css"/> /search.js">
| Classes | Job Modules | Data Objects | Services | Algorithms | Tools | Packages | Directories | Tracs |

In This Package:

Public Member Functions | Public Attributes | Properties | Private Member Functions | Private Attributes
Scraper::dq::db::DB Class Reference

List of all members.

Public Member Functions

def __init__
def execute_
def fetchall
def fields
def group_by_digest
def compare_by_digest
def group_by_range
def digest_table_scan
def __call__

Public Attributes

 sect
 dbc
 database
 conn
 group_by
 count

Properties

 size = property(_get_size, doc="Size estimate of the DB in MB ")
 databases = property(_get_databases, doc="List of database names obtained from information_schema.tables")
 datadir = property(_get_datadir, doc="Query DB server to find the datadir, eg /var/lib/mysql/ OR /data/mysql/ ")

Private Member Functions

def _query_size
def _get_size
def _get_databases
def _get_datadir

Private Attributes

 _size

Detailed Description

Definition at line 17 of file db.py.


Constructor & Destructor Documentation

def Scraper::dq::db::DB::__init__ (   self,
  sect,
  database = None,
  group_concat_max_len = 8192,
  group_by = "SEQNO" 
)
:param sect: used as the `read_default_group` in MySQLdb connection 
:param database: name of DB that overrides any setting within the 

Definition at line 18 of file db.py.

00019                                                                                          :
00020         """
00021         :param sect: used as the `read_default_group` in MySQLdb connection 
00022         :param database: name of DB that overrides any setting within the 
00023         """
00024         self.sect = sect
00025         dbc = MyCnf("~/.my.cnf").mysqldb_pars(sect, database=database)
00026         self.dbc = dbc
00027         self.database = dbc.get('db', None)
00028         log.debug("connecting to %s " % dict(dbc, passwd="***"))
00029         try:  
00030             conn = MySQLdb.connect( **dbc )   # huh, version variation in accepted params
00031         except MySQLdb.Error, e: 
00032             raise Exception("Error %d: %s " % ( e.args[0], e.args[1] ) )
00033         self.conn = conn
00034         self._size = None
00035         self("set @@group_concat_max_len = %(group_concat_max_len)s" % locals())
00036         self.group_by = group_by


Member Function Documentation

def Scraper::dq::db::DB::execute_ (   self,
  cmd 
)

Definition at line 37 of file db.py.

00038                            :
00039         cursor = self.conn.cursor(MySQLdb.cursors.DictCursor)
00040         cursor.execute( cmd )
00041         return cursor

def Scraper::dq::db::DB::fetchall (   self,
  cmd 
)

Definition at line 42 of file db.py.

00043                             :
00044         cursor = self.execute_(cmd)
00045         rows = cursor.fetchall()
00046         self.count = cursor.rowcount
00047         cursor.close()
00048         return rows

def Scraper::dq::db::DB::fields (   self,
  dbtab,
  skipfield 
)
:param dbtab: dbname and table specifcation string eg `channelquality_db.DqChannel` on same server as self
:return: list of fields names

::

    mysql> select column_name from information_schema.columns where concat(table_schema,'.',table_name) = 'channelquality_db.DqChannel' ;
    +-------------+
    | column_name |
    +-------------+
    | SEQNO       | 
    | ROW_COUNTER | 
    | RUNNO       | 
    | FILENO      | 
    | CHANNELID   | 
    | OCCUPANCY   | 
    | DADCMEAN    | 
    | DADCRMS     | 
    | HVMEAN      | 
    | HVRMS       | 
    +-------------+
    10 rows in set (0.01 sec)

Definition at line 49 of file db.py.

00050                                        :
00051         """
00052         :param dbtab: dbname and table specifcation string eg `channelquality_db.DqChannel` on same server as self
00053         :return: list of fields names
00054 
00055         ::
00056 
00057             mysql> select column_name from information_schema.columns where concat(table_schema,'.',table_name) = 'channelquality_db.DqChannel' ;
00058             +-------------+
00059             | column_name |
00060             +-------------+
00061             | SEQNO       | 
00062             | ROW_COUNTER | 
00063             | RUNNO       | 
00064             | FILENO      | 
00065             | CHANNELID   | 
00066             | OCCUPANCY   | 
00067             | DADCMEAN    | 
00068             | DADCRMS     | 
00069             | HVMEAN      | 
00070             | HVRMS       | 
00071             +-------------+
00072             10 rows in set (0.01 sec)
00073 
00074         """
00075         sql = "select column_name from information_schema.columns where concat(table_schema,'.',table_name) = '%(dbtab)s' " % locals()
00076         return filter(lambda _:not _ in skipfield, map(lambda _:_['column_name'], self(sql) ))

def Scraper::dq::db::DB::group_by_digest (   self,
  dbtab,
  limit = "0 
)

Definition at line 77 of file db.py.

00078                                                    :
00079         """
00080         """
00081         group_by = self.group_by 
00082         fields = ",".join(self.fields(dbtab, skipfield=group_by.split()))
00083         sql = "select %(group_by)s,md5(group_concat(md5(concat_ws(',',%(fields)s)) separator ',')) as digest from %(dbtab)s group by %(group_by)s limit %(limit)s " % locals()
00084         return dict((d[group_by],d['digest']) for d in self(sql))

def Scraper::dq::db::DB::compare_by_digest (   self,
  a,
  b,
  limit = "0 
)

Definition at line 85 of file db.py.

00086                                                      :
00087         a = self.group_by_digest( a, limit ) 
00088         b = self.group_by_digest( b, limit ) 
00089         same = a == b 
00090         if not same:
00091             log.warn("compare_by_digest difference %s %s %s %s  " % (a,b,limit,group_by) )
00092             assert len(a) == len(b) , "length mismatch "
00093             assert a.keys() == b.keys() , "keys mismatch "
00094             for k in sorted(a.keys()):
00095                 if a[k] != b[k]:
00096                     log.warn(" %s %s %s " % ( k, a[k], b[k] ))
00097             pass
00098         return same

def Scraper::dq::db::DB::group_by_range (   self,
  dbtab 
)

Definition at line 99 of file db.py.

00100                                    :
00101         group_by = self.group_by
00102         return self("select min(%(group_by)s) as min, max(%(group_by)s) as max from %(dbtab)s " % locals())[0]

def Scraper::dq::db::DB::digest_table_scan (   self,
  a,
  b,
  chunk = 1000 
)
This approach is too slow ot be useful with big tables.  
Too much done in python.  

Definition at line 103 of file db.py.

00104                                                   :
00105         """
00106         This approach is too slow ot be useful with big tables.  
00107         Too much done in python.  
00108         """ 
00109         ar = self.group_by_range(a)
00110         br = self.group_by_range(b)
00111 
00112         log.info("a %-30s %s " % (a, ar )) 
00113         log.info("b %-30s %s " % (b, br )) 
00114 
00115         if ar == br:
00116             num = ar['max'] - ar['min'] + 1 
00117         else:
00118             max = min(ar['max'], br['max'])
00119             assert ar['min'] == br['min']
00120             num = max - ar['min'] + 1 
00121            
00122         log.info("common max %s num %s " % (max, num ))
00123         offset = 0
00124         while offset < num:
00125             limit = "%(offset)s, %(chunk)s " % locals()
00126             same = self.compare_by_digest(a, b, limit )
00127             log.info( " %s : %s " % ( limit, same ) )
00128             offset += chunk
00129 

def Scraper::dq::db::DB::_query_size (   self) [private]

Definition at line 130 of file db.py.

00131                          :
00132         sql = "select round(sum((data_length+index_length-data_free)/1024/1024),2) as TOT_MB from information_schema.tables where table_schema = '%(db)s' " % self.dbc
        return float(self(sql)[0]['TOT_MB'])
def Scraper::dq::db::DB::_get_size (   self) [private]

Definition at line 133 of file db.py.

00134                        :
00135         if self._size is None:
00136              self._size = self._query_size()
        return self._size 
def Scraper::dq::db::DB::_get_databases (   self) [private]
This query gives fewer results than `show databases`, which demands skips to avoid errors in getting sizes 
#skip = "hello hello2 other test_noname tmp_cascade_2 tmp_dbitest tmp_tmp_offline_db_2".split()  

Definition at line 139 of file db.py.

00140                             :
00141         """
00142         This query gives fewer results than `show databases`, which demands skips to avoid errors in getting sizes 
00143         #skip = "hello hello2 other test_noname tmp_cascade_2 tmp_dbitest tmp_tmp_offline_db_2".split()  
00144         """
00145         sql = "select distinct(table_schema) from information_schema.tables"
        return map(lambda _:_['table_schema'],self(sql))
def Scraper::dq::db::DB::_get_datadir (   self) [private]

Definition at line 148 of file db.py.

00149                           :
        return self("select @@datadir as datadir")[0]['datadir']
def Scraper::dq::db::DB::__call__ (   self,
  cmd 
)

Definition at line 152 of file db.py.

00153                            :
00154         log.debug(cmd)
00155         return self.fetchall(cmd)
00156 


Member Data Documentation

Definition at line 21 of file db.py.

Definition at line 21 of file db.py.

Definition at line 21 of file db.py.

Definition at line 21 of file db.py.

Definition at line 21 of file db.py.

Definition at line 21 of file db.py.

Definition at line 42 of file db.py.


Property Documentation

Scraper::dq::db::DB::size = property(_get_size, doc="Size estimate of the DB in MB ") [static]

Definition at line 137 of file db.py.

Scraper::dq::db::DB::databases = property(_get_databases, doc="List of database names obtained from information_schema.tables") [static]

Definition at line 146 of file db.py.

Scraper::dq::db::DB::datadir = property(_get_datadir, doc="Query DB server to find the datadir, eg /var/lib/mysql/ OR /data/mysql/ ") [static]

Definition at line 150 of file db.py.


The documentation for this class was generated from the following file:
| Classes | Job Modules | Data Objects | Services | Algorithms | Tools | Packages | Directories | Tracs |

Generated on Fri May 16 2014 09:50:03 for Scraper by doxygen 1.7.4