/search.css" rel="stylesheet" type="text/css"/> /search.js">
| Classes | Job Modules | Data Objects | Services | Algorithms | Tools | Packages | Directories | Tracs |

In This Package:

Public Member Functions | Public Attributes | Private Member Functions | Private Attributes
Scraper::base::propagator::Propagator Class Reference

List of all members.

Public Member Functions

def __init__
def cooldownsec_sleep
def lastresult
def maxlag
def toggle_loglevel
def handle_signal

Public Attributes

 cfg
 interval
 maxage
 sleep
 cooldownsec
 offset
 tunesleepmod
 heartbeat
 maxiter
 threshold
 aggregate
 feed forward used config rather than directly accessing cfg for clarity/control and easy mocking
 aggregate_filter
 aggregate_skips
 aggregate_count
 aggregate_group_by
 eg "date,hour" to trigger hourly group_by aggregation
 aggregate_group_by_att
 aggregate_group_by_territory
 seed_target_tables
 target
 srcs

Private Member Functions

def _init_sourcevector

Private Attributes

 _lastsample

Detailed Description

Base class holding features common to `Scraper` and `Averager`, namely:

#. config handling
#. signal handling

Definition at line 9 of file propagator.py.


Constructor & Destructor Documentation

def Scraper::base::propagator::Propagator::__init__ (   self,
  srcs,
  target,
  cfg 
)
:param srcs: list of source SA classes 
:param target: `Target` instance that encapsulates the DybDbi class
:param cfg: instance of relevant `Regime` subclass (which isa dict holding config)           
 
Config options:

:param maxiter: maximum iterations or 0 for no limit 
:param interval: timedelta cursor step size
:param maxage: timedelta maximum age, beyond which even an unchanged row gets written
:param sleep: timedelta sleep between scrape update sampling 

Definition at line 17 of file propagator.py.

00018                                           :
00019         """
00020         :param srcs: list of source SA classes 
00021         :param target: `Target` instance that encapsulates the DybDbi class
00022         :param cfg: instance of relevant `Regime` subclass (which isa dict holding config)           
00023  
00024         Config options:
00025 
00026         :param maxiter: maximum iterations or 0 for no limit 
00027         :param interval: timedelta cursor step size
00028         :param maxage: timedelta maximum age, beyond which even an unchanged row gets written
00029         :param sleep: timedelta sleep between scrape update sampling 
00030         
00031         """
00032 
00033         self.cfg = cfg
00034         self.cfg.setsignals()
00035         log.debug( "propagator __init__ cfg %r " % cfg )
00036 
00037         self.interval      = cfg.pop('interval') 
00038         self.maxage        = cfg.pop('maxage')
00039         self.sleep         = cfg.pop('sleep')
00040         self.cooldownsec   = cfg.get('cooldownsec', 0.1)
00041 
00042         self.offset        = cfg.pop('offset')
00043         self.tunesleepmod  = cfg.pop('tunesleepmod')
00044         self.heartbeat     = cfg.pop('heartbeat')
00045 
00046         self.maxiter       = cfg.pop('maxiter')
00047         self.threshold     = cfg.pop('threshold')
00048 
00049         ## feed forward used config rather than directly accessing cfg for clarity/control and easy mocking
00050         self.aggregate         = cfg.pop('aggregate', None)
00051         self.aggregate_filter  = cfg.pop('aggregate_filter', None)
00052         self.aggregate_skips   = cfg.pop('aggregate_skips', None)
00053         self.aggregate_count   = cfg.pop('aggregate_count', 'count')
00054 
00055         ## eg "date,hour" to trigger hourly group_by aggregation 
00056         self.aggregate_group_by      = cfg.pop('aggregate_group_by',  None)       
00057         self.aggregate_group_by_att  = cfg.pop('aggregate_group_by_att', None)       
00058         self.aggregate_group_by_territory  = cfg.pop('aggregate_group_by_territory', None)       
00059 
00060         self.seed_target_tables = cfg.get('seed_target_tables',False)
00061 
00062         self.target        = target
00063         self.srcs          = srcs
00064 
00065         self._lastsample  = time.time()   # starting time 
00066       
00067         if self.seed_target_tables:
00068             log.warn("seeding target tables")
00069             self.target.seed( srcs , self )
00070         else:
00071             log.warn("not seeding target tables")
00072 
00073         if len(cfg) > 0:
00074             for k in sorted(cfg):
00075                 v = cfg[k]
00076                 log.info( "%-30s : %s " % ( k , v.ctime() if type(v) == datetime else v ))
00077 
00078 
00079         log.info("target : %r srcs %s from %s  " % ( target , srcs, srcs[0].db ) )
00080 
00081         for source in srcs:
00082             self._init_sourcevector(source)


Member Function Documentation

def Scraper::base::propagator::Propagator::cooldownsec_sleep (   self)
Calling just prior to making a query will sleep 
for the requisite floating point seconds to meet the  
configured `cooldownsec`

Definition at line 83 of file propagator.py.

00084                                :
00085         """
00086         Calling just prior to making a query will sleep 
00087         for the requisite floating point seconds to meet the  
00088         configured `cooldownsec`
00089         """
00090         cooldownsec = self._lastsample + self.cooldownsec - time.time()
00091         if cooldownsec > 0:
00092             log.debug("cooldownsec_sleep %s %s %s " % ( cooldownsec , self._lastsample, self.cooldownsec )) 
00093             time.sleep( cooldownsec )
00094             pass 
00095         self._lastsample = time.time()  
 
def Scraper::base::propagator::Propagator::lastresult (   self,
  r 
)
:param r: lastresult enum 
:return:  True if all sources have the result state passed, otherwise False

Definition at line 96 of file propagator.py.

00097                             :
00098          """
00099          :param r: lastresult enum 
00100          :return:  True if all sources have the result state passed, otherwise False
00101          """
00102          for sv in self:
00103              if sv.lastresult != r:
00104                  return False
00105          return True

def Scraper::base::propagator::Propagator::_init_sourcevector (   self,
  source 
) [private]
:param source: 

logical coordinates from each source are used to access (usually source specific) 
target last validity, ie different contexts within the same target table are examined
in order to set the time cursors corresponding to each of the sources and hence SourceVectors

When writing into empty tables, there will be no target last validity, in this case ``tcursor=None``
which is only permissable when averaging. External binning based on source times are used. 

Definition at line 106 of file propagator.py.

00107                                           :
00108         """ 
00109         :param source: 
00110 
00111         logical coordinates from each source are used to access (usually source specific) 
00112         target last validity, ie different contexts within the same target table are examined
00113         in order to set the time cursors corresponding to each of the sources and hence SourceVectors
00114 
00115         When writing into empty tables, there will be no target last validity, in this case ``tcursor=None``
00116         which is only permissable when averaging. External binning based on source times are used. 
00117         """
00118         tlv = self.target.lastvld( source.xtn )  
00119         if tlv:
00120             timeend = tlv.contextrange.timeend            ## UTC TimeStamp
00121             tcursor = timeend.UTCtoNaiveLocalDatetime     
00122             log.debug( "tlv %r %r " % ( source.xtn , tlv ) )
00123             log.info(  "timecursor(local) %r %s " % ( source.xtn, tcursor.ctime() ) )
00124         else: 
00125             tcursor = None
00126             log.warn("no target last validty for source context  %r : writing into empty tables " % ( source )  ) 
00127         pass
00128         sv = SourceVector(self, source) 
00129         sv.tcursor = tcursor
00130         self.append( sv  ) 

def Scraper::base::propagator::Propagator::maxlag (   self)
Returns maximum lag from all sources as a timedelta or None, when the scraper is
not behind this gives None

Definition at line 131 of file propagator.py.

00132                     :
00133         """
00134         Returns maximum lag from all sources as a timedelta or None, when the scraper is
00135         not behind this gives None
00136         """
00137         lags = filter( lambda _:_, map( lambda sv:sv.lag(), self ))
00138         return max(lags) if len(lags) >  0 else None 
00139 
 
def Scraper::base::propagator::Propagator::toggle_loglevel (   self)
Toggle loglevel up/down from configured level in response to HUP signals

Definition at line 140 of file propagator.py.

00141                              :
00142         """
00143         Toggle loglevel up/down from configured level in response to HUP signals
00144         """ 
00145         c = getattr( logging, self.cfg['loglevel'] )
00146         e = log.getEffectiveLevel()
00147         l = ( logging.NOTSET, logging.DEBUG, logging.INFO, logging.WARN, logging.ERROR, logging.FATAL )
00148         i = l.index(e)
00149         v = l[(i-1)%6] if c == e else c   
00150         log.info("loglevel configured/effective/adjusted %s %s => %s  " % ( c, e, v) )
00151         log.setLevel(v)    

def Scraper::base::propagator::Propagator::handle_signal (   self)
Use `kill` to send the signal, eg::

    kill -HUP $(pgrep -f pmthv_scraper)
    

Definition at line 152 of file propagator.py.

00153                            :
00154         """
00155         Use `kill` to send the signal, eg::
00156 
00157             kill -HUP $(pgrep -f pmthv_scraper)
00158     
00159         """
00160         if self.cfg.signal:
00161             sig, self.cfg.signal = self.cfg.signal, None         
00162             if sig == signal.SIGHUP:
00163                 log.warn("received SIGHUP ")   
00164                 self.toggle_loglevel()   
00165             elif sig == signal.SIGTERM:
00166                 log.warn("received SIGTERM ")   
00167                 sys.exit(0)
00168             else:
00169                 log.warn("received unhandled signal %s " % sig )       
00170 


Member Data Documentation

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

Definition at line 29 of file propagator.py.

feed forward used config rather than directly accessing cfg for clarity/control and easy mocking

Definition at line 30 of file propagator.py.

Definition at line 30 of file propagator.py.

Definition at line 30 of file propagator.py.

Definition at line 30 of file propagator.py.

eg "date,hour" to trigger hourly group_by aggregation

Definition at line 31 of file propagator.py.

Definition at line 31 of file propagator.py.

Definition at line 31 of file propagator.py.

Definition at line 31 of file propagator.py.

Definition at line 31 of file propagator.py.

Definition at line 31 of file propagator.py.

Definition at line 31 of file propagator.py.


The documentation for this class was generated from the following file:
| Classes | Job Modules | Data Objects | Services | Algorithms | Tools | Packages | Directories | Tracs |

Generated on Fri May 16 2014 09:50:03 for Scraper by doxygen 1.7.4