Other framework/main

Table Of Contents

Previous topic

Installation and Working with the Source Code

Next topic

Data Model

This Page

Daya Bay Links

Content Skeleton

Offline Framework

Introduction

When writing software it is important to manage complexity. One way to do that is to organize the software based on functionality that is generic to many specific, although maybe similar applications. The goal is to develop software which “does everything” except those specific things that make the application unique. If done well, this allows unique applications to be implemented quickly, and in a way that is robust against future development but still flexible to allow the application to be taken in novel directions.

This can be contrasted with the inverted design of a toolkit. Here one focuses on units of functionality with no initial regards of integration. One builds libraries of functions or objects that solve small parts of the whole design and, after they are developed, find ways to glue them all together. This is a useful design, particularly when there are ways to glue disparate toolkits together, but can lead to redundant development and inter-operational problems.

Finally there is the middle ground where a single, monolithic application is built from the ground up. When unforeseen requirements are found their solution is bolted on in whatever the most expedient way can be found. This can be useful for quick initial results but eventually will not be maintainable without growing levels of effort.

Framework Components and Interfaces

Gaudi components are special classes that can be used by other code without explicitly compiling against them. They can do this because they inherit from and implement one or more special classes called “interface classes” or just interfaces. These are light weight and your code compiles against them. Which actual implementation that is used is determined at run time by looking them up by name. Gaudi Interfaces are special for a few reasons:

Pure-virtual:
all methods are declared =0 so that implementations are required to provide them. This is the definition of an “interface class”. Being pure-virtual also allows for an implementation class to inherit from multiple interfaces without problem.
References counted:
all interfaces must implement reference counting memory management.
ID number:
all interface implementations must have a unique identifying number.
Fast casting:
all interfaces must implement the fast queryInterface() dynamic cast mechanism.

Part of a components implementation involves registering a “factory” class with Gaudi that knows how to produce instances of the component given the name of the class. This registration happens when the component library is linked and this linking can be done dynamically given the class name and the magic of generated rootmap files.

As a result, C++ (or Python) code can request a component (or Python shadow class) given its class name. At the same time as the request, the resulting instance is registered with Gaudi using a nick-name [1]. This nick-name lets you configure multiple instances of one component class in different ways. For example one might want to have a job with two competing instances of the same algorithm class run on the same data but configured with two different sets of properties.

Common types of Components

The main three types of Gaudi components are Algorithms, Tools and Services.

Algorithms

  • Inherit from GaudiAlgorithm or if you will produce data from DybAlgorithm.
  • execute(), initialize(), finalize() and associated requirements (eg. calling GaudiAlgorithm::initialize()).
  • TES access with get() and put() or getTes() and putTES if implementing DybAlgorithm. There is also getAES to access the archive event store.
  • Logging with info(), etc.
  • required boilerplate (_entries & _load files, cpp macros)
  • some special ones: sequencer (others?)

Algorithms contain code that should be run once per execution cycle. They may take input from the TES and may produce output. They are meant to encapsulate complexity in a way that allows them to be combined in a high-level manner. They can be combined in a serial chain to run one-by-one or they can run other algorithms as sub-algorithms. It is also possible to set up high-level branch decisions that govern whether or not sub-chains run.

Tools

Tools contain utility code or parts of algorithm code that can be shared. Tool instances can be public, in which case any other code may use it, or they may be private. Multiple instances of a private tool may be created. A tool may be created at any time during a job and will be deleted once no other code references it.

Services

Service is very much like a public tool of which there is a single instance created. Services are meant to be created at the beginning of the job and live for its entire life. They typically manage major parts of the framework or some external service (such as a database).

Writing your own component

Algorithms

One of the primary goals of Gaudi is to provide the concept of an Algorithm which is the main entry point for user code. All other parts of the framework exist to allow users to focus on writing algorithms.

An algorithm provide three places for users to add their own code:

initialize()
This method is called once, at the beginning of the job. It is optional but can be used to apply any properties that the algorithm supports or to look up and cache pointers to services, tools or other components or any other initializations that require the Gaudi framework.
execute()
This method is called once every execution cycle (“event”). Here is where user code does implements whatever algorithm the user creates.
finalize()
This method is called once, at the end of the job. It is optional but can be used to release() any cached pointers to services or tools, or do any other cleaning up that requires the Gaudi framework.

When writing an algorithm class the user has three possible classes to use as a basis:

Algorithm
is a low level class that does not provide many useful features and is probably best to ignore.
GaudiAlgorithm
inherits from Algorithm and provide many useful general features such as access to the message service via info() and related methods as well as methods providing easy access to the TES and TDS (eg, get() and getDet()). This is a good choice for many types of algorithms.
DybAlgorithm
inherits from GaudiAlgorithm and adds Daya Bay specific features related to producing objects from the DataModel. It should only be considered for algorithms that need to add new data to the TES. An algorithm may be based on GaudiAlgorithm and still add data to the TES but some object bookkeeping will need to be done manually.

Subclasses of DybAlgorithm should provide initialize, execute and finalize methods as they would if they use the other two algorithm base classes. DybAlgorithm is templated by the DataModel data type that it will produce and this type is specified when a subclass inherits from it. Instances of the object should be created using the MakeHeaderObject() method. Any input objects that are needed should be retrieved from the data store using getTES() or getAES(). Finally, the resulting data object is automatically put into the TES at the location specified by the “Location” property which defaults to that specified by the DataModel class being used. This will assure bookkeeping such as the list of input headers, the random state and other things are properly set.

Tools

  • examples
  • Implementing existing tool interface,
  • writing new interface.
  • required boilerplate (_entries & _load files, cpp macros)

Services

  • common ones provided, how to access in C++
  • Implementing existing service interface,
  • writing new interface.
  • Include difference between tools and service.
  • required boilerplate (_entries & _load files, cpp macros)

Generalized Components

Properties and Configuration

Just about every component that Gaudi provides, or those that Daya Bay programmers will write, one or more properties. A property has a name and a value and is associated with a component. Users can set properties that will then get applied by the framework to the component.

Gaudi has two main ways of setting such configuration. Initially a text based C++-like language was used. Daya Bay does not use this but instead uses the more modern Python based configuration. With this, it is possible to write a main Python program to configure everything and start the Gaudi main loop to run some number of executions of the top-level algorithm chain.

The configuration mechanism described below was introduced after release 0.5.0.

Overview of configuration mechanism

../../_images/layers.png

fig:config-layers

Cartoon of the layers of configuration code.

The configuration mechanism is a layer of Python code. As one goes up the layer one goes from basic Gaudi configuration up to user interaction. The layers are pictured in Fig. fig:config-layers. The four layers are described from lowest to highest in the next sections.

Configurables

All higher layers may make use of Configurables. They are Python classes that are automatically generated for all components (Algorithms, Tools, Services, etc). They hold all the properties that the component defines and include their default values and any documentation strings. They are named the same as the component that they represent and are available in Python using this pattern:

from PackageName.PackageNameConf import MyComponent
mc = MyComponent()
mc.SomeProperty = 42

You can find out what properties any component has using the properties.py script which should be installed in your PATH.

shell> properties.py
GtGenerator :
     GenName:  Name of this generator for book keeping purposes.
     GenTools:  Tools to generate HepMC::GenEvents
     GlobalTimeOffset: None
     Location:  TES path location for the HeaderObject this algorithm produces.
        ...

A special configurable is the ApplicationMgr. Most users will need to use this to include their algorithms into the “TopAlg” list. Here is an example:

from Gaudi.Configuration import ApplicationMgr
theApp = ApplicationMgr()

from MyPackage.MyPackageConf import MyAlgorithm
ma = MyAlgorithm()
ma.SomeProperty = "harder, faster, stronger"
theApp.TopAlg.append(ma)

Configurables and Their Names

It is important to understand how configurables eventually pass properties to instantiated C++ objects. Behind the scenes, Gaudi maintains a catalog that maps a key name to a set of properties. Normally, no special attention need be given to the name. If none is given, the configurable will take a name based on its class:

# gets name 'MyAlgorithm'
generic = MyAlgorithm()
# gets name 'alg1'
specific = MyAlgorithm('alg1')

theApp.TopAlg.append(generic)
theApp.TopAlg.append(specific)
# TopAlg now holds ['MyAlgorithm/MyAlgorithm', 'MyAlgorithm/alg1']

Naming Gaudi Tool Configurables

In the case of Gaudi Tools, things become more complex. Tools themselves can (and should) be configured through configurables. But, there are a few things to be aware of or else one can become easily tricked:

  • Tool configurables can be public or private. A public tool configurable is “owned” by ToolSvc and shared by all parents, a private one is “owned” by a single parent and not shared.
  • By default, a tool configurable is public.
  • “Ownership” is indicated by prepending the parent’s name, plus a dot (”.”) to the a simple name.
  • Ownership is set, either when creating the tool configurable by prepending the parent’s name, or during assignment of it to the parent configurable.
  • During assignment to the parent a copy will be made if the tool configurable name is not consistent with the parent name plus a dot prepended to a simple name.

What this means is that you may end up with different final configurations depending on:

  • the initial name you give the tool configurable
  • when you assign it to the parent
  • if the parent uses the tool as a private or a public one
  • when you assign the tool’s properties

To best understand how things work some examples are given. An example of how public tools work:

mt = MyTool("foo")
mt.getName()            # -> "ToolSvc.foo"

mt.Cut = 1
alg1.pubtool = mt
mt.Cut = 2
alg2.pubtool = mt
mt.Cut = 3
# alg1 and alg2 will have same tool, both with cut == 3

Here a single “MyTool” configurable is created with a simple name. In the constructor a “ToolSvc.” is appended (since there was no ”.” in the name). Since the tool is public the final value (3) will be used by both alg1 and alg2.

An example of how private tools work:

mt = MyTool("foo")
mt.getName()            # -> "ToolSvc.foo"

mt.Cut = 1
alg1.privtool = mt
# alg1 gets "alg1.foo" configured with Cut==1
mt.Cut = 2
alg2.privtool = mt
# (for now) alg2 gets "alg2.foo" configured with Cut==2

# after assignment, can get renamed copy
from Gaudi.Configuration import Configurable
mt2 = Configurable.allConfigurables["alg2.foo"]
mt2.Cut = 3
# (now, really) alg2 gets "alg2.foo" configured with Cut==3

Again, the same tool configurable is created and implicitly renamed. An initial cut of 1 is set and the tool configurable is given to alg1. Guadi makes a copy and the “ToolSvc.foo” name of the original is changed to “alg1.foo” in the copy. The original then as the cut changed to 2 and given to alg2. Alg1’s tool’s cut is still 1. Finally, the copied MyTool configurable is looked up using the name “alg2.foo”. This can be used if you need to configure the tool after it has been assigned to alg2.

The Package Configure Class and Optional Helper Classes

Every package that needs any but the most trivial configuration should provide a Configure class. By convention this class should be available from the module named after the package. When it is instantiated it should:

  • Upon construction (in __init__()), provide a sensible, if maybe incomplete, default configuration for the general features the package provides.
  • Store any and all configurables it creates in the instance (Python’s self variable) for the user to later access.

In addition, the package author is encouraged to provide one or more “helper” classes that can be used to simplify non-default configuration. Helper objects can either operate on the Configure object or can be passed in to Configure or both.

To see an example of helpers are written look at:

$SITEROOT/dybgaudi/InstallArea/python/GenTools/Helpers.py

Package authors should write these classes and all higher layers may make use of these classes.

User Job Option Scripts

The next layer consists of job option scripts. These are short Python scripts that use the lower layers to provide non-default configuration that makes the user’s job unique. However, these are not “main program” files and do not execute on their own (see next section).

Users can configure an entire job in one file or spread parts of the configuration among multiple files. The former case is useful for bookkeeping and the latter is if the user wants to run multiple jobs that differ in only a small part of their configuration. In this second case, they can separate invariant configuration from that which changes from run to run.

An example of a job script using the GenTools helpers described above is:

from GenTools.Helpers import Gun
gunner = Gun()

import GaudiKernel.SystemOfUnits as units
gunner.timerator.LifeTime = int(60*units.second)
# ...
import GenTools
gt = GenTools.Configure("gun","Particle Gun",helper=gunner)
gt.helper.positioner.Position = [0,0,0]

In the first two lines a “Gun” helper class is imported and constructed with defaults. This helper will set up the tools needed to implement a particle gun based generator. It chooses a bunch of defaults such as particle type, momentum, etc, which you probably don’t want so you can change them later. For example the mean life time is set in line 5. Finally, the package is configured and this helper is passed in. The configuration creates a GtGenerator algorithm that will drive the GenTools implementing the gun based kinematics generation. After the Configure object is made, it can be used to make more configuration changes.

This specific example was for GenTools. Other package will do different things that make sense for them. To learn what each package does you can read the Configure and/or helper code or you can read its inlined documentation via the pydoc program. Some related examples of this latter method:

shell> pydoc GenTools.Helpers
Help on module GenTools.Helpers in GenTools:

NAME
    GenTools.Helpers

FILE
    /path/to/NuWa-trunk/dybgaudi/InstallArea/python/GenTools/Helpers.py

DESCRIPTION
    Several helper classes to assist in configuring GenTools.  They
    assume geometry has already been setup.  The helper classes that
    produce tools need to define a "tools()" method that returns an
    ordered list of what tools it created.  Users of these helper classes
    should use them like:

CLASSES
    Gun
    HepEVT
...

shell> pydoc GenTools.Helpers.Gun
Help on class Gun in GenTools.Helpers:

GenTools.Helpers.Gun = class Gun
 |  Configure a particle gun based kinematics
 |
 |  Methods defined here:
 |
 |  __init__(self, ...)
 |      Construct the configuration.  Coustom configured tools can
 |      be passed in or customization can be done after construction
 |      using the data members:
 |
 |      .gun
 |      .positioner
 |      .timerator
 |      .transformer
 |
 |      The GtGenerator alg is available from the .generatorAlg member.
 |
 |      They can be accessed for additional, direct configuration.
...

User Job Option Modules

A second, complimentary high-level configuration method is to collect lower level code into a user job module. These are normal Python modules and as such are defined in a file that exist in the users current working, in the packages python/ sub directory or otherwise in a location in the user’s PYTHONPATH.

Any top level code will be evaluated as the module is imported in the context of configuration (same as job option scripts). But, these modules can supply some methods, named by convention, that can allow additional functionality.

configure(argv=[])
This method can hold all the same type of configuration code that the job option scripts do. This method will be called just after the module is imported. Any command line options given to the module will be available in argv list.
run(appMgr)
This method can hold code that is to be executed after the configuration stage has finished and all configuration has been applied to the actual underlying C++ objects. In particular, you can define pure-Python algorithms and add them to the TopAlg list.

There are many examples Job Option Modules in the code. Here are some specific ones.

GenTools.Test
this module [2] gives an example of a configure(argv=[]) function that parses command line options. Following it will allow users to access the command line usage by simply running — nuwa.py -m 'GenTools.Test --help'.
DivingIn.Example
this module [3] gives an example of a Job Option Module that takes no command line arguments and configures a Python Algorithm class into the job.

The nuwa.py main script

Finally, there is the layer on top of it all. This is a main Python script called nuwa.py which collects all the layers below. This script provides the following features:

  • A single, main script everyone uses.
  • Configures framework level things
  • Python, interactive vs. batch
  • Logging level and color
  • File I/O, specify input or output files on the command line
  • Geometry
  • Use or not of the archive event store
  • Access to visualization
  • Running of user job option scripts and/or loading of modules

After setting up your environment in the usual way the nuwa.py script should be in your execution PATH. You can get a short help screen by just typing [4]:

shell> nuwa.py --help
Usage:
    This is the main program to run NuWa offline jobs.

    It provides a job with a minimal, standard setup.  Non standard
    behavior can made using command line options or providing additional
    configuration in the form of python files or modules to load.

    Usage:

      nuwa.py [options] [-m|--module "mod.ule --mod-arg ..."] \
              [config1.py config2.py ...] \
              [mod.ule1 mod.ule2 ...] \
              [input1.root input2.root ...]

    Python modules can be specified with -m|--module options and may
    include any per-module arguments by enclosing them in shell quotes
    as in the above usage.  Modules that do not take arguments may
    also be listed as non-option arguments.  Modules may supply the
    following functions:

    configure(argv=[]) - if exists, executed at configuration time

    run(theApp) - if exists, executed at run time with theApp set to
    the AppMgr.

    Additionally, python job scripts may be specified.

    Modules and scripts are loaded in the order they are specified on
    the command line.

    Finally, input ROOT files may be specified.  These will be read in
    the order they are specified and will be assigned to supplying
    streams not specificially specified in any input-stream map.

    The listing of modules, job scripts and/or ROOT files may be
    interspersed but must follow all options.



Options:
  -h, --help            show this help message and exit
  -A, --no-aes          Do not use the Archive Event Store.
  -l LOG_LEVEL, --log-level=LOG_LEVEL
                        Set output log level.
  -C COLOR, --color=COLOR
                        Use colored logs assuming given background ('light' or
                        'dark')
  -i, --interactive     Enter interactive ipython shell after the run
                        completes (def is batch).
  -s, --show-includes   Show printout of included files.
  -m MODULE, --module=MODULE
                        Load given module and pass optional argument list
  -n EXECUTIONS, --executions=EXECUTIONS
                        Number of times to execute list of top level
                        algorithms.
  -o OUTPUT, --output=OUTPUT
                        Output filename
  -O OUTPUT_STREAMS, --output-streams=OUTPUT_STREAMS
                        Output file map
  -I INPUT_STREAMS, --input-streams=INPUT_STREAMS
                        Input file map
  -H HOSTID, --hostid=HOSTID
                        Force given hostid
  -R RUN, --run=RUN     Set run number
  -N EXECUTION, --execution=EXECUTION
                        Set the starting execution number
  -V, --visualize       Run in visualize mode
  -G DETECTOR, --detector=DETECTOR
                        Specify a non-default, top-level geometry file

Each job option .py file that you pass on the command line will be evaluated in turn and the list of .root files will be appended to the “default” input stream. Any non-option argument that does not end in .py or .root is assumed to be a Python module which will be loaded as described in the previous section.

If you would like to pass command line arguments to your module, instead of simply listing them on the command line you must -m or --module. The module name and arguments must be surrounded by shell quotes. For example:

shell> nuwa.py -n1 -m "DybPython.TestMod1 -a foo bar" \
                   -m DybPython.TestMod2 \
                   DybPython.TestMod3

In this example, only DybPython.TestMod1 takes arguments. TestMod2 does not but can still be specified with “-m”. As the help output states, modules and job script files are all loaded in the order in which they are listed on the command line. All non-option arguments must follow options.

Example: Configuring DetSimValidation

During the move from the legacy G4dyb simulation to the Gaudi based one an extensive validation process was done. The code to do this is in the package DetSimValidation in the Validation area. It is provides a full-featured configuration example. Like GenTools, the configuration is split up into modules providing helper classes. In this case, there is a module for each detector and a class for each type of validation run. For example, test of uniformly distributed positrons can be configured like:

from DetSimValidation.AD import UniformPositron
up = UniformPositron()

Footnotes

[1]Nick-names default to the class name.
[2]Code is at dybgaudi/Simulation/GenTools/python/GenTools/Test.py.
[3]Code is at tutorial/DivingIn/python/DivingIn/Example.py
[4]Actual output may differ slightly.