Other datamodel/main

Table Of Contents

Previous topic

Offline Framework

Next topic

Data I/O

This Page

Daya Bay Links

Content Skeleton

Data Model

  • Over all structure of data
  • One package per processing stage
  • Single “header object” as direct TES DataObject
  • Providence
  • Tour of DataModel packages

Overview

The “data model” is the suite of classes used to describe almost all of the information used in our analysis of the experimental results. This includes simulated truth, real and simulated DAQ data, calibrated data, reconstructed events or other quantities. Just about anything that an algorithm might produce is a candidate for using existing or requiring new classes in the data model. It does not include some information that will be stored in a database (reactor power, calibration constants) nor any analysis ntuples. In this last case, it is important to strive to keep results in the form of data model classes as this will allow interoperability between different algorithms and a common language that we can use to discuss our analysis.

The classes making up the data model are found in the DataModel area of a release. There is one package for each related collection of classes that a particular analysis stage produces.

HeaderObject

There is one special class in each package which inherits from HeaderObject. All other objects that a processing stage produces will be held, directly or indirectly by the HeaderObject for the stage. HeaderObjects also hold a some book-keeping items such as:

TimeStamp
giving a single reference time for this object and any subobjects it may hold. See below for details on what kind of times the data model makes use of.
Execution Number
counts the number of times the algorithm’s execution method has been called, starting at 1. This can be thought of as an “event” number in more traditional experiments.
Random State
holds the stage of the random number generator engine just before the algorithm that produced the HeaderObject was run. It can be used to re-run the algorithm in order to reproduce and arbitrary output.
Input HeaderObjects
that were used to produce this one are referenced in order to determine providence.
Time Extent
records the time this data spans. It is actually stored in the TemporalDataObject base class.

Times

There are various times recorded in the data. Some are absolute but imprecise (integral number of ns) and others are relative but precise (sub ns).

Absolute Time

Absolute time is stored in TimeStamp objects from the Conventions package under DataModel. They store time as seconds from the Unix Epoch (Jan 1, 1970, UTC) and nanoseconds w/in a second. A 32 bit integer is currently given to store each time scale [1]. While providing absolute time, they are not suitable for recording times to a precision less than 1 ns. TimeStamp objects can be implicitly converted to a double but will suffer a loss of precision of 100s of \musec when holding modern times.

Relative Time

Relative times simply count seconds from some absolute time and are stored as a double.

Reference times

Each HeaderObject holds an absolute reference time as a TimeStamp. How each is defined depends on the algorithms that produced the HeaderObject.

Sub-object precision times

Some HeaderObjects, such as SimHeader, hold sub-objects that need precision times (eg SimHits). These are stored as doubles and are measured from the reference time of the HeaderObject holding the sub- objects.

Time Extents

Each TemporalObject (and thus each HeaderObject) has a time extent represented by an earliest TimeStamp followed by a latest one. These are used by the window-based analysis window implemented by the Archive Event Storeaes to determine when objects fall outside the window and can be purged. How each earliest/latest pair is defined depends on the algorithm that produced the object but are typically chosen to just contain the times of all sub-objects held by the HeaderObject.

How Some Times are Defined

This list how some commonly used times are defined. The list is organized by the top-level DataObject where you may find the times.

GenHeader

Generator level information.

Reference Time
Defined by the generator output. It is the first or primary signal event interaction time.
Time Extent
Defined to encompass all primary vertices. Will typically be infinitesimally small.
Precision Times
Currently, there no precision times in the conventional sense. Each primary vertex in an event may have a unique time which is absolute and stored as a double.
SimHeader

Detector Simulation output.

Reference Time
This is identical to the reference time for the GenHeader that was used to as input to the simulation.
Time Extent
Defined to contain the times of all SimHits from all detectors.
Precision Times

Each RPC/PMT SimHit has a time measured from the reference time.

FIXME Need to check on times used in the Historian.

ElecHeader TrigHeader Readout ...

Examples of using the Data Model objects

Please write more about me!

Tutorial examples

Good examples are provided by the tutorial project which is located under NuWa-RELEASE/tutorial/. Each package shoudl provide a simple, self contained example but note that sometimes they get out of step with the rest of the code or may show less than ideal (older) ways of doing things.

Some good examples to look at are available in the DivingIn tutorial package. It shows how to do almost all things one will want to do to write analysis. It includes, accessing the data, making histograms, reading/writing files. Look at the Python modules under python/DivingIn/. Most provide instructions on how to run them in comments at the top of the file. There is a companion presentation available as DocDB #3131 [2].

Footnotes

[1]Before 2038 someone had better increase the size what stores the seconds!
[2]http://dayabay.ihep.ac.cn/cgi-bin/DocDB/ShowDocument?docid=3131