.. _tutorial-generics: Event structure and generic handlers ==================================== This chapter introduces basics of the event data handling needed to deal with *generic handlers*. Generic handlers are powerful and versatile, but they demand user to know the structure (data model) of the event. As in previous chapter, we will operate at the very basic level of DAQ digits to avoid complex pipelines and hidden parameters. Event and hits collections -------------------------- An *event* in the NA64sw is defined internally as a data structure. It is not *flat*, meaning that it can not be represented entirely as a single table or ntuple as it has rather complex topology. The event object consists of collections of smaller, simpler objects. Those simpler objects then *may*, in order, be represented as table. It is pretty much like a bunch of interconnected C/C++ structures or a set of ``TTree`` instances, or a filesystem subtree (files and directories) or `normal form`_ in relational DB, etc. -- that's a pretty common pattern in programming to structure hierarchical data. .. _`normal form`: https://en.wikipedia.org/wiki/Database_normalization#Normal_forms Examples of collections within an event: * (M)SADC-based detector hits (ECAL, HCAL, SRD, beam counters, etc) * APV-based detector hits (MicroMegas, GEMs) * Clusters on APV detectors * Track points, built of APV clusters or MSADC hits * Tracks For details on how the event data is structured, consider reading the :ref:`event structure` paragraph of our concepts documentation. For instance, basic hits raw data (like :cpp:class:`~na64dp::event::RawDataSADC` and :cpp:class:`~na64dp::event::RawDataAPV`) are just plain C structures built with scalar values defined. .. note:: NA64sw benefits from C++ strong typing and event's data topology is defined at compile time. So, there is no handlers that can create new types of collections within an event. Only creation/deletion of the elements within defined collections can be performed by handlers. From a user point of view, the most interesting are the scalar values defined in these complex structures. They refer to particular physical quantities. .. figure:: ../images/manual/uml-struct-example-03.png :align: center Illustrative example of ``Event`` class diagram with aggreagations. For details on graphic notation, see :ref:`event structure`. At a very basic level, an every handler deals with one event at a time. Of course, within an event a handler may iterate over a collection of hits to apply a certain operation or complex algorithm to every object in this collection (hit, cluster, track point, track, etc). Generic handlers: a foreword ---------------------------- Besides of specific handlers designed for specific purpose (like subtracting pedestals), NA64sw provides some *generic* tools to access this data: to plot, to modify or to remove hits from event or to exclude an event from further processing (i.e. apply a cut). From C/C++ point of view, by using *generic* handlers one can avoid the need to implement a new handler each time one need to plot (or, say, cut) a new quatnity within an event or a hit. It may not seem so crucial for this basic tutorial, but if you are familiar with C++ programming you probably would appreciate an effort. One of this common and simplest operations is to accumulate a certain value as 1D histogram. To be more specific, consider example from previous chapter: we've calculated a sum of waveform and used a handler parameterised with name of the *value*: .. code-block:: yaml - _type: Histogram1D value: sadcHits.rawData.sum histName: "sum" histDescr: "Amplitude sum, {TBName} ; time, ns ; Events" nBins: 100 range: [0, 5000] We created a generic handler instance with a specific purpose to plot a 1D histogram for every SADC hit. The handler type (class) specified as ``_type`` parameter was ``Histogram1D``. Parameters ``histName``, ``histDescr``, ``nBins``, and ``range`` are naturally expected for 1D histogram. But ``value`` refers to a certain value within a hit and the question that one immediately asks is how to know which data can be retrieved as ``value`` -- what else can we plot? There are a number of ways to know: * One way is the handlers list provided here, in documentation: see :doc:`/handlers-lists/all` page -- handlers are divided by their subject: :doc:`MSADC hits `, :doc:`APV hits `, :doc:`clusters `, :doc:`tracking `, etc. On the top of every page, all the attributes of corresponding data type are listed (in "Structure" paragraph). * Another way is to refer to the :ref:`event structure diagram `. This diagram is pretty bulky but comprehensive one. * Last, once one get familiar with C/C++ API of NA64sw one can consider looking directly into the code of ``na64event`` library's headers. Event and all its collections are merely a C++ structures with accessing functions defined in static arrays. Let's consider a practical example of building 2D correlation plot between maximum sample amplitude of (M)SADC waveform and its sum. These values have a practical meaning with respect to physical data reconstruction as they are both proportional to the energy deposition in calorimeter cells. If you have followed previous chapter, you may try to do this task by your own while section below describes foreseen steps to do it. Excercise: building a pipeline ------------------------------ 1. The ``sum`` value is already calculated by our ``custom.yaml`` config from previous chapter. If you've lost this file, you may copy a pre-defined one: .. code-block:: shell $ cp $NA64SW_PREFIX/share/na64sw/run/tutorial/03-generics.yaml custom.yaml 2. To get the maximum value defined: * Find the maxima-finding handlers for SADC by ``grep``'ing the ``na64sw-pipe -l`` output: .. code-block:: shell $ na64sw-pipe -l | grep max * We see a number of matching results. Well, finding maximum sample in SADC waveform is a bit sensitive task, so there are few algorithms, but for this tutorial's purposes let's limit ourselves with simplest possible variant: :cpp:class:`~na64dp::handlers::SADCFindMaxSimple`. According to the description this handler need no settings except for optional ``applyTo`` that we will discuss a bit later. Since it is optional it is safe to omit it so far. * Append ``custom.yaml`` with .. code-block:: yaml - _type: SADCFindMaxSimple to get ``maxSample`` and ``maxValue`` of :cpp:class:`~na64dp::handlers::RawDataSADC` set (as it is promised in the handler's doc). 3. To build the plot iself: * For this task we will need a 2D histogram plotter that operates within a single instance of ``SADCHit``. A generic handler for that is ``Histogram2D`` and we can parameterize it with max/sum values. * On the :doc:`generic handlers page ` we see the usage example for handler :cpp:class:`~na64dp::Histogram2D`. It is pretty similar to its 1D counterpart -- the main difference is that parameter names now have ``X`` and ``Y`` suffixes. * Let's say we want to get the sum value calculated on previous step. Description of the :cpp:class:`~na64dp::handlers::SADCDirectSum`/:cpp:class:`~na64dp::handlers::SADCLinearSum` refer to a ``sum`` attribute of :cpp:class:`~na64dp::event::RawDataSADC` member of :cpp:class:`~na64dp::event::SADCHit`. The documentation says it can be accessed via ``rawData.sum`` path. * Limits for histogram may be tuned only empirically, since wefor raw (uncalibrated) values there are very few preliminary expectations * This way, the block we need to visualize max/sum correlation is: .. code-block:: yaml - _type: Histogram2D valueX: sadcHits.rawData.sum valueY: sadcHits.rawData.maxAmp histName: "sumVsMax" histDescr: "MSADC sum vs max correlation, {TBName} ; sum ; max" nBinsX: 100 rangeX: [0, 25000] nBinsY: 100 rangeY: [0, 5000] Running pipeline app with this configuration shall produce this kind of plot built for every (M)SADC detector in ``.root`` output file. .. image:: ../images/manual/tutorial-3-sum-vs-amp.jpg This strong correlation observed for most of the detectors of that type nicely demonstrates the fact that integrated charge and maximum signal value shall be in a strong agreement within a PMT pulse shape.