.. _writing own handler:

Writing own handler
===================

.. warning::

    This page is currently being written.

Tweaking existing pipelines or modifying various pipelines is enough for
relatively simple tasks like adjusting parameters for existing analysis,
optimizing cuts, studying correlations etc. This way user is generally
restricted by set of extensions (handlers and data sources) provided by
the basic distribution.

Users are encouraged to write their own extensions. This chapter of the
Tutorial covers some basics of the corresponding C/C++ API.

A handler as a C/C++ class
--------------------------

In a nutshell, a handler is an instance of C/C++ class that inherits
``na64dp::AbstractHandler`` abstract base and implements its abstract method
``na64dp::AbstractHandler::process_event()``.

.. code-block:: cpp

    #include "na64dp/abstractHandler.hh"  // AbstractHandler base is defined here
    #include "na64event/data/event.hh"  // for Event structure details

    // declare a handler class
    class MyHandler : public na64dp::AbstractHandler {
    public:
        // a function that will be invoked for each processed event. Based on
        // its return code, a pipeline may continue to propagate the event,
        // stop or abrupt processing
        ProcRes process_event( na64dp::event::Event & event ) {
            return kOk;
        }
    };

    // this block is needed to tell NA64SW how construct the handler using
    // provided YAML configuration
    REGISTER_HANDLER(MyHandler, cdsp, cfg, "My handler.") {
        return new MyHandler();
    }

This is somewhat mimal amount of code need to declare a simplest possible
handler. It defines a class ``MyHandler`` that inherits ``AbstractHandler``
base and implements ``process_event()`` in a way nothing will be done with
an event passed by. A ``ProcRes`` is a C/C++ ``enum`` that define how the
pipeline shall proceed with an event that this handler has just considered.
Possible return codes:

* ``kOk`` -- propagate event further
* ``kDiscriminateEvent`` -- makes the pipeline to give up current event and
  take another from source, starting from the first handler
* ``kStopProcessing`` -- makes the pipeline to propagate current event to the
  end of the pipeline and stop (so, no more events will be taken from the
  source)
* ``kAbortProcessing`` is a combination of the latter two -- current event
  won't be propagated after current handler and no event will be considered
  anymore (instant stop of the pipeline).

The only thing remaining is to tell NA64SW how to construct a new handler
instance from YAML node. For ocnvenience, a macro ``REGISTER_HANDLER()`` is
defined in ``na64dp/abstractHandler.hh`` header. It takes four parameters and
defined in the way that subsequent C/C++ code will be a function.

Parameters are:

1. A handler name that will be used in ``_type: ...`` attribute in YAML
   object (no quotes need).
2. A name of the calibration dispatcher variable
3. A name of ``YAML::Node`` instance used to configure the handler
4. A text description of the handler

We will discuss 2 and 3 parameters later, but now let's use a code snippet
above to actually create something.

Compiling extensions
--------------------

Extensions for NA64SW is just a Linux *shared object* files (``.so`` --
"shared libraries"). You can put the code above in a file (in, say,
``myHandler.cc``) and compile it with command:

.. code-block:: sh

    $ g++ myHandler.cc \
        $(pkg-config na64sw --cflags --libs) \
        -shared -fPIC \
        -o libMyHandler.so

.. todo::

    Currently genfit include path is not included in cflags, so add the
    following to the command above: ``-I/afs/cern.ch/work/r/rdusaev/public/na64/sw/LCG_96b/x86_64-centos7-gcc62-opt/include/genfit/``

    It is planned to fix it in next public build.

Here we used ``pkg-config`` to retrieve proper compilation options for active
NA64SW installation (still assuming you are on LXPLUS), provided ``-shared``
and ``-fPIC`` to make a shared object file ``libMyHandler.so``, i.e. we built
a shared library from C/C++ code.

One may load a handler definition to ``na64sw-pipe`` application right away:

.. code-block:: sh

    $ na64sw-pipe -m MyHandler -l | grep MyHandler

We have started a pipeline application in extension-listing mode
(option ``-l,--list``) and provided our brand new extension to it.
Filtered lines shall demonstrate that handler is loaded and available for
usage.

One can, of course, instantiate a handler with some pipe config, but it
does not produce any side effects yet.

Example: dump event ID
----------------------

Let's slightly modify an example code above to produce some side effect. For
instance, let's make it print event ID for every event that passes through
the handler. Modify ``process_event()`` as follows:

.. code-block:: cpp

    ProcRes process_event( na64dp::event::Event & event ) {
        std::cout << " event " << event.id << std::endl;
        return kOk;
    }

A simplest possible pipeline config that uses this handler would be (say,
``myHandler-test.yaml``):

.. code-block:: yaml

    pipeline:
        - _type: MyHandler

You can run it as usual, with out new extension:

.. code-block:: sh

    $ na64sw-pipe -m myHandler -r myHandler-test.yaml \
        -N 100 /eos/experiment/na64/data/cdr/cdr01002-003292.dat

Event ID printed to ``stdout`` will contain run number, spill count and
event-in-spill ID (standard event identifier).

.. tip::

    One can find event structure definition in ``na64event/data/`` directory
    (`link on gitlab`_)
    to learn more on the event and hits C/C++ definitions.

.. _`link on gitlab`: https://gitlab.cern.ch/P348/na64sw/-/tree/dev/include/na64event/data

Iterating hit collections
-------------------------

Within an event few collections are defined: detector hits of various types,
hit clusters, track points, tracks, etc. Technically, these collections are
various STL maps (``std::map<>``, ``std::multimap<>``) indexed with numerical
ID denoting certain detector entity (like calorimeter cell, detector plane or
station).

Those numerical IDs have elaborated meaning and can be translated to
human-readable form of so called *TBname*: "ECAL", "MM02X", etc. by mean of
special object that we will refer in next section. Let's see how these
collection can be accessed in ``process_event()`` body:

.. code-block:: cpp

    for( auto & sadcHitEntry : event.sadcHits ) {
        DetID detID = sadcHitEntry.first;  // numerical identifier for a hit
        event::SADCHit & hit = *sadcHitEntry.second;  // ptr to a hit

        std::cout << "detID=" << detID
                  << ", SADC hit energy deposition: " << hit.eDep
                  << std::endl;
    }

You can re-build your extension and run this pipeline to see the result. It
won't be quite informative yet, though:

.. code-block:: shell

    detID=1107363909, SADC hit energy deposition: 0.618324
    detID=1107363011, SADC hit energy deposition: 0.00808365

.. tip::

    One can *include* another pipeline to yours with a special handler called
    ``Subpipe``. It will effectively embed a pipeline from another pipe config.
    For instance, adding this

    .. code-block:: yaml

        - _type: Subpipe
          pipeline default

    to the beginning of your ``myHandler-test.yaml`` will provide you with
    some meaningful data for hits (the `"default"` is the standard pipe config
    performing basic common reconstruction).

Detector ID
-----------

Numerical identifiers within collections are the fastest possible option
to enumerate objects. They also provides some useful semantics for
efficient selection of certain hit (like, by cell number or projection plane).

A prominent drawback yet is that one need a special object (called
``DetectorNaming``) to convert this numbers to strings wherever it must be
shown to a human. For reasons that are
out of scope of this tutorial we defined this conversion (numerical identifier
to string and vice-versa) as information piece that depends on *run number*
(like calibration data for instance). Within NA64SW such kind of an information
is provided by subscription mechanism expressed with template class
``calib::Handle``. To utilize it a following minor addition have to be made:

* Include a ``na64calib/manager.hh`` header for ``Handle`` class and
  ``na64detID/TBName.hh`` for ``DetectorNaming`` class
* Add the ``Handle<DetectorNaming>`` class to your handler's parents to append
  handler with subscription functionality
* Forward calibration data dispatcher instance to ``Handle``'s constructor

Then one can get the current instance of ``nameutils::DetectorNaming``
by invoking ``get()`` within the ``process_event()`` method. Its overloaded
``[]`` operator provides ID-to-str conversion then. So, to dump human-readable
ID of the hit one can invoke ``get()[detID]``.

As the result your code shall look like this:

.. code:: cpp

    #include "na64dp/abstractHandler.hh"  // AbstractHandler base is defined here
    #include "na64event/data/event.hh"  // for event struct details
    #include "na64calib/manager.hh"  // for calib::Dispatcher and calib::Handle<>
    #include "na64detID/TBName.hh"  // for nameutils::DetectorNaming

    class MyHandler : public na64dp::AbstractHandler
                    , public na64dp::calib::Handle<na64dp::nameutils::DetectorNaming> {
    public:
        MyHandler( na64dp::calib::Dispatcher & cdsp )
                : na64dp::calib::Handle<na64dp::nameutils::DetectorNaming>("default", cdsp)
                {}

        ProcRes process_event( na64dp::event::Event & event ) {
            std::cout << event.id << std::endl;

            for( auto & sadcHitEntry : event.sadcHits ) {
                na64dp::DetID detID = sadcHitEntry.first;
                na64dp::event::SADCHit & hit = *sadcHitEntry.second;

                std::cout << "detID=" << get()[detID]
                          << ", SADC hit energy deposition: " << hit.eDep
                          << std::endl;
            }

            return kOk;
        }
    };

    REGISTER_HANDLER(MyHandler, cdsp, cfg, "My handler.") {
        return new MyHandler(cdsp);
    }

This handler will produce output similar to:

.. code-block:: shell

    detID=ECAL0:3-1-1, SADC hit energy deposition: 0.618324
    detID=ECAL0:1-5-0, SADC hit energy deposition: 0.00808365

Parameterising a handler
------------------------

NA64SW uses yaml-cpp_ library to parse and toss around various parameters. It
defines clean and handy syntax for data retrieval. One can exploit a function
body defined just after a ``REGISTER_HANDLER`` macro to retrieve parameters
from corresponding YAML node and pass them by to handler's constructor.

.. _yaml-cpp: https://github.com/jbeder/yaml-cpp/wiki/Tutorial

Discussion and further reading
------------------------------

The above example demonstrate basics of handler usage.

* Each handler is just a C/C++ class instance (object)
* It is created with user-controlled function
* Its lifetime is nearly the same as the pipeline itself
* Within this object one may store persistent data -- statistical
  variable (sums, counters), file descriptors, plots, etc.
* Event is processed by a handler within a special method (function)
  ``process_event()``
* Within this method a handler has full access to the whole event data for
  individual read and modification
* A special method (function) called ``finalize()`` is invoked just before
  handler object destruction.

That's it for basic tutorial!

NA64SW API provides various utility C/C++ tools to write handlers. Common needs
like iterating over set of hits of certain type, putting ``TObject``
(histograms, trees) in ``TDirectory``, sliding window statistics, subscribing a
handler to certain calibration data, etc. are provided by NA64SW API in its
libraries.

Partially these tools are covered by Advanced topics section of the public
docs. For the rest, the best way to learn these features is to study
C/C++ source code of the existing "standard" handlers.