.. _detectors naming:

Detectors naming
================

.. warning::

    This page is combined from few different parts originally belonging to
    different documents, and probably has structural issues. Has to be
    revised.

For number of reasons in ``na64sw`` we use numerical identifiers to refer to
particular sensitive entities (detectors) in the experiment.

Despite usually detectors are denoted with a string (e.g. "ECAL", "HCAL",
"MM3", etc) it is pretty common to have some sort of detector enumeration.

.. note::

   You can find examples in NA64 software: Genova group's TTree introduces a
   scheme to enumerate calorimeter, Donskov's applications represent all the
   (M)SADC detectors as a linear index with index being a major indentifier
   across all its code, etc.

NA64sw identifiers are *semantic*, meaning that each ID encodes some information
about detector DAQ chip, its kin, station
number and elementary sensitive entity number within a station or plane (e.g.
wire or cell). This way each DAQ entity has its own detector ID, naturally
corresponding to every channel, so as a result every sensitive entity in the
experiment has its own unique numerical ID, that at the same time encodes
certain features of this entity.

This page covers various apects of this numerical detector ID composition and
usage.

Sections
--------

The numerical detector ID itself is just an integer value (of standard 4-bytes
size, ``uint32_t``) that may be decomposed (decoded) onto following semantic
units (*sections*):

- detector DAQ chip
- detector type ("kin" in the internal terminology)
- detector station number
- payload section that shall be usually interpreted depending on the chip type

Each section may be *set* to a certain value or left *unset*. This way one can
perform operations over hits of certain detector class (say, *all* micromegas,
by setting only chip and kin) without specifying the particular station numbers.

Instantiation of detector ID object
-----------------------------------

Though detector ID is always a number, a helper class (``DetID``) is wrapped
around it to provide convenient way to set, unset, and retrieve each *section*
values.

One can instantiate detector ID helper struct with

.. code-block:: cpp

    DetID did;

(type is defined in ``na64detID/detectorID.hh`` header file so one would want to
include it). With this default constructor all sections will be *unset*. One
can then set sections with eponymous methods:

.. code-block:: cpp

    did.chip(1);
    did.kin(23);
    did.number(3);
    did.payload(1234)

or, alternatively, one ca set these values with constructor:

.. code-block:: cpp

    DetID did(1, 23, 3, 1234);

To retrieve the numeric ID of particular section one can call eponymous getter
methods:

.. code-block:: cpp

    int detectorChip = did.chip();
    int detectorKin = did.kin();
    int stationNumber = did.number();
    int payload = did.payload()

To *unset* the section value, use the eponymous method with ``unset_`` prefix:

.. code-block:: cpp

    did.unset_chip();
    did.unset_kin();
    did.unset_number();
    did.unset_payload();

At any time one can access to the entire detector ID *numeric value* by ``id``
public attribute:

.. code-block:: cpp

    unsigned int numericValue = did.id;

All set/unset operations affects this value immediately (they actually just
modifies bits of this number).

Semantics and names
-------------------

Though one can directly use plain integer literals (``1``, ``42``, etc) to
specify sections as was shown above, **it is not recommended** since meaning
(semantics) of particular number is not clear to a human beings who mainains
the code.

One should rely, instead, on the naming information that is defined in the
instance of ``DetectorNaming`` class. This instance is typically provided in
various contexts of ``na64sw`` framework. For most of the handlers (every
subclass of most frequent ``SelectiveHitHandler``) it may be retrieved with
``naming()`` method:

.. code-block:: cpp

    DetID did;
    did.chip( naming().chip_id("APV") );

One should note here that ``kin`` is meaningless without its ``chip``. Since
it is practically impossible to have, say, hadronic calorimeter steered by APV
chip, only chip+kin code pair makes sense. That is why
``DetectorNaming::kin_id()`` will return a pair of numbers: a chip code as
``first`` and a kin code as ``second``. So, to obtain identifier to all
micromega stations consider this example:

.. code-block:: cpp

    DetID did;
    did.chip( naming().kin_id("MM").first );
    did.kin(  naming().kin_id("MM").second );

A shortcut exists for such usecases -- an operator ``[]`` for ``DetectorNaming``
instances:

.. code-block:: cpp

    DetID did = naming()["MM"];

It supports also a number (this way chip, kin and station number values will
be set):

.. code-block:: cpp

    DetID did = naming()["MM02"];

Although retreiving values this way is convenient, doing so each time may cause
minor performance issues as string-to-int conversion here still costs some CPU.
Lets consider few practical scenarios of how one can easily mitigate this
effect.

Comparison
----------

Values are not unique within a section, i.e. two different detector types
(say tracker and calorimeter), can have same kin numeric value, but together
with their chip values, IDs will be different.

.. code-block:: cpp

    DetID trackerID(1, 2);
    DetID caloID(2, 2);
    caloID.kin() == trackerID.kin();  // yields true
    caloID.chip() == trackerID.chip();  // yields false

To compare full kin identifier one can make a ``std::pair<>`` object from chip
and kin identifiers and compare them. Naming instance provides convenient
method for fast comparison operations to compare with respect to particular
detector kin:

.. code-block:: cpp

   std::make_pair(did.chip(), did.kin()) == naming().kin_id("ECAL");
   // ^^^ evaluates to `true` for all ECAL cells and ECAL sub-entities

Payload Section
---------------

The payload section has meaning varying for different chips. It usually defines
some finer segmentation of the detectors where it has a meaning: for SADC
detectors that typically are segmented calorimeters, the payload must describe
x/y/z segment number while for APV-based detectors that typically are the
tracking planes the payload defines plane projection (one of X/Y/U/V) and wire
number.

Usage example of constructing full detector ID that refers to 3x3 cell of ECAL
preshower:

.. code-block:: cpp

    DetID did = naming()["ECAL"];
    CellID cid( 3, 3, 0 );
    did.payload(cid.cellID);

or, same thing shorter:
    
.. code-block:: cpp

    DetID ecalDid = naming()["ECAL"];
    did.payload( CellID( 3, 3, 0 ).cellID );

Micromega 3 wire 33 on plane U:

.. code-block:: cpp

    DetID mmDid = naming()["MM01"];
    did.payload( WireID( WireID::kU, 33 ).id );

Decoding is done by explicitly casting payload section to a certain type. For
instance, retreiving the x/y/z-indexes from full ECAL's cell ID:

.. code-block:: cpp

    CellID cid(ecalDid.payload());
    int cellX = cid.get_x()
      , cellY = cid.get_y()
      , cellZ = cid.get_z()
      ;

And projection ID and wire number from MuMega's full wire ID:

.. code-block:: cpp

    WireID wid(mmDid.payload());
    WireID::Projection projID = mmDid.proj();
    int wireNo = mmDid.wire_no();

See reference for ``CellID`` and ``WireID`` types for more explainatory reference
on what this types are capable of.

Practical Scenario #1: Handler
------------------------------

.. todo::

    Provide an example of using ``DetID`` within a C/C++ handler API (use
    ``naming()["MM03X"]`` etc).

Practical Scenario #2: Data Source
----------------------------------

.. todo::

    Provide an example of using ``DetID`` within a C/C++ data source API (use
    availibility of ``DetectorsNaming`` instance, etc).

General Practical Scenario: Subscription
----------------------------------------

.. todo::

    Provide an example of using ``DetID`` within generic C/C++ routines (use
    availibility of ``calib::Handle<DetectorsNaming>`` instance, etc).

.. _TBName mapping:

Notes on TBName Mapping
-----------------------

Raw data in NA64 is provided in format treated by ``DaqDataDecoding`` (DDD). This
lib came from COMPASS (NA58) experiment and provides versatile integration with
hardware DAQ (data acquizition) system inherited by NA64 from COMPASS.

In this document the brief description on the hierarchy is given. For detailed
information, see the original code of NA58 DAQ -- the ``DaqDataDecoding``
library is a part of CORAL software distribution.

Hardware Electronics Chips
--------------------------

The key element of any DAQ system is the particular chip that provides
measurements in certain digital form for dispatching and subsequent storage.
One type of chip may serve a number of detectors.

E.g. the SADC is a chip widely used by both experiments to acquire data from
numerous type of detectors: calorimeters, drift chambers, veto detectors and so
on. It mostly consists of two samplers (sampling rate is about 40ns) running in
parallel with shift of 25ns. Thus, singe SADC hip provides samling rate of
12.5ns. The time window of SADC chip is limited by 32 samples.

Another important chip is APV. Originally developed for CMS detectors, it
serves for many experiments at CERN as an analog de-multiplexer gathering
signals from multiple inputs and transmitting their read-out by single output
line so that multiple inputs may be measured by one sampling ADC.

Assumption that we made during architecturing our software system is that one
detector is managed by a chip of certain type.

.. _TBname:

TBname
------

DDD has concept of "*TBName*". The *TBName* is a short string usually
corresponding to a single detector or assembly part of the detector. Particular
naming scheme is usually choosen by means of DAQ convenience. E.g. the
electromagnetic calorimeter of NA64 consists of two parts (preshower and main
part) with corresponding distinct TBNames: ``ECAL0`` for pre-shower part and
``ECAL1`` for main.

Generally, TBName refers to a detector of certain type: ``ECAL0``, ``ECAL1`` are
for electromagnetic calorimeters, ``HCAL0``, ``HCAL1``, ``HCAL2``, ``HCAL3`` are
for hadronic calorimeter, ``GM01X__``, ``GM02Y__`` are for GEMs (gas electronic
multiplier tracking detector) and so on.

So far, one may assume that TBName consists of at least two semantic units:
the detector *kin* (like ``ECAL``, ``GM``, ``HCAL``, ``DC``, etc) and the station
number. Third, optional part refers to arbitrary naming *postfix* (``X``, ``Y__``,
``Ybl``, etc) bearing various assembly-specific information.

For example:

* ``ECAL0`` consists of detector *kin* name -- ``ECAL``
  and *station number* -- ``0``
* ``MM03U`` consists of three parts: *kin* -- `MM` (micromega detector),
  *station number* -- ``03``, and postfix -- ``U`` denoting that this entry is
  for the U-projection plane identification.
* ``ST03Ybl`` (from NA58) is for straw detectors kin (``ST``) for station number
  ``03`` and assembly part for Y-plane at its "bottom left" (``bl``).

Such a partitioning allows trigger hardware to refer to
certain part important for triggering, while at the same time leads to some
unambiguity in analysis tools, causing data treatment code to consider a
special case for multiple parts while it logically a single detector assembly.
Moreover, the ``TBName`` is rarely refers to finer partitioning. E.g.
calorimeters usually consists of cells and tracking detectors are the arrays of
wires. In native DDD it is a subject of *digits* -- an information about
triggered event part. Logically, however it is more about *location* of the hit
and thus should be affiliated to detector ID.

Detectors identifiers in NA64SW and DAQ TBname
----------------------------------------------

In this software we propose another approach to identify detectors. Instead of
TBNames, the numerical identification is used. By the means of bit
partitioning, the full detector ID is immersed into a single integer identifier,
bearing information of *chip*, *kin*, *station number* and detector-specific
entity number that we further refer as *payload*. Note that the *payload* here
refers not to a detector digit (payload of hit information), but has to be
interpreted as a *payload* of detector ID.

The mapping of TBnames to numerical ids is controlled at a runtime from special
object providing coding shortcuts for conversion from names to numbers and
vise-versa (we foresee setup changes involving changes in TBNames naming
nomenclature).

TBNames and Numerical ID Correspondance
---------------------------------------

.. todo::

    Describe here a format of ``nameMaps`` object in
    ``presets/calibration.yaml`` once it'll become settled

Two-Staged Conversion Procedure
-------------------------------

.. todo::

    Describe here a role of ``postfix`` argument in
    ``TBNameMappings::detector_id_by_ddd_name()`` and details on what is called
    "full name" contrary to TBName once the per-detector selection procedure will
    be finalized within handlers

.. _TDirAdapter usage:

String templates and TDirAdapter class
--------------------------------------

.. todo::

    Write this up when the path naming templates will be clearly understood
    and implemented not only for the particular detector planes, but rather for
    detector assemblies and so on

.. _DSuL:

Detector Selection DSL
----------------------

To apply handlers to certain detectors, the small domain-specific language
is involved. To reflect the fact that this language does not impose
sophisticated functionality that one would expect from any artifical grammar
we some times call it "detector selection domain specific language" or DSuL
where the "u" stands for "micro-" (language).

The basic grammar is somewhat resembling C++ logic expressions where some
amount of variables are "defined" externally. Exact description of grammar
may be found at ``presets/ds-dsl.y`` file which contains assets needed to
automatically generate parsing and evaluation routines for this language.

Examples:

* ``kin==MM && (station==7 || station==2)`` -- filters out only the MuMegas hits
  from stations #7 and #2.
* ``kin==HCAL && number==2 && (xIdx==2 && yIdx==2)`` -- filters out only hits
  from second module of hadronic calorimeter (HCAL) in cell 2x2.

One can notice that besides of common C-like logical expression syntax with
integer literals, there is an important knowledge necessary for user to compose
such expressions: list of definitions.

DSuL: comparison expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Following comparison expressions are supported:

* EQALS ``==``
* DOES NOT EQUAL ``!=``
* GREATER ``>``
* GREATER OR EQUAL ``>=``
* LESSER ``<``
* LESS OR EQUAL ``<=``

This operators can not be applied to the result of an expression -- another
comparison or logic expression. I.e. the following statements are invalid:

* ``(kin == GM) != 0`` (might be reduced it to ``kin == GM``)
* ``!(kin == GM || kin == MM) == 0 && wire=12`` (might be reduce it
  to ``!(kin == GM || kin == MM) && wire=12``).

This is deliberate choice since most of selector expressions used in practice
may be efficiently expressed with pure boolean algebra over value comparison
results.

DSuL: operators
~~~~~~~~~~~~~~~

Following set of logic operators are supported (sorted by the precedence):

* Unary NOT (``!``) operator
* Binary AND (``&&``) operator
* Binary OR (``||``) operator
* Binary XOR (``^^``) operator

Note about XOR: C does not support this operator. At the places were its
meaning becomes handy common C practice suggests using ``!=``. In our DSuL we
can not use this idiom as DSuL does not support comparison operators over logic
result.

List of Definitions
~~~~~~~~~~~~~~~~~~~

All the chip (``SADC``, ``APV``) and kin (``GM``, ``MM``, ``HCAL`` etc) names
defined in detector naming mapping are available in the expression context as
a variables that can be resolved to their numeric identifier. Additionaly, the
projection identifier letters (``U``, ``V``, ``X``, ``Y``) will also be
resolved to their numeric equivalent.

Besides of these identifiers, the following properties related to detector
identifier are available:

* ``chip`` -- will be resolved to current detector chip ID
* ``kin`` -- resolved to current detector kin ID
* ``number`` -- resolved to current detector station number
* (for SADC detectors sonly) ``xIdx``, ``yIdx`` -- resolved to current
  hit's X/Y indexes (typically, calorimeter cell)
* (for APV detectors only) ``projection`` -- projection identifier
* (for APV detectors only) ``wireNo`` -- APV detector wire number