.. index:: Core Operations; Preferential Attachment .. _core-operation-preferential-attachment: PreferentialAttachment ================================== Applies an attachment to the sum of a group of records that share equal values in the columns specified by ``erode_by_columns``. It is applied incrementally in to steps to groups of records that share same values in the columns specified by ``span_columns``. In first step attachment is applied to the preferred records that have matching ``preferred_values`` in the ``preferred_column``. Then attachment applies to the non-matched records. In both steps once the attachment is applied, the result is proportionally allocated to all matching (preferred or non-preferred) records. Structure --------- .. code-block:: json { "_schema": "PreferentialAttachment_1.0", "attachment": 1000, "span_columns": ["Time", "OccurrenceKey"], "erode_by_columns": ["Time", "OccurrenceKey"], "currency":"GBP", "preferred_column": ["Region"], "preferred_values": ["NAM"], "invert": false } Parameters ---------- +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Parameter Name | Required | Type | Description | +======================+==========+=================+===================================================================================================================================================================================================================================+ | ``attachment`` | Yes | ``double`` | The attachment value in currency units. Cannot be negative. | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``span_columns`` | No | ``string list`` | List of column names. Adjacent records with the same values in the provided columns will share the same eroding attachment. Supported columns: ``OccurrenceKey``, ``Time``. If left empty, all records share the same attachment. | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``erode_by_columns`` | No | ``string list`` | List of column names. Specifies the grouping of records the attachment will apply to. Supported columns: ``OccurrenceKey``, ``Time``. Always implicitly contains the values in ``span_columns``. | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``currency`` | No | ``string`` | The currency in which ``attachment`` is defined. | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``preferred_column`` | No | ``string`` | Name of the metadata field that stores preferred values (OccurrenceKey, Time and Trial columns are not allowed). | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``preferred_values`` | No | ``string list`` | List of values. If record has any of the listed values in preferred column, such record is treated as preferred. | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``invert`` | No | ``bool`` | Specifies if the preferential order should be inverted. | +----------------------+----------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Behaviour -------------- Records given to preferential attachment have their attachment applied according to how they are grouped by ``span_columns`` and ``erode_by_columns``, and what value records have in the ``preferred_column``. All adjacent records with matching values in the columns specified by ``span_columns`` are placed within a *span group*. All adjacent records with matching values in the columns specified by ``erode_by_columns`` within a span group are further placed in an *erode group*. Records in *erode group* are also separated into preferred and non preferred subgroups. Inside *erode group* attachment is applied to the preferred records first and to the non-preferred records after that. Order of the preferred and non-preferred records within erode group is not important. They may or may not be adjacent. Preferred column can be only one of the metadata columns. Within each erode group, the attachment is applied first to the sum of the preferred records' ``Value`` fields (``sum``) with the function ``new_sum = min(sum + attachment, 0)``. This value is then proportionally allocated to the ``Value`` field of each preferred record within the group with the function ``record.value = new_sum / sum``. If ``sum`` is 0, then each records' ``Value`` field is instead set to 0. In either case, the attachment is eroded with the function ``attachment = max(attachment + sum, 0)``. Then same calculations are applied to the non-preferred records in the erode group. Within each span group, the value of the attachment is reset to the original attachment specified in the arguments.