Inner Join¶

The general structure of the Inner Join core operation in Graphene is:

{
    "_schema": "InnerJoin_1.0",
    "join_record_type": "JoinType",
    "join_on": ["OccurrenceKey"],
    "op": "MULTIPLY"
}

Parameters¶

The parameters are defined as follows:

Parameter Name	Required	Type	Description
`join_record_type`	Yes	`string`	Record type that will be used to identify joining records. All records that don’t have join_record_type will be considered financial records that might be propagated to the final output depending on the join results.
`join_on`	Yes	`string list`	Ledger attribute name(s) that will be used to join financial records and joining records.
`op`	No	`InnerJoin.Operation`	Optional operator to be used with the join record values. Allowed values are `MULTIPLY` and `NO_OPERATION`.

The first element in join_on must be Trial, Time, or the OccurrenceKey attribute name. The remaining elements can be zero or more metadata attributes.

Behavior¶

Inner Join, like all Graphene nodes, receives one merged input stream, and so uses join_record_type to distinguish between what might be called “left” and “right” in other system’s joins. Here, all records with join_record_type are considered “joining records”, and all other records are “financial records”.

All financial records that have corresponding joining records are propagated to the output ledger. Joining records are used to determine which financial records will be propagated, but are not themselves propagated to the output ledger. Financial records that don’t have corresponding joining record(s) are not propagated to the output ledger. If there are N financial record and M corresponding records with join_record_type, then NxM financial records will be propagated.

Corresponding records are determined by matching values for the attributes specified in join_on.

When op is set to MULTIPLY the financial records values are multiplied by the values of the corresponding joining records. If op is not defined, then the propagated financial records values are not modified, and values of join_record_type records are ignored.

Example¶

Assuming we have two input ledger with Losses and factors:

Trial	Time	Type	Value	EventId	LOB	GID
1	1	Loss	100	1	A	1
1	2	Loss	300	2	A	2
1	2	Loss	400	2	B	3

Input ledger with join_record_type records

Trial	Time	Type	Value	EventId	LOB	GID
1	1	Factor	1.1	1	A	1
1	2	Factor	1.2	2	A	2
1	2	Factor	1.3	2	A	3

If we use the InnerJoin template with the following attributes:

{
    "_schema": "InnerJoin_1.0",
    "join_record_type": "Factor",
    "join_on": ["OccurrenceKey", "LOB"],
    "op": "MULTIPLY"
}

We will get the following output:

Trial	Time	Type	Value	EventId	LOB	GID
1	1	Loss	110	1	A	1
1	2	Loss	360	2	A	2
1	2	Loss	390	2	A	2

Example notes:

Financial record with EventId 1 and LOB A has one joining record with value 1.1, therefore output value is 100 * 1.1.
Financial record with EventId 2 and LOB A has 2 joining records with values 1.2 and 1.3, therefore there are two records in the output with values 300*1.2 and 300*1.3.
Financial record with EventId 2 and LOB B doesn’t have any joining records with type Factor, therefore it is not propagated to the output.

If instead we use the InnerJoin template with no operation:

{
    "_schema": "InnerJoin_1.0",
    "join_record_type": "Factor",
    "join_on": ["OccurrenceKey", "LOB"],
}

We will get the following output:

Trial	Time	Type	Value	EventId	LOB	GID
1	1	Loss	100	1	A	1
1	2	Loss	300	2	A	2
1	2	Loss	300	2	A	2

Inner Join¶

Parameters¶

Behavior¶

Example¶

Table of Contents

Previous topic

Next topic

This Page