Inner Join

The general structure of the Inner Join core operation in Graphene is:

{
    "_schema": "InnerJoin_1.0",
    "join_record_type": "JoinType",
    "join_on": ["OccurrenceKey"],
    "op": "MULTIPLY"
}

Parameters

The parameters are defined as follows:

Parameter Name

Required

Type

Description

join_record_type

Yes

string

Record type that will be used to identify joining records. All records that don’t have join_record_type will be considered financial records that might be propagated to the final output depending on the join results.

join_on

Yes

string list

Ledger attribute name(s) that will be used to join financial records and joining records.

op

No

InnerJoin.Operation

Optional operator to be used with the join record values. Allowed values are MULTIPLY and NO_OPERATION.

The first element in join_on must be Trial, Time, or the OccurrenceKey attribute name. The remaining elements can be zero or more metadata attributes.

Behavior

Inner Join, like all Graphene nodes, receives one merged input stream, and so uses join_record_type to distinguish between what might be called “left” and “right” in other system’s joins. Here, all records with join_record_type are considered “joining records”, and all other records are “financial records”.

All financial records that have corresponding joining records are propagated to the output ledger. Joining records are used to determine which financial records will be propagated, but are not themselves propagated to the output ledger. Financial records that don’t have corresponding joining record(s) are not propagated to the output ledger. If there are N financial record and M corresponding records with join_record_type, then NxM financial records will be propagated.

Corresponding records are determined by matching values for the attributes specified in join_on.

When op is set to MULTIPLY the financial records values are multiplied by the values of the corresponding joining records. If op is not defined, then the propagated financial records values are not modified, and values of join_record_type records are ignored.

Example

Assuming we have two input ledger with Losses and factors:

Trial

Time

Type

Value

EventId

LOB

GID

1

1

Loss

100

1

A

1

1

2

Loss

300

2

A

2

1

2

Loss

400

2

B

3

Input ledger with join_record_type records

Trial

Time

Type

Value

EventId

LOB

GID

1

1

Factor

1.1

1

A

1

1

2

Factor

1.2

2

A

2

1

2

Factor

1.3

2

A

3

If we use the InnerJoin template with the following attributes:

{
    "_schema": "InnerJoin_1.0",
    "join_record_type": "Factor",
    "join_on": ["OccurrenceKey", "LOB"],
    "op": "MULTIPLY"
}

We will get the following output:

Trial

Time

Type

Value

EventId

LOB

GID

1

1

Loss

110

1

A

1

1

2

Loss

360

2

A

2

1

2

Loss

390

2

A

2

Example notes:

  • Financial record with EventId 1 and LOB A has one joining record with value 1.1, therefore output value is 100 * 1.1.

  • Financial record with EventId 2 and LOB A has 2 joining records with values 1.2 and 1.3, therefore there are two records in the output with values 300*1.2 and 300*1.3.

  • Financial record with EventId 2 and LOB B doesn’t have any joining records with type Factor, therefore it is not propagated to the output.

If instead we use the InnerJoin template with no operation:

{
    "_schema": "InnerJoin_1.0",
    "join_record_type": "Factor",
    "join_on": ["OccurrenceKey", "LOB"],
}

We will get the following output:

Trial

Time

Type

Value

EventId

LOB

GID

1

1

Loss

100

1

A

1

1

2

Loss

300

2

A

2

1

2

Loss

300

2

A

2