Loss Set / Ledger

A Loss Set is a prerequisite for use of other financial structures. Invoking this template is done to provide a reference to the input that will be processed in Graphene. Defining multiple Loss Sets for analysis is supported (and common).

Once a Ledger is defined according to Ledger Format or Advanced Ledger Format, the S3 location where it is stored will be referenced by the Loss Set template so that this data may be included in a larger financial model.

Suppose the following ledger has been created in S3:

s3://example_bucket/uploads/ledgers/example_ledger_upload/
  |-- ground_up_loss.parquet.1of4
  |-- ground_up_loss.parquet.2of4
  |-- ground_up_loss.parquet.3of4
  |-- ground_up_loss.parquet.4of4

The following template would be used to reference this ledger in a financial model:

{
    "_schema": "LossSet_1.0",
    "path": "s3://example_bucket/uploads/ledgers/example_ledger_upload/"
}

Structure

The general structure a Loss Set in Graphene is:

{
    "_schema": "LossSet_1.0",
    "path": "s3://example_bucket/uploads/ledgers/example_ledger_upload/",
    "occurrence_key_column": "EventId",
    "currency": "USD"
}

Parameters

The parameters are defined as follows:

Parameter Name

Required

Type

Description

path

Yes

string

The unescaped S3 key prefix or full S3 key that represents a complete ledger. This path may be relative to the bucket root or may be absolute.

occurrence_key_column

No

string

Required when distinct events can occur with the same time value. Refer to Occurrence Key for more information

currency

No

string

The currency in which the input currency values are defined. Defaults to the base currency if not set.

Note

Avoid S3 keys containing special characters as described in the S3 User Guide with the exception of delimiting / characters.

Partitioned ledgers

The examples above assume ledgers are partitioned into multiple files, which requires that the common key prefix of those files is used to define a reference to the ledger. This is often recommended for performance reasons.

However, when it is convenient, a single parquet file may be used to represent a ledger. In this case, the Loss Set structure would be defined with the full file path as shown:

{
    "_schema": "LossSet_1.0",
    "path": "s3://example_bucket/uploads/ledgers/misc/a_parquet_file.parquet"
}

Relative S3 Paths

Both relative and absolute s3 paths are supported for the path parameter. A relative path is any path that does not begin with s3://. A relative path will be mapped to an absolute path by concatenating the S3 key provided with the bucket URL. The custom:bucket JWT claim defines the bucket name for the authenticated user. Refer to JWT Claims for details of this and other claims available to authenticated users.

Example:

{
    "_schema": "LossSet_1.0",
    "path": "uploads/ledgers/example_ledger_upload/"
}

Assuming a bucket name of example_bucket, the above template is equivalent to:

{
    "_schema": "LossSet_1.0",
    "path": "s3://example_bucket/uploads/ledgers/example_ledger_upload/"
}

Note

A path beginning with / is unlikely to be interpreted as intended since / is a valid character for any S3 key. Graphene interprets a leading / as part of the key.

i.e.

/uploads/ledgers/example_ledger_upload/ refers to s3://example_bucket//uploads/ledgers/example_ledger_upload/ such that the URL contains two slashes after the bucket name.

Special characters

Paths referenced by the Loss Set should avoid using characters like &, +, ?, %, etc. These characters normally need to be URL-encoded, although S3 and supporting utilities treat them as-is, without encoding or decoding, for the purpose of defining an object key. Refer to the S3 object key naming guidelines.

If special characters must be used as part of s3 object keys and prefixes, for legacy reasons, leave them unencoded when defining a Loss Set or any entity in Graphene that accepts S3 URLs containing them. This leaves Graphene and S3 do not distinguish special characters like # so using the query or fragment for something like ephemeral metadata is only possible if the full URL is used in both systems.

For example:

S3 Object

s3://example_bucket/example_path/special_characters_number_sign_%23?special#123.parquet

Graphene path

s3://example_bucket/example_path/special_characters_number_sign_%23?special#123.parquet