data_tools.schema.CanonicalPath#

class data_tools.schema.CanonicalPath(origin: str, source: str, event: str, name: str)#

Bases: object

__init__(origin: str, source: str, event: str, name: str)#

Construct a canonical path representing a path to a file in any abstract data source.

Parameters:
  • origin (str) – Identifies the origin (code) of this data, usually the data pipeline version.

  • source (str) – The producer of the data pointed to by this canonical path, usually a pipeline stage

  • event (str) – The event that this data belongs to

  • name (str) – The name of this data

Methods

__init__(origin, source, event, name)

Construct a canonical path representing a path to a file in any abstract data source.

to_path()

to_string()

Obtain the string representation of this canonical path

unwrap()

Decompose this CanonicalPath into its constituent elements.

unwrap_canonical_path(canonical_path)

Unwrap a canonical path into its elements.

Attributes

property event: str#
property name: str#
property origin: str#
property source: str#
to_path() Path#
to_string() str#

Obtain the string representation of this canonical path

unwrap() List[str]#

Decompose this CanonicalPath into its constituent elements. Equivalent to os.path.split.

static unwrap_canonical_path(canonical_path: str) List[str]#

Unwrap a canonical path into its elements.

For example, “pipeline_2024_11_01/ingest/TotalPackVoltage” would be unwrapped to [“pipeline_2024_11_01”, “ingest”, “TotalPackVoltage”].

The first element should always be a reference to the origin (code) that produced this data. The second element should always refer to the stage (processing step) that produced this data. The last element should always be the name of this data.

Parameters:

canonical_path – The path to be decomposed

Returns:

A List[str] of path elements