CTS URN subreferences

Overview

CTS URNs identify citable nodes in the OHCO2 model of text. Each node contains text that, in a particular version, may be represented by a rich content model: that is, the text of a node may be represented in a markup or markdown system, or it may be pure textual content.

Within a leaf-level citation node, a sub reference points to a string of characters. While CTS URN passage references are abstract, and can apply to any version of a text, subreferences expressed in terms of strings of characters are inherently tied to a specific language. They are only valid on URNs that include work references at the version or exemplar level.

Syntax

Syntactically, substrings are set off form the passage reference they qualify by the at sign @. A subreference may contain two parts: a literal string, and an index value. If an index value is included, it is enclosed in square brackets [] and follows any substring. The index value must evalute to a positive integer.

Semantics

At least one of the two parts of the subreference must be present. If both a substring and an index, n, are included, the reference points to the nth occurrence of the substring in the cited node. If a substring is given, but no index value, then it is taken to mean the first occurrence of the substring in the cited node. If an index is given, but no substring, it is taken to mean the nth code point in the cited node. Index values are 1-origin values

Examples

Subreference with and without index

The following two URNs are equivalent:

urn:cts:greekLit:tlg0012.tlg001.mth-01:1.1@Achilles[1]

urn:cts:greekLit:tlg0012.tlg001.mth-01:1.1@Achilles

In both cases, the reference is to the first occurence of the string “Achilles” in line 1 of book 1 of an English translation of the Iliad.

A subreference spanning leaf citation nodes

urn:cts:greekLit:tlg0012.tlg001.mth-01:1.1@Achilles-1.10@Atreus

This identifies a span of text running from the first occurrence of the string “Achilles” in book 1, line 1 of a version of the Iliad, to the first occurrence of the string “Atreus” in book 1, line 10 of the same translation.

Indexed substrings

urn:cts:greekLit:tlg0012.tlg001.mth-01:1.1@Achilles-1.10@the[2]

This identifies a span of text running from the first occurrence of the string “Achilles” in book 1, line 1, to the second occurrence of the string “the” in book 1, line 10 of the specified translation of Iliad.

Indexed code points

urn:cts:greekLit:tlg0012.tlg001.mth-01:1.1@[4]-1.1@[6]

This URN refers to the fourth through sixth code points (inclusive) of book 1, line 1 of the Iliad, in a specified version. Note that the meaning of this will depend both on the reading of the specific version, and the digital character encoding of the specific version. In particular, for non-ASCII characters in UTF-8, it is worth emphasizing that character data values in a programming language may not be equivalent to Unicode code points in that text.