Scripting with the HMT text model

Include the library:

import org.homermultitext.edmodel._

The TeiReader object can read a ctiable text in HMMT project XML, and produce a vector of analyses, where each analysis corresponds to a single token in the text.

The TeiReader has functions to read a file, or a Corpus in the ohco2 library, but can also read a simple string with delimited text like this:


val iliadOpening = """urn:cts:greekLit:tlg0012.tlg001.va_xml:1.1#<l n="1" xmlns="http://www.tei-c.org/ns/1.0" xmlns:tei="http://www.tei-c.org/ns/1.0">Μῆνιν ἄειδε θεὰ <persName n="urn:cite2:hmt:pers.r1:pers1">Πηληϊάδεω  Ἀχιλῆος</persName> </l>
urn:cts:greekLit:tlg0012.tlg001.va_xml:1.2#<l n="2" xmlns="http://www.tei-c.org/ns/1.0" xmlns:tei="http://www.tei-c.org/ns/1.0">οὐλομένην· ἡ μυρί' <rs type="ethnic" n="urn:cite2:hmt:place.r1:place96">Ἀχαιοῖς</rs> ἄλγε' ἔθηκεν· </l>
urn:cts:greekLit:tlg0012.tlg001.va_xml:1.3#<l n="3" xmlns="http://www.tei-c.org/ns/1.0" xmlns:tei="http://www.tei-c.org/ns/1.0">πολλὰς δ' ἰφθίμους ψυχὰς <placeName n="urn:cite2:hmt:place.r1:place67">Ἄϊδι</placeName> προΐαψεν </l>
urn:cts:greekLit:tlg0012.tlg001.va_xml:1.4#<l n="4" xmlns="http://www.tei-c.org/ns/1.0" xmlns:tei="http://www.tei-c.org/ns/1.0">ἡρώων· αὐτοὺς δὲ ἑλώρια τεῦχε κύνεσσιν </l>
urn:cts:greekLit:tlg0012.tlg001.va_xml:1.5#<l n="5" xmlns="http://www.tei-c.org/ns/1.0" xmlns:tei="http://www.tei-c.org/ns/1.0">οἰωνοῖσί τε πᾶσι· <persName n="urn:cite2:hmt:pers.r1:pers8">Διὸς</persName> δ'  ἐτελείετο βουλή· </l>
"""

val tokens = TeiReader.fromString(iliadOpening)
assert(tokens.size == 30)

Website © 2019-2020, the Homer Multitext project. For licensing on image collections, see the Image Archive page.

Powered by Hydejack v8.1.1