Skip to contents

Import a Rez file. This returns an object containing, among other things, a nodeMap object containing raw information, and data frames for tokens, units, chunks, track chain entries, track chains, containing only key information likely to be useful for the user.

Usage

importRez(
  paths,
  docnames = "",
  concatFields,
  layerRegex = list(),
  separator = " "
)

Arguments

paths

A character vector of paths to the files to be imported. For Windows users, please use / instead of \.

docnames

A character vector of the document names. If left blank, a docname will be generated according to the filenames of files you import. For example, the document foo/bar.rez will be named 'bar'.

concatFields

A string of names of token-level fields, for example word or transcription, that should be concatenated to form chunk- or entry-level fields. For example, if your word field is called 'word' and you have an IPA transcription field called 'ipa', then concatFields should be c("word", "ipa").

layerRegex

A list, each of which is a component (just tree, track, rez, or chunk for now; stack to be added later). In each list entry, there are three components: field is the field on which the splitting is based; regex is a vector of regular expressions; names is a vector of layer names. regex should have one fewer entry than names, as the last of the 'names' should be the default case.

separator

The character you wish to use to separate words in concatenated columns, generally the empty string in languages like Tibetan and Chinese, and a single space in languages like Spanish and English.

Value

A rezrObj object. See new_rezrObj for details.

Note

After import, you may consider calling such functions as addUnitSeq, addIsWordField or getAllTreeCorrespondences, which are excluded from the import because of performance issues.

Examples

path = system.file("extdata", "sbc007.rez", package = "rezonateR", mustWork = T)
layerRegex = list(chunk = list(field = "chunkType", regex = c("verb"), names = c("verb", "refexpr")))
concatFields = c("text", "transcript")
rez007 = importRez(path, layerRegex = layerRegex, concatFields = concatFields)
#> Import starting - please be patient ...
#> Creating node maps ...
#> Creating rezrDFs ...
#> Adding foreign fields to rezrDFs and sorting (this is the slowest step) ...
#> >Adding to unit entry DF ...
#> >Adding to unit DF ...
#> >Adding to chunk DF ...
#> >Adding to stack DFs ...
#> >Adding to rez DFs ...
#> >Adding to track DFs ...
#> >Adding to tree DFs ...
#> Splitting rezrDFs into layers ...
#> A few finishing touches ...
#> Done!