Import a Rez file
importRez.Rd
Import a Rez file. This returns an object containing, among other things, a nodeMap
object containing raw information, and data frames for tokens, units, chunks, track chain entries, track chains, containing only key information likely to be useful for the user.
Usage
importRez(
paths,
docnames = "",
concatFields,
layerRegex = list(),
separator = " "
)
Arguments
- paths
A character vector of paths to the files to be imported. For Windows users, please use / instead of \.
- docnames
A character vector of the document names. If left blank, a
docname
will be generated according to the filenames of files you import. For example, the document foo/bar.rez will be named 'bar'.- concatFields
A string of names of token-level fields, for example word or transcription, that should be concatenated to form chunk- or entry-level fields. For example, if your word field is called 'word' and you have an IPA transcription field called 'ipa', then concatFields should be c("word", "ipa").
- layerRegex
A list, each of which is a component (just tree, track, rez, or chunk for now; stack to be added later). In each list entry, there are three components:
field
is the field on which the splitting is based;regex
is a vector of regular expressions;names
is a vector of layer names.regex
should have one fewer entry thannames
, as the last of the 'names
' should be the default case.- separator
The character you wish to use to separate words in concatenated columns, generally the empty string in languages like Tibetan and Chinese, and a single space in languages like Spanish and English.
Value
A rezrObj object. See new_rezrObj for details.
Note
After import, you may consider calling such functions as addUnitSeq, addIsWordField or getAllTreeCorrespondences, which are excluded from the import because of performance issues.
Examples
path = system.file("extdata", "sbc007.rez", package = "rezonateR", mustWork = T)
layerRegex = list(chunk = list(field = "chunkType", regex = c("verb"), names = c("verb", "refexpr")))
concatFields = c("text", "transcript")
rez007 = importRez(path, layerRegex = layerRegex, concatFields = concatFields)
#> Import starting - please be patient ...
#> Creating node maps ...
#> Creating rezrDFs ...
#> Adding foreign fields to rezrDFs and sorting (this is the slowest step) ...
#> >Adding to unit entry DF ...
#> >Adding to unit DF ...
#> >Adding to chunk DF ...
#> >Adding to stack DFs ...
#> >Adding to rez DFs ...
#> >Adding to track DFs ...
#> >Adding to tree DFs ...
#> Splitting rezrDFs into layers ...
#> A few finishing touches ...
#> Done!