Skip to contents

These functions may be used for the complexAction field in addFieldForeign / changeFieldForeign], or within expressions in functions like addField.

Usage

concatenateAll(x)

longestLength(x, isWord = T)

longest(x, isWord = T)

shortestLength(x, isWord = T)

shortest(x, isWord = T)

inLength(x, isWord = T)

Arguments

x

The information from the source rezrDF.

isWord

Name of the column that determines whether a token is a word or not.

Note

concatenateAll concatenates everything together. It is not to be confused with concatStringFields, which is applied on dataFrames. longest and shortest give the longest and shortest strings, and may have multiple entries if there are ties. longestLength and shortestLength give the lengths of the longest and shortest strings in x. Some base R functions that may be used include max, min, mean, range, etc.

Remember to include only the function name in complexAction fields, and include the 'x' (normally the name of a column inside your rezrDF) in expression fields.

Examples

sbc007 = addField(sbc007, entity = "token", layer = "",
                 fieldName = "longestWordInUnit",
                 expression = longestLength(text),
                 type = "complex",
                 groupField = "unit",
                 fieldaccess = "auto")
sbc007UnitLengths = sbc007$tokenDF %>%
rez_group_by(unit) %>%
summarise(lenWords = inLength(text, isWord = (kind == "Word")))