Skip to contents

Add a field on whether something is a word or not.

Usage

addIsWordField(x, ...)

# S3 method for rezrDF
addIsWordField(x, cond, addWordSeq = T)

# S3 method for rezrObj
addIsWordField(x, cond, addWordSeq = T)

Arguments

x

The rezrDF or rezrObj to be edited.

cond

The wordhood condition. For example, if your word column is called 'word', and you wish to exclude zeroes, you may write 'x == "<0>"'.

addWordSeq

If TRUE, the columns wordOrder and docWordSeq will be added.

Value

The modified rezrDF or rezrObj. If addWordSeq is set to TRUE, the columns wordOrder and docWordSeq will be added to tokenDF and entryDF, and the columns wordOrderFirst, wordOrderLast, docWordSeqFirst and docWordSeqLast will be added to unitDF, chunkDF, rezDF and trackDF.

Note

If used on a rezrObj and addWordSeq = T, wordOrder and docWordSeq are automatically added to entry, unit, chunk, rez and track tables.

Examples

sbc007_withword = addIsWordField(sbc007, kind == "Word")
head(sbc007_withword$chunkDF$refexpr %>% select(id, text, tokenOrderFirst, wordOrderFirst, docTokenSeqLast, docWordSeqLast))
#> # A tibble: 6 × 6
#>   id            text                             token…¹ wordO…² docTo…³ docWo…⁴
#>   <chr>         <chr>                              <dbl>   <dbl>   <dbl>   <dbl>
#> 1 35E3E0AB6803A Stay up late                           3       1      19      14
#> 2 1F6B5F0B3FF59 the purpose of getting up in th…       5       3      32      24
#> 3 24FE2B219BD21 getting up in the morning              8       6      32      24
#> 4 158B579C1BA49 the morning                           11       9      32      24
#> 5 2B6521E881365 all this other shit                    2       2     147      99
#> 6 5B854594DD34  the way (...) they were feeling        5       2     161     107
#> # … with abbreviated variable names ¹​tokenOrderFirst, ²​wordOrderFirst,
#> #   ³​docTokenSeqLast, ⁴​docWordSeqLast