Skip to contents

This may either be counted within a window of tokens from the current one, or all referents competing with the current one may be counted, or a mix of both conditions. By default, we count referents intervening between the current and previous mention. Despite its name, tokenOrder can be set as unitSeqLast or similar.

Usage

countCompetitors(
  cond = NULL,
  windowSize = Inf,
  tokenSeq = NULL,
  unitSeq = NULL,
  chain = NULL,
  between = T,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F,
  windowType = "unit"
)

countCompetitorsMatch(
  matchCol,
  windowSize = Inf,
  tokenOrder = NULL,
  chain = NULL,
  between = T,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

countMatchingCompetitors(
  matchCol,
  windowSize = Inf,
  tokenOrder = NULL,
  chain = NULL,
  between = T,
  exclFrag = F,
  combinedChunk = NULL,
  nonFragmentMember = F
)

Arguments

cond

The condition under which something counts as a competitor. Leave blank if anything goes.

windowSize

The size of the window in which you will be counting.

unitSeq

The vector of tokenOrder values where the mentions appeared. You can choose tokenOrderFirst, tokenOrderFirst, or maybe an average of the two. By default it's tokenOrderFirst.

chain

The chain that each mention belongs to.

between

Do we only count competitors between the current mention and previous mention? (If T, then the value is NA for first mentions.)

exclFrag

Exclude 'fragments' (i.e. members of a combined chunk which do not serve as meaningful chunks in their own right)

combinedChunk

The combinedChunk column of the rezrDF. By default, named combinedChunk.

nonFragmentMember

Vector indicating whether each entry is a non-fragment member, i.e. a member of a combined chunk that also serves as a meaningful chunk in its own right.

matchCol

The column for which a value is to be matched.

tokenOrder

The vector of sequence values values where the mentions appeared. Common choices are docTokenSeqFirst, docTokenSeqLast, wordTokenSeqFirst and wordTokenseqLast (the last two are available after running addIsWordField on a rezrObj. By default it's docTokenSeqLast.

Value

A vector of number of competitors.

Examples

sbc007$trackDF$default %>%
rez_mutate(isZero = (text == "<0>")) %>%
 rez_mutate(noCompetitors = countCompetitors(windowSize = 40, between = F),
            noMatchingCompetitors = countMatchingCompetitors(isZero, windowSize = 40, between = F))
#> # A tibble: 236 × 34
#>    id      doc   chain sourc…¹ token gapWo…² charC…³ token…⁴ gapUn…⁵ kind  place
#>    <chr>   <chr> <chr> <chr>   <chr> <chr>     <dbl>   <dbl> <chr>   <chr> <chr>
#>  1 1096E4… sbc0… 278D… ""      37EF… N/A           1       1 N/A     "Wor… "1"  
#>  2 92F20A… sbc0… 278D… "174E6… 9363… 2             1       1 0       "Wor… "3"  
#>  3 7E5BB6… sbc0… 2B67… ""      744A… N/A          17       5 N/A     ""    ""   
#>  4 1F74D2… sbc0… 2A01… "52452… 1265… N/A           4       1 N/A     "Wor… "9"  
#>  5 2485C4… sbc0… 278D… "CB1D9… 2113… 10            3       1 1       ""    ""   
#>  6 1BF226… sbc0… 2A01… ""      35E3… 5            12       3 1       ""    ""   
#>  7 6B37B5… sbc0… 2A01… "ED8C9… 233E… 5             3       1 1       ""    ""   
#>  8 259C2C… sbc0… 251A… ""      1F6B… N/A          40       8 N/A     ""    ""   
#>  9 1D1F2B… sbc0… 10FA… ""      24FE… N/A          25       5 N/A     ""    ""   
#> 10 1FA380… sbc0… 3067… ""      158B… N/A          11       2 N/A     ""    ""   
#> # … with 226 more rows, 23 more variables: text <chr>, transcript <chr>,
#> #   endNote <chr>, order <chr>, negPlace <chr>, corpusSeq <chr>,
#> #   pSentOrder <chr>, POS_dft <chr>, tokenSeq <chr>, chunkType <chr>,
#> #   turnOrder <chr>, largerChunk <chr>, tokenOrderFirst <dbl>,
#> #   tokenOrderLast <dbl>, docTokenSeqFirst <dbl>, docTokenSeqLast <dbl>,
#> #   chainCreateSeq <dbl>, name <chr>, chainSize <dbl>, layer <chr>,
#> #   isZero <lgl>, noCompetitors <int>, noMatchingCompetitors <int>, and …
#> # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names