Allows you to supply pre-computed term frequency lookup tables instead of having them computed automatically from the data. This is useful when you have production TF tables from a larger dataset or want to reuse TF values across multiple linkage runs.
Details
The supplied data must have exactly two columns: the value column
(named the same as the comparison column) and the frequency column
(named tf_<col>).
Examples
con <- DBI::dbConnect(duckdb::duckdb())
spec <- il_spec() |>
il_compare(first_name, cl_exact()) |>
il_block_on(surname)
model <- il_model(fake_20, spec = spec, con = con)
tf <- data.frame(
first_name = c('John', 'Jane', 'Bob', 'Alice', 'Tom'),
tf_first_name = rep(0.2, 5)
)
model <- il_register_tf(model, 'first_name', tf)
il_cleanup(model)
DBI::dbDisconnect(con, shutdown = TRUE)
