Takes a loaded (or existing) il_model and binds it to new data and a
fresh database connection, producing a model ready for predict() or
further training. Accepts in-memory data frames, dbplyr::tbl_lazy table
references, or character table names.
Arguments
- model
An
il_modelobject, typically fromil_load().- .data
A data frame,
tibble::tibble(), dbplyr::tbl_lazy, or character table name. The first (or only) input dataset.- ...
Additional datasets for multi-table linkage.
- con
A DBI connection object from
DBI::dbConnect(). Optional when.datais a dbplyr::tbl_lazy, the connection is extracted from the table reference.- link_type
Optionally override the model's link type. If
NULL(default), uses the link type stored in the model.
Value
The model, now connected to con with data uploaded, ready
for predict(), il_find_matches(), or further training.
Details
This is the key function for the production workflow:
train once with il_model() -> save with il_save() -> later, load
with il_load() and attach to new data with il_attach().
The loaded model's trained parameters (m, u, prior) are preserved.
You can immediately call predict() on the attached model, or
continue training with il_estimate_em() using the existing
parameters as a warm start.
Examples
con <- DBI::dbConnect(duckdb::duckdb())
spec <- il_spec() |>
il_compare(first_name, cl_jaro_winkler(0.9, 0.7)) |>
il_block_on(surname)
model <- il_model(fake_1000, spec = spec, con = con)
model <- il_estimate_u(model)
model <- il_estimate_em(model, block_on(surname))
#> EM trained: first_name
path <- tempfile(fileext = '.rds')
il_save(model, path)
DBI::dbDisconnect(con, shutdown = TRUE)
con2 <- DBI::dbConnect(duckdb::duckdb())
loaded <- il_load(path)
model2 <- il_attach(loaded, fake_1000, con = con2)
DBI::dbDisconnect(con2, shutdown = TRUE)
