Restricted multiply-imputed data subsets

A function to perform multiple imputation of missing dating data in the EDH dataset.

rmids(x, vars, collapse, pool, type = c("1", "2"))

Arguments

x: dataframe or list of dataframes with a data set to impute
vars: vector of attribute variables in x, typically dating data (optional)
collapse: collapse list of dataframes? (optional and logical, default FALSE)
pool: pool the results? (optional and logical)
type: type of pooling: "1" for min TAQ and max TPQ. "2" for conditional pooling

Details

Imputation refers to the replacement process of missing data, and this is the case of entries in the Epigraphic Database Heidelberg and related datasets. In this context, the missing data for imputation are the endpoints of the timespan of existence of epigraphs or inscriptions represented by variables TAQ and TPQ (cf. prex) as "not_before" and "not_after" in the EDH dataset with cases of censoring with one limit of the timespan known.

To perform imputation on subsets of missing dating data in the EDH dataset, function edhwpd serves to organize records per Roman province and dates by simple match similarity of different attribute variables specified in vars. Such organisation is in the form of a dataframe or a list of dataframes depending on the province characteristics, and a restricted multiply-imputed data subsets takes place on this outcome, and where collapse is for collapsing lists of dataframes.

When dating data is complete missing, rpd provides the average date, min TAQ, max TPQ, and the average length timespan for each Roman province that applies for a multiple imputation.

Value

A list of dataframes with imputed data where imputed dating data is not preceeded by a zero as with the recorded values. Component cases and names are:

NA-NA: all missing
taq-NA: censored
NA-tpq: censored
complete: complete data

References

Ostoic, A and Letina, S. ``Network imputation for missing dating data in archaeological artefacts,'' The Connected Past: Artefactual Intelligence conference, Aarhus (2021).

Author

Antonio Rivero Ostoic

Note

Rownames of complete dating data belonging to a component having imputed data gets replaced in the collapsed dataframe produced from a list of dataframes.

Examples

if (FALSE) {
## extract from EDH dataset province, dates, and single variable attribute
rom <- edhwpd(vars="type_of_inscription", province="Rom", dates=c("not_after", "not_before"))

# perform restricted imputation
rmids(rom, vars=c("not_after", "not_before"))}