rmids.Rd
A function to perform multiple imputation of missing dating data in the EDH
dataset.
rmids(x, vars, collapse, pool, type = c("1", "2"))
dataframe or list of dataframes with a data set to impute
vector of attribute variables in x
, typically dating data (optional)
collapse list of dataframes? (optional and logical, default FALSE
)
pool the results? (optional and logical)
type of pooling: "1"
for min TAQ and max TPQ. "2"
for conditional pooling
Imputation refers to the replacement process of missing data, and this is the case of entries in the
Epigraphic Database Heidelberg and related datasets.
In this context, the missing data for imputation are the endpoints of the timespan of existence of epigraphs or inscriptions
represented by variables TAQ and TPQ (cf. prex
) as "not_before"
and "not_after"
in the EDH
dataset
with cases of censoring with one limit of the timespan known.
To perform imputation on subsets of missing dating data in the EDH
dataset,
function edhwpd
serves to organize records per Roman province and dates
by simple match similarity of different attribute variables specified in vars
.
Such organisation is in the form of a dataframe or a list of dataframes depending on the province characteristics, and
a restricted multiply-imputed data subsets takes place on this outcome, and where collapse
is for
collapsing lists of dataframes.
When dating data is complete missing, rpd
provides the average date, min TAQ, max TPQ, and the average length timespan
for each Roman province that applies for a multiple imputation.
A list of dataframes with imputed data where imputed dating data is not preceeded by a zero as with the recorded values. Component cases and names are:
NA-NA
all missing
taq-NA
censored
NA-tpq
censored
complete
complete data
Ostoic, A and Letina, S. ``Network imputation for missing dating data in archaeological artefacts,'' The Connected Past: Artefactual Intelligence conference, Aarhus (2021).
Rownames of complete dating data belonging to a component having imputed data gets replaced in the collapsed dataframe produced from a list of dataframes.