Similarity between vectors in columns

A function to compute similarity between vectors from columns in a data frame based on common attribute characteristics.

simil(x, vars, uniq, diag.incl, dichot, rm.isol, k)

Arguments

x: a list or a data frame object
vars: vector with column(s) in x representing variable attributes
uniq: unique entries? (optional and logical)
diag.incl: include also entries in matrix diagonal? (optional and logical)
dichot: dichotomize output? (optional and logical)
rm.isol: remove isolates in output? (optional and logical)
k: cut-off for dichotomization (if not specified, max of output)

Details

This is a function to compute the similarity between two or more vectors, which can arise from columns in a data frame and from list entries. The similarity of artefacts or other units having common variable attributes specified in vars is by simple matching, and this represents a measure of proximity among the items to compare. Comparison takes an id column from x; otherwise, the first column is taken provided that there are no duplicated entry names.

Both the dichotomization of the output and the removing isolated items from the system of co-occurrence relations depends on functions from package "multiplex".

Value

A valued matrix of similarities among items in x.

Author

Antonio Rivero Ostoic

Examples

## Not run:
# get inscriptions from a Roman province
arm <- edhw(province="Armenia")
#> Warning: "x" is for dataset "EDH".
#> Warning: "province" with no "vars" returns lists.

# choose variables to a data frame
armv <- edhw(x=arm, as="df", 
        vars=c("findspot_ancient", "type_of_inscription", "type_of_monument", "language"))

# matrix of similarities of two variables
simil(armv, vars=c("findspot_ancient", "language"))
#>          HD015521 HD015524 HD029916
#> HD015521        0        2        1
#> HD015524        2        0        1
#> HD029916        1        1        0
#> attr(,"class")
#> [1] SimMatrix
## End(Not run)