A function to compute similarity between vectors from columns in a data frame based on common attribute characteristics.

simil(x, vars, uniq, diag.incl, dichot, rm.isol, k)

Arguments

x

a list or a data frame object

vars

vector with column(s) in x representing variable attributes

uniq

unique entries? (optional and logical)

diag.incl

include also entries in matrix diagonal? (optional and logical)

dichot

dichotomize output? (optional and logical)

rm.isol

remove isolates in output? (optional and logical)

k

cut-off for dichotomization (if not specified, max of output)

Details

This is a function to compute the similarity between two or more vectors, which can arise from columns in a data frame and from list entries. The similarity of artefacts or other units having common variable attributes specified in vars is by simple matching, and this represents a measure of proximity among the items to compare. Comparison takes an id column from x; otherwise, the first column is taken provided that there are no duplicated entry names.

Both the dichotomization of the output and the removing isolated items from the system of co-occurrence relations depends on functions from package "multiplex".

Value

A valued matrix of similarities among items in x.

Author

Antonio Rivero Ostoic

Examples

## Not run:
# get inscriptions from a Roman province
arm <- edhw(province="Armenia")
#> Warning: "x" is for dataset "EDH".
#> Warning: "province" with no "vars" returns lists.

# choose variables to a data frame
armv <- edhw(x=arm, as="df", 
        vars=c("findspot_ancient", "type_of_inscription", "type_of_monument", "language"))

# matrix of similarities of two variables
simil(armv, vars=c("findspot_ancient", "language"))
#>          HD015521 HD015524 HD029916
#> HD015521        0        2        1
#> HD015524        2        0        1
#> HD029916        1        1        0
#> attr(,"class")
#> [1] SimMatrix
## End(Not run)