edhw.Rd
A function to obtain variable data and perform transformations on the EDH
dataset.
a list object name with fragments of the EDH
dataset (optional)
vector of variables of interest from x
; if x=NULL
, the entire EDH
dataset is taken (optional)
format to return the output; either as a "list"
or a data frame "df"
object.
type format of data frame; either "long"
or "wide"
("narrow"
not yet implemented)
divide the data into groups by id? (optional and logical)
vector with "people"
variables (optional)
add identification to the output? (optional and logical)
integer or vector to limit the returned output. Ignored if id
is specified (optional)
select only hd_nr
records (optional, integer or character)
remove entries with NA data? (optional and logical)
is x
list of data frames? (optional and logical)
name or abbreviation of Roman province in EDH
as in rp
dataset
gender of people in EDH
: male
or female
customized list of Roman provinces as in rp
dataset
optional arguments if needed.
This is an interface to extract attribute variables in vars
from the EDH
dataset attached
to this package. However, the input in x
can be fragments of the EDH
dataset or from
the Epigraphic Database Heidelberg API obtained by functions get.edh
or get.edhw
with the "rjson
" format, or transformed data organized, for example, by provinces.
When x
is explicit, it must be at least a list object with a comparable structure to the EDH
dataset.
Through vars
argument and return the output either as
a list with list
or a data frame with df
. In case that argument vars
is missing, then all entries in x
are taken.
By default, a list object is returned, with or without an `ID' identification provided by the addID
argument.
When the input list is converted into a data frame, the ordering of the variables is given alphabetically.
If desired, it is also possible to remove missing data from the output by activating na.rm
and
work with complete cases.
Arguments id
and limit
serve to reduce the returned output either to some Epigraphic Database number or to numbers,
which are specified by hd_nr
, or else by limiting the amount of the returned output.
limit
here is like the limit
argument of function get.edh
, but in
this case the offset can be specified as a sequence.
While limit
is a faster way to get to entries in the EDH
dataset, argument id
is for
referring to precisely one or more hd_nr
s in the Epigraphic Database Heidelberg API.
Component "people"
is a separated list in the EDH
dataset, and it should be considered as
a separate case from the rest of the variables.
In the case that the output is a data frame, the default output is a `long' type
table; that is records can
appear in different rows and each variable is assigned into a single column, and with this option is possible to
select
"people"
variables like gender and origin.
When choosing people variables with select
and a data frame output, then "people"
attribute must be in vars
.
By setting "wide"
in type
, it is possible to place the different people from a single entry
column by column in the data frame and each record has a single row. Finally, argument split
allows for
dividing the data in the data frame into groups by `id', which corresponds to the HD number of inscription
in the EDH
dataset.
Ad hoc arguments are the EDH
entries province
and gender
for entering a Roman province
and people's gender in x
as a data frame; otherwise, these arguments are ignored.
When province
is used, it is possible to refer to a customized list of provinces with argument "rp
";
otherwise, dataset rp
is the default where names and abbreviations are accepted.
Argument ldf
is a flag when the input in x
is a created list of data frames that is
organized by variables rather than by records as in the EDH
dataset.
A list or a data frame with a long or wide format, depending on the input arguments.
Argument province
with no vars
returns a list of lists.
https://edh.ub.uni-heidelberg.de/data/api
Warning
messages are given for the EDH
dataset as the input, and when choosing
the province
argument alone.
if (FALSE) {
## load dataset
data(EDH)
## make a list for three variables in 'EDH' for first 4 entries
edhw(vars=c("type_of_inscription", "not_after", "not_before"), limit=4 )
## as before, but also select 'gender' from 'people'
edhw(vars=c("people", "not_after", "not_before"), select="gender", limit=4 )}