cln.Rd
A function to re-encode Greek (and other) characters and to remove symbols.
cln(x, level, what, na.rm, case, repl, unlist)
a vector, list or dataframe
optional clean level, either 0
for no-clean, default 1
to most strict 9
(see details)
additional characters to clean (optional)
remove entries with NA data? (optional and logical)
case for text 1
for 1st uppercase, code2 for lowercase, code3 for uppercase (optional)
data frame with text to replace (optional)
return a vector? (optional and logical, for vector input)
This function is meant to re-encode Greek (and other) characters in the EDH
set given either as list format,
vector, or a dataframe produced with function edhw
for example.
By default, the symbols "?" "*" "+"
placed at the end of each record are removed after the re-encoding.
However, when level is 0
only re-encoding is performed, and level 2
is either to force an extra iteration
in the re-encoding, to remove extra spaces, or what is in what
at the end of a record when clean what is invoked.
With level 9
all content after an opening parenthesis is removed with all the consequences for the input text.
With repl
, is possible to replace a list of text in two columns, for `text to replace' and for `text that replaces'.
Disabling option unlist
returns a vector in case that x
is also a vector; otherwise, it returns a list with the two
versions of the input.
Depending on the input, a vector, list or dataframe.
Encoding more than once the same input requires re-starting the console; otherwise, the re-encoding is not complete.
# clean Greek characters
cln("Caesar?*+")
#> [1] "Caesar*+"