| Title: | Convenient Functions for Exploratory Data Analysis |
|---|---|
| Description: | A collection of convenient functions to facilitate common tasks in exploratory data analysis. Some common tasks include generating summary tables of variables, displaying tables as a 'flextable' or a 'kable' and visualising variables using 'ggplot2'. Labels stating the source file with run time can be easily generated for annotation in tables and plots. |
| Authors: | Tomas Sou [aut, cre] (ORCID: <https://orcid.org/0000-0002-7570-5545>) |
| Maintainer: | Tomas Sou <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.0.7.9000 |
| Built: | 2026-05-25 14:49:48 UTC |
| Source: | https://github.com/soutomas/edar |
Copy files to destination and rename with date and a tag as desired.
fc(..., des = "", tag = "", td = TRUE)fc(..., des = "", tag = "", td = TRUE)
... |
|
des |
|
tag |
|
td |
|
A logical vector indicating if the operation succeeded for each of the files.
## Not run: # Copy a file to home directory tmp = tempdir() fc("f1.R","f2.R",des=tmp) ## End(Not run)## Not run: # Copy a file to home directory tmp = tempdir() fc("f1.R","f2.R",des=tmp) ## End(Not run)
Sugar function for default flextable output.
ft(d, fnote = NULL, ttl = NULL, sig = 8, dig = 2, src = 0, omit = "")ft(d, fnote = NULL, ttl = NULL, sig = 8, dig = 2, src = 0, omit = "")
d |
|
fnote |
|
ttl |
|
sig |
|
dig |
|
src |
|
omit |
|
A flextable object.
mtcars |> head() |> ft() mtcars |> head() |> ft(src=1) mtcars |> head() |> ft("Footnote") mtcars |> head() |> ft("Footnote",src=1) mtcars |> head() |> ft(sig=2,dig=1)mtcars |> head() |> ft() mtcars |> head() |> ft(src=1) mtcars |> head() |> ft("Footnote") mtcars |> head() |> ft("Footnote",src=1) mtcars |> head() |> ft(sig=2,dig=1)
Sugar function to set flextable defaults.
The arguments are passed to flextable::set_flextable_defaults().
ft_def( show = FALSE, font = "Calibri Light", fsize = 10, pad = 3, na = "", nan = "", ... )ft_def( show = FALSE, font = "Calibri Light", fsize = 10, pad = 3, na = "", nan = "", ... )
show |
|
font |
|
fsize |
|
pad |
|
na |
|
nan |
|
... |
Additional arguments to pass to |
A list containing previous default values.
flextable::set_flextable_defaults().
## Not run: ft_def() ## End(Not run)## Not run: ft_def() ## End(Not run)
Compute geometric coefficient of variation (GCV)
geo_cv(x)geo_cv(x)
x |
|
Geometric coefficient of variation
geo_cv(rlnorm(10))geo_cv(rlnorm(10))
Compute geometric mean.
geo_mean(x)geo_mean(x)
x |
|
Geometric mean.
geo_mean(rlnorm(10))geo_mean(rlnorm(10))
Compute geometric standard deviation (GSD)
geo_sd(x)geo_sd(x)
x |
|
Geometric standard deviation
geo_sd(rlnorm(10))geo_sd(rlnorm(10))
Create box plots for a chosen variable by all discrete covariates in a dataset. Numeric variables will be dropped, except the chosen variable to plot.
ggbox(d, var, cats, alpha = 0.1, show = TRUE, nsub = TRUE, ...)ggbox(d, var, cats, alpha = 0.1, show = TRUE, nsub = TRUE, ...)
d |
|
var |
|
cats |
|
alpha |
|
show |
|
nsub |
|
... |
Additional arguments for ggplot2::geom_boxplot. |
A ggplot object.
d = mtcars |> mutate(across(c(am,carb,cyl,gear,vs),factor)) d |> ggbox(mpg) d |> ggbox(mpg,alpha=0.5) d |> ggbox(mpg,show=FALSE) d |> ggbox(mpg,nsub=FALSE) d |> ggbox(mpg,c(cyl,vs))d = mtcars |> mutate(across(c(am,carb,cyl,gear,vs),factor)) d |> ggbox(mpg) d |> ggbox(mpg,alpha=0.5) d |> ggbox(mpg,show=FALSE) d |> ggbox(mpg,nsub=FALSE) d |> ggbox(mpg,c(cyl,vs))
Create histograms for all numeric variables in a dataset. Non-numeric variables will be dropped.
gghist(d, cols, bins = 30, nsub = TRUE, ...)gghist(d, cols, bins = 30, nsub = TRUE, ...)
d |
|
cols |
|
bins |
|
nsub |
|
... |
Additional arguments for ggplot2::geom_histogram. |
A ggplot object.
iris |> gghist() iris |> gghist(c(Sepal.Width,Sepal.Length))iris |> gghist() iris |> gghist(c(Sepal.Width,Sepal.Length))
Save ggplot with output path
ggout(plt, fpath, lab = "", omit = "", ...)ggout(plt, fpath, lab = "", omit = "", ...)
plt |
A ggplot object. |
fpath |
|
lab |
|
omit |
|
... |
Other arguments to pass to |
The file path of the output.
## Not run: fpath = "../output.png" iris |> gghist() |> ggout(fpath) ## End(Not run)## Not run: fpath = "../output.png" iris |> gghist() |> ggout(fpath) ## End(Not run)
Add a label with the current source file path and run time to a ggplot object.
ggsrc(plt, span = 1, size = 8, col = "grey55", lab = NULL, omit = "")ggsrc(plt, span = 1, size = 8, col = "grey55", lab = NULL, omit = "")
plt |
A ggplot object. |
span |
|
size |
|
col |
|
lab |
|
omit |
|
A ggplot object with the added label.
p = mtcars |> ggxy(mpg,hp) p |> ggsrc() p |> ggsrc(lab="My label") p |> ggsrc(lab="My label",omit="My ")p = mtcars |> ggxy(mpg,hp) p |> ggsrc() p |> ggsrc(lab="My label") p |> ggsrc(lab="My label",omit="My ")
Create plots for time profile data such as PK and PD plots.
ggtpp( d, x, y, id, ..., nsub = TRUE, logx = FALSE, logy = FALSE, alpha_point = 0.2, alpha_line = 0.1, xlab = NULL, ylab = NULL, ttl = NULL, sttl = NULL, cap = NULL )ggtpp( d, x, y, id, ..., nsub = TRUE, logx = FALSE, logy = FALSE, alpha_point = 0.2, alpha_line = 0.1, xlab = NULL, ylab = NULL, ttl = NULL, sttl = NULL, cap = NULL )
d |
|
x, y
|
|
id |
|
... |
Arguments to pass to ggplot2::aes for additional mapping. |
nsub |
|
logx, logy
|
|
alpha_point |
|
alpha_line |
|
xlab, ylab
|
|
ttl, sttl, cap
|
|
A ggplot object.
Theoph |> ggtpp(x=Time,y=conc,id=Subject)Theoph |> ggtpp(x=Time,y=conc,id=Subject)
Create violin plots for a chosen variable by all discrete covariates in a dataset. Numeric variables will be dropped, except the chosen variable to plot.
ggvio(d, var, cats, alpha = 0.1, show = TRUE, nsub = TRUE, ...)ggvio(d, var, cats, alpha = 0.1, show = TRUE, nsub = TRUE, ...)
d |
|
var |
|
cats |
|
alpha |
|
show |
|
nsub |
|
... |
Additional arguments for ggplot2::geom_violin. |
A ggplot object.
d = mtcars |> mutate(across(c(am,carb,cyl,gear,vs),factor)) d |> ggvio(mpg) d |> ggvio(mpg,alpha=0.5) d |> ggvio(mpg,show=FALSE) d |> ggvio(mpg,nsub=FALSE) d |> ggvio(mpg,c(cyl,vs))d = mtcars |> mutate(across(c(am,carb,cyl,gear,vs),factor)) d |> ggvio(mpg) d |> ggvio(mpg,alpha=0.5) d |> ggvio(mpg,show=FALSE) d |> ggvio(mpg,nsub=FALSE) d |> ggvio(mpg,c(cyl,vs))
Create basic XY scatter plot for quick data exploration.
Default to show Pearson correlation coefficient with p-value using ggpubr::stat_cor().
For more complex plots, it is recommended to use ggplot2::ggplot2 directly.
ggxy( d, x, y, ..., lm = TRUE, se = TRUE, cor = TRUE, pv = 0.001, nsub = TRUE, legend = TRUE, asp = 1 )ggxy( d, x, y, ..., lm = TRUE, se = TRUE, cor = TRUE, pv = 0.001, nsub = TRUE, legend = TRUE, asp = 1 )
d |
|
x, y
|
|
... |
Arguments to pass to |
lm |
|
se |
|
cor |
|
pv |
|
nsub |
|
legend |
|
asp |
|
A ggplot object.
mtcars |> ggxy(wt,hp) mtcars |> ggxy(wt,hp,col=factor(gear)) mtcars |> ggxy(wt,hp,col=factor(gear),legend=FALSE) mtcars |> ggxy(wt,hp,col=factor(gear),pch=factor(am)) mtcars |> ggxy(wt,hp,nsub=FALSE) mtcars |> ggxy(wt,hp,pv=NULL) mtcars |> ggxy(wt,hp,lm=FALSE) mtcars |> ggxy(wt,hp,se=FALSE) mtcars |> ggxy(wt,hp,cor=FALSE)mtcars |> ggxy(wt,hp) mtcars |> ggxy(wt,hp,col=factor(gear)) mtcars |> ggxy(wt,hp,col=factor(gear),legend=FALSE) mtcars |> ggxy(wt,hp,col=factor(gear),pch=factor(am)) mtcars |> ggxy(wt,hp,nsub=FALSE) mtcars |> ggxy(wt,hp,pv=NULL) mtcars |> ggxy(wt,hp,lm=FALSE) mtcars |> ggxy(wt,hp,se=FALSE) mtcars |> ggxy(wt,hp,cor=FALSE)
Generate a vector of hex colour codes for the desired number of colours.
Colours are generated by evenly splitting hue in the range [0,360]
in the HCL colour space using grDevices::hcl.
The output is meant to follow the default colours used in ggplot2::ggplot2.
hexn(n, show = FALSE)hexn(n, show = FALSE)
n |
|
show |
|
A vector of hex colour codes that can be used for plotting.
hexn(6,FALSE) hexn(4,TRUE)hexn(6,FALSE) hexn(4,TRUE)
Sugar function for default kable output.
kb(d, fnote = NULL, cap = NULL, sig = 8, dig = 2, src = 0, omit = "")kb(d, fnote = NULL, cap = NULL, sig = 8, dig = 2, src = 0, omit = "")
d |
|
fnote |
|
cap |
|
sig |
|
dig |
|
src |
|
omit |
|
A kable object.
mtcars |> head() |> kb() mtcars |> head() |> kb(src=1) mtcars |> head() |> kb("Footnote") mtcars |> head() |> kb("Footnote",src=1) mtcars |> head() |> kb(sig=2,dig=1)mtcars |> head() |> kb() mtcars |> head() |> kb(src=1) mtcars |> head() |> kb("Footnote") mtcars |> head() |> kb("Footnote",src=1) mtcars |> head() |> kb(sig=2,dig=1)
Generate a label with the current source file path and run time,
assuming that the source file is in the current working directory.
In interactive sessions, the function is designed to work in a script file in RStudio
and uses rstudioapi to get the file path.
It will return empty if run in the console directly.
label_src(span = 1, omit = "", tz = TRUE, fname = FALSE)label_src(span = 1, omit = "", tz = TRUE, fname = FALSE)
span |
|
omit |
|
tz |
|
fname |
|
A label showing the source file path with a time stamp.
label_src() label_src(tz=FALSE) label_src(fname=TRUE)label_src() label_src(tz=FALSE) label_src(fname=TRUE)
Generate a label with a time stamp indicating the run time.
label_tz(omit = "")label_tz(omit = "")
omit |
|
A label with time stamp.
label_tz()label_tz()
Summarise all continuous variables by group. Non-numeric variables will be dropped.
summ_by( d, cols, ..., pct = c(0.25, 0.75), geo = FALSE, xname = "", view = FALSE )summ_by( d, cols, ..., pct = c(0.25, 0.75), geo = FALSE, xname = "", view = FALSE )
d |
|
cols |
|
... |
|
pct |
|
geo |
|
xname |
|
view |
|
A data frame of summarised variables.
d = mtcars |> dplyr::mutate(vs=factor(vs), am=factor(am)) d |> summ_by() d |> summ_by(geo=TRUE) d |> summ_by(pct=c(0.1,0.9)) d |> summ_by(mpg) d |> summ_by(mpg,vs) d |> summ_by(mpg,vs,am) d |> summ_by(c(mpg,disp)) d |> summ_by(c(mpg,disp),vs) d |> summ_by(c(mpg,disp),vs,xname="mpg_") # Grouping without column selection is possible but rarely useful in large dataset d |> summ_by(,vs)d = mtcars |> dplyr::mutate(vs=factor(vs), am=factor(am)) d |> summ_by() d |> summ_by(geo=TRUE) d |> summ_by(pct=c(0.1,0.9)) d |> summ_by(mpg) d |> summ_by(mpg,vs) d |> summ_by(mpg,vs,am) d |> summ_by(c(mpg,disp)) d |> summ_by(c(mpg,disp),vs) d |> summ_by(c(mpg,disp),vs,xname="mpg_") # Grouping without column selection is possible but rarely useful in large dataset d |> summ_by(,vs)
Summarise all categorical variables. Numeric variables will be dropped.
summ_cat(d, ..., var, view = FALSE)summ_cat(d, ..., var, view = FALSE)
d |
|
... |
|
var |
|
view |
|
A list containing summaries for all categorical variables or a data frame showing the summary of a selected variable.
d = mtcars |> dplyr::mutate(dplyr::across(c(cyl,vs,am,gear,carb),factor)) d |> summ_cat() d |> summ_cat(cyl,vs) d |> summ_cat(var=cyl) d |> summ_cat(var=1)d = mtcars |> dplyr::mutate(dplyr::across(c(cyl,vs,am,gear,carb),factor)) d |> summ_cat() d |> summ_cat(cyl,vs) d |> summ_cat(var=cyl) d |> summ_cat(var=1)
Wrapper function to produce summary tables for two variables.
tab2v(d, x, y)tab2v(d, x, y)
d |
|
x, y
|
|
A tabyl object.
# example code mtcars |> tab2v(vs,cyl) mtcars |> tab2v(vs,am) mtcars |> tab2v(vs,am) mtcars |> tab2v(vs,gear)# example code mtcars |> tab2v(vs,cyl) mtcars |> tab2v(vs,am) mtcars |> tab2v(vs,am) mtcars |> tab2v(vs,gear)