R/smooth.R
smooth_density.Rd
Smooths x
values using a density estimator, returning new x
of the same
length. Can be used with a dotplot (e.g. geom_dots
(smooth = ...)
) to create
"density dotplots".
Supports automatic partial function application.
smooth_bounded(x, density = "bounded", bounds = c(NA, NA), ...)
smooth_unbounded(x, density = "unbounded", ...)
a numeric vector
Density estimator to use for smoothing. One of:
A function which takes a numeric vector and returns a list with elements
x
(giving grid points for the density estimator) and y
(the
corresponding densities). ggdist provides a family of functions
following this format, including density_unbounded()
and
density_bounded()
.
A string giving the suffix of a function name that starts with "density_"
;
e.g. "bounded"
for [density_bounded()]
.
length-2 vector of min and max bounds. If a bound is NA
, then
that bound is replaced with min(x)
or max(x)
. Thus, the default,
c(NA, NA)
, means that the bounds used are range(x)
.
Arguments passed to the density estimator specified by density
.
A numeric vector of length(x)
, where each entry is a smoothed version of
the corresponding entry in x
.
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
Applies a kernel density estimator (KDE) to x
, then uses weighted quantiles
of the KDE to generate a new set of x
values with smoothed values. Plotted
using a dotplot (e.g. geom_dots(smooth = "bounded")
or
geom_dots(smooth = smooth_bounded(...)
), these values create a variation on
a "density dotplot" (Zvinca 2018).
Such plots are recommended only in very large sample sizes where precise positions of individual values are not particularly meaningful. In small samples, normal dotplots should generally be used.
Two variants are supplied by default:
smooth_bounded()
, which uses density_bounded()
.
Passes the bounds
arguments to the estimator.
smooth_unbounded()
, which uses density_unbounded()
.
It is generally recommended to pick the smooth based on the known bounds of
your data, e.g. by using smooth_bounded()
with the bounds
parameter if
there are finite bounds, or smooth_unbounded()
if both bounds are infinite.
Zvinca, Daniel. "In the pursuit of diversity in data visualization. Jittering data to access details." https://www.linkedin.com/pulse/pursuit-diversity-data-visualization-jittering-access-daniel-zvinca/.
Other dotplot smooths:
smooth_discrete()
,
smooth_none()
library(ggplot2)
set.seed(1234)
x = rnorm(1000)
# basic dotplot is noisy
ggplot(data.frame(x), aes(x)) +
geom_dots()
# density dotplot is smoother, but does move points (most noticeable
# in areas of low density)
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = "unbounded")
# you can adjust the kernel and bandwidth...
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = smooth_unbounded(kernel = "triangular", adjust = 0.5))
# for bounded data, you should use the bounded smoother
x_beta = rbeta(1000, 0.5, 0.5)
ggplot(data.frame(x_beta), aes(x_beta)) +
geom_dots(smooth = smooth_bounded(bounds = c(0, 1)))