| Title: | Add Uncertainty to Data Visualisations |
|---|---|
| Description: | A 'ggplot2' extension for visualising uncertainty with the goal of signal suppression. Usually, uncertainty visualisation focuses on expressing uncertainty as a distribution or probability, whereas 'ggdibbler' differentiates itself by viewing an uncertainty visualisation as an adjustment to an existing graphic that incorporates the inherent uncertainty in the estimates. You provide the code for an existing plot, but replace any of the variables with a vector of distributions, and it will convert the visualisation into it's signal suppression counterpart. |
| Authors: | Harriet Mason [aut, cre] (ORCID: <https://orcid.org/0009-0007-4568-8215>), Dianne Cook [aut, ths] (ORCID: <https://orcid.org/0000-0002-3813-7155>), Sarah Goodwin [aut, ths] (ORCID: <https://orcid.org/0000-0001-8894-8282>), Susan VanderPlas [aut, ths] (ORCID: <https://orcid.org/0000-0002-3803-0972>) |
| Maintainer: | Harriet Mason <[email protected]> |
| License: | GPL-3 |
| Version: | 0.6.5.9000 |
| Built: | 2026-05-13 15:49:07 UTC |
| Source: | https://github.com/harriet-mason/ggdibbler |
Identical to geom_vline, geom_hline and geom_abline, except that it will accept a distribution in place of any of the usual aesthetics.
geom_abline_sample( mapping = NULL, data = NULL, stat = "identity_sample", times = 10, seed = NULL, ..., slope, intercept, na.rm = FALSE, show.legend = NA, inherit.aes = FALSE ) geom_hline_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., seed = NULL, times = 10, yintercept, na.rm = FALSE, show.legend = NA, inherit.aes = FALSE ) geom_vline_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, xintercept, na.rm = FALSE, show.legend = NA, inherit.aes = FALSE )geom_abline_sample( mapping = NULL, data = NULL, stat = "identity_sample", times = 10, seed = NULL, ..., slope, intercept, na.rm = FALSE, show.legend = NA, inherit.aes = FALSE ) geom_hline_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., seed = NULL, times = 10, yintercept, na.rm = FALSE, show.legend = NA, inherit.aes = FALSE ) geom_vline_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, xintercept, na.rm = FALSE, show.legend = NA, inherit.aes = FALSE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
xintercept, yintercept, slope, intercept
|
Parameters that control the
position of the line. If these are set, |
A ggplot2 layer
# load libraries library(ggplot2) library(distributional) # ggplot p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # ggdibbler q <- ggplot(uncertain_mtcars, aes(wt, mpg)) + geom_point_sample(alpha=0.3) # ggplot p + geom_abline(intercept = 20) # ggplot q + geom_abline_sample(intercept = dist_normal(20, 1), alpha=0.3) # ggdibbler p + geom_vline(xintercept = 5) # ggplot q + geom_vline_sample(xintercept = dist_normal(5, 0.1), alpha=0.3) # ggdibbler p + geom_hline(yintercept = 20) # ggplot q + geom_hline_sample(yintercept = dist_normal(20, 1), alpha=0.3) # ggdibbler # Calculate slope and intercept of line of best fit # get coef and standard error summary(lm(mpg ~ wt, data = mtcars)) # ggplot for coef p + geom_abline(intercept = 37, slope = -5) # ggplot # ggdibbler for coef AND standard error p + geom_abline_sample(intercept = dist_normal(37, 1.8), slope = dist_normal(-5, 0.56), times=30, alpha=0.3) # ggplot# load libraries library(ggplot2) library(distributional) # ggplot p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # ggdibbler q <- ggplot(uncertain_mtcars, aes(wt, mpg)) + geom_point_sample(alpha=0.3) # ggplot p + geom_abline(intercept = 20) # ggplot q + geom_abline_sample(intercept = dist_normal(20, 1), alpha=0.3) # ggdibbler p + geom_vline(xintercept = 5) # ggplot q + geom_vline_sample(xintercept = dist_normal(5, 0.1), alpha=0.3) # ggdibbler p + geom_hline(yintercept = 20) # ggplot q + geom_hline_sample(yintercept = dist_normal(20, 1), alpha=0.3) # ggdibbler # Calculate slope and intercept of line of best fit # get coef and standard error summary(lm(mpg ~ wt, data = mtcars)) # ggplot for coef p + geom_abline(intercept = 37, slope = -5) # ggplot # ggdibbler for coef AND standard error p + geom_abline_sample(intercept = dist_normal(37, 1.8), slope = dist_normal(-5, 0.56), times=30, alpha=0.3) # ggplot
Identical to geom_bar, except that it will accept a distribution in place of any of the usual aesthetics.
geom_bar_sample( mapping = NULL, data = NULL, stat = "count_sample", position = "stack_dodge", ..., just = 0.5, times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_col_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "stack_dodge", ..., just = 0.5, times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_count_sample( mapping = NULL, data = NULL, geom = "bar", position = "stack_identity", ..., orientation = NA, times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_bar_sample( mapping = NULL, data = NULL, stat = "count_sample", position = "stack_dodge", ..., just = 0.5, times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_col_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "stack_dodge", ..., just = 0.5, times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_count_sample( mapping = NULL, data = NULL, geom = "bar", position = "stack_identity", ..., orientation = NA, times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
just |
Adjustment for column placement. Set to |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Override the default connection between |
orientation |
The orientation of the layer. The default ( |
A ggplot2 layer
library(distributional) library(ggplot2) # Set up data g <- ggplot(mpg, aes(class)) #ggplot q <- ggplot(uncertain_mpg, aes(class)) #ggdibbler # Number of cars in each class: g + geom_bar() #ggplot q + geom_bar_sample() #ggdibbler - a q + geom_bar_sample(position = "identity_identity", alpha=0.1) #ggdibbler - b # make dataframe df <- data.frame(trt = c("a", "b", "c"), outcome = c(2.3, 1.9, 3.2)) uncertain_df <- data.frame(trt = c("a", "b", "c"), outcome = dist_normal(mean = c(2.3, 1.9, 3.2), sd = c(0.5, 0.8, 0.7))) # geom_col also has a sample counterpart # ggplot ggplot(df, aes(trt, outcome)) + geom_col() # ggdibbler ggplot(uncertain_df, aes(x=trt, y=outcome)) + geom_col_sample() # ggplot ggplot(mpg, aes(y = class)) + geom_bar(aes(fill = drv), position = position_stack(reverse = TRUE)) + theme(legend.position = "top") # ggdibbler ggplot(uncertain_mpg, aes(y = class)) + geom_bar_sample(aes(fill = drv), alpha=1, position = position_stack_dodge(reverse = TRUE)) + theme(legend.position = "top")library(distributional) library(ggplot2) # Set up data g <- ggplot(mpg, aes(class)) #ggplot q <- ggplot(uncertain_mpg, aes(class)) #ggdibbler # Number of cars in each class: g + geom_bar() #ggplot q + geom_bar_sample() #ggdibbler - a q + geom_bar_sample(position = "identity_identity", alpha=0.1) #ggdibbler - b # make dataframe df <- data.frame(trt = c("a", "b", "c"), outcome = c(2.3, 1.9, 3.2)) uncertain_df <- data.frame(trt = c("a", "b", "c"), outcome = dist_normal(mean = c(2.3, 1.9, 3.2), sd = c(0.5, 0.8, 0.7))) # geom_col also has a sample counterpart # ggplot ggplot(df, aes(trt, outcome)) + geom_col() # ggdibbler ggplot(uncertain_df, aes(x=trt, y=outcome)) + geom_col_sample() # ggplot ggplot(mpg, aes(y = class)) + geom_bar(aes(fill = drv), position = position_stack(reverse = TRUE)) + theme(legend.position = "top") # ggdibbler ggplot(uncertain_mpg, aes(y = class)) + geom_bar_sample(aes(fill = drv), alpha=1, position = position_stack_dodge(reverse = TRUE)) + theme(legend.position = "top")
Identical to geom_bin_2d, except that it will accept a distribution in place of any of the usual aesthetics.
geom_bin_2d_sample( mapping = NULL, data = NULL, stat = "bin2d_sample", position = "identity_dodge", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_2d_sample( mapping = NULL, data = NULL, geom = "tile", position = "identity_dodge", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, breaks = NULL, drop = TRUE, boundary = NULL, closed = NULL, center = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_bin_2d_sample( mapping = NULL, data = NULL, stat = "bin2d_sample", position = "identity_dodge", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_2d_sample( mapping = NULL, data = NULL, geom = "tile", position = "identity_dodge", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, breaks = NULL, drop = TRUE, boundary = NULL, closed = NULL, center = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that takes x after scale transformation as input and
returns a single numeric value. When specifying a function along with a
grouping structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
bins |
Number of bins. Overridden by |
breaks |
Alternatively, you can supply a numeric vector giving
the bin boundaries. Overrides |
drop |
if |
closed |
One of |
center, boundary
|
bin position specifiers. Only one, |
A ggplot2 layer
# ggplot library(ggplot2) d <- ggplot(smaller_diamonds, aes(x, y)) d + geom_bin_2d() # ggdibbler b <- ggplot(smaller_uncertain_diamonds, aes(x, y)) # the ggdibbler default position adjustment is dodging b + geom_bin_2d_sample(times=100) # but it can change it to be transparency b + geom_bin_2d_sample(position="identity", alpha=0.1) # Still have the same options d + geom_bin_2d(bins = 10) #ggplot b + geom_bin_2d_sample(bins = 10) #ggdibbler# ggplot library(ggplot2) d <- ggplot(smaller_diamonds, aes(x, y)) d + geom_bin_2d() # ggdibbler b <- ggplot(smaller_uncertain_diamonds, aes(x, y)) # the ggdibbler default position adjustment is dodging b + geom_bin_2d_sample(times=100) # but it can change it to be transparency b + geom_bin_2d_sample(position="identity", alpha=0.1) # Still have the same options d + geom_bin_2d(bins = 10) #ggplot b + geom_bin_2d_sample(bins = 10) #ggdibbler
Identical to geom_boxplot, except that it will accept a distribution in place of any of the usual aesthetics.
geom_boxplot_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "boxplot_sample", position = "identity", ..., outliers = TRUE, outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL, outlier.shape = NULL, outlier.size = NULL, outlier.stroke = 0.5, outlier.alpha = NULL, whisker.colour = NULL, whisker.color = NULL, whisker.linetype = NULL, whisker.linewidth = NULL, staple.colour = NULL, staple.color = NULL, staple.linetype = NULL, staple.linewidth = NULL, median.colour = NULL, median.color = NULL, median.linetype = NULL, median.linewidth = NULL, box.colour = NULL, box.color = NULL, box.linetype = NULL, box.linewidth = NULL, notch = FALSE, notchwidth = 0.5, staplewidth = 0, varwidth = FALSE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_boxplot_sample( mapping = NULL, data = NULL, geom = "boxplot", position = "identity", ..., times = 10, orientation = NA, seed = NULL, coef = 1.5, quantile.type = 7, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_boxplot_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "boxplot_sample", position = "identity", ..., outliers = TRUE, outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL, outlier.shape = NULL, outlier.size = NULL, outlier.stroke = 0.5, outlier.alpha = NULL, whisker.colour = NULL, whisker.color = NULL, whisker.linetype = NULL, whisker.linewidth = NULL, staple.colour = NULL, staple.color = NULL, staple.linetype = NULL, staple.linewidth = NULL, median.colour = NULL, median.color = NULL, median.linetype = NULL, median.linewidth = NULL, box.colour = NULL, box.color = NULL, box.linetype = NULL, box.linewidth = NULL, notch = FALSE, notchwidth = 0.5, staplewidth = 0, varwidth = FALSE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_boxplot_sample( mapping = NULL, data = NULL, geom = "boxplot", position = "identity", ..., times = 10, orientation = NA, seed = NULL, coef = 1.5, quantile.type = 7, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
outliers |
Whether to display ( |
outlier.colour, outlier.color, outlier.fill, outlier.shape, outlier.size, outlier.stroke, outlier.alpha
|
Default aesthetics for outliers. Set to |
whisker.colour, whisker.color, whisker.linetype, whisker.linewidth
|
Default aesthetics for the whiskers. Set to |
staple.colour, staple.color, staple.linetype, staple.linewidth
|
Default aesthetics for the staples. Set to |
median.colour, median.color, median.linetype, median.linewidth
|
Default aesthetics for the median line. Set to |
box.colour, box.color, box.linetype, box.linewidth
|
Default aesthetics for the boxes. Set to |
notch |
If |
notchwidth |
For a notched box plot, width of the notch relative to
the body (defaults to |
staplewidth |
The relative width of staples to the width of the box. Staples mark the ends of the whiskers with a line. |
varwidth |
If |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
coef |
Length of the whiskers as multiple of IQR. Defaults to 1.5. |
quantile.type |
An integer between 1 and 9 setting the quantile algorithm
per |
A ggplot2 layer
library(ggplot2) # ggplot p <- ggplot(mpg, aes(class, hwy)) p + geom_boxplot(alpha=0.5) # using alpha to manage overplotting q <- ggplot(uncertain_mpg, aes(class, hwy)) q + geom_boxplot_sample(alpha=0.1) # ggplot p + geom_boxplot(varwidth = TRUE) # ggdibbler q + geom_boxplot_sample(alpha=0.1, varwidth = TRUE) # ggplot p + geom_boxplot(aes(colour = drv), position = position_dodge(preserve = "single")) # ggdibbler q + geom_boxplot_sample(aes(colour = drv), alpha=0.05, position = "dodge_identity")library(ggplot2) # ggplot p <- ggplot(mpg, aes(class, hwy)) p + geom_boxplot(alpha=0.5) # using alpha to manage overplotting q <- ggplot(uncertain_mpg, aes(class, hwy)) q + geom_boxplot_sample(alpha=0.1) # ggplot p + geom_boxplot(varwidth = TRUE) # ggdibbler q + geom_boxplot_sample(alpha=0.1, varwidth = TRUE) # ggplot p + geom_boxplot(aes(colour = drv), position = position_dodge(preserve = "single")) # ggdibbler q + geom_boxplot_sample(aes(colour = drv), alpha=0.05, position = "dodge_identity")
Identical to geom_contour and geom_contour_filled, except that it will accept a distribution in place of any of the usual aesthetics.
geom_contour_sample( mapping = NULL, data = NULL, stat = "contour_sample", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_contour_filled_sample( mapping = NULL, data = NULL, stat = "contour_filled_sample", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, rule = "evenodd", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour_sample( mapping = NULL, data = NULL, geom = "contour", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour_filled_sample( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_contour_sample( mapping = NULL, data = NULL, stat = "contour_sample", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_contour_filled_sample( mapping = NULL, data = NULL, stat = "contour_filled_sample", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, rule = "evenodd", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour_sample( mapping = NULL, data = NULL, geom = "contour", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_contour_filled_sample( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", ..., times = 10, seed = NULL, bins = NULL, binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
bins |
Number of contour bins. Overridden by |
binwidth |
The width of the contour bins. Overridden by |
breaks |
One of:
Overrides |
arrow |
Arrow specification, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
rule |
Either |
geom |
The geometric object to use to display the data for this layer.
When using a
|
A ggplot2 layer
library(ggplot2) library(dplyr) faithfuld # ggplot2 v <- ggplot(faithfuld |> filter(waiting>80) |> filter(eruptions >3), aes(waiting, eruptions, z = density)) v + geom_contour() # ggdibbler u <- ggplot(uncertain_faithfuld |> filter(waiting>80) |> filter(eruptions >3), aes(waiting, eruptions, z = density)) u + geom_contour_sample() # use geom_contour_filled() for filled contours # ggplot2 v + geom_contour_filled() # no error (point prediction) # ggdibbler u + geom_contour_filled_sample()library(ggplot2) library(dplyr) faithfuld # ggplot2 v <- ggplot(faithfuld |> filter(waiting>80) |> filter(eruptions >3), aes(waiting, eruptions, z = density)) v + geom_contour() # ggdibbler u <- ggplot(uncertain_faithfuld |> filter(waiting>80) |> filter(eruptions >3), aes(waiting, eruptions, z = density)) u + geom_contour_sample() # use geom_contour_filled() for filled contours # ggplot2 v + geom_contour_filled() # no error (point prediction) # ggdibbler u + geom_contour_filled_sample()
Identical to geom_count, except that it will accept a distribution in place of any of the usual aesthetics.
geom_count_sample( mapping = NULL, data = NULL, stat = "sum_sample", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_sum_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_count_sample( mapping = NULL, data = NULL, stat = "sum_sample", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_sum_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
A ggplot2 layer
library(ggplot2) # Discrete values have overplotting # ggplot ggplot(mpg, aes(cty, hwy)) + geom_point() # ggdibbler ggplot(uncertain_mpg, aes(cty, hwy)) + geom_point_sample() # Can use geom_count to fix it # ggplot ggplot(mpg, aes(cty, hwy)) + geom_count() # ggdibbler (alpha for resample overlap) ggplot(uncertain_mpg, aes(cty, hwy)) + geom_count_sample(alpha=0.15) # Best used in conjunction with scale_size_area # ggplot ggplot(mpg, aes(cty, hwy)) + geom_count() + scale_size_area() # ggdibbler ggplot(uncertain_mpg, aes(cty, hwy)) + geom_count_sample(alpha=0.15) + scale_size_area()library(ggplot2) # Discrete values have overplotting # ggplot ggplot(mpg, aes(cty, hwy)) + geom_point() # ggdibbler ggplot(uncertain_mpg, aes(cty, hwy)) + geom_point_sample() # Can use geom_count to fix it # ggplot ggplot(mpg, aes(cty, hwy)) + geom_count() # ggdibbler (alpha for resample overlap) ggplot(uncertain_mpg, aes(cty, hwy)) + geom_count_sample(alpha=0.15) # Best used in conjunction with scale_size_area # ggplot ggplot(mpg, aes(cty, hwy)) + geom_count() + scale_size_area() # ggdibbler ggplot(uncertain_mpg, aes(cty, hwy)) + geom_count_sample(alpha=0.15) + scale_size_area()
Identical to geom_linerange, geom_errorbar, geom_crossbar, and geom_pointrange except that they will accept a distribution in place of any of the usual aesthetics.
geom_crossbar_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "identity_sample", position = "identity", ..., middle.colour = NULL, middle.color = NULL, middle.linetype = NULL, middle.linewidth = NULL, box.colour = NULL, box.color = NULL, box.linetype = NULL, box.linewidth = NULL, fatten = deprecated(), na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_errorbar_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, orientation = NA, seed = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_linerange_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, orientation = NA, seed = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_pointrange_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, orientation = NA, seed = NULL, lineend = "butt", fatten = 4, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_crossbar_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "identity_sample", position = "identity", ..., middle.colour = NULL, middle.color = NULL, middle.linetype = NULL, middle.linewidth = NULL, box.colour = NULL, box.color = NULL, box.linetype = NULL, box.linewidth = NULL, fatten = deprecated(), na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) geom_errorbar_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, orientation = NA, seed = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_linerange_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, orientation = NA, seed = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_pointrange_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, orientation = NA, seed = NULL, lineend = "butt", fatten = 4, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
middle.colour, middle.color, middle.linetype, middle.linewidth
|
Default aesthetics for the middle line. Set to |
box.colour, box.color, box.linetype, box.linewidth
|
Default aesthetics for the boxes. Set to |
fatten |
|
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
lineend |
Line end style (round, butt, square). |
A ggplot2 layer
library(ggplot2) library(dplyr) library(distributional) # Create a simple example dataset df <- data.frame( trt = factor(c(1, 1, 2, 2)), resp = c(1, 5, 3, 4), group = factor(c(1, 2, 1, 2)), upper = c(1.1, 5.3, 3.3, 4.2), lower = c(0.8, 4.6, 2.4, 3.6) ) uncertain_df <- df |> group_by(trt, group) |> mutate(resp = dist_normal(resp, runif(1,0,0.2)), upper = dist_normal(upper, runif(1,0,0.2)), lower = dist_normal(lower, runif(1,0,0.2)) ) p <- ggplot(df, aes(trt, resp, colour = group)) q <- ggplot(uncertain_df, aes(trt, resp, colour = group)) # ggplot p + geom_linerange(aes(ymin = lower, ymax = upper), linewidth=4) #ggdibbler q + geom_linerange_sample(aes(ymin = lower, ymax = upper), linewidth=4) # ggplot p + geom_pointrange(aes(ymin = lower, ymax = upper)) # ggdibbler q + geom_pointrange_sample(aes(ymin = lower, ymax = upper)) # ggplot p + geom_crossbar(aes(ymin = lower, ymax = upper), width = 0.2) # ggdibbler q + geom_crossbar_sample(aes(ymin = lower, ymax = upper), width = 0.2) # ggplot p + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) # ggdibbler q + geom_errorbar_sample(aes(ymin = lower, ymax = upper), width = 0.2)library(ggplot2) library(dplyr) library(distributional) # Create a simple example dataset df <- data.frame( trt = factor(c(1, 1, 2, 2)), resp = c(1, 5, 3, 4), group = factor(c(1, 2, 1, 2)), upper = c(1.1, 5.3, 3.3, 4.2), lower = c(0.8, 4.6, 2.4, 3.6) ) uncertain_df <- df |> group_by(trt, group) |> mutate(resp = dist_normal(resp, runif(1,0,0.2)), upper = dist_normal(upper, runif(1,0,0.2)), lower = dist_normal(lower, runif(1,0,0.2)) ) p <- ggplot(df, aes(trt, resp, colour = group)) q <- ggplot(uncertain_df, aes(trt, resp, colour = group)) # ggplot p + geom_linerange(aes(ymin = lower, ymax = upper), linewidth=4) #ggdibbler q + geom_linerange_sample(aes(ymin = lower, ymax = upper), linewidth=4) # ggplot p + geom_pointrange(aes(ymin = lower, ymax = upper)) # ggdibbler q + geom_pointrange_sample(aes(ymin = lower, ymax = upper)) # ggplot p + geom_crossbar(aes(ymin = lower, ymax = upper), width = 0.2) # ggdibbler q + geom_crossbar_sample(aes(ymin = lower, ymax = upper), width = 0.2) # ggplot p + geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) # ggdibbler q + geom_errorbar_sample(aes(ymin = lower, ymax = upper), width = 0.2)
Identical to geom_segment, except that it will accept a distribution in place of any of the usual aesthetics.
geom_curve_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, curvature = 0.5, angle = 90, ncp = 5, arrow = NULL, arrow.fill = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_segment_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_curve_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, curvature = 0.5, angle = 90, ncp = 5, arrow = NULL, arrow.fill = NULL, lineend = "butt", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_segment_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
curvature |
A numeric value giving the amount of curvature. Negative values produce left-hand curves, positive values produce right-hand curves, and zero produces a straight line. |
angle |
A numeric value between 0 and 180, giving an amount to skew the control points of the curve. Values less than 90 skew the curve towards the start point and values greater than 90 skew the curve towards the end point. |
ncp |
The number of control points used to draw the curve. More control points creates a smoother curve. |
arrow |
specification for arrow heads, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
linejoin |
Line join style (round, mitre, bevel). |
A ggplot2 layer
library(ggplot2) library(distributional) # ggplot b <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # ggdibbler a <- ggplot(uncertain_mtcars, aes(wt, mpg)) + geom_point_sample(seed=77, alpha=0.5) df <- data.frame(x1 = 2.62, x2 = 3.57, y1 = 21.0, y2 = 15.0) uncertain_df <- data.frame(x1 = dist_normal(2.62, 0.1), x2 = dist_normal(3.57,0.1), y1 = dist_normal(21.0, 0.1), y2 = dist_normal(15.0,0.1)) # ggplot b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "curve"), data = df) + geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "segment"), data = df) # ggdibbler a + geom_curve_sample(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "curve"), data = uncertain_df, seed=77, alpha=0.5) + geom_segment_sample(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "segment"), data = uncertain_df, seed=77, alpha=0.5)library(ggplot2) library(distributional) # ggplot b <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # ggdibbler a <- ggplot(uncertain_mtcars, aes(wt, mpg)) + geom_point_sample(seed=77, alpha=0.5) df <- data.frame(x1 = 2.62, x2 = 3.57, y1 = 21.0, y2 = 15.0) uncertain_df <- data.frame(x1 = dist_normal(2.62, 0.1), x2 = dist_normal(3.57,0.1), y1 = dist_normal(21.0, 0.1), y2 = dist_normal(15.0,0.1)) # ggplot b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "curve"), data = df) + geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "segment"), data = df) # ggdibbler a + geom_curve_sample(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "curve"), data = uncertain_df, seed=77, alpha=0.5) + geom_segment_sample(aes(x = x1, y = y1, xend = x2, yend = y2, colour = "segment"), data = uncertain_df, seed=77, alpha=0.5)
Identical to geom_density_2d() and geom_density_2d_filled, except that it will accept a distribution in place of any of the usual aesthetics.
geom_density_2d_sample( mapping = NULL, data = NULL, stat = "density_2d_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_density_2d_filled_sample( mapping = NULL, data = NULL, stat = "density_2d_filled_sample", position = "identity", ..., times = 10, seed = NULL, rule = "evenodd", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d_sample( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", times = 10, seed = NULL, h = NULL, adjust = c(1, 1), n = 100, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d_filled_sample( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", times = 10, seed = NULL, h = NULL, adjust = c(1, 1), n = 100, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_density_2d_sample( mapping = NULL, data = NULL, stat = "density_2d_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_density_2d_filled_sample( mapping = NULL, data = NULL, stat = "density_2d_filled_sample", position = "identity", ..., times = 10, seed = NULL, rule = "evenodd", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d_sample( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", times = 10, seed = NULL, h = NULL, adjust = c(1, 1), n = 100, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_2d_filled_sample( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", times = 10, seed = NULL, h = NULL, adjust = c(1, 1), n = 100, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
arrow |
specification for arrow heads, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
rule |
Either |
geom, stat
|
Use to override the default connection between
|
contour |
If |
contour_var |
Character string identifying the variable to contour
by. Can be one of |
h |
Bandwidth (vector of length two). If |
adjust |
A multiplicative bandwidth adjustment to be used if 'h' is
'NULL'. This makes it possible to adjust the bandwidth while still
using the a bandwidth estimator. For example, |
n |
Number of grid points in each direction. |
A ggplot2 layer
library(ggplot2) # ggplot m <- ggplot(faithful, aes(x = eruptions, y = waiting)) + geom_point() + xlim(0.5, 6) + ylim(40, 110) # contour lines m + geom_density_2d() # ggdibbler n <- ggplot(uncertain_faithful, aes(x = eruptions, y = waiting)) + geom_point_sample(size=2/10) + scale_x_continuous_distribution(limits = c(0.5, 6)) + scale_y_continuous_distribution(limits = c(40, 110)) n + geom_density_2d_sample(linewidth=2/10, alpha=0.5) # contour bands # ggplot m + geom_density_2d_filled(alpha = 0.5) # ggdibbler n + geom_density_2d_filled_sample(alpha = 0.1)library(ggplot2) # ggplot m <- ggplot(faithful, aes(x = eruptions, y = waiting)) + geom_point() + xlim(0.5, 6) + ylim(40, 110) # contour lines m + geom_density_2d() # ggdibbler n <- ggplot(uncertain_faithful, aes(x = eruptions, y = waiting)) + geom_point_sample(size=2/10) + scale_x_continuous_distribution(limits = c(0.5, 6)) + scale_y_continuous_distribution(limits = c(40, 110)) n + geom_density_2d_sample(linewidth=2/10, alpha=0.5) # contour bands # ggplot m + geom_density_2d_filled(alpha = 0.5) # ggdibbler n + geom_density_2d_filled_sample(alpha = 0.1)
Identical to geom_density, except that the fill for each density will be represented by a sample from each distribution.
geom_density_sample( mapping = NULL, data = NULL, stat = "density_sample", position = "identity", ..., outline.type = "upper", seed = NULL, times = 10, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_sample( mapping = NULL, data = NULL, geom = "area", position = "stack_identity", ..., orientation = NA, seed = NULL, times = 10, bw = "nrd0", adjust = 1, kernel = "gaussian", n = 512, trim = FALSE, bounds = c(-Inf, Inf), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_density_sample( mapping = NULL, data = NULL, stat = "density_sample", position = "identity", ..., outline.type = "upper", seed = NULL, times = 10, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_density_sample( mapping = NULL, data = NULL, geom = "area", position = "stack_identity", ..., orientation = NA, seed = NULL, times = 10, bw = "nrd0", adjust = 1, kernel = "gaussian", n = 512, trim = FALSE, bounds = c(-Inf, Inf), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
outline.type |
Type of the outline of the area; |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
times |
A parameter used to control the number of values sampled from each distribution. |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
orientation |
The orientation of the layer. The default ( |
bw |
The smoothing bandwidth to be used.
If numeric, the standard deviation of the smoothing kernel.
If character, a rule to choose the bandwidth, as listed in
|
adjust |
A multiplicate bandwidth adjustment. This makes it possible
to adjust the bandwidth while still using the a bandwidth estimator.
For example, |
kernel |
Kernel. See list of available kernels in |
n |
number of equally spaced points at which the density is to be
estimated, should be a power of two, see |
trim |
If |
bounds |
Known lower and upper bounds for estimated data. Default
|
A ggplot2 layer
library(ggplot2) # Basic density plot # GGPLOT ggplot(smaller_diamonds, aes(carat)) + geom_density() # GGDIBBLER ggplot(smaller_uncertain_diamonds, aes(carat)) + geom_density_sample(alpha=0.5) # ggplot ggplot(smaller_diamonds, aes(depth, fill = cut, colour = cut)) + geom_density(alpha = 0.7) + xlim(55, 70) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(depth, fill = cut)) + geom_density_sample(aes(colour = after_stat(fill)), alpha = 0.1) + scale_x_continuous_distribution(limits=c(55, 70)) + #' ggdibbler does not have an xlim (yet) theme(palette.colour.discrete = "viridis", palette.fill.discrete = "viridis") #' bug: random variables have different colourlibrary(ggplot2) # Basic density plot # GGPLOT ggplot(smaller_diamonds, aes(carat)) + geom_density() # GGDIBBLER ggplot(smaller_uncertain_diamonds, aes(carat)) + geom_density_sample(alpha=0.5) # ggplot ggplot(smaller_diamonds, aes(depth, fill = cut, colour = cut)) + geom_density(alpha = 0.7) + xlim(55, 70) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(depth, fill = cut)) + geom_density_sample(aes(colour = after_stat(fill)), alpha = 0.1) + scale_x_continuous_distribution(limits=c(55, 70)) + #' ggdibbler does not have an xlim (yet) theme(palette.colour.discrete = "viridis", palette.fill.discrete = "viridis") #' bug: random variables have different colour
Identical to geom_dotplot, except that it will accept a distribution in place of any of the usual aesthetics.
geom_dotplot_sample( mapping = NULL, data = NULL, position = "identity", seed = NULL, ..., times = 10, binwidth = NULL, binaxis = "x", method = "dotdensity", binpositions = "bygroup", stackdir = "up", stackratio = 1, dotsize = 1, stackgroups = FALSE, origin = NULL, right = TRUE, width = 0.9, drop = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_dotplot_sample( mapping = NULL, data = NULL, position = "identity", seed = NULL, ..., times = 10, binwidth = NULL, binaxis = "x", method = "dotdensity", binpositions = "bygroup", stackdir = "up", stackratio = 1, dotsize = 1, stackgroups = FALSE, origin = NULL, right = TRUE, width = 0.9, drop = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
binwidth |
When |
binaxis |
The axis to bin along, "x" (default) or "y" |
method |
"dotdensity" (default) for dot-density binning, or "histodot" for fixed bin widths (like stat_bin) |
binpositions |
When |
stackdir |
which direction to stack the dots. "up" (default), "down", "center", "centerwhole" (centered, but with dots aligned) |
stackratio |
how close to stack the dots. Default is 1, where dots just touch. Use smaller values for closer, overlapping dots. |
dotsize |
The diameter of the dots relative to |
stackgroups |
should dots be stacked across groups? This has the effect
that |
origin |
When |
right |
When |
width |
When |
drop |
If TRUE, remove all bins with zero counts |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) # ggplot ggplot(mtcars, aes(x = mpg)) + geom_dotplot() # ggdibbler ggplot(uncertain_mtcars, aes(x = mpg)) + geom_dotplot_sample(alpha=0.2) # ggplot ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5) # ggdibbler ggplot(uncertain_mtcars, aes(x = mpg)) + geom_dotplot_sample(binwidth = 1.5, alpha=0.2) # Use fixed-width bins #ggplot ggplot(mtcars, aes(x = mpg)) + geom_dotplot(method="histodot", binwidth = 1.5) # ggdibbler ggplot(uncertain_mtcars, aes(x = mpg)) + geom_dotplot_sample(method="histodot", binwidth = 1.5, alpha=0.2)library(ggplot2) # ggplot ggplot(mtcars, aes(x = mpg)) + geom_dotplot() # ggdibbler ggplot(uncertain_mtcars, aes(x = mpg)) + geom_dotplot_sample(alpha=0.2) # ggplot ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5) # ggdibbler ggplot(uncertain_mtcars, aes(x = mpg)) + geom_dotplot_sample(binwidth = 1.5, alpha=0.2) # Use fixed-width bins #ggplot ggplot(mtcars, aes(x = mpg)) + geom_dotplot(method="histodot", binwidth = 1.5) # ggdibbler ggplot(uncertain_mtcars, aes(x = mpg)) + geom_dotplot_sample(method="histodot", binwidth = 1.5, alpha=0.2)
Identical to geom_histogram, geom_freqpoly, and stat-bin except that it will accept a distribution in place of any of the usual aesthetics.
geom_freqpoly_sample( mapping = NULL, data = NULL, stat = "bin_sample", position = "identity", ..., na.rm = FALSE, times = 10, seed = NULL, show.legend = NA, inherit.aes = TRUE ) geom_histogram_sample( mapping = NULL, data = NULL, stat = "bin_sample", position = "stack_dodge", ..., times = 10, seed = NULL, binwidth = NULL, bins = NULL, orientation = NA, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_sample( mapping = NULL, data = NULL, geom = "bar", position = "stack_dodge", ..., times = 10, orientation = NA, seed = NULL, binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, closed = c("right", "left"), pad = FALSE, breaks = NULL, drop = "none", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_freqpoly_sample( mapping = NULL, data = NULL, stat = "bin_sample", position = "identity", ..., na.rm = FALSE, times = 10, seed = NULL, show.legend = NA, inherit.aes = TRUE ) geom_histogram_sample( mapping = NULL, data = NULL, stat = "bin_sample", position = "stack_dodge", ..., times = 10, seed = NULL, binwidth = NULL, bins = NULL, orientation = NA, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_sample( mapping = NULL, data = NULL, geom = "bar", position = "stack_dodge", ..., times = 10, orientation = NA, seed = NULL, binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, closed = c("right", "left"), pad = FALSE, breaks = NULL, drop = "none", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that takes x after scale transformation as input and
returns a single numeric value. When specifying a function along with a
grouping structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
bins |
Number of bins. Overridden by |
orientation |
The orientation of the layer. The default ( |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
geom, stat
|
Use to override the default connection between
|
center, boundary
|
bin position specifiers. Only one, |
closed |
One of |
pad |
If |
breaks |
Alternatively, you can supply a numeric vector giving
the bin boundaries. Overrides |
drop |
Treatment of zero count bins. If |
A ggplot2 layer
# load ggplot library(ggplot2) # ggplot ggplot(smaller_diamonds, aes(carat)) + geom_histogram() # ggdibbler ggplot(smaller_uncertain_diamonds, aes(carat)) + geom_histogram_sample() #' alpha ggplot(smaller_uncertain_diamonds, aes(carat)) + geom_histogram_sample(position="identity_identity", alpha=0.15) # ggplot ggplot(smaller_diamonds, aes(price, colour = cut)) + geom_freqpoly(binwidth = 500) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(price, colour = cut)) + geom_freqpoly_sample(binwidth = 500) # ggplot2 ggplot(smaller_diamonds, aes(price, fill = cut)) + geom_histogram(binwidth = 500) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(price, fill = cut)) + geom_histogram_sample(binwidth = 500)# load ggplot library(ggplot2) # ggplot ggplot(smaller_diamonds, aes(carat)) + geom_histogram() # ggdibbler ggplot(smaller_uncertain_diamonds, aes(carat)) + geom_histogram_sample() #' alpha ggplot(smaller_uncertain_diamonds, aes(carat)) + geom_histogram_sample(position="identity_identity", alpha=0.15) # ggplot ggplot(smaller_diamonds, aes(price, colour = cut)) + geom_freqpoly(binwidth = 500) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(price, colour = cut)) + geom_freqpoly_sample(binwidth = 500) # ggplot2 ggplot(smaller_diamonds, aes(price, fill = cut)) + geom_histogram(binwidth = 500) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(price, fill = cut)) + geom_histogram_sample(binwidth = 500)
Identical to geom_hex, except that it will accept a distribution in place of any of the usual aesthetics.
geom_hex_sample( mapping = NULL, data = NULL, stat = "bin_hex_sample", position = "identity", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_hex_sample( mapping = NULL, data = NULL, geom = "hex", position = "identity", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_hex_sample( mapping = NULL, data = NULL, stat = "bin_hex_sample", position = "identity", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_bin_hex_sample( mapping = NULL, data = NULL, geom = "hex", position = "identity", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Override the default connection between |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that takes x after scale transformation as input and
returns a single numeric value. When specifying a function along with a
grouping structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
bins |
Number of bins. Overridden by |
A ggplot2 layer
library(ggplot2) d <- ggplot(smaller_diamonds, aes(carat, price)) d + geom_hex() b <- ggplot(smaller_uncertain_diamonds, aes(carat, price)) b + geom_hex_sample(alpha=0.15) # You still have access to all the same parameters d + geom_hex(bins = 10) b + geom_hex_sample(bins = 10, alpha=0.15)library(ggplot2) d <- ggplot(smaller_diamonds, aes(carat, price)) d + geom_hex() b <- ggplot(smaller_uncertain_diamonds, aes(carat, price)) b + geom_hex_sample(alpha=0.15) # You still have access to all the same parameters d + geom_hex(bins = 10) b + geom_hex_sample(bins = 10, alpha=0.15)
Identical to geom_jitter, except that it will accept a distribution in place of any of the usual aesthetics.
geom_jitter_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "jitter", ..., width = NULL, height = NULL, na.rm = FALSE, times = 10, seed = NULL, show.legend = NA, inherit.aes = TRUE )geom_jitter_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "jitter", ..., width = NULL, height = NULL, na.rm = FALSE, times = 10, seed = NULL, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
width, height
|
Amount of vertical and horizontal jitter. The jitter is added in both positive and negative directions, so the total spread is twice the value specified here. If omitted, defaults to 40% of the resolution of the data: this means the jitter values will occupy 80% of the implied bins. Categorical data is aligned on the integers, so a width or height of 0.5 will spread the data so it's not possible to see the distinction between the categories. |
na.rm |
If |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) # ggplot p <- ggplot(mpg, aes(cyl, hwy)) #ggplot q <- ggplot(uncertain_mpg, aes(cyl, hwy)) #ggdibbler p + geom_point() q + geom_point_sample(times=10) # ggplot p + geom_jitter() # ggdibbler q + geom_jitter_sample(times=10) # Add aesthetic mappings p + geom_jitter(aes(colour = class)) #ggplot p + geom_jitter_sample(aes(colour = class)) #ggdiblerlibrary(ggplot2) # ggplot p <- ggplot(mpg, aes(cyl, hwy)) #ggplot q <- ggplot(uncertain_mpg, aes(cyl, hwy)) #ggdibbler p + geom_point() q + geom_point_sample(times=10) # ggplot p + geom_jitter() # ggdibbler q + geom_jitter_sample(times=10) # Add aesthetic mappings p + geom_jitter(aes(colour = class)) #ggplot p + geom_jitter_sample(aes(colour = class)) #ggdibler
Identical to geom_text and geom_label except that it will accept a distribution in place of any of the usual aesthetics.
geom_label_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "identity_sample", position = "nudge", ..., parse = FALSE, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = deprecated(), border.colour = NULL, border.color = NULL, text.colour = NULL, text.color = NULL, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_text_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "nudge", ..., times = 10, seed = NULL, parse = FALSE, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_label_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "identity_sample", position = "nudge", ..., parse = FALSE, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = deprecated(), border.colour = NULL, border.color = NULL, text.colour = NULL, text.color = NULL, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_text_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "nudge", ..., times = 10, seed = NULL, parse = FALSE, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
parse |
If |
label.padding |
Amount of padding around label. Defaults to 0.25 lines. |
label.r |
Radius of rounded corners. Defaults to 0.15 lines. |
label.size |
|
border.colour, border.color
|
Colour of label border. When |
text.colour, text.color
|
Colour of the text. When |
size.unit |
How the |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check_overlap |
If |
A ggplot2 geom representing a point_sample which can be added to a ggplot object
A ggplot2 layer
library(ggplot2) p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars))) q <- ggplot(uncertain_mtcars, aes(wt, mpg, label = rownames(uncertain_mtcars))) # Text example p + geom_text() # ggplot q + geom_text_sample(times=3, alpha=0.5) #ggdibbler # Labels with background p + geom_label() #ggplot q + geom_label_sample(times=3, alpha=0.5) #ggdibbler # Random text with constant position (harder to read signal supression) # ggplot ggplot(mtcars, aes(wt, mpg, label = cyl)) + geom_text(size=6) # ggdibbler ggplot(uncertain_mtcars, aes(mean(wt), mean(mpg), lab = cyl)) + geom_text_sample(aes(label = after_stat(lab)), size=6, alpha=0.3)library(ggplot2) p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars))) q <- ggplot(uncertain_mtcars, aes(wt, mpg, label = rownames(uncertain_mtcars))) # Text example p + geom_text() # ggplot q + geom_text_sample(times=3, alpha=0.5) #ggdibbler # Labels with background p + geom_label() #ggplot q + geom_label_sample(times=3, alpha=0.5) #ggdibbler # Random text with constant position (harder to read signal supression) # ggplot ggplot(mtcars, aes(wt, mpg, label = cyl)) + geom_text(size=6) # ggdibbler ggplot(uncertain_mtcars, aes(mean(wt), mean(mpg), lab = cyl)) + geom_text_sample(aes(label = after_stat(lab)), size=6, alpha=0.3)
Identical to geom_path, geom_line, and geom_step, except that it will accept a distribution in place of any of the usual aesthetics.
geom_path_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_line_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, orientation = NA, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_step_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, orientation = NA, lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, arrow.fill = NULL, direction = "hv", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_path_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_line_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, orientation = NA, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_step_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, orientation = NA, lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, arrow.fill = NULL, direction = "hv", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
arrow |
Arrow specification, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
orientation |
The orientation of the layer. The default ( |
direction |
direction of stairs: 'vh' for vertical then horizontal, 'hv' for horizontal then vertical, or 'mid' for step half-way between adjacent x-values. |
A ggplot2 layer
library(ggplot2) library(dplyr) library(distributional) #ggplot ggplot(economics, aes(date, unemploy)) + geom_line() #ggdibbler ggplot(uncertain_economics, aes(date, unemploy)) + geom_line_sample(alpha=0.1) # geom_step() is useful when you want to highlight exactly when # the y value changes recent <- economics[economics$date > as.Date("2013-01-01"), ] uncertain_recent <- uncertain_economics[uncertain_economics$date > as.Date("2013-01-01"), ] # geom line ggplot(recent, aes(date, unemploy)) + geom_step()#ggplot ggplot(uncertain_recent, aes(date, unemploy)) + geom_step_sample(alpha=0.5)#ggdibbler # geom_path lets you explore how two variables are related over time, # ggplot m <- ggplot(economics, aes(unemploy, psavert)) m + geom_path(aes(colour = as.numeric(date))) # ggdibbler n <- ggplot(uncertain_economics, aes(unemploy, psavert)) n + geom_path_sample(aes(colour = as.numeric(date)), alpha=0.15) # You can use NAs to break the line. df <- data.frame(x = 1:5, y = c(1, 2, NA, 4, 5)) uncertain_df <- df |> mutate(y=dist_normal(y, 0.3)) # ggplot ggplot(df, aes(x, y)) + geom_point() + geom_line() # ggdibbler ggplot(uncertain_df, aes(x, y)) + geom_point_sample(seed=33) + geom_line_sample(seed=33)library(ggplot2) library(dplyr) library(distributional) #ggplot ggplot(economics, aes(date, unemploy)) + geom_line() #ggdibbler ggplot(uncertain_economics, aes(date, unemploy)) + geom_line_sample(alpha=0.1) # geom_step() is useful when you want to highlight exactly when # the y value changes recent <- economics[economics$date > as.Date("2013-01-01"), ] uncertain_recent <- uncertain_economics[uncertain_economics$date > as.Date("2013-01-01"), ] # geom line ggplot(recent, aes(date, unemploy)) + geom_step()#ggplot ggplot(uncertain_recent, aes(date, unemploy)) + geom_step_sample(alpha=0.5)#ggdibbler # geom_path lets you explore how two variables are related over time, # ggplot m <- ggplot(economics, aes(unemploy, psavert)) m + geom_path(aes(colour = as.numeric(date))) # ggdibbler n <- ggplot(uncertain_economics, aes(unemploy, psavert)) n + geom_path_sample(aes(colour = as.numeric(date)), alpha=0.15) # You can use NAs to break the line. df <- data.frame(x = 1:5, y = c(1, 2, NA, 4, 5)) uncertain_df <- df |> mutate(y=dist_normal(y, 0.3)) # ggplot ggplot(df, aes(x, y)) + geom_point() + geom_line() # ggdibbler ggplot(uncertain_df, aes(x, y)) + geom_point_sample(seed=33) + geom_line_sample(seed=33)
Identical to geom_point, except that it will accept a distribution in place of any of the usual aesthetics.
geom_point_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_point_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) library(distributional) # ggplot p <- ggplot(mtcars, aes(wt, mpg)) p + geom_point() # ggdibbler - set the sample size with times q <- ggplot(uncertain_mtcars, aes(wt, mpg)) q + geom_point_sample(times=50, alpha=0.5) # Add aesthetic mappings # ggplot p + geom_point(aes(colour = factor(cyl))) # ggdibbler - a q + geom_point_sample(aes(colour = dist_transformed(cyl, factor, as.numeric))) + labs(colour = "factor(cyl)") # ggplot p + geom_point(aes(shape = factor(cyl))) # ggdibbler q + geom_point_sample(aes(shape = dist_transformed(cyl, factor, as.numeric))) + labs(shape = "factor(cyl)") # A "bubblechart": # ggplot2 p + geom_point(aes(size = qsec)) # ggdibbler q + geom_point_sample(aes(size = qsec), alpha=0.15)library(ggplot2) library(distributional) # ggplot p <- ggplot(mtcars, aes(wt, mpg)) p + geom_point() # ggdibbler - set the sample size with times q <- ggplot(uncertain_mtcars, aes(wt, mpg)) q + geom_point_sample(times=50, alpha=0.5) # Add aesthetic mappings # ggplot p + geom_point(aes(colour = factor(cyl))) # ggdibbler - a q + geom_point_sample(aes(colour = dist_transformed(cyl, factor, as.numeric))) + labs(colour = "factor(cyl)") # ggplot p + geom_point(aes(shape = factor(cyl))) # ggdibbler q + geom_point_sample(aes(shape = dist_transformed(cyl, factor, as.numeric))) + labs(shape = "factor(cyl)") # A "bubblechart": # ggplot2 p + geom_point(aes(size = qsec)) # ggdibbler q + geom_point_sample(aes(size = qsec), alpha=0.15)
Identical to geom_polygon, except that it will accept a distribution in place of any of the usual aesthetics.
geom_polygon_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, rule = "evenodd", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_polygon_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, rule = "evenodd", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
rule |
Either |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) library(distributional) library(dplyr) ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) #' Currently we need to manually merge the two together datapoly <- merge(values, positions, by = c("id")) #' Make uncertain version of datapoly uncertain_datapoly <- datapoly |> mutate(x = dist_uniform(x-0.1, x + 0.1), y = dist_uniform(y-0.1, y + 0.1), value = dist_uniform(value-0.5, value + 0.5)) p <- ggplot(datapoly, aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id)) p q <- ggplot(uncertain_datapoly, aes(x = x, y = y)) + geom_polygon_sample(aes(fill = value, group = id), alpha=0.15) qlibrary(ggplot2) library(distributional) library(dplyr) ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) #' Currently we need to manually merge the two together datapoly <- merge(values, positions, by = c("id")) #' Make uncertain version of datapoly uncertain_datapoly <- datapoly |> mutate(x = dist_uniform(x-0.1, x + 0.1), y = dist_uniform(y-0.1, y + 0.1), value = dist_uniform(value-0.5, value + 0.5)) p <- ggplot(datapoly, aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id)) p q <- ggplot(uncertain_datapoly, aes(x = x, y = y)) + geom_polygon_sample(aes(fill = value, group = id), alpha=0.15) q
Identical to geom_quantile, except that it will accept a distribution in place of any of the usual aesthetics.
geom_quantile_sample( mapping = NULL, data = NULL, stat = "quantile_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_quantile_sample( mapping = NULL, data = NULL, geom = "quantile", position = "identity", ..., seed = NULL, times = 10, quantiles = c(0.25, 0.5, 0.75), formula = NULL, method = "rq", method.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_quantile_sample( mapping = NULL, data = NULL, stat = "quantile_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_quantile_sample( mapping = NULL, data = NULL, geom = "quantile", position = "identity", ..., seed = NULL, times = 10, quantiles = c(0.25, 0.5, 0.75), formula = NULL, method = "rq", method.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
arrow |
Arrow specification, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
quantiles |
conditional quantiles of y to calculate and display |
formula |
formula relating y variables to x variables |
method |
Quantile regression method to use. Available options are |
method.args |
List of additional arguments passed on to the modelling
function defined by |
A ggplot2 layer
library(ggplot2) # ggplot m <- ggplot(mpg, aes(displ, hwy)) + geom_point() # ggdibbler n <- ggplot(uncertain_mpg, aes(displ, hwy)) + geom_point_sample(alpha=0.3) # ggplot m + geom_quantile() # ggdibbler n + geom_quantile_sample(alpha=0.3) # ggplot m + geom_quantile(quantiles = 0.5) # ggdibbler n + geom_quantile_sample(quantiles = 0.5, alpha=0.3)library(ggplot2) # ggplot m <- ggplot(mpg, aes(displ, hwy)) + geom_point() # ggdibbler n <- ggplot(uncertain_mpg, aes(displ, hwy)) + geom_point_sample(alpha=0.3) # ggplot m + geom_quantile() # ggdibbler n + geom_quantile_sample(alpha=0.3) # ggplot m + geom_quantile(quantiles = 0.5) # ggdibbler n + geom_quantile_sample(quantiles = 0.5, alpha=0.3)
Identical to geom_tile and geom_rect, except that they will accept a distribution in place of any of the usual aesthetics.
geom_raster_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity_dodge", ..., times = 10, seed = NULL, interpolate = FALSE, hjust = 0.5, vjust = 0.5, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rect_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_tile_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity_dodge", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_raster_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity_dodge", ..., times = 10, seed = NULL, interpolate = FALSE, hjust = 0.5, vjust = 0.5, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rect_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_tile_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity_dodge", ..., times = 10, seed = NULL, lineend = "butt", linejoin = "mitre", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
interpolate |
If |
hjust, vjust
|
horizontal and vertical justification of the grob. Each justification value should be a number between 0 and 1. Defaults to 0.5 for both, centering each pixel over its data location. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
A ggplot2 layer
library(ggplot2) library(distributional) library(dplyr) # Rasters #ggplot ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) #ggdibbler ggplot(uncertain_faithfuld, aes(waiting, eruptions)) + geom_raster_sample(aes(fill = density)) # Justification controls where the cells are anchored df <- expand.grid(x = 0:5, y = 0:5) set.seed(1) df$z <- runif(nrow(df)) uncertain_df <- df |> group_by(x,y) |> mutate(z = dist_normal(z, runif(1, 0, 0.1))) |> ungroup() # default is compatible with geom_tile() # ggplot ggplot(df, aes(x, y, fill = z)) + geom_raster() #ggdibbler ggplot(uncertain_df, aes(x, y, fill = z)) + geom_raster_sample() # If you want to draw arbitrary rectangles, # use geom_tile_sample() or geom_rect_sample() tile_df <- data.frame( x = rep(c(2, 5, 7, 9, 12), 2), y = rep(c(1, 2), each = 5), z = factor(rep(1:5, each = 2)), w = rep(diff(c(0, 4, 6, 8, 10, 14)), 2) ) # most likely case that only colour is random uncertain_tile_df <- tile_df uncertain_tile_df$z <- dist_transformed((1 + dist_binomial(rep(1:5, each = 2), 0.5)), factor, as.numeric) # ggplot ggplot(tile_df, aes(x, y)) + geom_tile(aes(fill = z), colour = "grey50") # ggdibbler ggplot(uncertain_tile_df, aes(x, y)) + geom_tile_sample(aes(fill = z), position="identity_dodge") + geom_tile(fill = NA, colour = "grey50", linewidth=1) + labs(fill = "z") # Rectangles rect_df <- tile_df |> mutate(xmin = x - w / 2, xmax = x + w / 2, ymin = y, ymax = y + 1) uncertain_rect <- rect_df|> mutate(xmin = dist_normal(xmin, 0.5), xmax = dist_normal(xmax, 0.5), ymin = dist_normal(ymin, 0.5), ymax = dist_normal(ymax, 0.5)) # ggplot ggplot(data = rect_df, aes(xmin= xmin, xmax = xmax, ymin = ymin, ymax = ymax)) + geom_rect(aes(fill = z), colour = "grey50") # ggdibbler ggplot(data = uncertain_rect, aes(xmin= xmin, xmax = xmax, ymin = ymin, ymax = ymax, f = z)) + geom_rect_sample(aes(fill = as.factor(after_stat(f))), colour = "grey50", alpha=0.2) + labs(fill = "z")library(ggplot2) library(distributional) library(dplyr) # Rasters #ggplot ggplot(faithfuld, aes(waiting, eruptions)) + geom_raster(aes(fill = density)) #ggdibbler ggplot(uncertain_faithfuld, aes(waiting, eruptions)) + geom_raster_sample(aes(fill = density)) # Justification controls where the cells are anchored df <- expand.grid(x = 0:5, y = 0:5) set.seed(1) df$z <- runif(nrow(df)) uncertain_df <- df |> group_by(x,y) |> mutate(z = dist_normal(z, runif(1, 0, 0.1))) |> ungroup() # default is compatible with geom_tile() # ggplot ggplot(df, aes(x, y, fill = z)) + geom_raster() #ggdibbler ggplot(uncertain_df, aes(x, y, fill = z)) + geom_raster_sample() # If you want to draw arbitrary rectangles, # use geom_tile_sample() or geom_rect_sample() tile_df <- data.frame( x = rep(c(2, 5, 7, 9, 12), 2), y = rep(c(1, 2), each = 5), z = factor(rep(1:5, each = 2)), w = rep(diff(c(0, 4, 6, 8, 10, 14)), 2) ) # most likely case that only colour is random uncertain_tile_df <- tile_df uncertain_tile_df$z <- dist_transformed((1 + dist_binomial(rep(1:5, each = 2), 0.5)), factor, as.numeric) # ggplot ggplot(tile_df, aes(x, y)) + geom_tile(aes(fill = z), colour = "grey50") # ggdibbler ggplot(uncertain_tile_df, aes(x, y)) + geom_tile_sample(aes(fill = z), position="identity_dodge") + geom_tile(fill = NA, colour = "grey50", linewidth=1) + labs(fill = "z") # Rectangles rect_df <- tile_df |> mutate(xmin = x - w / 2, xmax = x + w / 2, ymin = y, ymax = y + 1) uncertain_rect <- rect_df|> mutate(xmin = dist_normal(xmin, 0.5), xmax = dist_normal(xmax, 0.5), ymin = dist_normal(ymin, 0.5), ymax = dist_normal(ymax, 0.5)) # ggplot ggplot(data = rect_df, aes(xmin= xmin, xmax = xmax, ymin = ymin, ymax = ymax)) + geom_rect(aes(fill = z), colour = "grey50") # ggdibbler ggplot(data = uncertain_rect, aes(xmin= xmin, xmax = xmax, ymin = ymin, ymax = ymax, f = z)) + geom_rect_sample(aes(fill = as.factor(after_stat(f))), colour = "grey50", alpha=0.2) + labs(fill = "z")
Identical to geom_ribbon and geom_area, except that it will accept a distribution in place of any of the usual aesthetics.
geom_ribbon_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., seed = NULL, times = 10, lineend = "butt", linejoin = "round", linemitre = 10, outline.type = "both", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_area_sample( mapping = NULL, data = NULL, stat = "align_sample", position = "stack_identity", ..., times = 10, seed = NULL, orientation = NA, outline.type = "upper", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_align_sample( mapping = NULL, data = NULL, geom = "area", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_ribbon_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., seed = NULL, times = 10, lineend = "butt", linejoin = "round", linemitre = 10, outline.type = "both", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_area_sample( mapping = NULL, data = NULL, stat = "align_sample", position = "stack_identity", ..., times = 10, seed = NULL, orientation = NA, outline.type = "upper", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_align_sample( mapping = NULL, data = NULL, geom = "area", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
times |
A parameter used to control the number of values sampled from each distribution. |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
outline.type |
Type of the outline of the area; |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
orientation |
The orientation of the layer. The default ( |
geom |
The geometric object to use to display the data for this layer.
When using a
|
A ggplot2 layer
library(distributional) library(dplyr) library(ggplot2) # Generate data huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron)) uncertain_huron <- huron |> group_by(year) |> mutate(level = dist_normal(level, runif(1,0,2))) # ggplot h <- ggplot(huron, aes(year)) # ggdibbler q <- ggplot(uncertain_huron, aes(year)) # ggplot h + geom_ribbon(aes(ymin=0, ymax=level)) # ggdibbler q + geom_ribbon_sample(aes(ymin=0, ymax=level), alpha=0.15) # Add aesthetic mappings h + # ggplot geom_ribbon(aes(ymin = level - 1, ymax = level + 1), fill = "grey70") + geom_line(aes(y = level)) q + # ggdibbler geom_ribbon_sample(aes(ymin = level - 1, ymax = level + 1), fill = "grey70", seed=4, alpha=0.15) + geom_line_sample(aes(y = level), seed=4, alpha=0.15) df <- data.frame( g = c("a", "a", "a", "b", "b", "b"), x = c(1, 3, 5, 2, 4, 6), y = c(2, 5, 1, 3, 6, 7) ) uncertain_df <- df |> mutate(x = dist_normal(x, 0.8), y = dist_normal(y, 0.8)) # ggplot ggplot(df, aes(x, y, fill = g)) + geom_area() + facet_grid(g ~ .) # ggdibbler ggplot(uncertain_df, aes(x, y, fill = g)) + geom_area_sample(seed=100, alpha=0.15) + geom_point_sample(seed=100) + facet_grid(g ~ .)library(distributional) library(dplyr) library(ggplot2) # Generate data huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron)) uncertain_huron <- huron |> group_by(year) |> mutate(level = dist_normal(level, runif(1,0,2))) # ggplot h <- ggplot(huron, aes(year)) # ggdibbler q <- ggplot(uncertain_huron, aes(year)) # ggplot h + geom_ribbon(aes(ymin=0, ymax=level)) # ggdibbler q + geom_ribbon_sample(aes(ymin=0, ymax=level), alpha=0.15) # Add aesthetic mappings h + # ggplot geom_ribbon(aes(ymin = level - 1, ymax = level + 1), fill = "grey70") + geom_line(aes(y = level)) q + # ggdibbler geom_ribbon_sample(aes(ymin = level - 1, ymax = level + 1), fill = "grey70", seed=4, alpha=0.15) + geom_line_sample(aes(y = level), seed=4, alpha=0.15) df <- data.frame( g = c("a", "a", "a", "b", "b", "b"), x = c(1, 3, 5, 2, 4, 6), y = c(2, 5, 1, 3, 6, 7) ) uncertain_df <- df |> mutate(x = dist_normal(x, 0.8), y = dist_normal(y, 0.8)) # ggplot ggplot(df, aes(x, y, fill = g)) + geom_area() + facet_grid(g ~ .) # ggdibbler ggplot(uncertain_df, aes(x, y, fill = g)) + geom_area_sample(seed=100, alpha=0.15) + geom_point_sample(seed=100) + facet_grid(g ~ .)
Identical to geom_rug, except that it will accept a distribution in place of any of the usual aesthetics.
geom_rug_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, lineend = "butt", sides = "bl", outside = FALSE, length = unit(0.03, "npc"), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_rug_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, lineend = "butt", sides = "bl", outside = FALSE, length = unit(0.03, "npc"), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
lineend |
Line end style (round, butt, square). |
sides |
A string that controls which sides of the plot the rugs appear on.
It can be set to a string containing any of |
outside |
logical that controls whether to move the rug tassels outside of the plot area. Default is off (FALSE). You will also need to use |
length |
A |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) # ggplot p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # ggdibbler q <- ggplot(uncertain_mtcars, aes(wt, mpg)) + geom_point_sample(seed=4) p + geom_rug() #ggplot q + geom_rug_sample(seed=4, alpha=0.5) #ggdibblerlibrary(ggplot2) # ggplot p <- ggplot(mtcars, aes(wt, mpg)) + geom_point() # ggdibbler q <- ggplot(uncertain_mtcars, aes(wt, mpg)) + geom_point_sample(seed=4) p + geom_rug() #ggplot q + geom_rug_sample(seed=4, alpha=0.5) #ggdibbler
Identical to geom_sf, except that the fill for each area will be a distribution. This function will replace the fill area with a grid, where each cell is filled with an outcome from the fill distribution.
geom_sf_sample( mapping = aes(), data = NULL, position = "subdivide", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, times = 10, seed = NULL, n = deprecated(), ... )geom_sf_sample( mapping = aes(), data = NULL, position = "subdivide", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, times = 10, seed = NULL, n = deprecated(), ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
You can also set this to one of "polygon", "line", and "point" to override the default legend. |
inherit.aes |
If |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
n |
Deprecated in favour of times. |
... |
Other arguments passed on to
|
A ggplot2 geom representing a sf_sample which can be added to a ggplot object
# In it's most basic form, the geom will make a subdivision library(ggplot2) library(dplyr) library(sf) basic_data <- toy_temp_dist |> filter(county_name %in% c("Pottawattamie County", "Mills County", "Cass County")) basic_data |> ggplot() + geom_sf_sample(times=100, linewidth=0, aes(geometry = county_geometry, fill=temp_dist)) # The original borders of the sf object can be hard to see, # so layering the original geometry on top can help to see the original boundaries basic_data |> ggplot() + geom_sf_sample(aes(geometry = county_geometry, fill=temp_dist), linewidth=0, times=100) + geom_sf(aes(geometry=county_geometry), fill=NA, linewidth=1)# In it's most basic form, the geom will make a subdivision library(ggplot2) library(dplyr) library(sf) basic_data <- toy_temp_dist |> filter(county_name %in% c("Pottawattamie County", "Mills County", "Cass County")) basic_data |> ggplot() + geom_sf_sample(times=100, linewidth=0, aes(geometry = county_geometry, fill=temp_dist)) # The original borders of the sf object can be hard to see, # so layering the original geometry on top can help to see the original boundaries basic_data |> ggplot() + geom_sf_sample(aes(geometry = county_geometry, fill=temp_dist), linewidth=0, times=100) + geom_sf(aes(geometry=county_geometry), fill=NA, linewidth=1)
Identical to geom_smooth, except that it will accept a distribution in place of any of the usual aesthetics.
geom_smooth_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "smooth_sample", position = "identity", ..., method = NULL, formula = NULL, se = TRUE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_smooth_sample( mapping = NULL, data = NULL, geom = "smooth", position = "identity", ..., times = 10, seed = NULL, method = NULL, formula = NULL, se = TRUE, n = 80, span = 0.75, fullrange = FALSE, xseq = NULL, level = 0.95, method.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_smooth_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "smooth_sample", position = "identity", ..., method = NULL, formula = NULL, se = TRUE, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_smooth_sample( mapping = NULL, data = NULL, geom = "smooth", position = "identity", ..., times = 10, seed = NULL, method = NULL, formula = NULL, se = TRUE, n = 80, span = 0.75, fullrange = FALSE, xseq = NULL, level = 0.95, method.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
method |
Smoothing method (function) to use, accepts either
For If you have fewer than 1,000 observations but want to use the same |
formula |
Formula to use in smoothing function, eg. |
se |
Display confidence band around smooth? ( |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
n |
Number of points at which to evaluate smoother. |
span |
Controls the amount of smoothing for the default loess smoother.
Smaller numbers produce wigglier lines, larger numbers produce smoother
lines. Only used with loess, i.e. when |
fullrange |
If |
xseq |
A numeric vector of values at which the smoother is evaluated.
When |
level |
Level of confidence band to use (0.95 by default). |
method.args |
List of additional arguments passed on to the modelling
function defined by |
A ggplot2 layer
library(ggplot2) # ggplot ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() # ggdibbbler ggplot(uncertain_mpg, aes(displ, hwy)) + geom_point_sample(alpha=0.5, size=0.2, seed = 22) + geom_smooth_sample(linewidth=0.2, alpha=0.1, seed = 22) # Smooths are automatically fit to each group (defined by categorical # aesthetics or the group aesthetic) and for each facet. # ggplot ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() + geom_smooth(se = FALSE, method = lm) # ggdibbler ggplot(uncertain_mpg, aes(displ, hwy, colour = class)) + geom_point_sample(alpha=0.5, size=0.2, seed = 22) + geom_smooth_sample(linewidth=0.2, alpha=0.1, se = FALSE, method = lm, seed = 22)library(ggplot2) # ggplot ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() # ggdibbbler ggplot(uncertain_mpg, aes(displ, hwy)) + geom_point_sample(alpha=0.5, size=0.2, seed = 22) + geom_smooth_sample(linewidth=0.2, alpha=0.1, seed = 22) # Smooths are automatically fit to each group (defined by categorical # aesthetics or the group aesthetic) and for each facet. # ggplot ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() + geom_smooth(se = FALSE, method = lm) # ggdibbler ggplot(uncertain_mpg, aes(displ, hwy, colour = class)) + geom_point_sample(alpha=0.5, size=0.2, seed = 22) + geom_smooth_sample(linewidth=0.2, alpha=0.1, se = FALSE, method = lm, seed = 22)
Identical to geom_spoke except that it will accept a distribution in place of any of the usual aesthetics.
geom_spoke_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_spoke_sample( mapping = NULL, data = NULL, stat = "identity_sample", position = "identity", ..., times = 10, seed = NULL, arrow = NULL, arrow.fill = NULL, lineend = "butt", linejoin = "round", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
arrow |
specification for arrow heads, as created by |
arrow.fill |
fill colour to use for the arrow head (if closed). |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) library(dplyr) library(distributional) # deterministic data set.seed(1) df <- expand.grid(x = 1:10, y=1:10) df$angle <- runif(100, 0, 2*pi) df$speed <- runif(100, 0, sqrt(0.1 * df$x)) # uncertain data uncertain_df <- df |> group_by(x,y) |> mutate(angle = dist_normal(angle, runif(1,0, 0.5)), speed = dist_normal(speed, runif(1,0, 0.1))) |> ungroup() # ggplot ggplot(df, aes(x, y)) + geom_point() + geom_spoke(aes(angle = angle, radius = speed)) # ggdibbler ggplot(uncertain_df, aes(x, y)) + geom_point_sample() + #' and here we used geom_point_sample geom_spoke_sample(aes(angle = angle, radius = speed), alpha=0.3)library(ggplot2) library(dplyr) library(distributional) # deterministic data set.seed(1) df <- expand.grid(x = 1:10, y=1:10) df$angle <- runif(100, 0, 2*pi) df$speed <- runif(100, 0, sqrt(0.1 * df$x)) # uncertain data uncertain_df <- df |> group_by(x,y) |> mutate(angle = dist_normal(angle, runif(1,0, 0.5)), speed = dist_normal(speed, runif(1,0, 0.1))) |> ungroup() # ggplot ggplot(df, aes(x, y)) + geom_point() + geom_spoke(aes(angle = angle, radius = speed)) # ggdibbler ggplot(uncertain_df, aes(x, y)) + geom_point_sample() + #' and here we used geom_point_sample geom_spoke_sample(aes(angle = angle, radius = speed), alpha=0.3)
Identical to geom_violin, except that it will accept a distribution in place of any of the usual aesthetics.
geom_violin_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "ydensity_sample", position = "dodge_identity", ..., trim = TRUE, bounds = c(-Inf, Inf), quantile.colour = NULL, quantile.color = NULL, quantile.linetype = 0L, quantile.linewidth = NULL, draw_quantiles = deprecated(), scale = "area", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_ydensity_sample( mapping = NULL, data = NULL, geom = "violin", position = "identity", ..., times = 10, seed = NULL, orientation = NA, bw = "nrd0", adjust = 1, kernel = "gaussian", trim = TRUE, scale = "area", drop = TRUE, bounds = c(-Inf, Inf), quantiles = c(0.25, 0.5, 0.75), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_violin_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, stat = "ydensity_sample", position = "dodge_identity", ..., trim = TRUE, bounds = c(-Inf, Inf), quantile.colour = NULL, quantile.color = NULL, quantile.linetype = 0L, quantile.linewidth = NULL, draw_quantiles = deprecated(), scale = "area", na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) stat_ydensity_sample( mapping = NULL, data = NULL, geom = "violin", position = "identity", ..., times = 10, seed = NULL, orientation = NA, bw = "nrd0", adjust = 1, kernel = "gaussian", trim = TRUE, scale = "area", drop = TRUE, bounds = c(-Inf, Inf), quantiles = c(0.25, 0.5, 0.75), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
trim |
If |
bounds |
Known lower and upper bounds for estimated data. Default
|
quantile.colour, quantile.color, quantile.linewidth, quantile.linetype
|
Default aesthetics for the quantile lines. Set to |
draw_quantiles |
|
scale |
if "area" (default), all violins have the same area (before trimming the tails). If "count", areas are scaled proportionally to the number of observations. If "width", all violins have the same maximum width. |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom, stat
|
Use to override the default connection between
|
bw |
The smoothing bandwidth to be used.
If numeric, the standard deviation of the smoothing kernel.
If character, a rule to choose the bandwidth, as listed in
|
adjust |
A multiplicate bandwidth adjustment. This makes it possible
to adjust the bandwidth while still using the a bandwidth estimator.
For example, |
kernel |
Kernel. See list of available kernels in |
drop |
Whether to discard groups with less than 2 observations
( |
quantiles |
A numeric vector with numbers between 0 and 1 to indicate
quantiles marked by the |
A ggplot2 layer
library(ggplot2) library(dplyr) library(distributional) # plot set up p <- ggplot(mtcars, aes(factor(cyl), mpg)) q <- ggplot(uncertain_mtcars, aes(dist_transformed(cyl, factor, as.numeric), mpg)) # ggplot p + geom_violin() # ggdibbler q + geom_violin_sample(alpha=0.1) # Default is to trim violins to the range of the data. To disable: # ggplot p + geom_violin(trim = FALSE) # ggdibbler q + geom_violin_sample(trim = FALSE, alpha=0.1)library(ggplot2) library(dplyr) library(distributional) # plot set up p <- ggplot(mtcars, aes(factor(cyl), mpg)) q <- ggplot(uncertain_mtcars, aes(dist_transformed(cyl, factor, as.numeric), mpg)) # ggplot p + geom_violin() # ggdibbler q + geom_violin_sample(alpha=0.1) # Default is to trim violins to the range of the data. To disable: # ggplot p + geom_violin(trim = FALSE) # ggdibbler q + geom_violin_sample(trim = FALSE, alpha=0.1)
These functions use nested positioning for distributional data, where one of the positions is dodged. This allows you to set different position adjustments for the "main" and "distribution" parts of your plot.
position_dodge_dodge( width = NULL, preserve = "single", orientation = "x", reverse = FALSE ) position_dodge_identity( width = NULL, preserve = "single", orientation = "x", reverse = FALSE ) position_identity_dodge( width = NULL, preserve = "single", orientation = "x", reverse = FALSE )position_dodge_dodge( width = NULL, preserve = "single", orientation = "x", reverse = FALSE ) position_dodge_identity( width = NULL, preserve = "single", orientation = "x", reverse = FALSE ) position_identity_dodge( width = NULL, preserve = "single", orientation = "x", reverse = FALSE )
width |
Dodging width, when different to the width of the individual elements. This is useful when you want to align narrow geoms with wider geoms. See the examples. |
preserve |
Should dodging preserve the |
orientation |
Fallback orientation when the layer or the data does not
indicate an explicit orientation, like |
reverse |
If |
A ggplot2 position
position_dodge() understands the following aesthetics. Required aesthetics are displayed in bold and defaults are displayed for optional aesthetics:
| • | order |
→ NULL
|
Learn more about setting these aesthetics in vignette("ggplot2-specs").
library(ggplot2) # ggplot dodge ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), position = position_dodge(preserve = "single")) # normal dodge without nesting ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "dodge") # dodge_identity ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "dodge_identity", alpha=0.2) # dodge_dodge ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "dodge_dodge") # identity_dodge ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), alpha=0.5, position = "identity") ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "identity_dodge", alpha=0.7)library(ggplot2) # ggplot dodge ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), position = position_dodge(preserve = "single")) # normal dodge without nesting ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "dodge") # dodge_identity ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "dodge_identity", alpha=0.2) # dodge_dodge ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "dodge_dodge") # identity_dodge ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), alpha=0.5, position = "identity") ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "identity_dodge", alpha=0.7)
These functions use nested positioning for distributional data, where both of the positions are an identity. This allows you to set different position adjustments for the "main" and "distribution" parts of your plot.
position_identity_identity()position_identity_identity()
A ggplot2 position
# Standard ggplots often have a position adjustment to fix overplotting # plot with overplotting library(ggplot2) ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), alpha=0.5, position = "identity") # sometimes ggdibbler functions call for more control over these # overplotting adjustments ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "identity", alpha=0.1) # is the same as... ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "identity_identity", alpha=0.1) # nested positions allows us to differentiate which postion adjustments # are used for the plot groups vs the distribution samples# Standard ggplots often have a position adjustment to fix overplotting # plot with overplotting library(ggplot2) ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), alpha=0.5, position = "identity") # sometimes ggdibbler functions call for more control over these # overplotting adjustments ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "identity", alpha=0.1) # is the same as... ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "identity_identity", alpha=0.1) # nested positions allows us to differentiate which postion adjustments # are used for the plot groups vs the distribution samples
This function lets you nest any two positions available in ggplot2 (your results may vary). This allows you to set different position adjustments for the "main" and "distribution" parts of your plot.
position_nest(position = "identity_identity")position_nest(position = "identity_identity")
position |
a character of the nested position you want to use |
A ggplot2 position
# nested positions allows us to differentiate which postion adjustments # are used for the plot groups vs the distribution samples library(ggplot2) ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), alpha=0.9, position = position_nest("stack_dodge"))# nested positions allows us to differentiate which postion adjustments # are used for the plot groups vs the distribution samples library(ggplot2) ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), alpha=0.9, position = position_nest("stack_dodge"))
These functions use nested positioning for distributional data, where one of the positions is stacked. This allows you to set different position adjustments for the "main" and "distribution" parts of your plot.
position_stack_identity(vjust = 1, reverse = FALSE) position_stack_dodge( vjust = 1, reverse = FALSE, width = NULL, preserve = "single", orientation = "x" )position_stack_identity(vjust = 1, reverse = FALSE) position_stack_dodge( vjust = 1, reverse = FALSE, width = NULL, preserve = "single", orientation = "x" )
vjust |
Vertical adjustment for geoms that have a position
(like points or lines), not a dimension (like bars or areas). Set to
|
reverse |
If |
width |
Dodging width, when different to the width of the individual elements. This is useful when you want to align narrow geoms with wider geoms. See the examples. |
preserve |
Should dodging preserve the |
orientation |
Fallback orientation when the layer or the data does not
indicate an explicit orientation, like |
A ggplot2 position
# Standard ggplots often have a position adjustment to fix overplotting # plot with overplotting library(ggplot2) ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), position = "stack") # normal stack warps the scale and doesn't communicate useful info ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "stack") # stack_identity ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "stack_identity", alpha=0.2) # stack_dodge ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "stack_dodge")# Standard ggplots often have a position adjustment to fix overplotting # plot with overplotting library(ggplot2) ggplot(mpg, aes(class)) + geom_bar(aes(fill = drv), position = "stack") # normal stack warps the scale and doesn't communicate useful info ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "stack") # stack_identity ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "stack_identity", alpha=0.2) # stack_dodge ggplot(uncertain_mpg, aes(class)) + geom_bar_sample(aes(fill = drv), position = "stack_dodge")
If the outline of a polygon is deterministic but the fill is random, you should use position subdivide rather than varying the alpha value. This subdivide position can be used with geom_polygon_sample (soon to be extended to others such as geom_sf, geom_map, etc).
position_subdivide()position_subdivide()
A ggplot2 position
library(ggplot2) library(distributional) library(dplyr) # make data polygon with uncertain fill values ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(1, 2, 3, 4, 5, 6) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) datapoly <- merge(values, positions, by = c("id")) uncertain_datapoly <- datapoly |> mutate(value = dist_uniform(value, value + 0.8)) # ggplot ggplot(datapoly , aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id)) # ggdibbler ggplot(uncertain_datapoly , aes(x = x, y = y)) + geom_polygon_sample(aes(fill = value, group = id), times=50, position = "subdivide")library(ggplot2) library(distributional) library(dplyr) # make data polygon with uncertain fill values ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3")) values <- data.frame( id = ids, value = c(1, 2, 3, 4, 5, 6) ) positions <- data.frame( id = rep(ids, each = 4), x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3, 0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3), y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5, 2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2) ) datapoly <- merge(values, positions, by = c("id")) uncertain_datapoly <- datapoly |> mutate(value = dist_uniform(value, value + 0.8)) # ggplot ggplot(datapoly , aes(x = x, y = y)) + geom_polygon(aes(fill = value, group = id)) # ggdibbler ggplot(uncertain_datapoly , aes(x = x, y = y)) + geom_polygon_sample(aes(fill = value, group = id), times=50, position = "subdivide")
Simulates outcomes from all distributions in the dataset to make an "expanded" data set that can be intepreted by ggplot2. This can be used to debug ggdibbler plots, or used to make an uncertainty visualisation for a geom that doesn't exist. If (and only if) you are implementing a ggdibbler version of a ggplot stat extension, you should use dibble_to_tibble instead.
sample_expand(data, times = 10, seed = NULL) dibble_to_tibble(data, params)sample_expand(data, times = 10, seed = NULL) dibble_to_tibble(data, params)
data |
Distribution dataset to expand into samples |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to get the same draw from repeated sample_expand calls |
params |
the params argument for the stat function sample_expand(uncertain_mpg, times=10) |
A data frame of resampled values from the input distributions
These scales allow for distributions to be passed to the x and y position by mapping distribution objects to continuous aesthetics. These scale can be used similarly to the scale_*_continuous functions, but they do not accept transformations. If you want to transform your scale, you should apply a transformation through the coord_* functions, as they are applied after the stat, so the existing ggplot infastructure can be used. For example, if you would like a log transformation of the x axis, plot + coord_transform(x = "log") would work fine.
scale_x_continuous_distribution( name = waiver(), breaks = waiver(), labels = waiver(), limits = NULL, expand = waiver(), oob = oob_keep, guide = waiver(), position = "bottom", sec.axis = waiver(), ... ) scale_y_continuous_distribution( name = waiver(), breaks = waiver(), labels = waiver(), limits = NULL, expand = waiver(), oob = scales::oob_keep, guide = waiver(), position = "left", sec.axis = waiver(), ... )scale_x_continuous_distribution( name = waiver(), breaks = waiver(), labels = waiver(), limits = NULL, expand = waiver(), oob = oob_keep, guide = waiver(), position = "bottom", sec.axis = waiver(), ... ) scale_y_continuous_distribution( name = waiver(), breaks = waiver(), labels = waiver(), limits = NULL, expand = waiver(), oob = scales::oob_keep, guide = waiver(), position = "left", sec.axis = waiver(), ... )
name |
The name of the scale. Used as the axis or legend title. If
|
breaks |
One of:
|
labels |
One of the options below. Please note that when
|
limits |
One of:
|
expand |
For position scales, a vector of range expansion constants used to add some
padding around the data to ensure that they are placed some distance
away from the axes. Use the convenience function |
oob |
One of:
|
guide |
A function used to create a guide or its name. See
|
position |
For position scales, The position of the axis.
|
sec.axis |
|
... |
Other arguments passed on to |
A ggplot2 scale
library(ggplot2) library(distributional) set.seed(1997) point_data <- data.frame(xvar = c(dist_uniform(2,3), dist_normal(3,2), dist_exponential(3)), yvar = c(dist_gamma(2,1), dist_sample(x = list(rnorm(100, 5, 1))), dist_exponential(1))) ggplot(data = point_data) + geom_point_sample(aes(x=xvar, y=yvar)) + scale_x_continuous_distribution(name="Hello, I am a random variable", limits = c(-5, 10)) + scale_y_continuous_distribution(name="I am also a random variable")library(ggplot2) library(distributional) set.seed(1997) point_data <- data.frame(xvar = c(dist_uniform(2,3), dist_normal(3,2), dist_exponential(3)), yvar = c(dist_gamma(2,1), dist_sample(x = list(rnorm(100, 5, 1))), dist_exponential(1))) ggplot(data = point_data) + geom_point_sample(aes(x=xvar, y=yvar)) + scale_x_continuous_distribution(name="Hello, I am a random variable", limits = c(-5, 10)) + scale_y_continuous_distribution(name="I am also a random variable")
These scales allow for discrete distributions to be passed to the x and y position by mapping distribution objects to discrete aesthetics. These scale can be used similarly to the scale_*_discrete functions. If you want to transform your scale, you should apply a transformation through the coord_* functions, as they are applied after the stat, so the existing ggplot infastructure can be used.
scale_x_discrete_distribution( name = waiver(), palette = seq_len, expand = waiver(), guide = waiver(), position = "bottom", sec.axis = waiver(), continuous.limits = NULL, drop = TRUE, ... ) scale_y_discrete_distribution( name = waiver(), palette = seq_len, expand = waiver(), guide = waiver(), position = "left", sec.axis = waiver(), continuous.limits = NULL, drop = TRUE, ... )scale_x_discrete_distribution( name = waiver(), palette = seq_len, expand = waiver(), guide = waiver(), position = "bottom", sec.axis = waiver(), continuous.limits = NULL, drop = TRUE, ... ) scale_y_discrete_distribution( name = waiver(), palette = seq_len, expand = waiver(), guide = waiver(), position = "left", sec.axis = waiver(), continuous.limits = NULL, drop = TRUE, ... )
name |
The name of the scale. Used as the axis or legend title. If
|
palette |
A palette function that when called with a single integer argument (the number of levels in the scale) returns the numerical values that they should take. |
expand |
For position scales, a vector of range expansion constants used to add some
padding around the data to ensure that they are placed some distance
away from the axes. Use the convenience function |
guide |
A function used to create a guide or its name. See
|
position |
For position scales, The position of the axis.
|
sec.axis |
|
continuous.limits |
One of:
|
drop |
|
... |
Arguments passed on to
|
A ggplot2 scale
library(ggplot2) # ggplot ggplot(smaller_diamonds, aes(x = cut, y = clarity)) + geom_count(aes(size = after_stat(prop))) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(x = cut, y = clarity)) + geom_count_sample(aes(size = after_stat(prop)), times=10, alpha=0.1)library(ggplot2) # ggplot ggplot(smaller_diamonds, aes(x = cut, y = clarity)) + geom_count(aes(size = after_stat(prop))) # ggdibbler ggplot(smaller_uncertain_diamonds, aes(x = cut, y = clarity)) + geom_count_sample(aes(size = after_stat(prop)), times=10, alpha=0.1)
Generates a single value from the distribution and uses it to set the default ggplot scale. The scale can be changed later in the ggplot by using any scale_* function
## S3 method for class 'distribution' scale_type(x)## S3 method for class 'distribution' scale_type(x)
x |
value being scaled |
A character vector of scale types. The scale type is the ggplot scale type of the outcome of the distribution.
This dataset is a subset of the diamonds data. There is a deterministic version that is only a subset (smaller_diamonds) and a version that has random variables (uncertain_smaller_diamonds). The data is only a subset as the ggdibbler approach can take quite a long time when applied to the full sized diamonds data set. An uncertain version of the original diamonds data is also available as uncertain_diamonds, although it isn't used in any examples.
smaller_diamonds uncertain_diamondssmaller_diamonds uncertain_diamonds
A data frame with almost 54000 observations and 10 variables:
Binomial random variable - price in US dollars ($326–$18,823)
Normal random variable - weight of the diamond (0.2–5.01)
Categorical random variable - quality of the cut (Fair, Good, Very Good, Premium, Ideal)
Categorical random variable - diamond colour, from D (best) to J (worst)
Categorical random variable - a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
Normal random variable - length in mm (0–10.74)
Normal random variable - width in mm (0–58.9)
Normal random variable - depth in mm (0–31.8)
Normal random variable - total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)
Normal random variable - width of top of diamond relative to widest point (43–95)
An object of class tbl_df (inherits from tbl, data.frame) with 1000 rows and 10 columns.
An object of class tbl_df (inherits from tbl, data.frame) with 5000 rows and 20 columns.
Identical to stat_connect, except that it will accept a distribution in place of any of the usual aesthetics.
stat_connect_sample( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., times = 10, seed = NULL, connection = "hv", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_connect_sample( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., times = 10, seed = NULL, connection = "hv", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
connection |
A specification of how two points are connected. Can be one of the folloing:
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
# set up data library(ggplot2) x <- seq(0, 1, length.out = 20)[-1] smooth <- cbind(x, scales::rescale(1 / (1 + exp(-(x * 10 - 5))))) zigzag <- cbind(c(0.4, 0.6, 1), c(0.75, 0.25, 1)) # ggplot ggplot(head(economics, 10), aes(date, unemploy)) + stat_connect(aes(colour = "zigzag"), connection = zigzag) + stat_connect(aes(colour = "smooth"), connection = smooth) + geom_point() # ggdibbler ggplot(head(uncertain_economics, 10), aes(date, unemploy)) + stat_connect_sample(aes(colour = "zigzag"), connection = zigzag, seed=64) + stat_connect_sample(aes(colour = "smooth"), connection = smooth, seed=64) + geom_point_sample(seed=64)# set up data library(ggplot2) x <- seq(0, 1, length.out = 20)[-1] smooth <- cbind(x, scales::rescale(1 / (1 + exp(-(x * 10 - 5))))) zigzag <- cbind(c(0.4, 0.6, 1), c(0.75, 0.25, 1)) # ggplot ggplot(head(economics, 10), aes(date, unemploy)) + stat_connect(aes(colour = "zigzag"), connection = zigzag) + stat_connect(aes(colour = "smooth"), connection = smooth) + geom_point() # ggdibbler ggplot(head(uncertain_economics, 10), aes(date, unemploy)) + stat_connect_sample(aes(colour = "zigzag"), connection = zigzag, seed=64) + stat_connect_sample(aes(colour = "smooth"), connection = smooth, seed=64) + geom_point_sample(seed=64)
Identical to stat_ecdf, except that it will accept a distribution in place of any of the usual aesthetics.
stat_ecdf_sample( mapping = NULL, data = NULL, geom = "step", position = "identity", ..., times = 10, seed = NULL, n = NULL, pad = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_ecdf_sample( mapping = NULL, data = NULL, geom = "step", position = "identity", ..., times = 10, seed = NULL, n = NULL, pad = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
n |
if NULL, do not interpolate. If not NULL, this is the number of points to interpolate with. |
pad |
If |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
library(ggplot2) library(dplyr) library(distributional) set.seed(44) # df df <- data.frame( x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)), g = gl(2, 100) ) uncertain_df <- df |> group_by(x) |> mutate(x = dist_normal(x, runif(1,0,5)), g_pred = dist_bernoulli(0.9-0.8*(2-as.numeric(g))) ) # ggplot ggplot(df, aes(x)) + stat_ecdf(geom = "step") # ggdibbler ggplot(uncertain_df, aes(x)) + stat_ecdf_sample(geom = "step", alpha=0.3) # Multiple ECDFs # ggplot ggplot(df, aes(x, colour = g)) + stat_ecdf() # ggdibbler 1 ggplot(uncertain_df, aes(x, colour = g)) + stat_ecdf_sample(alpha=0.3)library(ggplot2) library(dplyr) library(distributional) set.seed(44) # df df <- data.frame( x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)), g = gl(2, 100) ) uncertain_df <- df |> group_by(x) |> mutate(x = dist_normal(x, runif(1,0,5)), g_pred = dist_bernoulli(0.9-0.8*(2-as.numeric(g))) ) # ggplot ggplot(df, aes(x)) + stat_ecdf(geom = "step") # ggdibbler ggplot(uncertain_df, aes(x)) + stat_ecdf_sample(geom = "step", alpha=0.3) # Multiple ECDFs # ggplot ggplot(df, aes(x, colour = g)) + stat_ecdf() # ggdibbler 1 ggplot(uncertain_df, aes(x, colour = g)) + stat_ecdf_sample(alpha=0.3)
Identical to stat_ellipse, except that it will accept a distribution in place of any of the usual aesthetics.
stat_ellipse_sample( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., times = 10, seed = NULL, type = "t", level = 0.95, segments = 51, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_ellipse_sample( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., times = 10, seed = NULL, type = "t", level = 0.95, segments = 51, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
type |
The type of ellipse.
The default |
level |
The level at which to draw an ellipse,
or, if |
segments |
The number of segments to be used in drawing the ellipse. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) library(distributional) # ggplot ggplot(faithful, aes(waiting, eruptions)) + geom_point() + stat_ellipse() # ggdibbler ggplot(uncertain_faithful, aes(waiting, eruptions)) + geom_point_sample() + stat_ellipse_sample() # ggplot ggplot(faithful, aes(waiting, eruptions, color = eruptions > 3)) + geom_point() + stat_ellipse(type = "t") # ggdibbler ggplot(uncertain_faithful, aes(waiting, eruptions, color = dist_transformed(eruptions,function(x) x > 3, identity))) + geom_point_sample() + stat_ellipse_sample(type = "t") + labs(colour = "eruptions > 3") # ggplot ggplot(faithful, aes(waiting, eruptions, fill = eruptions > 3)) + stat_ellipse(geom = "polygon") # ggdibbler ggplot(uncertain_faithful, aes(waiting, eruptions, fill = dist_transformed(eruptions, function(x) x > 3, identity))) + stat_ellipse_sample(geom = "polygon", alpha=0.1) + labs(fill = "eruptions > 3")library(ggplot2) library(distributional) # ggplot ggplot(faithful, aes(waiting, eruptions)) + geom_point() + stat_ellipse() # ggdibbler ggplot(uncertain_faithful, aes(waiting, eruptions)) + geom_point_sample() + stat_ellipse_sample() # ggplot ggplot(faithful, aes(waiting, eruptions, color = eruptions > 3)) + geom_point() + stat_ellipse(type = "t") # ggdibbler ggplot(uncertain_faithful, aes(waiting, eruptions, color = dist_transformed(eruptions,function(x) x > 3, identity))) + geom_point_sample() + stat_ellipse_sample(type = "t") + labs(colour = "eruptions > 3") # ggplot ggplot(faithful, aes(waiting, eruptions, fill = eruptions > 3)) + stat_ellipse(geom = "polygon") # ggdibbler ggplot(uncertain_faithful, aes(waiting, eruptions, fill = dist_transformed(eruptions, function(x) x > 3, identity))) + stat_ellipse_sample(geom = "polygon", alpha=0.1) + labs(fill = "eruptions > 3")
Can think of as the ggdibbler equivalent to "stat_identity". It is the default stat that we used for most geoms.
stat_identity_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_identity_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 geom representing a point_sample which can be added to a ggplot object
A ggplot2 layer
library(ggplot2) p <- ggplot(mtcars, aes(wt, mpg)) p + stat_identity() q <- ggplot(uncertain_mtcars, aes(wt, mpg)) q + stat_identity_sample(aes(colour = after_stat(drawID)))library(ggplot2) p <- ggplot(mtcars, aes(wt, mpg)) p + stat_identity() q <- ggplot(uncertain_mtcars, aes(wt, mpg)) q + stat_identity_sample(aes(colour = after_stat(drawID)))
Identical to stat_manual, except that it will accept a distribution in place of any of the usual aesthetics.
stat_manual_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, fun = identity, args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_manual_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, fun = identity, args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
fun |
Function that takes a data frame as input and returns a data
frame or data frame-like list as output. The default ( |
args |
A list of arguments to pass to the function given in |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) library(distributional) # A standard scatterplot p <- ggplot(mtcars, aes(disp, mpg, colour = factor(cyl))) + geom_point() q <- ggplot(uncertain_mtcars, aes(disp, mpg, colour = dist_transformed(cyl, factor, as.numeric))) + labs(colour="factor(cyl)") + geom_point_sample() # Using a custom function make_hull <- function(data) { hull <- chull(x = data$x, y = data$y) data.frame(x = data$x[hull], y = data$y[hull]) } p + stat_manual( geom = "polygon", fun = make_hull, fill = NA ) q + stat_manual_sample( geom = "polygon", fun = make_hull, fill = NA, ) # Using the `transform` function with quoting p + stat_manual( geom = "segment", fun = transform, args = list( xend = quote(mean(x)), yend = quote(mean(y)) ) ) q + stat_manual_sample( geom = "segment", fun = transform, args = list( xend = quote(mean(x)), yend = quote(mean(y)) ) ) # Using dplyr verbs with `vars()` if (requireNamespace("dplyr", quietly = TRUE)) { # Get centroids with `summarise()` p + stat_manual( size = 10, shape = 21, fun = dplyr::summarise, args = vars(x = mean(x), y = mean(y)) ) q + stat_manual_sample( size = 10, shape = 21, fun = dplyr::summarise, args = vars(x = mean(x), y = mean(y)) ) }library(ggplot2) library(distributional) # A standard scatterplot p <- ggplot(mtcars, aes(disp, mpg, colour = factor(cyl))) + geom_point() q <- ggplot(uncertain_mtcars, aes(disp, mpg, colour = dist_transformed(cyl, factor, as.numeric))) + labs(colour="factor(cyl)") + geom_point_sample() # Using a custom function make_hull <- function(data) { hull <- chull(x = data$x, y = data$y) data.frame(x = data$x[hull], y = data$y[hull]) } p + stat_manual( geom = "polygon", fun = make_hull, fill = NA ) q + stat_manual_sample( geom = "polygon", fun = make_hull, fill = NA, ) # Using the `transform` function with quoting p + stat_manual( geom = "segment", fun = transform, args = list( xend = quote(mean(x)), yend = quote(mean(y)) ) ) q + stat_manual_sample( geom = "segment", fun = transform, args = list( xend = quote(mean(x)), yend = quote(mean(y)) ) ) # Using dplyr verbs with `vars()` if (requireNamespace("dplyr", quietly = TRUE)) { # Get centroids with `summarise()` p + stat_manual( size = 10, shape = 21, fun = dplyr::summarise, args = vars(x = mean(x), y = mean(y)) ) q + stat_manual_sample( size = 10, shape = 21, fun = dplyr::summarise, args = vars(x = mean(x), y = mean(y)) ) }
Identical to geom_qq, stat_qq, geom_gg_line, and stat_qq_line, except that they accept a distribution in place of any of the usual aesthetics.
geom_qq_line_sample( mapping = NULL, data = NULL, geom = "abline", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq_line_sample( mapping = NULL, data = NULL, geom = "abline", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_qq_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )geom_qq_line_sample( mapping = NULL, data = NULL, geom = "abline", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq_line_sample( mapping = NULL, data = NULL, geom = "abline", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_qq_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_qq_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, distribution = stats::qnorm, dparams = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
distribution |
Distribution function to use, if x not specified |
dparams |
Additional parameters passed on to |
line.p |
Vector of quantiles to use when fitting the Q-Q line, defaults
defaults to |
fullrange |
Should the q-q line span the full range of the plot, or just the data |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2 layer
library(ggplot2) library(distributional) df <- data.frame(y = rt(200, df = 5)) uncertain_df <- data.frame(y=dist_normal(rt(200, df = 5), runif(200))) # ggplot p <- ggplot(df, aes(sample = y)) p + stat_qq() + stat_qq_line() # ggdibbler q <- ggplot(uncertain_df, aes(sample = y)) q + stat_qq_sample() + stat_qq_line_sample() # Using to explore the distribution of a variable # ggplot ggplot(mtcars, aes(sample = mpg)) + stat_qq() + stat_qq_line() # ggdibbler ggplot(uncertain_mtcars, aes(sample = mpg)) + stat_qq_sample() + stat_qq_line_sample()library(ggplot2) library(distributional) df <- data.frame(y = rt(200, df = 5)) uncertain_df <- data.frame(y=dist_normal(rt(200, df = 5), runif(200))) # ggplot p <- ggplot(df, aes(sample = y)) p + stat_qq() + stat_qq_line() # ggdibbler q <- ggplot(uncertain_df, aes(sample = y)) q + stat_qq_sample() + stat_qq_line_sample() # Using to explore the distribution of a variable # ggplot ggplot(mtcars, aes(sample = mpg)) + stat_qq() + stat_qq_line() # ggdibbler ggplot(uncertain_mtcars, aes(sample = mpg)) + stat_qq_sample() + stat_qq_line_sample()
Identical to stat_summary_2d, except that it will accept a distribution in place of any of the usual aesthetics.
stat_summary_2d_sample( mapping = NULL, data = NULL, geom = "tile", position = "identity_dodge", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, breaks = NULL, drop = TRUE, fun = "mean", fun.args = list(), boundary = 0, closed = NULL, center = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_summary_hex_sample( mapping = NULL, data = NULL, geom = "hex", position = "identity", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, drop = TRUE, fun = "mean", fun.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_summary_2d_sample( mapping = NULL, data = NULL, geom = "tile", position = "identity_dodge", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, breaks = NULL, drop = TRUE, fun = "mean", fun.args = list(), boundary = 0, closed = NULL, center = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_summary_hex_sample( mapping = NULL, data = NULL, geom = "hex", position = "identity", ..., times = 10, seed = NULL, binwidth = NULL, bins = 30, drop = TRUE, fun = "mean", fun.args = list(), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that takes x after scale transformation as input and
returns a single numeric value. When specifying a function along with a
grouping structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
bins |
Number of bins. Overridden by |
breaks |
Alternatively, you can supply a numeric vector giving
the bin boundaries. Overrides |
drop |
drop if the output of |
fun |
function for summary. |
fun.args |
A list of extra arguments to pass to |
closed |
One of |
center, boundary
|
bin position specifiers. Only one, |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
library(ggplot2) d <- ggplot(smaller_diamonds, aes(carat, depth, z = price)) d + stat_summary_2d() b <- ggplot(smaller_uncertain_diamonds, aes(carat, depth, z = price)) b + stat_summary_2d_sample() # summary_hex d + stat_summary_hex(fun = ~ sum(.x^2)) b + stat_summary_hex_sample(fun = ~ sum(.x^2), alpha=0.3)library(ggplot2) d <- ggplot(smaller_diamonds, aes(carat, depth, z = price)) d + stat_summary_2d() b <- ggplot(smaller_uncertain_diamonds, aes(carat, depth, z = price)) b + stat_summary_2d_sample() # summary_hex d + stat_summary_hex(fun = ~ sum(.x^2)) b + stat_summary_hex_sample(fun = ~ sum(.x^2), alpha=0.3)
Identical to stat_summary, except that it will accept a distribution in place of any of the usual aesthetics.
stat_summary_bin_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, geom = "pointrange", position = "identity", ..., fun.data = NULL, fun = NULL, fun.max = NULL, fun.min = NULL, fun.args = list(), bins = 30, binwidth = NULL, breaks = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, fun.y = deprecated(), fun.ymin = deprecated(), fun.ymax = deprecated() ) stat_summary_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, geom = "pointrange", position = "identity", ..., fun.data = NULL, fun = NULL, fun.max = NULL, fun.min = NULL, fun.args = list(), na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, fun.y = deprecated(), fun.ymin = deprecated(), fun.ymax = deprecated() )stat_summary_bin_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, geom = "pointrange", position = "identity", ..., fun.data = NULL, fun = NULL, fun.max = NULL, fun.min = NULL, fun.args = list(), bins = 30, binwidth = NULL, breaks = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, fun.y = deprecated(), fun.ymin = deprecated(), fun.ymax = deprecated() ) stat_summary_sample( mapping = NULL, data = NULL, times = 10, seed = NULL, geom = "pointrange", position = "identity", ..., fun.data = NULL, fun = NULL, fun.max = NULL, fun.min = NULL, fun.args = list(), na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, fun.y = deprecated(), fun.ymin = deprecated(), fun.ymax = deprecated() )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
fun.data |
A function that is given the complete data and should
return a data frame with variables |
fun.min, fun, fun.max
|
Alternatively, supply three individual functions that are each passed a vector of values and should return a single number. |
fun.args |
Optional additional arguments passed on to the functions. |
bins |
Number of bins. Overridden by |
binwidth |
The width of the bins. Can be specified as a numeric value
or as a function that takes x after scale transformation as input and
returns a single numeric value. When specifying a function along with a
grouping structure, the function will be called once per group.
The default is to use the number of bins in The bin width of a date variable is the number of days in each time; the bin width of a time variable is the number of seconds. |
breaks |
Alternatively, you can supply a numeric vector giving the bin
boundaries. Overrides |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
fun.ymin, fun.y, fun.ymax
|
library(ggplot2) library(distributional) d <- ggplot(mtcars, aes(cyl, mpg)) + geom_point() b <- ggplot(uncertain_mtcars, aes(cyl, mpg)) + geom_point_sample(seed=4) d + stat_summary(fun = "median", colour = "red", geom = "point") b + stat_summary_sample(fun = "median", colour = "red", geom = "point") d + aes(colour = factor(vs)) + stat_summary(fun = mean, geom="line") b + aes(colour = dist_transformed(vs, factor, as.numeric)) + stat_summary_sample(fun = mean, geom="line", seed=4) + labs(colour = "factor(vs)")library(ggplot2) library(distributional) d <- ggplot(mtcars, aes(cyl, mpg)) + geom_point() b <- ggplot(uncertain_mtcars, aes(cyl, mpg)) + geom_point_sample(seed=4) d + stat_summary(fun = "median", colour = "red", geom = "point") b + stat_summary_sample(fun = "median", colour = "red", geom = "point") d + aes(colour = factor(vs)) + stat_summary(fun = mean, geom="line") b + aes(colour = dist_transformed(vs, factor, as.numeric)) + stat_summary_sample(fun = mean, geom="line", seed=4) + labs(colour = "factor(vs)")
Identical to stat_unique, except that it will accept a distribution in place of any of the usual aesthetics. Note that the values will only be unique within each draw, (at the final plot level you might still have double ups).
stat_unique_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )stat_unique_sample( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., times = 10, seed = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
times |
A parameter used to control the number of values sampled from each distribution. |
seed |
Set the seed for the layers random draw, allows you to plot the same draw across multiple layers. |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
library(ggplot2) # ggplot ggplot(mtcars, aes(vs, am)) + geom_point(alpha = 0.1) # ggdibbler ggplot(uncertain_mtcars, aes(vs, am)) + geom_point_sample(alpha = 0.01) # ggplot ggplot(mtcars, aes(vs, am)) + geom_point(alpha = 0.1, stat = "unique") # ggdibbler ggplot(uncertain_mtcars, aes(vs, am)) + geom_point_sample(alpha = 0.01, stat = "unique_sample")library(ggplot2) # ggplot ggplot(mtcars, aes(vs, am)) + geom_point(alpha = 0.1) # ggdibbler ggplot(uncertain_mtcars, aes(vs, am)) + geom_point_sample(alpha = 0.01) # ggplot ggplot(mtcars, aes(vs, am)) + geom_point(alpha = 0.1, stat = "unique") # ggdibbler ggplot(uncertain_mtcars, aes(vs, am)) + geom_point_sample(alpha = 0.01, stat = "unique_sample")
There are several measurements for each county, with no location marker for individual scientists to preserve anonyminity. Counties can have different numbers of observations as well as a different levels of variance between the observations in the county.
A tibble with 99 observations and 4 variables
the name of each Iowa county
the ambient temperature recorded by the citizen scientist
the ID number for the scientist who made the recording
the shape file for each county of Iowa
the centroid longitude for each county of Iowa
the centroid latitude for each county of Iowa
The map shows a wave pattern in temperature on the state of Iowa. Each estimate also has an uncertainty component added, and is represented as a distribution
A tibble with 99 observations and 4 variables
the name of each Iowa county
the temperature of each county as a distribution
the shape file for each county of Iowa
This dataset is identical to the economics data, except that every variable in the data set is represented by a normal random variable. The original 'economics' dataset is based on real US economic time series data, but the uncertainty we added is hypothetical and included for illustrative purposes.
uncertain_economics_longuncertain_economics_long
A data frame with almost 574 observations and 6 variables:
A deterministic variable - Month of data collection
Normal random variable - personal consumption expenditures, in billions of dollars
Normal random variable - total population, in thousands
Normal random variable - personal savings rate
Normal random variable - median duration of unemployment, in weeks
Normal random variable - number of unemployed in thousands
An object of class tbl_df (inherits from tbl, data.frame) with 2870 rows and 4 columns.
The old faithful data from the datasets package but with added uncertainty.
A data frame:
Eruption time in mins
Waiting time to next eruption in mins
A 2d density estimate of the waiting and eruptions variables data faithful. Unlike other uncertain datasets, the only uncertain variable is density. Since this is based on a model, it wouldn't make sense for erruptions or waiting to be represented as random variables.
A data frame with 5,625 observations and 3 variables:
Eruption time in mins
Waiting time to next eruption in mins
A 2d density estimate that is normally distributed with a low variance
A 2d density estimate that is normally distributed with a medium variance
A 2d density estimate that is normally distributed with a high variance
This dataset is based on the Fuel economy data from 1999 to 2008 from 'ggplot2', but every value is represented by a distribution. Every variable in the data set is represetned by a categorical, discrete, or continuous random variable. The original MPG dataset in ggplot is a real a subset of the fuel economy data from the EPA, but the uncertainty is hypothetical uncertainty for each data type, added by us for illustrative purposes.
A data frame with 234 rows and 11 variables:
manufacturer, as a categorical random variable
model name as a categorical random variable
engine displacement, as a uniform random variable to represent bounded data
year of manufacture, as a sample of possible years
number of cylinders, as a categorical random variable
type of transmission, as a categorical random variable
the type of drive train, as a categorical random variable
city miles per gallon, as a discrete random variable
highway miles per gallon, as a discrete random variable
fuel type, as a categorical random variable
"type" of car, as a categorical random variable
This dataset is identical to the mtcars data, except that every variable in the data set is represented by a categorical, discrete, or continuous random variable. The original 'mtcars' dataset in datasets is based on real data extracted from the 1974 Motor Trend US magazine, but the uncertainty we added is hypothetical and included for illustrative purposes.
A data frame with 32 observations and 11 variables:
Uniform random variable - Miles/(US) gallon as
Categorical random variable - Number of cylinders
Uniform random variable - Displacement (cu.in.)
Normal random variable - Gross horsepower
Uniform random variable - Rear axle ratio
Uniform random variable - Weight (1000 lbs)
Uniform random variable - 1/4 mile time
Bernouli random variable - Engine (0 = V-shaped, 1 = straight)
Bernouli random variable - Transmission (0 = automatic, 1 = manual)
Categorical random variable - Number of forward gears
Categorical random variable- Number of carburetors
Daily step counts during October 2025 for five teams of four people competing in the Walktober 2025 Challenge.
A data frame with 744 observations and 4 variables:
Team name
Name of team member
Date steps were recorded
Number of steps recorded on 'date'