Skip to contents

Given 2 numeric vectors defining a 2 dimensional space, and a specified number of bins, this function compute the count for each bins in the 2 dimensional space and return the result either as a dataframe or a matrix. It is also possible to specify a grouping vector to obtain as much count matrix as the number of specified groups.

Usage

countMat(
  x = NULL,
  y = NULL,
  nbins = 100,
  groups = NULL,
  output = c("matrix", "data.frame"),
  na.rm = TRUE
)

Arguments

x

A numeric vector used as the first dimension of the matrix.

y

A numeric vector used as the second dimension of the matrix.

nbins

A numeric value indicating the number of bins in both vertical and horizontal directions (default = 100).

groups

A character or numeric vector used as a grouping variable to split the count matrix.

output

A character string indicating whether the result should be returned as a dataframe or a list of matrix (depending on the number of groups, if groups is NULL, the function returns an unique matrix).

na.rm

A logical value (i.e., TRUE or FALSE) indicating whether NA values should be stripped before the computation proceeds (default = TRUE).

Value

This function returns the number of counts over the two specified dimension (x and y) according to the number of specified bins (nbins argument). The resulting output can be either a dataframe or a list of matrix depending on the output argument and the number of groups specified by the groups argument.

Author

Quentin PETITJEAN

Examples


test <-
countMat(x = round(rnorm(5000, mean = 50 , sd = 10)),
         y = round(rnorm(5000, mean = 50 , sd = 10)),
         nbins = 10,
         output = "matrix")
test
#>       x
#> y      18.45 25.35 32.25 39.15 46.05 52.95 59.85 66.75 73.65 80.55
#>   17.6     1     0     0     4     0     4     3     0     0     0
#>   24.8     0     0     6     8    13     9     7     5     0     0
#>   32       1     5    15    42    64    83    61    17     3     2
#>   39.2     1    12    38   136   171   198   128    52    17     2
#>   46.4     2    17    94   242   392   361   244   107    23     1
#>   53.6     4    14    59   208   313   376   218    77    15     4
#>   60.8     4    17    46   101   204   186   120    50     8     4
#>   68       0     6    22    51    76    77    55    19     8     0
#>   75.2     0     3     3     7    10    15    10     4     4     0
#>   82.4     0     0     4     1     4     2     0     0     0     0

# it is then possible to draw the count distribution in the 2d space using plotly
if (FALSE) {
# draw the 3d plot using plotly 
library(plotly)

# initialize the plot
fig <- plotly::plot_ly(x=~colnames(test), y=~rownames(test), contours = list(
  z = list(
    show = TRUE,
    start = round(min(sqrt(test)),-2),
    project = list(z = TRUE),
    end = round(max(sqrt(test)),-2),
    size = max(sqrt(test)) / 10,
    color = "white"
  )
))
# add the layer
fig <- plotly::add_surface(
  p = fig,
  z = sqrt(test),
  opacity = 0.8,
  colorscale = "Hot",
  cmin = min(sqrt(test)),
  cmax = max(sqrt(test)),
  colorbar = list(title = "counts (sqrt)")
)
# add some legends
fig <- plotly::layout(
  fig,
  title = '3D density plot',
  scene1 = list(
    xaxis = list(title = "x"),
    yaxis = list(title = "y"),
    zaxis = list(title = "counts (sqrt)")
  )
)
fig
}