Bins

Syntax:

    plot 'DATA' using <XCOL> {:<YCOL>} bins{=<NBINS>}
         {binrange [<LOW>:<HIGH>]} {binwidth=<width>}
         {binvalue={sum|avg}

The bins option to a plot command first assigns the original data to equal width bins on x and then plots a single value per bin. The default number of bins is controlled by set samples, but this can be changed by giving an explicit number of bins in the command.

If no binrange is given, the range is taken from the extremes of the x values found in 'DATA'.

Given the range and the number of bins, bin width is calculated automatically and points are assigned to bins 0 to NBINS-1

    BINWIDTH = (HIGH - LOW) / (NBINS-1)
    xmin = LOW - BINWIDTH/2
    xmax = HIGH + BINWIDTH/2
    first bin holds points with (xmin <= x < xmin + BINWIDTH)
    last bin holds points with (xmax-BINWIDTH <= x < xman)
    each point is assigned to bin i = floor(NBINS * (x-xmin)/(xmax-xmin))

Alternatively you can provide a fixed bin width, in which case nbins is calculated as the smallest number of bins that will span the range.

On output bins are plotted or tabulated by midpoint. E.g. if the program calculates bin width as shown above, the x coordinate output for the first bin is x=LOW (not x=xmin).

If only a single column is given in the using clause then each data point contributes a count of 1 to the accumulation of total counts in the bin for that x coordinate value. If a second column is given then the value in that column is added to the accumulation for the bin. Thus the following two plot command are equivalent:

    plot 'DATA" using N bins=20
    set samples 20
    plot 'DATA' using (column(N)):(1)

The y value plotted for each bin is the sum of the y values over all points in that bin. This corresponds to binvalue=sum. EXPERIMENTAL: binvalue=avg instead plots the mean y value for that bin.

For related plotting styles see smooth frequency (p. ) and smooth kdensity (p. ).