Create box chart (box plot)

`boxchart(`

creates a box chart, or box
plot, for each column of the matrix `ydata`

)`ydata`

. If
`ydata`

is a vector, then `boxchart`

creates a
single box chart.

Each box chart displays the following information: the median, the lower and upper quartiles, any outliers (computed using the interquartile range), and the minimum and maximum values that are not outliers. For more information, see Box Chart (Box Plot).

`boxchart(`

groups the data in the vector `xgroupdata`

,`ydata`

)`ydata`

according to the unique values in
`xgroupdata`

and plots each group of data as a separate box chart.
`xgroupdata`

determines the position of each box chart along the
*x*-axis. `ydata`

must be a vector, and
`xgroupdata`

must have the same length as
`ydata`

.

`boxchart(___,`

specifies additional chart options using one or more name-value pair arguments. For
example, you can compare sample medians using notches by specifying
`Name,Value`

)`'Notch','on'`

. Specify the name-value pair arguments after all other
input arguments. For a list of properties, see BoxChart Properties.

returns a
`b`

= boxchart(___)`BoxChart`

object. Use `b`

to set properties of the
box charts after creating them. For a list of properties, see BoxChart Properties.

Create a single box chart from a vector of ages. Use the box chart to understand the distribution of ages.

Load the `patients`

data set. The `Age`

variable contains the ages of 100 patients. Create a box chart to visualize the distribution of ages.

load patients boxchart(Age) ylabel('Age (years)')

The median patient age of 39 years is shown as the line inside the box. The lower and upper quartiles of 32 and 44 years are shown as the bottom and top edges of the box, respectively. The whisker endpoints correspond to the youngest and oldest patients. The youngest patient is 25 years old, and the oldest is 50 years old. The data set contains no outliers, which would be represented by small circles.

You can use data tips to get a summary of the data statistics. Hover over the box chart to see the data tip.

Use box charts to compare the distribution of values along the columns and the rows of a magic square.

Create a magic square, with 10 rows and 10 columns.

Y = magic(10)

`Y = `*10×10*
92 99 1 8 15 67 74 51 58 40
98 80 7 14 16 73 55 57 64 41
4 81 88 20 22 54 56 63 70 47
85 87 19 21 3 60 62 69 71 28
86 93 25 2 9 61 68 75 52 34
17 24 76 83 90 42 49 26 33 65
23 5 82 89 91 48 30 32 39 66
79 6 13 95 97 29 31 38 45 72
10 12 94 96 78 35 37 44 46 53
11 18 100 77 84 36 43 50 27 59

Create a box chart for each column of the magic square. Each column has a similar median value (around `50`

). However, the first five columns of `Y`

have greater interquartile ranges than the last five columns of `Y`

. The interquartile range is the distance between the upper quartile (top edge of the box) and the lower quartile (bottom edge of the box).

boxchart(Y) xlabel('Column') ylabel('Value')

Create a box chart for each row of the magic square. Each row has a similar interquartile range, but the median values differ across the rows.

boxchart(Y') xlabel('Row') ylabel('Value')

Plot the magnitudes of earthquakes according to the month in which they occurred. Use a vector of earthquake magnitudes and a grouping variable indicating the month of each earthquake. For each group of data, create a box chart and place it in the specified position along the *x*-axis.

Read a set of tsunami data into the workspace as a table. The data set includes information on earthquakes as well as other causes of tsunamis. Display the first eight rows, showing the month, cause, and earthquake magnitude columns of the table.

tsunamis = readtable('tsunamis.xlsx'); tsunamis(1:8,["Month","Cause","EarthquakeMagnitude"])

`ans=`*8×3 table*
Month Cause EarthquakeMagnitude
_____ __________________ ___________________
10 {'Earthquake' } 7.6
8 {'Earthquake' } 6.9
12 {'Volcano' } NaN
3 {'Earthquake' } 8.1
3 {'Earthquake' } 4.5
5 {'Meteorological'} NaN
11 {'Earthquake' } 9
3 {'Earthquake' } 5.8

Create the table `earthquakes`

, which contains data for the tsunamis caused by earthquakes.

unique(tsunamis.Cause)

`ans = `*8×1 cell*
{0×0 char }
{'Earthquake' }
{'Earthquake and Landslide'}
{'Landslide' }
{'Meteorological' }
{'Unknown Cause' }
{'Volcano' }
{'Volcano and Landslide' }

```
idx = contains(tsunamis.Cause,'Earthquake');
earthquakes = tsunamis(idx,:);
```

Group the earthquake magnitudes based on the month in which the corresponding tsunamis occurred. For each month, display a separate box chart. For example, `boxchart`

uses the fourth, fifth, and eighth earthquake magnitudes, as well as others, to create the third box chart, which corresponds to the third month.

boxchart(earthquakes.Month,earthquakes.EarthquakeMagnitude) xlabel('Month') ylabel('Earthquake Magnitude')

Notice that because the month values are numeric, the *x*-axis ruler is also numeric.

For more descriptive month names, convert the `earthquakes.Month`

column to a `categorical`

variable.

monthOrder = ["Jan","Feb","Mar","Apr","May","Jun","Jul", ... "Aug","Sep","Oct","Nov","Dec"]; namedMonths = categorical(earthquakes.Month,1:12,monthOrder);

Create the same box charts as before, but use the `categorical`

variable `namedMonths`

instead of the numeric month values. The *x*-axis ruler is now categorical, and the order of the categories in `namedMonths`

determines the order of the box charts.

boxchart(namedMonths,earthquakes.EarthquakeMagnitude) xlabel('Month') ylabel('Earthquake Magnitude')

Group medical patients based on their age, and for each age group, create a box chart of diastolic blood pressure values.

Load the `patients`

data set. The `Age`

and `Diastolic`

variables contain the ages and diastolic blood pressure levels of 100 patients.

`load patients`

Group the patients into five age bins. Find the minimum and maximum ages, and then divide the range between them into five-year bins. Bin the values in the `Age`

variable by using the `discretize`

function. Use the bin names in `bins`

. The resulting `groupAge`

variable is a `categorical`

variable.

min(Age)

ans = 25

max(Age)

ans = 50

binEdges = 25:5:50; bins = {'late 20s','early 30s','late 30s','early 40s','late 40s+'}; groupAge = discretize(Age,binEdges,'categorical',bins);

Create a box chart for each age group. Each box chart shows the diastolic blood pressure values of the patients in that group.

boxchart(groupAge,Diastolic) xlabel('Age Group') ylabel('Diastolic Blood Pressure')

Combine two grouping variables into one, and use the variable to group and position the resulting box charts.

Load the sample file `TemperatureData.csv`

, which contains average daily temperatures from January 2015 through July 2016. Read the file into a table.

`tbl = readtable('TemperatureData.csv');`

Convert the `tbl.Month`

and `tbl.Year`

variables to `categorical`

variables. Specify the order of the categories in each variable.

monthOrder = {'January','February','March','April','May','June','July', ... 'August','September','October','November','December'}; yearOrder = [2015 2016]; tbl.Month = categorical(tbl.Month,monthOrder); tbl.Year = categorical(tbl.Year,yearOrder);

Combine the `tbl.Month`

and `tbl.Year`

variables into one grouping variable `xgroupdata`

. Create box charts showing the distribution of temperatures during each month-year pairing. Notice that `tbl`

does not contain data for some month-year pairings, such as August 2016.

```
xgroupdata = tbl.Month.*tbl.Year;
boxchart(xgroupdata,tbl.TemperatureF)
ylabel('Temperature (F)')
```

In this figure, you can easily compare the distribution of temperatures for one particular month across multiple years. For example, you can see that February temperatures varied much more in 2016 than in 2015. If you prefer a chronological view of temperatures, you can set `xgroupdata`

to `tbl.Year.*tbl.Month`

.

Create box charts, and plot the data values over the box charts by using `hold on`

.

Load the `patients`

data set. Group the patients into smokers and nonsmokers. Compare the diastolic blood pressure values for each group of patients by using box charts. Plot the individual diastolic blood pressure values over the box charts.

load patients boxchart(categorical(Smoker),Diastolic) hold on plot(categorical(Smoker),Diastolic,'x') xlabel('Smoker') ylabel('Diastolic Blood Pressure') hold off

Use notches to determine whether median values are significantly different from each other.

Load the `patients`

data set. Split the patients according to their location. For each group of patients, create a box chart of their weights. Specify `'Notch','on'`

so that each box includes a tapered, shaded region called a notch. Box charts with overlapping notches do not have significantly different medians.

load patients boxchart(categorical(Location),Weight,'Notch','on') ylabel('Weight (lbs)')

In this example, the three notches overlap, showing that the three weight medians are not significantly different.

Display a side-by-side pair of box charts using the `tiledlayout`

and `nexttile`

functions.

Load the `patients`

data set. Convert `Smoker`

to a `categorical`

variable with more descriptive category names (`Smoker`

and `Nonsmoker`

rather than `1`

and `0`

). Convert `SelfAssessedHealthStatus`

to an ordinal `categorical`

variable because the categories `Poor`

, `Fair`

, `Good`

, and `Excellent`

have a natural order.

load patients Smoker = categorical(Smoker,logical([1 0]),{'Smoker','Nonsmoker'}); healthOrder = {'Poor','Fair','Good','Excellent'}; SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus, ... healthOrder,'Ordinal',true);

Create a 2-by-1 tiled chart layout using the `tiledlayout`

function. Create the first set of axes `ax1`

within it by calling the `nexttile`

function. In the axes, display two box charts of diastolic blood pressure values, one for smokers and the other for nonsmokers. Create the second set of axes `ax2`

within the tiled chart layout by calling the `nexttile`

function. In the axes, display four box charts of diastolic blood pressure values, grouping patients by self-assessed health status. Specify the color of the boxes by using the `'BoxFaceColor'`

name-value pair argument.

tiledlayout(1,2) % Left axes ax1 = nexttile; boxchart(ax1,Smoker,Diastolic) ylabel(ax1,'Diastolic Blood Pressure') % Right axes ax2 = nexttile; boxchart(ax2,SelfAssessedHealthStatus,Diastolic, ... 'BoxFaceColor',[0 0.5 0.5]) xlabel(ax2,'Self-Assessed Health Status') ylabel(ax2,'Diastolic Blood Pressure')

Create a box chart from power outage data with many outliers, and make it easier to distinguish them visually by changing the properties of the `BoxChart`

object. Find the indices for the outlier entries.

Read power outage data into the workspace as a table. Display the first few rows of the table.

```
outages = readtable('outages.csv');
head(outages)
```

`ans=`*8×6 table*
Region OutageTime Loss Customers RestorationTime Cause
_____________ ________________ ______ __________ ________________ ___________________
{'SouthWest'} 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 {'winter storm' }
{'SouthEast'} 2003-01-23 00:49 530.14 2.1204e+05 NaT {'winter storm' }
{'SouthEast'} 2003-02-07 21:15 289.4 1.4294e+05 2003-02-17 08:14 {'winter storm' }
{'West' } 2004-04-06 05:44 434.81 3.4037e+05 2004-04-06 06:10 {'equipment fault'}
{'MidWest' } 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 {'severe storm' }
{'West' } 2003-06-18 02:49 0 0 2003-06-18 10:54 {'attack' }
{'West' } 2004-06-20 14:39 231.29 NaN 2004-06-20 19:16 {'equipment fault'}
{'West' } 2002-06-06 19:28 311.86 NaN 2002-06-07 00:51 {'equipment fault'}

Create a `BoxChart`

object `b`

from the `outages.Customers`

values, which indicate how many customers were affected by each power outage. `boxchart`

discards entries with `NaN`

values.

```
b = boxchart(outages.Customers);
ylabel('Number of Customers')
```

The plot contains many outliers. To better see them, jitter the outliers and change the outlier marker style. When you set the `JitterOutliers`

property of the `BoxChart`

object to `'on'`

, the software randomly displaces the outlier markers horizontally so that they are unlikely to overlap perfectly. The values and vertical positions of the outliers are unchanged.

b.JitterOutliers = 'on'; b.MarkerStyle = '.';

You can now more easily see the distribution of outliers.

To find the outlier indices, use the `isoutlier`

function. Specify the `'quartiles'`

method of computing outliers to match the `boxchart`

outlier definition. Use the indices to create the `outliers`

table, which contains a subset of the `outages`

data. Notice that `isoutlier`

identifies 96 outliers.

```
idx = isoutlier(outages.Customers,'quartiles');
outliers = outages(idx,:);
size(outliers,1)
```

ans = 96

Because of all the outliers, the quartiles of the box chart are hard to see. To inspect them, change the *y*-axis limits.

ylim([0 4e5])

`ydata`

— Sample datanumeric vector | numeric matrix

Sample data, specified as a numeric vector or matrix.

If

`ydata`

is a matrix, then`boxchart`

creates a box chart for each column of`ydata`

.If

`ydata`

is a vector and you do not specify`xgroupdata`

, then`boxchart`

creates a single box chart.If

`ydata`

is a vector and you do specify`xgroupdata`

, then`boxchart`

creates a box chart for each unique value in`xgroupdata`

.

**Data Types: **`single`

| `double`

| `int8`

| `int16`

| `int32`

| `int64`

| `uint8`

| `uint16`

| `uint32`

| `uint64`

`xgroupdata`

— Grouping and positioning variablenumeric vector | categorical vector

Grouping and positioning variable, specified as a numeric or categorical vector.
`xgroupdata`

must have the same length as the vector
`ydata`

; you cannot specify `xgroupdata`

when
`ydata`

is a matrix.

`boxchart`

groups the data in `ydata`

according to the unique values in `xgroupdata`

. The function creates
a box chart for each group of data and positions each box chart at the corresponding
`xgroupdata`

value. By default, `boxchart`

vertically orients the box charts and displays the `xgroupdata`

values along the *x*-axis. You can change the box chart orientation
by using the `Orientation`

property.

**Data Types: **`single`

| `double`

| `int8`

| `int16`

| `int32`

| `int64`

| `uint8`

| `uint16`

| `uint32`

| `uint64`

| `categorical`

`ax`

— Target axes`Axes`

objectTarget axes, specified as an `Axes`

object. If you do not specify the
axes, then `boxchart`

uses the current axes
(`gca`

).

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

```
boxchart([rand(10,4); 4*rand(1,4)],'BoxFaceColor',[0 0.5
0],'MarkerColor',[0 0.5 0])
```

creates box charts with green boxes and green
outliers, if applicable.The `BoxChart`

properties listed here are only a subset. For a complete
list, see BoxChart Properties.

`'BoxFaceColor'`

— Box colorRGB triplet | hexadecimal color code | color name | short name

Box color, specified as an RGB triplet, hexadecimal color code, color name, or short name.

For a custom color, specify an RGB triplet or a hexadecimal color code.

An RGB triplet is a three-element row vector whose elements specify the intensities of the red, green, and blue components of the color. The intensities must be in the range

`[0,1]`

; for example,`[0.4 0.6 0.7]`

.A hexadecimal color code is a character vector or a string scalar that starts with a hash symbol (

`#`

) followed by three or six hexadecimal digits, which can range from`0`

to`F`

. The values are not case sensitive. Thus, the color codes`'#FF8800'`

,`'#ff8800'`

,`'#F80'`

, and`'#f80'`

are equivalent.

Alternatively, you can specify some common colors by name. This table lists the named color options, the equivalent RGB triplets, and hexadecimal color codes.

Color Name | Short Name | RGB Triplet | Hexadecimal Color Code | Appearance |
---|---|---|---|---|

`'red'` | `'r'` | `[1 0 0]` | `'#FF0000'` | |

`'green'` | `'g'` | `[0 1 0]` | `'#00FF00'` | |

`'blue'` | `'b'` | `[0 0 1]` | `'#0000FF'` | |

`'cyan'`
| `'c'` | `[0 1 1]` | `'#00FFFF'` | |

`'magenta'` | `'m'` | `[1 0 1]` | `'#FF00FF'` | |

`'yellow'` | `'y'` | `[1 1 0]` | `'#FFFF00'` | |

`'black'` | `'k'` | `[0 0 0]` | `'#000000'` | |

`'white'` | `'w'` | `[1 1 1]` | `'#FFFFFF'` | |

`'none'` | Not applicable | Not applicable | Not applicable | No color |

Here are the RGB triplets and hexadecimal color codes for the default colors MATLAB^{®} uses in many types of plots.

RGB Triplet | Hexadecimal Color Code | Appearance |
---|---|---|

`[0 0.4470 0.7410]` | `'#0072BD'` | |

`[0.8500 0.3250 0.0980]` | `'#D95319'` | |

`[0.9290 0.6940 0.1250]` | `'#EDB120'` | |

`[0.4940 0.1840 0.5560]` | `'#7E2F8E'` | |

`[0.4660 0.6740 0.1880]` | `'#77AC30'` | |

`[0.3010 0.7450 0.9330]` | `'#4DBEEE'` | |

`[0.6350 0.0780 0.1840]` | `'#A2142F'` |

**Example: **```
b =
boxchart(rand(10,1),'BoxFaceColor','red')
```

**Example: **`b.BoxFaceColor = [0 0.5 0.5];`

**Example: **`b.BoxFaceColor = '#EDB120';`

`'MarkerStyle'`

— Outlier style`'o'`

(default) | `'+'`

| `'*'`

| `'.'`

| `'x'`

| ...Outlier style, specified as one of the options listed in this table.

Value | Description |
---|---|

`'o'` | Circle |

`'+'` | Plus sign |

`'*'` | Asterisk |

`'.'` | Point |

`'x'` | Cross |

`'square'` or `'s'` | Square |

`'diamond'` or `'d'` | Diamond |

`'^'` | Upward-pointing triangle |

`'v'` | Downward-pointing triangle |

`'>'` | Right-pointing triangle |

`'<'` | Left-pointing triangle |

`'pentagram'` or `'p'` | Five-pointed star (pentagram) |

`'hexagram'` or `'h'` | Six-pointed star (hexagram) |

`'none'` | No markers |

**Example: **`b = boxchart([rand(10,1);2],'MarkerStyle','x')`

**Example: **`b.MarkerStyle = 'x';`

`'JitterOutliers'`

— Outlier marker displacement`'off'`

(default) | on/off logical valueOutlier marker displacement, specified as `'on'`

or `'off'`

, or as numeric or logical `1`

(`true`

) or `0`

(`false`

). A value of `'on'`

is equivalent to `true`

, and `'off'`

is equivalent to `false`

. Thus, you can use the value of this property as a logical value. The value is stored as an on/off logical value of type `matlab.lang.OnOffSwitchState`

.

If you set the `JitterOutliers`

property to
`'on'`

, then `boxchart`

randomly displaces the
outlier markers along the `XData`

direction to help you distinguish
between outliers that have similar `ydata`

values. For an example,
see Visualize and Find Outliers.

**Example: **`b = boxchart([rand(20,1);2;2;2],'JitterOutliers','on')`

**Example: **`b.JitterOutliers = 'on';`

`'Notch'`

— Median comparison display`'off'`

(default) | on/off logical valueMedian comparison display, specified as `'on'`

or `'off'`

, or as numeric or logical `1`

(`true`

) or `0`

(`false`

). A value of `'on'`

is equivalent to `true`

, and `'off'`

is equivalent to `false`

. Thus, you can use the value of this property as a logical value. The value is stored as an on/off logical value of type `matlab.lang.OnOffSwitchState`

.

If you set the `Notch`

property to `'on'`

, then
`boxchart`

creates a tapered, shaded region around each median.
Box charts whose notches do not overlap have different medians at the 5% significance
level. For more information, see Box Chart (Box Plot).

Notches can extend beyond the lower and upper quartiles.

**Example: **`b = boxchart(rand(10,2),'Notch','on')`

**Example: **`b.Notch = 'on';`

`'Orientation'`

— Orientation of box charts`'vertical'`

(default) | `'horizontal'`

Orientation of box charts, specified as `'vertical'`

or
`'horizontal'`

. By default, the box charts are vertically
orientated, so that the `ydata`

statistics are aligned with the
*y*-axis. Regardless of the orientation,
`boxchart`

stores the `ydata`

values in the
`YData`

property of the `BoxChart`

object.

**Example: **`b = boxchart(rand(10,1),'Orientation','horizontal')`

**Example: **`b.Orientation = 'horizontal';`

`b`

— Box charts`BoxChart`

objectBox charts, returned as a `BoxChart`

object. For more information,
see BoxChart Properties.

A box chart, or box plot, provides a visual representation of summary statistics for a data sample. Given numeric data, the corresponding box chart displays the following information: the median, the lower and upper quartiles, any outliers (computed using the interquartile range), and the minimum and maximum values that are not outliers.

The line inside of each box is the sample median. You can compute the value of the median using the

`median`

function.The top and bottom edges of each box are the upper and lower quartiles, respectively. The distance between the top and bottom edges is the interquartile range (IQR).

For more information on how the quartiles are computed, see

`quantile`

Algorithms (Statistics and Machine Learning Toolbox), where the upper quartile corresponds to the 0.75 quantile and the lower quartile corresponds to the 0.25 quantile. To use the`quantile`

function, you must have a Statistics and Machine Learning Toolbox™ license.Outliers are values that are more than 1.5 ·

*IQR*away from the top or bottom of the box. By default,`boxchart`

displays each outlier using an`'o'`

symbol. The outlier computation is comparable to that of the`isoutlier`

function with the`'quartiles'`

method.The whiskers are lines that extend above and below each box. One whisker connects the upper quartile to the

*nonoutlier maximum*(the maximum value that is not an outlier), and the other connects the lower quartile to the*nonoutlier minimum*(the minimum value that is not an outlier).Notches help you compare sample medians across multiple box charts. When you specify

`'Notch','on'`

, the`boxchart`

function creates a tapered, shaded region around each median. Box charts whose notches do not overlap have different medians at the 5% significance level. The significance level is based on a normal distribution assumption, but the median comparison is reasonably robust for other distributions.The top and bottom edges of the notch region correspond to $$m+\left(1.57\cdot IQR\right)/\sqrt{n}$$ and $$m-\left(1.57\cdot IQR\right)/\sqrt{n}$$, respectively, where

*m*is the median,*IQR*is the interquartile range, and*n*is the number of data points, excluding`NaN`

values.

You can add two types of data tips to a

`BoxChart`

object: one for each box chart and one for each outlier. A general box chart data tip appears at the nonoutlier maximum value, regardless of where you click on the box chart. The displayed`Num Points`

value includes`NaN`

values in the corresponding`ydata`

, but`boxchart`

discards the`NaN`

values before computing the box chart statistics.You can use the

`datatip`

function to add more data tips to a`BoxChart`

object, but the indexing of data tips differs from other charts.`boxchart`

first assigns indices to the box charts and then assigns indices to the outliers. For example, if a`BoxChart`

object`b`

displays two box charts and one outlier,`datatip(b,'DataIndex',3)`

creates a data tip at the outlier point.

Existe una versión modificada de este ejemplo en su sistema. ¿Prefiere abrir esta versión?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)