3. Box plot

3.1. Description

Box plots, or box and whisker plots, are a way to summarize a distribution of values using Tukey’s 5-number summary (Hoaglin et al., 1983). The dark line in the middle of the boxes (sometimes called the ‘waist’) is the median of data. Half of the data values have a value greater than the median, and half have a value lower. The actual box (i.e. the central rectangle) spans the first quartile to the third quartile (or the interquartile range or IQR). Whiskers extend to 1.5 times the height of the box or, if closer to the median, the minimum or maximum values of the data. The points represent outliers: any data point more than 1.5 times the IQR away from the median. These are defined as values that do not fall inside the whiskers. Outliers are extreme values. It is used by METviewer for generating box plots. Refer to the METviewer documentation for details on how this plot is utilized.

3.2. Example

3.2.1. Sample Data

The data is text output from MET in columnar format. The sample data used to create an example box plot is available in the METplotpy repository, where the box plot tests are located:

$METPLOTPY_BASE/metplotpy/test/box/box.data

$METPLOTPY_BASE is the directory where the METplotpy code is saved

e.g.

/usr/path/to/METplotpy if the source code was cloned or forked from the Github repository

or

/usr/path/to/METplotpy-x.y.z if the source code was downloaded as a zip or gzip’d tar file from the Release link of the Github repository. The x.y.z is the release number.

3.2.2. Configuration Files

The box plot utilizes YAML configuration files to indicate where input data is located and to set plot attributes. These plot attributes correspond to values that can be set via the METviewer tool. YAML is a recursive acronym for “YAML Ain’t Markup Language” and according to yaml.org, it is a “human-friendly data serialization language”. It is commonly used for configuration files and in applications where data is being stored or transmitted. Two configuration files are required. The first is a default configuration file, box_defaults.yaml, which is found in the $METPLOTPY_BASE/metplotpy/plots/config directory. All default configuration files are located in the $METPLOTPY_BASE/metplotpy/plots/config directory. $METPLOTPY_BASE is the user-specified directory where the METplotpy source code has been saved. Default configuration files are automatically loaded by the plotting code and do not need to be explicitly specified when generating a plot.

The second required configuration file is a user-supplied “custom” configuration file. This file is used to customize/override the default settings in the box_defaults.yaml file. The custom configuration file can be an empty file if all default settings are to be applied.

3.3. METplus Configuration

3.3.1. Default Configuration File

The following is the mandatory, box_defaults.yaml configuration file, which serves as a good starting point for creating a line plot as it represents the default values set in METviewer.

colors: []
create_html: 'False'
derived_series_1: []
derived_series_2: []
dump_points_1: 'True'
dump_points_2: 'True'


grid_col: '#cccccc'
grid_lty: 3
grid_lwd: 1
grid_on: 'True'
grid_x: listX

indy_vals: []
indy_var: fcst_lead
legend_box: o
legend_inset:
  x: 0.0
  y: -0.25
legend_ncol: 3
legend_size: 0.8
line_type: None

list_stat_1: []
list_stat_2: []

mar:
- 8
- 4
- 5
- 4
mgp:
- 1
- 1
- 0
num_iterations: 1
num_threads: -1
plot_caption: Caption
plot_disp: []
plot_filename: ./box_expected.png
plot_height: 8.5
plot_res: 72
plot_stat: median
plot_type: png16m
plot_units: in
plot_width: 11.0
random_seed: null
series_order: []

series_val_1: {}
series_val_2: {}

show_nstats: 'False'

stat_input: ../../../test/box/box.data
sync_yaxes: 'False'
title: test title
title_align: 0.5
title_offset: -2
title_size: 1.4
title_weight: 2.0
user_legend: ['None']
x2lab_align: 0.5
x2lab_offset: -0.5
x2lab_size: 0.8
x2lab_weight: 1
x2tlab_horiz: 0.5
x2tlab_orient: 1
x2tlab_perp: 1
x2tlab_size: 0.8
xaxis: test x_label
xaxis_reverse: 'False'
xlab_align: 0.5
xlab_offset: 2
xlab_size: 1
xlab_weight: 1
xlim: []
xtlab_decim: 0
xtlab_horiz: 0.5
xtlab_orient: 1
xtlab_perp: -0.75
xtlab_size: 1
y2lab_align: 0.5
y2lab_offset: 1
y2lab_size: 1
y2lab_weight: 1
y2lim: []
y2tlab_horiz: 0.5
y2tlab_orient: 1
y2tlab_perp: 1
y2tlab_size: 1.0
yaxis_1: test y_label
yaxis_2: ''
ylab_align: 0.5
ylab_offset: -2
ylab_size: 1
ylab_weight: 1
ylim: []
ytlab_horiz: 0.5
ytlab_orient: 1
ytlab_perp: 0.5
ytlab_size: 1

box_pts: False
box_notch: False
box_outline: True
box_avg: False

caption_size: 0.8
caption_offset: 3
caption_weight: 1
caption_align: 0
caption_col: '#333333'

3.3.2. Custom Configuration File

A second, mandatory configuration file is required, which is used to customize the settings to the box plot. The custom_box.yaml file is included with the source code. If the user wishes to use all the default settings defined in the box_defaults.yaml file, an empty custom configuration file can be specified instead.

alpha: 0.05
box_avg: 'True'
box_boxwex: 0.2
box_notch: 'False'
box_outline: 'True'
box_pts: 'False'
caption_align: 0.0
caption_col: '#333333'
caption_offset: 3.0
caption_size: 0.8
caption_weight: 1
cex: 1
colors:
- '#00ff7f'
- '#ff0000'
- '#8000ff'
- '#00aaff'
con_series:
- 1
- 1
- 1
- 1
create_html: 'False'
derived_series_1:
- - AFWAOCv3.5.1_d01 SPFH FBAR
  - NoahMPv3.5.1_d01 SPFH FBAR
  - DIFF
derived_series_2: []
dump_points_1: 'True'
dump_points_2: 'False'
eqbound_high: 0.001
eqbound_low: -0.001
event_equal: 'True'
fcst_var_val_1:
  SPFH:
  - FBAR
fcst_var_val_2:
  SPFH:
  - MAE
fixed_vars_vals_input:
  obtype:
    obtype_1:
    - ADPUPA
  fcst_lev:
    fcst_lev_2:
    - P300
  vx_mask:
    vx_mask_0:
    - CONUS
grid_col: '#cccccc'
grid_lty: 3
grid_lwd: 1
grid_on: 'True'
grid_x: listX
indy_label:
- '12'
- '24'
indy_stagger_1: 'False'
indy_stagger_2: 'False'
indy_vals:
- '120000'
- '240000'
indy_var: fcst_lead
legend_box: o
legend_inset:
  x: 0.0
  y: -0.25
legend_ncol: 3
legend_size: 0.8
line_type: None
list_stat_1:
- FBAR
list_stat_2:
- MAE
mar:
- 8
- 4
- 5
- 4
method: bca
mgp:
- 1
- 1
- 0
num_iterations: 1
num_threads: -1
plot_caption: Caption text
plot_ci:
- none
- none
- none
- none
plot_disp:
- 'True'
- 'True'
- 'True'
- 'True'
plot_height: 8.5
plot_res: 72
plot_stat: median
plot_type: png16m
plot_units: in
plot_width: 11.0


# Optional, uncomment and set to directory to store the .points1 file
# that is used by METviewer (created when dump_points_1 is set to True)
# if dump_points_1 is True and this is uncommented, the points1 file
# will be saved in the default location (i.e. where the input data file is stored).
#points_path: /dir_to_save_points1_file

random_seed: null


series_order:
- 1
- 2
- 3
- 4


series_val_1:
  model:
  - AFWAOCv3.5.1_d01
  - NoahMPv3.5.1_d01
series_val_2:
  model:
  - AFWAOCv3.5.1_d01
show_nstats: 'True'
show_signif:
- 'False'
- 'False'
- 'False'
- 'False'
sync_yaxes: 'False'
title: Box
title_align: 0.5
title_offset: -2
title_size: 1.4
title_weight: 2.0
user_legend: []
x2lab_align: 0.5
x2lab_offset: -0.5
x2lab_size: 0.8
x2lab_weight: 1
x2tlab_horiz: 0.5
x2tlab_orient: 1
x2tlab_perp: 1
x2tlab_size: 0.8
xaxis: FCST_LEAD
xaxis_reverse: 'False'
xlab_align: 0.5
xlab_offset: 2
xlab_size: 1
xlab_weight: 1
xlim: []
xtlab_decim: 0
xtlab_horiz: 0.5
xtlab_orient: 1
xtlab_perp: -0.75
xtlab_size: 1
y2lab_align: 0.5
y2lab_offset: 1
y2lab_size: 1
y2lab_weight: 1
y2lim: []
y2tlab_horiz: 0.5
y2tlab_orient: 1
y2tlab_perp: 1
y2tlab_size: 1.0
yaxis_1: FBAR
yaxis_2: MAE
ylab_align: 0.5
ylab_offset: -2
ylab_size: 1
ylab_weight: 1
ylim: []
ytlab_horiz: 0.5
ytlab_orient: 1
ytlab_perp: 0.5
ytlab_size: 1

stat_input: ./box.data
plot_filename: ./box.png

Copy this custom config file from the directory where the source code was saved to the working directory:

cp $METPLOTPY_BASE/test/box/custom_box.yaml $WORKING_DIR/custom_box.yaml

Modify the stat_input setting in the $METPLOTPY_BASE/test/box/custom_box.yaml file to explicitly point to the $METPLOTPY_BASE/test/box directory (where the custom config files and sample data reside). Replace the relative path ./box.data with the full path $METPLOTPY_BASE/test/box/box.data (including replacing $METPLOTPY_BASE with the full path to the METplotpy installation on the system). Modify the plot_filename setting to point to the output path where the plot will be saved, including the name of the plot.

For example:

stat_input: /username/myworkspace/METplotpy/test/box/box.data

plot_filename: /username/working_dir/output_plots/box.png

This is where /username/myworkspace/METplotpy is $METPLOTPY_BASE and /username/working_dir is $WORKING_DIR. Make sure that the $WORKING_DIR directory that is specified exists and has the appropriate read and write permissions.

The path listed for plot_filename may be changed to the output directory of one’s choosing. If this is not set, then the plot_filename setting specified in the $METPLOTPY_BASE/metplotpy/plots/config/box_defaults.yaml configuration file will be used.

To save the intermediate .points1 file (used by METviewer and useful for debugging but not required), set the dump_points_1 setting to True. Uncomment or add (if it doesn’t exist) the points_path setting.

dump_points_1: ‘True’

points_path: ‘/dir_to_save_points1_file’

Replace the /dir_to_save_points1_file to the same directory where the .points1 file is saved. If points_path is commented out (indicated by a ‘#’ symbol in front of it), remove the ‘#’ symbol to uncomment the points_path so that it will be used by the code. Make sure that this directory exists and has the appropriate read and write permissions. NOTE: the points_path setting is optional and does not need to be defined in the configuration file unless saving the intermediate .points1 file is desired.

3.3.3. Using Defaults

To use the default settings defined in the box_defaults.yaml file, specify a minimal custom configuration file (minimal_box.yaml), which consists of only a comment block, but can be any empty file (if the user has write permissions for the output filename path corresponding to the plot_filename setting in the default configuration file. Otherwise the user will need to specify a plot_filename in the minimal_box.yaml file):

# minimal config file that uses all the settings in the
# default configuration file, custom_box.yaml
# leave "empty" to use all the settings in the line_default.yaml
# as long as you have write permissions for the directory specified in the `plot_filename`
# setting.
#Otherwise, specify an appropriate plot_filename setting here (below this comment block)

Copy this file to the working directory:

cp $METPLOTPY_BASE/test/box/minimal_box.yaml $WORKING_DIR/minimal_box.yaml

Add the stat_input (input data) and plot_filename (output file/plot path) settings to the $WORKING_DIR/minimal_box.yaml file (anywhere below the comment block). The stat_input setting explicitly indicates where the sample data and custom configuration files are located. Set the stat_input to $METPLOTPY_BASE/test/box/box.data and set the plot_filename to $WORKING_DIR/output_plots/box_default.png:

stat_input: $METPLOTPY_BASE/test/box/box.data

plot_filename: $WORKING_DIR/output_plots/box_default.png

Where $WORKING_DIR is the working directory where where all the custom configuration files are being saved. NOTE: If the plot_filename (output directory) is specified to a directory other than the $WORKING_DIR/output_plots, the user must have read and write permissions to that directory.

NOTE: This file does not plot any data, its purpose is to provide a template for setting the margins, plot size, labels, etc.

3.4. Run from the Command Line

Perform the following to generate the plots:

  • If using the conda environment, verify the conda environment is running and has the required Python packages outlined in the requirements section.

  • Set the METPLOTPY_BASE environment variable to point to $METPLOTPY_BASE.

    For the ksh environment:

    export METPLOTPY_BASE=$METPLOTPY_BASE
    

    For the csh environment:

    setenv METPLOTPY_BASE $METPLOTPY_BASE
    

    Recall that $METPLOTPY_BASE is the directory path indicating where the METplotpy source code was saved.

    The custom_box.yaml configuration file, in combination with the box_defaults.yaml configuration file, generates a customized box plot:

    ../_images/box.png

    To generate the above plot using the box_defaults.yaml and custom_box.yaml config files, perform the following:

    python $METPLOTPY_BASE/metplotpy/plots/box/box.py $WORKING_DIR/custom_box.yaml
    

    The minimal_box.yaml configuration file, in combination with the box_defaults.yaml configuration file, generates a “default” box plot. The purpose of this is to provide a template/starting point for setting up the margins, plot size, labels, etc. and does not plot any data:

    ../_images/box_default.png

    To generate the above “defaults” plot (i.e using default configuration settings), use the “minimal” custom configuration file, minimal_box.yaml.

  • Enter the following command.

    python $METPLOTPY_BASE/metplotpy/plots/box/box.py $WORKING_DIR/minimal_box.yaml
    
  • A box_default.png output file will be created in the directory specified in the plot_filename configuration setting in the box_minimal.yaml config file.