TimeSeriesVisualization

class src.lookoutequipment.plot.TimeSeriesVisualization(timeseries_df, data_format, timestamp_col=None, tag_col=None, resample=None, verbose=False)

A class to manage time series visualization along with labels and detected events

Attributes

DEFAULT_COLORS

data

A pandas.DataFrame containing time series data to plot

format

Either timeseries or tabular depending on the format of your time series.

legend_format

kwargs dict to configure the legend to be displayed when this class renders the requested plot

signal_data

list of pandas.DataFrame containing the time series data to plot

tag_col

If data_format is timeseries, this argument specifies what is the name of the columns that contains the name of the tags

tags_list

list of strings containing the list of all tags associated to the current dataset

timestamp_col

string specifying the name of the columns that contains the timestamps

Methods

__init__(timeseries_df, data_format[, …])

Create a new instance to plot time series with different data structure

add_labels(labels_df[, labels_title])

Add a label component to the plot to visualize the known anomalies periods as a secondary plot under the time series visualization panel.

add_predictions(predictions_list[, …])

Add a prediction component to the plot to visualize detected events as a secondary plot under the time series visualization panel.

add_rolling_average(window_size)

Adds a rolling average over a time series plot

add_signal(signals_list)

This method will let you select which signals you want to plot.

add_train_test_split(split_timestamp[, …])

Add a way to visualize the split between training and testing periods.

plot([fig_width, colors, labels_bottom, …])

Renders the plot as configured with the previous function

plot_histograms([freq, prediction_index, …])

Plot values distribution as histograms for the top contributing sensors.

__init__(timeseries_df, data_format, timestamp_col=None, tag_col=None, resample=None, verbose=False)

Create a new instance to plot time series with different data structure

Parameters
  • timeseries_df (pandas.DataFrame) – A dataframe containing time series data that you want to plot

  • data_format (string) – Use “timeseries” if your dataframe has three columns: timestamp, values and tagname. Use “tabular” if timestamp is your first column and all the other tags are in the following columns: timestamp, tag1, tag2

  • timestamp_col (string) – Specifies the name of the columns that contains the timestamps. If set to None, it means the timestamp is already an index (default to None)

  • tag_col (string) – If data_format is “timeseries”, this argument specifies what is the name of the columns that contains the name of the tags

  • resample (string) – If specified, this class will resample the data before plotting them. Use the same format than the string rule as used in the pandas.DataFrame.resample() method (default to None)

  • verbose (boolean) – If True, this class will print some messages along the way (defaults to False)

add_labels(labels_df, labels_title='Known anomalies')

Add a label component to the plot to visualize the known anomalies periods as a secondary plot under the time series visualization panel.

Parameters
  • labels_df (pandas.DataFrame) – You can add one label ribbon, defined with a dataframe that gives the start and end date of every known anomalies

  • labels_title (string) – Title to be used for the known anomalies label ribbon

add_predictions(predictions_list, prediction_titles=['Detected events'])

Add a prediction component to the plot to visualize detected events as a secondary plot under the time series visualization panel.

Parameters
  • predictions_list (list of pandas.DataFrame) – You can add several predictions ribbon. Each ribbon is defined with a dataframe that gives the start and end date of every detected events. Several ribbons can be grouped inside a list

  • prediction_titles (list of strings) – This lists contains all the titles to be used for each prediction ribbon

add_rolling_average(window_size)

Adds a rolling average over a time series plot

Parameters

window_size (integer) – Size of the window in time steps to average over

add_signal(signals_list)

This method will let you select which signals you want to plot. It will double check that the signals are, actually available in the tags list. This method will populate the signal_data property with the list of each dataframes containing the signals to plot.

Parameters

signals_list (list of string) – A list of tag names to be rendered when you call plot()

Raises

Exception – if some of the signals are not found in the tags list

add_train_test_split(split_timestamp, train_label='Train', test_label='Evaluation')

Add a way to visualize the split between training and testing periods. The training period will stay colorful on the timeseries area of the plot while the testing period will be greyed out.

Parameters
  • split_timestamp (string or datetime) – The split date. If a string is passed, it will be converted into a datetime

  • train_label (string) – Name of the training period (will be visible in the legend)

  • test_label (string) – Name of the testing period (will be visible in the legend)

plot(fig_width=18, colors={'labels': 'tab:green', 'predictions': 'tab:red'}, labels_bottom=False, no_legend=False)

Renders the plot as configured with the previous function

Parameters

fig_width (integer) – The width of the figure to generate (defaults to 18)

Returns

tuple containing:
  • A matplotlib.pyplot.figure where the plots are drawn

  • A list of matplotlib.pyplot.Axis with each plot drawn here

Return type

tuple

plot_histograms(freq='1min', prediction_index=0, top_n=8, fig_width=18, start=None, end=None)

Plot values distribution as histograms for the top contributing sensors.

Parameters
  • freq (string) – The datetime index frequence (defaults to ‘1min’). This must be a string following this format: XXmin where XX is a number of minutes.

  • prediction_index (integer) – You can add several predicted ranges in your plot. Use this argument to specify for which one you wish to plot a histogram for (defaults to 0)

  • top_n (integer) – Number of top signals to plot (default: 8)

  • fig_width (float) – Width of the figure generated (default: 18)

  • start (pandas.DatetTime) – Start date of the range to build the values distribution for (default: None, use the evaluation period start)

  • end (pandas.DatetTime) – End date of the range to build the values distribution for (default: None, use the evaluation period end)

Returns

a figure where the histograms are drawn

Return type

matplotlib.pyplot.figure