API Reference

General Interfaces

Base classes

class datastories.api.IAnalysisResult

Interface implemented by all analysis results

plot(*args, **kwargs)

Plots a graphical representation of the results in Jupyter Notebook.

to_csv(file_name, delimiter=', ', decimal='.')

Exports the result to a CSV file.

Args:
file_name (str):
 name of the file to export to.
delimiter (str=’,’):
 CSV delimiter
decimal (str=’.’):
 CSV decimal point
to_excel(file_name)

Exports the result to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_pandas()

Exports the result to a Pandas DataFrame.

Returns:
  • The constructed Pandas DataFrame.
class datastories.api.IPredictiveModel(prediction_type, *args, **kwargs)

Interface implemented by all prediction models

metrics

A dictionary containing model prediction performance metrics.

The type of metrics depend on the model type (i.e., regression or classification)

predict(data_frame)

Predict the model KPI on a new data frame.

Args:
data_frame (obj):
 the data frame on which the model associated KPI is to be predicted.
Returns:
  • An object of type datastories.regression.PredictionResult encapsulating the prediction results.
Raises:
  • ValueError: if not all required columns are provided.

Note: All columns present in the training data frame are required for making predictions even if they are not significant for the prediction.

to_cpp(file_name)

Export the model to a C++ file.

Args:
file_name (str):
 name of the file to export to.
Raises:
  • DatastoriesError: if there is a problem saving the file.
to_excel(file_name)

Export the model to an Excel file.

Args:
file_name (str):
 name of the file to export to.
Raises:
  • DatastoriesError: if there is a problem saving the file.
to_matlab(file_name)

Export the model to a MATLAB file.

Args:
file_name (str):
 name of the file to export to.
Raises:
  • DatastoriesError: if there is a problem saving the file.
to_py(file_name)

Export the model to a Python file.

Args:
file_name (str):
 name of the file to export to.
Raises:
  • DatastoriesError: if there is a problem saving the file.
to_r(file_name)

Export the model to an R file.

Args:
file_name (str):
 name of the file to export to.
Raises:
  • DatastoriesError: if there is a problem saving the file.
class datastories.api.IPrediction(prediction_type, *args, **kwargs)

Bases: datastories.api.interface.IAnalysisResult

Interface implemented by all prediction results

metrics

A dictionary containing prediction performance metrics.

These metrics are computed when the data frame used for prediction includes KPI values, for the purpose of evaluating the model prediction performance.

class datastories.api.IStory(notes=[], *args, **kwargs)

Bases: datastories.api.interface.IAnalysisResult

Interface implemented by all story analyses

add_note(note)

Add an annotation to the story results.

The already present annotations can be retrieved using the datastories.api.IStory.notes() property.

Args:
note (str):the annotation to be added.
clear_note(note_id)

Remove a specific annotation associated with the story analysis.

Args:
note_id (int):the index of the note to be removed.
Raises:
  • ValueError: if the note index is unknown.
clear_notes()

Clear the annotations associated with the story analysis.

static load(file_name)

Loads a previously saved story.

metrics

Returns a set of metrics computed during analysis.

notes

A text representation of all annotations currently associated with the story analysis.

save(file_name)

Saves the story analysis results.

class datastories.api.IPredictiveStory(notes=[], *args, **kwargs)

Bases: datastories.api.interface.IStory

Interface implemented by all story analyses that generate a predictive model

model

Returns an object of type datastories.api.IPredictiveModel that can be used for making predictions on new data

class datastories.api.IProgressObserver

Interface implemented by all progress report observers

on_progress(progress)

Callback triggered upon progress update.

Args:
progress (float):
 the amount of progress. Possible values: [0-1]

Constants and Enumerations

class datastories.api.OutlierType

Enumeration of possible outlier types.

FAR_OUTLIER_HIGH = 2
FAR_OUTLIER_LOW = -2
NO_OUTLIER = 0
OUTLIER_HIGH = 1
OUTLIER_LOW = -1

Data

The datastories.data package contains a collection of classes and functions for handling data and converting it to and from the internal format used by DataStories.

Data Frame Construction

class datastories.data.DataFrame

Encapsulates a data frame in the DataStories format.

Args:
rows (int):number of rows in the data frame.
cols (int):number of columns in the data frame.
types (list):list of value types for the data frame columns.
cols(self)

Get the number of columns in the data frame.

Returns:
  • (int) : number of columns in the data frame.
static from_pandas(df)

Construct a new datastories.data.DataFrame from a Pandas DataFrame object.

Args:
df (obj):the source Pandas DataFrame object.
Returns:
get(self, size_t row, size_t col)

Get the value of a cell in the data frame.

Args:
row (int):the index of the cell row.
col (int):the index of the cell column.
Returns:
  • (float|string) : The in the data frame cell at position (row, column).
get_type(self, size_t col)

Get the type of values in a given column.

Args:
col (int):the index of the column.
Returns:
name(self, size_t col)

Get the name of a specific column.

Args:
col (int):the index of the column.
Returns:
  • (str) : the name of the column with the given index.
names(self)

Get the data frame column names.

Returns:
  • (list) : a list of strings.
static read_csv(filename, delimiter=u', ', decimal=u'.', quote=u'"', int header_rows=1, missing_values=None)
rows(self)

Get the number of rows in the data frame.

Returns:
  • (int) : number of rows in the data frame.
set_float(self, size_t row, size_t col, double val)

Sets the value of a given cell to a new float value.

Args:
row (int):the row index of the cell.
col (int):the column index of the cell.
val (float):the new float value.
set_int(self, size_t row, size_t col, int64_t val)

Sets the value of a given cell to a new int value.

Args:
row (int):the row index of the cell.
col (int):the column index of the cell.
val (int):the new int value.
set_name(self, size_t col, string name)

Set the name of a column in the data frame.

Args:
col (int):the index of the column.
name (str):the new name.
set_string(self, size_t row, size_t col, string val)

Sets the value of a given cell to a new string value.

Args:
row (int):the row index of the cell.
col (int):the column index of the cell.
val (str):the new string value.
to_pandas(self)

Exports the DataFrame to a Pandas DataFrame object.

Returns:
  • The constructed Pandas DataFrame object.

class datastories.data.ColumnType

Possible column types for datastories.data.DataFrame.

DATE = 3
INTEGER = 2
MIXED = 10
NUMERIC = 1
STRING = 4
UNKNOWN = 0

class datastories.data.DataType

Possible cell value types for datastories.data.DataFrame.

DATE = 3
INTEGER = 2
NUMERIC = 1
STRING = 4
UNKOWN = 0

class datastories.data.RangeType

Possible value range types for datastories.data.DataFrame.

CATEGORICAL = 3
INTERVAL = 1
ORDINAL = 2
UNSPECIFIED = 0

datastories.data.prepare_data_frame(data_frame, progress_bar=False)

Prepares a pandas.core.frame.DataFrame object compatible with the DataStories clean-up and type conversion rules.

Pandas DataFrames obtained from external sources are often inconsistent and need to be cleaned-up in order to make them usable for analysis. The clean-up process transforms the data frame, for example by enforcing type conversions and discarding non-usable values. DataStories analyses perform the clean-up operation automatically. However, there may be scenarios when a data clean-up is required before running it through a DataStories analysis (e.g., a custom feature-engineering stage).

This function can be used to obtain a Pandas DataFrame object that is cleaned-up according the DataStories rules and conventions.

Args:
data_frame (obj):
 the data frame object to convert (either a pandas.core.frame.DataFrame or a datastories.data.DataFrame object);
progress_bar (obj|bool=False):
 An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).
Returns:
  • The constructed Pandas DataFrame object.

Summary Calculation

class datastories.data.DataSummaryResult(stats)

Encapsulates the result of the datastories.data.compute_summary() analysis.

Note: Objects of this class should not be manually constructed.

Attributes:
stats (obj):an object of type datastories.data.TableStatistics wrapping up summary statistics.
vis_settings (obj):
 an object of type datastories.visualization.DataSummaryTableSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Displays a graphical representation of the data summary analysis results.

Accepts the same parameters as the constructor for datastories.visualization.DataSummaryTableSettings

select(cols)

Selects a set of columns for further reference.

selected

Retrieves the list of selected columns.

to_csv(file_name)

Exports the list of ranking scores to a CSV file.

Args:
file_name (str):
 name of the file to export to.
to_excel(file_name)

Exports the list of ranking scores to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_html(file_name, title='Data Summary', subtitle='')

Exports the data summary visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Data Summary’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.
to_pandas()

Exports the detailed (column-level) data summary to a Pandas DataFrame.

Returns:
  • The constructed Pandas DataFrame object.

class datastories.data.TableStatistics(name=None, rows=None, columns=None, n=None, n_missing=None, p_missing=None, health=None, health_score=0, df=None, converters=None, n_rows=None, n_columns=None, version={'core': '1.4.0'})

Statistics and data health reports for a given data frame.

Note: Objects of this class should not be manually constructed.
Attributes:
n_rows (int):Number of rows
n_columns (int):
 Number of columns
n (int):Number of values
n_missing (int):
 Number of missing values
p_missing (float):
 Percentage of missing values
health_score (float):
 Health score: 0 (good) - 100 (bad)
health (float):General health value for the data frame (unusable:0, fixable:0.5, great:1).
columns ‘(list)`:
 List of objects of type datastories.data.ColumnStatistics wrapping up detailed column level statistics

class datastories.data.ColumnStatistics(col=None, id=None, converter=None, label=None, column_type=None, element_type=None, n=None, n_valid=None, n_missing=None, p_missing=None, n_unique=None, min=None, max=None, mean=None, median=None, most_freq=None, first_quartile=None, third_quartile=None, histo_labels=None, histo_counts=None, balance_score=None, balance_health=None, missing_health=None, left_outlier_score=None, right_outlier_score=None, outlier_score=None, left_outlier_health=None, right_outlier_health=None, outlier_health=None, health=None, missing_thr=None, balance_thr=None, outlier_thr=None, bincount=10)

Statistics and data health reports for a given column in a data frame.

Note: Objects of this class should not be manually constructed.
Attributes:
n_rows (int):Number of rows
id (int):The index of the column.
label (str):The label (header values) of the column.
n (int):The length of the column.
n_valid (int):The number of correctly parsed data items.
n_missing (int):
 The number of unreadable data items.
p_missing (float):
 Percent of unreadable data items.
column_type (str):
 Type of the column (ordinal, interval, binary, …)
element_type (str):
 Type of individual data items (float, string, …)
n_unique (int):Number of unique values.
min (float):Minimum value.
max (float):Maximum value.
mean (float):Mean value.
median (float):Median value.
first_quartile (float):
 First quartile (data point under which 25% of data is situated).
third_quartile (float):
 Third quartile (data point under which 75% of data is situated).
histo_labels (list):
 Labels for the histogram bins.
histo_counts (list):
 Counts for the histogram bins.
balance_score (float):
 Score for the balanceness of the data, 0 (good) - 100 (bad).
balance_health (float):
 Health value in terms of balance (unusable:0, fixable:0.5, great:1).
missing_health (float):
 Health value in terms of nr of missing items (unusable, …).
left_outlier_score (float):
 Metric for outlier impact on the left (i.e., small) side of the data range. Scale: 0 (no outliers detected) - 100 (bad).
right_outlier_score (float):
 Metric for outlier impact on the right (i.e., big) side of the data range. Scale: 0 (no outliers detected) - 100 (bad).
outlier_score (float):
 Metric for the general outlier impact of the data. Scale: 0 (no outlier impact whatsoever) - 100 (bad).
left_outlier_health (float):
 Health value for left outlier impact (unusable:0, fixable:0.5, great:1).
right_outlier_health (float):
 Health value for right outlier impact (unusable, fixable:0.5, great:1).
outlier_health (float):
 Health value for outlier impact (unusable:0, fixable:0.5, great:1).
health (float):General health value for this column (unusable:0, fixable:0.5, great:1).
calc_basic_stats()

Generates the basic statistics for the column and sets the corresponding attributes.


datastories.data.compute_summary(data_frame, col_types=None, sample_size=None, progress_bar=False)

Computes a data summary on a provided data frame.

Args:
data_frame (obj):
 the input data frame.
col_types (list=None):
 list of column types to use for extracting statistics. If not provided, it will be inferred from a sample of the data, based on the most frequent value type in each column.
sample_size (int=100|str=’10%’):
 the sample size to use for inferring data types (either absolute integer value or a percentage)
progress_bar (obj|bool=False):
 An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).
Returns:

Example:

from datastories.data import compute_summary
import pandas as pd
df = pd.read_csv('example.csv')
summary = compute_summary(df)
print(summary)

Outlier Detection

class datastories.data.OutlierResult(input, outliers)

Encapsulates the result of the datastories.data.compute_outliers() analysis.

Attributes:
valid (bool):a flag indicating whether the result is valid.
Raises:
AssertionError: when calling methods of an invalid result.

Note: Objects of this class should not be manually constructed.

as_index(self, outlier_types=[OutlierType.FAR_OUTLIER_HIGH, OutlierType.FAR_OUTLIER_LOW, OutlierType.OUTLIER_HIGH, OutlierType.OUTLIER_LOW])

A numpy index vector that can be used to select and retrieve outlier values.

The index can be applied on numpy arrays or pandas.core.series.Series objects.

Args:
outlier_types (list):
 list of datastories.data.OutlierType values to specify which outliers to retrieve. By default, all outliers are included (i.e., outlier_types = [OutlierType.FAR_OUTLIER_HIGH, OutlierType.FAR_OUTLIER_LOW, OutlierType.OUTLIER_HIGH, OutlierType.OUTLIER_LOW])
as_itemgetter(self, outlier_types=[OutlierType.FAR_OUTLIER_HIGH, OutlierType.FAR_OUTLIER_LOW, OutlierType.OUTLIER_HIGH, OutlierType.OUTLIER_LOW])

An operator.itemgetter object that can be used to select and retrieve outlier values from a list.

Args:
outlier_types (list):
 list of datastories.data.OutlierType values to specify which outliers to retrieve. By default, all outliers are included (i.e., outlier_types = [OutlierType.FAR_OUTLIER_HIGH, OutlierType.FAR_OUTLIER_LOW, OutlierType.OUTLIER_HIGH, OutlierType.OUTLIER_LOW])
clip_to_iqr(self, low_threshold=0.05, high_threshold=0.95)

Marks as outliers values that are outside a specific inter-quartile range.

This operation can be un-done via the reset method.

Args:
low_threshold (float=0.05):
 the lower bound of the inter-quartile range. Should be in the interval [0,1].
high_threshold (float=0.95):
 the higher bound of the inter-quartile range. Shoudl be in the interval [0,1].
Raises:
ValueError: when the input arguments are not valid.
metrics

A dictionary containing outlier detection metrics.

The following metrics are retrieved:
Outliers:total number of outliers
Outliers Low:number of lower outliers
Outliers High:number of higher outliers
Close Outliers:number of close outliers
Close Outliers Low:
 number of lower close outliers
Close Outliers High:
 number of higher close outliers
Far Outliers:number of far outliers
Far Outliers Low:
 number of lower far outliers
Far Outliers High:
 number of higher far outliers
NaN:number of NaN values
Normal:number of values that are neither outliers not NaN
plot(self, *args, **kwargs)

Displays a graphical representation of the outlier analysis results.

Accepts the same parameters as the constructor for datastories.visualization.OutlierPlotSettings objects.

reset(self)

Resets outliers to original values, as computed by the datastories.data.compute_outliers() analysis.

to_csv(self, file_name, content=u'metrics')

Exports a list of detected outliers or metrics to a CSV file.

Args:
file_name (str):
 name of the file to export to.
content (str=metrics):
 the type of metrics to export. Possible values: - metrics - exports outlier detection metrics; - outliers - exports point-wise outlier classification.
Raises:
ValueError: when an invalid value is provided for the content parameter.
to_excel(self, file_name)

Exports the list of detected outliers and metrics to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_html(self, file_name, title=u'Outliers', subtitle=u'')

Exports the outliers visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Outliers’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.
to_pandas(self, content=u'metrics')

Exports a list of detected outliers or metrics to a pandas.core.series.Series object.

Args:
content (str=metrics):
 the type of metrics to export. Possible values: - metrics - exports outlier detection metrics; - outliers - exports point-wise outlier classification.
Returns:
The constructed pandas.core.series.Series object.
Raises:
ValueError: when an invalid value is provided for the content parameter.
update(self, updates)

Updates the list of detected outliers with manual corrections.

updated

A list of manual corrections applied to the detected outliers.

vis_settings

An object of type datastories.visualization.OutlierPlotSettings encapsulating the outlier visualization settings


datastories.data.compute_outliers(input, ref=None, double strictness=0.25, outlier_vote_threshold=None, far_outlier_vote_threshold=None)

Identifies numeric outliers in a 1D or 2D space.

This function can be used either with the strictness parameter only (i.e., by leaving two last parameters at their defaults so they will be computed as a function of the strictness) or manually by setting the last two parameters in which case the strictness will be ignored.

Args:
input (list|obj|ndarray):
 numeric input vector can be either a list, a pandas.core.series.Series object or a numpy numeric array;
ref (list|obj|ndarray=None):
 abscissa vector for the 2D case; can be either a list, a pandas.core.series.Series object or a numpy numeric array;
strictness (double=0.25):
 determines how strict the algorithm selects outliers - higher values yield less outliers. Value in range is [0-1];
outlier_vote_threshold (double=None):
 determines when a point is considered outlier - higher values yield less outliers. Value in range is [0-100]. When left unspecified it will be set to 100 * strictness.
far_outlier_vote_threshold (double=None):
 determines when a point is considered a far outlier - higher values yield less outliers. This must be larger than outlier_vote_threshold. Default is outlier_vote_threshold + 50. Value in range is [0-100].
Returns:
An object of type datastories.data.OutlierResult wrapping-up the computed outliers.

Example:

from datastories.data import compute_outliers
import pandas as pd
df = pd.read_csv('example.csv')
outliers = compute_outliers(df['my_column'])
print(outliers)

Classification

The datastories.classification package contains a collection of classes and functions to facilitate classification analysis.

Feature Ranking

datastories.classification.rank_features(data_set, kpi, metric=FeatureRankingMetric.Accuracy) → FeatureRankResult

Computes the relative importance of columns in a dataframe for predicting a binary KPI.

The scoring is based on maximizing the prediction accuracy with respect to the KPI while iteratively splitting the dataframe rows.

Args:
data_set (obj):a DataStories or a Pandas DataFrame object.
kpi (int|str):the index or the name of the KPI column.
metric (enum = FeatureRankingMetric.Accuracy):
 an object of type datastories.classification.FeatureRankResult specifying the metric type used to rank the features. Possible values: FeatureRankingMetric.Accuracy
Returns:
Raises:
  • TypeError: if data_set is not a DataFrame or a Pandas DataFrame object.
  • ValueError: if kpi is not a valid column name or index value (e.g., out-of-range index).

Example:

from datastories.classification import rank_features
import pandas as pd
df = pd.read_csv('example.csv')
kpi_column_index = 1
ranks = rank_features(df, kpi_column_index)
print(ranks)

class datastories.classification.FeatureRankResult(c_split_list, data_frame, kpi, metric)

Bases: datastories.api.interface.IAnalysisResult

Encapsulates the result of the datastories.classification.rank_features() analysis.

Note: Objects of this class should not be manually constructed.

feature_ranks

Retrieves the feature ranks computed by the datastories.classification.rank_features() analysis.

Returns:
  • (list): a list of datastories.classification.RankingSplit objects.
get_feature_ranks(self)

Retrieves the feature ranks computed by the datastories.classification.rank_features() analysis.

Returns:
  • (list): a list of datastories.classification.RankingSplit objects.
metric_map = {<FeatureRankingMetric.Accuracy: 0>: 'Accuracy'}
plot(self, *args, **kwargs)

Displays a graphical representation of the rank features analysis results.

Accepts the same parameters as the constructor for datastories.visualization.FeatureRanksTableSettings objects.

select(self, cols)

Selects a number of column names as features.

selected

The list of column names currently selected as features.

to_csv(self, file_name, delimiter=u', ', decimal=u'.')

Exports the list of ranking scores to a CSV file.

Args:
file_name (str):
 name of the file to export to.
delimiter (str=’,’):
 CSV delimiter
decimal (str=’.’):
 CSV decimal point
to_excel(self, file_name)

Exports the list of ranking scores to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_html(self, file_name, title=u'Feature Ranks', subtitle=u'')

Exports the feature ranks visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature Ranks’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.
to_pandas(self, ranking_column=u'Score', min_threshold=0.0)

Exports the list of ranking scores to a Pandas DataFrame object.

Args:
ranking_column (str=Score):
 Column to compute the rank and order the dataframe. This can be useful to discover interesting variables that are penalised because they have a lot of missing values.
min_threshold (float):
 A a cutoff threshold for the minimum score that a variable should have in order to be exported.
Returns:
  • The constructed Pandas DataFrame object.
vis_settings

An object of type datastories.visualization.FeatureRanksTableSettings encapsulating the outlier visualization settings


Correlation

The datastories.correlation package contains a collection of classes and functions to facilitate correlation analysis.

Prototype Detection

datastories.correlation.compute_prototypes(data_set, kpi, double prototype_threshold: float = 0.85, fast_approximation: bool = True, double missing_value_threshold: float = 0.5) → PrototypeResult

Identifies a set of mutually uncorrelated variables from a data frame.

Correlation estimation is based on the Mutual Information Content measure.

Each variable in the set has the following properties:

  • it is not significantly correlated to any other variable in the set;
  • it can be highly correlated to other variables that are not included in the set;
  • it has a higher KPI correlation score than all the other variables that are highly correlated to it.

Each variable that is not included in the set has the property that is highly correlated to a variable in the set.

Args:
data_set (obj):a DataStories or a Pandas dataframe.
kpi (int|str):the index or the name of the KPI column.
prototype_threshold (float = 0.85):
 correlation threshold for features to be considered proxies.
fast_approximation (bool = True):
 approximate the mutual information, this provides a significant speedup with little precision loss.
missing_value_threshold (float = 0.5):
 missing values threshold for excluding features from prototypes.
Returns:
Raises:
  • TypeError: if data_set is not a DataFrame or a Pandas DataFrame object.
  • ValueError: if kpi is not a valid column name or index value (e.g., out-of-range index).

Example:

from datastories.correlation import compute_prototypes
import pandas as pd
df = pd.read_csv('example.csv')
kpi_column_index = 1
protoypes = compute_prototypes(df, kpi_column_index)
print(protoypes)

class datastories.correlation.PrototypeResult(c_prototype_list)

Bases: datastories.api.interface.IAnalysisResult

Encapsulates the result of the datastories.correlation.compute_prototypes() analysis.

Note: Objects of this class should not be manually constructed.

get_prototypes(self)

Retrieves the list of models computed by the datastories.correlation.compute_prototypes() analysis.

Returns:
-(list): a list of datastories.correlation.Prototype objects.
plot(self, *args, **kwargs)

Displays a graphical representation of the prototype analysis results.

Accepts the same parameters as the constructor for datastories.visualization.PrototypeTableSettings objects.

prototypes

Retrieves the list of column names currently selected as prototypes.

Returns:
  • (list): a list of column names.
select(self, cols)

Selects a number of column names as prototypes.

selected

Retrieves the list of column names currently selected as prototypes.

Returns:
  • (list): a list of column names.
to_csv(self, file_name, delimiter=u', ', decimal=u'.')

Exports the list of prototypes to a CSV file.

Args:
file_name (str):
 name of the file to export to.
delimiter (str=’,’):
 CSV delimiter
decimal (str=’.’):
 CSV decimal point
to_excel(self, file_name)

Exports the list of protoypes to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_html(self, file_name, title=u'Prototypes', subtitle=u'')

Exports the prototypes visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Outliers’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.
to_pandas(self)

Exports the list of prototypes to a Pandas DataFrame object.

Returns:
  • The constructed Pandas DataFrame object.
vis_settings

An object of type datastories.visualization.PrototypeTableSettings encapsulating the outlier visualization settings


class datastories.correlation.Prototype(c_prototype)

Encapsulates prototype information data.

Attributes:
info (obj):an object of type datastories.correlation.CorrelationInfo describing the correlation of the prototype with respect to the KPI.
proxy_list (list):
 a list of datastories.correlation.CorrelationInfo objects corresponding to highly correlated variables with respect to the prototype.

class datastories.correlation.CorrelationInfo(c_correlation_info)

Encapsulates correlation information for a variable with respect to a reference.

Attributes:
col_index (int):
 the index of the variable in the input data frame.
col_name (str):the name of the variable.
correlation (float):
 the correlation score with respect to the reference.

Model

The datastories.model package contains a collection of classes that encapsulate data models (e.g., prediction models computed by regression or classification analysis).


class datastories.model.Model

An DataStories model.

evaluate(self, data_frame)

Evaluate the model on an input data frame.

Args:

  • data_frame (obj):
    the input data frame (either a datastories.data.DataFrame or a Pandas DataFrame object). This has to include the input variables for the model.

Returns:

A data frame including the evaluated output variables of the model. Can be either a datastories.data.DataFrame or a Pandas DataFrame object, depending on the provided input.
inputs

A list of input model variable names

outputs

A list of output model variable names

plot(self, *args, **kwargs)

Displays a graphical representation of the prediction model.

Accepts the same parameters as the constructor for datastories.visualization.WhatIfsSettings

save(self, file_name=None)

Serialize the model to a file or a bytes object.

Args:

  • file_name (str = None):
    Name of the output file. If omitted the file is saved to a bytes object and returned as output for the function.

Returns:

A bytes object containing the model when the file_name argument is omitted or set to None.
to_cpp(self, file_name)

Export the model to a C++ file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_excel(self, file_name)

Export the model to an Excel file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_matlab(self, file_name)

Export the model to a MATLAB file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_py(self, file_name)

Export the model to a Python file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_r(self, file_name)

Export the model to an R file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
variables

A dictionary mapping model variables to corresponding information such as variable type and range.

Returns:

Dictionary string -> datastories.model.VariableInfo.

class datastories.model.VariableInfo

Holds information about a model variable, such as ranges and types.

categories

Get the registered categories of the associated variable (i.e., if the variable is categorical).

index

Get the index of the associated variable.

is_input

Check if the associated variable is an input for the model.

max

Get the maximum value of the associated variable.

min

Get the minimum value of the associated variable.

range_type

Get the range type of the associated variable.

type

Get the type of the associated variable.


class datastories.model.SingleKpiPredictor(kpi_name, column_names, prediction_type, prediction_performance, *args, **kwargs)

Bases: datastories.api.interface.IPredictiveModel, datastories.core.utils.object_.StorageBackedObject

Encapsulates prediction models (e.g., computed using datastories.story.predict_single_kpi()).

Note: Objects of this class should not be manually constructed.
Attributes:
vis_settings (obj):
 an object of type datastories.visualization.PredictedVsActualSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
error_plot

A visualization for assessing model prediction errors, as discovered while training the model.

Returns:
maximize(progress_bar=True)

Compute the input combination that maximizes the predictive model output.

Args:

  • progress_bar (obj|bool=True):
    An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

Returns:

A datastories.optimization.OptimizationResult object encapsulating the model variables values that maximize the model outputs.
metrics

A dictionary containing model prediction performance metrics.

The type of metrics depend on the model type (i.e., regression or classification)

For regression models the metrics include:
Correlation:actual vs predicted correlation
Estimated Correlation:
 estimated correlation for future (unseen) values
R-squared:the coefficient of determination
MSE:mean squared error
RMSE:root mean squared error
For binary classification models the metrics include:
Positive Label:the label used to identify positive cases
Negative Label:the label used to identify negative cases
True Positives:number of correctly identified positive cases (TP)
False Positives:
 number of incorrectly identified positive cases (FP)
True Negatives:number of correctly identified negative cases (TN)
False Negatives:
 number of incorrectly identified negative cases (FN)
Not Classified:number of records that could not be classified (i.e., KPI is NaN)
True Positive Rate:
 TP / (TP + FN) * 100 (a.k.a. sensitivity, recall)
False Positive Rate:
 FP / (FP + TN) * 100 (a.k.a. fall-out)
True Negative Rate:
 TN / ( FP + TN) * 100 (a.k.a. specificity)
False Negative Rate:
 FN / (TP + FN) * 100 (a.k.a. miss rate)
Precision:percentage of correctly identified cases from the total reported positive cases TP / (TP + FP) * 100
Recall:percentage of correctly identified cases from the total existing positive cases TP / (TP + FN) * 100
Accuracy:percentage of correctly identified cases (TP + TN) / (TP + FP + TN + FN) * 100
F1 Score:the F1 score (the harmonic mean of precision and recall)
AUC:area under (ROC) curve
minimize(progress_bar=True)

Compute the input combination that minimizes the predictive model output.

Args:

  • progress_bar (obj|bool=True):
    An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

Returns:

A datastories.optimization.OptimizationResult object encapsulating the model variables values that minimize the model outputs.
optimize(optimization_spec=<datastories.optimization.specification.OptimizationSpecification object>, progress_bar=True)

Compute an optimum input/output combination according to an (optional) optimization specification.

Args:

Returns:

A datastories.optimization.OptimizationResult object encapsulating the model variables values that satisfy the optimization specification.
plot(*args, **kwargs)

Displays a graphical representation of the prediction model.

Accepts the same parameters as the constructor for datastories.visualization.ConfusionMatrixSettings (for classification models) or datastories.visualization.PredictedVsActualSettings (for regression models).

predict(data_frame)

Predict the model KPI on a new data frame.

Args:
data_frame (obj):
 the data frame on which the model associated KPI is to be predicted.
Returns:
  • An object of type datastories.regression.PredictionResult encapsulating the prediction results.
Raises:
  • ValueError: if not all required columns are provided.

Note: All columns present in the training data frame are required for making predictions even if they are not significant for the prediction.

rebuild(score_threshold=None)

Rebuilds the prediction model using custom settings.

Args:
score_threshold (float=None):
 the decision threshold for binary KPI models. If missing, the optimal decision threshold will be determined automatically.

Note: In order to make changes permanent (i.e., survive story reloads) the associated story has to be saved after executing this method. To save a story use datastories.story.PredictSingleKpiStory.save().

to_cpp(file_name)

Export the model to a C++ file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_excel(file_name)

Export the model to an Excel file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_html(file_name, title='Predictive Model', subtitle='Predicted vs Actual')

Exports a visual representation of the prediction model to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.
to_matlab(file_name)

Export the model to a MATLAB file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_py(file_name)

Export the model to a Python file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.
to_r(file_name)

Export the model to an R file.

Args:

  • file_name (str): name of the file to export to.

Raises:

  • class:datastories.api.DatastoriesError: if there is a problem saving the file.

class datastories.model.SingleKpiPrediction(prediction_type, prediction, data, kpi_name, is_test=False)

Bases: datastories.api.interface.IPrediction

Encapsulates the results of a prediction done using a datastories.model.SingleKpiPredictor object.

Note: Objects of this class should not be manually constructed.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.PredictedVsActualSettings containing visualization settings. Set this object before displaying the visualization.
error_plot

A visualization for assessing model prediction errors.

Returns:
metrics

A dictionary containing prediction performance metrics.

These metrics are computed when the data frame used for prediction includes KPI values, for the purpose of evaluating the model prediction performance.

The following metrics are retrieved:
Number of Records:
 number of records submitted for prediction
Correlation:actual vs predicted correlation
R-squared:the coefficient of determination
MSE:mean squared error
RMSE:root mean squared error
In case the KPI is a binary variable, the following additional metrics are included:
Positive Label:the label used to identify positive cases
Negative Label:the label used to identify negative cases
True Positives:number of correctly identified positive cases (TP)
False Positives:
 number of incorrectly identified positive cases (FP)
True Negatives:number of correctly identified negative cases (TN)
False Negatives:
 number of incorrectly identified negative cases (FN)
False Negatives:
 number of incorrectly identified negative cases (FN)
Not Classified:number of records that could not be classified (i.e., KPI is NaN)
True Positive Rate:
 TP / (TP + FN) * 100 (a.k.a. sensitivity, recall)
False Positive Rate:
 FP / (FP + TN) * 100 (a.k.a. fall-out)
True Negative Rate:
 TN / ( FP + TN) * 100 (a.k.a. specificity)
False Negative Rate:
 FN / (TP + FN) * 100 (a.k.a. miss rate)
Precision:percentage of correctly identified cases from the total reported positive cases TP / (TP + FP) * 100
Recall:percentage of correctly identified cases from the total existing positive cases TP / (TP + FN) * 100
Accuracy:percentage of correctly identified cases (TP + TN) / (TP + FP + TN + FN) * 100
F1 Score:the F1 score (the harmonic mean of precision and recall)
AUC:area under (ROC) curve
plot(*args, **kwargs)

Displays a graphical representation of the prediction performance.

Accepts the same parameters as the constructor for datastories.visualization.ConfusionMatrixSettings (for classification based predictions predictions) or datastories.visualization.PredictedVsActualSettings (for regression based predictions).

to_csv(file_name, keep_metrics=True, delimiter=', ', decimal='.')

Exports the list of predictions to a CSV file.

Args:
file_name (str):
 name of the file to export to.
keep_metrics (bool=True):
 True is predictions metrics should be included as additional columns.
delimiter (str=’,’):
 CSV delimiter
decimal (str=’.’):
 CSV decimal point
to_excel(file_name, keep_metrics=True)

Exports the list of predictions to an Excel file.

Args:
file_name (str):
 name of the file to export to.
keep_metrics (bool=True):
 True is predictions metrics should be included as additional columns.
to_html(file_name, title='Prediction Performance', subtitle='Predicted vs Actual')

Exports a visual representation of the prediction performance to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.
to_pandas(keep_metrics=True)

Exports the list of predictions to a pandas.core.frame.DataFrame object.

Args:
keep_metrics (bool=True):
 True is predictions metrics should be included as additional columns.
Returns:
  • The constructed pandas.core.frame.DataFrame object.

Optimization

The datastories.optimization package contains a collection of classes and functions for optimizing models.


datastories.optimization.create_optimizer(*args, **kwargs)

Factory method for creating optimizers.

Returns:

An object of type datastories.optimization.pso.Optimizer that can be used to perform optimization analyses on a datastories.model.Model object.

Example:

model = Model("my_model.rsx")
spec = OptimizationSpecification()
spec.objectives = [
    Minimize('KPI_1'),
    Maximize('KPI_2')
]
spec.constraints = [
    AtMost('Input_1', 10),
]
optimizer = create_optimizer()
optimization_result = optimizer.optimize(model, optimization_spec=spec)
print(optimization_result.optimum)

class datastories.optimization.pso.Optimizer(size_t population_size=500, size_t iterations=250)

A model optimizer using the particle swarm strategy for identifying an optimum solution.

Args:

  • population_size (int = 500):
    the initial size of the swarm population.
  • iterations (int = 250):
    number of swarm computation iterations before stopping.
maximize(self, model, variable_ranges={}, progress_bar=True)

Run the optimizer with the goal of maximizing the outputs (i.e., KPIs) of a given model.

Args:
  • model (datastories.model.Model):
    The input model whose KPIs are to be maximized.
  • variable_ranges (dict [str, datastories.optimization.VariableRange] = {}):
    An optional dictionary mapping variable names to ranges that are to be used to limit the searching for the optimum solution to a given domain.
  • progress_bar (obj|bool=True):
    An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

Returns:

A datastories.optimization.OptimizationResult object encapsulating the model variables values that maximize the model outputs.
minimize(self, model, variable_ranges={}, progress_bar=True)

Run the optimizer with the goal of minimizing the outputs (i.e., KPIs) of a given model.

Args:
  • model (datastories.model.Model):
    The input model whose KPIs are to be minimized.
  • variable_ranges (dict [str, datastories.optimization.VariableRange] = {}):
    An optional dictionary mapping variable names to ranges that are to be used to limit the searching for the optimum solution to a given domain.
  • progress_bar (obj|bool=True):
    An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

Returns:

A datastories.optimization.OptimizationResult object encapsulating the model variables values that minimize the model outputs.
optimize(self, model, optimization_spec=OptimizationSpecification(), variable_ranges={}, direction=None, progress_bar=True)

Optimize an input model according to a given optimization specification.

Args:

  • model (datastories.model.Model):

    The input model to be optimized

  • optimization_spec (datastories.optimization.OptimizationSpecification):

    An optional specification for the optimization objectives and constraints. The default value is an empty specification (i.e., OptimizationSpecification())

  • variable_ranges (dict [str, datastories.optimization.VariableRange] = {}):

    An optional dictionary mapping variable names to ranges that are to be used to limit the searching for the optimum solution to a given domain.

  • direction (datastories.optimization.OptimizationDirection)

    The direction of optimization when no specification is provided. Can be one of:

    • OptimizationDirection.MAXIMIZE
    • OptimizationDirection.MINIMIZE
  • progress_bar (obj|bool=True):

    An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

Returns:

A datastories.optimization.OptimizationResult object encapsulating the model variables values that satisfy the optimization specification.

class datastories.optimization.OptimizerType

Enumeration for DataStories supported optimizer types.

Possible value:
  • OptimizerType.PARTICLE_SWARM

class datastories.optimization.OptimizationDirection

Enumeration for possible optimization goals when no other optimization specification is provided.

Possible values:
  • OptimizationDirection.MAXIMIZE
  • OptimizationDirection.MINIMIZE

class datastories.optimization.OptimizationSpecification(objectives=None, constraints=None)

Encapsulates a set of optimization objectives and constraints that can be used to configure an optimization analysis.

Both objectives and constraints are defined using datastories.optimization.VariableSpec and (potentially) datastories.optimization.VariableMapper objects.

Example:

spec = OptimizationSpecification()
spec.objectives = [
    Minimize('KPI_1', 2),
    InInterval('KPI_2', 1, 100)
]
spec.add_constraint(AtMost(Sum('Input_1','Input_2'), 100))
add_constraint(self, constraint)

Add a optimization constraint to the specification.

add_objective(self, objective)

Add a optimization objective to the specification.

constraints

Get/set the optimization specification constraints.

objectives

Get/set the optimization specification objectives.


class datastories.optimization.OptimizationResult

Encapsulates the result of a datastories.optimizer.Optimizer.optimize() analysis.

Note: Objects of this class should not be manually constructed.

is_complete

Check whether the search for the optimum has been interrupted before completion.

is_feasible

Check whether the identified optimum position respects the imposed constraints (if any).

optimum

Get the model variable values for the identified optimum position.

to_pandas(self)

Export the optimum position to a Pandas DataFrame object.

Returns:

The constructed Pandas DataFrame object.

class datastories.optimization.VariableRange(min=None, max=None, value=None)
Encapsulates a numeric or categorical value ranges.
  • Numeric ranges are defined by an upper and a lower bound.
  • Categorical ranges are currently limited to a single value.
Args:
  • min (double = 0):
    a numeric range lower bound
  • max (double = 0):
    a numeric range upper bound
  • value (str = ''):
    a categorical range value
is_categorical

Check if the variable range is categorical.

is_numeric

Check if the variable range is numeric.

max

Get the upper bound of a numeric range.

min

Get the lower bound of a numeric range.

value

Get the value of a categorical range.


class datastories.optimization.VariableMapper

Base class for all variable mappers.

Variable mappers are the first parameter to be passed when defining optimization objectives and constraints. They indicate to what variable or group of variables the objective/constraint applies.

For simple cases (i.e., one variable), variable mappers can be replaced with the name of the variable itself. However, in more complex scenarios (e.g., a constraint that applies to the aggregated value of a number of variables), mappers have to be explicitly constructed.


class datastories.optimization.Sum(operands, weights=None)

Bases: datastories.optimization.specification.VariableMapper

Aggregates a number of variables using a weighted sum. This can be then used to define optimization objectives or constraints.

Args:

  • operands (list):
    a list of variable names to sum-up.
  • weights (list = None):
    a list of relative weights for aggregating the given variables.

class datastories.optimization.VariableSpec

Base class for all optimization objectives and constraints.


class datastories.optimization.AtMost(operand, double limit, double weight=1.0)

Bases: datastories.optimization.specification.VariableSpec

Specifies an optimization objective or constraint by which a variable (or aggregation of variables) should be lower than a given reference value.

Args:

  • operand (obj):
    a variable mapper (datastories.optimization.specification.VariableMapper) indicating to whom the objective/constraint applies.
  • limit (double):
    the reference value to compare against.
  • weight (double = 1):
    the relative weight of this objective/constraint among all the specified objectives or constraints.

class datastories.optimization.AtLeast(operand, double limit, double weight=1.0)

Bases: datastories.optimization.specification.VariableSpec

Specifies an optimization objective or constraint by which a variable (or aggregation of variables) should be greater than a given reference value.

Args:

  • operand (obj):
    a variable mapper (datastories.optimization.specification.VariableMapper) indicating to whom the objective/constraint applies.
  • limit (double):
    the reference value to compare against.
  • weight (double = 1):
    the relative weight of this objective/constraint among all the specified objectives or constraints.

class datastories.optimization.InInterval(operand, double lower_limit, double upper_limit, double weight=1.0)

Bases: datastories.optimization.specification.VariableSpec

Specifies an optimization objective or constraint by which a variable (or aggregation of variables) should be in a given reference interval.

Args:

  • operand (obj):
    a variable mapper (datastories.optimization.specification.VariableMapper) indicating to whom the objective/constraint applies.
  • lower_limit (double):
    the lower bound of the reference interval.
  • upper_limit (double):
    the upper bound of the reference interval.
  • weight (double = 1):
    the relative weight of this objective/constraint among all the specified objectives or constraints.

class datastories.optimization.IsEqual(operand, double value, double weight=1.0)

Bases: datastories.optimization.specification.VariableSpec

Specifies an optimization objective by which a variable (or aggregation of variables) should be equal to a given reference value.

Note: This cannot be used to define optimization constraints. To achieve a similar effect when defining a constraint, one can use a combination of datastories.optimization.specification.AtMost and datastories.optimization.specification.AtLeast instead.

Args:

  • operand (obj):
    a variable mapper (datastories.optimization.specification.VariableMapper) indicating to whom the objective applies.
  • value (double):
    the reference value to compare against.
  • weight (double = 1):
    the relative weight of this objective/constraint among all the specified objectives or constraints.

class datastories.optimization.Minimize(operand, double weight=1.0)

Bases: datastories.optimization.specification.VariableSpec

Specifies an optimization objective by which a variable (or aggregation of variables) should have the smallest possible value.

Note: This cannot be used to define optimization constraints.

Args:

  • operand (obj):
    a variable mapper (datastories.optimization.specification.VariableMapper) indicating to whom the objective applies.
  • weight (double = 1):
    the relative weight of this objective/constraint among all the specified objectives or constraints.

class datastories.optimization.Maximize(operand, double weight=1.0)

Bases: datastories.optimization.specification.VariableSpec

Specifies an optimization objective by which a variable (or aggregation of variables) should have the largest possible value.

Note: This cannot be used to define optimization constraints.

Args:

  • operand (obj):
    a variable mapper (datastories.optimization.specification.VariableMapper) indicating to whom the objective applies.
  • weight (double = 1):
    the relative weight of this objective/constraint among all the specified objectives or constraints.

Visualization

Plots

The datastories.visualization package contains a collection of visualizations that facilitates the assessment of selected DataStories analysis results.


class datastories.visualization.ClassificationPlot(data, predicted_name, actual_name, prediction_performance=None, vis_settings=<datastories.visualization.classification_plot.ClassificationPlotSettings object>, *args, **kwargs)

Visual representation of binary classification (performance).

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.ClassificationPlotSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.ClassificationPlotSettings objects.


class datastories.visualization.ClassificationPlotSettings(x_axis=None, jitter=0.2, *args, **kwargs)

Encapsulates visualization settings for datastories.visualization.ClassificationPlot visualizations.

Args:
x_axis (str=None):
 Column to display on the X axis;
jitter (float=0.2):
 Amount of ‘jitter’ to add on the Y axis in order to minimize overlapping.
Attributes:
Same as the Args section above.

class datastories.visualization.ConfusionMatrix(prediction_performance, vis_settings=<datastories.visualization.confusion_matrix.ConfusionMatrixSettings object>, *args, **kwargs)

Encapsulates a visual representation of model accuracy for binary classification models.

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.ConfusionMatrixSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.ConfusionMatrixSettings objects.

to_html(file_name, title='Confusion Matrix', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

class datastories.visualization.ConfusionMatrixSettings(width=480, height=320)

Encapsulates visualization settings for datastories.visualization.ConfusionMatrix visualizations.

Args:
width (int=640):
 Graph width in pixels;
height (int=480):
 Graph height in pixels;
Attributes:
Same as the Args section above.

class datastories.visualization.CorrelationBrowser(vis_settings=<datastories.visualization.correlation_browser.CorrelationBrowserSettings object>, *args, **kwargs)

Encapsulates a visual representation of correlation between features.

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_setting (obj):
 an object of type datastories.visualization.CorrelationBrowserSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.CorrelationBrowserSettings objects.

to_html(file_name, title='Feature correlation browser', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

class datastories.visualization.CorrelationBrowserSettings(scale=1, node_opacity=0.9, edge_opacity=0.3, tension=0.65, font_size=15, filter_unconnected=False, min_weight=50, max_weight=100, weight_key='weightMI', show_controls=True)

Encapsulates visualization settings for datastories.visualization.CorrelationBrowser visualizations.

Args:
scale (float=1):
 Scale factor of the radius [0-1];
node_opacity (float=0.9):
 Opacity of the nodes that aren’t hovered or connected to hovered or selected nodes [0-1];
edge_opacity (float=0.3):
 Opacity of the edges that aren’t hovered or connected to hovered or selected nodes [0-1];
tension (float=0.65):
 The tension of the links. A tension of 0 means straight lines [0-1];
font_size (int=15):
 Font size used for the nodes of the plot [10-32];
filter_unconnected (boolean=False):
 Whether or nodes that aren’t connected to any other node are filtered from the view;
min_weight (int=50):
 Minimum weight of the links that will be shown [0-100];
max_weight (int=100):
 Maximum weight of the links that will be shown [0-100];
weight_key (str=’weightMI’):
 Type of relations top display [‘weightMI’ for Mutual Information,’weightL’ for Linear Correlation];
Attributes:
Same as the Args section above.

class datastories.visualization.DataSummaryTable(summary, vis_settings=<datastories.visualization.data_summary_table.DataSummaryTableSettings object>, *args, **kwargs)

Encapsulates a visual representation of data frame summary.

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.DataSummaryTableSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.DataSummaryTableSettings objects.

to_html(file_name, title='Data Summary', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Data Summary’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

class datastories.visualization.DataSummaryTableSettings(page_size=25, show_console=True)

Encapsulates visualization settings for datastories.visualization.DataSummaryTable visualizations.

Args:
page_size (int=1):
 Maximum number of columns to display one one summary page;
Attributes:
Same as the Args section above.

class datastories.visualization.ErrorPlot(prediction_performance, vis_settings=<datastories.visualization.error_plot.ErrorPlotSettings object>, *args, **kwargs)

Encapsulates a visual representation of model accuracy for regression models.

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.ErrorPlotSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.ErrorPlotSettings objects.

to_html(file_name, title='Error Plot', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

class datastories.visualization.ErrorPlotSettings(sort_key='id', lines=False, highlight_outliers=False, threshold=None, confidence=False, x_padding=0, y_padding=1, marker_size=32, hover_marker_size_delta=32, animations=500, margin_top=10, margin_right=20, margin_bottom=40, margin_left=60)

Encapsulates visualization settings for datastories.visualization.ErrorPlot visualizations.

Args:
sort_key (str=’id’):
 The sorting criteria for the X axis.Possible values: id - sort on record id; act - sort on record actual KPI value; pred - sort on record predicted value.
lines (bool=Tue):
 True if points should be connected by lines;
highlight_outliers (bool=Tue):
 True if outliers should be highlighted;
threshold (float=0.5):
 Threshold;
confidence (bool=Tue):
 True if confidence limits should be displayed
x_padding (int=1):
 X padding;
y_padding (int=1):
 ; Y padding;
marker_size (int=32):
 ; Size of the point marker
hover_marker_size_delta (int=32):
 Size of the point hover marker;
animations (int=500):
 ; Animation duration in milliseconds;
margin_top (int=10):
 
margin_right (int=20):
 
margin_bottom (int=40):
 
margin_left (int=60):
 
Attributes:
Same as the Args section above.

class datastories.visualization.OutlierXPlot(outliers_result, vis_settings=<datastories.visualization.outlier_plot.OutlierPlotSettings object>, *args, **kwargs)

Encapsulates a visual representation of outliers resulting from a one dimensional analysis.

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.OutlierPlotSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.OutlierPlotSettings objects.

to_html(file_name, title='Confusion Matrix', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

class datastories.visualization.OutlierPlotSettings(width=800, height=200, x_padding=0.2, y_padding=0.2, marker_size=32, hover_marker_size_delta=32, animations=500, show_jitter=True, show_cdf=True, show_iqr=True, show_summary=True, show_console=True, show_legend=True, low_threshold=0.05, high_threshold=0.95)

Encapsulates visualization settings for datastories.visualization.OutlierXPlot visualizations.

Args:
width (int=400):
 Graph width in pixels;
height (int=300):
 Graph height in pixels;
x_padding (float=0.2):
 X padding;
y_padding (float=0.2):
 ; Y padding;
marker_size (int=32):
 ; Size of the point marker
hover_marker_size_delta (int=32):
 Size of the point hover marker;
animations (int=500):
 Animation duration in milliseconds;
show_jitter (bool=False):
 adds some jitter to the vertical dimension, to better distinguish points;
show_cdf (bool=True):
 shows the cumulative distribution function;
show_iqr (bool=True):
 displays the inter-quartile range, as specified in the lower and higher threshold arguments;
show_summary (bool=True):
 displays the summary table
show_console (bool=True):
 displays the visualization console where update operations are logged
low_threshold (float=0.05):
 the lower threshold for the inter-quartile range;
high_threshold (float=0.95):
 the upper threshold for the inter-quartile range;
Attributes:
Same as the Args section above.

class datastories.visualization.PredictedVsActual(prediction_performance, vis_settings=<datastories.visualization.predicted_vs_actual.PredictedVsActualSettings object>, *args, **kwargs)

Encapsulates a visual representation of model accuracy for regression models.

Note: Objects of this class should not be manually constructed.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Attributes:
vis_settings (obj):
 an object of type datastories.visualization.PredictedVsActualSettings containing visualization settings. Set this object before displaying the visualization or exporting to HTML.
plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.PredictedVsActualSettings objects.

to_html(file_name, title='Predicted vs Actual', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

class datastories.visualization.PredictedVsActualSettings(width=400, highlight_outliers=True, threshold=0.5, x_padding=0.2, y_padding=0.2, marker_size=32, hover_marker_size_delta=32, animations=500)

Encapsulates visualization settings for datastories.visualization.PredictedVsActual visualizations.

Args:
width (int=400):
 Graph width in pixels;
highlight_outliers (bool=Tue):
 True if outliers should be highlighted;
threshold (float=0.5):
 Threshold;
x_padding (float=0.2):
 X padding;
y_padding (float=0.2):
 ; Y padding;
marker_size (int=32):
 ; Size of the point marker
hover_marker_size_delta (int=32):
 Size of the point hover marker;
animations (int=500):
 ; Animation duration in milliseconds;
Attributes:
Same as the Args section above.

class datastories.visualization.WhatIfs(current_values=[], minimize_values=None, maximize_values=None, raw_model=None, vis_settings=<datastories.visualization.whatifs.WhatIfsSettings object>, *args, **kwargs)

Encapsulates a visual representation for exploring the influence of driver variables on target KPIs.

One can display this visualization in a IPython Notebook by simply giving the name of an object of this class.

Note: Objects of this class should not be manually constructed.

drivers

Retrieves the driver values

maximize()

Identify a set of driver values that maximize the KPI

minimize()

Identify a set of driver values that minimize the KPI

plot(*args, **kwargs)

Convenience function to set-up and display the visualization.

Accepts the same parameters as the constructor for datastories.visualization.PredictedVsActualSettings objects.

to_html(file_name, title='What-Ifs', subtitle='')

Exports the visualization to a standalone HTML document.

Args:
file_name (str):
 name of the file to export to;
title (str=’Feature correlation browser’):
 HTML document title;
subtitle (str=’’):
 HTML document subtitle.

datastories.visualization.what_ifs(model_path=None, current_values=[], minimize_values=None, maximize_values=None, raw_model=None)

Displays a WhatIf visualization in a Jupyter notebook based on an input RSX model file.

Args:
model_path (str=None):
 path to the input RSX model file; if None the raw_model arguments has to be provided.
current_values (list=[]):
 list of initial driver values;
minimize_values (list=None):
 driver values that minimize the KPI;
maximize_values (list=None):
 driver values that maximize the KPI;
raw_model (bytes=None):
 an optional bytes object, containing the source of the backing RSX model
Returns:

Example:

from datastories.visualization import what_ifs
what_ifs('my_model.rsx')

Utils

The datastories.display package contains a collection of display helpers.


class datastories.display.ProgressCounter

Base class implemented by all progress counters.

Attr:
total (int):the number of steps required for completion
step (int):the current step
start_time (int):
 the start time in ns
stop_time (int):
 the stop time in ns
increment(steps=1)

Advances the progress with a number of steps.

Args:
steps (int):the number of steps to advance
start(total=1)

Initializes the progress range.

Args:
total (int):the number of steps required for completion
stop()

Stops progress monitoring.


class datastories.display.ProgressReporter

Abstract base class implemented by all progress reporters.

log(message)

Logs a progress message.

Args:
: message (str): Progress message to log.
on_progress(progress)

Logs the completion percentage.

Args:
: progress (float=None): Completion percentage to be logged.

datastories.display.get_progress_bar(progress_bar)

Retrieves a default implementation for a progress bar.

Args:
progress_bar (obj|bool=False):
 

An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

When an datastories.display.ProgressReporter object is provided it will be returned as is.

Returns:
An object of type datastories.api.ProgressReporter.

datastories.display.wide_screen(width=0.95)

Make the notebook screen wider when running under Jupyter Notebook.

Args:
width (float=0.95):
 width of notebook as a fraction of the screen width. Should be in the interval [0,1].

datastories.display.init_graphics()

Initializes the DataStories graphics engine.

Use this function at the top of your notebooks when planing to save HTML copies of your work.


Regression

The datastories.regression package contains a collection of classes and functions to facilitate regression analysis.


class datastories.regression.RegressionError(value)

Exception generated when failing to execute regression analysis methods.


Story

The datastories.story package contains a collection of workflows to automate specific analysis tasks (e.g., building a predictive model).

Predict Single KPI

datastories.story.predict_single_kpi(data_frame, column_list, kpi, runs=3, outlier_elimination=True, prototypes='auto', progress_bar=True)

Fits a non-linear regression model on a data frame in order to predict one column.

The column to pe predicted (i.e., the KPI) is to be identified either by name or by column index in the data frame.

Args:
data_frame (obj):
 

the input data frame (either a pandas.core.frame.DataFrame or a datastories.data.DataFrame object);

column_list (list):
 

the list of variables (i.e., columns) to consider for regression;

kpi (int|str):

the index or the name of the target (i.e., KPI) column;

runs (int=3):

the number of training rounds;

outlier_elimination (bool=True):
 

set to True in order to exclude far outliers from modeling;

prototypes (str=’yes’):
 

indicates whether analysis should be performed on prototypes. Possible values:

'yes': use only prototypes as inputs;

'no': use all original inputs;

'auto': use prototypes if there are more than 200 inputs variables.

progress_bar (obj|bool=True):
 

An object of type datastories.display.ProgressReporter, or a boolean to get a default implementations (i.e., True to display progress, False to show nothing).

Returns:
Raises:
  • ValueError: when an invalid value is provided for one of the input parameters parameters.
  • class:datastories.story.StoryError: if there is a problem fitting the model.

Example:

from datastories.story import predict_single_kpi
import pandas as pd
df = pd.read_csv('example.csv')
kpi_column_index = 1
ranks = predict_single_kpi(df, df.columns, kpi_column_index, progress_bar=True)
print(story)

class datastories.story.predict_single_kpi.Story(platform, kpi_name, user_columns, nrows, folder='', *args, **kwargs)

Bases: datastories.api.interface.IPredictiveStory, datastories.story.predict_single_kpi.predict_single_kpi.StoryRun

Encapsulates the result of the datastories.story.predict_single_kpi() story.

Note: Objects of this class should not be manually constructed.

add_note(note)

Add an annotation to the story results.

The already present annotations can be retrieved using the datastories.api.IStory.notes() property.

Args:
note (str):the annotation to be added.
assert_alive()

Triggers an exception if the object has been manually released.

clear_note(note_id)

Remove a specific annotation associated with the story analysis.

Args:
note_id (int):the index of the note to be removed.
Raises:
  • ValueError: if the note index is unknown.
clear_notes()

Clear the annotations associated with the story analysis.

correlation_browser

A visualization for assessing feature correlation.

An object of type datastories.visualization.CorrelationBrowser that can be used for assessing feature correlation, as discovered while training the model.

static load(file_name)

Loads a previously saved story.

Args:
file_name (str):
 the name of the source file.
Returns:
Raises:
  • datastories.story.StoryError if there is a problem loading the story file (e.g., story version not compatible).
make_independent(base_folder='')

Make object independent by copying required resources to an own folder.

Args:
base_folder (str=’’):
 the base folder for the unique object folder that will hold the required resources.
metrics

A dictionary containing the model performance metrics and the list of main drivers.

These metrics are computed on the training data for the purpose of evaluating the model prediction performance.

The following metrics are retrieved:
Training Set Size:
 size of the actual data frame used for training (rows x columns)
Correlation:actual vs predicted correlation
Estimated Correlation:
 estimated correlation for future (unseen) values
R-squared:the coefficient of determination
MSE:mean squared error
RMSE:root mean squared error
Main Drivers:list of main features with associated relative importance and energy
Features:list of all features with associated relative importance and energy
Computation Effort:
 a measure of model complexity
Number of Runs:number of training rounds
Best Run:best performing training round
Run Overview:overview of individual runs including Performance and Feature Importance
In case the KPI is a binary variable, the following additional metrics are included:
Positive Label:the label used to identify positive cases
Negative Label:the label used to identify negative cases
True Positives:number of correctly identified positive cases (TP)
False Positives:
 number of incorrectly identified positive cases (FP)
True Negatives:number of correctly identified negative cases (TN)
False Negatives:
 number of incorrectly identified negative cases (FN)
Not Classified:number of records that could not be classified (i.e., KPI is NaN)
True Positive Rate:
 TP / (TP + FN) * 100 (a.k.a. sensitivity, recall)
False Positive Rate:
 FP / (FP + TN) * 100 (a.k.a. fall-out)
True Negative Rate:
 TN / ( FP + TN) * 100 (a.k.a. specificity)
False Negative Rate:
 FN / (TP + FN) * 100 (a.k.a. miss rate)
Precision:percentage of correctly identified cases from the total reported positive cases TP / (TP + FP) * 100
Recall:percentage of correctly identified cases from the total existing positive cases TP / (TP + FN) * 100
Accuracy:percentage of correctly identified cases (TP + TN) / (TP + FP + TN + FN) * 100
F1 Score:the F1 score (the harmonic mean of precision and recall)
AUC:area under (ROC) curve
model

Returns an object of type datastories.model.SingleKpiPredictor that can be used for making predictions on new data.

notes

A text representation of all annotations currently associated with the story analysis.

plot(*args, **kwargs)

Plots a graphical representation of the results in Jupyter Notebook.

release()

Releases the object associated storage.

Note: This function should only be used in order to force releasing allocated resources. Using the object after this point would lead to an exception.

run_overview

An overview of feature importance metrics across all runs.

runs

A list containing the results of individual analysis rounds.

Each entry in the list is an object of type datastories.story.predict_single_kpi.StoryRun encapsulating the results associated with a given analysis round.

save(file_name)

Saves the story analysis results.

Use this function to persist the results of the datastories.story.predict_single_kpi() analysis. One can reload them and continue investigations at a later moment using the datastories.story.predict_single_kpi.Story.load() method.

Args:
file_name (str):
 the name of the destination file.
to_csv(file_name, content='metrics', delimiter=', ', decimal='.')

Exports a list of model metrics to a CSV file.

Args:
file_name (str):
 name of the file to export to.
content (str=metrics):
 the type of metrics to export. Possible values: - metrics - exports estimated model performance metrics; - drivers - exports driver importance metrics; - run_overview - exports an overview of feature importance metrics across all runs.
delimiter (str=’,’):
 CSV delimiter
decimal (str=’.’):
 CSV decimal point
Raises:
  • ValueError: when an invalid value is provided for the content parameter.
to_excel(file_name)

Exports the list of model metrics to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_pandas(content='metrics')

Exports a list of model metrics to a pandas.core.frame.DataFrame object.

Args:
content (str=metrics):
 the type of metrics to export. Possible values: - metrics - exports estimated model performance metrics; - drivers - exports feature importance metrics for the model; - run_overview - exports an overview of feature importance metrics across all runs.
Returns:
  • The constructed pandas.core.frame.DataFrame object.
Raises:
  • ValueError: when an invalid value is provided for the content parameter.
what_ifs

A visualization for interactive exploration of the models.

The visualization helps getting insight into how driver variables influence the target KPIs. An object of type datastories.visualization.WhatIfs that can be used for interactive exploration of the models.


class datastories.story.predict_single_kpi.StoryRun(platform, kpi_name, nrows, *args, **kwargs)

Bases: datastories.api.interface.IAnalysisResult, datastories.core.utils.object_.StorageBackedObject

Encapsulates the result of one analysis round from the datastories.story.predict_single_kpi() story.

Note: Objects of this class should not be manually constructed.

correlation_browser

A visualization for assessing feature correlation.

An object of type datastories.visualization.CorrelationBrowser that can be used for assessing feature correlation, as discovered while training the model.

metrics

A dictionary containing the model performance metrics and the list of main drivers.

These metrics are computed on the training data for the purpose of evaluating the model prediction performance.

The following metrics are retrieved:
Training Set Size:
 size of the actual data frame used for training (rows x columns)
Correlation:actual vs predicted correlation
Estimated Correlation:
 estimated correlation for future (unseen) values
R-squared:the coefficient of determination
MSE:mean squared error
RMSE:root mean squared error
Main Drivers:list of main features with associated relative importance and energy
Features:list of all features with associated relative importance and energy
In case the KPI is a binary variable, the following additional metrics are included:
Positive Label:the label used to identify positive cases
Negative Label:the label used to identify negative cases
True Positives:number of correctly identified positive cases (TP)
False Positives:
 number of incorrectly identified positive cases (FP)
True Negatives:number of correctly identified negative cases (TN)
False Negatives:
 number of incorrectly identified negative cases (FN)
Not Classified:number of records that could not be classified (i.e., KPI is NaN)
True Positive Rate:
 TP / (TP + FN) * 100 (a.k.a. sensitivity, recall)
False Positive Rate:
 FP / (FP + TN) * 100 (a.k.a. fall-out)
True Negative Rate:
 TN / ( FP + TN) * 100 (a.k.a. specificity)
False Negative Rate:
 FN / (TP + FN) * 100 (a.k.a. miss rate)
Precision:percentage of correctly identified cases from the total reported positive cases TP / (TP + FP) * 100
Recall:percentage of correctly identified cases from the total existing positive cases TP / (TP + FN) * 100
Accuracy:percentage of correctly identified cases (TP + TN) / (TP + FP + TN + FN) * 100
F1 Score:the F1 score (the harmonic mean of precision and recall)
AUC:area under (ROC) curve
model

An object of type datastories.model.SingleKpiPredictor that can be used for making predictions on new data.

to_csv(file_name, content='metrics', delimiter=', ', decimal='.')

Exports a list of model drivers or metrics to a CSV file.

Args:
file_name (str):
 name of the file to export to.
content (str=metrics):
 the type of metrics to export. Possible values: - metrics - exports estimated model performance metrics; - drivers - exports driver importance metrics.
delimiter (str=’,’):
 CSV delimiter
decimal (str=’.’):
 CSV decimal point
Raises:
  • ValueError: when an invalid value is provided for the content parameter.
to_excel(file_name)

Exports the list of model drivers and metrics to an Excel file.

Args:
file_name (str):
 name of the file to export to.
to_pandas(content='metrics')

Exports a list of model drivers or metrics to a pandas.core.frame.DataFrame object.

Args:
content (str=metrics):
 the type of metrics to export. Possible values: - metrics - exports estimated model performance metrics; - drivers - exports driver importance metrics.
Returns:
  • The constructed pandas.core.frame.DataFrame object.
Raises:
  • ValueError: when an invalid value is provided for the content parameter.
what_ifs

A visualization for interactive exploration of the models.

The visualization helps getting insight into how driver variables influence the target KPIs. An object of type datastories.visualization.WhatIfs that can be used for interactive exploration of the models.


License

datastories.api.get_activation_info()

Get information required to create and activate a DataStories license.

Returns:
dict:a dictionary containing data to be submitted to the DataStories representative in charge with issuing the license.

The datastories.license package contains a collection of utility functions to facilitate license management.

These functions are available as methods of a predefined object of class datastories.license.LicenseManager called master.

Example:

from datastories.license import manager
manager.initialize('my_license.lic')
manager

class datastories.license.LicenseManager(license_file_path=None)

Encapsulates the DataStories license manager.

The license manager enables users to inspect the details of their installed DataStories API license, and to use license keys that are not available in the standard installation locations (see Installation)

This class should not be instantiated directly. Instead one should use the already available object instance datastories.license.manager.

Args:
license_file_path (str = None):
 The path to a license key file or folder if other than the standard locations for the platform.
Attributes:
status (str):the status of the license manager initialization.
license (obj):the managed license as indicated in the license key file.

Example:

from datastories.license import manager
manager.initialize('my_license.lic')
manager
default_license_path

Default path used for license initialization if none provided

initialize(license_file_path=None)

Initialize the license manager with a license key at a specific location.

Args:
license_file_path (string):
 The path to a license key file or a folder containing the license key file.
Raises:
  • ValueError: when the provided license_file_path is not accessible.
is_granted(opt)

Checks if execution rights are granted for license protected functionality.

Args:
opt (str):the license option required by the protected functionality.
Return:
bool: True if execution rights are granted by the installed license.
is_ok()

Check the initialization status of the license manager.

The license manager initialization fails when no valid license file is found in the standard or user indicated locations.

Returns:
  • (bool): True if the license manager was successfully initialized.

Note: A successful license manager initialization does not guarantee a grant for using license protected functionality. Fort example, when an expired license is used, the initialization is still successful. To check whether execution rights are granted one should use the datastories.license.LicenseManager.is_granted() method.

reinitialize()

Re-initializes the license manager.

This is done using the same license file path as in the previous call to datastories.license.LicenseManager.initialize().

release()

Releases the currently held licenses.

This can be useful e.g., when using floating or counted licenses, as it makes the released licenses available for other clients or processes.

Note: once a license is released, the associated execution rights are retracted. In order to use the license protected functionality, users need to acquire the license, by initializing the license manager again (i.e., datastories.license.LicenseManager.initialize()).


class datastories.license.rlm.LicenseError(value)

Exception generated when accessing license protected functionality using an invalid license.