shared.common¶

common.py

This module contains shared variables and utility functions used across the Flask application for processing uploaded files.

Usage:

Import this module in both app.py and processing.py to access and modify the shared progress_data.

shared.common.addFieldNode(sf, l, cat, shapes, colors, calc)¶

Creates graph node objects for an input source field.

Parameters:

sf (str) – Input source field replacement ID.
l (str) – Input source field label.
cat (str) – Input source field category.
shapes (dict) – List of shapes per source field category.
colors (dict) – List of colors per source field category.
calc (str) – Input source field calculation expression.

Returns:

List of 2 node objects with:

name equal to the replacement ID,
label equal to the source field label,
shape/color/tooltip based on field type.

Return type:

list

shared.common.appendFieldsToDicts(l, k, v)¶

Append a fixed list of (key, value) to a list of dictionaries.

Parameters:

l (list) – Input list of dictionaries to update.
k (list) – List of fixed key names to append.
v (list) – List of fixed values to append corresponding to keys.

Returns:

Updated version of input list with new (key, value) pairs appended to each dictionary.

Return type:

list

shared.common.backwardDependencies(df, f, level=0, c=None)¶

Recursively get all backward dependencies of a field.

Parameters:

df (pandas.DataFrame) – Input data frame.
f (str) – Source field replacement ID.
level (int, optional) – Dependency level (0 = root, -1 = level 1 backwards, etc.). Defaults to 0.
c (str, optional) – Originating child source field replacement ID. Defaults to None.

Returns:

List of all backward dependencies of the input source field.

Return type:

list

shared.common.fieldCalculationDependencies(l, x)¶

List direct dependencies in a calculation based on a list of possible values.

Parameters:

l (list) – A list of all possible values that can be matched.
x (str) – An input calculation string.

Returns:

A list of values from the list l that were matched in the: string x.

Return type:

list

shared.common.fieldCalculationMapping(c, s, d, l)¶

Replace all external and internal field references by unique source/field IDs

Parameters:

c (str) – The source field calculation string.
s (str) – The source field name.
d (dict) – A dictionary mapping source fields to their replacement IDs.
l (list) – A list of unique field names.

Returns:

The calculation string without comments and with all field: ID references replaced by their corresponding unique replacement ID references.

Return type:

str

Notes

This function assumes that external fields are referenced as [source ID].[field ID] and internal fields as [field ID]. If this is not the case, the function may return incorrect results.

shared.common.fieldCategory(s, c)¶

Returns the category of a source field.

Parameters:

s (str) – Source field label.
c (str) – Source field cleaned calculation.

Returns:

Category of the source field, which can be:

”Parameter”
”Calculated Field (LOD)”
”Calculated Field”
”Field”

Return type:

str

shared.common.fieldIDMapping(x, s, d)¶

Replace IDs by labels for an input string or dict list.

Parameters:

x (str or list) – Input string or dict list.
s (str) – Source name.
d (dict) – Dictionary of (field/sheet) label -> ID mappings.

Returns:

String or dict list with all field IDs replaced by labels and references to internal source fields removed.

Return type:

str or list

shared.common.fieldMappingTable(df, colFrom, colTo)¶

Replace all external and internal field references by unique source/field IDs

Parameters:

df (pandas.DataFrame) – The input dataframe containing the field references.
colFrom – The name of the column containing the original values.
colTo – The name of the column containing the mapped values.

Returns:

A dictionary containing mappings from original values to: mapped values, where each key is an original value and each value is its corresponding mapped value.

Return type:

dict

shared.common.fieldsFromCategory(l, c, f)¶

Return a list of fields of a given category from a list of dependency dictionaries.

Parameters:

l (list) – Input list of dependency dictionaries.
c (str) – Category type to filter fields.
f (bool) – Flag indicating backward (True) or forward (False)
dependencies.

Returns:

List of unique field names corresponding to the specified: category.

Return type:

list

shared.common.forwardDependencies(df, f, w, level=0, p=None)¶

Recursively get all forward dependencies of a field.

Parameters:

df (pandas.DataFrame) – Input data frame.
f (str) – Source field replacement ID.
w (list) – List of root source field worksheet ID dependencies.
level (int, optional) – Dependency level (0 = root, -1 = level 1 backwards, etc.). Defaults to 0.
p (str, optional) – Originating parent source field replacement ID. Defaults to None.

Returns:

List of all forward dependencies of the input source field.

Return type:

list

shared.common.getRandomReplacementBaseID(df, c, suffix='')¶

Generate a random ID of 10 lowercase letters combined with the dataframe index.

The generated ID is used as a base for a field identifier. It ensures that the new ID does not match any values already present in the specified column.

Parameters:

df (pandas.DataFrame) – The input dataframe from which unique values are checked.
c (str) – The column name in the dataframe to ensure none of the generated IDs conflict with existing values.
suffix (str, optional) – An optional fixed suffix to append to the randomly generated ID. Defaults to an empty string.

Returns:

A random ID consisting of 10 lowercase letters combined with the optional suffix. The ID is guaranteed not to match any existing values in the specified column.

Return type:

str

shared.common.isParamDuplicate(p, s, x)¶

Checks if a field is a parameter duplicate.

Parameters:

p (list) – List of parameter fields.
s (str) – Source name.
x (str) – Field name.

Returns:

True if the field is a parameter duplicate and should be removed; False otherwise.

Return type:

bool

shared.common.maxDependencyLevel(l)¶

Return maximum forward or backward dependency level of a given input list.

Parameters:: l (list) – Input list of dependency dictionaries.
Returns:: Maximum dependency level for the given dictionary list.
Return type:: int

shared.common.processCaptions(i, c)¶

Process captions into a format suitable for calculations, removing invalid characters for JSON parsing.

Parameters:

i (str) – The source or field ID value.
c (str) – The source or field caption value.

Returns:

A tuple containing:

str: The original field name enclosed in brackets.
str: The processed caption enclosed in square brackets, with
any additional right square brackets doubled. Single and double quotes are replaced by HTML codes (' and "), while a backslash () is replaced by two backslashes ().

Return type:

tuple

shared.common.processSheetNames(s)¶

Remove invalid characters from sheet names for JSON parsing.

Parameters:

s (list) – A list of input sheet names.

Returns:

A processed list of sheet names with single quotes replaced: by ', double quotes replaced by ", and backslashes replaced by two backslashes (\).

Return type:

list

shared.common.removeDuplicatesByRowLength(df, x)¶

Remove duplicates from a DataFrame by retaining the row with the largest concatenated string length per grouping.

Parameters:

df (pandas.DataFrame) – The input DataFrame.
x (str) – The name of the column to group by.

Returns:

A tuple containing:

pandas.DataFrame: A copy of the input DataFrame with, for each unique grouping value, the row with the largest concatenated string length.
int: The number of duplicates removed.

Return type:

tuple

shared.common.sheetMapping(s, d)¶

Replace all sheet names with sequential sheet IDs.

Parameters:

s (list) – A list of sheet names.
d (dict) – A dictionary mapping sheet names to their corresponding sheet IDs.

Returns:

A list of mapped sheet IDs corresponding to the input sheet: names.

Return type:

list

shared.common.sheetMappingTable(df, colFrom)¶

Create a dictionary mapping sheet names to sheet IDs.

Parameters:

df (pandas.DataFrame) – The input dataframe containing the sheet lists.
colFrom (str) – The name of the column containing the sheet names.

Returns:

A dictionary mapping each unique sheet name to its corresponding: sheet ID, where each key is a sheet name and each value is a generated sheet ID.

Return type:

dict

shared.common.show_exception_and_exit(exc_type, exc_value, tb)¶: Keeps the application alive when an unhandled exception occurs Source: https://stackoverflow.com/questions/779675/stop-python-from-closing-on-error

shared.common.uniqueDependencies(d, g, f)¶

Keep unique dependencies from a list of dependencies with their minimum dependency level.

Parameters:

d (list) – Input list of dependency dictionaries.
g (list) – Grouping list used to determine unique dependencies.
f (str) – Field name representing the dependency level.

Returns:

Dependency dictionaries that only contain unique dependencies with their minimum dependency level.

Return type:

list

shared.common.visualizeFieldDependencies(df, sf, l, g, din, svg=False)¶

Creates output PNG/SVG files containing all dependencies for a given source field.

Parameters:

df (DataFrame) – Input data frame containing backward and forward
dependencies.
sf (str) – Input source field replacement ID.
l (str) – Input source field label.
g (Graph) – Master graph containing all source field and field node
objects.
din (str) – Full path to root directory where graphs will be saved.
svg (bool, optional) – Indicator (True/False) whether or not to
False. (generate SVG as well. Defaults to)

Returns:

PNG file is saved in “<workbook path> FilesGraphs<source field name>.png” and additional SVG file (with extra attributes) if svg is True.

Return type:

None

shared.common.visualizeSheetDependencies(df, sh, g, din, svg=False)¶

Create output PNG/SVG files containing all dependencies for a given source field.

Parameters:

df (pandas.DataFrame) – Input data frame containing backward and forward
dependencies.
sh (str) – Input sheet ID for which dependencies are visualized.
g (Graph) – Master graph containing all source field and field node
objects.
din (str) – Full path to the root directory where graphs will be saved.
svg (bool, optional) – Indicator (True/False) to generate SVG as well.
False. (Defaults to)

Returns:

PNG file is saved in “<workbook path> FilesGraphsSheets<sheet name>.png” and an additional SVG file (with extra attributes) if svg is True.

Return type:

None

shared.common.zip_folder(folder_path, output_zip_path)¶

Zip the contents of a folder, preserving the folder structure.

Parameters:

folder_path (str) – The path to the folder to zip.
output_zip_path (str) – The path where the output zip file will be created.

Returns:

The function creates a zip file at the specified output path.

Return type:

None