Function parseNormalizationStats(const std::string&, int, const std::vector<std::string>&)
Defined in File checkFun.h
Function Documentation
-
std::tuple<std::vector<int>, bool, std::vector<double>, std::vector<double>> parseNormalizationStats(const std::string &normalizationFile, int nbAttributes, const std::vector<std::string> &attributes = std::vector<std::string>())
Parses a normalization file to extract statistical data.
This function reads a normalization file and extracts statistical information such as attribute indices, wether mean or median was used for normalization mean/median and standard deviations values. It handles files with either numeric indices or attribute names. The function also checks for consistency in the usage of mean or median across the file and detects duplicate indices.
- Parameters:
normalizationFile – The path to the normalization file to be parsed.
nbAttributes – The number of attributes expected in the file.
attributes – Optional list of attribute names. If provided, the function will parse the file based on attribute names instead of numeric indices.
- Throws:
FileContentError – If there is a mismatch in the number of attributes, or if the file format is incorrect.
FileNotFoundError – If the normalization file cannot be opened or found.
- Returns:
A tuple containing four elements in the following order:
A vector of attribute indices (int).
A boolean flag indicating whether the file uses ‘median’ (true) or ‘mean’ (false).
A vector of mean or median values (double) extracted from the file.
A vector of standard deviations values (double) extracted from the file.