FidexGlo¶
Description¶
The goal of FidexGlo
is to explain the model's decision for each sample by generating one or more explanation rules. It searches through the global ruleset generated by FidexGloRules. If no suitable rule is found, the algorithm calls Fidex to generate a local rule for the sample. The explanations provided by FidexGlo
are in the form of activated rules, highlighting both the correct decision class (matching the model's decision) and incorrect decisions (where the attributes match, but the class differs from the model's decision).
Arguments list¶
The FidexGlo
algorithm works with both required and optional arguments. Each argument has specific properties:
- Is required means whether an argument must be specified when calling the program or not.
- Type specifies the argument datatype.
- CLI argument syntax is the exact name to use if you are writing the argument along with the program call.
- JSON identifier is the exact name to use if you are writing the argument inside a JSON configuration file.
- Default value is the value that will be used by the program if the argument is not specified. If
None
, it could mean that the argument is not used at all during the algorithm execution or could also mean that you have to specify it yourself.
Show help¶
Display parameters and other helpful information concerning the program usage and terminate it when done.
Property | Value |
---|---|
Is required | No |
Type | None |
CLI argument syntax | -h , --help or None |
JSON identifier | N/A |
Default value | None |
Warning
If you use this argument, it must be the only one specified. No other argument can be specified with it.
JSON configuration file¶
File containing the configuration for the algorithm in JSON format (see more about JSON configuration files).
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --json-configuration-file |
JSON identifier | N/A |
Default value | None |
Warning
If you use this argument, it must be the only one specified. No other argument can be specified with it.
Root folder path¶
Default path from where all the other arguments related to file paths are going to be based. Using this allows you to work with paths relative to this location and avoid writing absolute paths or lengthy relative paths.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --root_folder |
JSON identifier | root_folder |
Default value | . |
Test data file¶
Path to the file containing test sample(s) data, it can contain predictions, and true classes too if fidex is used.
Property | Value |
---|---|
Is required | Yes |
Type | String |
CLI argument syntax | --test_data_file |
JSON identifier | test_data_file |
Default value | None |
Global Rules file¶
Path to the file containing the global rules obtained with fidexGloRules
algorithm.
Property | Value |
---|---|
Is required | Yes |
Type | String |
CLI argument syntax | --global_rules_file |
JSON identifier | global_rules_file |
Default value | None |
Number of attributes¶
Number of attributes in the dataset (should be equal to the number of inputs of the model). Takes values in the range [1,∞[
.
Property | Value |
---|---|
Is required | Yes |
Type | Integer |
CLI argument syntax | --nb_attributes |
JSON identifier | nb_attributes |
Default value | None |
Number of classes¶
Number of classes in the dataset (should be equal to the number of outputs of the model). Takes values in the range [2,∞[
.
Property | Value |
---|---|
Is required | Yes |
Type | Integer |
CLI argument syntax | --nb_classes |
JSON identifier | nb_classes |
Default value | None |
Test prediction file¶
Path to the file containing predictions on the test portion of the dataset. If it is used, the test data file must only contain the test data.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --test_pred_file |
JSON identifier | test_pred_file |
Default value | None |
Note
The test data file can hold the predictions too. This means that it is possible to merge the content of the test prediction file into the test data file instead of using this parameter.
Explanation file¶
Path to the file where explanation(s), consisting of one or more explaining rules, will be stored for each test sample.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --explanation_file |
JSON identifier | explanation_file |
Default value | None |
Attributes file¶
File containing attributes (inputs) and classes (outputs) names.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --attributes_file |
JSON identifier | attributes_file |
Default value | None |
Logs output file¶
Name of file containing every feedback made by the algorithm during its execution. If not specified, the feedback is displayed in the terminal.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --console_file |
JSON identifier | console_file |
Default value | None |
Using the minimal version¶
Whether to use the minimal version, which only gets correct activated rules. If fidex is used, it launches Fidex
when no such rule is found.
Property | Value |
---|---|
Is required | No |
Type | Boolean |
CLI argument syntax | --with_minimal_version |
JSON identifier | with_minimal_version |
Default value | False |
Use FidexGlo with Fidex¶
Whether to call Fidex
while executing the FidexGlo
algorithm when no rule can exaplain a sample in the global ruleset.
Property | Value |
---|---|
Is required | No |
Type | Boolean |
CLI argument syntax | --with_fidex |
JSON identifier | with_fidex |
Default value | False |
Note
If this parameter is set to True
, there is another set of parameters to be specified too.
If Fidex is used¶
Note
This section is only usable if the parameter named "use FidexGlo with fidex" is set to True
.
Train data file¶
File containing the training portion of the dataset used to train the model, from which the ruleset/weights belong. It can also contain training true classes (see Train true classes file).
Property | Value |
---|---|
Is required | Yes |
Type | String |
CLI argument syntax | --train_data_file |
JSON identifier | train_data_file |
Default value | None |
Train predictions file¶
File containing the predictions from the training portion of the dataset used to train the model.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --train_pred_file |
JSON identifier | train_pred_file |
Default value | None |
Train true classes file¶
File containing "true classes" (expected predictions), from the training portion of the dataset used to train the model.
Property | Value |
---|---|
Is required | No** |
Type | String |
CLI argument syntax | --train_class_file |
JSON identifier | train_class_file |
Default value | None |
Note
This argument is not required if, and only if, the true classes are already specified inside the train data file.
Weights file¶
File containing the model trained weights.
Property | Value |
---|---|
Is required | No** |
Type | String |
CLI argument syntax | --weights_file |
JSON identifier | weights_file |
Default value | None |
Note
This argument is not required if, and only if, the rules file is specified instead.
Rules file¶
File containing the model trained rules.
Property | Value |
---|---|
Is required | No** |
Type | String |
CLI argument syntax | --rules_file |
JSON identifier | rules_file |
Default value | None |
Note
This argument is not required if, and only if, the weights file is specified instead.
Test true classes file¶
File containing "true classes" (expected predictions), from the test portion of the dataset used to train the model.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --test_class_file |
JSON identifier | test_class_file |
Default value | None |
Note
The true classes can also be specified inside the test data file. This means it is possible to merge classes into the test data file instead of using this parameter.
Maximum number of iterations¶
Maximum number of Fidex
iterations allowed. Also the maximum possible number of antecedents in a rule. Takes values in the range [1,∞[
.
Property | Value |
---|---|
Is required | No |
Type | Integer |
CLI argument syntax | --max_iterations |
JSON identifier | max_iterations |
Default value | 10 |
Tip
If you're working with images, we recommend setting this argument to 25
.
Minimum covering¶
Minimal number of samples covered by every generated rule. Takes values in the range [1,∞[
.
Property | Value |
---|---|
Is required | No |
Type | Integer |
CLI argument syntax | --min_covering |
JSON identifier | min_covering |
Default value | 2 |
Use dichotomic search¶
Whether or not the algorithm uses a dichotomic strategy to compute a rule. This occurs when the algorithm fails to find a rule with the minimum covering value used.
Property | Value |
---|---|
Is required | No |
Type | Boolean |
CLI argument syntax | --covering_strategy |
JSON identifier | covering_strategy |
Default value | True |
Maximum number of failed attempts¶
Number of attempts allowed to compute a rule that could not be found by the algorithm. Takes values in the range [0,∞[
.
Property | Value |
---|---|
Is required | No |
Type | Integer |
CLI argument syntax | --max_failed_attempts |
JSON identifier | max_failed_attempts |
Default value | 30 |
Minimum fidelity¶
Lowest fidelity score allowed for a rule to be selected. Takes values in the range [0,1]
.
Property | Value |
---|---|
Is required | No |
Type | Float |
CLI argument syntax | --min_fidelity |
JSON identifier | min_fidelity |
Default value | 1.0 |
Minimum generated fidelity¶
Lowest fidelity score to which we agree to go down when a rule must be generated. Takes values in the range [0,1]
Property | Value |
---|---|
Is required | No |
Type | Float |
CLI argument syntax | --lowest_min_fidelity |
JSON identifier | lowest_min_fidelity |
Default value | 0.75 |
Number of rules¶
Number of Fidex
rules to compute per sample when launching the Fidex
algorithm. Takes values in the range [1,∞[
.
Property | Value |
---|---|
Is required | No |
Type | Integer |
CLI argument syntax | --nb_fidex_rules |
JSON identifier | nb_fidex_rules |
Default value | 1 |
Dimension dropout¶
Percentage of dimensions that are ignored during an iteration. Takes values in the range [0,1]
.
Property | Value |
---|---|
Is required | No |
Type | Float |
CLI argument syntax | --dropout_dim |
JSON identifier | dropout_dim |
Default value | 0.0 |
Hyperplane dropout¶
Percentage of hyperplanes that are ignored during an iteration. Takes values in the range [0,1]
.
Property | Value |
---|---|
Is required | No |
Type | Float |
CLI argument syntax | --dropout_hyp |
JSON identifier | dropout_hyp |
Default value | 0.0 |
Number of stairs¶
Number of stairs in the staircase activation function used in the Dimlp layer during training. Takes values in the range [3,∞[
.
Property | Value |
---|---|
Is required | No |
Type | Integer |
CLI argument syntax | --nb_quant_levels |
JSON identifier | nb_quant_levels |
Default value | 50 |
Normalization file¶
File containing the mean and standard deviation for specified attributes that have been normalized. If specified, it is used to denormalize the rules.
Property | Value |
---|---|
Is required | No |
Type | String |
CLI argument syntax | --normalization_file |
JSON identifier | normalization_file |
Default value | None |
Mus¶
Mean or median of each attribute index specified in normalization indices that have been normalized. This argument is used alongside sigmas and normalization indices. If specified, it is used to denormalize the rules. Takes values in the range ]-∞,∞[
.
Property | Value |
---|---|
Is required | No** |
Type | Float list |
CLI argument syntax | --mus |
JSON identifier | mus |
Default value | None |
Warning
If sigmas or normalization indices are used, then this argument is required. Not used if a normalization file is given.
Sigmas¶
Standard deviation of each attribute index specified in normalization indices that have been normalized. This argument is used alongside mus and normalization indices. If specified, it is used to denormalize the rules. Takes values in the range ]-∞,∞[
.
Property | Value |
---|---|
Is required | No** |
Type | Float list |
CLI argument syntax | --sigmas |
JSON identifier | sigmas |
Default value | None |
Warning
If mus or normalization indices are used, then this argument is required. Not used if a normalization file is given.
Normalization indices¶
Indices of attributes that have been normalized. If specified, it is used to denormalize the rules. Index starts at 0. Each index takes values in the range [0,nb_attributes-1]
.
Property | Value |
---|---|
Is required | No** |
Type | List of integers |
CLI argument syntax | --normalization_indices |
JSON identifier | normalization_indices |
Default value | [0,...,nb_attributes-1] |
Warning
If mus or sigmas are used, then this argument is required. Not used if a normalization file is given.
Seed¶
Number to feed the random generator. If 0
, the randomness cannot be reproduced. If any other number x
is used, you can reproduce the same output if x
is re-used. Takes values in the range [0,∞[
.
Property | Value |
---|---|
Is required | No |
Type | Integer |
CLI argument syntax | --seed |
JSON identifier | seed |
Default value | 0 |
Usage example¶
Example
from dimlpfidex.fidex import fidexGlo
fidexGlo("""
--root_folder dimlp/datafiles
--test_data_file test_data.txt
--test_pred_file predTest.out
--global_rules_file globalRules.rls
--nb_attributes 16
--nb_classes 2
--explanation_file explanation.txt
--with_fidex true
--train_data_file train_data.txt
--train_pred_file predTrain.out
--train_class_file train_class.txt
--test_class_file test_class.txt
--weights_file weights.wts"""
)
./fidexGlo --root_folder ../dimlp/datafiles --test_data_file test_data.txt --test_pred_file predTest.out --global_rules_file globalRules.rls --nb_attributes 16 --nb_classes 2 --explanation_file explanation.txt --with_fidex true --train_data_file train_data.txt --train_pred_file predTrain.out --train_class_file train_class.txt --test_class_file test_class.txt --weights_file weights.wts
Output interpretation¶
Explanation file¶
This file contains the explanations computed for each test data sample. An explanation is given by a bunch of correct activated global rules computed beforehand by FidexGloRules , as well as any incorrectly activated rules if present. If no rule is activated for a sample, one or more local rules are computed by Fidex. The file begins with global statistics about the ruleset, followed by the explanation for each test sample. Each explanation includes the model's prediction class its probability score (confidence). For both the correct and incorrect activated rules, the number of rules is provided, with rules ordered by their covering size and associated with their performance metrics. At the end of the file, a statistic is included showing the percentage of times that Fidex was called to generate a local rule.
Global Statistics:
Number of rules
- Indicates the total number of rules in the ruleset.
Mean sample covering number per rule
- The average number of training samples covered by each rule.
Mean number of antecedents per rule
- Represents the average number of conditions (antecedents) in each rule.
Decision threshold
- If present, indicates the decision threshold used for prediction.
Explanation of Each Rule:
Each rule consists of conditions on various attributes, followed by the predicted class, and is accompanied by several performance metrics. Let's break down this rule as an example:
Rule 1: X0>=0.65839 X1>=0.423139 X8>=0.105399 -> class 0
Train Covering size : 121
Train Fidelity : 1
Train Accuracy : 0.950413
Train Confidence : 0.97161
X0, X1, X8
- These represent the variables from the dataset.
>=0.65839, >=0.423139, >=0.105399
- The thresholds that the variable values must meet for the rule to be activated.
-> class 0
- The class predicted by the rule when the conditions are met. Here, the rule predicts class 0.
Performance Metrics Associated with the Rule:
Train Covering size
- Indicates the number of training samples that are covered by the rule. For Rule 1, it covers 121 samples.
Train Fidelity
- Measures how well the rule aligns with the model’s predictions. A fidelity of 1 means that the rule exactly matches the model’s predictions for all the samples it covers.
Train Accuracy
- The accuracy of the rule in correctly classifying the samples it covers. In the case of Rule 1, 95.04% of the covered samples are correctly classified.
Train Confidence
- This is the average confidence score of the model’s predictions for the samples covered by the rules. It is computed based on the prediction scores of the covered samples, indicating the model’s confidence in its classifications. For Rule 1, the confidence is 97.16%.
Each subsequent rule follows the same structure.