Skip to content

Benchmarks

Benchmark has been done with some tabular datasets from the Tabular data learning benchmark. It is also hosted on Hugging Face.

For the binary classification task, sklearn.metrics.f1_score is used for evaluation. For the regression task, sklearn.metrics.mean_squared_error is used for evaluation.

The downloaded datasets are divided into 3 sections: train: 50%, val: 10%, test: 40%. Feature importance is calculated from the train set. Feature selection is done on the val set. The final benchmark is evaluated on the test set. Therefore the test set is unseen to both the feature importance and selection process.

The exact model params are the following as show in run_tabular_benchmark.py as well.

seed = 2023
num_actual_runs = 10
num_random_runs = 50
shuffle_feature_order = True

model_cls_dicts = {
    "binary_classification": {
        "RandomForest": (RandomForestClassifier, {"n_jobs": -1}),
        "XGBoost": (
            XGBClassifier,
            {"n_jobs": -1, "importance_type": "gain"},
        ),
        "LGBM": (
            LGBMClassifier,
            {"n_jobs": -1, "importance_type": "gain"},
        ),
        "CatBoost": (
            CatBoostClassifier,
            {"verbose": False},
        ),
    },
    "regression": {
        "RandomForest": (RandomForestRegressor, {"n_jobs": -1}),
        "XGBoost": (
            XGBRegressor,
            {"n_jobs": -1, "importance_type": "gain"},
        ),
        "LGBM": (
            LGBMRegressor,
            {"n_jobs": -1, "importance_type": "gain"},
        ),
        "CatBoost": (
            CatBoostRegressor,
            {"verbose": False},
        ),
    },
}

Here is the summary of running null-importances with feature selection on multiple models and datasets. "better" means it is better than running feature selection with the model's built-in feature importances. We can see even with with default models' parameters it shows its effectiveness.

model n_dataset n_better better %
RandomForestClassifier 10 10 100.0
RandomForestRegressor 12 8 66.67
XGBClassifier 10 7 70.0
XGBRegressor 12 7 58.33
LGBMClassifier 10 8 80.0
LGBMRegressor 12 8 66.67
CatBoostClassifier 10 6 60.0
CatBoostRegressor 12 8 66.67

The table below also shows the effectiveness of different importances calculation methods for each task over all datasets:

task n_dataset importances the best %
binary_classification 10 A-R 32.69
binary_classification 10 A/(R+1) 26.92
binary_classification 10 Wasserstein 23.08
binary_classification 10 built-in 17.31
regression 12 built-in 29.82
regression 12 A/(R+1) 28.07
regression 12 A-R 26.32
regression 12 Wasserstein 15.79

built-in: The baseline, it is the built-in importances from the model.

A-R: compute_permutation_importance_by_subtraction

A/(R+1): compute_permutation_importance_by_division

Wasserstein: compute_permutation_importance_by_wasserstein_distance

Below tables shows the raw results and the raw csv data are in benchmarks/results.

Binary Classification Results with RandomForest

dataset importances feature_reduction test_score
clf_cat/electricity.csv built-in 8->2 0.894008
clf_cat/electricity.csv A-R 8->4 0.903448
clf_cat/electricity.csv A/(R+1) 8->2 0.894008
clf_cat/electricity.csv Wasserstein 8->8 0.886164
clf_cat/eye_movements.csv built-in 23->22 0.616902
clf_cat/eye_movements.csv A-R 23->11 0.663573
clf_cat/eye_movements.csv A/(R+1) 23->23 0.613378
clf_cat/eye_movements.csv Wasserstein 23->22 0.616098
clf_cat/covertype.csv built-in 54->26 0.955826
clf_cat/covertype.csv A-R 54->52 0.958779
clf_cat/covertype.csv A/(R+1) 54->25 0.956147
clf_cat/covertype.csv Wasserstein 54->28 0.954931
clf_cat/albert.csv built-in 31->22 0.65181
clf_cat/albert.csv A-R 31->22 0.651038
clf_cat/albert.csv A/(R+1) 31->28 0.656057
clf_cat/albert.csv Wasserstein 31->28 0.656968
clf_cat/compas-two-years.csv built-in 11->10 0.631631
clf_cat/compas-two-years.csv A-R 11->2 0.658924
clf_cat/compas-two-years.csv A/(R+1) 11->8 0.63761
clf_cat/compas-two-years.csv Wasserstein 11->8 0.630445
clf_cat/default-of-credit-card-clients.csv built-in 21->18 0.670973
clf_cat/default-of-credit-card-clients.csv A-R 21->17 0.682581
clf_cat/default-of-credit-card-clients.csv A/(R+1) 21->21 0.676039
clf_cat/default-of-credit-card-clients.csv Wasserstein 21->21 0.676729
clf_cat/road-safety.csv built-in 32->31 0.789492
clf_cat/road-safety.csv A-R 32->30 0.788183
clf_cat/road-safety.csv A/(R+1) 32->32 0.790758
clf_cat/road-safety.csv Wasserstein 32->32 0.790862
clf_num/Bioresponse.csv built-in 419->295 0.768577
clf_num/Bioresponse.csv A-R 419->80 0.764124
clf_num/Bioresponse.csv A/(R+1) 419->300 0.767705
clf_num/Bioresponse.csv Wasserstein 419->60 0.769556
clf_num/jannis.csv built-in 54->22 0.795777
clf_num/jannis.csv A-R 54->28 0.798353
clf_num/jannis.csv A/(R+1) 54->27 0.797567
clf_num/jannis.csv Wasserstein 54->51 0.78659
clf_num/MiniBooNE.csv built-in 50->33 0.930573
clf_num/MiniBooNE.csv A-R 50->42 0.930107
clf_num/MiniBooNE.csv A/(R+1) 50->45 0.930059
clf_num/MiniBooNE.csv Wasserstein 50->50 0.931065

Regression Results with RandomForest

dataset importances feature_reduction test_score
reg_num/cpu_act.csv built-in 21->20 6.005464
reg_num/cpu_act.csv A-R 21->20 6.009862
reg_num/cpu_act.csv A/(R+1) 21->19 5.976787
reg_num/cpu_act.csv Wasserstein 21->18 6.044395
reg_num/pol.csv built-in 26->16 0.273401
reg_num/pol.csv A-R 26->26 0.277991
reg_num/pol.csv A/(R+1) 26->12 0.278584
reg_num/pol.csv Wasserstein 26->14 0.271883
reg_num/elevators.csv built-in 16->7 8.044735
reg_num/elevators.csv A-R 16->15 8.34646
reg_num/elevators.csv A/(R+1) 16->6 7.884783
reg_num/elevators.csv Wasserstein 16->14 8.37675
reg_num/wine_quality.csv built-in 11->11 0.410926
reg_num/wine_quality.csv A-R 11->10 0.40895
reg_num/wine_quality.csv A/(R+1) 11->11 0.411228
reg_num/wine_quality.csv Wasserstein 11->11 0.41206
reg_num/Ailerons.csv built-in 33->12 2.827377
reg_num/Ailerons.csv A-R 33->29 2.810824
reg_num/Ailerons.csv A/(R+1) 33->12 2.821326
reg_num/Ailerons.csv Wasserstein 33->32 2.841721
reg_num/yprop_4_1.csv built-in 42->26 75403.649647
reg_num/yprop_4_1.csv A-R 42->41 74837.199659
reg_num/yprop_4_1.csv A/(R+1) 42->30 75175.03063
reg_num/yprop_4_1.csv Wasserstein 42->29 75796.896781
reg_num/superconduct.csv built-in 79->53 54470.492359
reg_num/superconduct.csv A-R 79->68 54068.62626
reg_num/superconduct.csv A/(R+1) 79->56 54584.570517
reg_num/superconduct.csv Wasserstein 79->69 54009.203279
reg_cat/topo_2_1.csv built-in 255->217 76175.864005
reg_cat/topo_2_1.csv A-R 255->254 76311.964795
reg_cat/topo_2_1.csv A/(R+1) 255->79 76059.172223
reg_cat/topo_2_1.csv Wasserstein 255->165 75797.079183
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv built-in 359->6 177937.918395
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A-R 359->195 183867.243148
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A/(R+1) 359->6 177937.918395
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv Wasserstein 359->96 195247.864875
reg_cat/house_sales.csv built-in 17->16 110072.875485
reg_cat/house_sales.csv A-R 17->17 110141.291334
reg_cat/house_sales.csv A/(R+1) 17->17 110404.08618
reg_cat/house_sales.csv Wasserstein 17->17 110078.623402
reg_cat/nyc-taxi-green-dec-2016.csv built-in 16->15 10585.637729
reg_cat/nyc-taxi-green-dec-2016.csv A-R 16->4 10758.481089
reg_cat/nyc-taxi-green-dec-2016.csv A/(R+1) 16->16 10590.967684
reg_cat/nyc-taxi-green-dec-2016.csv Wasserstein 16->16 10600.876808
reg_cat/Allstate_Claims_Severity.csv built-in 124->113 1002055785.041467
reg_cat/Allstate_Claims_Severity.csv A-R 124->124 1003062488.299886
reg_cat/Allstate_Claims_Severity.csv A/(R+1) 124->93 1003238483.814182
reg_cat/Allstate_Claims_Severity.csv Wasserstein 124->84 1002670828.551967

Binary Classification Results with XGBoost

dataset importances feature_reduction test_score
clf_cat/electricity.csv built-in 8->4 0.910815
clf_cat/electricity.csv A-R 8->4 0.910815
clf_cat/electricity.csv A/(R+1) 8->4 0.910815
clf_cat/electricity.csv Wasserstein 8->8 0.901493
clf_cat/eye_movements.csv built-in 23->2 0.647699
clf_cat/eye_movements.csv A-R 23->1 0.647699
clf_cat/eye_movements.csv A/(R+1) 23->10 0.685209
clf_cat/eye_movements.csv Wasserstein 23->1 0.647699
clf_cat/covertype.csv built-in 54->40 0.891799
clf_cat/covertype.csv A-R 54->53 0.890788
clf_cat/covertype.csv A/(R+1) 54->40 0.891799
clf_cat/covertype.csv Wasserstein 54->54 0.887196
clf_cat/albert.csv built-in 31->11 0.651735
clf_cat/albert.csv A-R 31->20 0.653106
clf_cat/albert.csv A/(R+1) 31->11 0.651735
clf_cat/albert.csv Wasserstein 31->25 0.643385
clf_cat/compas-two-years.csv built-in 11->4 0.66599
clf_cat/compas-two-years.csv A-R 11->2 0.690587
clf_cat/compas-two-years.csv A/(R+1) 11->5 0.66599
clf_cat/compas-two-years.csv Wasserstein 11->6 0.66599
clf_cat/default-of-credit-card-clients.csv built-in 21->17 0.677356
clf_cat/default-of-credit-card-clients.csv A-R 21->15 0.676268
clf_cat/default-of-credit-card-clients.csv A/(R+1) 21->16 0.680071
clf_cat/default-of-credit-card-clients.csv Wasserstein 21->19 0.685076
clf_cat/road-safety.csv built-in 32->20 0.792338
clf_cat/road-safety.csv A-R 32->30 0.789321
clf_cat/road-safety.csv A/(R+1) 32->27 0.790174
clf_cat/road-safety.csv Wasserstein 32->31 0.78271
clf_num/Bioresponse.csv built-in 419->82 0.740794
clf_num/Bioresponse.csv A-R 419->67 0.763121
clf_num/Bioresponse.csv A/(R+1) 419->68 0.745468
clf_num/Bioresponse.csv Wasserstein 419->130 0.754958
clf_num/jannis.csv built-in 54->17 0.793181
clf_num/jannis.csv A-R 54->26 0.796462
clf_num/jannis.csv A/(R+1) 54->26 0.796462
clf_num/jannis.csv Wasserstein 54->52 0.784297
clf_num/MiniBooNE.csv built-in 50->45 0.937386
clf_num/MiniBooNE.csv A-R 50->38 0.937817
clf_num/MiniBooNE.csv A/(R+1) 50->47 0.93797
clf_num/MiniBooNE.csv Wasserstein 50->48 0.936908

Regression Results with XGBoost

dataset importances feature_reduction test_score
reg_num/cpu_act.csv built-in 21->17 5.946372
reg_num/cpu_act.csv A-R 21->18 5.654754
reg_num/cpu_act.csv A/(R+1) 21->18 5.679894
reg_num/cpu_act.csv Wasserstein 21->21 5.752541
reg_num/pol.csv built-in 26->15 0.305836
reg_num/pol.csv A-R 26->25 0.300197
reg_num/pol.csv A/(R+1) 26->15 0.305665
reg_num/pol.csv Wasserstein 26->26 0.296534
reg_num/elevators.csv built-in 16->15 5.715875
reg_num/elevators.csv A-R 16->16 5.799774
reg_num/elevators.csv A/(R+1) 16->13 5.598146
reg_num/elevators.csv Wasserstein 16->16 5.76182
reg_num/wine_quality.csv built-in 11->10 0.463778
reg_num/wine_quality.csv A-R 11->11 0.45338
reg_num/wine_quality.csv A/(R+1) 11->11 0.453376
reg_num/wine_quality.csv Wasserstein 11->11 0.449631
reg_num/Ailerons.csv built-in 33->21 2.771169
reg_num/Ailerons.csv A-R 33->24 2.779444
reg_num/Ailerons.csv A/(R+1) 33->22 2.807877
reg_num/Ailerons.csv Wasserstein 33->26 2.880596
reg_num/yprop_4_1.csv built-in 42->2 78997.96633
reg_num/yprop_4_1.csv A-R 42->2 78997.96633
reg_num/yprop_4_1.csv A/(R+1) 42->2 78997.96633
reg_num/yprop_4_1.csv Wasserstein 42->1 80189.002714
reg_num/superconduct.csv built-in 79->40 58619.061023
reg_num/superconduct.csv A-R 79->36 58742.531663
reg_num/superconduct.csv A/(R+1) 79->39 58312.179595
reg_num/superconduct.csv Wasserstein 79->73 58888.372081
reg_cat/topo_2_1.csv built-in 255->169 85043.995838
reg_cat/topo_2_1.csv A-R 255->115 84843.36509
reg_cat/topo_2_1.csv A/(R+1) 255->180 85421.986381
reg_cat/topo_2_1.csv Wasserstein 255->176 87021.271777
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv built-in 359->7 175536.434647
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A-R 359->82 179004.576184
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A/(R+1) 359->13 175536.434647
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv Wasserstein 359->14 175717.216092
reg_cat/house_sales.csv built-in 17->17 105445.807214
reg_cat/house_sales.csv A-R 17->16 107369.173579
reg_cat/house_sales.csv A/(R+1) 17->17 105474.801274
reg_cat/house_sales.csv Wasserstein 17->17 105647.831015
reg_cat/nyc-taxi-green-dec-2016.csv built-in 16->7 11361.646796
reg_cat/nyc-taxi-green-dec-2016.csv A-R 16->7 11361.5967
reg_cat/nyc-taxi-green-dec-2016.csv A/(R+1) 16->7 11361.944294
reg_cat/nyc-taxi-green-dec-2016.csv Wasserstein 16->4 11586.532828
reg_cat/Allstate_Claims_Severity.csv built-in 124->69 928207411.901454
reg_cat/Allstate_Claims_Severity.csv A-R 124->112 928486398.650217
reg_cat/Allstate_Claims_Severity.csv A/(R+1) 124->73 930825046.195666
reg_cat/Allstate_Claims_Severity.csv Wasserstein 124->124 929893696.608902

Binary Classification Results with LightGBM

dataset importances feature_reduction test_score
clf_cat/electricity.csv built-in 8->5 0.877175
clf_cat/electricity.csv A-R 8->5 0.877175
clf_cat/electricity.csv A/(R+1) 8->5 0.877175
clf_cat/electricity.csv Wasserstein 8->5 0.877175
clf_cat/eye_movements.csv built-in 23->23 0.632579
clf_cat/eye_movements.csv A-R 23->12 0.668197
clf_cat/eye_movements.csv A/(R+1) 23->13 0.666667
clf_cat/eye_movements.csv Wasserstein 23->6 0.668401
clf_cat/covertype.csv built-in 54->19 0.851571
clf_cat/covertype.csv A-R 54->25 0.85109
clf_cat/covertype.csv A/(R+1) 54->43 0.847426
clf_cat/covertype.csv Wasserstein 54->25 0.85109
clf_cat/albert.csv built-in 31->14 0.665288
clf_cat/albert.csv A-R 31->19 0.666948
clf_cat/albert.csv A/(R+1) 31->17 0.667958
clf_cat/albert.csv Wasserstein 31->22 0.663828
clf_cat/compas-two-years.csv built-in 11->3 0.655332
clf_cat/compas-two-years.csv A-R 11->4 0.66972
clf_cat/compas-two-years.csv A/(R+1) 11->4 0.66972
clf_cat/compas-two-years.csv Wasserstein 11->4 0.66972
clf_cat/default-of-credit-card-clients.csv built-in 21->20 0.689504
clf_cat/default-of-credit-card-clients.csv A-R 21->14 0.689751
clf_cat/default-of-credit-card-clients.csv A/(R+1) 21->14 0.689751
clf_cat/default-of-credit-card-clients.csv Wasserstein 21->21 0.684882
clf_cat/road-safety.csv built-in 32->23 0.792133
clf_cat/road-safety.csv A-R 32->21 0.791898
clf_cat/road-safety.csv A/(R+1) 32->27 0.792048
clf_cat/road-safety.csv Wasserstein 32->19 0.792245
clf_num/Bioresponse.csv built-in 419->22 0.752857
clf_num/Bioresponse.csv A-R 419->416 0.762108
clf_num/Bioresponse.csv A/(R+1) 419->260 0.757295
clf_num/Bioresponse.csv Wasserstein 419->60 0.753395
clf_num/jannis.csv built-in 54->24 0.797305
clf_num/jannis.csv A-R 54->25 0.798759
clf_num/jannis.csv A/(R+1) 54->24 0.800101
clf_num/jannis.csv Wasserstein 54->32 0.797517
clf_num/MiniBooNE.csv built-in 50->40 0.936987
clf_num/MiniBooNE.csv A-R 50->37 0.937139
clf_num/MiniBooNE.csv A/(R+1) 50->37 0.937139
clf_num/MiniBooNE.csv Wasserstein 50->45 0.93793

Regression Results with LightGBM

dataset importances feature_reduction test_score
reg_num/cpu_act.csv built-in 21->15 5.125908
reg_num/cpu_act.csv A-R 21->21 5.188476
reg_num/cpu_act.csv A/(R+1) 21->21 5.188049
reg_num/cpu_act.csv Wasserstein 21->18 5.158656
reg_num/pol.csv built-in 26->13 0.274706
reg_num/pol.csv A-R 26->25 0.278663
reg_num/pol.csv A/(R+1) 26->14 0.278948
reg_num/pol.csv Wasserstein 26->14 0.26098
reg_num/elevators.csv built-in 16->16 5.510428
reg_num/elevators.csv A-R 16->16 5.510406
reg_num/elevators.csv A/(R+1) 16->16 5.510406
reg_num/elevators.csv Wasserstein 16->16 5.510406
reg_num/wine_quality.csv built-in 11->11 0.437608
reg_num/wine_quality.csv A-R 11->11 0.437608
reg_num/wine_quality.csv A/(R+1) 11->11 0.437608
reg_num/wine_quality.csv Wasserstein 11->11 0.437608
reg_num/Ailerons.csv built-in 33->22 2.600292
reg_num/Ailerons.csv A-R 33->29 2.57817
reg_num/Ailerons.csv A/(R+1) 33->27 2.560278
reg_num/Ailerons.csv Wasserstein 33->28 2.595686
reg_num/yprop_4_1.csv built-in 42->29 75930.237995
reg_num/yprop_4_1.csv A-R 42->34 76201.666859
reg_num/yprop_4_1.csv A/(R+1) 42->31 76186.560793
reg_num/yprop_4_1.csv Wasserstein 42->19 76494.373796
reg_num/superconduct.csv built-in 79->55 63228.223091
reg_num/superconduct.csv A-R 79->76 62647.41587
reg_num/superconduct.csv A/(R+1) 79->70 62806.379162
reg_num/superconduct.csv Wasserstein 79->70 63367.083612
reg_cat/topo_2_1.csv built-in 255->84 77167.225165
reg_cat/topo_2_1.csv A-R 255->32 78130.633542
reg_cat/topo_2_1.csv A/(R+1) 255->153 77161.555463
reg_cat/topo_2_1.csv Wasserstein 255->232 77395.701091
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv built-in 359->87 189609.18945
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A-R 359->14 174764.764814
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A/(R+1) 359->69 191855.328702
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv Wasserstein 359->183 191374.726273
reg_cat/house_sales.csv built-in 17->14 97213.085621
reg_cat/house_sales.csv A-R 17->14 96600.951963
reg_cat/house_sales.csv A/(R+1) 17->11 96625.662916
reg_cat/house_sales.csv Wasserstein 17->12 100009.945837
reg_cat/nyc-taxi-green-dec-2016.csv built-in 16->5 13655.984718
reg_cat/nyc-taxi-green-dec-2016.csv A-R 16->6 13632.35428
reg_cat/nyc-taxi-green-dec-2016.csv A/(R+1) 16->8 13623.652139
reg_cat/nyc-taxi-green-dec-2016.csv Wasserstein 16->6 13632.35428
reg_cat/Allstate_Claims_Severity.csv built-in 124->82 918539118.574563
reg_cat/Allstate_Claims_Severity.csv A-R 124->78 919651444.584081
reg_cat/Allstate_Claims_Severity.csv A/(R+1) 124->102 919303774.163119
reg_cat/Allstate_Claims_Severity.csv Wasserstein 124->95 918926264.810142

Binary Classification Results with CatBoost

dataset importances feature_reduction test_score
clf_cat/electricity.csv built-in 8->3 0.883178
clf_cat/electricity.csv A-R 8->3 0.883178
clf_cat/electricity.csv A/(R+1) 8->3 0.883178
clf_cat/electricity.csv Wasserstein 8->8 0.883155
clf_cat/eye_movements.csv built-in 23->23 0.628159
clf_cat/eye_movements.csv A-R 23->14 0.64964
clf_cat/eye_movements.csv A/(R+1) 23->5 0.656049
clf_cat/eye_movements.csv Wasserstein 23->22 0.625205
clf_cat/covertype.csv built-in 54->24 0.913111
clf_cat/covertype.csv A-R 54->54 0.910959
clf_cat/covertype.csv A/(R+1) 54->35 0.912809
clf_cat/covertype.csv Wasserstein 54->30 0.913034
clf_cat/albert.csv built-in 31->15 0.665236
clf_cat/albert.csv A-R 31->24 0.66661
clf_cat/albert.csv A/(R+1) 31->18 0.6679
clf_cat/albert.csv Wasserstein 31->31 0.666891
clf_cat/compas-two-years.csv built-in 11->7 0.675926
clf_cat/compas-two-years.csv A-R 11->10 0.674274
clf_cat/compas-two-years.csv A/(R+1) 11->9 0.675244
clf_cat/compas-two-years.csv Wasserstein 11->7 0.674286
clf_cat/default-of-credit-card-clients.csv built-in 21->21 0.690341
clf_cat/default-of-credit-card-clients.csv A-R 21->21 0.689137
clf_cat/default-of-credit-card-clients.csv A/(R+1) 21->21 0.689978
clf_cat/default-of-credit-card-clients.csv Wasserstein 21->20 0.69052
clf_cat/road-safety.csv built-in 32->27 0.794
clf_cat/road-safety.csv A-R 32->28 0.793317
clf_cat/road-safety.csv A/(R+1) 32->22 0.791271
clf_cat/road-safety.csv Wasserstein 32->30 0.79264
clf_num/Bioresponse.csv built-in 419->290 0.781006
clf_num/Bioresponse.csv A-R 419->419 0.786373
clf_num/Bioresponse.csv A/(R+1) 419->206 0.784507
clf_num/Bioresponse.csv Wasserstein 419->386 0.787368
clf_num/jannis.csv built-in 54->18 0.806811
clf_num/jannis.csv A-R 54->26 0.80771
clf_num/jannis.csv A/(R+1) 54->22 0.810063
clf_num/jannis.csv Wasserstein 54->49 0.801057
clf_num/MiniBooNE.csv built-in 50->48 0.942751
clf_num/MiniBooNE.csv A-R 50->46 0.943224
clf_num/MiniBooNE.csv A/(R+1) 50->45 0.942883
clf_num/MiniBooNE.csv Wasserstein 50->50 0.943038

Regression Results with CatBoost

dataset importances feature_reduction test_score
reg_num/cpu_act.csv built-in 21->16 5.104869
reg_num/cpu_act.csv A-R 21->21 5.13585
reg_num/cpu_act.csv A/(R+1) 21->20 5.1915
reg_num/cpu_act.csv Wasserstein 21->21 5.127099
reg_num/pol.csv built-in 26->15 0.271039
reg_num/pol.csv A-R 26->25 0.271706
reg_num/pol.csv A/(R+1) 26->20 0.26218
reg_num/pol.csv Wasserstein 26->20 0.260988
reg_num/elevators.csv built-in 16->15 4.324409
reg_num/elevators.csv A-R 16->15 4.34998
reg_num/elevators.csv A/(R+1) 16->16 4.321757
reg_num/elevators.csv Wasserstein 16->16 4.35382
reg_num/wine_quality.csv built-in 11->11 0.433982
reg_num/wine_quality.csv A-R 11->11 0.428414
reg_num/wine_quality.csv A/(R+1) 11->10 0.433384
reg_num/wine_quality.csv Wasserstein 11->11 0.433185
reg_num/Ailerons.csv built-in 33->28 2.42977
reg_num/Ailerons.csv A-R 33->28 2.41639
reg_num/Ailerons.csv A/(R+1) 33->24 2.440884
reg_num/Ailerons.csv Wasserstein 33->19 2.45879
reg_num/yprop_4_1.csv built-in 42->29 75892.248828
reg_num/yprop_4_1.csv A-R 42->41 75664.082657
reg_num/yprop_4_1.csv A/(R+1) 42->28 75348.084082
reg_num/yprop_4_1.csv Wasserstein 42->32 75485.168192
reg_num/superconduct.csv built-in 79->67 57399.093773
reg_num/superconduct.csv A-R 79->73 57487.193807
reg_num/superconduct.csv A/(R+1) 79->63 57478.518356
reg_num/superconduct.csv Wasserstein 79->75 57710.629917
reg_cat/topo_2_1.csv built-in 255->176 75530.315952
reg_cat/topo_2_1.csv A-R 255->240 76035.257684
reg_cat/topo_2_1.csv A/(R+1) 255->226 76258.199946
reg_cat/topo_2_1.csv Wasserstein 255->210 76661.935034
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv built-in 359->338 189052.696626
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A-R 359->10 174728.12932
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv A/(R+1) 359->9 174846.635809
reg_cat/Mercedes_Benz_Greener_Manufacturing.csv Wasserstein 359->318 189125.782494
reg_cat/house_sales.csv built-in 17->16 91482.325955
reg_cat/house_sales.csv A-R 17->16 91295.138498
reg_cat/house_sales.csv A/(R+1) 17->16 91173.085911
reg_cat/house_sales.csv Wasserstein 17->17 91505.900609
reg_cat/nyc-taxi-green-dec-2016.csv built-in 16->12 12548.48941
reg_cat/nyc-taxi-green-dec-2016.csv A-R 16->16 12596.921199
reg_cat/nyc-taxi-green-dec-2016.csv A/(R+1) 16->16 12577.487796
reg_cat/nyc-taxi-green-dec-2016.csv Wasserstein 16->16 12633.541154
reg_cat/Allstate_Claims_Severity.csv built-in 124->106 905710139.82245
reg_cat/Allstate_Claims_Severity.csv A-R 124->124 906265053.899284
reg_cat/Allstate_Claims_Severity.csv A/(R+1) 124->106 905170209.556065
reg_cat/Allstate_Claims_Severity.csv Wasserstein 124->120 905989468.489141