...
numpy>=2.0.0
compatibility: replace instances ofx.ptp()
withnp.ptp(x)
andnp.Inf
withnp.inf
(#571)- added a way to pass
sample_weight
to loss functions inmodel_parts()
(variable importance) usingweights
fromdx.Explainer
(#563) - fixed the visualization of
shap_wrapper
forshap==0.45.0
- increase the dependencies to
python>=3.8
,pandas>=1.5.0
,numpy>=1.23.3
and addpython==3.11
to CI - added
keras.src.models.sequential.Sequential
to classes with a knownpredict_function
; it should fix changes inkeras==3.0.0
andtensorflow==2.16.0
- turn off
verbose
in the predict method of tensorflow/keras models that changed intensorflow>=2.9.0
- update the warning occurring when specifying
variable_splits
(#558) - fix an error occuring in
predict_profile()
when a DataFrame has MultiIndex inpandas>=1.3.0
(#550) - fix gaussian
norm()
calculation inmodel_profile()
frompi*sqrt(2)
tosqrt(2*pi)
- fix a warning (future error) between
prepare_numerical_categorical()
andprepare_x()
withpandas==2.1.0
- fix a warning (future error) concerning the default value of
numeric_only
inpandas.DataFrame.corr()
indalex.aspect.calculate_assoc_matrix()
- add
ZeroDivisionError
to precision and recall functions (#532) - add a warning to
calculate_depend_matrix()
when there is a variable with only one value (#537) - fix missing EDA plots in (Python) Arena (#544)
- fix baseline positions in the subplots of the predict parts explanations: BreakDown, Shap (#545)
This release consists of mostly maintenance updates and, after a year, marks the Beta -> Stable release.
- increase the dependency from
python>=3.6
topython>=3.7
(at this moment, bothnumpy
andpandas
depend onpython>=3.8
), and addpython==3.10
to CI - increase the dependencies to
pandas>=1.2.5
,numpy>=1.20.3
(#526),scipy>=1.6.3
,plotly>=5.1.0
, andtqdm>=4.61.2
due to errors withpandas
(see tqdm/#1199) - remove the use of
pd.Series.append()
(#489) - remove the use of
np.isnan
causing error indalex.fairness
(#491) - fix iBreakDown plot y-axis labels (#493)
- stop the Arena's
werkzeug
server using a clearner and still supported API (#518)
- added fairness plot for regression models to
Arena
(dalex/#408) - added new
facet_scales
parameter toAP.plot
andCP.plot
, which allows to free the y-axis withfacet_scales="free"
(dalex/#469); consistent with R (DALEX/#468, ingredients/#140)
- fixed
AP
andCP
progress bars
- added new
aspect
module, which will focus on groups of dependent variables @krzyzinskim & @arturzolkowski - added new
scipy>=1.5.4
dependency
- improved the calculation of AUC, ROC plot (#459)
- wrong yaxis labels in
VariableImportance.plot(split="variable")
(#451) repr_html()
didn't work for explanation objects before using thefit
method (#449)
- added new
Aspect
object with thepredict_triplot
,model_triplot
,predict_parts
,model_parts
,get_aspects
methods - added new
PredictTriplot
,ModelTriplot
,PredictAspectImportance
,ModelAspectImportance
objects with theplot
method
- added bias mitigation techniques (
resample
,reweight
,roc_pivot
) into thefairness
module (#432)
- method
set_options
in Arena now takiesoption_category
instead ofplot_type
(SHAPValues
=>ShapleyValues
,FeatureImportance
=>VariableImportance
) (#420) - methods using the
N
parameter now properly sample rows fromdata
- fixed wrong error value when no
predict_function
is found inExplainer
(77ca90d) - set multiprocessing context to
'spawn'
(#412) - fixed bug in
metric_scores
plot that made only one subgroup appear on y-axis (#416) - added support for older keras models (#415)
- added a resource mechanism to Arena (#419)
- added
ShapleyValuesImportance
andShapleyValuesDependence
plots to Arena (#420) - return
error
instead ofNaN
when AUC is calculated on observations from one class only (#415)
- fixed concurrent random seeds when
processes > 1
(#392), which means that the results of parallel computation will vary betweenv1.1.0
and previous versions
GroupFairnessX.plot(type='fairness_check')
generates ticks according to the x-axis range (#409)GroupFainressRegression.plot(type='density')
has a more readable hover - only for outliers (#409)BreakDown.plot()
wrongly displayed the "+all factors" bar whenmax_vars < p
(#401)GroupFairnessClassification.plot(type='metric_scores')
did not handleNaN
's (#399)
- Experimental support for regression models in the
fairness
module. AddedGroupFairnessRegression
object, with theplot
method having two types:fairness_check
anddensity
.Explainer.model_fairness
method now depends on themodel_type
attribute. (#391) - added
N
parameter to thepredict_parts
method which isNone
by default (#402) epsilon
is now an argument of theGroupFairnessClassification
object (#397)
- fixed broken range on
yaxis
infairness_check
plot (#376) - warnings because
np.float
is depracated sincenumpy
v1.20 (#384)
- added
ipython
to test dependencies
These are summed up in (#368):
- rename modules:
dataset_level
intomodel_explanations
,instance_level
intopredict_explanations
,_arena
module intoarena
- use
__dir__
method to define autocompletion in IPython environment - show only['Explainer', 'Arena', 'fairness', 'datasets']
- add
plot
method andresult
attribute toLimeExplanation
(uselime.explanation.Explanation.as_pyplot_figure()
andlime.explanation.Explanation.as_list()
) CeterisParibus.plot(variable_type='categorical')
now has horizontal barplots -horizontal_spacing=None
by default (varies onvariable_type
). Also, once again added the "dot" for observation value.predict_fn
inpredict_surrogate
now usespredict_function
(trying to make it work for more frameworks)
- fixed wrong verbose output when any value in
y_hat/residuals
was anint
notfloat
- added proper
"-"
sign to negative dropout losses inVariableImportance.plot
- added
geom='bars'
toAggregateProfiles.plot
to force the categorical plot - added
geom='roc'
andgeom='lift'
toModelPerformance.plot
- added Fairness plot to Arena
- remove
colorize
fromExplainer
- updated the documentation, refactored code (import modules not functions, unify variable names in
object.py
, move utils funcitons fromchecks.py
toutils.py
, etc.) - added license notice next to data
- added support for
h2o.estimators.*
(#332) - added
tensorflow.python.keras.engine.functional.Functional
to thetensorflow
list - updated the
plotly
dependency to>=4.12.0
- code maintenance:
yhat
,check_data
- fixed
check_if_empty_fields()
used in loading theExplainer
from a pickle file, since several checks were changed - fixed
plot()
method inGroupFairnessClassification
as it omitted plotting a metric whenNaN
was present in metric ratios (result) - fixed
dragons
andHR
datasets having,
delimeter instead of.
, which transformed numerical columns into categorical. - fixed representation of the
ShapWrapper
class (removed_repr_html_
method)
- allow for
y
to be apandas.DataFrame
(converted) - allow for
data
,y
to be aH2OFrame
(converted) - added
label
parameter to all the relevantdx.Explainer
methods, which overrides the default label in explanation'sresult
- now using
GradientExplainer
fortf.keras.engine.sequential.Sequential
, added proper warning whenshap_explainer_type
isNone
(#366)
- unify verbose output of
Explainer
- added new
arena
module, which adds the backend for Arena dashboard @piotrpiatyszek
- added new aliases to
dx.Explainer
methods (#350) inmodel_parts
it is{'permutational': 'variable_importance', 'feature_importance': 'variable_importance'}
, inmodel_profile
it is{'pdp': 'partial', 'ale': 'accumulated'}
- added
Arena
object for dashboard backend. See https://github.com/ModelOriented/Arena - new
fairness
plot types:stacked
,radar
,performance_and_fairness
,heatmap
,ceteris_paribus_cutoff
- upgraded
fairness_check()
- added new
fairness
module, which will focus on bias detection, visualization and mitigation @jakwisn
- removed unnecessary warning when
precalculate=False and verbose=False
(#340)
- added
model_fairness
method to theExplainer
, which performs fairness explanation - added
GroupFairnessClassification
object, with theplot
method having two types:fairness_check
andmetric_scores
- added the
N=50000
argument toResidualDiagnostics.plot
, which samples observations from theresult
parameter to omit performance issues whensmooth=True
(#341)
- added support for
tensorflow.python.keras.engine.sequential.Sequential
andtensorflow.python.keras.engine.training.Model
(#326) - updated the
tqdm
dependency to>=4.48.2
,pandas
dependency to>=1.1.2
andnumpy
dependency to>=1.18.4
- fixed the wrong order of
Explainer
verbose messages - fixed a bug that caused
model_info
parameter to be overwritten by the default values - fixed a bug occurring when the variable from
groups
was not ofstr
type (#327) - fixed
model_profile
:variable_type='categorical'
not working when user passedvariables
parameter (#329) + the reverse order of bars in'categorical'
plots + (again) addedvariable_splits_type
parameter tomodel_profile
to specify how grid points shall be calculated (#266) + allow for both'quantile'
and'quantiles'
types (alias)
- added informative error messages when importing optional dependencies (#316)
- allow for
data
andy
to beNone
- added checks inExplainer
methods
- wrong parameter name
title_x
changed toy_title
inCeterisParibus.plot
andAggregatedProfiles.plot
(#317) - now warning the user in
Explainer
whenpredict_function
returns an error or doesn't returnnumpy.ndarray (1d)
(#325)
- updated the
pandas
dependency to>=1.1.0
ModelPerformance.plot
now uses a drwhy color palette- use
unique
method instead ofnp.unique
invariable_splits
(#293) v0.2.0
didn't export new datasets- fixed a bug where
predict_parts(type='shap')
calculated wrongcontributions
(#300) model_profile
uses observation mean instead of profile mean in_yhat_
centering- fixed barplot baseline in categorical
model_profile
andpredict_profile
plots (#297) - fixed
model_profile(type='accumulated')
giving wrong results (#302) - vertical/horizontal lines in plots now end on the plot edges
- added new
type='shap_wrapper'
topredict_parts
andmodel_parts
methods, which returns a newShapWrapper
object. It contains the main result attribute (shapley_values
) and the plot method (force_plot
andsummary_plot
respectively). These come from the shap package Explainer.predict
method now acceptsnumpy.ndarray
- added the
ResidualDiagnostics
object with aplot
method - added
model_diagnostics
method to theExplainer
, which performs residual diagnostics - added
predict_surrogate
method to theExplainer
, which is a wrapper for thelime
tabular explanation from the lime package - added
model_surrogate
method to theExplainer
, which creates a basic surrogate decision tree or linear model from the black-box model using the scikit-learn package - added a
_repr_html_
method to all of the explanation objects (it prints theresult
attribute) - added
dalex.__version__
- added informative error messages in
Explainer
methods wheny
is of wrong type (#294) CeterisParibus.plot(variable_type='categorical')
now allows for multiple observations- new verbose checks for
model_type
- add
type
tomodel_info
indump
anddumps
for R compatibility (#303) ModelPerformance.result
now haslabel
as index
- removed
_grid_
column inAggregatedProfiles.result
andcenter
only works withtype=accumulated
- use
Pipeline._final_estimator
to extractmodel_class
of the actual model - use
model._estimator_type
to extractmodel_type
if possible
- major documentation update (#270)
- unified the order of function parameters
v0.1.9
had wrong_original_
column inpredict_profile
vertical_spacing
acts as intended inVariableImportance.plot
whensplit='variable'
loss_function='auc'
now usesloss_one_minus_auc
as this should be a descending measure- plots are now saved with the original height and width
model_profile
now properly passes thevariables
parameter toCeterisParibus
variables
parameter inpredict_profile
now can also be a string
- use
px.express
instead of coreplotly
to makemodel_profile
andpredict_profile
plots; thus, enhance performance and scalability - added
verbose
parameter wheretqdm
is used to verbose progress bar - added
loss_one_minus_auc
function that can be used withloss_function='1-auc'
inmodel_parts
- added new example data sets:
apartments
,dragons
andhr
- added
color
,opacity
,title_x
parameters tomodel_profile
andpredict_profile
plots (#236), changed tooltips and legends (#262) - added
geom='profiles'
parameter tomodel_profile
plot andraw_profiles
attribute toAggregatedProfiles
- added
variable_splits_type
parameter topredict_profile
to specify how grid points shall be calculated (#266) - added
variable_splits_with_obs
parameter topredict_profile
function to extend split points with observation variable values (#269) - added
variable_splits
parameter tomodel_profile
- use different
loss_function
for classification and regression (#248) - models that use
proba
yhats now getmodel_type='classification'
if it's not specified - use uniform way of grid points calculation in
predict_profile
andmodel_profile
(seevariable_splits_type
parameter) - add the variable values of
new_observation
tovariable_splits
inpredict_profile
(seevariable_splits_with_obs
parameter) - use
N=1000
inmodel_parts
andN=300
inmodel_profile
to comply with the R version keep_raw_permutation
is now set toFalse
instead ofNone
inmodel_parts
intercept
parameter inmodel_profile
is now namedcenter
- feature: added
random_state
parameter forpredict_parts(type='shap')
andmodel_profile
for reproducible calculations - fix: fixed
random_state
parameter inmodel_parts
- feature: multiprocessing added for:
model_profile
,model_parts
,predict_profile
andpredict_parts(type='shap')
, through theprocesses
parameter - fix: significantly improved the speed of
accumulated
andconditional
types inmodel_profile
- bugfix: use pd.api.types.is_numeric_dtype()
instead of
np.issubdtype()
to cover more types; e.g. it caused errors withstring
type - defaults: use pd.convert_dtypes()
on the result of
CeterisParibus
to fix variable dtypes and later allow for a concatenation without the dtype conversion - fix:
variables
parameter now can be a singlestr
value - fix: number rounding in
predict_parts
,model_parts
(#245) - fix: CP calculations for models that take only variables as an input
- bugfix:
variable_splits
parameter now works correctly inpredict_profile
- bugfix: fix baseline for 3+ models in
AggregatedProfiles.plot
(#234) - printing: now rounding numbers in
Explainer
messages - fix: minor checks fixes in
instance_level
- bugfix:
AggregatedProfiles.plot
now works withgroups
- feature: parameter
N
inmodel_profile
can be set toNone
, to select all observations - input:
groups
andvariable
parameters inmodel_profile
can be:str
,list
,numpy.ndarray
,pandas.Series
- fix:
check_label
returned only a first letter - bugfix: removed the conversion of
all_variables
tostr
inprepare_all_variables
, which caused an error inmodel_profile
(#214) - defaults: change numpy data variable names from numbers to strings
- fix: change
short_name
encoding infifa
dataset (utf8->ascii) - fix: remove
scipy
dependency - defaults: default
loss_root_mean_square
in model parts changed tormse
- bugfix: checks related to
new_observation
inBreakDown, Shap, CeterisParibus
now work for multiple inputs (#207) - bugfix:
CeterisParibus.fit
andCeterisParibus.plot
now work for more types ofnew_observation.index
, but won't work for abolean
type (#211)
- feature: add
xgboost
package compatibility (#188) - feature: added
model_class
parameter toExplainer
to handle wrapped models - feature:
Exaplainer
attributemodel_info
remembers if parameters are default - bugfix:
variable_groups
parameter now works correctly inmodel_parts
- fix: changed parameter order in
Explainer
:model_type
,model_info
,colorize
- documentation:
model_parts
documentation is updated - feature: new
show
parameter inplot
methods that (if False
) returnsplotly Figure
(#190) - feature:
load_fifa()
function which loads the preprocessed players_20 dataset - fix:
CeterisParibus.plot
tooltip
- feature: new
Explainer.residual
method which usesresidual_function
to calculateresiduals
- feature: new
dump
anddumps
methods for savingExplainer
in a binary form;load
andloads
methods for loadingExplainer
from binary form - fix:
Explainer
constructor verbose text - bugfix:
B:=B+1
-Shap
now stores average results asB=0
and path results asB=1,2,...
- bugfix:
Explainer.model_performance
method usesself.model_type
whenmodel_type
isNone
- bugfix: values in
BreakDown
andShap
are now rounded to 4 significant places (#180) - bugfix:
Shap
by default usespath='average'
,sign
column is properly updated and bars inplot
are sorted byabs(contribution)
- release of the
dalex
package Explainer
object withpredict
,predict_parts
,predict_profile
,model_performance
,model_parts
andmodel_profile
methodsBreakDown
,Shap
,CeterisParibus
,ModelPerformance
,VariableImportance
andAggregatedProfiles
objects with aplot
methodload_titanic()
function which loads thetitanic_imputed
dataset