Explain the Machine Learning Model in SQLFlow
This design doc introduces how to support the in SQLFlow with SHAP as the backend and display the visualization image to the user.
Train SQL:
SELECT * FROM train_table
WITH
plots = force
where:
train_table
is the table of training data.my_model
is the trained model.- is the explain type.
- Enhance the SQLFlow parser to support the
Explain
keyword. - Implement the
codegen_shap.go
to generate a SHAP Python program. The Python program would be executed by SQLFlowExecutor
module and prints the visualization image in HTML format to stdout. The stdout will be captured by the Go program using . - For each
Explain SQL
request from the SQLFlow magic command, the SQLFlow server would response the HTML text as a single message, and then display the visualization image on Jupyter Notebook
- For the current milestone, SQLFlow only supports DeepExplainer for the Keras Model, and TreeExplainer for XGBoost, more abundant Explainer and Model type will be supported in the future.