Table of Content
- What is ML Modeling?
- What is Explainable AI?
- So why is it important to interpret the model result reports? How do we explain the models you train with B2ML Studio?
- What is the Regression Model?
- What is the Classification Model?
- What is the B2Metric ML Studio Explainable AI Module?
- B2Metric ML Studio Interpretable Model Results
- Desicion Tree
- Sunburst
- Feature Importance
- Regression Coefficient
- Lift analysis
- Decision Plot
- B2Metric ML Studio Model Performance Results
- Compare Models
- Leaderboard
- Detailed Metrics
- Confusion Matrix
AutoML is a quick, easy and understandable solution for predicting and modeling your dataset. It's a popular and intriguing concept for corporations lately.
So how do we present our AutoML solution to you and how do we do things differently at B2Metric?
B2ML Studio is an automated machine learning platform that runs your data in a simple "load dataset – select models and algorithms – get AI results to be interpreted" scenario.
So what is ML modeling…
We can explain the machine learning model in the form of learning from a data set using various algorithms and making predictions from a data set that he has never seen before. As an example, in image processing learning, we can teach the recognition of objects such as cars, bicycles, and cats. Or you can send smart notifications according to user habits with a classification model. For example, by looking at how long users spend in news categories, you can send current news that will be of more interest to users in that category.
What is Explainable AI?
Explainable Artificial Intelligence (XAI) is a set of processes and methods that enable human users to understand and trust the results and outputs of machine learning algorithms. Explainable AI is used to describe the AI model, its expected effects, and possible biases. This helps characterize model accuracy, fairness, transparency, and results in AI-powered decision-making. Explainable AI is important for companies to build trust when AI models go into production. AI accountability also helps organizations take a responsible approach to AI development.
As AI continues to evolve, humans face the challenge of understanding and understanding how algorithms reach results. The entire computational process becomes a so-called "black box" that cannot be interpreted. These black-box models are built directly from the data. And even the engineers and data scientists who create the algorithms can't understand this or explain what's happening internally or how the AI algorithm reached a particular result.
With B2ML explainable AI, you can debug and improve model performance, and help others understand your models' behavior.
We explained how the explanation and interpretability of machine learning models are. With the B2ML Supervised Learning module, regression and classification problems are modeled with AutoML-based algorithms, and the models become clear with B2ML explainable AI.
So why is it important to interpret the model result reports? How do we explain the models you train with B2ML Studio?
Databases and get BI reporting, AutoML modeling, and explainable AI results from your data in a single platform. Interpreting model results is the most important step to gaining accurate insights. It gives us an idea of how variables affect our model and guides us to take the necessary actions. By interpreting these results, model improvements and developments can be achieved. Publishing the AutoML model we built in the real world also depends on the correct interpretation of the model results.
What is the Regression Model?
Regression is the process of estimating the relationship between a dependent variable and independent variables. In simpler words, it means fitting a function from a selected family of functions to the data sampled under some error function. Regression analysis is one of the most well-known types of modeling used in machine learning. Using the B2ML regression module and trying to predict the outcome for future or pending data points.
What is the Classification Model?
The classification model reads the inputs and produces outputs that categorize those inputs. For example, classification models can predict customer churn in B2ML Studio. You can categorize your customer as "Missing customer" or "Not missing customer". With the binary classification model, you can generate predictions with yes-no or 0-1 logic. With multi-classification, you can predict your data in more than two categories. For example, the type of car (Sedan, Hatchback, SUV, Coupe) can be predicted by multi-classification modeling.
B2ML Studio allows you to use the following algorithms for regression and classification modeling;
- Support Vector Machine(SVM)
- LightGBM
- Catboost
- XgBoost
- Extra Trees
- Random Forest
- Linear
- Decision Tree
- Baseline
- Linear Discriminant Analysis
- Neural Network
What is the B2Metric ML Studio Explainable AI Module?
In addition to no-code AutoML modeling, the B2Metric ML Studio platform also interprets the descriptive model results of the developed models for you.
B2Metric ML Studio Interpretable Model Results
Desicion Tree
Decision trees explain the model's decision judgment by converting the results of the data distribution learned by modeling into simple rule sets. Thus, you can find out which situations are for or against your business model by following the rule sets.
So, what do B2Metric decision trees tell you?
The Decision Tree is interpreted by following tree paths that reach nodes that separate the target class in a highly purely manner. To reach pure nodes, we find the nodes with the highest rate differences in bars.
- Colors are used to indicate the density of classes.
- B2M Main Node tells you what percentage of your target audience belongs to which class in the classification problem and how many instances your data set consists of.
- B2M Child Node displays the density of target classes at a particular node in the decision tree by providing a horizontal bar colored by target classes
Sunburst
Sunburst shows hierarchy through a series of rings, that is sliced for each category node. Each ring corresponds to a level in the hierarchy, with the central circle representing the root node and the hierarchy moving outwards from it. Rings are sliced up and divided based on their hierarchical relationship to the parent slice. The angle of each slice is either divided equally under its parent node or can be made proportional to a value. Colour can be used to highlight hierarchical groupings or specific categories.
Feature Importance
Choosing the appropriate features for the model is one of the most important things for a correct model. Feature importance is also a demonstration that gives us an idea for suitable feature selections. We can read the relationship between the selected target and the features. By choosing the right features, we can get much higher scores and make better predictions. It also gives us the idea of making changes to our data by seeing the values on the Feature importance screen. And by interpreting these representations, we can remove irrelevant features from our model.
The feature importance list is the list of variables that affect the results of this model, which best describes the underlying meaning of the data. By looking at the variables in this list, we can classify, prioritize and interpret the factors that affect our business cycle.
Feature Dependency & Shapley Importance
Shapley Importance briefly describes how much each property contributes to the estimate for the estimated value.
For example, let's say that a municipal institution wants to build a metro line. The parameters that are important for the establishment of this line are "whether there are other subways around", "human population in that area", "cost of establishing a subway line there" and "traffic density in that area". Let's predict whether the subway will be built or not in the model. When we do predictive modeling, we can read which feature contributes positively or negatively to our model with the Shapley Importance graph.
Feature dependency shows the list of variables that affect model decisions. SHAP values are a convenient, (mostly) model-agnostic method of explaining a model’s output, or a feature’s impact on a model’s output. It provides a means to estimate and demonstrate how each feature’s contribution influences the model.
Regression Coefficient
The regression coefficients are a statistical measure that is used to measure the average functional relationship between variables. Also, it measures the degree of dependence of one variable on the other(s).
Lift analysis
As an example, Lift analysis allows you to read situations such as the acceptance rate of a campaign, the likelihood that the customer will be lost, or whether a credit customer will receive a loan in the future by examining their past credit status.
The Lift value of an association rule is the ratio of the confidence of the rule and the expected confidence of the rule. The expected confidence of a rule is defined as the product of the support values of the ruling body and the rule head divided by the support of the ruling body. the confidence value is defined as the ratio of the support of the joined rule body and rule head divided by the support of the ruling body.
The lift curve shows how often the model performs better than the random model. Lift is defined as the ratio of cumulative gain to cumulative gain in a random model (which must always be 1).
This relative performance takes into account the fact that increasing the number of classes makes classification difficult. (Random models mistakenly predict that the percentage of samples from a 10-class dataset is higher than that of a 2-class dataset.) In general, for a good model, the lift curve for this plot is high, far from the x-axis. This shows that if the model is the most reliable in its predictions, it will perform many times better than random estimates.
Decision Plot
A decision surface plot is a powerful tool for understanding how a given model "sees" the prediction task. It also explains how the model has decided on a specific data segment. For example, the given plot shows the model decision for Passenger Id <=23 data segment.
B2Metric ML Studio Model Performance Results
Compare Models:
The Compare models screen is the screen where we see the model scores, Champion and Challenger models. Here we have the scores generated by our model for each score type and each algorithm. We can filter the desired model type and the metric type and see our results.
Leaderboard:
On the Leaderboard screen, you can read the values generated by the algorithms in the performance metric type you selected when training your model. You can see the training times.
Detailed Metrics :
Can see our metric scores based on the model on the detailed metrics screen. About our Target column type, the number of rows, minimum and maximum value, standard deviation and number of unique values, etc. we can read information.
Confusion Matrix
A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing. Let's understand TP, FP, FN, and TN in terms of pregnancy analogy.
If you are interested in more about our blogs, you can find much more information there.
B2Metric has been recognized by world’s leading management consultancy company Gartner. You can read Gartner Peer insights in B2Metric’s review here
#automl #machinelearning #b2metric #artificialintelligence #B2MetricML