This week our topic shifts to the classification concepts in chapter four. Therefore, answer the following questions:
- What are the various types of classifiers?
- What is a rule-based classifier?
- What is the difference between nearest neighbor and naïve bayes classifiers?
- What is logistic regression?
Reply to at least two classmates’ responses by the date indicated. You should be actively engaging with weekly discussions by providing peer-to-peer feedback.
POST 1
Various Types of Classifiers
According to Tan et al. (2018), there are eight different types of classifiers coupled into four clusters. The classifiers are binary, multiclass, deterministic, probabilistic, linear, nonlinear, global, local, generative, and discriminative. The clustering of the classifiers occurs in pairs of two. For example, binary and multiclass, deterministic and probabilistic, etc…
Rule-based Classifier
A rule classifier is a classifier that assigns a certain value based upon a certain condition. In the programming language, they are often written using if and then statements (Tan et al., 2018). A couple of examples include:
If “Warm blooded” = yes and “flys” = no -> mammal
If “Warm blooded” = yes and “flys” = yes -> bird
As the example shows, one can make a determination based upon certain criteria or rules.
Difference Nearest Neighbor and Naïve Byes Classifiers
Nearest Neighbor and Naïve Byes classifiers are two very important classifiers – both relationship based. Still, there are very important differences between the two. The major differences include that the nearest neighbor comes to a conclusion based upon a relationship of identifiers (Puchkin & Spokoiny, 2020). For example, if it oinks like a pig, looks like a pig, it is probably a pig – conclusion based upon certain, logical criteria. Naïve Byes, on the other hand, uses probabilities, and the basic probability theory, to compute the conditional probability of an event occurring (Tan et al., 2018). For example, if there is an 80 percent chance of an event occurring and 75 percent chance of my event occurring, which is dependent on the previous event, then the overall chance is 60 percent.
Logistic Regression
Logistic regression is used to estimate the odds of a certain data instance occurring based upon certain attributes that it contains (Song et al., 2021). There are several clear characteristics associated with logistic regression, which includes discriminative model, different weights per attribute, does not involve computing densities and distances, can handle irrelevant attributes, and cannot handle data with missing values (Tan et al., 2018).
POST 2
The different types of classifiers are,
- Perceptron
- Naive Bayes
- Decision Tree
- Logistic Regression
- K-Nearest Neighbor
- Artificial Neural Networks/Deep Learning
- Support Vector Machine
2. Rule based classifications are static and don’t change based on new inputs or conditions. A frequent way of building rules classifiers is to first construct a decision tree and then post-process it. For rule based, it’s a sequence of logical predicates that are executed in order (e.g. If X is true and Y or Z are false it’s a rabbit).
3. Naive Bayes assumes that each class is distributed according to a simple distribution, independent on feature basis. For continuous case, It will fit a radial Normal distribution to your whole class (each of them) and then make a decision.Nearest neighbor on the other hand is not a probabilistic model. It is simply a “smooth” version of the original idea, where you return ratio of each class in the nearest neighbors set . This assumes nothing about data distribution (besides being locally smooth).
4. Logistic regression is a statistical technique used to predict probability of binary response based on one or more independent variables. It means that, given a certain factors, logistic regression is used to predict an outcome which has two values such as 0 or 1, pass or fail, yes or no etc.