Text classification is one of the fundamental tasks in Natural Language Processing (NLP) , the field concerned with how to process and analyze large amounts of natural language data. Text is an incredibly available data type, but it is generally difficult to extract meaningful insights from text data due to its raw unstructured form. In the case where data is provided by customers, text data can provide direct feedback to companies that affects business decisions. As a result, businesses devote great resources to structuring, processing and analyzing this type of data.
It is typical to mention at some point in a learning course on probability that statisticians divide themselves into two camps: frequentists and Bayesians. On hearing this for the first time, I recall thinking how odd it was that something could divide an entire field of study. Of course, there are reasons that this difference exists. This post will take a look at the reasons behind the split and its practical effect on an example chosen to contrast them.