Victor Solis Victor Solis

Understanding Clustering Models: A Simple Guide with Examples

It is useful for a data science practitioner to understand how to apply clustering algorithms due to their ability to discern and group similar data points without prior labeling knowledge. Clustering is instrumental in revealing hidden patterns within data, aiding in customer segmentation and anomaly detection applications.

Read More
Statistics Victor Solis Statistics Victor Solis

Understanding Key Statistical Tests for Data Scientists

In statistics, understanding how to choose and apply the right test is crucial for analyzing data effectively. That is also true for data science use cases. A proper statistical test can help you determine if there is a significant difference between numerical or categorical variables. This blog post will delve into five fundamental statistical tests I have used throughout my data science and analytics career.

Read More
Explainable AI, Feature Selection Victor Solis Explainable AI, Feature Selection Victor Solis

Interpreting Machine Learning Models in Python with SHAP

In machine learning, understanding how models arrive at their predictions is crucial. A common way to determine feature contribution is by looking at feature importance. This measure is based on the decrease in model performance when removing a feature. It is a useful measure but contains no information beyond that importance.

Read More
NLP Victor Solis NLP Victor Solis

Topic Modeling Using LDA and Topic Coherence

Topic modeling is a powerful technique used in natural language processing to identify topics in a text corpus automatically. Latent Dirichlet Allocation (LDA) is one of the most popular topic modeling techniques, and in this tutorial, we'll explore how to implement it using the Gensim library in Python.

Read More
NLP Victor Solis NLP Victor Solis

Bert Transformer Text Similarity in Python

Sentence similarity involves determining how closely related two sentences are in meaning. It measures the similarity between two or more sentences using a similarity score. We often want to measure text similarity in various NLP tasks. Some of those tasks include information retrieval, text summarization, and question answering. One way of achieving that is by using a BERT Transformer model.

Read More
Machine Learning Victor Solis Machine Learning Victor Solis

Common Supervised Machine Learning Algorithms

Supervised machine learning is a branch of artificial intelligence that involves training algorithms to make predictions or classifications based on input data. We have a labeled dataset in supervised learning, meaning we have input features and corresponding output values. The goal is to train a model to predict output values for new, unseen input data accurately

Read More

Stay up to date