Classifying Instagram profiles by gender

The purpose of this project is to create a model which, given an Instagram user’s profile, predicts their gender as accurately as possible. The motivation for this undertaking is to be able to target for marketing purposes Instagram users of specific demographics. The model is trained using labeled text-based profile data passed through a tuned logistic regression model. The model parameters are optimized using the AUROC metric to reduce variability in the precision and recall of predictions for each gender. The resulting model achieves 90% overall accuracy from a dataset of 20,000, though it deviates substantially in the recall of each gender.

Continue reading “Classifying Instagram profiles by gender”