As a part of my INSOFE analytics super specialization program at IFIM Business School, I got introduced to new concepts such as deep learning models and an opportunity to work on this project.
In today’s marketing campaigns, understanding consumer expectations is of utmost importance. Companies are beginning to shift to social media and various e-commerce sites as a medium for understanding their clients. Text analysis has become an important area of study in natural language processing as part of the revolution. One of the most common problems in this field is the classification of text or the prediction of anything based on textual data. One of the issues I have focused on in this project is predicting women’s dress ratings based on e-commerce reviews given by customers.
We built all the models in Google Collaboratory using Keras.
The first and most difficult step in the process of text analysis is to pre-process text which includes removal of punctuation, special characters, converting text to lower case, managing outliers and missing values. In addition, the use of tokenization and Glove embeddings to transform text data into numerical form and feed this input to the 128 neurons LSTM model, which resulted in the accuracy of 75 percent of the train data and 60 percent of the test data. To tackle overfitting, I used random forest algorithm for feature selection and added important variables to the LSTM model, tweaked hyperparameters, added dropout layer and regularisation, which provided 70 percent accuracy on train data and 66 percent accuracy on test data. Apart from LSTM, I built a bi-directional LSTM, a simple RNN and a basic Naïve Bays model.
It turned out to be a very challenging project, as well as a great learning experience. This project has given me a lot of practical awareness of NLP. It has taught me how to break down complicated tasks into sections and measures. Plan and manage time. Refine understanding through discussion and explanation in a team.
Student at IFIM Business School