From Stars to Strategy: Leveraging 23,000 Reviews to Reshape Women's E-commerce

In the bustling realm of e-commerce, customer reviews serve as both a compass and a mirror.

They guide businesses towards areas of improvement and reflect the quality of products and services offered. Our recent deep dive into women's clothing e-commerce reviews unveiled a tapestry of sentiments, preferences, and invaluable feedback.

This analysis wasn't just about numbers; it was about understanding the voice of the customer.

Through meticulous data cleansing, strategic feature engineering, and insightful visualizations, we unearthed patterns that can shape future business strategies. We also identified key performance indicators that act as the pulse of the platform, revealing its health and areas for growth.

In this analysis, we journey through this analytical process, from raw data to actionable insights.

GOAL

Understand customer sentiment and feedback regarding women's clothing products sold on the e-commerce platform.
Identify patterns in ratings and recommendations to inform business strategies.
Examine the distribution of reviewers' ages to target marketing and product strategies.
Understand the average rating across different departments to pinpoint strong and weak product lines.
Extract key performance indicators to track business health and performance.

PROJECT OBJECTIVES

Analyze the distribution of ratings.
Examine the distribution of recommendations.
Break down reviews by age groups to understand demographic trends.
Analyze average ratings across different product departments.
Compute key performance indicators.
Understand the overall sentiment of reviews.
Create an interactive dashboard that visualizes the results of the analysis.
Use the results of the analysis to identify opportunities for improvement in the e-commerce store.

RESULTS & INSIGHTS

1. Distribution of Rating:

Most reviews tend to be on the positive side, with ratings of 4 and 5 being the most frequent. This indicates a general satisfaction among the customers.
Lower ratings, such as 1 and 2, were less frequent, suggesting fewer negative experiences

2. Distribution of Recommendations:

A significant majority of reviewers recommended the products, signaling positive feedback.
However, there's still a noteworthy portion that did not recommend the products. This segment requires attention to understand the reasons for dissatisfaction.

3. Age Distribution of Reviewers:

The primary age group contributing to reviews falls within the 30s and 40s. This demographic seems to be the most active on the platform.
The platform also has active users in the 50s age group, followed by the 20s and 60s.
Fewer reviews come from users aged 70 and above.

4. Average Rating by Department:

Some departments may have higher average ratings, indicating better product satisfaction in those categories.
A department with a lower average rating could be an area of concern and might need product quality or variety improvements.

5. KPIs:

Total Records: This represents the total number of reviews analyzed, giving a sense of the dataset's volume.
Average Rating: A critical metric to understand overall customer satisfaction. A rating closer to 5 indicates high satisfaction, while a rating closer to 1 suggests dissatisfaction.
Average Age: This gives an idea about the average age of the platform's user demographic.

6. Sentiment Distribution:

Majority of reviews are positive, reinforcing the findings from the ratings distribution.
Neutral and negative sentiments are present but in a smaller proportion. These reviews should be analyzed in-depth to understand customer concerns and areas of improvement.

Dashboard Features & Interactivity:

Filters: Users can filter reviews based on ratings. This allows them to specifically view reviews with certain ratings, helping in targeted analysis.
Interactive Elements: Hovering over data points provides additional details, enhancing the user experience and clarity.
Visual Aesthetics: Color-coded visuals make it easy to distinguish between different data points and categories. For example, positive, neutral, and negative sentiments can be color-coded as green, yellow, and red, respectively.

Incorporating this analysis into an interactive dashboard in tools like Tableau enhances decision-making by providing a visual and intuitive interface. The ability to drill down into specific areas and use filters allows for a more tailored and in-depth exploration of data.

The dashboard is equipped with filters based on rating distribution. This allows stakeholders to drill down into specific data points, offering a granular view of the reviews. Whether you're looking to understand a particular age group's preferences or see how a specific rating correlates with recommendations, the dashboard provides the flexibility to do so.

Data Cleansing: Laying a Strong Foundation

Every insightful analysis begins with clean data. Our dataset, sourced from a prominent women's clothing e-commerce platform, was rich but required some tidying up.

Tokenization: We started by breaking down the review text into individual 'tokens' or words. This step transformed our text data into manageable units, ready for further processing.

Removing Stop Words: Common words like 'and', 'the', 'is', etc., termed as 'stop words', often add noise to text data analysis. By eliminating these, we ensured that our dataset only contained words that added meaningful context.

Lemmatization: English language can be tricky! Words like 'running', 'ran', and 'runner' all stem from the root 'run'. Lemmatization helped us streamline such words to their base or root form, enhancing data consistency.

Feature Engineering: Crafting New Insights

With a clean dataset in hand, we aimed to derive new features from existing data, a process known as feature engineering.

Sentiment Scores: Using the TextBlob library, a popular Python tool, we calculated sentiment polarity scores for each review. This score, ranging between -1 and 1, helped us gauge the overall sentiment of the review. A score closer to 1 indicated a positive sentiment, whereas a score near -1 implied a negative sentiment.

Sentiment Categories: To make our analysis more intuitive, we categorized each review based on its sentiment score:

Positive (score > 0)
Neutral (score = 0)
Negative (score < 0)

Here's a consolidated version of the Python code we used for the analysis:

Click here for the script

Utilizing visualization tools like Tableau, we plotted the distribution of sentiments across all reviews. This not only provided a bird's-eye view of customer sentiments but also highlighted areas requiring attention.

KPIs in Our Analysis

In our exploration of women's clothing e-commerce reviews, we chose specific KPIs to give us a bird's-eye view of the dataset before diving deeper:

Total Records: Representing the total number of reviews, this gave us a sense of the dataset's volume and the scale of customer interactions.
Average Rating: An essential metric, it gave us an immediate understanding of overall customer satisfaction. A high average rating signified broad customer satisfaction, whereas a lower average pointed towards areas of potential concern.
Average Age: This KPI provided a glimpse into the demographic of the platform's user base. Knowing the average age helped in tailoring marketing strategies and product offerings to the most active age groups.

Prioritization: With KPIs, we knew where to focus. If the average rating was low, our subsequent analysis would delve into why customers were dissatisfied.
Benchmarks: They provided a standard for comparison. Analyzing trends over time or comparing with industry benchmarks could reveal if the business is on the right track.
Decision Making: KPIs informed decisions. Knowing the average age of reviewers, for instance, could shape the marketing strategy or influence inventory choices.

Departmental Ratings: Spotlight on Product Categories

Analyzing average ratings across different product departments spotlighted the stars and the underperformers.

Star Performers: Departments with high average ratings indicate customer favorites. They could be promoted further, serving as brand flag bearers.
Areas of Improvement: Lower-rated departments point to potential quality or variety issues. A deeper dive into reviews can reveal specific problems, allowing for targeted improvements.

Each review, each rating, and each piece of feedback is a piece of the puzzle, helping shape a clearer picture of the market landscape. As we stitched these pieces together, the image of a dynamic, ever-evolving e-commerce platform emerged, teeming with opportunities, challenges, and endless possibilities.

Data Gaps and Future Improvements:

Data Gaps:

Missing Data: Certain records might have had missing values for essential fields. While we worked with what we had, missing data can sometimes skew analysis or limit its depth.
Review Authenticity: Not all reviews might be genuine. There's always a possibility of fake reviews, both positive (to boost product ratings) and negative (competitors trying to malign a product).
Limited Context: While we analyzed sentiment, the reasons behind those sentiments remain a mystery without deeper textual analysis. Why was a product rated low? Was it size, color, material, or something else?
Time Frame: Our dataset covered a specific period. Seasonal variations, promotional campaigns, or market events outside this window might have impacted reviews and ratings.

Future Improvements:

Incorporate More Data: Expanding the dataset to include more recent reviews or reviews from other platforms can offer a more comprehensive picture.
Deeper Text Analysis: Techniques like topic modeling can be employed to understand the common themes in reviews. This could uncover specific product issues or areas of improvement.
Review Verification: Implementing algorithms or techniques to filter out potential fake reviews can improve the authenticity of the analysis.
Trend Analysis: By tracking reviews and ratings over time, we can identify trends. Are certain products gaining popularity? Is there a recurrent issue every winter? Such insights can be invaluable.
Integration with Other Data: Merging this dataset with other business data, like sales, returns, or customer support interactions, can provide a 360-degree view of the customer experience.
Feedback Loop: Creating a mechanism to act on the insights from the analysis and then measure the impact of those actions can be a game-changer. For example, if reviews highlight sizing issues and the business rectifies it, is there a positive change in reviews and sales afterward?

Conlcusion

Data analysis is an ongoing journey, where each step – from data cleaning to deriving insights – brings us closer to understanding the nuances of customer behavior and preferences. The world of e-commerce is dynamic, and feedback from customers, as seen in reviews, offers a goldmine of information to stay ahead and adapt in this ever-evolving landscape.

For those interested in exploring the data themselves or verifying our analytical steps, I have attached the dataset I used for your reference.

Download the dataset