Exploring Customer Demographics with Data Analysis: Insights from Istanbul's Shopping Malls

Introduction

This project is a deep dive into customer demographic analysis, exploring shopping trends across 10 malls in Istanbul from 2021 to 2023. Through this dataset, I aimed to uncover insights about customer behavior, from spending habits across age groups to payment preferences. This project marks the beginning of my journey as a data analyst, with the goal of building a strong foundation in real-world data analysis. Whether I pursue this path as a freelancer or in a full-time role, I’m committed to learning and sharing insights from real-world projects like this. Stay tuned for more as I continue to expand my skills and understanding of how data shapes decision-making in various industries.

Project Setup

To get started with this Customer Demographic Analysis project, I followed these steps:

  • Data Source: I obtained the dataset from Kaggle, which includes shopping records from 10 different malls in Istanbul over three years (2021–2023). You can check it out here https://www.kaggle.com/datasets/mehmettahiraslan/customer-shopping-dataset

  • Environment: The project was conducted in a Jupyter Notebook using Python for data analysis and visualization.

  • Libraries Used:

    • Pandas for data manipulation and cleaning.

    • NumPy for numerical operations.

    • Matplotlib and Seaborn for data visualization.

  • Analysis Focus: I focused on insights around demographics, spending patterns, product categories, and payment methods.

Data Analysis Process

To uncover insights from the dataset, I followed a systematic data analysis process, utilizing Python in Jupyter Notebook. Below are the key steps I followed:

Step 1: Data Cleaning and Preprocessing

  • Explain how you handled missing values, checked for duplicates, and prepared the dataset for analysis.

  • Example:

Step 2: Descriptive Statistics and Exploratory Data Analysis (EDA)

In this step, I started by analyzing basic statistics to get an overview of the data. This included examining the distribution of customer ages, understanding the gender split, and identifying general spending patterns.

To visualize these insights, I used charts like histograms for age distribution and pie charts for gender breakdown. These initial visualizations helped me summarize the demographics of the customers, giving a clearer picture of the audience and their spending habits.

Step 3: Key Analyses

As we know all good data analysis projects begin with trying to answer questions. Here are some questions which come in my mind by seeing the data.

  • Age Group Analysis: Which age group spends the most?

  • Gender vs. Spending: Do certain genders spend more in specific categories?

  • Mall Location Analysis: Identify which mall has the highest Average Transaction Value (ATV).

  • Monthly and Seasonal Trends: Analyze spending by month or season to see if certain times of year have higher sales.

  • Customer Demographics by Mall: Analyze if specific malls attract specific demographics.

    And many other small insights which are in my Ipython notebook that I uploaded in my Github account(https://github.com/RohitKumar-09/Customer-Demographic-Analysis/blob/main/Customer%20Demographic%20Analysis.ipynb)

Key Insights and Results

  • Females buys more then males almost in every category, But as we can see in Clothing, Cosmetics, and Food this are those 3 category that females like the most.

  • The three main category which is popular in shopping malls of Istanbul. Clothing, Shoes, & Technology are the most favorite items across all the age group.

  • Major trend difference in year 2023 where people are not buying as compare to past two years. This is because of the Covid Pandemic which rises from year 2021 to 2023.

  • Seasonal Trends: People tends to buy more in Spring and Winter because its festival time like Ramadan or Christmas.

  • Top-Selling Categories: I found that certain product categories like "Clothing", "Cosmetic" & "Food & Beverages" are the top sellers by quantity. This indicates high customer interest in these categories.

  • Top-Grossing Categories: Categories like "Clothing", "Shoes" and "Technology" might have lower quantities sold but higher revenue, indicating that these are high-value items even if they are purchased less frequently.

Conclusion

This Customer Demographic Analysis provided valuable insights into the shopping patterns of various demographics across Istanbul’s malls. Such insights could help mall management and retailers make informed decisions about inventory and marketing strategies. Future work could involve incorporating predictive analytics to forecast sales trends based on seasonality and customer demographics.