Marketing is essential for the growth and sustainability of any business.
Marketers can help develop the company’s brand, attract customers, increase revenue and increase sales. One of the critical points for marketers is knowing their customers and identifying their needs.
By understanding the customer, marketers can launch specialized and targeted campaigns to suit each customer’s specifics and needs.
By having the availability of data, referring to customer behavior, tools related to data science can be used in order to meet the specific needs of customers.
In this case study, it will be simulated that a bank in the city of New York wants to carry out a case study by launching a marketing campaign, but not just any campaign, it wants it to be specifically targeted to each type of customer. The main object is to segment them into at least 3 groups.
In the last 6 months the company has collected data regarding its customers, and these will be used for a classification model and to know this segmentation.
The data set contains the following information:
- CUSTID: Identification of the holder of the Credit Card.
- BALANCE: Amount of balance left in your account to make purchases.
- BALANCEFREQUENCY: How often the Balance is updated, score between 0 and 1 (1 = frequently updated, 0 = not frequently updated).
PURCHASES: Amount of purchases made from the account.
- ONEOFFPURCHASES: Maximum purchase amount made at one time.
- INSTALLMENT PURCHASES: Amount of the purchase made in installments.
- CASHDVANCE: Cash in advance given by the user.
- FREQUENCY OF PURCHASES: How often Purchases are made, score between 0 and 1 (1 = frequent purchase, 0 = infrequent purchase).
- ONEOFFPURCHASESFREQUENCY - How often to make one-time purchases (1 = frequent purchase, 0 = infrequent purchase).
- FREQUENCY OF INSTALLMENT PURCHASES: Frequency with which installment purchases are made (1 = frequent, 0 = infrequent).
- CASHADVANCEFREQUENCY: How often the cash advance is paid.
- CASHADVANCETRX: Number of Transactions carried out with “Cash in Advanced”.
- PURCHASESTRX: Number of purchase transactions made,
- CREDITLIMIT: credit card limit for the user.
- PAYMENTS: Amount of the payment made by the user.
- MINIMUM_PAYMENTS : Minimum amount of payments made by the user.
- PRCFULLPAYMENT: Percentage of the total payment paid by the user.
- TENURE: Tenure of the credit card service for the user.
The data that will be used for this study with open and public domain, are available at the following link.
Content
The development of the project is applied as follows:
-
We start the case study by importing the libraries and the data with which they are going to work. A quick analysis of the variables is applied to begin to understand the data with which they will work.
-
We continue with an exploratory data analysis, in this section we will work the data to make it ready for training, we will look for how to solve null data and duplicate data problems. We also apply some interesting visualizations that help us understand them.
-
We start with training. We use a K-Means algorithm that helps to segment the data into a given cluster, we find an optimal number of clusters with the elbow method, we analyze the results.
-
Visualizing results of multiple variables is complicated, for this, we apply PCA to reduce the data to two variables, with this the clusters are visualized in a scatter plot.
-
When working with many variables, some of them may not be relevant to the model, we will use the autoencoders to reduce from 17 to 10 variables, once applied, we will use the K-Means method again to obtain the clusters.
-
PCA is applied to these new 10 variables, to finish, the resulting clusters are displayed again.
-
The detailed Project can be found by clicking the buttons below, we have a version in Spanish and another in English.

