Classification Competition: Customer Default Prediction

Overview

This repository contains the data and R scripts for a classification competition aimed at predicting customer defaults. Participants are required to estimate the probability of default (PD) for various clients and submit their predictions.

This Classification competition based on Addison Howard, AritraAmex, Di Xu, Hossein Vashani, inversion, Negin, Sohier Dane. (2022). American Express - Default Prediction. Kaggle. https://kaggle.com/competitions/amex-default-prediction

Repository Structure

Data

  • 01_data_raw/dat_fa.rds: Raw data file containing initial customer data.
  • amex_train.rds: Training dataset for model building.
  • amex_validation.rds: Validation dataset for model evaluation (does not include target variable).
  • amex_submission.rds: Template for competition submissions, containing customer IDs.

Scripts

  • data_prep.R: Main script to preprocess data, create training and validation datasets, and prepare a submission template.

Additional Notes

For any issues or questions regarding the data or the competition setup, please see the ILIAS learning module.