Skip to content
README.md 1.15 KiB
Newer Older
# Classification Competition: Customer Default Prediction
Marius Puke's avatar
Marius Puke committed

## Overview
Marius Puke's avatar
Marius Puke committed

This repository contains the data and R scripts for a classification competition aimed at predicting customer defaults. Participants are required to estimate the probability of default (PD) for various clients and submit their predictions.
Marius Puke's avatar
Marius Puke committed

This Classification competition based on 
Addison Howard, AritraAmex, Di Xu, Hossein Vashani, inversion, Negin, Sohier Dane. (2022). *American Express - Default Prediction*. Kaggle. https://kaggle.com/competitions/amex-default-prediction
Marius Puke's avatar
Marius Puke committed

## Repository Structure
Marius Puke's avatar
Marius Puke committed

### Data
- `01_data_raw/dat_fa.rds`: Raw data file containing initial customer data.
- `amex_train.rds`: Training dataset for model building.
- `amex_validation.rds`: Validation dataset for model evaluation (does not include target variable).
- `amex_submission.rds`: Template for competition submissions, containing customer IDs.
Marius Puke's avatar
Marius Puke committed

### Scripts
- `data_prep.R`: Main script to preprocess data, create training and validation datasets, and prepare a submission template.
Marius Puke's avatar
Marius Puke committed

## Additional Notes
For any issues or questions regarding the data or the competition setup, please see the ILIAS learning module.
Marius Puke's avatar
Marius Puke committed