Pebblous Project 2

Term: Spring 2024

Faculty Advisor: Jeho Park

Project Description:

This semester’s project focuses on enhancing structured datasets through imputation and lightweighting techniques. Imputation involves addressing missing values or outliers, with a focus on synthetic data generation for privacy-sensitive domains. Lightweighting minimizes dataset volume while retaining distributional characteristics, reducing GPU costs in deep learning. The methodology includes experimenting with various techniques on provided datasets, aiming for no more than 5% performance degradation in image classification tasks with only 10% of the original dataset. Students will undertake these tasks, submitting enhanced datasets along with a detailed report, a tech blog article summarizing their journey, and the experimental Python code. Iterative improvements will be evaluated to identify the most effective techniques.