DISTRIBUTION-PRESERVING DATA AUGMENTATION FOR SHIP FUEL CONSUMPTION PREDICTION UNDER LIMITED LOGBOOK DATA: A BUNKER VESSEL CASE STUDYPages 48-54
Abstract
Fuel consumption prediction in shipping is often a problem because of limited data. This study looks at how to predict fuel consumption using a method. With two years of records from a bunker vessel for training, only with 76 records before splitting. The things that affected fuel consumption were: how far the vessel traveled, how fast it went how cargo it carried, the trim of the vessel the wind speed and the height of the waves. Fuel consumption prediction is very important in shipping. We need to find a way to make good predictions, with the data we have. The baseline ensemble models, such as Gradient Boosting and Random Forest and XGBoost were trained on the dataset. These models gave the performance R² lower than 0.64. To this issue, we augmented the training dataset from 61 to 1000 with distributional distortion methods that are reliable with similar characteristics of the original dataset. The trained results achieved up to MAE of 0.027, RMSE of 0.050, and R² of 0.995. This indicates that, under limited-data, carefully constrained augmentation can recover predictive structure without introducing distributional distortion, enabling reliable fuel prediction.
Keywords:
Bunker ship,
Data Augmentation,
Fuel Consumption Prediction,
Logbook Data,
Machine Learning.
|