Predicting shared-car use and examining nonlinear effects using gradient boosting regression trees

Published in International Journal of Sustainable Transportation, 2020

Recommended citation: Wang, Tao, Songhua Hu*, and Yuan Jiang. "Predicting shared-car use and examining nonlinear effects using gradient boosting regression trees." International Journal of Sustainable Transportation (2020): 1-15. https://www.tandfonline.com/doi/abs/10.1080/15568318.2020.1827316

Flexible drop-off and pick-up (one-way) carsharing programs provide users with high levels of convenience but meanwhile incurs spatiotemporal imbalances in shared-cars distribution. Predicting shared-car use helps recognize system imbalances beforehand while identifying determinants related to shared-car use helps operators efficiently implement relocation strategies. In this study, a gradient boosting regression model (GBRT) is employed to predict shared-car use at a station level, and partial dependence plots (PDPs) are employed to examine nonlinear relationships between shared-car use and various predictors. Results show: (1) GBRTs predict shared-car use with a high level of accuracy (MSE: 1.1069–1.1648). (2) PDPs present highly consistent results with relationships derived from the traditional statistical model; (3) Time-varying variables account for 89.30%–86.84% importance in shared-cars use prediction, suggesting these variables can greatly enhance prediction accuracy; (4) Other variables like built environment, station attributes, and socioeconomic features, also account for some importance and can enhance prediction accuracy. Findings help carsharing operators accurately predict the station-level shared-car use and optimally identify the best locations for stations, and thus maintain the operational efficiency of carsharing programs.

Download paper here