• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Development of a Platform for Automatic Forecasting of Multiple Time Series in Demand Planning Tasks

Student: rysin nikita

Supervisor: Ashish Kumar Jha

Faculty: Graduate School of Business

Educational Programme: Big Data Systems (Master)

Year of Graduation: 2022

The thesis is devoted to the development of a scalable IT solution for forecasting a large number of time series in mass demand planning and forecasting problem. The result of the work is approbated for the ATM cash demand forecasting case for a commercial bank. Was developed a scalable module for hierarchically organized time series forecasting with these features: algorithms for preprocessing and preparing data; automated generation a feature space; algorithms for detecting downtimes and processing them based on demand approximation using the geolocation clustering, proximity and similarity between ATMs; algorithms for detecting and processing anomalies (CUSUM); marking mass payments events based on identified periodic anomalies; also constructing and selecting the best model for cross-validation in a sliding window; feature selection and hyperparameters selection. The module includes an algorithm for identification cannibalization effect between nearly located ATMs. Different models and architectures were analyzed and compared: SARIMA, Prophet, LSTM, GBDT. As the best forecasting model was selected gradient boosting over decision trees, which was built separately for each cluster of ATMs. ATM cash demand forecasting model was implemented as ETL process, and integration with data sources (Hadoop, PostgreSQL) was organized. Services for development and implementation were deployed in a Docker container: Hadoop, PostgreSQL, Apache Spark, Apache Airflow. Financial effect after implementation was estimated using A/B test. The implementation of forecasting solution leads to decreasing ATM servicing costs by 10%. Due to its flexibility and modularity, the forecasting module can be reused in other processes where time series forecasting may be necessary. The module allows to save labor costs for developing multiple time series forecasting models.

Student Theses at HSE must be completed in accordance with the University Rules and regulations specified by each educational programme.

Summaries of all theses must be published and made freely available on the HSE website.

The full text of a thesis can be published in open access on the HSE website only if the authoring student (copyright holder) agrees, or, if the thesis was written by a team of students, if all the co-authors (copyright holders) agree. After a thesis is published on the HSE website, it obtains the status of an online publication.

Student theses are objects of copyright and their use is subject to limitations in accordance with the Russian Federation’s law on intellectual property.

In the event that a thesis is quoted or otherwise used, reference to the author’s name and the source of quotation is required.

Search all student theses