PTQ Q4 2023 Issue

The developed ‘champion algorithm’ system ensures that the most effective predictive model is always selected based on the most recent data. This dynamism enables the model to adapt to shifts in the underlying patterns in the data, result- ing in more accurate predictions over time. The Time-Series Split, particularly useful when dealing with time-series data, provides a more robust validation technique to avoid data leakage and overfitting. This is achieved by creating a ‘roll - ing’ train-test split that simulates real-time forecasting.² Production and consumption amounts not modelled were incorporated into the calculation as rolling averages over two-hour periods. After predictions and calculations about steam, fuel gas, vent gas, and natural gas were completed, these values were converted into a standard unit using energy balance equations. The boiler’s total steam load was first calculated using the steam production and consump - tion values and energy balance equations. Subsequently, estimates were made for how much fuel gas production the aromatics plant could achieve and how much of the fuel gas produced by ethylene would go to the boilers based on the production and consumption values of the fuel gas. Finally, the amount of natural gas needed by the gas turbine was determined based on its natural gas prediction. All these data were consolidated to ascertain the possible natural gas requirement of Petkim. Model inputs can be modified by users, and changes in Petkim’s natural gas demand under various conditions can be observed. For example, by changing the raw material input amount of the aromatics plant, it is possible to see how much fuel gas the aromatics plant can produce. Accordingly, users can prepare for these scenarios and take various actions. Results and discussion The prediction model’s performance is evaluated by com- paring it to actual consumption. Natural gas consumption is forecasted in the day-ahead period. The consumption data is converted into dimensionless values by daily consumption being divided by the highest value. Data from three months, which belong to different seasons, is analysed in Figure 3 to observe how the model is affected by seasonal changes. In Figure 3, the actual consumption value is slightly more than the model. The difference between the model and actual data differs with the MAPE of 4.5%. Due to the unexpected situation in the complex, the data that fed the model prediction broke down, and the error reached the highest value in the middle of April. Figure 4 shows projected data followed by actual con- sumption with the MAPE of 7.9%, the highest value. Actual consumption fluctuates in the winter due to the disturbance by the unpredicted steam consumption, which also favours natural gas consumption. The model’s flourishing period is considered as summer and spring because the ambient temperature is slightly correlated with the natural gas con- sumption of the complex. Decreasing ambient temperature leads to deviation in the model estimations. With the rising ambient temperature and stabilisation in the process con- dition, the variance decreases with the MAPE of 2.8%, as shown in Figure 5 . Model predictions are significantly close to actual consumption values in May.

1.1

Actual Model

1.0

0.9

0.8

0.7

0.6

1/4/2023

11/4/2023

21/4/2023

Figure 3 Actual vs model prediction natural gas consumption in April

consumption/production values, integrating model predic- tions for fuel gas, steam, and natural gas using the energy balance equations referenced in the ‘Energy Balance’ sec- tion. The gas turbine accounts for approximately 15-45% of the total hourly natural gas requirement. A model exists to predict the amount of natural gas the Gas turbine draws. The results from this model are also integrated into the energy balance equations. Logic of advanced analytics models for predictions Dynamic and retrainable models were chosen to predict these production and consumption values. The model- ling approach was transformed into a function, and the same logic was applied to predict all productions and consumptions. A regression-based model was developed to enable users to simulate the product quickly. The mod- elling approach integrated several algorithms, including LassoRegressor and HuberRegressor. Using the GridSearchCV methodology, the algorithm and parameter combination for predicting the next hour were identified. The model logic, leveraging time-series valida - tion, divided all data into five parts. For each split, the model combination with the highest R2 and lowest MAPE (mean absolute percentage error) values was identified and stored as the champion algorithm. This process was repeated for several time splits. The algorithm that emerged as the champion in the final split received the highest score, while the one in the first split received the lowest score. Scores were then totalled, and the model combination with the highest score was used for prediction. This same logic was applied to all models as the model logic was designed as a function. This modelling approach enables the creation of dynamic models that can be continually retrained. Through the GridSearchCV methodology, the algorithm and parameter combination to be used for predicting the next hour are identified. The concept of GridSearchCV is powerful as it exhaustively conducts a hyper-parameter tuning, cross-validating along the way to determine the combination that gives the best performance.¹

100

PTQ Q4 2023