1. Machine Learning Approach
Training Process
A dataset is split into two parts:
- Training set (typically 80% of data): The model uses this data to learn patterns and relationships.
- Test set (typically 20% of data): Used to evaluate the model's performance on unseen data.
Data quality
The predictive accuracy of the model is highly dependent on the data used. Key factors include:
- Quantity: Sufficient historical data to train the model effectively.
- Quality and Reliability: Clean, accurate data to avoid introducing bias or errors.
- Representativeness: Ensuring the dataset reflects a diverse range of real-world scenarios.
To ensure robustness, data was collected from multiple diverse sources, enabling the model to account for various real-world conditions.
2. Feature Selection
The goal of the model is to predict the total energy consumption per charging plug at EV charging stations for the year following their implementation. Features were selected based on their relevance to the target variable. These features allow the model to capture meaningful insights and enhance prediction accuracy:
Infrastructure Features
-
- Charger type categorization (fast/slow/rapid)
- Number of charging plugs per station
Traffic-Related Features
- Traffic density on highways
- Traffic on local and residential roads
- Traffic on primary roads
- Traffic on secondary roads/tertiary roads
Socio-economic and Demographic Indicators
- Population density
- Purchasing power per capita at neighborhood level
- Number of electric vehicles at neighborhood level
- Electric vehicle penetration rate
- Total number of vehicles at the neighborhood scale
3. Model Selection
After evaluating multiple approaches, Random Forest was selected as the optimal modelling technique thanks to its ability to capture non-linear relationships in the data. This choice was particularly important given the complex interactions between features in different contexts.
Random Forest combines multiple decision trees that each analyze the data differently, then aggregate their individual predictions to produce a final result. This approach is particularly powerful because it can capture complex interactions between features (e.g., how traffic patterns might affect energy consumption differently depending on the charger type and location) and helps us understand which factors are most crucial for predictions.
This type of model is used by leading companies across industries:
- Spotify uses Random Forest for music recommendations
- Amazon leverages them for product recommendations towards customers
- JP Morgan utilizes machine learning models like Random Forests for financial forecasting and algorithmic trading
For example:
- In highway-adjacent locations, traffic on high-speed roads emerges as a crucial predictor
- In urban settings, the model captures the greater importance of addressable car fleet (local EV population), purchasing power, traffic on residential roads
The choice of the parameters of the model has been done by selecting the ones that minimized the error percentage.
4. Data sources and results
The training data was collected from multiple charging station operators covering a large typology of locations:
- Highways
- Residential areas
- Urban Areas
- Suburban Areas
These operators provided detailed data about their stations' locations and performance metrics. This diverse set of data sources ensures the model captures patterns across different operating environments and usage scenarios.
Results:
| Charger Type | MAE | MAPE |
|
fast |
920.6 | 29.1% |
| rapid | 9358.4 | 22.3% |
| slow | 546.4 | 26.4% |
Comments
0 comments
Article is closed for comments.