Charge Prediction – Help Center - Mytraffic

1. Machine Learning Approach

Training Process

A dataset is split into two parts:

Training set (typically 80% of data): The model uses this data to learn patterns and relationships.
Test set (typically 20% of data): Used to evaluate the model's performance on unseen data.

Data quality

The predictive accuracy of the model is highly dependent on the data used. Key factors include:

Quantity: Sufficient historical data to train the model effectively.
Quality and Reliability: Clean, accurate data to avoid introducing bias or errors.
Representativeness: Ensuring the dataset reflects a diverse range of real-world scenarios.

To ensure robustness, data was collected from multiple diverse sources, enabling the model to account for various real-world conditions.

2. Feature Selection

The goal of the model is to predict the total energy consumption per charging plug at EV charging stations for the year following their implementation. Features were selected based on their relevance to the target variable. These features allow the model to capture meaningful insights and enhance prediction accuracy:

Infrastructure Features

- Charger type categorization (fast/slow/rapid)
- Number of charging plugs per station

Traffic-Related Features

Traffic density on highways
Traffic on local and residential roads
Traffic on primary roads
Traffic on secondary roads/tertiary roads

Socio-economic and Demographic Indicators

Population density
Purchasing power per capita at neighborhood level
Number of electric vehicles at neighborhood level
Electric vehicle penetration rate
Total number of vehicles at the neighborhood scale

3. Model Selection

After evaluating multiple approaches, Random Forest was selected as the optimal modelling technique thanks to its ability to capture non-linear relationships in the data. This choice was particularly important given the complex interactions between features in different contexts.

Random Forest combines multiple decision trees that each analyze the data differently, then aggregate their individual predictions to produce a final result. This approach is particularly powerful because it can capture complex interactions between features (e.g., how traffic patterns might affect energy consumption differently depending on the charger type and location) and helps us understand which factors are most crucial for predictions.

This type of model is used by leading companies across industries:

Spotify uses Random Forest for music recommendations
Amazon leverages them for product recommendations towards customers
JP Morgan utilizes machine learning models like Random Forests for financial forecasting and algorithmic trading

For example:

In highway-adjacent locations, traffic on high-speed roads emerges as a crucial predictor
In urban settings, the model captures the greater importance of addressable car fleet (local EV population), purchasing power, traffic on residential roads

The choice of the parameters of the model has been done by selecting the ones that minimized the error percentage.

4. Data sources and results

The training data was collected from multiple charging station operators covering a large typology of locations:

Highways
Residential areas
Urban Areas
Suburban Areas

These operators provided detailed data about their stations' locations and performance metrics. This diverse set of data sources ensures the model captures patterns across different operating environments and usage scenarios.

Results:

Charger Type	MAE	MAPE
fast	920.6	29.1%
rapid	9358.4	22.3%
slow	546.4	26.4%

1. Machine Learning Approach

2. Feature Selection

3. Model Selection

4. Data sources and results

Related articles