Paper "Context-Aware Inference via Performance Forecasting in Decentralized Learning Networks"

Thread for discussion of Context-Aware Inference via Performance Forecasting in Decentralized Learning Networks.

Optimizing Decentralized Online Learning for Supervised Regression and Classification Problems

ADI 2, 40-56; October 9, 2025

Joel Pfeffer, J. M. Diederik Kruijssen, Clément Gossart, Mélanie Chevance, Diego Campo Millan, Florian Stecker, Steven N. Longmore

In decentralized learning networks, predictions from many participants are combined to generate a network inference. While many studies have demonstrated performance benefits of combining multiple model predictions, existing strategies using linear pooling methods (ranging from simple averaging to dynamic weight updates) face a key limitation. Dynamic prediction combinations that rely on historical performance to update weights are necessarily reactive. Due to the need to average over a reasonable number of epochs (e.g. with moving averages or exponential weighting), they tend to be slow to adjust to changing circumstances (e.g. phase or regime changes). In this work, we develop a model that uses machine learning to forecast the performance of predictions by models at each epoch in a time series. This enables ‘context-awareness’ by assigning higher weight to models that are likely to be more accurate at a given time. We show that adding a performance forecasting worker in a decentralized learning network, following a design similar to the Allora network, can improve the accuracy of network inferences. Specifically, we find that forecasting models that predict regret (performance relative to the network inference) or regret z-score (performance relative to other workers) show greater improvement than models predicting losses, which often do not outperform the naive network inference (historically weighted average of all inferences). Through a series of optimization tests, we show that the performance of the forecasting model can be sensitive to choices in the feature set and number of training epochs. These properties may depend on the exact problem and should be tailored to each domain. Although initially designed for a decentralized learning network, using performance forecasting for prediction combination may be useful in any situation where predictive rather than reactive model weighting is needed.

2 Likes

A few months back we published this paper on developing forecaster models for the Allora network. Forecasters are critical as they are the component that enables context awareness. Without forecasters, inference synthesis (i.e. the model aggregation step) is necessarily backward-looking as it can only rely on past performance.

The central idea is that forecasters weight the predictions from inferers on network topics by their expected performance in order to generate a forecast-implied inference. That is, instead of asking “Who performed best recently?”, forecasters ask “Who will perform best now?”

The paper explores how training forecaster models on different measures of worker “performance” impacts their effectiveness, namely:

  • Log losses: the quantity used for evaluating models on the network.
  • Regrets: log loss relative to that of the network inference.
  • Regret z-scores: normalized regret relative to the mean regret of all inferences.
    The latter two are measures of outperformance (relative to the network or other inferers), rather than absolute performance like log loss.

We found that forecasting models predicting raw losses barely outperformed the naive network inference, while regret and z-score models systematically performed better. We suggest that outperformance (regret or z-score) provides a simpler target, avoiding having to forecast the absolute performance of inferers (loss). This demonstrates how transforming the target variable can yield improved results in machine learning.


Figure: Performance of forecasting models over 100 backtests against the naive network inference for testnet Topic 13 (ETH/USD 5 minute predictions).

In addition to the model target, we explored different features sets and training configurations (a single “global” model vs “per-inferer” models) to optimize the models against controlled experiments and live network data (from testnet topics 13 and 14). We found per-inferer models (i.e. training a new ML model for each inferer) have more ‘context awareness’ than global models, as they isolate the performance of each worker. In practice, we use the global model as a fallback for making forecasts for inferers with insufficient data.

We are currently in the process of reviewing the model, making sure that the implemented code matches the forecaster design. The base model handles network queries and data manipulation (target losses, etc.), enabling users to focus on ML model implementation and feature engineering. The intent is that this forecaster model can provide a framework that can be built upon by network participants wishing to deploy forecasting models, and will be made public in future.

1 Like