Extend the Allora network to anomaly detection tasks

Similarly to the classification extension, we want to extend the Allora network to another class of topics: anomaly detection. In anomaly detection topics, consumers supply their own data to the network, and then the network decides whether this data is anomalous. A major difference to regression/classification problems is that a ground truth is generally not going to be available (and sometimes not even clearly defined), so this is an example of an unsupervised problem.

To start with, we should figure out what the main changes relative to regression and classification topics are and how we want to deal with them. Then we can implement this in the simulator, optimize parameters, and test whether everything works the way we hope.

1 Like

For the most part, the Allora network architecture is general enough to accommodate anomaly detection problems without major changes. Similarly to classification tasks, inferences need to be vector-valued, where the i-th component represents the probability that the i-th element in a given list of data points is anomalous in the epoch.

The two main modifications required beyond the classification extension are:

1. Data Submission by Participants (Arguments):
Consumers (or potentially any participant) must be able to supply data to the network, which will then classify each point as anomalous or not. This requires an additional step before the “inference submission window,” where each interested consumer submits a list of data points. While the data type of these points should be fixed in the topic definition, the network itself does not interpret or process them beyond passing them on to participants. This flexibility allows for a wide range of anomaly detection applications.

2. Loss Functions Without a Fixed Ground Truth:
Many anomaly detection tasks do have a well-defined ground truth, but it is often unavailable in a timely manner. Other tasks may be inherently subjective or loosely defined, making it difficult to establish an agreed-upon ground truth at all. As a result, we need an alternative approach to assigning losses (and thus scores and regrets) to inferences.

One straightforward method is to have reputers collectively determine a ground truth and incentivize them based on consensus. This is what regression type topics already do, so it would require no modification to the network. However, consensus-based approaches can struggle when participants have too much freedom in their assessments—similar to how optimization algorithms (e.g., gradient descent) perform better when initialized near an optimal solution.

Alternatively, we could eliminate reputers entirely and allow topics to define an objective loss function that directly evaluates inferences. In this model, inference workers still provide value, as identifying the minimizer of a loss function can be computationally challenging—this is even more true in clustering tasks, which share similarities with anomaly detection. However, for most topics, this approach is highly susceptible to overfitting: designing a loss function that aligns well with the intended goal is a significant challenge.

We propose a hybrid approach that balances both methods:

  • The topic defines an objective loss function with one or more adjustable parameters.
  • Reputers engage in a consensus game to determine the optimal values for these parameters in each epoch.

This approach reduces the degrees of freedom in the consensus process, improving convergence while still allowing for adaptability. Compared to a fixed loss function, it introduces some level of uncertainty for inference workers, discouraging overfitting to a static loss metric.

1 Like

For certain anomaly detection tasks, it may be beneficial to enhance the outlined design with a supervised component. This component introduces a mechanism in which “synthetic” data—predefined as either anomalous or non-anomalous—is supplied to inferers. The responses to these queries can then be used to score inference workers.

Which Type of Actor is Best Suited for This Role?

  • Reputers would seem like a natural choice, as they are primarily responsible for scoring inference workers. However, reputers themselves receive scores based on consensus, which makes it difficult for them to submit queries to inferers in an uncoordinated manner. Additionally, we want to avoid adding further complexity to the reputer role beyond what has already been outlined in the existing design.

  • Forecasters, on the other hand, have the flexibility to query inferers with synthetic data and use the results to refine their loss forecasts. Notably, this approach does not require any modifications to the network’s core functionality, as the mechanism can be entirely implemented by forecasters. When a forecaster assigns a high weight to an inferer, that inferer’s contributions to network inference increase, and therefore also rewards they receive.

    It is important to note that the losses a forecaster assigns to inferers are never directly compared to those reported by reputers or even by other forecasters. As a result, forecasters do not need to adhere to a specific loss scale. Some forecasters may choose to predict reputer-assigned losses using traditional methods, while others may employ supervised loss functions (e.g., mean squared error) based on the responses to queries with known ground truth.

Conclusion

For now, we do not plan to introduce synthetic queries as an explicit mechanism in the Allora network. Instead, we will rely on forecasters to implement this functionality as needed, allowing them to incorporate supervised components when they find it beneficial. We will include such forecasters in some of our experiments to study how much they can help the network.

1 Like

I like the concept of using objective functions with free parameters that the reputers can vary, and converging on these through the reputer consensus game. That should prevent inferers from overfitting on the objective function ahead of time. Do you already have any tests showing if the suggested mechanism works well, and maybe even if this prevents pre-optimization by the inferers?