Price/returns topic feature engineering

Thanks for sharing @t-hossein!

In addition to the features you already use, I’ve found gradients (of a linear fit over some window), acceleration/force and difference from moving average to be other very useful features that are often among the most important.
For time encoding, perhaps time in a week (e.g. in hours) could be useful to capture weekly cycles (like due to weekends).

Do you do any feature reduction? That’s quite a lot of features for 250 independent data points so you could reduce pairs of very highly correlated features, remove features that are consistently of least importance, etc.

For the ML model, LightGBM and CatBoost may also be worth testing. I tend to find LightGBM a bit better than XGBoost most of the time (though don’t have much experience with CatBoost).

You could also consider modifying the evaluation metric to give larger true returns more weight in the minimisation. “Z-transformed Power-Tanh Absolute Error” (ZPTAE) is used in returns topics which has this behaviour. Let me know if you want any more info on that.

2 Likes