This topic has been deleted. Only users with topic management privileges can see it.
M
magenta.kabuto @Vyacheslav_Blast edited by
@vyacheslav_b the problem with not using states as I understand is the following: lets say the model estimated in t (single pass) gives an estimate for NAS:AAPL = 0.04 (weight allocation). So thats the position assigned to the stock in t for t+1.
In t+1 the model is reestimated but with the information of NAS:AAPL in t and assigns weights 0.03 for t+1 and 0.035 for t+2 in t+1. When I do not use states, and apply get_lower_slippage function , I will have weight allocation 0.035 for t+2 in t+1 whereas with the states I will have 0.04 for t+2 in t+1 and I will not have impact of the transaction costs.
Thank you.
Regards
1 ReplyLast reply ReplyQuote0
M
machesterdragon @Vyacheslav_Blast edited by
@vyacheslav_b I tried your way but when prechecking there was an error: some dates are missed in the portfolio_history and the sharpness was very low.
I went to precheck result html file and this is the error result and sharpe is nan
We look forward to receiving your support. Thank you
@support @Vyacheslav_B
V1 ReplyLast reply ReplyQuote0
V
Vyacheslav_B @machesterdragonlast edited by Vyacheslav_B
@machesterdragon Hello.
If you use a state and a function that returns the prediction for one day, you will not get correct results with precheck.
Theoretically, you can specify the number of partitions as all available days. or you can return all predictions
I have not checked how the precheck works.
If it works in parallel, you will not see the correct result even more so.
State in strategy limits you. I recommend not using it.
Here is an example of a version for one pass; I couldn't test it because my submission did not calculate even one day.
init.ipynb
! pip install torch==2.2.1
strategy.ipynb
import gzipimport picklefrom qnt.data import get_envfrom qnt.log import log_err, log_infodef state_write(state, path=None): if path is None: path = get_env("OUT_STATE_PATH", "state.out.pickle.gz") try: with gzip.open(path, 'wb') as gz: pickle.dump(state, gz) log_info("State saved: " + str(state)) except Exception as e: log_err(f"Error saving state: {e}")def state_read(path=None): if path is None: path = get_env("OUT_STATE_PATH", "state.out.pickle.gz") try: with gzip.open(path, 'rb') as gz: state = pickle.load(gz) log_info("State loaded.") return state except Exception as e: log_err(f"Can't load state: {e}") return Nonestate = state_read()print(state)# separate celldef print_stats(data, weights): stats = qns.calc_stat(data, weights) display(stats.to_pandas().tail()) performance = stats.to_pandas()["equity"] qngraph.make_plot_filled(performance.index, performance, name="PnL (Equity)", type="log")data_train = load_data(train_period)models = train_model(data_train)data_predict = load_data(lookback_period)last_time = data_predict.time.values[-1]if last_time < np.datetime64('2006-01-02'): print("The first state should be None") state_write(None) state = state_read() print(state)weights_predict, state_new = predict(models, data_predict, state)print_stats(data_predict, weights_predict)state_write(state_new)print(state_new)qnout.write(weights_predict) # To participate in the competition, save this code in a separate cell.
But I hope it will work correctly.
Do not expect any responses from me during this week.
M1 ReplyLast reply ReplyQuote0
M
machesterdragon @Vyacheslav_Blast edited by machesterdragon
@vyacheslav_b said in Acess previous weights:
If you use a state and a function that returns the prediction for one day, you will not get correct results with precheck.
Theoretically, you can specify the number of partitions as all available days. or you can return all predictions
I have not checked how the precheck works.
If it works in parallel, you will not see the correct result even more so.
State in strategy limits you. I recommend not using it.
Thank you so much @Vyacheslav_B.
I just tried applying the single pass you suggested but the results were nan. Looking forward to your help when you have time. thank you very much
V1 ReplyLast reply ReplyQuote0
V
Vyacheslav_B @machesterdragonlast edited by
@machesterdragon
That's how it should be. This code is needed so that submissions are processed faster when sent to the contest. The backtest system will calculate the weights for each day. The function I provided calculates weights for only one day.
1 ReplyLast reply ReplyQuote0
illustrious.felice @Vyacheslav_Blast edited by illustrious.felice
@vyacheslav_b Hello, I was trying the code you gave and realized that using state for train ml_backtest only works when the get feature function is a feature like ohlc or log of ohlc (open, high, low, close).
I added some other features (eg: trend = qnta.roc(qnta.lwma(data.sel(field='close'), 40), 1),...) and noticed that after passing ml_backtest, every The indexes are all nan. Looking forward to your help. Thank you.
@Vyacheslav_B
V1 ReplyLast reply ReplyQuote0
V
Vyacheslav_B @illustrious.felicelast edited by
@illustrious-felice Hello.
Show me an example of the code.
I don't quite understand what you are trying to do.
Maybe you just don't have enough data in the functions to get the value.
Please note that in the lines I intentionally reduce the data size to 1 day to predict only the last day.
last_time = data.time.values[-1]data_last = data.sel(time=slice(last_time, None))
Calculate your indicators before this code, and then slice the values.
1 ReplyLast reply ReplyQuote0
illustrious.felice @Vyacheslav_Blast edited by illustrious.felice
@vyacheslav_b Thank you for your response
Here is the code I used from your example. I added some other features (eg: trend = qnta.roc(qnta.lwma(data.sel(field='close'), 40), 1),...) and noticed that after passing ml_backtest, every indexes are all nan. Pnl is a straight line. I have tried changing many other features but the result is still the same, all indicators are nan
import xarray as xrimport qnt.data as qndataimport qnt.backtester as qnbtimport qnt.ta as qntaimport qnt.stats as qnsimport qnt.graph as qngraphimport qnt.output as qnoutimport numpy as npimport pandas as pdimport torchfrom torch import nn, optimimport randomasset_name_all = ['NAS:AAPL', 'NAS:GOOGL']lookback_period = 155train_period = 100class LSTM(nn.Module): """ Class to define our LSTM network. """ def __init__(self, input_dim=3, hidden_layers=64): super(LSTM, self).__init__() self.hidden_layers = hidden_layers self.lstm1 = nn.LSTMCell(input_dim, self.hidden_layers) self.lstm2 = nn.LSTMCell(self.hidden_layers, self.hidden_layers) self.linear = nn.Linear(self.hidden_layers, 1) def forward(self, y): outputs = [] n_samples = y.size(0) h_t = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) c_t = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) h_t2 = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) c_t2 = torch.zeros(n_samples, self.hidden_layers, dtype=torch.float32) for time_step in range(y.size(1)): x_t = y[:, time_step, :] # Ensure x_t is [batch, input_dim] h_t, c_t = self.lstm1(x_t, (h_t, c_t)) h_t2, c_t2 = self.lstm2(h_t, (h_t2, c_t2)) output = self.linear(h_t2) outputs.append(output.unsqueeze(1)) outputs = torch.cat(outputs, dim=1).squeeze(-1) return outputsdef get_model(): def set_seed(seed_value=42): """Set seed for reproducibility.""" random.seed(seed_value) np.random.seed(seed_value) torch.manual_seed(seed_value) torch.cuda.manual_seed(seed_value) torch.cuda.manual_seed_all(seed_value) # if you are using multi-GPU. torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False set_seed(42) model = LSTM(input_dim=3) return modeldef get_features(data): close_price = data.sel(field="close").ffill('time').bfill('time').fillna(1) open_price = data.sel(field="open").ffill('time').bfill('time').fillna(1) high_price = data.sel(field="high").ffill('time').bfill('time').fillna(1) log_close = np.log(close_price) log_open = np.log(open_price) trend = qnta.roc(qnta.lwma(close_price ), 40), 1) features = xr.concat([log_close, log_open, high_price, trend], "feature") return featuresdef get_target_classes(data): price_current = data.sel(field='open') price_future = qnta.shift(price_current, -1) class_positive = 1 # prices goes up class_negative = 0 # price goes down target_price_up = xr.where(price_future > price_current, class_positive, class_negative) return target_price_updef load_data(period): return qndata.stocks.load_ndx_data(tail=period, assets=asset_name_all)def train_model(data): features_all = get_features(data) target_all = get_target_classes(data) models = dict() for asset_name in asset_name_all: model = get_model() target_cur = target_all.sel(asset=asset_name).dropna('time', 'any') features_cur = features_all.sel(asset=asset_name).dropna('time', 'any') target_for_learn_df, feature_for_learn_df = xr.align(target_cur, features_cur, join='inner') criterion = nn.MSELoss() optimiser = optim.LBFGS(model.parameters(), lr=0.08) epochs = 1 for i in range(epochs): def closure(): optimiser.zero_grad() feature_data = feature_for_learn_df.transpose('time', 'feature').values in_ = torch.tensor(feature_data, dtype=torch.float32).unsqueeze(0) out = model(in_) target = torch.zeros(1, len(target_for_learn_df.values)) target[0, :] = torch.tensor(np.array(target_for_learn_df.values)) loss = criterion(out, target) loss.backward() return loss optimiser.step(closure) models[asset_name] = model return modelsdef predict(models, data, state): last_time = data.time.values[-1] data_last = data.sel(time=slice(last_time, None)) weights = xr.zeros_like(data_last.sel(field='close')) for asset_name in asset_name_all: features_all = get_features(data_last) features_cur = features_all.sel(asset=asset_name).dropna('time', 'any') if len(features_cur.time) < 1: continue feature_data = features_cur.transpose('time', 'feature').values in_ = torch.tensor(feature_data, dtype=torch.float32).unsqueeze(0) out = models[asset_name](in_) prediction = out.detach()[0] weights.loc[dict(asset=asset_name, time=features_cur.time.values)] = prediction weights = weights * data_last.sel(field="is_liquid") # state may be null, so define a default value if state is None: default = xr.zeros_like(data_last.sel(field='close')).isel(time=-1) state = { "previus_weights": default, } previus_weights = state['previus_weights'] # align the arrays to prevent problems in case the asset list changes previus_weights, weights = xr.align(previus_weights, weights, join='right') weights_avg = (previus_weights + weights) / 2 next_state = { "previus_weights": weights_avg.isel(time=-1), } # print(last_time)# print("previus_weights")# print(previus_weights)# print(weights)# print("weights_avg")# print(weights_avg.isel(time=-1)) return weights_avg, next_stateweights = qnbt.backtest_ml( load_data=load_data, train=train_model, predict=predict, train_period=train_period, retrain_interval=360, retrain_interval_after_submit=1, predict_each_day=True, competition_type='stocks_nasdaq100', lookback_period=lookback_period, start_date='2006-01-01', build_plots=True)
M1 ReplyLast reply ReplyQuote0
M
magenta.kabuto @illustrious.felicelast edited by
hello again to all,
I hope everyone is fine.
I again came across a question, which should have occurred to me earlier, namely when we use a stateful machine learning strategy for submission, how can we pass on the states without using the ml_backtester, assuming the notebook is rerun at each point in time.
Thank you.
Regards
1 ReplyLast reply ReplyQuote0
V
Vyacheslav_Blast edited by
@illustrious-felice Hi,
https://github.com/quantiacs/strategy-ml_lstm_state/blob/master/strategy.ipynb
This repository provides an example of using state, calculating complex indicators, dynamically selecting stocks for trading, and implementing basic risk management measures, such as normalizing and reducing large positions. It also includes recommendations for submitting strategies to the competition.
1 ReplyLast reply ReplyQuote0