Title: Compare agents1.py · Issue #7 · six519/PastebinPython · GitHub
Open Graph Title: Compare agents1.py · Issue #7 · six519/PastebinPython
X Title: Compare agents1.py · Issue #7 · six519/PastebinPython
Description: """ Compare PPO, A2C, and DQN on MT5 trading environment. Each agent is trained separately Results (mean reward) are logged """ import time import numpy as np import pandas as pd import gym from gym import spaces import MetaTrader5 as mt...
Open Graph Description: """ Compare PPO, A2C, and DQN on MT5 trading environment. Each agent is trained separately Results (mean reward) are logged """ import time import numpy as np import pandas as pd import gym from gy...
X Description: """ Compare PPO, A2C, and DQN on MT5 trading environment. Each agent is trained separately Results (mean reward) are logged """ import time import numpy as np import p...
Opengraph URL: https://github.com/six519/PastebinPython/issues/7
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Compare agents1.py","articleBody":"\"\"\"\nCompare PPO, A2C, and DQN on MT5 trading environment.\n- Each agent is trained separately\n- Results (mean reward) are logged\n\"\"\"\n\nimport time\nimport numpy as np\nimport pandas as pd\nimport gym\nfrom gym import spaces\nimport MetaTrader5 as mt5\n\nfrom stable_baselines3 import PPO, A2C, DQN\nfrom stable_baselines3.common.vec_env import DummyVecEnv\nfrom stable_baselines3.common.evaluation import evaluate_policy\n\n# -------------------------\n# CONFIG\n# -------------------------\nSYMBOL = \"EURUSD\"\nTIMEFRAME = mt5.TIMEFRAME_M5\nLOOKBACK = 50\nTRAIN_TIMESTEPS = 20000\nN_BARS = 5000\nSEED = 42\n# -------------------------\n\n# -------------------------\n# MT5 Connection\n# -------------------------\ndef mt5_connect():\n if not mt5.initialize():\n raise RuntimeError(f\"MT5 init failed: {mt5.last_error()}\")\n if not mt5.symbol_select(SYMBOL, True):\n raise RuntimeError(f\"Could not select {SYMBOL}\")\n\ndef mt5_shutdown():\n mt5.shutdown()\n\ndef fetch_bars(symbol, timeframe, n_bars):\n rates = mt5.copy_rates_from_pos(symbol, timeframe, 0, n_bars)\n if rates is None:\n raise RuntimeError(f\"Failed to fetch data: {mt5.last_error()}\")\n df = pd.DataFrame(rates)\n df['time'] = pd.to_datetime(df['time'], unit='s')\n return df\n\n# -------------------------\n# Custom Gym Env\n# -------------------------\nclass MT5TradingEnv(gym.Env):\n def __init__(self, df, lookback=LOOKBACK):\n super().__init__()\n self.df = df.reset_index(drop=True)\n self.lookback = lookback\n self.ptr = lookback\n self.position = 0\n self.entry_price = 0\n self.observation_space = spaces.Box(low=-np.inf, high=np.inf,\n shape=(lookback+1,), dtype=np.float32)\n self.action_space = spaces.Discrete(3) # 0=hold, 1=buy, 2=sell\n\n def _get_obs(self):\n closes = self.df.loc[self.ptr-self.lookback:self.ptr-1, \"close\"].values.astype(np.float32)\n norm = closes / (closes[-1] + 1e-9) - 1.0\n return np.concatenate([norm, [float(self.position)]], axis=0)\n\n def reset(self):\n self.ptr = self.lookback\n self.position = 0\n self.entry_price = 0\n return self._get_obs()\n\n def step(self, action):\n done, reward = False, 0\n price = float(self.df.loc[self.ptr, \"close\"])\n if action == 1: # buy\n if self.position == 0:\n self.position, self.entry_price = 1, price\n elif self.position == -1:\n reward += (self.entry_price - price)\n self.position, self.entry_price = 1, price\n elif action == 2: # sell\n if self.position == 0:\n self.position, self.entry_price = -1, price\n elif self.position == 1:\n reward += (price - self.entry_price)\n self.position, self.entry_price = -1, price\n\n self.ptr += 1\n if self.ptr \u003e= len(self.df):\n done = True\n else:\n next_price = float(self.df.loc[self.ptr, \"close\"])\n if self.position == 1:\n reward += (next_price - self.entry_price) * 0.1\n elif self.position == -1:\n reward += (self.entry_price - next_price) * 0.1\n\n obs = self._get_obs() if not done else np.zeros(self.observation_space.shape, dtype=np.float32)\n return obs, float(reward), done, {}\n\n# -------------------------\n# Training \u0026 Evaluation\n# -------------------------\ndef run_comparison():\n mt5_connect()\n df = fetch_bars(SYMBOL, TIMEFRAME, N_BARS)\n mt5_shutdown()\n\n results = {}\n agents = {\n \"PPO\": PPO,\n \"A2C\": A2C,\n \"DQN\": DQN\n }\n\n for name, algo in agents.items():\n print(f\"\\n=== Training {name} ===\")\n env = DummyVecEnv([lambda: MT5TradingEnv(df, lookback=LOOKBACK)])\n model = algo(\"MlpPolicy\", env, verbose=0, seed=SEED)\n model.learn(total_timesteps=TRAIN_TIMESTEPS)\n mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=5)\n results[name] = (mean_reward, std_reward)\n print(f\"{name} → mean reward: {mean_reward:.2f}, std: {std_reward:.2f}\")\n\n print(\"\\n=== Summary ===\")\n for k, v in results.items():\n print(f\"{k}: mean {v[0]:.2f}, std {v[1]:.2f}\")\n\nif __name__ == \"__main__\":\n run_comparison()","author":{"url":"https://github.com/vicks4u","@type":"Person","name":"vicks4u"},"datePublished":"2025-09-16T01:34:59.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/7/PastebinPython/issues/7"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:3e393dc9-24eb-2576-af89-9ff01e478400 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | A554:3197C5:2126091:2A61DB1:6975E8CA |
| html-safe-nonce | 20452040244110b14fd05e1b5ac7dd0f2d9089f0819649dd611e3cc1cae3f1a8 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBNTU0OjMxOTdDNToyMTI2MDkxOjJBNjFEQjE6Njk3NUU4Q0EiLCJ2aXNpdG9yX2lkIjoiNzI4OTMyNjk1MzYwNzU4ODA0MiIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 8ea2b95ec14fa656a2132c0f919c9fa95073382781b971f42b546e54138cb522 |
| hovercard-subject-tag | issue:3420007604 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/six519/PastebinPython/7/issue_layout |
| twitter:image | https://opengraph.githubassets.com/0611f8dbcd6ecb44603cddbaf5e1e811fea1a5fc09c3e75f4baa717f63e9fb6a/six519/PastebinPython/issues/7 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/0611f8dbcd6ecb44603cddbaf5e1e811fea1a5fc09c3e75f4baa717f63e9fb6a/six519/PastebinPython/issues/7 |
| og:image:alt | """ Compare PPO, A2C, and DQN on MT5 trading environment. Each agent is trained separately Results (mean reward) are logged """ import time import numpy as np import pandas as pd import gym from gy... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | vicks4u |
| hostname | github.com |
| expected-hostname | github.com |
| None | 2bce766e7450b03e00b2fc5badd417927ce33a860e78cda3e4ecb9bbd1374cc6 |
| turbo-cache-control | no-preview |
| go-import | github.com/six519/PastebinPython git https://github.com/six519/PastebinPython.git |
| octolytics-dimension-user_id | 483547 |
| octolytics-dimension-user_login | six519 |
| octolytics-dimension-repository_id | 8210586 |
| octolytics-dimension-repository_nwo | six519/PastebinPython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 8210586 |
| octolytics-dimension-repository_network_root_nwo | six519/PastebinPython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | fcca2b8ef702b5f7f91427a6e920fa44446fe312 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width