René's URL Explorer Experiment


Title: GitHub - modelscope/Trinity-RFT: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

Open Graph Title: GitHub - modelscope/Trinity-RFT: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

X Title: GitHub - modelscope/Trinity-RFT: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

Description: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM). - modelscope/Trinity-RFT

Open Graph Description: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM). - modelscope/Trinity-RFT

X Description: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM). - modelscope/Trinity-RFT

Opengraph URL: https://github.com/modelscope/Trinity-RFT

X: @github

direct link

Domain: github.com

route-pattern/:user_id/:repository
route-controllerfiles
route-actiondisambiguate
fetch-noncev2:5be8159b-be7b-a098-48f3-d1e9faf64235
current-catalog-service-hashf3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb
request-idD4FA:3EFE85:1FB5A58:2B2F6FD:6964C09E
html-safe-nonce62fd296bd4c845adf16b4f08cfbc22a14657575011096621c1a075a165ccca2f
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJENEZBOjNFRkU4NToxRkI1QTU4OjJCMkY2RkQ6Njk2NEMwOUUiLCJ2aXNpdG9yX2lkIjoiODY5NTI3OTMzNDEzNTM5OTk4IiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0=
visitor-hmaca5dc4cb0c34da3255067bad0cf830f2a9639d9b5c007401165dc850e6752f0c1
hovercard-subject-tagrepository:963030058
github-keyboard-shortcutsrepository,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location//
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/modelscope/Trinity-RFT
twitter:imagehttps://opengraph.githubassets.com/a3cf703d609acff434f8a3eb4ccfc156ef3dba78f36001410e9cfa860dbadc22/modelscope/Trinity-RFT
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/a3cf703d609acff434f8a3eb4ccfc156ef3dba78f36001410e9cfa860dbadc22/modelscope/Trinity-RFT
og:image:altTrinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM). - modelscope/Trinity-RFT
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
hostnamegithub.com
expected-hostnamegithub.com
None21df671ce2c9f1a16940ccbd3af6cb4f3f12a856929ca7eb1b4aea8e384ea442
turbo-cache-controlno-preview
go-importgithub.com/modelscope/Trinity-RFT git https://github.com/modelscope/Trinity-RFT.git
octolytics-dimension-user_id109945100
octolytics-dimension-user_loginmodelscope
octolytics-dimension-repository_id963030058
octolytics-dimension-repository_nwomodelscope/Trinity-RFT
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id963030058
octolytics-dimension-repository_network_root_nwomodelscope/Trinity-RFT
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release5707c685ac172d50a0bdd7533dde4f8aabcf8eef
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/modelscope/Trinity-RFT#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fmodelscope%2FTrinity-RFT
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fmodelscope%2FTrinity-RFT
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=modelscope%2FTrinity-RFT
Reloadhttps://github.com/modelscope/Trinity-RFT
Reloadhttps://github.com/modelscope/Trinity-RFT
Reloadhttps://github.com/modelscope/Trinity-RFT
modelscope https://github.com/modelscope
Trinity-RFThttps://github.com/modelscope/Trinity-RFT
Notifications https://github.com/login?return_to=%2Fmodelscope%2FTrinity-RFT
Fork 47 https://github.com/login?return_to=%2Fmodelscope%2FTrinity-RFT
Star 474 https://github.com/login?return_to=%2Fmodelscope%2FTrinity-RFT
modelscope.github.io/Trinity-RFT/https://modelscope.github.io/Trinity-RFT/
Apache-2.0 license https://github.com/modelscope/Trinity-RFT/blob/main/LICENSE
474 stars https://github.com/modelscope/Trinity-RFT/stargazers
47 forks https://github.com/modelscope/Trinity-RFT/forks
Branches https://github.com/modelscope/Trinity-RFT/branches
Tags https://github.com/modelscope/Trinity-RFT/tags
Activity https://github.com/modelscope/Trinity-RFT/activity
Star https://github.com/login?return_to=%2Fmodelscope%2FTrinity-RFT
Notifications https://github.com/login?return_to=%2Fmodelscope%2FTrinity-RFT
Code https://github.com/modelscope/Trinity-RFT
Issues 29 https://github.com/modelscope/Trinity-RFT/issues
Pull requests 4 https://github.com/modelscope/Trinity-RFT/pulls
Discussions https://github.com/modelscope/Trinity-RFT/discussions
Actions https://github.com/modelscope/Trinity-RFT/actions
Projects 0 https://github.com/modelscope/Trinity-RFT/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/modelscope/Trinity-RFT/security
Please reload this pagehttps://github.com/modelscope/Trinity-RFT
Insights https://github.com/modelscope/Trinity-RFT/pulse
Code https://github.com/modelscope/Trinity-RFT
Issues https://github.com/modelscope/Trinity-RFT/issues
Pull requests https://github.com/modelscope/Trinity-RFT/pulls
Discussions https://github.com/modelscope/Trinity-RFT/discussions
Actions https://github.com/modelscope/Trinity-RFT/actions
Projects https://github.com/modelscope/Trinity-RFT/projects
Security https://github.com/modelscope/Trinity-RFT/security
Insights https://github.com/modelscope/Trinity-RFT/pulse
Brancheshttps://github.com/modelscope/Trinity-RFT/branches
Tagshttps://github.com/modelscope/Trinity-RFT/tags
https://github.com/modelscope/Trinity-RFT/branches
https://github.com/modelscope/Trinity-RFT/tags
380 Commitshttps://github.com/modelscope/Trinity-RFT/commits/main/
https://github.com/modelscope/Trinity-RFT/commits/main/
.githubhttps://github.com/modelscope/Trinity-RFT/tree/main/.github
.githubhttps://github.com/modelscope/Trinity-RFT/tree/main/.github
benchmarkhttps://github.com/modelscope/Trinity-RFT/tree/main/benchmark
benchmarkhttps://github.com/modelscope/Trinity-RFT/tree/main/benchmark
docshttps://github.com/modelscope/Trinity-RFT/tree/main/docs
docshttps://github.com/modelscope/Trinity-RFT/tree/main/docs
environmentshttps://github.com/modelscope/Trinity-RFT/tree/main/environments
environmentshttps://github.com/modelscope/Trinity-RFT/tree/main/environments
exampleshttps://github.com/modelscope/Trinity-RFT/tree/main/examples
exampleshttps://github.com/modelscope/Trinity-RFT/tree/main/examples
scriptshttps://github.com/modelscope/Trinity-RFT/tree/main/scripts
scriptshttps://github.com/modelscope/Trinity-RFT/tree/main/scripts
testshttps://github.com/modelscope/Trinity-RFT/tree/main/tests
testshttps://github.com/modelscope/Trinity-RFT/tree/main/tests
trinityhttps://github.com/modelscope/Trinity-RFT/tree/main/trinity
trinityhttps://github.com/modelscope/Trinity-RFT/tree/main/trinity
.flake8https://github.com/modelscope/Trinity-RFT/blob/main/.flake8
.flake8https://github.com/modelscope/Trinity-RFT/blob/main/.flake8
.gitignorehttps://github.com/modelscope/Trinity-RFT/blob/main/.gitignore
.gitignorehttps://github.com/modelscope/Trinity-RFT/blob/main/.gitignore
.pre-commit-config.yamlhttps://github.com/modelscope/Trinity-RFT/blob/main/.pre-commit-config.yaml
.pre-commit-config.yamlhttps://github.com/modelscope/Trinity-RFT/blob/main/.pre-commit-config.yaml
CONTRIBUTING.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/CONTRIBUTING.md
CONTRIBUTING.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/CONTRIBUTING.md
LICENSEhttps://github.com/modelscope/Trinity-RFT/blob/main/LICENSE
LICENSEhttps://github.com/modelscope/Trinity-RFT/blob/main/LICENSE
README.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/README.md
README.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/README.md
README_zh.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md
README_zh.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md
pyproject.tomlhttps://github.com/modelscope/Trinity-RFT/blob/main/pyproject.toml
pyproject.tomlhttps://github.com/modelscope/Trinity-RFT/blob/main/pyproject.toml
setup.pyhttps://github.com/modelscope/Trinity-RFT/blob/main/setup.py
setup.pyhttps://github.com/modelscope/Trinity-RFT/blob/main/setup.py
READMEhttps://github.com/modelscope/Trinity-RFT
Contributinghttps://github.com/modelscope/Trinity-RFT
Apache-2.0 licensehttps://github.com/modelscope/Trinity-RFT
中文主页https://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md
Tutorialhttps://modelscope.github.io/Trinity-RFT/
FAQhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/faq.html
https://camo.githubusercontent.com/3332441bbfcdb42e2b842c96da1c0771a54511b7a52a946ca5a21e4be74bd4b7/68747470733a2f2f696d672e616c6963646e2e636f6d2f696d6765787472612f69312f4f31434e30316c764c7066773235506c346f68475a6e555f2121363030303030303030373531392d322d7470732d313632382d3439302e706e67
https://github.com/modelscope/Trinity-RFT#trinity-rft-a-general-purpose-and-unified-framework-forreinforcement-fine-tuning-of-large-language-models
https://arxiv.org/abs/2505.17826
https://modelscope.github.io/Trinity-RFT/
https://pypi.org/project/trinity-rft/
https://camo.githubusercontent.com/cbbee0a45a727ab338a068488b57af8dca9ab5be7046469577902c5257107fc3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4170616368652d2d322e302d3030303030302e737667
https://github.com/modelscope/Trinity-RFT#-what-is-trinity-rft
[tutorial]https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_workflow.html
[tutorial]https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_algorithm.html
[tutorial]https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_operator.html
https://github.com/modelscope/Trinity-RFT#-news
[Release Notes]https://github.com/modelscope/Trinity-RFT/releases/tag/v0.4.0
Tinkerhttps://thinkingmachines.ai/tinker/
Newshttps://tech.china.com.cn/sx/20251201/411376.shtml
Release Noteshttps://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.3
Learn-to-Askhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/learn_to_ask
paperhttps://arxiv.org/pdf/2510.25441
BOTShttps://github.com/modelscope/Trinity-RFT/tree/main/examples/bots
paperhttps://arxiv.org/pdf/2510.26374
Our paperhttps://arxiv.org/pdf/2509.24203
implementationhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/rec_gsm8k
CHORDhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord
paperhttps://arxiv.org/pdf/2508.11408
https://arxiv.org/abs/2505.17826https://arxiv.org/abs/2505.17826
https://arxiv.org/abs/2505.17826https://arxiv.org/abs/2505.17826
https://github.com/modelscope/Trinity-RFT#-tutorials-and-guidelines
Quick start: GRPO on GSM8khttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html
Off-policy RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_advanced.html
Fully asynchronous RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_async_mode.html
Offline learning by DPO or SFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_dpo.html
RFT without local GPU (Tinker Backend)https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_tinker_backend.html
Concatenated multi-turn workflowhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_multi_turn.html
General multi-step workflowhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_step_wise.html
ReAct workflow with an agent frameworkhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_react.html
Example: train a web-search agenthttps://github.com/modelscope/Trinity-RFT/tree/main/examples/agentscope_websearch
Rollout task mixing and selectionhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_selector.html
Online task curriculumhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/bots
paperhttps://arxiv.org/pdf/2510.26374
Research project: learn-to-askhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/learn_to_ask
paperhttps://arxiv.org/pdf/2510.25441
Experience replay with prioritizationhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/ppo_countdown_exp_replay
Advanced data processing & human-in-the-loophttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_data_functionalities.html
RL algorithm development with Trinity-RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_mix_algo.html
paperhttps://arxiv.org/pdf/2508.11408
Research project: group-relative REINFORCEhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/rec_gsm8k
paperhttps://arxiv.org/abs/2509.24203
RULERhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/grpo_gsm8k_ruler
trainable RULERhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/grpo_gsm8k_trainable_ruler
rubric-as-rewardhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/grpo_rubric_as_reward
Benchmark toolkit (quick verification & experimentation)https://github.com/modelscope/Trinity-RFT/tree/main/benchmark/README.md
Guru-Math benchmark & comparison with veRLhttps://github.com/modelscope/Trinity-RFT/tree/main/benchmark/reports/guru_math.md
FrozenLake benchmark & comparison with rLLMhttps://github.com/modelscope/Trinity-RFT/tree/main/benchmark/reports/frozenlake.md
Alfworld benchmark & comparison with rLLMhttps://github.com/modelscope/Trinity-RFT/tree/main/benchmark/reports/alfworld.md
Full configurationshttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_configs.html
GPU resource and training configuration guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_gpu_configs.html
Understand the coordination between explorer and trainerhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/synchronizer.html
How to align configuration with veRLhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/align_with_verl.html
Trinity-RFT documentationhttps://modelscope.github.io/Trinity-RFT/
https://github.com/modelscope/Trinity-RFT#-key-features
https://camo.githubusercontent.com/5df82684977ebca5e2f724e009ac6821caf617fc5a8b946013b8e5fca79bfa37/68747470733a2f2f696d672e616c6963646e2e636f6d2f696d6765787472612f69332f4f31434e303145374e736b533146466f5449396a6c61515f2121363030303030303030303435382d322d7470732d313435382d3638322e706e67
AgentScopehttps://github.com/agentscope-ai/agentscope
https://camo.githubusercontent.com/efa9a0947bda535179acb6e6cbb4c353cba445b4079459fcc486746e2896bf9d/68747470733a2f2f696d672e616c6963646e2e636f6d2f696d6765787472612f69312f4f31434e30317a3169376b6b316a6c4d455661385a48565f2121363030303030303030343538382d322d7470732d313236322d3639352e706e67
https://camo.githubusercontent.com/12dc5a3efd173a1a5bce3131faca075720e708766f31d8a913dcb2e62e17556e/68747470733a2f2f696d672e616c6963646e2e636f6d2f696d6765787472612f69322f4f31434e3031476b3943527732384e734c30396e624f6a5f2121363030303030303030373932312d322d7470732d323533302d3636302e706e67
https://camo.githubusercontent.com/1f2782a7a9c95fbdec4da39720c12ca02a0a6ab258c58ca8fb56aa7f533b0e08/68747470733a2f2f696d672e616c6963646e2e636f6d2f696d6765787472612f69312f4f31434e30315469306f343332305279776f417579684e5f2121363030303030303030363834372d322d7470732d333834302d323133342e706e67
https://github.com/modelscope/Trinity-RFT#-supported-algorithms
Algorithm modulehttps://github.com/modelscope/Trinity-RFT/blob/main/trinity/algorithm/algorithm.py
tutorialhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_algorithm.html
Paperhttps://arxiv.org/pdf/1707.06347
Dochttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html
Countdown Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/ppo_countdown
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/ppo_policy_loss.py
Paperhttps://arxiv.org/pdf/2402.03300
Dochttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html
GSM8K Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/grpo_gsm8k
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/advantage_fn/grpo_advantage.py
Paperhttps://arxiv.org/pdf/2508.11408
Dochttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_mix_algo.html
ToolACE Examplehttps://github.com/modelscope/Trinity-RFT/blob/main/examples/mix_chord/mix_chord_toolace.yaml
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/chord_policy_loss.py
Paperhttps://arxiv.org/pdf/2509.24203
GSM8K Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/rec_gsm8k
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/rec_policy_loss.py
Paperhttps://arxiv.org/pdf/2402.14740
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/advantage_fn/rloo_advantage.py
Paperhttps://arxiv.org/pdf/2501.03262
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/advantage_fn/reinforce_advantage.py
Paperhttps://arxiv.org/pdf/2507.18071
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/gspo_policy_loss.py
Paperhttps://arxiv.org/pdf/2503.14286
GSM8K Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/topr_gsm8k
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/topr_policy_loss.py
Paperhttps://arxiv.org/pdf/2108.05828
GSM8K Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/sppo_gsm8k
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/sppo_loss_fn.py
Paperhttps://arxiv.org/pdf/2506.20520
GSM8K Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/asymre_gsm8k
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/advantage_fn/asymre_advantage.py
Paperhttps://arxiv.org/pdf/2506.13585
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/cispo_policy_loss.py
Paperhttps://arxiv.org/pdf/2511.20347
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/algorithm/policy_loss_fn/sapo_policy_loss.py
Bloghttps://thinkingmachines.ai/blog/on-policy-distillation/
Paperhttps://arxiv.org/pdf/2306.13649
GSM8K Examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/on_policy_distill
Codehttps://github.com/modelscope/Trinity-RFT/tree/main/trinity/common/workflows/on_policy_distill_workflow.py
https://github.com/modelscope/Trinity-RFT#table-of-contents
Quick Starthttps://github.com/modelscope/Trinity-RFT#quick-start
Step 1: installationhttps://github.com/modelscope/Trinity-RFT#step-1-installation
Step 2: prepare dataset and modelhttps://github.com/modelscope/Trinity-RFT#step-2-prepare-dataset-and-model
Step 3: configurationshttps://github.com/modelscope/Trinity-RFT#step-3-configurations
Step 4: run the RFT processhttps://github.com/modelscope/Trinity-RFT#step-4-run-the-rft-process
Contribution Guidehttps://github.com/modelscope/Trinity-RFT#contribution-guide
Acknowledgementshttps://github.com/modelscope/Trinity-RFT#acknowledgements
Citationhttps://github.com/modelscope/Trinity-RFT#citation
https://github.com/modelscope/Trinity-RFT#quick-start
Tinker training examplehttps://github.com/modelscope/Trinity-RFT/tree/main/examples/tinker
https://github.com/modelscope/Trinity-RFT#step-1-installation
https://github.com/modelscope/Trinity-RFT#from-source-recommended
https://github.com/modelscope/Trinity-RFT#1-clone-the-repository
https://github.com/modelscope/Trinity-RFT#2-set-up-environment
https://github.com/modelscope/Trinity-RFT#using-pre-built-docker-image-recommended-for-beginners
https://github.com/modelscope/Trinity-RFT#using-conda
https://github.com/modelscope/Trinity-RFT#using-venv
https://github.com/modelscope/Trinity-RFT#using-uv
uvhttps://github.com/astral-sh/uv
https://github.com/modelscope/Trinity-RFT#via-pypi
Megatron-LM Backendhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_megatron.html
https://github.com/modelscope/Trinity-RFT#step-2-prepare-dataset-and-model
Huggingfacehttps://huggingface.co/docs/huggingface_hub/main/en/guides/cli
ModelScopehttps://modelscope.cn/docs/models/download
Huggingfacehttps://huggingface.co/docs/huggingface_hub/main/en/guides/cli#download-a-dataset-or-a-space
ModelScopehttps://modelscope.cn/docs/datasets/download
https://github.com/modelscope/Trinity-RFT#step-3-configurations
exampleshttps://github.com/modelscope/Trinity-RFT/blob/main/examples
Trinity-Studiohttps://github.com/modelscope/Trinity-Studio
https://camo.githubusercontent.com/292a5bcfdc3f75de3cadfd84549a010b912aed5cd6df1d3134d46153f0f102b4/68747470733a2f2f696d672e616c6963646e2e636f6d2f696d6765787472612f69312f4f31434e3031796859725630316c474b636874797753485f2121363030303030303030343739312d322d7470732d313438302d3834342e706e67
https://github.com/modelscope/Trinity-RFT#step-4-run-the-rft-process
Wandbhttps://docs.wandb.ai/quickstart/
TensorBoardhttps://www.tensorflow.org/tensorboard
MLFlowhttps://mlflow.org
this documentationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_configs.html#monitor-configuration
https://github.com/modelscope/Trinity-RFT#contribution-guide
CONTRIBUTING.mdhttps://github.com/modelscope/Trinity-RFT/blob/main/CONTRIBUTING.md
https://github.com/modelscope/Trinity-RFT#acknowledgements
verlhttps://github.com/volcengine/verl
FSDPhttps://pytorch.org/docs/stable/fsdp.html
Megatron-LMhttps://github.com/NVIDIA/Megatron-LM
vLLMhttps://github.com/vllm-project/vllm
Data-Juicerhttps://github.com/modelscope/data-juicer?tab=readme-ov-file
AgentScopehttps://github.com/agentscope-ai/agentscope
Rayhttps://github.com/ray-project/ray
OpenRLHFhttps://github.com/OpenRLHF/OpenRLHF
TRLhttps://github.com/huggingface/trl
ChatLearnhttps://github.com/alibaba/ChatLearn
rLLMhttps://github.com/rllm-org/rllm
https://github.com/modelscope/Trinity-RFT#citation
modelscope.github.io/Trinity-RFT/https://modelscope.github.io/Trinity-RFT/
agent https://github.com/topics/agent
llm https://github.com/topics/llm
rlhf https://github.com/topics/rlhf
Readme https://github.com/modelscope/Trinity-RFT#readme-ov-file
Apache-2.0 license https://github.com/modelscope/Trinity-RFT#Apache-2.0-1-ov-file
Contributing https://github.com/modelscope/Trinity-RFT#contributing-ov-file
Please reload this pagehttps://github.com/modelscope/Trinity-RFT
Activityhttps://github.com/modelscope/Trinity-RFT/activity
Custom propertieshttps://github.com/modelscope/Trinity-RFT/custom-properties
474 starshttps://github.com/modelscope/Trinity-RFT/stargazers
5 watchinghttps://github.com/modelscope/Trinity-RFT/watchers
47 forkshttps://github.com/modelscope/Trinity-RFT/forks
Report repository https://github.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2Fmodelscope%2FTrinity-RFT&report=modelscope+%28user%29
Releases 9https://github.com/modelscope/Trinity-RFT/releases
v0.4.0 Latest Dec 30, 2025 https://github.com/modelscope/Trinity-RFT/releases/tag/v0.4.0
+ 8 releaseshttps://github.com/modelscope/Trinity-RFT/releases
Packages 0https://github.com/orgs/modelscope/packages?repo_name=Trinity-RFT
Please reload this pagehttps://github.com/modelscope/Trinity-RFT
Please reload this pagehttps://github.com/modelscope/Trinity-RFT
Contributors 20https://github.com/modelscope/Trinity-RFT/graphs/contributors
https://github.com/pan-x-c
https://github.com/hiyuchang
https://github.com/chenyushuo
https://github.com/yanxi-chen
https://github.com/garyzhang99
https://github.com/HYLcool
https://github.com/shiweijiezero
https://github.com/yaochaorui
https://github.com/yxdyc
https://github.com/binary-husky
https://github.com/apps/gemini-code-assist
https://github.com/vadimkantorov
https://github.com/nkkarpov
https://github.com/0x404
+ 6 contributorshttps://github.com/modelscope/Trinity-RFT/graphs/contributors
Python 99.6% https://github.com/modelscope/Trinity-RFT/search?l=python
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.