René's URL Explorer Experiment


Title: Quick Start — Trinity-RFT 0.4.0 documentation

direct link

Domain: modelscope.github.io

docsearch:languageen

Links:

Skip to main contenthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#main-content
https://modelscope.github.io/Trinity-RFT/en/main/index.html
latesthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html
v0.4.0https://modelscope.github.io/Trinity-RFT/en/v0.4.0/tutorial/example_reasoning_basic.html
v0.3.3https://modelscope.github.io/Trinity-RFT/en/v0.3.3/tutorial/example_reasoning_basic.html
Installationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_installation.html
Developer Guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_overview.html
Workflow Development Guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_workflow.html
Algorithms Development Guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_algorithm.html
Advanced Algorithm Developmenthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_mix_algo.html
Operator Development Guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_operator.html
🧪 Experimental: Task Selectionhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_selector.html
Configuration Guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_configs.html
GPU Configuration Guidehttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_gpu_configs.html
Synchronizer in Trinity-RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/synchronizer.html
Align configuration with veRLhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/align_with_verl.html
Quick Starthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html
Off-Policy RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_advanced.html
Asynchronous RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_async_mode.html
Concatenated Multi-Turn RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_multi_turn.html
General Multi-Step RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_step_wise.html
ReAct Agent Traininghttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_react.html
Email Search Workflowhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_search_email.html
Offline DPO and SFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_dpo.html
Tinker Backendhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_tinker_backend.html
Megatron-LM Backendhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_megatron.html
Data Processinghttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_data_functionalities.html
Example Summaryhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_dataset_perspective.html
FAQhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/faq.html
API Referencehttps://modelscope.github.io/Trinity-RFT/en/main/api_reference.html
trinity.buffer packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.html
trinity.buffer.operators packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.operators.html
trinity.buffer.operators.filters packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.operators.filters.html
trinity.buffer.operators.mappers packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.operators.mappers.html
trinity.buffer.operators.data_juicer_operator modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.operators.data_juicer_operator.html
trinity.buffer.operators.experience_operator modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.operators.experience_operator.html
trinity.buffer.pipelines packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.pipelines.html
trinity.buffer.pipelines.experience_pipeline modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.pipelines.experience_pipeline.html
trinity.buffer.pipelines.task_pipeline modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.pipelines.task_pipeline.html
trinity.buffer.reader packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.reader.html
trinity.buffer.reader.file_reader modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.reader.file_reader.html
trinity.buffer.reader.queue_reader modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.reader.queue_reader.html
trinity.buffer.reader.sql_reader modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.reader.sql_reader.html
trinity.buffer.schema packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.schema.html
trinity.buffer.schema.formatter modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.schema.formatter.html
trinity.buffer.schema.sql_schema modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.schema.sql_schema.html
trinity.buffer.selector packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.selector.html
trinity.buffer.selector.difficulty_estimator modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.selector.difficulty_estimator.html
trinity.buffer.selector.selector modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.selector.selector.html
trinity.buffer.storage packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.storage.html
trinity.buffer.storage.file modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.storage.file.html
trinity.buffer.storage.queue modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.storage.queue.html
trinity.buffer.storage.sql modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.storage.sql.html
trinity.buffer.writer packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.writer.html
trinity.buffer.writer.file_writer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.writer.file_writer.html
trinity.buffer.writer.queue_writer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.writer.queue_writer.html
trinity.buffer.writer.sql_writer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.writer.sql_writer.html
trinity.buffer.buffer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.buffer.html
trinity.buffer.buffer_reader modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.buffer_reader.html
trinity.buffer.buffer_writer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.buffer_writer.html
trinity.buffer.task_scheduler modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.task_scheduler.html
trinity.buffer.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.utils.html
trinity.buffer.viewer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.buffer.viewer.html
trinity.explorer packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.html
trinity.explorer.proxy packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.proxy.html
trinity.explorer.proxy.app modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.proxy.app.html
trinity.explorer.proxy.client modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.proxy.client.html
trinity.explorer.proxy.recorder modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.proxy.recorder.html
trinity.explorer.proxy.service modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.proxy.service.html
trinity.explorer.explorer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.explorer.html
trinity.explorer.scheduler modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.scheduler.html
trinity.explorer.workflow_runner modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.explorer.workflow_runner.html
trinity.trainer packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.html
trinity.trainer.tinker packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.tinker.html
trinity.trainer.tinker.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.tinker.utils.html
trinity.trainer.verl packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.html
trinity.trainer.verl.dp_actor modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.dp_actor.html
trinity.trainer.verl.fsdp_checkpoint_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.fsdp_checkpoint_manager.html
trinity.trainer.verl.fsdp_workers modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.fsdp_workers.html
trinity.trainer.verl.megatron_actor modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.megatron_actor.html
trinity.trainer.verl.megatron_checkpoint_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.megatron_checkpoint_manager.html
trinity.trainer.verl.megatron_workers modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.megatron_workers.html
trinity.trainer.verl.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl.utils.html
trinity.trainer.tinker_trainer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.tinker_trainer.html
trinity.trainer.trainer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.trainer.html
trinity.trainer.verl_trainer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.trainer.verl_trainer.html
trinity.algorithm packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.html
trinity.algorithm.advantage_fn packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.html
trinity.algorithm.advantage_fn.advantage_fn modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.advantage_fn.html
trinity.algorithm.advantage_fn.asymre_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.asymre_advantage.html
trinity.algorithm.advantage_fn.grpo_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.grpo_advantage.html
trinity.algorithm.advantage_fn.multi_step_grpo_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.multi_step_grpo_advantage.html
trinity.algorithm.advantage_fn.on_policy_distill_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.on_policy_distill_advantage.html
trinity.algorithm.advantage_fn.opmd_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.opmd_advantage.html
trinity.algorithm.advantage_fn.ppo_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.ppo_advantage.html
trinity.algorithm.advantage_fn.rec_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.rec_advantage.html
trinity.algorithm.advantage_fn.reinforce_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.reinforce_advantage.html
trinity.algorithm.advantage_fn.reinforce_plus_plus_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.reinforce_plus_plus_advantage.html
trinity.algorithm.advantage_fn.remax_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.remax_advantage.html
trinity.algorithm.advantage_fn.rloo_advantage modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.advantage_fn.rloo_advantage.html
trinity.algorithm.entropy_loss_fn packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.entropy_loss_fn.html
trinity.algorithm.entropy_loss_fn.entropy_loss_fn modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.entropy_loss_fn.entropy_loss_fn.html
trinity.algorithm.kl_fn packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.kl_fn.html
trinity.algorithm.kl_fn.kl_fn modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.kl_fn.kl_fn.html
trinity.algorithm.policy_loss_fn packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.html
trinity.algorithm.policy_loss_fn.chord_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.chord_policy_loss.html
trinity.algorithm.policy_loss_fn.cispo_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.cispo_policy_loss.html
trinity.algorithm.policy_loss_fn.dpo_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.dpo_loss.html
trinity.algorithm.policy_loss_fn.gspo_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.gspo_policy_loss.html
trinity.algorithm.policy_loss_fn.importance_sampling_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.importance_sampling_policy_loss.html
trinity.algorithm.policy_loss_fn.mix_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.mix_policy_loss.html
trinity.algorithm.policy_loss_fn.opmd_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.opmd_policy_loss.html
trinity.algorithm.policy_loss_fn.policy_loss_fn modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.policy_loss_fn.html
trinity.algorithm.policy_loss_fn.ppo_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.ppo_policy_loss.html
trinity.algorithm.policy_loss_fn.rec_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.rec_policy_loss.html
trinity.algorithm.policy_loss_fn.sapo_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.sapo_policy_loss.html
trinity.algorithm.policy_loss_fn.sft_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.sft_loss.html
trinity.algorithm.policy_loss_fn.sppo_loss_fn modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.sppo_loss_fn.html
trinity.algorithm.policy_loss_fn.topr_policy_loss modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.policy_loss_fn.topr_policy_loss.html
trinity.algorithm.sample_strategy packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.sample_strategy.html
trinity.algorithm.sample_strategy.mix_sample_strategy modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.sample_strategy.mix_sample_strategy.html
trinity.algorithm.sample_strategy.sample_strategy modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.sample_strategy.sample_strategy.html
trinity.algorithm.sample_strategy.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.sample_strategy.utils.html
trinity.algorithm.algorithm modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.algorithm.html
trinity.algorithm.key_mapper modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.key_mapper.html
trinity.algorithm.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.algorithm.utils.html
trinity.manager packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.html
trinity.manager.config_registry packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.html
trinity.manager.config_registry.algorithm_config_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.algorithm_config_manager.html
trinity.manager.config_registry.buffer_config_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.buffer_config_manager.html
trinity.manager.config_registry.config_registry modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.config_registry.html
trinity.manager.config_registry.explorer_config_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.explorer_config_manager.html
trinity.manager.config_registry.model_config_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.model_config_manager.html
trinity.manager.config_registry.trainer_config_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_registry.trainer_config_manager.html
trinity.manager.config_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.config_manager.html
trinity.manager.state_manager modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.state_manager.html
trinity.manager.synchronizer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.manager.synchronizer.html
trinity.common packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.html
trinity.common.models packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.html
trinity.common.models.vllm_patch packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.vllm_patch.html
trinity.common.models.mm_utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.mm_utils.html
trinity.common.models.model modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.model.html
trinity.common.models.tinker_model modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.tinker_model.html
trinity.common.models.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.utils.html
trinity.common.models.vllm_model modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.vllm_model.html
trinity.common.models.vllm_worker modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.models.vllm_worker.html
trinity.common.rewards packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.html
trinity.common.rewards.accuracy_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.accuracy_reward.html
trinity.common.rewards.agents_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.agents_reward.html
trinity.common.rewards.countdown_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.countdown_reward.html
trinity.common.rewards.dapo_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.dapo_reward.html
trinity.common.rewards.eval_utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.eval_utils.html
trinity.common.rewards.format_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.format_reward.html
trinity.common.rewards.human_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.human_reward.html
trinity.common.rewards.math_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.math_reward.html
trinity.common.rewards.naive_dapo_score modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.naive_dapo_score.html
trinity.common.rewards.qwen25_eval modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.qwen25_eval.html
trinity.common.rewards.reward_fn modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.reward_fn.html
trinity.common.rewards.tool_reward modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.tool_reward.html
trinity.common.rewards.utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.rewards.utils.html
trinity.common.workflows packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.html
trinity.common.workflows.agentscope packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.agentscope.html
trinity.common.workflows.agentscope_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.agentscope_workflow.html
trinity.common.workflows.customized_math_workflows modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.customized_math_workflows.html
trinity.common.workflows.customized_toolcall_workflows modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.customized_toolcall_workflows.html
trinity.common.workflows.eval_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.eval_workflow.html
trinity.common.workflows.math_rm_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.math_rm_workflow.html
trinity.common.workflows.math_ruler_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.math_ruler_workflow.html
trinity.common.workflows.math_trainable_ruler_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.math_trainable_ruler_workflow.html
trinity.common.workflows.on_policy_distill_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.on_policy_distill_workflow.html
trinity.common.workflows.rubric_judge_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.rubric_judge_workflow.html
trinity.common.workflows.simple_mm_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.simple_mm_workflow.html
trinity.common.workflows.step_wise_workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.step_wise_workflow.html
trinity.common.workflows.workflow modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.workflows.workflow.html
trinity.common.config modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.config.html
trinity.common.constants modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.constants.html
trinity.common.experience modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.experience.html
trinity.common.verl_config modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.common.verl_config.html
trinity.utils packagehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.html
trinity.utils.annotations modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.annotations.html
trinity.utils.distributed modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.distributed.html
trinity.utils.dlc_utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.dlc_utils.html
trinity.utils.log modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.log.html
trinity.utils.lora_utils modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.lora_utils.html
trinity.utils.monitor modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.monitor.html
trinity.utils.plugin_loader modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.plugin_loader.html
trinity.utils.registry modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.registry.html
trinity.utils.timer modulehttps://modelscope.github.io/Trinity-RFT/en/main/build_api/trinity.utils.timer.html
https://github.com/modelscope/Trinity-RFT
.md https://modelscope.github.io/Trinity-RFT/en/main/_sources/tutorial/example_reasoning_basic.md
Step 0: Environment Preparationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-0-environment-preparation
Step 1: Model and Data Preparationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-1-model-and-data-preparation
Step 2: Set up Configuration and Run Experimenthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-2-set-up-configuration-and-run-experiment
Synchronous Mode of Trinity-RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#synchronous-mode-of-trinity-rft
Use GRPO Algorithmhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#use-grpo-algorithm
Run the Experimenthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#run-the-experiment
Optional: RFT with SFT Warmuphttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#optional-rft-with-sft-warmup
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#quick-start
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-0-environment-preparation
Installationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_installation.html
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-1-model-and-data-preparation
ModelScopehttps://modelscope.cn/docs/models/download
Huggingfacehttps://huggingface.co/docs/huggingface_hub/main/en/guides/cli
ModelScopehttps://modelscope.cn/docs/datasets/download
Huggingfacehttps://huggingface.co/docs/huggingface_hub/main/en/guides/cli#download-a-dataset-or-a-space
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-2-set-up-configuration-and-run-experiment
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#synchronous-mode-of-trinity-rft
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#use-grpo-algorithm
gsm8k.yamlhttps://github.com/modelscope/Trinity-RFT/tree/main/examples/grpo_gsm8k/gsm8k.yaml
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#run-the-experiment
#https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#optional-rft-with-sft-warmup
previous Align configuration with veRL https://modelscope.github.io/Trinity-RFT/en/main/tutorial/align_with_verl.html
next Off-Policy RFT https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_advanced.html
Step 0: Environment Preparationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-0-environment-preparation
Step 1: Model and Data Preparationhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-1-model-and-data-preparation
Step 2: Set up Configuration and Run Experimenthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#step-2-set-up-configuration-and-run-experiment
Synchronous Mode of Trinity-RFThttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#synchronous-mode-of-trinity-rft
Use GRPO Algorithmhttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#use-grpo-algorithm
Run the Experimenthttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#run-the-experiment
Optional: RFT with SFT Warmuphttps://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_reasoning_basic.html#optional-rft-with-sft-warmup

Viewport: width=device-width, initial-scale=1


URLs of crawlers that visited me.