René's URL Explorer Experiment


Title: reward-learning · GitHub Topics · GitHub

Open Graph Title: Build software better, together

X Title: GitHub

Description: GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

Open Graph Description: GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

X Description: GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

Opengraph URL: https://github.com

X: github

direct link

Domain: patch-diff.githubusercontent.com

route-pattern/topics/:topic_name(.:format)
route-controllertopics
route-actionshow
fetch-noncev2:417afb42-a041-8950-f7b9-e2de46a4807e
current-catalog-service-hash82c569b93da5c18ed649ebd4c2c79437db4611a6a1373e805a3cb001c64130b7
request-id927E:133EFF:28D3CF5:37FE56B:698C94E2
html-safe-nonced0b985992bb468e75933801a6a455b5ec1e8aec7e9a58b3b92180cd8e8d77754
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5MjdFOjEzM0VGRjoyOEQzQ0Y1OjM3RkU1NkI6Njk4Qzk0RTIiLCJ2aXNpdG9yX2lkIjoiNDY5NzI0MjgxNjcwNjU0ODk2MiIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac8406c7bae96993ad12f73b3708179f6218a3654060443dd775262c83f4f1e130
github-keyboard-shortcutscopilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/topics/reward-learning
og:site_nameGitHub
og:imagehttps://github.githubassets.com/assets/github-octocat-13c86b8b336d.png
og:image:typeimage/png
og:image:width1200
og:image:height620
twitter:site:id13334762
twitter:creatorgithub
twitter:creator:id13334762
twitter:cardsummary_large_image
twitter:imagehttps://github.githubassets.com/assets/github-logo-55c5b9a1fe52.png
twitter:image:width1200
twitter:image:height1200
hostnamegithub.com
expected-hostnamegithub.com
None640eeb7b6ff4d8d106235d228c0c286e82592d4d2403227b5b2b4fc5832297a4
turbo-cache-controlno-preview
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release3d444f0a47beeeac94cddbb51c91ab408befe8d4
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/topics/reward-learning#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Ftopics%2Freward-learning
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Ftopics%2Freward-learning
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2Ftopics%2Freward-learning&source=header
Reloadhttps://patch-diff.githubusercontent.com/topics/reward-learning
Reloadhttps://patch-diff.githubusercontent.com/topics/reward-learning
Reloadhttps://patch-diff.githubusercontent.com/topics/reward-learning
Explorehttps://patch-diff.githubusercontent.com/explore
Topicshttps://patch-diff.githubusercontent.com/topics
Trendinghttps://patch-diff.githubusercontent.com/trending
Collectionshttps://patch-diff.githubusercontent.com/collections
Eventshttps://patch-diff.githubusercontent.com/events
GitHub Sponsorshttps://patch-diff.githubusercontent.com/sponsors/explore
Star https://patch-diff.githubusercontent.com/login?return_to=%2Ftopic.reward-learning
All 10 https://github.com/topics/reward-learning
Python 6 https://github.com/topics/reward-learning?l=python
Jupyter Notebook 1 https://github.com/topics/reward-learning?l=jupyter+notebook
NetLogo 1 https://github.com/topics/reward-learning?l=netlogo
XSLT 1 https://github.com/topics/reward-learning?l=xslt
HumanCompatibleAIhttps://patch-diff.githubusercontent.com/HumanCompatibleAI
imitationhttps://patch-diff.githubusercontent.com/HumanCompatibleAI/imitation
Star 1.7k https://patch-diff.githubusercontent.com/login?return_to=%2FHumanCompatibleAI%2Fimitation
Code https://patch-diff.githubusercontent.com/HumanCompatibleAI/imitation
Issues https://patch-diff.githubusercontent.com/HumanCompatibleAI/imitation/issues
Pull requests https://patch-diff.githubusercontent.com/HumanCompatibleAI/imitation/pulls
imitation-learninghttps://patch-diff.githubusercontent.com/topics/imitation-learning
gymnasiumhttps://patch-diff.githubusercontent.com/topics/gymnasium
inverse-reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/inverse-reinforcement-learning
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
snap-stanfordhttps://patch-diff.githubusercontent.com/snap-stanford
optimashttps://patch-diff.githubusercontent.com/snap-stanford/optimas
Star 73 https://patch-diff.githubusercontent.com/login?return_to=%2Fsnap-stanford%2Foptimas
Code https://patch-diff.githubusercontent.com/snap-stanford/optimas
Issues https://patch-diff.githubusercontent.com/snap-stanford/optimas/issues
Pull requests https://patch-diff.githubusercontent.com/snap-stanford/optimas/pulls
optimizationhttps://patch-diff.githubusercontent.com/topics/optimization
multiagent-systemshttps://patch-diff.githubusercontent.com/topics/multiagent-systems
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
compound-ai-systemshttps://patch-diff.githubusercontent.com/topics/compound-ai-systems
bobxwuhttps://patch-diff.githubusercontent.com/bobxwu
learning-from-rewards-llm-papershttps://patch-diff.githubusercontent.com/bobxwu/learning-from-rewards-llm-papers
Star 63 https://patch-diff.githubusercontent.com/login?return_to=%2Fbobxwu%2Flearning-from-rewards-llm-papers
Code https://patch-diff.githubusercontent.com/bobxwu/learning-from-rewards-llm-papers
Issues https://patch-diff.githubusercontent.com/bobxwu/learning-from-rewards-llm-papers/issues
Pull requests https://patch-diff.githubusercontent.com/bobxwu/learning-from-rewards-llm-papers/pulls
reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/reinforcement-learning
post-traininghttps://patch-diff.githubusercontent.com/topics/post-training
self-correctionhttps://patch-diff.githubusercontent.com/topics/self-correction
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
large-language-modelshttps://patch-diff.githubusercontent.com/topics/large-language-models
llmhttps://patch-diff.githubusercontent.com/topics/llm
llmshttps://patch-diff.githubusercontent.com/topics/llms
reward-modelshttps://patch-diff.githubusercontent.com/topics/reward-models
reward-modelhttps://patch-diff.githubusercontent.com/topics/reward-model
reward-modelinghttps://patch-diff.githubusercontent.com/topics/reward-modeling
guided-decodinghttps://patch-diff.githubusercontent.com/topics/guided-decoding
test-time-scalinghttps://patch-diff.githubusercontent.com/topics/test-time-scaling
csmile-1006https://patch-diff.githubusercontent.com/csmile-1006
REDS_agenthttps://patch-diff.githubusercontent.com/csmile-1006/REDS_agent
Star 18 https://patch-diff.githubusercontent.com/login?return_to=%2Fcsmile-1006%2FREDS_agent
Code https://patch-diff.githubusercontent.com/csmile-1006/REDS_agent
Issues https://patch-diff.githubusercontent.com/csmile-1006/REDS_agent/issues
Pull requests https://patch-diff.githubusercontent.com/csmile-1006/REDS_agent/pulls
reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/reinforcement-learning
visual-reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/visual-reinforcement-learning
reward-shapinghttps://patch-diff.githubusercontent.com/topics/reward-shaping
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
reward-modelshttps://patch-diff.githubusercontent.com/topics/reward-models
HumanCompatibleAIhttps://patch-diff.githubusercontent.com/HumanCompatibleAI
interpreting-rewardshttps://patch-diff.githubusercontent.com/HumanCompatibleAI/interpreting-rewards
Star 10 https://patch-diff.githubusercontent.com/login?return_to=%2FHumanCompatibleAI%2Finterpreting-rewards
Code https://patch-diff.githubusercontent.com/HumanCompatibleAI/interpreting-rewards
Issues https://patch-diff.githubusercontent.com/HumanCompatibleAI/interpreting-rewards/issues
Pull requests https://patch-diff.githubusercontent.com/HumanCompatibleAI/interpreting-rewards/pulls
deep-reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/deep-reinforcement-learning
interpretabilityhttps://patch-diff.githubusercontent.com/topics/interpretability
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
Masoudjafaripourhttps://patch-diff.githubusercontent.com/Masoudjafaripour
OnlineRLHFhttps://patch-diff.githubusercontent.com/Masoudjafaripour/OnlineRLHF
Star 6 https://patch-diff.githubusercontent.com/login?return_to=%2FMasoudjafaripour%2FOnlineRLHF
Code https://patch-diff.githubusercontent.com/Masoudjafaripour/OnlineRLHF
Issues https://patch-diff.githubusercontent.com/Masoudjafaripour/OnlineRLHF/issues
Pull requests https://patch-diff.githubusercontent.com/Masoudjafaripour/OnlineRLHF/pulls
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
pbrlhttps://patch-diff.githubusercontent.com/topics/pbrl
rlhfhttps://patch-diff.githubusercontent.com/topics/rlhf
preference-based-reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/preference-based-reinforcement-learning
Entiencehttps://patch-diff.githubusercontent.com/Entience
ASIMOVhttps://patch-diff.githubusercontent.com/Entience/ASIMOV
Star 5 https://patch-diff.githubusercontent.com/login?return_to=%2FEntience%2FASIMOV
Code https://patch-diff.githubusercontent.com/Entience/ASIMOV
Issues https://patch-diff.githubusercontent.com/Entience/ASIMOV/issues
Pull requests https://patch-diff.githubusercontent.com/Entience/ASIMOV/pulls
addictionhttps://patch-diff.githubusercontent.com/topics/addiction
foraginghttps://patch-diff.githubusercontent.com/topics/foraging
agent-based-simulationhttps://patch-diff.githubusercontent.com/topics/agent-based-simulation
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
homeostatic-plasticityhttps://patch-diff.githubusercontent.com/topics/homeostatic-plasticity
ethanvillalovozhttps://patch-diff.githubusercontent.com/ethanvillalovoz
clarification-guided-reward-learninghttps://patch-diff.githubusercontent.com/ethanvillalovoz/clarification-guided-reward-learning
Star 0 https://patch-diff.githubusercontent.com/login?return_to=%2Fethanvillalovoz%2Fclarification-guided-reward-learning
Code https://patch-diff.githubusercontent.com/ethanvillalovoz/clarification-guided-reward-learning
Issues https://patch-diff.githubusercontent.com/ethanvillalovoz/clarification-guided-reward-learning/issues
Pull requests https://patch-diff.githubusercontent.com/ethanvillalovoz/clarification-guided-reward-learning/pulls
roboticshttps://patch-diff.githubusercontent.com/topics/robotics
bayesian-inferencehttps://patch-diff.githubusercontent.com/topics/bayesian-inference
human-robot-interactionhttps://patch-diff.githubusercontent.com/topics/human-robot-interaction
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
clarification-questionshttps://patch-diff.githubusercontent.com/topics/clarification-questions
caitlin-leonardhttps://patch-diff.githubusercontent.com/caitlin-leonard
pacman-rl-agenthttps://patch-diff.githubusercontent.com/caitlin-leonard/pacman-rl-agent
Star 0 https://patch-diff.githubusercontent.com/login?return_to=%2Fcaitlin-leonard%2Fpacman-rl-agent
Code https://patch-diff.githubusercontent.com/caitlin-leonard/pacman-rl-agent
Issues https://patch-diff.githubusercontent.com/caitlin-leonard/pacman-rl-agent/issues
Pull requests https://patch-diff.githubusercontent.com/caitlin-leonard/pacman-rl-agent/pulls
pythonhttps://patch-diff.githubusercontent.com/topics/python
machine-learninghttps://patch-diff.githubusercontent.com/topics/machine-learning
reinforcement-learninghttps://patch-diff.githubusercontent.com/topics/reinforcement-learning
pacmanhttps://patch-diff.githubusercontent.com/topics/pacman
tkinter-guihttps://patch-diff.githubusercontent.com/topics/tkinter-gui
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
rl-agenthttps://patch-diff.githubusercontent.com/topics/rl-agent
NBCLabhttps://patch-diff.githubusercontent.com/NBCLab
probabilistic-selection-taskhttps://patch-diff.githubusercontent.com/NBCLab/probabilistic-selection-task
Star 0 https://patch-diff.githubusercontent.com/login?return_to=%2FNBCLab%2Fprobabilistic-selection-task
Code https://patch-diff.githubusercontent.com/NBCLab/probabilistic-selection-task
Issues https://patch-diff.githubusercontent.com/NBCLab/probabilistic-selection-task/issues
Pull requests https://patch-diff.githubusercontent.com/NBCLab/probabilistic-selection-task/pulls
neuroimaginghttps://patch-diff.githubusercontent.com/topics/neuroimaging
fmrihttps://patch-diff.githubusercontent.com/topics/fmri
eprimehttps://patch-diff.githubusercontent.com/topics/eprime
reward-learninghttps://patch-diff.githubusercontent.com/topics/reward-learning
behavioral-taskhttps://patch-diff.githubusercontent.com/topics/behavioral-task
Curate this topic https://github.com/github/explore/tree/master/CONTRIBUTING.md?source=add-description-reward-learning
Learn more https://docs.github.com/en/articles/classifying-your-repository-with-topics
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.