René's URL Explorer Experiment


Title: GitHub - S4Plus/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Open Graph Title: GitHub - S4Plus/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

X Title: GitHub - S4Plus/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Description: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers

Open Graph Description: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers

X Description: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers

Opengraph URL: https://github.com/S4Plus/transformers

X: @github

direct link

Domain: patch-diff.githubusercontent.com

route-pattern/:user_id/:repository
route-controllerfiles
route-actiondisambiguate
fetch-noncev2:cbe82a59-a6cf-293b-dd0e-4a792239c6a5
current-catalog-service-hashf3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb
request-idEC7E:1BBE0B:B313E73:E84B53A:697692FF
html-safe-noncefbad7331bc22d6f0a7c52f6b484ad4f5010f41ed3fb8c4492fba9d3e1fe8e6a3
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJFQzdFOjFCQkUwQjpCMzEzRTczOkU4NEI1M0E6Njk3NjkyRkYiLCJ2aXNpdG9yX2lkIjoiODEzMjY1MzEyODAyNjA2NzcxMSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmacf3cce061f740554f796d766f5f63d1fedf2a19e5d1b2d7187b01a8cc8d9d41b9
hovercard-subject-tagrepository:768066078
github-keyboard-shortcutsrepository,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location//
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/S4Plus/transformers
twitter:imagehttps://opengraph.githubassets.com/c30831f360117ad1fb4b94031d13673a9536091bf507a339d68937e167356a23/S4Plus/transformers
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/c30831f360117ad1fb4b94031d13673a9536091bf507a339d68937e167356a23/S4Plus/transformers
og:image:alt🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
hostnamegithub.com
expected-hostnamegithub.com
None032152924a283b83384255d9489e7b93b54ba01da8d380b05ecd3953b3212411
turbo-cache-controlno-preview
go-importgithub.com/S4Plus/transformers git https://github.com/S4Plus/transformers.git
octolytics-dimension-user_id61465266
octolytics-dimension-user_loginS4Plus
octolytics-dimension-repository_id768066078
octolytics-dimension-repository_nwoS4Plus/transformers
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forktrue
octolytics-dimension-repository_parent_id155220641
octolytics-dimension-repository_parent_nwohuggingface/transformers
octolytics-dimension-repository_network_root_id155220641
octolytics-dimension-repository_network_root_nwohuggingface/transformers
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release5b577f6be6482e336e3c30e8daefa30144947b17
ui-targetcanary-2
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/S4Plus/transformers#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FS4Plus%2Ftransformers
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FS4Plus%2Ftransformers
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=S4Plus%2Ftransformers
Reloadhttps://patch-diff.githubusercontent.com/S4Plus/transformers
Reloadhttps://patch-diff.githubusercontent.com/S4Plus/transformers
Reloadhttps://patch-diff.githubusercontent.com/S4Plus/transformers
S4Plus https://patch-diff.githubusercontent.com/S4Plus
transformershttps://patch-diff.githubusercontent.com/S4Plus/transformers
huggingface/transformershttps://patch-diff.githubusercontent.com/huggingface/transformers
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Fork 0 https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Star 0 https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
huggingface.co/transformershttps://huggingface.co/transformers
Apache-2.0 license https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/LICENSE
0 stars https://patch-diff.githubusercontent.com/S4Plus/transformers/stargazers
31.9k forks https://patch-diff.githubusercontent.com/S4Plus/transformers/forks
Branches https://patch-diff.githubusercontent.com/S4Plus/transformers/branches
Tags https://patch-diff.githubusercontent.com/S4Plus/transformers/tags
Activity https://patch-diff.githubusercontent.com/S4Plus/transformers/activity
Star https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Code https://patch-diff.githubusercontent.com/S4Plus/transformers
Pull requests 0 https://patch-diff.githubusercontent.com/S4Plus/transformers/pulls
Actions https://patch-diff.githubusercontent.com/S4Plus/transformers/actions
Projects 0 https://patch-diff.githubusercontent.com/S4Plus/transformers/projects
Security 0 https://patch-diff.githubusercontent.com/S4Plus/transformers/security
Insights https://patch-diff.githubusercontent.com/S4Plus/transformers/pulse
Code https://patch-diff.githubusercontent.com/S4Plus/transformers
Pull requests https://patch-diff.githubusercontent.com/S4Plus/transformers/pulls
Actions https://patch-diff.githubusercontent.com/S4Plus/transformers/actions
Projects https://patch-diff.githubusercontent.com/S4Plus/transformers/projects
Security https://patch-diff.githubusercontent.com/S4Plus/transformers/security
Insights https://patch-diff.githubusercontent.com/S4Plus/transformers/pulse
Brancheshttps://patch-diff.githubusercontent.com/S4Plus/transformers/branches
Tagshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tags
https://patch-diff.githubusercontent.com/S4Plus/transformers/branches
https://patch-diff.githubusercontent.com/S4Plus/transformers/tags
15,280 Commitshttps://patch-diff.githubusercontent.com/S4Plus/transformers/commits/main/
https://patch-diff.githubusercontent.com/S4Plus/transformers/commits/main/
.circlecihttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.circleci
.circlecihttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.circleci
.githubhttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.github
.githubhttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.github
dockerhttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docker
dockerhttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docker
docshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docs
docshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docs
exampleshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/examples
exampleshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/examples
model_cardshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/model_cards
model_cardshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/model_cards
notebookshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/notebooks
notebookshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/notebooks
scriptshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/scripts
scriptshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/scripts
src/transformershttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/src/transformers
src/transformershttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/src/transformers
templateshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/templates
templateshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/templates
testshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/tests
testshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/tests
utilshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/utils
utilshttps://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/utils
.coveragerchttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.coveragerc
.coveragerchttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.coveragerc
.gitattributeshttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitattributes
.gitattributeshttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitattributes
.gitignorehttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitignore
.gitignorehttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitignore
CITATION.cffhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CITATION.cff
CITATION.cffhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CITATION.cff
CODE_OF_CONDUCT.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CODE_OF_CONDUCT.md
CODE_OF_CONDUCT.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CODE_OF_CONDUCT.md
CONTRIBUTING.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CONTRIBUTING.md
CONTRIBUTING.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CONTRIBUTING.md
ISSUES.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/ISSUES.md
ISSUES.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/ISSUES.md
LICENSEhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/LICENSE
LICENSEhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/LICENSE
Makefilehttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/Makefile
Makefilehttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/Makefile
README.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README.md
README.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README.md
README_de.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_de.md
README_de.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_de.md
README_es.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_es.md
README_es.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_es.md
README_fr.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_fr.md
README_fr.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_fr.md
README_hd.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_hd.md
README_hd.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_hd.md
README_ja.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ja.md
README_ja.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ja.md
README_ko.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ko.md
README_ko.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ko.md
README_pt-br.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_pt-br.md
README_pt-br.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_pt-br.md
README_ru.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ru.md
README_ru.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ru.md
README_te.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_te.md
README_te.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_te.md
README_vi.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_vi.md
README_vi.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_vi.md
README_zh-hans.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hans.md
README_zh-hans.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hans.md
README_zh-hant.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hant.md
README_zh-hant.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hant.md
SECURITY.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/SECURITY.md
SECURITY.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/SECURITY.md
awesome-transformers.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/awesome-transformers.md
awesome-transformers.mdhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/awesome-transformers.md
conftest.pyhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/conftest.py
conftest.pyhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/conftest.py
hubconf.pyhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/hubconf.py
hubconf.pyhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/hubconf.py
pyproject.tomlhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/pyproject.toml
pyproject.tomlhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/pyproject.toml
setup.pyhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/setup.py
setup.pyhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/setup.py
READMEhttps://patch-diff.githubusercontent.com/S4Plus/transformers
Code of conducthttps://patch-diff.githubusercontent.com/S4Plus/transformers
Contributinghttps://patch-diff.githubusercontent.com/S4Plus/transformers
Licensehttps://patch-diff.githubusercontent.com/S4Plus/transformers
Securityhttps://patch-diff.githubusercontent.com/S4Plus/transformers
https://circleci.com/gh/huggingface/transformers
https://github.com/huggingface/transformers/blob/main/LICENSE
https://huggingface.co/docs/transformers/index
https://github.com/huggingface/transformers/releases
https://github.com/huggingface/transformers/blob/main/CODE_OF_CONDUCT.md
https://zenodo.org/badge/latestdoi/155220641
简体中文https://github.com/huggingface/transformers/blob/main/README_zh-hans.md
繁體中文https://github.com/huggingface/transformers/blob/main/README_zh-hant.md
한국어https://github.com/huggingface/transformers/blob/main/README_ko.md
Españolhttps://github.com/huggingface/transformers/blob/main/README_es.md
日本語https://github.com/huggingface/transformers/blob/main/README_ja.md
हिन्दीhttps://github.com/huggingface/transformers/blob/main/README_hd.md
Русскийhttps://github.com/huggingface/transformers/blob/main/README_ru.md
Рortuguêshttps://github.com/huggingface/transformers/blob/main/README_pt-br.md
తెలుగుhttps://github.com/huggingface/transformers/blob/main/README_te.md
Françaishttps://github.com/huggingface/transformers/blob/main/README_fr.md
Deutschhttps://github.com/huggingface/transformers/blob/main/README_de.md
Tiếng Việthttps://github.com/huggingface/transformers/blob/main/README_vi.md
https://patch-diff.githubusercontent.com/S4Plus/transformers#------------english---------简体中文---------繁體中文---------한국어---------español---------日本語---------हिन्दी---------русский---------рortuguês---------తెలుగు---------français---------deutsch---------tiếng-việt-----
https://patch-diff.githubusercontent.com/S4Plus/transformers#----state-of-the-art-machine-learning-for-jax-pytorch-and-tensorflow
https://hf.co/course
https://patch-diff.githubusercontent.com/S4Plus/transformers#----
model hubhttps://huggingface.co/models
Jaxhttps://jax.readthedocs.io/en/latest/
PyTorchhttps://pytorch.org/
TensorFlowhttps://www.tensorflow.org/
https://patch-diff.githubusercontent.com/S4Plus/transformers#online-demos
model hubhttps://huggingface.co/models
private model hosting, versioning, & an inference APIhttps://huggingface.co/pricing
Masked word completion with BERThttps://huggingface.co/google-bert/bert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France
Named Entity Recognition with Electrahttps://huggingface.co/dbmdz/electra-large-discriminator-finetuned-conll03-english?text=My+name+is+Sarah+and+I+live+in+London+city
Text generation with Mistralhttps://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
Natural Language Inference with RoBERTahttps://huggingface.co/FacebookAI/roberta-large-mnli?text=The+dog+was+lost.+Nobody+lost+any+animal
Summarization with BARThttps://huggingface.co/facebook/bart-large-cnn?text=The+tower+is+324+metres+%281%2C063+ft%29+tall%2C+about+the+same+height+as+an+81-storey+building%2C+and+the+tallest+structure+in+Paris.+Its+base+is+square%2C+measuring+125+metres+%28410+ft%29+on+each+side.+During+its+construction%2C+the+Eiffel+Tower+surpassed+the+Washington+Monument+to+become+the+tallest+man-made+structure+in+the+world%2C+a+title+it+held+for+41+years+until+the+Chrysler+Building+in+New+York+City+was+finished+in+1930.+It+was+the+first+structure+to+reach+a+height+of+300+metres.+Due+to+the+addition+of+a+broadcasting+aerial+at+the+top+of+the+tower+in+1957%2C+it+is+now+taller+than+the+Chrysler+Building+by+5.2+metres+%2817+ft%29.+Excluding+transmitters%2C+the+Eiffel+Tower+is+the+second+tallest+free-standing+structure+in+France+after+the+Millau+Viaduct
Question answering with DistilBERThttps://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad?text=Which+name+is+also+used+to+describe+the+Amazon+rainforest+in+English%3F&context=The+Amazon+rainforest+%28Portuguese%3A+Floresta+Amaz%C3%B4nica+or+Amaz%C3%B4nia%3B+Spanish%3A+Selva+Amaz%C3%B3nica%2C+Amazon%C3%ADa+or+usually+Amazonia%3B+French%3A+For%C3%AAt+amazonienne%3B+Dutch%3A+Amazoneregenwoud%29%2C+also+known+in+English+as+Amazonia+or+the+Amazon+Jungle%2C+is+a+moist+broadleaf+forest+that+covers+most+of+the+Amazon+basin+of+South+America.+This+basin+encompasses+7%2C000%2C000+square+kilometres+%282%2C700%2C000+sq+mi%29%2C+of+which+5%2C500%2C000+square+kilometres+%282%2C100%2C000+sq+mi%29+are+covered+by+the+rainforest.+This+region+includes+territory+belonging+to+nine+nations.+The+majority+of+the+forest+is+contained+within+Brazil%2C+with+60%25+of+the+rainforest%2C+followed+by+Peru+with+13%25%2C+Colombia+with+10%25%2C+and+with+minor+amounts+in+Venezuela%2C+Ecuador%2C+Bolivia%2C+Guyana%2C+Suriname+and+French+Guiana.+States+or+departments+in+four+nations+contain+%22Amazonas%22+in+their+names.+The+Amazon+represents+over+half+of+the+planet%27s+remaining+rainforests%2C+and+comprises+the+largest+and+most+biodiverse+tract+of+tropical+rainforest+in+the+world%2C+with+an+estimated+390+billion+individual+trees+divided+into+16%2C000+species
Translation with T5https://huggingface.co/google-t5/t5-base?text=My+name+is+Wolfgang+and+I+live+in+Berlin
Image classification with ViThttps://huggingface.co/google/vit-base-patch16-224
Object Detection with DETRhttps://huggingface.co/facebook/detr-resnet-50
Semantic Segmentation with SegFormerhttps://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512
Panoptic Segmentation with Mask2Formerhttps://huggingface.co/facebook/mask2former-swin-large-coco-panoptic
Depth Estimation with Depth Anythinghttps://huggingface.co/docs/transformers/main/model_doc/depth_anything
Video Classification with VideoMAEhttps://huggingface.co/docs/transformers/model_doc/videomae
Universal Segmentation with OneFormerhttps://huggingface.co/shi-labs/oneformer_ade20k_dinat_large
Automatic Speech Recognition with Whisperhttps://huggingface.co/openai/whisper-large-v3
Keyword Spotting with Wav2Vec2https://huggingface.co/superb/wav2vec2-base-superb-ks
Audio Classification with Audio Spectrogram Transformerhttps://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593
Table Question Answering with TAPAShttps://huggingface.co/google/tapas-base-finetuned-wtq
Visual Question Answering with ViLThttps://huggingface.co/dandelin/vilt-b32-finetuned-vqa
Image captioning with LLaVahttps://huggingface.co/llava-hf/llava-1.5-7b-hf
Zero-shot Image Classification with SigLIPhttps://huggingface.co/google/siglip-so400m-patch14-384
Document Question Answering with LayoutLMhttps://huggingface.co/impira/layoutlm-document-qa
Zero-shot Video Classification with X-CLIPhttps://huggingface.co/docs/transformers/model_doc/xclip
Zero-shot Object Detection with OWLv2https://huggingface.co/docs/transformers/en/model_doc/owlv2
Zero-shot Image Segmentation with CLIPSeghttps://huggingface.co/docs/transformers/model_doc/clipseg
Automatic Mask Generation with SAMhttps://huggingface.co/docs/transformers/model_doc/sam
https://patch-diff.githubusercontent.com/S4Plus/transformers#100-projects-using-transformers
awesome-transformershttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/awesome-transformers.md
https://patch-diff.githubusercontent.com/S4Plus/transformers#if-you-are-looking-for-custom-support-from-the-hugging-face-team
https://huggingface.co/support
https://patch-diff.githubusercontent.com/S4Plus/transformers#quick-tour
https://camo.githubusercontent.com/4153c3f6ae91d9b2d21065c7ff7596b0e24b0c5c23febf3e66eb503324eed1d3/68747470733a2f2f68756767696e67666163652e636f2f64617461736574732f68756767696e67666163652f646f63756d656e746174696f6e2d696d616765732f7265736f6c76652f6d61696e2f636f636f5f73616d706c652e706e67
https://camo.githubusercontent.com/c8821fb97a1b525d5ea9b5f67057b37392c430ee7b5915b4d6ad481202f410a8/68747470733a2f2f68756767696e67666163652e636f2f64617461736574732f68756767696e67666163652f646f63756d656e746174696f6e2d696d616765732f7265736f6c76652f6d61696e2f636f636f5f73616d706c655f706f73745f70726f6365737365642e706e67
https://patch-diff.githubusercontent.com/S4Plus/transformers#--------
this tutorialhttps://huggingface.co/docs/transformers/task_summary
Pytorch nn.Modulehttps://pytorch.org/docs/stable/nn.html#torch.nn.Module
TensorFlow tf.keras.Modelhttps://www.tensorflow.org/api_docs/python/tf/keras/Model
This tutorialhttps://huggingface.co/docs/transformers/training
https://patch-diff.githubusercontent.com/S4Plus/transformers#why-should-i-use-transformers
https://patch-diff.githubusercontent.com/S4Plus/transformers#why-shouldnt-i-use-transformers
Acceleratehttps://huggingface.co/docs/accelerate
examples folderhttps://github.com/huggingface/transformers/tree/main/examples
https://patch-diff.githubusercontent.com/S4Plus/transformers#installation
https://patch-diff.githubusercontent.com/S4Plus/transformers#with-pip
virtual environmenthttps://docs.python.org/3/library/venv.html
user guidehttps://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
TensorFlow installation pagehttps://www.tensorflow.org/install/
PyTorch installation pagehttps://pytorch.org/get-started/locally/#start-locally
Flaxhttps://github.com/google/flax#quick-install
Jaxhttps://github.com/google/jax#installation
install the library from sourcehttps://huggingface.co/docs/transformers/installation#installing-from-source
https://patch-diff.githubusercontent.com/S4Plus/transformers#with-conda
this issuehttps://github.com/huggingface/huggingface_hub/issues/1062
https://patch-diff.githubusercontent.com/S4Plus/transformers#model-architectures
All the model checkpointshttps://huggingface.co/models
model hubhttps://huggingface.co/models
usershttps://huggingface.co/users
organizationshttps://huggingface.co/organizations
https://camo.githubusercontent.com/f36a36c84f2ff8605938db0f71595cdfebb5ebc941833aeb2591205f220bc9d2/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f68756767696e67666163652e636f2f6170692f736869656c64732f6d6f64656c7326636f6c6f723d627269676874677265656e
herehttps://huggingface.co/docs/transformers/model_summary
ALBERThttps://huggingface.co/docs/transformers/model_doc/albert
ALBERT: A Lite BERT for Self-supervised Learning of Language Representationshttps://arxiv.org/abs/1909.11942
ALIGNhttps://huggingface.co/docs/transformers/model_doc/align
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervisionhttps://arxiv.org/abs/2102.05918
AltCLIPhttps://huggingface.co/docs/transformers/model_doc/altclip
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilitieshttps://arxiv.org/abs/2211.06679
Audio Spectrogram Transformerhttps://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer
AST: Audio Spectrogram Transformerhttps://arxiv.org/abs/2104.01778
Autoformerhttps://huggingface.co/docs/transformers/model_doc/autoformer
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecastinghttps://arxiv.org/abs/2106.13008
Barkhttps://huggingface.co/docs/transformers/model_doc/bark
suno-ai/barkhttps://github.com/suno-ai/bark
BARThttps://huggingface.co/docs/transformers/model_doc/bart
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehensionhttps://arxiv.org/abs/1910.13461
BARThezhttps://huggingface.co/docs/transformers/model_doc/barthez
BARThez: a Skilled Pretrained French Sequence-to-Sequence Modelhttps://arxiv.org/abs/2010.12321
BARTphohttps://huggingface.co/docs/transformers/model_doc/bartpho
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamesehttps://arxiv.org/abs/2109.09701
BEiThttps://huggingface.co/docs/transformers/model_doc/beit
BEiT: BERT Pre-Training of Image Transformershttps://arxiv.org/abs/2106.08254
BERThttps://huggingface.co/docs/transformers/model_doc/bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understandinghttps://arxiv.org/abs/1810.04805
BERT For Sequence Generationhttps://huggingface.co/docs/transformers/model_doc/bert-generation
Leveraging Pre-trained Checkpoints for Sequence Generation Taskshttps://arxiv.org/abs/1907.12461
BERTweethttps://huggingface.co/docs/transformers/model_doc/bertweet
BERTweet: A pre-trained language model for English Tweetshttps://aclanthology.org/2020.emnlp-demos.2/
BigBird-Pegasushttps://huggingface.co/docs/transformers/model_doc/bigbird_pegasus
Big Bird: Transformers for Longer Sequenceshttps://arxiv.org/abs/2007.14062
BigBird-RoBERTahttps://huggingface.co/docs/transformers/model_doc/big_bird
Big Bird: Transformers for Longer Sequenceshttps://arxiv.org/abs/2007.14062
BioGpthttps://huggingface.co/docs/transformers/model_doc/biogpt
BioGPT: generative pre-trained transformer for biomedical text generation and mininghttps://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac409/6713511?guestAccessKey=a66d9b5d-4f83-4017-bb52-405815c907b9
BiThttps://huggingface.co/docs/transformers/model_doc/bit
Big Transfer (BiT): General Visual Representation Learninghttps://arxiv.org/abs/1912.11370
Blenderbothttps://huggingface.co/docs/transformers/model_doc/blenderbot
Recipes for building an open-domain chatbothttps://arxiv.org/abs/2004.13637
BlenderbotSmallhttps://huggingface.co/docs/transformers/model_doc/blenderbot-small
Recipes for building an open-domain chatbothttps://arxiv.org/abs/2004.13637
BLIPhttps://huggingface.co/docs/transformers/model_doc/blip
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generationhttps://arxiv.org/abs/2201.12086
BLIP-2https://huggingface.co/docs/transformers/model_doc/blip-2
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Modelshttps://arxiv.org/abs/2301.12597
BLOOMhttps://huggingface.co/docs/transformers/model_doc/bloom
BigScience Workshophttps://bigscience.huggingface.co/
BORThttps://huggingface.co/docs/transformers/model_doc/bort
Optimal Subarchitecture Extraction For BERThttps://arxiv.org/abs/2010.10499
BridgeTowerhttps://huggingface.co/docs/transformers/model_doc/bridgetower
BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learninghttps://arxiv.org/abs/2206.08657
BROShttps://huggingface.co/docs/transformers/model_doc/bros
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documentshttps://arxiv.org/abs/2108.04539
ByT5https://huggingface.co/docs/transformers/model_doc/byt5
ByT5: Towards a token-free future with pre-trained byte-to-byte modelshttps://arxiv.org/abs/2105.13626
CamemBERThttps://huggingface.co/docs/transformers/model_doc/camembert
CamemBERT: a Tasty French Language Modelhttps://arxiv.org/abs/1911.03894
CANINEhttps://huggingface.co/docs/transformers/model_doc/canine
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representationhttps://arxiv.org/abs/2103.06874
Chinese-CLIPhttps://huggingface.co/docs/transformers/model_doc/chinese_clip
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinesehttps://arxiv.org/abs/2211.01335
CLAPhttps://huggingface.co/docs/transformers/model_doc/clap
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentationhttps://arxiv.org/abs/2211.06687
CLIPhttps://huggingface.co/docs/transformers/model_doc/clip
Learning Transferable Visual Models From Natural Language Supervisionhttps://arxiv.org/abs/2103.00020
CLIPSeghttps://huggingface.co/docs/transformers/model_doc/clipseg
Image Segmentation Using Text and Image Promptshttps://arxiv.org/abs/2112.10003
CLVPhttps://huggingface.co/docs/transformers/model_doc/clvp
Better speech synthesis through scalinghttps://arxiv.org/abs/2305.07243
CodeGenhttps://huggingface.co/docs/transformers/model_doc/codegen
A Conversational Paradigm for Program Synthesishttps://arxiv.org/abs/2203.13474
CodeLlamahttps://huggingface.co/docs/transformers/model_doc/llama_code
Code Llama: Open Foundation Models for Codehttps://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/
Conditional DETRhttps://huggingface.co/docs/transformers/model_doc/conditional_detr
Conditional DETR for Fast Training Convergencehttps://arxiv.org/abs/2108.06152
ConvBERThttps://huggingface.co/docs/transformers/model_doc/convbert
ConvBERT: Improving BERT with Span-based Dynamic Convolutionhttps://arxiv.org/abs/2008.02496
ConvNeXThttps://huggingface.co/docs/transformers/model_doc/convnext
A ConvNet for the 2020shttps://arxiv.org/abs/2201.03545
ConvNeXTV2https://huggingface.co/docs/transformers/model_doc/convnextv2
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencodershttps://arxiv.org/abs/2301.00808
CPMhttps://huggingface.co/docs/transformers/model_doc/cpm
CPM: A Large-scale Generative Chinese Pre-trained Language Modelhttps://arxiv.org/abs/2012.00413
CPM-Anthttps://huggingface.co/docs/transformers/model_doc/cpmant
OpenBMBhttps://www.openbmb.org/
CTRLhttps://huggingface.co/docs/transformers/model_doc/ctrl
CTRL: A Conditional Transformer Language Model for Controllable Generationhttps://arxiv.org/abs/1909.05858
CvThttps://huggingface.co/docs/transformers/model_doc/cvt
CvT: Introducing Convolutions to Vision Transformershttps://arxiv.org/abs/2103.15808
Data2Vechttps://huggingface.co/docs/transformers/model_doc/data2vec
Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Languagehttps://arxiv.org/abs/2202.03555
DeBERTahttps://huggingface.co/docs/transformers/model_doc/deberta
DeBERTa: Decoding-enhanced BERT with Disentangled Attentionhttps://arxiv.org/abs/2006.03654
DeBERTa-v2https://huggingface.co/docs/transformers/model_doc/deberta-v2
DeBERTa: Decoding-enhanced BERT with Disentangled Attentionhttps://arxiv.org/abs/2006.03654
Decision Transformerhttps://huggingface.co/docs/transformers/model_doc/decision_transformer
Decision Transformer: Reinforcement Learning via Sequence Modelinghttps://arxiv.org/abs/2106.01345
Deformable DETRhttps://huggingface.co/docs/transformers/model_doc/deformable_detr
Deformable DETR: Deformable Transformers for End-to-End Object Detectionhttps://arxiv.org/abs/2010.04159
DeiThttps://huggingface.co/docs/transformers/model_doc/deit
Training data-efficient image transformers & distillation through attentionhttps://arxiv.org/abs/2012.12877
DePlothttps://huggingface.co/docs/transformers/model_doc/deplot
DePlot: One-shot visual language reasoning by plot-to-table translationhttps://arxiv.org/abs/2212.10505
Depth Anythinghttps://huggingface.co/docs/transformers/model_doc/depth_anything
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Datahttps://arxiv.org/abs/2401.10891
DETAhttps://huggingface.co/docs/transformers/model_doc/deta
NMS Strikes Backhttps://arxiv.org/abs/2212.06137
DETRhttps://huggingface.co/docs/transformers/model_doc/detr
End-to-End Object Detection with Transformershttps://arxiv.org/abs/2005.12872
DialoGPThttps://huggingface.co/docs/transformers/model_doc/dialogpt
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generationhttps://arxiv.org/abs/1911.00536
DiNAThttps://huggingface.co/docs/transformers/model_doc/dinat
Dilated Neighborhood Attention Transformerhttps://arxiv.org/abs/2209.15001
DINOv2https://huggingface.co/docs/transformers/model_doc/dinov2
DINOv2: Learning Robust Visual Features without Supervisionhttps://arxiv.org/abs/2304.07193
DistilBERThttps://huggingface.co/docs/transformers/model_doc/distilbert
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighterhttps://arxiv.org/abs/1910.01108
DistilGPT2https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
DistilRoBERTahttps://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
DistilmBERThttps://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
DiThttps://huggingface.co/docs/transformers/model_doc/dit
DiT: Self-supervised Pre-training for Document Image Transformerhttps://arxiv.org/abs/2203.02378
Donuthttps://huggingface.co/docs/transformers/model_doc/donut
OCR-free Document Understanding Transformerhttps://arxiv.org/abs/2111.15664
DPRhttps://huggingface.co/docs/transformers/model_doc/dpr
Dense Passage Retrieval for Open-Domain Question Answeringhttps://arxiv.org/abs/2004.04906
DPThttps://huggingface.co/docs/transformers/master/model_doc/dpt
Vision Transformers for Dense Predictionhttps://arxiv.org/abs/2103.13413
EfficientFormerhttps://huggingface.co/docs/transformers/model_doc/efficientformer
EfficientFormer: Vision Transformers at MobileNetSpeedhttps://arxiv.org/abs/2206.01191
EfficientNethttps://huggingface.co/docs/transformers/model_doc/efficientnet
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networkshttps://arxiv.org/abs/1905.11946
ELECTRAhttps://huggingface.co/docs/transformers/model_doc/electra
ELECTRA: Pre-training text encoders as discriminators rather than generatorshttps://arxiv.org/abs/2003.10555
EnCodechttps://huggingface.co/docs/transformers/model_doc/encodec
High Fidelity Neural Audio Compressionhttps://arxiv.org/abs/2210.13438
EncoderDecoderhttps://huggingface.co/docs/transformers/model_doc/encoder-decoder
Leveraging Pre-trained Checkpoints for Sequence Generation Taskshttps://arxiv.org/abs/1907.12461
ERNIEhttps://huggingface.co/docs/transformers/model_doc/ernie
ERNIE: Enhanced Representation through Knowledge Integrationhttps://arxiv.org/abs/1904.09223
ErnieMhttps://huggingface.co/docs/transformers/model_doc/ernie_m
ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corporahttps://arxiv.org/abs/2012.15674
ESMhttps://huggingface.co/docs/transformers/model_doc/esm
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequenceshttps://www.pnas.org/content/118/15/e2016239118
Language models enable zero-shot prediction of the effects of mutations on protein functionhttps://doi.org/10.1101/2021.07.09.450648
Language models of protein sequences at the scale of evolution enable accurate structure predictionhttps://doi.org/10.1101/2022.07.20.500902
Falconhttps://huggingface.co/docs/transformers/model_doc/falcon
FastSpeech2Conformerhttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/model_doc/fastspeech2_conformer
Recent Developments On Espnet Toolkit Boosted By Conformerhttps://arxiv.org/abs/2010.13956
FLAN-T5https://huggingface.co/docs/transformers/model_doc/flan-t5
google-research/t5xhttps://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints
FLAN-UL2https://huggingface.co/docs/transformers/model_doc/flan-ul2
google-research/t5xhttps://github.com/google-research/t5x/blob/main/docs/models.md#flan-ul2-checkpoints
FlauBERThttps://huggingface.co/docs/transformers/model_doc/flaubert
FlauBERT: Unsupervised Language Model Pre-training for Frenchhttps://arxiv.org/abs/1912.05372
FLAVAhttps://huggingface.co/docs/transformers/model_doc/flava
FLAVA: A Foundational Language And Vision Alignment Modelhttps://arxiv.org/abs/2112.04482
FNethttps://huggingface.co/docs/transformers/model_doc/fnet
FNet: Mixing Tokens with Fourier Transformshttps://arxiv.org/abs/2105.03824
FocalNethttps://huggingface.co/docs/transformers/model_doc/focalnet
Focal Modulation Networkshttps://arxiv.org/abs/2203.11926
Funnel Transformerhttps://huggingface.co/docs/transformers/model_doc/funnel
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processinghttps://arxiv.org/abs/2006.03236
Fuyuhttps://huggingface.co/docs/transformers/model_doc/fuyu
blog posthttps://www.adept.ai/blog/fuyu-8b
Gemmahttps://huggingface.co/docs/transformers/main/model_doc/gemma
Gemma: Open Models Based on Gemini Technology and Researchhttps://blog.google/technology/developers/gemma-open-models/
GIThttps://huggingface.co/docs/transformers/model_doc/git
GIT: A Generative Image-to-text Transformer for Vision and Languagehttps://arxiv.org/abs/2205.14100
GLPNhttps://huggingface.co/docs/transformers/model_doc/glpn
Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepthhttps://arxiv.org/abs/2201.07436
GPThttps://huggingface.co/docs/transformers/model_doc/openai-gpt
Improving Language Understanding by Generative Pre-Traininghttps://openai.com/research/language-unsupervised/
GPT Neohttps://huggingface.co/docs/transformers/model_doc/gpt_neo
EleutherAI/gpt-neohttps://github.com/EleutherAI/gpt-neo
GPT NeoXhttps://huggingface.co/docs/transformers/model_doc/gpt_neox
GPT-NeoX-20B: An Open-Source Autoregressive Language Modelhttps://arxiv.org/abs/2204.06745
GPT NeoX Japanesehttps://huggingface.co/docs/transformers/model_doc/gpt_neox_japanese
GPT-2https://huggingface.co/docs/transformers/model_doc/gpt2
Language Models are Unsupervised Multitask Learnershttps://openai.com/research/better-language-models/
GPT-Jhttps://huggingface.co/docs/transformers/model_doc/gptj
kingoflolz/mesh-transformer-jaxhttps://github.com/kingoflolz/mesh-transformer-jax/
GPT-Sw3https://huggingface.co/docs/transformers/model_doc/gpt-sw3
Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedishhttp://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.376.pdf
GPTBigCodehttps://huggingface.co/docs/transformers/model_doc/gpt_bigcode
SantaCoder: don't reach for the stars!https://arxiv.org/abs/2301.03988
GPTSAN-japanesehttps://huggingface.co/docs/transformers/model_doc/gptsan-japanese
tanreinama/GPTSANhttps://github.com/tanreinama/GPTSAN/blob/main/report/model.md
Graphormerhttps://huggingface.co/docs/transformers/model_doc/graphormer
Do Transformers Really Perform Bad for Graph Representation?https://arxiv.org/abs/2106.05234
GroupViThttps://huggingface.co/docs/transformers/model_doc/groupvit
GroupViT: Semantic Segmentation Emerges from Text Supervisionhttps://arxiv.org/abs/2202.11094
HerBERThttps://huggingface.co/docs/transformers/model_doc/herbert
KLEJ: Comprehensive Benchmark for Polish Language Understandinghttps://www.aclweb.org/anthology/2020.acl-main.111.pdf
Huberthttps://huggingface.co/docs/transformers/model_doc/hubert
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Unitshttps://arxiv.org/abs/2106.07447
I-BERThttps://huggingface.co/docs/transformers/model_doc/ibert
I-BERT: Integer-only BERT Quantizationhttps://arxiv.org/abs/2101.01321
IDEFICShttps://huggingface.co/docs/transformers/model_doc/idefics
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documentshttps://huggingface.co/papers/2306.16527
ImageGPThttps://huggingface.co/docs/transformers/model_doc/imagegpt
Generative Pretraining from Pixelshttps://openai.com/blog/image-gpt/
Informerhttps://huggingface.co/docs/transformers/model_doc/informer
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecastinghttps://arxiv.org/abs/2012.07436
InstructBLIPhttps://huggingface.co/docs/transformers/model_doc/instructblip
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuninghttps://arxiv.org/abs/2305.06500
Jukeboxhttps://huggingface.co/docs/transformers/model_doc/jukebox
Jukebox: A Generative Model for Musichttps://arxiv.org/pdf/2005.00341.pdf
KOSMOS-2https://huggingface.co/docs/transformers/model_doc/kosmos-2
Kosmos-2: Grounding Multimodal Large Language Models to the Worldhttps://arxiv.org/abs/2306.14824
LayoutLMhttps://huggingface.co/docs/transformers/model_doc/layoutlm
LayoutLM: Pre-training of Text and Layout for Document Image Understandinghttps://arxiv.org/abs/1912.13318
LayoutLMv2https://huggingface.co/docs/transformers/model_doc/layoutlmv2
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understandinghttps://arxiv.org/abs/2012.14740
LayoutLMv3https://huggingface.co/docs/transformers/model_doc/layoutlmv3
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Maskinghttps://arxiv.org/abs/2204.08387
LayoutXLMhttps://huggingface.co/docs/transformers/model_doc/layoutxlm
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understandinghttps://arxiv.org/abs/2104.08836
LEDhttps://huggingface.co/docs/transformers/model_doc/led
Longformer: The Long-Document Transformerhttps://arxiv.org/abs/2004.05150
LeViThttps://huggingface.co/docs/transformers/model_doc/levit
LeViT: A Vision Transformer in ConvNet's Clothing for Faster Inferencehttps://arxiv.org/abs/2104.01136
LiLThttps://huggingface.co/docs/transformers/model_doc/lilt
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understandinghttps://arxiv.org/abs/2202.13669
LLaMAhttps://huggingface.co/docs/transformers/model_doc/llama
LLaMA: Open and Efficient Foundation Language Modelshttps://arxiv.org/abs/2302.13971
Llama2https://huggingface.co/docs/transformers/model_doc/llama2
Llama2: Open Foundation and Fine-Tuned Chat Modelshttps://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
LLaVahttps://huggingface.co/docs/transformers/model_doc/llava
Visual Instruction Tuninghttps://arxiv.org/abs/2304.08485
Longformerhttps://huggingface.co/docs/transformers/model_doc/longformer
Longformer: The Long-Document Transformerhttps://arxiv.org/abs/2004.05150
LongT5https://huggingface.co/docs/transformers/model_doc/longt5
LongT5: Efficient Text-To-Text Transformer for Long Sequenceshttps://arxiv.org/abs/2112.07916
LUKEhttps://huggingface.co/docs/transformers/model_doc/luke
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attentionhttps://arxiv.org/abs/2010.01057
LXMERThttps://huggingface.co/docs/transformers/model_doc/lxmert
LXMERT: Learning Cross-Modality Encoder Representations from Transformers for Open-Domain Question Answeringhttps://arxiv.org/abs/1908.07490
M-CTC-Thttps://huggingface.co/docs/transformers/model_doc/mctct
Pseudo-Labeling For Massively Multilingual Speech Recognitionhttps://arxiv.org/abs/2111.00161
M2M100https://huggingface.co/docs/transformers/model_doc/m2m_100
Beyond English-Centric Multilingual Machine Translationhttps://arxiv.org/abs/2010.11125
MADLAD-400https://huggingface.co/docs/transformers/model_doc/madlad-400
MADLAD-400: A Multilingual And Document-Level Large Audited Datasethttps://arxiv.org/abs/2309.04662
Mambahttps://huggingface.co/docs/transformers/main/model_doc/mamba
Mamba: Linear-Time Sequence Modeling with Selective State Spaceshttps://arxiv.org/abs/2312.00752
MarianMThttps://huggingface.co/docs/transformers/model_doc/marian
OPUShttp://opus.nlpl.eu/
Marian Frameworkhttps://marian-nmt.github.io/
MarkupLMhttps://huggingface.co/docs/transformers/model_doc/markuplm
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understandinghttps://arxiv.org/abs/2110.08518
Mask2Formerhttps://huggingface.co/docs/transformers/model_doc/mask2former
Masked-attention Mask Transformer for Universal Image Segmentationhttps://arxiv.org/abs/2112.01527
MaskFormerhttps://huggingface.co/docs/transformers/model_doc/maskformer
Per-Pixel Classification is Not All You Need for Semantic Segmentationhttps://arxiv.org/abs/2107.06278
MatChahttps://huggingface.co/docs/transformers/model_doc/matcha
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derenderinghttps://arxiv.org/abs/2212.09662
mBARThttps://huggingface.co/docs/transformers/model_doc/mbart
Multilingual Denoising Pre-training for Neural Machine Translationhttps://arxiv.org/abs/2001.08210
mBART-50https://huggingface.co/docs/transformers/model_doc/mbart
Multilingual Translation with Extensible Multilingual Pretraining and Finetuninghttps://arxiv.org/abs/2008.00401
MEGAhttps://huggingface.co/docs/transformers/model_doc/mega
Mega: Moving Average Equipped Gated Attentionhttps://arxiv.org/abs/2209.10655
Megatron-BERThttps://huggingface.co/docs/transformers/model_doc/megatron-bert
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelismhttps://arxiv.org/abs/1909.08053
Megatron-GPT2https://huggingface.co/docs/transformers/model_doc/megatron_gpt2
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelismhttps://arxiv.org/abs/1909.08053
MGP-STRhttps://huggingface.co/docs/transformers/model_doc/mgp-str
Multi-Granularity Prediction for Scene Text Recognitionhttps://arxiv.org/abs/2209.03592
Mistralhttps://huggingface.co/docs/transformers/model_doc/mistral
Mistral AIhttps://mistral.ai
Mixtralhttps://huggingface.co/docs/transformers/model_doc/mixtral
Mistral AIhttps://mistral.ai
mLUKEhttps://huggingface.co/docs/transformers/model_doc/mluke
mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Modelshttps://arxiv.org/abs/2110.08151
MMShttps://huggingface.co/docs/transformers/model_doc/mms
Scaling Speech Technology to 1,000+ Languageshttps://arxiv.org/abs/2305.13516
MobileBERThttps://huggingface.co/docs/transformers/model_doc/mobilebert
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Deviceshttps://arxiv.org/abs/2004.02984
MobileNetV1https://huggingface.co/docs/transformers/model_doc/mobilenet_v1
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applicationshttps://arxiv.org/abs/1704.04861
MobileNetV2https://huggingface.co/docs/transformers/model_doc/mobilenet_v2
MobileNetV2: Inverted Residuals and Linear Bottleneckshttps://arxiv.org/abs/1801.04381
MobileViThttps://huggingface.co/docs/transformers/model_doc/mobilevit
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformerhttps://arxiv.org/abs/2110.02178
MobileViTV2https://huggingface.co/docs/transformers/model_doc/mobilevitv2
Separable Self-attention for Mobile Vision Transformershttps://arxiv.org/abs/2206.02680
MPNethttps://huggingface.co/docs/transformers/model_doc/mpnet
MPNet: Masked and Permuted Pre-training for Language Understandinghttps://arxiv.org/abs/2004.09297
MPThttps://huggingface.co/docs/transformers/model_doc/mpt
llm-foundryhttps://github.com/mosaicml/llm-foundry/
MRAhttps://huggingface.co/docs/transformers/model_doc/mra
Multi Resolution Analysis (MRA) for Approximate Self-Attentionhttps://arxiv.org/abs/2207.10284
MT5https://huggingface.co/docs/transformers/model_doc/mt5
mT5: A massively multilingual pre-trained text-to-text transformerhttps://arxiv.org/abs/2010.11934
MusicGenhttps://huggingface.co/docs/transformers/model_doc/musicgen
Simple and Controllable Music Generationhttps://arxiv.org/abs/2306.05284
MVPhttps://huggingface.co/docs/transformers/model_doc/mvp
MVP: Multi-task Supervised Pre-training for Natural Language Generationhttps://arxiv.org/abs/2206.12131
NAThttps://huggingface.co/docs/transformers/model_doc/nat
Neighborhood Attention Transformerhttps://arxiv.org/abs/2204.07143
Nezhahttps://huggingface.co/docs/transformers/model_doc/nezha
NEZHA: Neural Contextualized Representation for Chinese Language Understandinghttps://arxiv.org/abs/1909.00204
NLLBhttps://huggingface.co/docs/transformers/model_doc/nllb
No Language Left Behind: Scaling Human-Centered Machine Translationhttps://arxiv.org/abs/2207.04672
NLLB-MOEhttps://huggingface.co/docs/transformers/model_doc/nllb-moe
No Language Left Behind: Scaling Human-Centered Machine Translationhttps://arxiv.org/abs/2207.04672
Nougathttps://huggingface.co/docs/transformers/model_doc/nougat
Nougat: Neural Optical Understanding for Academic Documentshttps://arxiv.org/abs/2308.13418
Nyströmformerhttps://huggingface.co/docs/transformers/model_doc/nystromformer
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attentionhttps://arxiv.org/abs/2102.03902
OneFormerhttps://huggingface.co/docs/transformers/model_doc/oneformer
OneFormer: One Transformer to Rule Universal Image Segmentationhttps://arxiv.org/abs/2211.06220
OpenLlamahttps://huggingface.co/docs/transformers/model_doc/open-llama
s-JoLhttps://huggingface.co/s-JoL
OPThttps://huggingface.co/docs/transformers/master/model_doc/opt
OPT: Open Pre-trained Transformer Language Modelshttps://arxiv.org/abs/2205.01068
OWL-ViThttps://huggingface.co/docs/transformers/model_doc/owlvit
Simple Open-Vocabulary Object Detection with Vision Transformershttps://arxiv.org/abs/2205.06230
OWLv2https://huggingface.co/docs/transformers/model_doc/owlv2
Scaling Open-Vocabulary Object Detectionhttps://arxiv.org/abs/2306.09683
PatchTSMixerhttps://huggingface.co/docs/transformers/model_doc/patchtsmixer
TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecastinghttps://arxiv.org/pdf/2306.09364.pdf
PatchTSThttps://huggingface.co/docs/transformers/model_doc/patchtst
A Time Series is Worth 64 Words: Long-term Forecasting with Transformershttps://arxiv.org/abs/2211.14730
Pegasushttps://huggingface.co/docs/transformers/model_doc/pegasus
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarizationhttps://arxiv.org/abs/1912.08777
PEGASUS-Xhttps://huggingface.co/docs/transformers/model_doc/pegasus_x
Investigating Efficiently Extending Transformers for Long Input Summarizationhttps://arxiv.org/abs/2208.04347
Perceiver IOhttps://huggingface.co/docs/transformers/model_doc/perceiver
Perceiver IO: A General Architecture for Structured Inputs & Outputshttps://arxiv.org/abs/2107.14795
Persimmonhttps://huggingface.co/docs/transformers/model_doc/persimmon
blog posthttps://www.adept.ai/blog/persimmon-8b
Phihttps://huggingface.co/docs/transformers/model_doc/phi
Textbooks Are All You Needhttps://arxiv.org/abs/2306.11644
Textbooks Are All You Need II: phi-1.5 technical reporthttps://arxiv.org/abs/2309.05463
PhoBERThttps://huggingface.co/docs/transformers/model_doc/phobert
PhoBERT: Pre-trained language models for Vietnamesehttps://www.aclweb.org/anthology/2020.findings-emnlp.92/
Pix2Structhttps://huggingface.co/docs/transformers/model_doc/pix2struct
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understandinghttps://arxiv.org/abs/2210.03347
PLBarthttps://huggingface.co/docs/transformers/model_doc/plbart
Unified Pre-training for Program Understanding and Generationhttps://arxiv.org/abs/2103.06333
PoolFormerhttps://huggingface.co/docs/transformers/model_doc/poolformer
MetaFormer is Actually What You Need for Visionhttps://arxiv.org/abs/2111.11418
Pop2Pianohttps://huggingface.co/docs/transformers/model_doc/pop2piano
Pop2Piano : Pop Audio-based Piano Cover Generationhttps://arxiv.org/abs/2211.00895
ProphetNethttps://huggingface.co/docs/transformers/model_doc/prophetnet
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-traininghttps://arxiv.org/abs/2001.04063
PVThttps://huggingface.co/docs/transformers/model_doc/pvt
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutionshttps://arxiv.org/pdf/2102.12122.pdf
QDQBerthttps://huggingface.co/docs/transformers/model_doc/qdqbert
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluationhttps://arxiv.org/abs/2004.09602
Qwen2https://huggingface.co/docs/transformers/model_doc/qwen2
Qwen Technical Reporthttps://arxiv.org/abs/2309.16609
RAGhttps://huggingface.co/docs/transformers/model_doc/rag
Retrieval-Augmented Generation for Knowledge-Intensive NLP Taskshttps://arxiv.org/abs/2005.11401
REALMhttps://huggingface.co/docs/transformers/model_doc/realm.html
REALM: Retrieval-Augmented Language Model Pre-Traininghttps://arxiv.org/abs/2002.08909
Reformerhttps://huggingface.co/docs/transformers/model_doc/reformer
Reformer: The Efficient Transformerhttps://arxiv.org/abs/2001.04451
RegNethttps://huggingface.co/docs/transformers/model_doc/regnet
Designing Network Design Spacehttps://arxiv.org/abs/2003.13678
RemBERThttps://huggingface.co/docs/transformers/model_doc/rembert
Rethinking embedding coupling in pre-trained language modelshttps://arxiv.org/abs/2010.12821
ResNethttps://huggingface.co/docs/transformers/model_doc/resnet
Deep Residual Learning for Image Recognitionhttps://arxiv.org/abs/1512.03385
RoBERTahttps://huggingface.co/docs/transformers/model_doc/roberta
RoBERTa: A Robustly Optimized BERT Pretraining Approachhttps://arxiv.org/abs/1907.11692
RoBERTa-PreLayerNormhttps://huggingface.co/docs/transformers/model_doc/roberta-prelayernorm
fairseq: A Fast, Extensible Toolkit for Sequence Modelinghttps://arxiv.org/abs/1904.01038
RoCBerthttps://huggingface.co/docs/transformers/model_doc/roc_bert
RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraininghttps://aclanthology.org/2022.acl-long.65.pdf
RoFormerhttps://huggingface.co/docs/transformers/model_doc/roformer
RoFormer: Enhanced Transformer with Rotary Position Embeddinghttps://arxiv.org/abs/2104.09864
RWKVhttps://huggingface.co/docs/transformers/model_doc/rwkv
this repohttps://github.com/BlinkDL/RWKV-LM
SeamlessM4Thttps://huggingface.co/docs/transformers/model_doc/seamless_m4t
SeamlessM4T — Massively Multilingual & Multimodal Machine Translationhttps://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf
SeamlessM4Tv2https://huggingface.co/docs/transformers/model_doc/seamless_m4t_v2
Seamless: Multilingual Expressive and Streaming Speech Translationhttps://ai.meta.com/research/publications/seamless-multilingual-expressive-and-streaming-speech-translation/
SegFormerhttps://huggingface.co/docs/transformers/model_doc/segformer
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformershttps://arxiv.org/abs/2105.15203
SegGPThttps://huggingface.co/docs/transformers/main/model_doc/seggpt
SegGPT: Segmenting Everything In Contexthttps://arxiv.org/abs/2304.03284
Segment Anythinghttps://huggingface.co/docs/transformers/model_doc/sam
Segment Anythinghttps://arxiv.org/pdf/2304.02643v1.pdf
SEWhttps://huggingface.co/docs/transformers/model_doc/sew
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognitionhttps://arxiv.org/abs/2109.06870
SEW-Dhttps://huggingface.co/docs/transformers/model_doc/sew_d
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognitionhttps://arxiv.org/abs/2109.06870
SigLIPhttps://huggingface.co/docs/transformers/model_doc/siglip
Sigmoid Loss for Language Image Pre-Traininghttps://arxiv.org/abs/2303.15343
SpeechT5https://huggingface.co/docs/transformers/model_doc/speecht5
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processinghttps://arxiv.org/abs/2110.07205
SpeechToTextTransformerhttps://huggingface.co/docs/transformers/model_doc/speech_to_text
fairseq S2T: Fast Speech-to-Text Modeling with fairseqhttps://arxiv.org/abs/2010.05171
SpeechToTextTransformer2https://huggingface.co/docs/transformers/model_doc/speech_to_text_2
Large-Scale Self- and Semi-Supervised Learning for Speech Translationhttps://arxiv.org/abs/2104.06678
Splinterhttps://huggingface.co/docs/transformers/model_doc/splinter
Few-Shot Question Answering by Pretraining Span Selectionhttps://arxiv.org/abs/2101.00438
SqueezeBERThttps://huggingface.co/docs/transformers/model_doc/squeezebert
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?https://arxiv.org/abs/2006.11316
StableLmhttps://huggingface.co/docs/transformers/model_doc/stablelm
StableLM 3B 4E1T (Technical Report)https://stability.wandb.io/stability-llm/stable-lm/reports/StableLM-3B-4E1T--VmlldzoyMjU4?accessToken=u3zujipenkx5g7rtcj9qojjgxpconyjktjkli2po09nffrffdhhchq045vp0wyfo
Starcoder2https://huggingface.co/docs/transformers/main/model_doc/starcoder2
StarCoder 2 and The Stack v2: The Next Generationhttps://arxiv.org/abs/2402.19173
SwiftFormerhttps://huggingface.co/docs/transformers/model_doc/swiftformer
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applicationshttps://arxiv.org/abs/2303.15446
Swin Transformerhttps://huggingface.co/docs/transformers/model_doc/swin
Swin Transformer: Hierarchical Vision Transformer using Shifted Windowshttps://arxiv.org/abs/2103.14030
Swin Transformer V2https://huggingface.co/docs/transformers/model_doc/swinv2
Swin Transformer V2: Scaling Up Capacity and Resolutionhttps://arxiv.org/abs/2111.09883
Swin2SRhttps://huggingface.co/docs/transformers/model_doc/swin2sr
Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restorationhttps://arxiv.org/abs/2209.11345
SwitchTransformershttps://huggingface.co/docs/transformers/model_doc/switch_transformers
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsityhttps://arxiv.org/abs/2101.03961
T5https://huggingface.co/docs/transformers/model_doc/t5
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformerhttps://arxiv.org/abs/1910.10683
T5v1.1https://huggingface.co/docs/transformers/model_doc/t5v1.1
google-research/text-to-text-transfer-transformerhttps://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511
Table Transformerhttps://huggingface.co/docs/transformers/model_doc/table-transformer
PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documentshttps://arxiv.org/abs/2110.00061
TAPAShttps://huggingface.co/docs/transformers/model_doc/tapas
TAPAS: Weakly Supervised Table Parsing via Pre-traininghttps://arxiv.org/abs/2004.02349
TAPEXhttps://huggingface.co/docs/transformers/model_doc/tapex
TAPEX: Table Pre-training via Learning a Neural SQL Executorhttps://arxiv.org/abs/2107.07653
Time Series Transformerhttps://huggingface.co/docs/transformers/model_doc/time_series_transformer
TimeSformerhttps://huggingface.co/docs/transformers/model_doc/timesformer
Is Space-Time Attention All You Need for Video Understanding?https://arxiv.org/abs/2102.05095
Trajectory Transformerhttps://huggingface.co/docs/transformers/model_doc/trajectory_transformers
Offline Reinforcement Learning as One Big Sequence Modeling Problemhttps://arxiv.org/abs/2106.02039
Transformer-XLhttps://huggingface.co/docs/transformers/model_doc/transfo-xl
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Contexthttps://arxiv.org/abs/1901.02860
TrOCRhttps://huggingface.co/docs/transformers/model_doc/trocr
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Modelshttps://arxiv.org/abs/2109.10282
TVLThttps://huggingface.co/docs/transformers/model_doc/tvlt
TVLT: Textless Vision-Language Transformerhttps://arxiv.org/abs/2209.14156
TVPhttps://huggingface.co/docs/transformers/model_doc/tvp
Text-Visual Prompting for Efficient 2D Temporal Video Groundinghttps://arxiv.org/abs/2303.04995
UDOPhttps://huggingface.co/docs/transformers/main/model_doc/udop
Unifying Vision, Text, and Layout for Universal Document Processinghttps://arxiv.org/abs/2212.02623
UL2https://huggingface.co/docs/transformers/model_doc/ul2
Unifying Language Learning Paradigmshttps://arxiv.org/abs/2205.05131v1
UMT5https://huggingface.co/docs/transformers/model_doc/umt5
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraininghttps://openreview.net/forum?id=kXwdL1cWOAi
UniSpeechhttps://huggingface.co/docs/transformers/model_doc/unispeech
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Datahttps://arxiv.org/abs/2101.07597
UniSpeechSathttps://huggingface.co/docs/transformers/model_doc/unispeech-sat
UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAININGhttps://arxiv.org/abs/2110.05752
UnivNethttps://huggingface.co/docs/transformers/model_doc/univnet
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generationhttps://arxiv.org/abs/2106.07889
UPerNethttps://huggingface.co/docs/transformers/model_doc/upernet
Unified Perceptual Parsing for Scene Understandinghttps://arxiv.org/abs/1807.10221
VANhttps://huggingface.co/docs/transformers/model_doc/van
Visual Attention Networkhttps://arxiv.org/abs/2202.09741
VideoMAEhttps://huggingface.co/docs/transformers/model_doc/videomae
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Traininghttps://arxiv.org/abs/2203.12602
ViLThttps://huggingface.co/docs/transformers/model_doc/vilt
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervisionhttps://arxiv.org/abs/2102.03334
VipLlavahttps://huggingface.co/docs/transformers/model_doc/vipllava
Making Large Multimodal Models Understand Arbitrary Visual Promptshttps://arxiv.org/abs/2312.00784
Vision Transformer (ViT)https://huggingface.co/docs/transformers/model_doc/vit
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scalehttps://arxiv.org/abs/2010.11929
VisualBERThttps://huggingface.co/docs/transformers/model_doc/visual_bert
VisualBERT: A Simple and Performant Baseline for Vision and Languagehttps://arxiv.org/pdf/1908.03557
ViT Hybridhttps://huggingface.co/docs/transformers/model_doc/vit_hybrid
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scalehttps://arxiv.org/abs/2010.11929
VitDethttps://huggingface.co/docs/transformers/model_doc/vitdet
Exploring Plain Vision Transformer Backbones for Object Detectionhttps://arxiv.org/abs/2203.16527
ViTMAEhttps://huggingface.co/docs/transformers/model_doc/vit_mae
Masked Autoencoders Are Scalable Vision Learnershttps://arxiv.org/abs/2111.06377
ViTMattehttps://huggingface.co/docs/transformers/model_doc/vitmatte
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformershttps://arxiv.org/abs/2305.15272
ViTMSNhttps://huggingface.co/docs/transformers/model_doc/vit_msn
Masked Siamese Networks for Label-Efficient Learninghttps://arxiv.org/abs/2204.07141
VITShttps://huggingface.co/docs/transformers/model_doc/vits
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speechhttps://arxiv.org/abs/2106.06103
ViViThttps://huggingface.co/docs/transformers/model_doc/vivit
ViViT: A Video Vision Transformerhttps://arxiv.org/abs/2103.15691
Wav2Vec2https://huggingface.co/docs/transformers/model_doc/wav2vec2
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representationshttps://arxiv.org/abs/2006.11477
Wav2Vec2-BERThttps://huggingface.co/docs/transformers/model_doc/wav2vec2-bert
Seamless: Multilingual Expressive and Streaming Speech Translationhttps://ai.meta.com/research/publications/seamless-multilingual-expressive-and-streaming-speech-translation/
Wav2Vec2-Conformerhttps://huggingface.co/docs/transformers/model_doc/wav2vec2-conformer
FAIRSEQ S2T: Fast Speech-to-Text Modeling with FAIRSEQhttps://arxiv.org/abs/2010.05171
Wav2Vec2Phonemehttps://huggingface.co/docs/transformers/model_doc/wav2vec2_phoneme
Simple and Effective Zero-shot Cross-lingual Phoneme Recognitionhttps://arxiv.org/abs/2109.11680
WavLMhttps://huggingface.co/docs/transformers/model_doc/wavlm
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processinghttps://arxiv.org/abs/2110.13900
Whisperhttps://huggingface.co/docs/transformers/model_doc/whisper
Robust Speech Recognition via Large-Scale Weak Supervisionhttps://cdn.openai.com/papers/whisper.pdf
X-CLIPhttps://huggingface.co/docs/transformers/model_doc/xclip
Expanding Language-Image Pretrained Models for General Video Recognitionhttps://arxiv.org/abs/2208.02816
X-MODhttps://huggingface.co/docs/transformers/model_doc/xmod
Lifting the Curse of Multilinguality by Pre-training Modular Transformershttp://dx.doi.org/10.18653/v1/2022.naacl-main.255
XGLMhttps://huggingface.co/docs/transformers/model_doc/xglm
Few-shot Learning with Multilingual Language Modelshttps://arxiv.org/abs/2112.10668
XLMhttps://huggingface.co/docs/transformers/model_doc/xlm
Cross-lingual Language Model Pretraininghttps://arxiv.org/abs/1901.07291
XLM-ProphetNethttps://huggingface.co/docs/transformers/model_doc/xlm-prophetnet
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-traininghttps://arxiv.org/abs/2001.04063
XLM-RoBERTahttps://huggingface.co/docs/transformers/model_doc/xlm-roberta
Unsupervised Cross-lingual Representation Learning at Scalehttps://arxiv.org/abs/1911.02116
XLM-RoBERTa-XLhttps://huggingface.co/docs/transformers/model_doc/xlm-roberta-xl
Larger-Scale Transformers for Multilingual Masked Language Modelinghttps://arxiv.org/abs/2105.00572
XLM-Vhttps://huggingface.co/docs/transformers/model_doc/xlm-v
XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Modelshttps://arxiv.org/abs/2301.10472
XLNethttps://huggingface.co/docs/transformers/model_doc/xlnet
XLNet: Generalized Autoregressive Pretraining for Language Understandinghttps://arxiv.org/abs/1906.08237
XLS-Rhttps://huggingface.co/docs/transformers/model_doc/xls_r
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scalehttps://arxiv.org/abs/2111.09296
XLSR-Wav2Vec2https://huggingface.co/docs/transformers/model_doc/xlsr_wav2vec2
Unsupervised Cross-Lingual Representation Learning For Speech Recognitionhttps://arxiv.org/abs/2006.13979
YOLOShttps://huggingface.co/docs/transformers/model_doc/yolos
You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detectionhttps://arxiv.org/abs/2106.00666
YOSOhttps://huggingface.co/docs/transformers/model_doc/yoso
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Samplinghttps://arxiv.org/abs/2111.09714
templateshttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/templates
contributing guidelineshttps://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CONTRIBUTING.md
this tablehttps://huggingface.co/docs/transformers/index#supported-frameworks
documentationhttps://github.com/huggingface/transformers/tree/main/examples
https://patch-diff.githubusercontent.com/S4Plus/transformers#learn-more
Documentationhttps://huggingface.co/docs/transformers/
Task summaryhttps://huggingface.co/docs/transformers/task_summary
Preprocessing tutorialhttps://huggingface.co/docs/transformers/preprocessing
Training and fine-tuninghttps://huggingface.co/docs/transformers/training
Quick tour: Fine-tuning/usage scriptshttps://github.com/huggingface/transformers/tree/main/examples
Model sharing and uploadinghttps://huggingface.co/docs/transformers/model_sharing
https://patch-diff.githubusercontent.com/S4Plus/transformers#citation
paperhttps://www.aclweb.org/anthology/2020.emnlp-demos.6/
huggingface.co/transformershttps://huggingface.co/transformers
Readme https://patch-diff.githubusercontent.com/S4Plus/transformers#readme-ov-file
Apache-2.0 license https://patch-diff.githubusercontent.com/S4Plus/transformers#Apache-2.0-1-ov-file
Please reload this pagehttps://patch-diff.githubusercontent.com/S4Plus/transformers
Activityhttps://patch-diff.githubusercontent.com/S4Plus/transformers/activity
Custom propertieshttps://patch-diff.githubusercontent.com/S4Plus/transformers/custom-properties
0 starshttps://patch-diff.githubusercontent.com/S4Plus/transformers/stargazers
0 watchinghttps://patch-diff.githubusercontent.com/S4Plus/transformers/watchers
0 forkshttps://patch-diff.githubusercontent.com/S4Plus/transformers/forks
Report repository https://patch-diff.githubusercontent.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2FS4Plus%2Ftransformers&report=S4Plus+%28user%29
Releaseshttps://patch-diff.githubusercontent.com/S4Plus/transformers/releases
Packages 0https://patch-diff.githubusercontent.com/orgs/S4Plus/packages?repo_name=transformers
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.