René's URL Explorer Experiment

Title: GitHub - S4Plus/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Open Graph Title: GitHub - S4Plus/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

X Title: GitHub - S4Plus/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Description: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers

Open Graph Description: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers

X Description: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers

Opengraph URL: https://github.com/S4Plus/transformers

X: @github

direct link

Domain: patch-diff.githubusercontent.com

route-pattern	/:user_id/:repository
route-controller	files
route-action	disambiguate
fetch-nonce	v2:cbe82a59-a6cf-293b-dd0e-4a792239c6a5
current-catalog-service-hash	f3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb
request-id	EC7E:1BBE0B:B313E73:E84B53A:697692FF
html-safe-nonce	fbad7331bc22d6f0a7c52f6b484ad4f5010f41ed3fb8c4492fba9d3e1fe8e6a3
visitor-payload	eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJFQzdFOjFCQkUwQjpCMzEzRTczOkU4NEI1M0E6Njk3NjkyRkYiLCJ2aXNpdG9yX2lkIjoiODEzMjY1MzEyODAyNjA2NzcxMSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac	f3cce061f740554f796d766f5f63d1fedf2a19e5d1b2d7187b01a8cc8d9d41b9
hovercard-subject-tag	repository:768066078
github-keyboard-shortcuts	repository,copilot
google-site-verification	Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-url	https://collector.github.com/github/collect
analytics-location	//
fb:app_id	1401488693436528
apple-itunes-app	app-id=1477376905, app-argument=https://github.com/S4Plus/transformers
twitter:image	https://opengraph.githubassets.com/c30831f360117ad1fb4b94031d13673a9536091bf507a339d68937e167356a23/S4Plus/transformers
twitter:card	summary_large_image
og:image	https://opengraph.githubassets.com/c30831f360117ad1fb4b94031d13673a9536091bf507a339d68937e167356a23/S4Plus/transformers
og:image:alt	🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - S4Plus/transformers
og:image:width	1200
og:image:height	600
og:site_name	GitHub
og:type	object
hostname	github.com
expected-hostname	github.com
None	032152924a283b83384255d9489e7b93b54ba01da8d380b05ecd3953b3212411
turbo-cache-control	no-preview
go-import	github.com/S4Plus/transformers git https://github.com/S4Plus/transformers.git
octolytics-dimension-user_id	61465266
octolytics-dimension-user_login	S4Plus
octolytics-dimension-repository_id	768066078
octolytics-dimension-repository_nwo	S4Plus/transformers
octolytics-dimension-repository_public	true
octolytics-dimension-repository_is_fork	true
octolytics-dimension-repository_parent_id	155220641
octolytics-dimension-repository_parent_nwo	huggingface/transformers
octolytics-dimension-repository_network_root_id	155220641
octolytics-dimension-repository_network_root_nwo	huggingface/transformers
turbo-body-classes	logged-out env-production page-responsive
disable-turbo	false
browser-stats-url	https://api.github.com/_private/browser/stats
browser-errors-url	https://api.github.com/_private/browser/errors
release	5b577f6be6482e336e3c30e8daefa30144947b17
ui-target	canary-2
theme-color	#1e2327
color-scheme	light dark

Links:

Skip to content	https://patch-diff.githubusercontent.com/S4Plus/transformers#start-of-content
	https://patch-diff.githubusercontent.com/
Sign in	https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FS4Plus%2Ftransformers
GitHub CopilotWrite better code with AI	https://github.com/features/copilot
GitHub SparkBuild and deploy intelligent apps	https://github.com/features/spark
GitHub ModelsManage and compare prompts	https://github.com/features/models
MCP RegistryNewIntegrate external tools	https://github.com/mcp
ActionsAutomate any workflow	https://github.com/features/actions
CodespacesInstant dev environments	https://github.com/features/codespaces
IssuesPlan and track work	https://github.com/features/issues
Code ReviewManage code changes	https://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilities	https://github.com/security/advanced-security
Code securitySecure your code as you build	https://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they start	https://github.com/security/advanced-security/secret-protection
Why GitHub	https://github.com/why-github
Documentation	https://docs.github.com
Blog	https://github.blog
Changelog	https://github.blog/changelog
Marketplace	https://github.com/marketplace
View all features	https://github.com/features
Enterprises	https://github.com/enterprise
Small and medium teams	https://github.com/team
Startups	https://github.com/enterprise/startups
Nonprofits	https://github.com/solutions/industry/nonprofits
App Modernization	https://github.com/solutions/use-case/app-modernization
DevSecOps	https://github.com/solutions/use-case/devsecops
DevOps	https://github.com/solutions/use-case/devops
CI/CD	https://github.com/solutions/use-case/ci-cd
View all use cases	https://github.com/solutions/use-case
Healthcare	https://github.com/solutions/industry/healthcare
Financial services	https://github.com/solutions/industry/financial-services
Manufacturing	https://github.com/solutions/industry/manufacturing
Government	https://github.com/solutions/industry/government
View all industries	https://github.com/solutions/industry
View all solutions	https://github.com/solutions
AI	https://github.com/resources/articles?topic=ai
Software Development	https://github.com/resources/articles?topic=software-development
DevOps	https://github.com/resources/articles?topic=devops
Security	https://github.com/resources/articles?topic=security
View all topics	https://github.com/resources/articles
Customer stories	https://github.com/customer-stories
Events & webinars	https://github.com/resources/events
Ebooks & reports	https://github.com/resources/whitepapers
Business insights	https://github.com/solutions/executive-insights
GitHub Skills	https://skills.github.com
Documentation	https://docs.github.com
Customer support	https://support.github.com
Community forum	https://github.com/orgs/community/discussions
Trust center	https://github.com/trust-center
Partners	https://github.com/partners
GitHub SponsorsFund open source developers	https://github.com/sponsors
Security Lab	https://securitylab.github.com
Maintainer Community	https://maintainers.github.com
Accelerator	https://github.com/accelerator
Archive Program	https://archiveprogram.github.com
Topics	https://github.com/topics
Trending	https://github.com/trending
Collections	https://github.com/collections
Enterprise platformAI-powered developer platform	https://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security features	https://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI features	https://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 support	https://github.com/premium-support
Pricing	https://github.com/pricing
Search syntax tips	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentation	https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in	https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2FS4Plus%2Ftransformers
Sign up	https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=S4Plus%2Ftransformers
Reload	https://patch-diff.githubusercontent.com/S4Plus/transformers
Reload	https://patch-diff.githubusercontent.com/S4Plus/transformers
Reload	https://patch-diff.githubusercontent.com/S4Plus/transformers
S4Plus	https://patch-diff.githubusercontent.com/S4Plus
transformers	https://patch-diff.githubusercontent.com/S4Plus/transformers
huggingface/transformers	https://patch-diff.githubusercontent.com/huggingface/transformers
Notifications	https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Fork 0	https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Star 0	https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
huggingface.co/transformers	https://huggingface.co/transformers
Apache-2.0 license	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/LICENSE
0 stars	https://patch-diff.githubusercontent.com/S4Plus/transformers/stargazers
31.9k forks	https://patch-diff.githubusercontent.com/S4Plus/transformers/forks
Branches	https://patch-diff.githubusercontent.com/S4Plus/transformers/branches
Tags	https://patch-diff.githubusercontent.com/S4Plus/transformers/tags
Activity	https://patch-diff.githubusercontent.com/S4Plus/transformers/activity
Star	https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Notifications	https://patch-diff.githubusercontent.com/login?return_to=%2FS4Plus%2Ftransformers
Code	https://patch-diff.githubusercontent.com/S4Plus/transformers
Pull requests 0	https://patch-diff.githubusercontent.com/S4Plus/transformers/pulls
Actions	https://patch-diff.githubusercontent.com/S4Plus/transformers/actions
Projects 0	https://patch-diff.githubusercontent.com/S4Plus/transformers/projects
Security 0	https://patch-diff.githubusercontent.com/S4Plus/transformers/security
Insights	https://patch-diff.githubusercontent.com/S4Plus/transformers/pulse
Code	https://patch-diff.githubusercontent.com/S4Plus/transformers
Pull requests	https://patch-diff.githubusercontent.com/S4Plus/transformers/pulls
Actions	https://patch-diff.githubusercontent.com/S4Plus/transformers/actions
Projects	https://patch-diff.githubusercontent.com/S4Plus/transformers/projects
Security	https://patch-diff.githubusercontent.com/S4Plus/transformers/security
Insights	https://patch-diff.githubusercontent.com/S4Plus/transformers/pulse
Branches	https://patch-diff.githubusercontent.com/S4Plus/transformers/branches
Tags	https://patch-diff.githubusercontent.com/S4Plus/transformers/tags
	https://patch-diff.githubusercontent.com/S4Plus/transformers/branches
	https://patch-diff.githubusercontent.com/S4Plus/transformers/tags
15,280 Commits	https://patch-diff.githubusercontent.com/S4Plus/transformers/commits/main/
	https://patch-diff.githubusercontent.com/S4Plus/transformers/commits/main/
.circleci	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.circleci
.circleci	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.circleci
.github	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.github
.github	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/.github
docker	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docker
docker	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docker
docs	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docs
docs	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/docs
examples	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/examples
examples	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/examples
model_cards	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/model_cards
model_cards	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/model_cards
notebooks	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/notebooks
notebooks	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/notebooks
scripts	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/scripts
scripts	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/scripts
src/transformers	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/src/transformers
src/transformers	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/src/transformers
templates	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/templates
templates	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/templates
tests	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/tests
tests	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/tests
utils	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/utils
utils	https://patch-diff.githubusercontent.com/S4Plus/transformers/tree/main/utils
.coveragerc	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.coveragerc
.coveragerc	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.coveragerc
.gitattributes	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitattributes
.gitattributes	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitattributes
.gitignore	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitignore
.gitignore	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/.gitignore
CITATION.cff	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CITATION.cff
CITATION.cff	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CITATION.cff
CODE_OF_CONDUCT.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CODE_OF_CONDUCT.md
CODE_OF_CONDUCT.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CODE_OF_CONDUCT.md
CONTRIBUTING.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CONTRIBUTING.md
CONTRIBUTING.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CONTRIBUTING.md
ISSUES.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/ISSUES.md
ISSUES.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/ISSUES.md
LICENSE	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/LICENSE
LICENSE	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/LICENSE
Makefile	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/Makefile
Makefile	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/Makefile
README.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README.md
README.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README.md
README_de.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_de.md
README_de.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_de.md
README_es.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_es.md
README_es.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_es.md
README_fr.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_fr.md
README_fr.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_fr.md
README_hd.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_hd.md
README_hd.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_hd.md
README_ja.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ja.md
README_ja.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ja.md
README_ko.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ko.md
README_ko.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ko.md
README_pt-br.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_pt-br.md
README_pt-br.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_pt-br.md
README_ru.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ru.md
README_ru.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_ru.md
README_te.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_te.md
README_te.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_te.md
README_vi.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_vi.md
README_vi.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_vi.md
README_zh-hans.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hans.md
README_zh-hans.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hans.md
README_zh-hant.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hant.md
README_zh-hant.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/README_zh-hant.md
SECURITY.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/SECURITY.md
SECURITY.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/SECURITY.md
awesome-transformers.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/awesome-transformers.md
awesome-transformers.md	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/awesome-transformers.md
conftest.py	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/conftest.py
conftest.py	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/conftest.py
hubconf.py	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/hubconf.py
hubconf.py	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/hubconf.py
pyproject.toml	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/pyproject.toml
pyproject.toml	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/pyproject.toml
setup.py	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/setup.py
setup.py	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/setup.py
README	https://patch-diff.githubusercontent.com/S4Plus/transformers
Code of conduct	https://patch-diff.githubusercontent.com/S4Plus/transformers
Contributing	https://patch-diff.githubusercontent.com/S4Plus/transformers
License	https://patch-diff.githubusercontent.com/S4Plus/transformers
Security	https://patch-diff.githubusercontent.com/S4Plus/transformers
	https://circleci.com/gh/huggingface/transformers
	https://github.com/huggingface/transformers/blob/main/LICENSE
	https://huggingface.co/docs/transformers/index
	https://github.com/huggingface/transformers/releases
	https://github.com/huggingface/transformers/blob/main/CODE_OF_CONDUCT.md
	https://zenodo.org/badge/latestdoi/155220641
简体中文	https://github.com/huggingface/transformers/blob/main/README_zh-hans.md
繁體中文	https://github.com/huggingface/transformers/blob/main/README_zh-hant.md
한국어	https://github.com/huggingface/transformers/blob/main/README_ko.md
Español	https://github.com/huggingface/transformers/blob/main/README_es.md
日本語	https://github.com/huggingface/transformers/blob/main/README_ja.md
हिन्दी	https://github.com/huggingface/transformers/blob/main/README_hd.md
Русский	https://github.com/huggingface/transformers/blob/main/README_ru.md
Рortuguês	https://github.com/huggingface/transformers/blob/main/README_pt-br.md
తెలుగు	https://github.com/huggingface/transformers/blob/main/README_te.md
Français	https://github.com/huggingface/transformers/blob/main/README_fr.md
Deutsch	https://github.com/huggingface/transformers/blob/main/README_de.md
Tiếng Việt	https://github.com/huggingface/transformers/blob/main/README_vi.md
	https://patch-diff.githubusercontent.com/S4Plus/transformers#------------english---------简体中文---------繁體中文---------한국어---------español---------日本語---------हिन्दी---------русский---------рortuguês---------తెలుగు---------français---------deutsch---------tiếng-việt-----
	https://patch-diff.githubusercontent.com/S4Plus/transformers#----state-of-the-art-machine-learning-for-jax-pytorch-and-tensorflow
	https://hf.co/course
	https://patch-diff.githubusercontent.com/S4Plus/transformers#----
model hub	https://huggingface.co/models
Jax	https://jax.readthedocs.io/en/latest/
PyTorch	https://pytorch.org/
TensorFlow	https://www.tensorflow.org/
	https://patch-diff.githubusercontent.com/S4Plus/transformers#online-demos
model hub	https://huggingface.co/models
private model hosting, versioning, & an inference API	https://huggingface.co/pricing
Masked word completion with BERT	https://huggingface.co/google-bert/bert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France
Named Entity Recognition with Electra	https://huggingface.co/dbmdz/electra-large-discriminator-finetuned-conll03-english?text=My+name+is+Sarah+and+I+live+in+London+city
Text generation with Mistral	https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
Natural Language Inference with RoBERTa	https://huggingface.co/FacebookAI/roberta-large-mnli?text=The+dog+was+lost.+Nobody+lost+any+animal
Summarization with BART	https://huggingface.co/facebook/bart-large-cnn?text=The+tower+is+324+metres+%281%2C063+ft%29+tall%2C+about+the+same+height+as+an+81-storey+building%2C+and+the+tallest+structure+in+Paris.+Its+base+is+square%2C+measuring+125+metres+%28410+ft%29+on+each+side.+During+its+construction%2C+the+Eiffel+Tower+surpassed+the+Washington+Monument+to+become+the+tallest+man-made+structure+in+the+world%2C+a+title+it+held+for+41+years+until+the+Chrysler+Building+in+New+York+City+was+finished+in+1930.+It+was+the+first+structure+to+reach+a+height+of+300+metres.+Due+to+the+addition+of+a+broadcasting+aerial+at+the+top+of+the+tower+in+1957%2C+it+is+now+taller+than+the+Chrysler+Building+by+5.2+metres+%2817+ft%29.+Excluding+transmitters%2C+the+Eiffel+Tower+is+the+second+tallest+free-standing+structure+in+France+after+the+Millau+Viaduct
Question answering with DistilBERT	https://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad?text=Which+name+is+also+used+to+describe+the+Amazon+rainforest+in+English%3F&context=The+Amazon+rainforest+%28Portuguese%3A+Floresta+Amaz%C3%B4nica+or+Amaz%C3%B4nia%3B+Spanish%3A+Selva+Amaz%C3%B3nica%2C+Amazon%C3%ADa+or+usually+Amazonia%3B+French%3A+For%C3%AAt+amazonienne%3B+Dutch%3A+Amazoneregenwoud%29%2C+also+known+in+English+as+Amazonia+or+the+Amazon+Jungle%2C+is+a+moist+broadleaf+forest+that+covers+most+of+the+Amazon+basin+of+South+America.+This+basin+encompasses+7%2C000%2C000+square+kilometres+%282%2C700%2C000+sq+mi%29%2C+of+which+5%2C500%2C000+square+kilometres+%282%2C100%2C000+sq+mi%29+are+covered+by+the+rainforest.+This+region+includes+territory+belonging+to+nine+nations.+The+majority+of+the+forest+is+contained+within+Brazil%2C+with+60%25+of+the+rainforest%2C+followed+by+Peru+with+13%25%2C+Colombia+with+10%25%2C+and+with+minor+amounts+in+Venezuela%2C+Ecuador%2C+Bolivia%2C+Guyana%2C+Suriname+and+French+Guiana.+States+or+departments+in+four+nations+contain+%22Amazonas%22+in+their+names.+The+Amazon+represents+over+half+of+the+planet%27s+remaining+rainforests%2C+and+comprises+the+largest+and+most+biodiverse+tract+of+tropical+rainforest+in+the+world%2C+with+an+estimated+390+billion+individual+trees+divided+into+16%2C000+species
Translation with T5	https://huggingface.co/google-t5/t5-base?text=My+name+is+Wolfgang+and+I+live+in+Berlin
Image classification with ViT	https://huggingface.co/google/vit-base-patch16-224
Object Detection with DETR	https://huggingface.co/facebook/detr-resnet-50
Semantic Segmentation with SegFormer	https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512
Panoptic Segmentation with Mask2Former	https://huggingface.co/facebook/mask2former-swin-large-coco-panoptic
Depth Estimation with Depth Anything	https://huggingface.co/docs/transformers/main/model_doc/depth_anything
Video Classification with VideoMAE	https://huggingface.co/docs/transformers/model_doc/videomae
Universal Segmentation with OneFormer	https://huggingface.co/shi-labs/oneformer_ade20k_dinat_large
Automatic Speech Recognition with Whisper	https://huggingface.co/openai/whisper-large-v3
Keyword Spotting with Wav2Vec2	https://huggingface.co/superb/wav2vec2-base-superb-ks
Audio Classification with Audio Spectrogram Transformer	https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593
Table Question Answering with TAPAS	https://huggingface.co/google/tapas-base-finetuned-wtq
Visual Question Answering with ViLT	https://huggingface.co/dandelin/vilt-b32-finetuned-vqa
Image captioning with LLaVa	https://huggingface.co/llava-hf/llava-1.5-7b-hf
Zero-shot Image Classification with SigLIP	https://huggingface.co/google/siglip-so400m-patch14-384
Document Question Answering with LayoutLM	https://huggingface.co/impira/layoutlm-document-qa
Zero-shot Video Classification with X-CLIP	https://huggingface.co/docs/transformers/model_doc/xclip
Zero-shot Object Detection with OWLv2	https://huggingface.co/docs/transformers/en/model_doc/owlv2
Zero-shot Image Segmentation with CLIPSeg	https://huggingface.co/docs/transformers/model_doc/clipseg
Automatic Mask Generation with SAM	https://huggingface.co/docs/transformers/model_doc/sam
	https://patch-diff.githubusercontent.com/S4Plus/transformers#100-projects-using-transformers
awesome-transformers	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/awesome-transformers.md
	https://patch-diff.githubusercontent.com/S4Plus/transformers#if-you-are-looking-for-custom-support-from-the-hugging-face-team
	https://huggingface.co/support
	https://patch-diff.githubusercontent.com/S4Plus/transformers#quick-tour
	https://camo.githubusercontent.com/4153c3f6ae91d9b2d21065c7ff7596b0e24b0c5c23febf3e66eb503324eed1d3/68747470733a2f2f68756767696e67666163652e636f2f64617461736574732f68756767696e67666163652f646f63756d656e746174696f6e2d696d616765732f7265736f6c76652f6d61696e2f636f636f5f73616d706c652e706e67
	https://camo.githubusercontent.com/c8821fb97a1b525d5ea9b5f67057b37392c430ee7b5915b4d6ad481202f410a8/68747470733a2f2f68756767696e67666163652e636f2f64617461736574732f68756767696e67666163652f646f63756d656e746174696f6e2d696d616765732f7265736f6c76652f6d61696e2f636f636f5f73616d706c655f706f73745f70726f6365737365642e706e67
	https://patch-diff.githubusercontent.com/S4Plus/transformers#--------
this tutorial	https://huggingface.co/docs/transformers/task_summary
Pytorch nn.Module	https://pytorch.org/docs/stable/nn.html#torch.nn.Module
TensorFlow tf.keras.Model	https://www.tensorflow.org/api_docs/python/tf/keras/Model
This tutorial	https://huggingface.co/docs/transformers/training
	https://patch-diff.githubusercontent.com/S4Plus/transformers#why-should-i-use-transformers
	https://patch-diff.githubusercontent.com/S4Plus/transformers#why-shouldnt-i-use-transformers
Accelerate	https://huggingface.co/docs/accelerate
examples folder	https://github.com/huggingface/transformers/tree/main/examples
	https://patch-diff.githubusercontent.com/S4Plus/transformers#installation
	https://patch-diff.githubusercontent.com/S4Plus/transformers#with-pip
virtual environment	https://docs.python.org/3/library/venv.html
user guide	https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
TensorFlow installation page	https://www.tensorflow.org/install/
PyTorch installation page	https://pytorch.org/get-started/locally/#start-locally
Flax	https://github.com/google/flax#quick-install
Jax	https://github.com/google/jax#installation
install the library from source	https://huggingface.co/docs/transformers/installation#installing-from-source
	https://patch-diff.githubusercontent.com/S4Plus/transformers#with-conda
this issue	https://github.com/huggingface/huggingface_hub/issues/1062
	https://patch-diff.githubusercontent.com/S4Plus/transformers#model-architectures
All the model checkpoints	https://huggingface.co/models
model hub	https://huggingface.co/models
users	https://huggingface.co/users
organizations	https://huggingface.co/organizations
	https://camo.githubusercontent.com/f36a36c84f2ff8605938db0f71595cdfebb5ebc941833aeb2591205f220bc9d2/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f68756767696e67666163652e636f2f6170692f736869656c64732f6d6f64656c7326636f6c6f723d627269676874677265656e
here	https://huggingface.co/docs/transformers/model_summary
ALBERT	https://huggingface.co/docs/transformers/model_doc/albert
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations	https://arxiv.org/abs/1909.11942
ALIGN	https://huggingface.co/docs/transformers/model_doc/align
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision	https://arxiv.org/abs/2102.05918
AltCLIP	https://huggingface.co/docs/transformers/model_doc/altclip
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities	https://arxiv.org/abs/2211.06679
Audio Spectrogram Transformer	https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer
AST: Audio Spectrogram Transformer	https://arxiv.org/abs/2104.01778
Autoformer	https://huggingface.co/docs/transformers/model_doc/autoformer
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting	https://arxiv.org/abs/2106.13008
Bark	https://huggingface.co/docs/transformers/model_doc/bark
suno-ai/bark	https://github.com/suno-ai/bark
BART	https://huggingface.co/docs/transformers/model_doc/bart
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension	https://arxiv.org/abs/1910.13461
BARThez	https://huggingface.co/docs/transformers/model_doc/barthez
BARThez: a Skilled Pretrained French Sequence-to-Sequence Model	https://arxiv.org/abs/2010.12321
BARTpho	https://huggingface.co/docs/transformers/model_doc/bartpho
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese	https://arxiv.org/abs/2109.09701
BEiT	https://huggingface.co/docs/transformers/model_doc/beit
BEiT: BERT Pre-Training of Image Transformers	https://arxiv.org/abs/2106.08254
BERT	https://huggingface.co/docs/transformers/model_doc/bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	https://arxiv.org/abs/1810.04805
BERT For Sequence Generation	https://huggingface.co/docs/transformers/model_doc/bert-generation
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks	https://arxiv.org/abs/1907.12461
BERTweet	https://huggingface.co/docs/transformers/model_doc/bertweet
BERTweet: A pre-trained language model for English Tweets	https://aclanthology.org/2020.emnlp-demos.2/
BigBird-Pegasus	https://huggingface.co/docs/transformers/model_doc/bigbird_pegasus
Big Bird: Transformers for Longer Sequences	https://arxiv.org/abs/2007.14062
BigBird-RoBERTa	https://huggingface.co/docs/transformers/model_doc/big_bird
Big Bird: Transformers for Longer Sequences	https://arxiv.org/abs/2007.14062
BioGpt	https://huggingface.co/docs/transformers/model_doc/biogpt
BioGPT: generative pre-trained transformer for biomedical text generation and mining	https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac409/6713511?guestAccessKey=a66d9b5d-4f83-4017-bb52-405815c907b9
BiT	https://huggingface.co/docs/transformers/model_doc/bit
Big Transfer (BiT): General Visual Representation Learning	https://arxiv.org/abs/1912.11370
Blenderbot	https://huggingface.co/docs/transformers/model_doc/blenderbot
Recipes for building an open-domain chatbot	https://arxiv.org/abs/2004.13637
BlenderbotSmall	https://huggingface.co/docs/transformers/model_doc/blenderbot-small
Recipes for building an open-domain chatbot	https://arxiv.org/abs/2004.13637
BLIP	https://huggingface.co/docs/transformers/model_doc/blip
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation	https://arxiv.org/abs/2201.12086
BLIP-2	https://huggingface.co/docs/transformers/model_doc/blip-2
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models	https://arxiv.org/abs/2301.12597
BLOOM	https://huggingface.co/docs/transformers/model_doc/bloom
BigScience Workshop	https://bigscience.huggingface.co/
BORT	https://huggingface.co/docs/transformers/model_doc/bort
Optimal Subarchitecture Extraction For BERT	https://arxiv.org/abs/2010.10499
BridgeTower	https://huggingface.co/docs/transformers/model_doc/bridgetower
BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning	https://arxiv.org/abs/2206.08657
BROS	https://huggingface.co/docs/transformers/model_doc/bros
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents	https://arxiv.org/abs/2108.04539
ByT5	https://huggingface.co/docs/transformers/model_doc/byt5
ByT5: Towards a token-free future with pre-trained byte-to-byte models	https://arxiv.org/abs/2105.13626
CamemBERT	https://huggingface.co/docs/transformers/model_doc/camembert
CamemBERT: a Tasty French Language Model	https://arxiv.org/abs/1911.03894
CANINE	https://huggingface.co/docs/transformers/model_doc/canine
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation	https://arxiv.org/abs/2103.06874
Chinese-CLIP	https://huggingface.co/docs/transformers/model_doc/chinese_clip
Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese	https://arxiv.org/abs/2211.01335
CLAP	https://huggingface.co/docs/transformers/model_doc/clap
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation	https://arxiv.org/abs/2211.06687
CLIP	https://huggingface.co/docs/transformers/model_doc/clip
Learning Transferable Visual Models From Natural Language Supervision	https://arxiv.org/abs/2103.00020
CLIPSeg	https://huggingface.co/docs/transformers/model_doc/clipseg
Image Segmentation Using Text and Image Prompts	https://arxiv.org/abs/2112.10003
CLVP	https://huggingface.co/docs/transformers/model_doc/clvp
Better speech synthesis through scaling	https://arxiv.org/abs/2305.07243
CodeGen	https://huggingface.co/docs/transformers/model_doc/codegen
A Conversational Paradigm for Program Synthesis	https://arxiv.org/abs/2203.13474
CodeLlama	https://huggingface.co/docs/transformers/model_doc/llama_code
Code Llama: Open Foundation Models for Code	https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/
Conditional DETR	https://huggingface.co/docs/transformers/model_doc/conditional_detr
Conditional DETR for Fast Training Convergence	https://arxiv.org/abs/2108.06152
ConvBERT	https://huggingface.co/docs/transformers/model_doc/convbert
ConvBERT: Improving BERT with Span-based Dynamic Convolution	https://arxiv.org/abs/2008.02496
ConvNeXT	https://huggingface.co/docs/transformers/model_doc/convnext
A ConvNet for the 2020s	https://arxiv.org/abs/2201.03545
ConvNeXTV2	https://huggingface.co/docs/transformers/model_doc/convnextv2
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders	https://arxiv.org/abs/2301.00808
CPM	https://huggingface.co/docs/transformers/model_doc/cpm
CPM: A Large-scale Generative Chinese Pre-trained Language Model	https://arxiv.org/abs/2012.00413
CPM-Ant	https://huggingface.co/docs/transformers/model_doc/cpmant
OpenBMB	https://www.openbmb.org/
CTRL	https://huggingface.co/docs/transformers/model_doc/ctrl
CTRL: A Conditional Transformer Language Model for Controllable Generation	https://arxiv.org/abs/1909.05858
CvT	https://huggingface.co/docs/transformers/model_doc/cvt
CvT: Introducing Convolutions to Vision Transformers	https://arxiv.org/abs/2103.15808
Data2Vec	https://huggingface.co/docs/transformers/model_doc/data2vec
Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language	https://arxiv.org/abs/2202.03555
DeBERTa	https://huggingface.co/docs/transformers/model_doc/deberta
DeBERTa: Decoding-enhanced BERT with Disentangled Attention	https://arxiv.org/abs/2006.03654
DeBERTa-v2	https://huggingface.co/docs/transformers/model_doc/deberta-v2
DeBERTa: Decoding-enhanced BERT with Disentangled Attention	https://arxiv.org/abs/2006.03654
Decision Transformer	https://huggingface.co/docs/transformers/model_doc/decision_transformer
Decision Transformer: Reinforcement Learning via Sequence Modeling	https://arxiv.org/abs/2106.01345
Deformable DETR	https://huggingface.co/docs/transformers/model_doc/deformable_detr
Deformable DETR: Deformable Transformers for End-to-End Object Detection	https://arxiv.org/abs/2010.04159
DeiT	https://huggingface.co/docs/transformers/model_doc/deit
Training data-efficient image transformers & distillation through attention	https://arxiv.org/abs/2012.12877
DePlot	https://huggingface.co/docs/transformers/model_doc/deplot
DePlot: One-shot visual language reasoning by plot-to-table translation	https://arxiv.org/abs/2212.10505
Depth Anything	https://huggingface.co/docs/transformers/model_doc/depth_anything
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data	https://arxiv.org/abs/2401.10891
DETA	https://huggingface.co/docs/transformers/model_doc/deta
NMS Strikes Back	https://arxiv.org/abs/2212.06137
DETR	https://huggingface.co/docs/transformers/model_doc/detr
End-to-End Object Detection with Transformers	https://arxiv.org/abs/2005.12872
DialoGPT	https://huggingface.co/docs/transformers/model_doc/dialogpt
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation	https://arxiv.org/abs/1911.00536
DiNAT	https://huggingface.co/docs/transformers/model_doc/dinat
Dilated Neighborhood Attention Transformer	https://arxiv.org/abs/2209.15001
DINOv2	https://huggingface.co/docs/transformers/model_doc/dinov2
DINOv2: Learning Robust Visual Features without Supervision	https://arxiv.org/abs/2304.07193
DistilBERT	https://huggingface.co/docs/transformers/model_doc/distilbert
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	https://arxiv.org/abs/1910.01108
DistilGPT2	https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
DistilRoBERTa	https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
DistilmBERT	https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation
DiT	https://huggingface.co/docs/transformers/model_doc/dit
DiT: Self-supervised Pre-training for Document Image Transformer	https://arxiv.org/abs/2203.02378
Donut	https://huggingface.co/docs/transformers/model_doc/donut
OCR-free Document Understanding Transformer	https://arxiv.org/abs/2111.15664
DPR	https://huggingface.co/docs/transformers/model_doc/dpr
Dense Passage Retrieval for Open-Domain Question Answering	https://arxiv.org/abs/2004.04906
DPT	https://huggingface.co/docs/transformers/master/model_doc/dpt
Vision Transformers for Dense Prediction	https://arxiv.org/abs/2103.13413
EfficientFormer	https://huggingface.co/docs/transformers/model_doc/efficientformer
EfficientFormer: Vision Transformers at MobileNetSpeed	https://arxiv.org/abs/2206.01191
EfficientNet	https://huggingface.co/docs/transformers/model_doc/efficientnet
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks	https://arxiv.org/abs/1905.11946
ELECTRA	https://huggingface.co/docs/transformers/model_doc/electra
ELECTRA: Pre-training text encoders as discriminators rather than generators	https://arxiv.org/abs/2003.10555
EnCodec	https://huggingface.co/docs/transformers/model_doc/encodec
High Fidelity Neural Audio Compression	https://arxiv.org/abs/2210.13438
EncoderDecoder	https://huggingface.co/docs/transformers/model_doc/encoder-decoder
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks	https://arxiv.org/abs/1907.12461
ERNIE	https://huggingface.co/docs/transformers/model_doc/ernie
ERNIE: Enhanced Representation through Knowledge Integration	https://arxiv.org/abs/1904.09223
ErnieM	https://huggingface.co/docs/transformers/model_doc/ernie_m
ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora	https://arxiv.org/abs/2012.15674
ESM	https://huggingface.co/docs/transformers/model_doc/esm
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences	https://www.pnas.org/content/118/15/e2016239118
Language models enable zero-shot prediction of the effects of mutations on protein function	https://doi.org/10.1101/2021.07.09.450648
Language models of protein sequences at the scale of evolution enable accurate structure prediction	https://doi.org/10.1101/2022.07.20.500902
Falcon	https://huggingface.co/docs/transformers/model_doc/falcon
FastSpeech2Conformer	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/model_doc/fastspeech2_conformer
Recent Developments On Espnet Toolkit Boosted By Conformer	https://arxiv.org/abs/2010.13956
FLAN-T5	https://huggingface.co/docs/transformers/model_doc/flan-t5
google-research/t5x	https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints
FLAN-UL2	https://huggingface.co/docs/transformers/model_doc/flan-ul2
google-research/t5x	https://github.com/google-research/t5x/blob/main/docs/models.md#flan-ul2-checkpoints
FlauBERT	https://huggingface.co/docs/transformers/model_doc/flaubert
FlauBERT: Unsupervised Language Model Pre-training for French	https://arxiv.org/abs/1912.05372
FLAVA	https://huggingface.co/docs/transformers/model_doc/flava
FLAVA: A Foundational Language And Vision Alignment Model	https://arxiv.org/abs/2112.04482
FNet	https://huggingface.co/docs/transformers/model_doc/fnet
FNet: Mixing Tokens with Fourier Transforms	https://arxiv.org/abs/2105.03824
FocalNet	https://huggingface.co/docs/transformers/model_doc/focalnet
Focal Modulation Networks	https://arxiv.org/abs/2203.11926
Funnel Transformer	https://huggingface.co/docs/transformers/model_doc/funnel
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing	https://arxiv.org/abs/2006.03236
Fuyu	https://huggingface.co/docs/transformers/model_doc/fuyu
blog post	https://www.adept.ai/blog/fuyu-8b
Gemma	https://huggingface.co/docs/transformers/main/model_doc/gemma
Gemma: Open Models Based on Gemini Technology and Research	https://blog.google/technology/developers/gemma-open-models/
GIT	https://huggingface.co/docs/transformers/model_doc/git
GIT: A Generative Image-to-text Transformer for Vision and Language	https://arxiv.org/abs/2205.14100
GLPN	https://huggingface.co/docs/transformers/model_doc/glpn
Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth	https://arxiv.org/abs/2201.07436
GPT	https://huggingface.co/docs/transformers/model_doc/openai-gpt
Improving Language Understanding by Generative Pre-Training	https://openai.com/research/language-unsupervised/
GPT Neo	https://huggingface.co/docs/transformers/model_doc/gpt_neo
EleutherAI/gpt-neo	https://github.com/EleutherAI/gpt-neo
GPT NeoX	https://huggingface.co/docs/transformers/model_doc/gpt_neox
GPT-NeoX-20B: An Open-Source Autoregressive Language Model	https://arxiv.org/abs/2204.06745
GPT NeoX Japanese	https://huggingface.co/docs/transformers/model_doc/gpt_neox_japanese
GPT-2	https://huggingface.co/docs/transformers/model_doc/gpt2
Language Models are Unsupervised Multitask Learners	https://openai.com/research/better-language-models/
GPT-J	https://huggingface.co/docs/transformers/model_doc/gptj
kingoflolz/mesh-transformer-jax	https://github.com/kingoflolz/mesh-transformer-jax/
GPT-Sw3	https://huggingface.co/docs/transformers/model_doc/gpt-sw3
Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish	http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.376.pdf
GPTBigCode	https://huggingface.co/docs/transformers/model_doc/gpt_bigcode
SantaCoder: don't reach for the stars!	https://arxiv.org/abs/2301.03988
GPTSAN-japanese	https://huggingface.co/docs/transformers/model_doc/gptsan-japanese
tanreinama/GPTSAN	https://github.com/tanreinama/GPTSAN/blob/main/report/model.md
Graphormer	https://huggingface.co/docs/transformers/model_doc/graphormer
Do Transformers Really Perform Bad for Graph Representation?	https://arxiv.org/abs/2106.05234
GroupViT	https://huggingface.co/docs/transformers/model_doc/groupvit
GroupViT: Semantic Segmentation Emerges from Text Supervision	https://arxiv.org/abs/2202.11094
HerBERT	https://huggingface.co/docs/transformers/model_doc/herbert
KLEJ: Comprehensive Benchmark for Polish Language Understanding	https://www.aclweb.org/anthology/2020.acl-main.111.pdf
Hubert	https://huggingface.co/docs/transformers/model_doc/hubert
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units	https://arxiv.org/abs/2106.07447
I-BERT	https://huggingface.co/docs/transformers/model_doc/ibert
I-BERT: Integer-only BERT Quantization	https://arxiv.org/abs/2101.01321
IDEFICS	https://huggingface.co/docs/transformers/model_doc/idefics
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents	https://huggingface.co/papers/2306.16527
ImageGPT	https://huggingface.co/docs/transformers/model_doc/imagegpt
Generative Pretraining from Pixels	https://openai.com/blog/image-gpt/
Informer	https://huggingface.co/docs/transformers/model_doc/informer
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting	https://arxiv.org/abs/2012.07436
InstructBLIP	https://huggingface.co/docs/transformers/model_doc/instructblip
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning	https://arxiv.org/abs/2305.06500
Jukebox	https://huggingface.co/docs/transformers/model_doc/jukebox
Jukebox: A Generative Model for Music	https://arxiv.org/pdf/2005.00341.pdf
KOSMOS-2	https://huggingface.co/docs/transformers/model_doc/kosmos-2
Kosmos-2: Grounding Multimodal Large Language Models to the World	https://arxiv.org/abs/2306.14824
LayoutLM	https://huggingface.co/docs/transformers/model_doc/layoutlm
LayoutLM: Pre-training of Text and Layout for Document Image Understanding	https://arxiv.org/abs/1912.13318
LayoutLMv2	https://huggingface.co/docs/transformers/model_doc/layoutlmv2
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding	https://arxiv.org/abs/2012.14740
LayoutLMv3	https://huggingface.co/docs/transformers/model_doc/layoutlmv3
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking	https://arxiv.org/abs/2204.08387
LayoutXLM	https://huggingface.co/docs/transformers/model_doc/layoutxlm
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding	https://arxiv.org/abs/2104.08836
LED	https://huggingface.co/docs/transformers/model_doc/led
Longformer: The Long-Document Transformer	https://arxiv.org/abs/2004.05150
LeViT	https://huggingface.co/docs/transformers/model_doc/levit
LeViT: A Vision Transformer in ConvNet's Clothing for Faster Inference	https://arxiv.org/abs/2104.01136
LiLT	https://huggingface.co/docs/transformers/model_doc/lilt
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding	https://arxiv.org/abs/2202.13669
LLaMA	https://huggingface.co/docs/transformers/model_doc/llama
LLaMA: Open and Efficient Foundation Language Models	https://arxiv.org/abs/2302.13971
Llama2	https://huggingface.co/docs/transformers/model_doc/llama2
Llama2: Open Foundation and Fine-Tuned Chat Models	https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
LLaVa	https://huggingface.co/docs/transformers/model_doc/llava
Visual Instruction Tuning	https://arxiv.org/abs/2304.08485
Longformer	https://huggingface.co/docs/transformers/model_doc/longformer
Longformer: The Long-Document Transformer	https://arxiv.org/abs/2004.05150
LongT5	https://huggingface.co/docs/transformers/model_doc/longt5
LongT5: Efficient Text-To-Text Transformer for Long Sequences	https://arxiv.org/abs/2112.07916
LUKE	https://huggingface.co/docs/transformers/model_doc/luke
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention	https://arxiv.org/abs/2010.01057
LXMERT	https://huggingface.co/docs/transformers/model_doc/lxmert
LXMERT: Learning Cross-Modality Encoder Representations from Transformers for Open-Domain Question Answering	https://arxiv.org/abs/1908.07490
M-CTC-T	https://huggingface.co/docs/transformers/model_doc/mctct
Pseudo-Labeling For Massively Multilingual Speech Recognition	https://arxiv.org/abs/2111.00161
M2M100	https://huggingface.co/docs/transformers/model_doc/m2m_100
Beyond English-Centric Multilingual Machine Translation	https://arxiv.org/abs/2010.11125
MADLAD-400	https://huggingface.co/docs/transformers/model_doc/madlad-400
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset	https://arxiv.org/abs/2309.04662
Mamba	https://huggingface.co/docs/transformers/main/model_doc/mamba
Mamba: Linear-Time Sequence Modeling with Selective State Spaces	https://arxiv.org/abs/2312.00752
MarianMT	https://huggingface.co/docs/transformers/model_doc/marian
OPUS	http://opus.nlpl.eu/
Marian Framework	https://marian-nmt.github.io/
MarkupLM	https://huggingface.co/docs/transformers/model_doc/markuplm
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding	https://arxiv.org/abs/2110.08518
Mask2Former	https://huggingface.co/docs/transformers/model_doc/mask2former
Masked-attention Mask Transformer for Universal Image Segmentation	https://arxiv.org/abs/2112.01527
MaskFormer	https://huggingface.co/docs/transformers/model_doc/maskformer
Per-Pixel Classification is Not All You Need for Semantic Segmentation	https://arxiv.org/abs/2107.06278
MatCha	https://huggingface.co/docs/transformers/model_doc/matcha
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering	https://arxiv.org/abs/2212.09662
mBART	https://huggingface.co/docs/transformers/model_doc/mbart
Multilingual Denoising Pre-training for Neural Machine Translation	https://arxiv.org/abs/2001.08210
mBART-50	https://huggingface.co/docs/transformers/model_doc/mbart
Multilingual Translation with Extensible Multilingual Pretraining and Finetuning	https://arxiv.org/abs/2008.00401
MEGA	https://huggingface.co/docs/transformers/model_doc/mega
Mega: Moving Average Equipped Gated Attention	https://arxiv.org/abs/2209.10655
Megatron-BERT	https://huggingface.co/docs/transformers/model_doc/megatron-bert
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism	https://arxiv.org/abs/1909.08053
Megatron-GPT2	https://huggingface.co/docs/transformers/model_doc/megatron_gpt2
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism	https://arxiv.org/abs/1909.08053
MGP-STR	https://huggingface.co/docs/transformers/model_doc/mgp-str
Multi-Granularity Prediction for Scene Text Recognition	https://arxiv.org/abs/2209.03592
Mistral	https://huggingface.co/docs/transformers/model_doc/mistral
Mistral AI	https://mistral.ai
Mixtral	https://huggingface.co/docs/transformers/model_doc/mixtral
Mistral AI	https://mistral.ai
mLUKE	https://huggingface.co/docs/transformers/model_doc/mluke
mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models	https://arxiv.org/abs/2110.08151
MMS	https://huggingface.co/docs/transformers/model_doc/mms
Scaling Speech Technology to 1,000+ Languages	https://arxiv.org/abs/2305.13516
MobileBERT	https://huggingface.co/docs/transformers/model_doc/mobilebert
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices	https://arxiv.org/abs/2004.02984
MobileNetV1	https://huggingface.co/docs/transformers/model_doc/mobilenet_v1
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications	https://arxiv.org/abs/1704.04861
MobileNetV2	https://huggingface.co/docs/transformers/model_doc/mobilenet_v2
MobileNetV2: Inverted Residuals and Linear Bottlenecks	https://arxiv.org/abs/1801.04381
MobileViT	https://huggingface.co/docs/transformers/model_doc/mobilevit
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer	https://arxiv.org/abs/2110.02178
MobileViTV2	https://huggingface.co/docs/transformers/model_doc/mobilevitv2
Separable Self-attention for Mobile Vision Transformers	https://arxiv.org/abs/2206.02680
MPNet	https://huggingface.co/docs/transformers/model_doc/mpnet
MPNet: Masked and Permuted Pre-training for Language Understanding	https://arxiv.org/abs/2004.09297
MPT	https://huggingface.co/docs/transformers/model_doc/mpt
llm-foundry	https://github.com/mosaicml/llm-foundry/
MRA	https://huggingface.co/docs/transformers/model_doc/mra
Multi Resolution Analysis (MRA) for Approximate Self-Attention	https://arxiv.org/abs/2207.10284
MT5	https://huggingface.co/docs/transformers/model_doc/mt5
mT5: A massively multilingual pre-trained text-to-text transformer	https://arxiv.org/abs/2010.11934
MusicGen	https://huggingface.co/docs/transformers/model_doc/musicgen
Simple and Controllable Music Generation	https://arxiv.org/abs/2306.05284
MVP	https://huggingface.co/docs/transformers/model_doc/mvp
MVP: Multi-task Supervised Pre-training for Natural Language Generation	https://arxiv.org/abs/2206.12131
NAT	https://huggingface.co/docs/transformers/model_doc/nat
Neighborhood Attention Transformer	https://arxiv.org/abs/2204.07143
Nezha	https://huggingface.co/docs/transformers/model_doc/nezha
NEZHA: Neural Contextualized Representation for Chinese Language Understanding	https://arxiv.org/abs/1909.00204
NLLB	https://huggingface.co/docs/transformers/model_doc/nllb
No Language Left Behind: Scaling Human-Centered Machine Translation	https://arxiv.org/abs/2207.04672
NLLB-MOE	https://huggingface.co/docs/transformers/model_doc/nllb-moe
No Language Left Behind: Scaling Human-Centered Machine Translation	https://arxiv.org/abs/2207.04672
Nougat	https://huggingface.co/docs/transformers/model_doc/nougat
Nougat: Neural Optical Understanding for Academic Documents	https://arxiv.org/abs/2308.13418
Nyströmformer	https://huggingface.co/docs/transformers/model_doc/nystromformer
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention	https://arxiv.org/abs/2102.03902
OneFormer	https://huggingface.co/docs/transformers/model_doc/oneformer
OneFormer: One Transformer to Rule Universal Image Segmentation	https://arxiv.org/abs/2211.06220
OpenLlama	https://huggingface.co/docs/transformers/model_doc/open-llama
s-JoL	https://huggingface.co/s-JoL
OPT	https://huggingface.co/docs/transformers/master/model_doc/opt
OPT: Open Pre-trained Transformer Language Models	https://arxiv.org/abs/2205.01068
OWL-ViT	https://huggingface.co/docs/transformers/model_doc/owlvit
Simple Open-Vocabulary Object Detection with Vision Transformers	https://arxiv.org/abs/2205.06230
OWLv2	https://huggingface.co/docs/transformers/model_doc/owlv2
Scaling Open-Vocabulary Object Detection	https://arxiv.org/abs/2306.09683
PatchTSMixer	https://huggingface.co/docs/transformers/model_doc/patchtsmixer
TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting	https://arxiv.org/pdf/2306.09364.pdf
PatchTST	https://huggingface.co/docs/transformers/model_doc/patchtst
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers	https://arxiv.org/abs/2211.14730
Pegasus	https://huggingface.co/docs/transformers/model_doc/pegasus
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization	https://arxiv.org/abs/1912.08777
PEGASUS-X	https://huggingface.co/docs/transformers/model_doc/pegasus_x
Investigating Efficiently Extending Transformers for Long Input Summarization	https://arxiv.org/abs/2208.04347
Perceiver IO	https://huggingface.co/docs/transformers/model_doc/perceiver
Perceiver IO: A General Architecture for Structured Inputs & Outputs	https://arxiv.org/abs/2107.14795
Persimmon	https://huggingface.co/docs/transformers/model_doc/persimmon
blog post	https://www.adept.ai/blog/persimmon-8b
Phi	https://huggingface.co/docs/transformers/model_doc/phi
Textbooks Are All You Need	https://arxiv.org/abs/2306.11644
Textbooks Are All You Need II: phi-1.5 technical report	https://arxiv.org/abs/2309.05463
PhoBERT	https://huggingface.co/docs/transformers/model_doc/phobert
PhoBERT: Pre-trained language models for Vietnamese	https://www.aclweb.org/anthology/2020.findings-emnlp.92/
Pix2Struct	https://huggingface.co/docs/transformers/model_doc/pix2struct
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding	https://arxiv.org/abs/2210.03347
PLBart	https://huggingface.co/docs/transformers/model_doc/plbart
Unified Pre-training for Program Understanding and Generation	https://arxiv.org/abs/2103.06333
PoolFormer	https://huggingface.co/docs/transformers/model_doc/poolformer
MetaFormer is Actually What You Need for Vision	https://arxiv.org/abs/2111.11418
Pop2Piano	https://huggingface.co/docs/transformers/model_doc/pop2piano
Pop2Piano : Pop Audio-based Piano Cover Generation	https://arxiv.org/abs/2211.00895
ProphetNet	https://huggingface.co/docs/transformers/model_doc/prophetnet
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training	https://arxiv.org/abs/2001.04063
PVT	https://huggingface.co/docs/transformers/model_doc/pvt
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions	https://arxiv.org/pdf/2102.12122.pdf
QDQBert	https://huggingface.co/docs/transformers/model_doc/qdqbert
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation	https://arxiv.org/abs/2004.09602
Qwen2	https://huggingface.co/docs/transformers/model_doc/qwen2
Qwen Technical Report	https://arxiv.org/abs/2309.16609
RAG	https://huggingface.co/docs/transformers/model_doc/rag
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks	https://arxiv.org/abs/2005.11401
REALM	https://huggingface.co/docs/transformers/model_doc/realm.html
REALM: Retrieval-Augmented Language Model Pre-Training	https://arxiv.org/abs/2002.08909
Reformer	https://huggingface.co/docs/transformers/model_doc/reformer
Reformer: The Efficient Transformer	https://arxiv.org/abs/2001.04451
RegNet	https://huggingface.co/docs/transformers/model_doc/regnet
Designing Network Design Space	https://arxiv.org/abs/2003.13678
RemBERT	https://huggingface.co/docs/transformers/model_doc/rembert
Rethinking embedding coupling in pre-trained language models	https://arxiv.org/abs/2010.12821
ResNet	https://huggingface.co/docs/transformers/model_doc/resnet
Deep Residual Learning for Image Recognition	https://arxiv.org/abs/1512.03385
RoBERTa	https://huggingface.co/docs/transformers/model_doc/roberta
RoBERTa: A Robustly Optimized BERT Pretraining Approach	https://arxiv.org/abs/1907.11692
RoBERTa-PreLayerNorm	https://huggingface.co/docs/transformers/model_doc/roberta-prelayernorm
fairseq: A Fast, Extensible Toolkit for Sequence Modeling	https://arxiv.org/abs/1904.01038
RoCBert	https://huggingface.co/docs/transformers/model_doc/roc_bert
RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining	https://aclanthology.org/2022.acl-long.65.pdf
RoFormer	https://huggingface.co/docs/transformers/model_doc/roformer
RoFormer: Enhanced Transformer with Rotary Position Embedding	https://arxiv.org/abs/2104.09864
RWKV	https://huggingface.co/docs/transformers/model_doc/rwkv
this repo	https://github.com/BlinkDL/RWKV-LM
SeamlessM4T	https://huggingface.co/docs/transformers/model_doc/seamless_m4t
SeamlessM4T — Massively Multilingual & Multimodal Machine Translation	https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf
SeamlessM4Tv2	https://huggingface.co/docs/transformers/model_doc/seamless_m4t_v2
Seamless: Multilingual Expressive and Streaming Speech Translation	https://ai.meta.com/research/publications/seamless-multilingual-expressive-and-streaming-speech-translation/
SegFormer	https://huggingface.co/docs/transformers/model_doc/segformer
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers	https://arxiv.org/abs/2105.15203
SegGPT	https://huggingface.co/docs/transformers/main/model_doc/seggpt
SegGPT: Segmenting Everything In Context	https://arxiv.org/abs/2304.03284
Segment Anything	https://huggingface.co/docs/transformers/model_doc/sam
Segment Anything	https://arxiv.org/pdf/2304.02643v1.pdf
SEW	https://huggingface.co/docs/transformers/model_doc/sew
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition	https://arxiv.org/abs/2109.06870
SEW-D	https://huggingface.co/docs/transformers/model_doc/sew_d
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition	https://arxiv.org/abs/2109.06870
SigLIP	https://huggingface.co/docs/transformers/model_doc/siglip
Sigmoid Loss for Language Image Pre-Training	https://arxiv.org/abs/2303.15343
SpeechT5	https://huggingface.co/docs/transformers/model_doc/speecht5
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing	https://arxiv.org/abs/2110.07205
SpeechToTextTransformer	https://huggingface.co/docs/transformers/model_doc/speech_to_text
fairseq S2T: Fast Speech-to-Text Modeling with fairseq	https://arxiv.org/abs/2010.05171
SpeechToTextTransformer2	https://huggingface.co/docs/transformers/model_doc/speech_to_text_2
Large-Scale Self- and Semi-Supervised Learning for Speech Translation	https://arxiv.org/abs/2104.06678
Splinter	https://huggingface.co/docs/transformers/model_doc/splinter
Few-Shot Question Answering by Pretraining Span Selection	https://arxiv.org/abs/2101.00438
SqueezeBERT	https://huggingface.co/docs/transformers/model_doc/squeezebert
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?	https://arxiv.org/abs/2006.11316
StableLm	https://huggingface.co/docs/transformers/model_doc/stablelm
StableLM 3B 4E1T (Technical Report)	https://stability.wandb.io/stability-llm/stable-lm/reports/StableLM-3B-4E1T--VmlldzoyMjU4?accessToken=u3zujipenkx5g7rtcj9qojjgxpconyjktjkli2po09nffrffdhhchq045vp0wyfo
Starcoder2	https://huggingface.co/docs/transformers/main/model_doc/starcoder2
StarCoder 2 and The Stack v2: The Next Generation	https://arxiv.org/abs/2402.19173
SwiftFormer	https://huggingface.co/docs/transformers/model_doc/swiftformer
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications	https://arxiv.org/abs/2303.15446
Swin Transformer	https://huggingface.co/docs/transformers/model_doc/swin
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows	https://arxiv.org/abs/2103.14030
Swin Transformer V2	https://huggingface.co/docs/transformers/model_doc/swinv2
Swin Transformer V2: Scaling Up Capacity and Resolution	https://arxiv.org/abs/2111.09883
Swin2SR	https://huggingface.co/docs/transformers/model_doc/swin2sr
Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration	https://arxiv.org/abs/2209.11345
SwitchTransformers	https://huggingface.co/docs/transformers/model_doc/switch_transformers
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity	https://arxiv.org/abs/2101.03961
T5	https://huggingface.co/docs/transformers/model_doc/t5
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer	https://arxiv.org/abs/1910.10683
T5v1.1	https://huggingface.co/docs/transformers/model_doc/t5v1.1
google-research/text-to-text-transfer-transformer	https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511
Table Transformer	https://huggingface.co/docs/transformers/model_doc/table-transformer
PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documents	https://arxiv.org/abs/2110.00061
TAPAS	https://huggingface.co/docs/transformers/model_doc/tapas
TAPAS: Weakly Supervised Table Parsing via Pre-training	https://arxiv.org/abs/2004.02349
TAPEX	https://huggingface.co/docs/transformers/model_doc/tapex
TAPEX: Table Pre-training via Learning a Neural SQL Executor	https://arxiv.org/abs/2107.07653
Time Series Transformer	https://huggingface.co/docs/transformers/model_doc/time_series_transformer
TimeSformer	https://huggingface.co/docs/transformers/model_doc/timesformer
Is Space-Time Attention All You Need for Video Understanding?	https://arxiv.org/abs/2102.05095
Trajectory Transformer	https://huggingface.co/docs/transformers/model_doc/trajectory_transformers
Offline Reinforcement Learning as One Big Sequence Modeling Problem	https://arxiv.org/abs/2106.02039
Transformer-XL	https://huggingface.co/docs/transformers/model_doc/transfo-xl
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context	https://arxiv.org/abs/1901.02860
TrOCR	https://huggingface.co/docs/transformers/model_doc/trocr
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models	https://arxiv.org/abs/2109.10282
TVLT	https://huggingface.co/docs/transformers/model_doc/tvlt
TVLT: Textless Vision-Language Transformer	https://arxiv.org/abs/2209.14156
TVP	https://huggingface.co/docs/transformers/model_doc/tvp
Text-Visual Prompting for Efficient 2D Temporal Video Grounding	https://arxiv.org/abs/2303.04995
UDOP	https://huggingface.co/docs/transformers/main/model_doc/udop
Unifying Vision, Text, and Layout for Universal Document Processing	https://arxiv.org/abs/2212.02623
UL2	https://huggingface.co/docs/transformers/model_doc/ul2
Unifying Language Learning Paradigms	https://arxiv.org/abs/2205.05131v1
UMT5	https://huggingface.co/docs/transformers/model_doc/umt5
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining	https://openreview.net/forum?id=kXwdL1cWOAi
UniSpeech	https://huggingface.co/docs/transformers/model_doc/unispeech
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data	https://arxiv.org/abs/2101.07597
UniSpeechSat	https://huggingface.co/docs/transformers/model_doc/unispeech-sat
UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING	https://arxiv.org/abs/2110.05752
UnivNet	https://huggingface.co/docs/transformers/model_doc/univnet
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation	https://arxiv.org/abs/2106.07889
UPerNet	https://huggingface.co/docs/transformers/model_doc/upernet
Unified Perceptual Parsing for Scene Understanding	https://arxiv.org/abs/1807.10221
VAN	https://huggingface.co/docs/transformers/model_doc/van
Visual Attention Network	https://arxiv.org/abs/2202.09741
VideoMAE	https://huggingface.co/docs/transformers/model_doc/videomae
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training	https://arxiv.org/abs/2203.12602
ViLT	https://huggingface.co/docs/transformers/model_doc/vilt
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision	https://arxiv.org/abs/2102.03334
VipLlava	https://huggingface.co/docs/transformers/model_doc/vipllava
Making Large Multimodal Models Understand Arbitrary Visual Prompts	https://arxiv.org/abs/2312.00784
Vision Transformer (ViT)	https://huggingface.co/docs/transformers/model_doc/vit
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale	https://arxiv.org/abs/2010.11929
VisualBERT	https://huggingface.co/docs/transformers/model_doc/visual_bert
VisualBERT: A Simple and Performant Baseline for Vision and Language	https://arxiv.org/pdf/1908.03557
ViT Hybrid	https://huggingface.co/docs/transformers/model_doc/vit_hybrid
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale	https://arxiv.org/abs/2010.11929
VitDet	https://huggingface.co/docs/transformers/model_doc/vitdet
Exploring Plain Vision Transformer Backbones for Object Detection	https://arxiv.org/abs/2203.16527
ViTMAE	https://huggingface.co/docs/transformers/model_doc/vit_mae
Masked Autoencoders Are Scalable Vision Learners	https://arxiv.org/abs/2111.06377
ViTMatte	https://huggingface.co/docs/transformers/model_doc/vitmatte
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers	https://arxiv.org/abs/2305.15272
ViTMSN	https://huggingface.co/docs/transformers/model_doc/vit_msn
Masked Siamese Networks for Label-Efficient Learning	https://arxiv.org/abs/2204.07141
VITS	https://huggingface.co/docs/transformers/model_doc/vits
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech	https://arxiv.org/abs/2106.06103
ViViT	https://huggingface.co/docs/transformers/model_doc/vivit
ViViT: A Video Vision Transformer	https://arxiv.org/abs/2103.15691
Wav2Vec2	https://huggingface.co/docs/transformers/model_doc/wav2vec2
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations	https://arxiv.org/abs/2006.11477
Wav2Vec2-BERT	https://huggingface.co/docs/transformers/model_doc/wav2vec2-bert
Seamless: Multilingual Expressive and Streaming Speech Translation	https://ai.meta.com/research/publications/seamless-multilingual-expressive-and-streaming-speech-translation/
Wav2Vec2-Conformer	https://huggingface.co/docs/transformers/model_doc/wav2vec2-conformer
FAIRSEQ S2T: Fast Speech-to-Text Modeling with FAIRSEQ	https://arxiv.org/abs/2010.05171
Wav2Vec2Phoneme	https://huggingface.co/docs/transformers/model_doc/wav2vec2_phoneme
Simple and Effective Zero-shot Cross-lingual Phoneme Recognition	https://arxiv.org/abs/2109.11680
WavLM	https://huggingface.co/docs/transformers/model_doc/wavlm
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing	https://arxiv.org/abs/2110.13900
Whisper	https://huggingface.co/docs/transformers/model_doc/whisper
Robust Speech Recognition via Large-Scale Weak Supervision	https://cdn.openai.com/papers/whisper.pdf
X-CLIP	https://huggingface.co/docs/transformers/model_doc/xclip
Expanding Language-Image Pretrained Models for General Video Recognition	https://arxiv.org/abs/2208.02816
X-MOD	https://huggingface.co/docs/transformers/model_doc/xmod
Lifting the Curse of Multilinguality by Pre-training Modular Transformers	http://dx.doi.org/10.18653/v1/2022.naacl-main.255
XGLM	https://huggingface.co/docs/transformers/model_doc/xglm
Few-shot Learning with Multilingual Language Models	https://arxiv.org/abs/2112.10668
XLM	https://huggingface.co/docs/transformers/model_doc/xlm
Cross-lingual Language Model Pretraining	https://arxiv.org/abs/1901.07291
XLM-ProphetNet	https://huggingface.co/docs/transformers/model_doc/xlm-prophetnet
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training	https://arxiv.org/abs/2001.04063
XLM-RoBERTa	https://huggingface.co/docs/transformers/model_doc/xlm-roberta
Unsupervised Cross-lingual Representation Learning at Scale	https://arxiv.org/abs/1911.02116
XLM-RoBERTa-XL	https://huggingface.co/docs/transformers/model_doc/xlm-roberta-xl
Larger-Scale Transformers for Multilingual Masked Language Modeling	https://arxiv.org/abs/2105.00572
XLM-V	https://huggingface.co/docs/transformers/model_doc/xlm-v
XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models	https://arxiv.org/abs/2301.10472
XLNet	https://huggingface.co/docs/transformers/model_doc/xlnet
XLNet: Generalized Autoregressive Pretraining for Language Understanding	https://arxiv.org/abs/1906.08237
XLS-R	https://huggingface.co/docs/transformers/model_doc/xls_r
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale	https://arxiv.org/abs/2111.09296
XLSR-Wav2Vec2	https://huggingface.co/docs/transformers/model_doc/xlsr_wav2vec2
Unsupervised Cross-Lingual Representation Learning For Speech Recognition	https://arxiv.org/abs/2006.13979
YOLOS	https://huggingface.co/docs/transformers/model_doc/yolos
You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection	https://arxiv.org/abs/2106.00666
YOSO	https://huggingface.co/docs/transformers/model_doc/yoso
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling	https://arxiv.org/abs/2111.09714
templates	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/templates
contributing guidelines	https://patch-diff.githubusercontent.com/S4Plus/transformers/blob/main/CONTRIBUTING.md
this table	https://huggingface.co/docs/transformers/index#supported-frameworks
documentation	https://github.com/huggingface/transformers/tree/main/examples
	https://patch-diff.githubusercontent.com/S4Plus/transformers#learn-more
Documentation	https://huggingface.co/docs/transformers/
Task summary	https://huggingface.co/docs/transformers/task_summary
Preprocessing tutorial	https://huggingface.co/docs/transformers/preprocessing
Training and fine-tuning	https://huggingface.co/docs/transformers/training
Quick tour: Fine-tuning/usage scripts	https://github.com/huggingface/transformers/tree/main/examples
Model sharing and uploading	https://huggingface.co/docs/transformers/model_sharing
	https://patch-diff.githubusercontent.com/S4Plus/transformers#citation
paper	https://www.aclweb.org/anthology/2020.emnlp-demos.6/
huggingface.co/transformers	https://huggingface.co/transformers
Readme	https://patch-diff.githubusercontent.com/S4Plus/transformers#readme-ov-file
Apache-2.0 license	https://patch-diff.githubusercontent.com/S4Plus/transformers#Apache-2.0-1-ov-file
Please reload this page	https://patch-diff.githubusercontent.com/S4Plus/transformers
Activity	https://patch-diff.githubusercontent.com/S4Plus/transformers/activity
Custom properties	https://patch-diff.githubusercontent.com/S4Plus/transformers/custom-properties
0 stars	https://patch-diff.githubusercontent.com/S4Plus/transformers/stargazers
0 watching	https://patch-diff.githubusercontent.com/S4Plus/transformers/watchers
0 forks	https://patch-diff.githubusercontent.com/S4Plus/transformers/forks
Report repository	https://patch-diff.githubusercontent.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2FS4Plus%2Ftransformers&report=S4Plus+%28user%29
Releases	https://patch-diff.githubusercontent.com/S4Plus/transformers/releases
Packages 0	https://patch-diff.githubusercontent.com/orgs/S4Plus/packages?repo_name=transformers
	https://github.com
Terms	https://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacy	https://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Security	https://github.com/security
Status	https://www.githubstatus.com/
Community	https://github.community/
Docs	https://docs.github.com/
Contact	https://support.github.com?tags=dotcom-footer

Viewport: width=device-width

URLs of crawlers that visited me.