René's URL Explorer Experiment


Title: textract — textract 1.6.1 documentation

direct link

Domain: textract.readthedocs.io

readthedocs-project-slugtextract
readthedocs-version-slugstable
readthedocs-resolver-filename/
readthedocs-http-status200

Links:

textract http://textract.readthedocs.io
Command line interfacehttp://textract.readthedocs.io/command_line_interface.html
Python packagehttp://textract.readthedocs.io/python_package.html
Installationhttp://textract.readthedocs.io/installation.html
Contributinghttp://textract.readthedocs.io/contributing.html
Change Loghttp://textract.readthedocs.io/changelog.html
textracthttp://textract.readthedocs.io
Docshttp://textract.readthedocs.io
Edit on GitHubhttps://github.com/deanmalmgren/textract/blob/05fdc7a08dc3fc52eb519aefac4fcbec8981dd8e/docs/index.rst
http://textract.readthedocs.io#textract
several packageshttp://textract.readthedocs.io#supporting
command line interfacehttp://textract.readthedocs.io/command_line_interface.html#command-line-interface
python packagehttp://textract.readthedocs.io/python_package.html#python-package
http://textract.readthedocs.io#currently-supporting
issue trackerhttps://github.com/deanmalmgren/textract/issues
contributing a pull requesthttp://textract.readthedocs.io/contributing.html#contributing
antiwordhttp://www.winfield.demon.nl/
python-docx2txthttps://github.com/ankushshah89/python-docx2txt
ebooklibhttps://github.com/aerkalov/ebooklib
tesseract-ocrhttps://code.google.com/p/tesseract-ocr/
tesseract-ocrhttps://code.google.com/p/tesseract-ocr/
beautifulsoup4http://beautiful-soup-4.readthedocs.org/en/latest/
soxhttp://sox.sourceforge.net/
SpeechRecognitionhttps://pypi.python.org/pypi/SpeechRecognition/
pocketsphinxhttps://github.com/cmusphinx/pocketsphinx/
msg-extractorhttps://github.com/mattgwwalker/msg-extractor
soxhttp://sox.sourceforge.net/
SpeechRecognitionhttps://pypi.python.org/pypi/SpeechRecognition/
pocketsphinxhttps://github.com/cmusphinx/pocketsphinx/
pdftotexthttp://poppler.freedesktop.org/
pdfminer.sixhttps://github.com/goulu/pdfminer
tesseract-ocrhttps://code.google.com/p/tesseract-ocr/
python-pptxhttps://python-pptx.readthedocs.org/en/latest/
ps2texthttp://pages.cs.wisc.edu/~ghost/doc/pstotext.htm
unrtfhttp://www.gnu.org/software/unrtf/
tesseract-ocrhttps://code.google.com/p/tesseract-ocr/
SpeechRecognitionhttps://pypi.python.org/pypi/SpeechRecognition/
pocketsphinxhttps://github.com/cmusphinx/pocketsphinx/
xlrdhttps://pypi.python.org/pypi/xlrd
xlrdhttps://pypi.python.org/pypi/xlrd
http://textract.readthedocs.io#related-projects
method agnostic about how content is extractedhttp://textract.readthedocs.io/contributing.html#contributing
Apache Tikahttp://tika.apache.org/
very similar, if not identical, aims as textracthttps://github.com/deanmalmgren/textract/issues/12
textract (node.js)https://github.com/dbashford/textract
pandochttp://johnmacfarlane.net/pandoc/
the ability to convert to plain texthttp://johnmacfarlane.net/pandoc/demos.html
Command line interfacehttp://textract.readthedocs.io/command_line_interface.html
textracthttp://textract.readthedocs.io/command_line_interface.html#textract
Python packagehttp://textract.readthedocs.io/python_package.html
Additional optionshttp://textract.readthedocs.io/python_package.html#additional-options
A look under the hoodhttp://textract.readthedocs.io/python_package.html#a-look-under-the-hood
A few specific exampleshttp://textract.readthedocs.io/python_package.html#a-few-specific-examples
Installationhttp://textract.readthedocs.io/installation.html
Ubuntu / Debianhttp://textract.readthedocs.io/installation.html#ubuntu-debian
OSXhttp://textract.readthedocs.io/installation.html#osx
Don’t see your operating system installation instructions here?http://textract.readthedocs.io/installation.html#don-t-see-your-operating-system-installation-instructions-here
Contributinghttp://textract.readthedocs.io/contributing.html
Quick starthttp://textract.readthedocs.io/contributing.html#quick-start
Change Loghttp://textract.readthedocs.io/changelog.html
latest changes in development for next releasehttp://textract.readthedocs.io/changelog.html#latest-changes-in-development-for-next-release
1.6.1http://textract.readthedocs.io/changelog.html#id1
1.6.0http://textract.readthedocs.io/changelog.html#id2
1.5.0http://textract.readthedocs.io/changelog.html#id3
1.4.0http://textract.readthedocs.io/changelog.html#id4
1.3.0http://textract.readthedocs.io/changelog.html#id5
1.2.0http://textract.readthedocs.io/changelog.html#id6
1.1.0http://textract.readthedocs.io/changelog.html#id7
1.0.0http://textract.readthedocs.io/changelog.html#id8
0.5.1http://textract.readthedocs.io/changelog.html#id9
0.5.0http://textract.readthedocs.io/changelog.html#id10
0.4.0http://textract.readthedocs.io/changelog.html#id11
0.3.0http://textract.readthedocs.io/changelog.html#id12
0.2.0http://textract.readthedocs.io/changelog.html#id13
0.1.0http://textract.readthedocs.io/changelog.html#id14
http://textract.readthedocs.io#indices-and-tables
Indexhttp://textract.readthedocs.io/genindex.html
Module Indexhttp://textract.readthedocs.io/py-modindex.html
Search Pagehttp://textract.readthedocs.io/search.html
Next http://textract.readthedocs.io/command_line_interface.html
Sphinxhttp://sphinx-doc.org/
themehttps://github.com/snide/sphinx_rtd_theme
Read the Docshttps://readthedocs.org

Viewport: width=device-width, initial-scale=1.0


URLs of crawlers that visited me.