Oven logo

Oven

html2text2025.4.15

Published

Turn HTML into equivalent Markdown-structured text.

pip install html2text

Package Downloads

Weekly DownloadsMonthly Downloads

Project URLs

Requires Python

>=3.9

Dependencies

    html2text

    CI codecov

    html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).

    Usage: html2text [filename [encoding]]

    OptionDescription
    --versionShow program's version number and exit
    -h, --helpShow this help message and exit
    --ignore-linksDon't include any formatting for links
    --escape-allEscape all special characters. Output is less readable, but avoids corner case formatting issues.
    --reference-linksUse reference links instead of links to create markdown
    --mark-codeMark preformatted and code blocks with [code]...[/code]

    For a complete list of options see the docs

    Or you can use it from within Python:

    >>> import html2text
    >>>
    >>> print(html2text.html2text("<p><strong>Zed's</strong> dead baby, <em>Zed's</em> dead.</p>"))
    **Zed's** dead baby, _Zed's_ dead.
    
    

    Or with some configuration options:

    >>> import html2text
    >>>
    >>> h = html2text.HTML2Text()
    >>> # Ignore converting links from HTML
    >>> h.ignore_links = True
    >>> print h.handle("<p>Hello, <a href='https://www.google.com/earth/'>world</a>!")
    Hello, world!
    
    >>> print(h.handle("<p>Hello, <a href='https://www.google.com/earth/'>world</a>!"))
    
    Hello, world!
    
    >>> # Don't Ignore links anymore, I like links
    >>> h.ignore_links = False
    >>> print(h.handle("<p>Hello, <a href='https://www.google.com/earth/'>world</a>!"))
    Hello, [world](https://www.google.com/earth/)!
    
    

    Originally written by Aaron Swartz. This code is distributed under the GPLv3.

    How to install

    html2text is available on pypi https://pypi.org/project/html2text/

    $ pip install html2text
    

    Development

    How to run unit tests

    $ tox
    

    To see the coverage results:

    $ coverage html
    

    then open the ./htmlcov/index.html file in your browser.

    Code Quality & Pre Commit

    The CI runs several linting steps, including:

    • mypy
    • Flake8
    • Black

    To make sure the code passes the CI linting steps, run:

    $ tox -e pre-commit
    

    Documentation

    Documentation lives here