# Parsing
The `HTML` class parses a string of HTML and provides methods to [query](querying.md) the DOM for specific elements.
## \_\_init\_\_
Creates an `HTML` object from a `str` or `bytes`.
```python
from minestrone import HTML
html = HTML("""
The Dormouse's Story
The Dormouse's Story
""")
```
If closing tags are missing, then they will be added as needed to make the HTML valid.
```python
from minestrone import HTML
assert str(HTML("dormouse")) == "dormouse"
```
### parser
Three parsers are available in `minestrone` and they all have different trade-offs. By default, the built-in, pure Python `html.parser` is used. `lxml` can be used for faster parsing speed. `html5lib` is another option to ensure a valid HTML5 document.
```{note}
`lxml` and `html5lib` are not installed with `minestrone` by default and must be specifically installed.
- `poetry add minestrone[lxml]` or `pip install minestrone[lxml]`
- `poetry add minestrone[html5]` or `pip install minestrone[html5]`
```
```{note}
BeautifulSoup has a [summary table](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser) of the three parsers. There is also a more detailed [breakdown of the differences](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#differences-between-parsers) between the parsers.
```
### Parser.HTML
```python
from minestrone import HTML, Parser
assert str(HTML("dormouse"), parser=Parser.HTML) == "dormouse"
```
### Parser.LXML
```python
from minestrone import HTML, Parser
assert str(HTML("dormouse"), parser=Parser.LXML) == "dormouse"
```
### Parser.HTML5
```python
from minestrone import HTML, Parser
assert str(HTML("dormouse"), parser=Parser.HTML5) == "dormouse"
```
## encoding
`Beautiful Soup` [attempts to decipher the encoding](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#encodings) of the HTML string, however it isn't always correct. An encoding can be passed along if necessary.
```python
from minestrone import HTML, Parser
html_bytes = b"\xed\xe5\xec\xf9
"
assert str(HTML(html_bytes)) == "翴檛
"
assert HTML(html_bytes).encoding == "big5"
assert str(HTML(html_bytes), encoding="iso-8859-8") == "םולש
"
assert HTML(html_bytes).encoding == "iso-8859-8"
```
## prettify
Returns a prettified version of the HTML.
```python
html = HTML("""
The Dormouse's Story
The Dormouse's Story
""")
assert html.prettify() == """
The Dormouse's Story
The Dormouse's Story
"""
```
## \_\_str\_\_
Returns the `HTML` object as a string.
```python
from minestrone import HTML
html = HTML("""
The Dormouse's Story
The Dormouse's Story
""")
assert str(html) == """
The Dormouse's Story
The Dormouse's Story
"""
```
```{note}
Rendering the `HTML` into a string _will_ remove preceding spaces.
```