When dealing with displaying large html content, we need a way to generate its summary and a “read more” or “continue” link to the full article.
This is a simple matter of truncating the html content at a defined maximum word count and adding the convenient ellipsis (…).
Full html: <p>paragraph 1</p><p>paragraph 2</p><p>paragraph 3</p><p>paragraph 4</p> Truncated html: <p>paragraph 1</p><p>paragraph 2...</p>
The problem is: most of the time trying to add a “read more” link to
<p>paragraph 1</p><p>paragraph 2...</p> would put the link outside of the html content. This results in a really frustrating line break since
<p></p> is a block element. In my opinion, the line break is disruptive and not as aesthetically pleasing.
What we really want:
Instead of: <p>paragraph 1</p><p>paragraph 2...</p><a href="/read-more/">read more</a> We want: <p>paragraph 1</p><p>paragraph 2...<a href="/read-more/">read more</a></p>
I ran into this pet peeve a while back when I was trying to add an inline “answer” link to my question & answer joke for officecheese.com. Here is the code I wrote to address this:
def insert_into_last_element(html, element): try: from lxml.html import fragment_fromstring, fragments_fromstring, tostring from lxml.etree import ParserError except ImportError: raise Exception("Unable to find lxml") try: item = fragment_fromstring(element) except ParserError, TypeError: item = fragment_fromstring('<span></span>') try: doc = fragments_fromstring(html) doc[-1].append(item) return ''.join(tostring(e) for e in doc) except ParserError, TypeError: return ''
Seeing that the same need exists in
Pelican, I added the functionality to my fork, and submitted a pull request here.
I hope this is useful to you and if anyone has suggestions for improvement, please don’t hesitate to let me know.
If you read this far, please consider connecting with me onto find out when my next article is published.