Designing styles¶
Rich text¶
Pybtex has a set of classes for working with formatted text
and producing formatted output.
A piece of formatted text in Pybtex is represented by a Text
object.
A Text
is basically a container that holds a list of
plain text parts, represented by
String
objects,
The basic workflow is:
Construct a
Text
object.Render it as LaTeX, HTML or other markup.
>>> from pybtex.richtext import Text, Tag
>>> text = Text('How to be ', Tag('em', 'a cat'), '.')
>>> print(text.render_as('html'))
How to be <em>a cat</em>.
>>> print(text.render_as('latex'))
How to be \emph{a cat}.
Rich text classes¶
There are several rich text classes in Pybtex:
Text
is the top level container that may contain
String
, Tag
, and HRef
objects.
When a Text
object is rendered into markup,
it renders all of its child objects, then concatenates the result.
String
is just a wrapper for a single Python string.
Tag
and HRef
are also containers that may contain
other String
, Tag
, and HRef
objects. This
makes nested formatting possible. For example, this stupidly formatted text:
Comprehensive TeX Archive Network is comprehensive.
is represented by this object tree:
>>> text = Text(
... HRef('https://ctan.org/', Tag('em', 'Comprehensive'), ' TeX Archive Network'),
... ' is ',
... Tag('em', 'comprehensive'),
... '.',
... )
>>> print(text.render_as('html'))
<a href="https://ctan.org/"><em>Comprehensive</em> TeX Archive Network</a> is <em>comprehensive</em>.
Protected
represents a “protected” piece of text, something like
{braced text} in BibTeX. It is not affected by case-changing operations, like
Text.upper()
or Text.lower()
, and is not splittable by
Text.split()
.
All rich text classes share the same API which is more or less similar to plain Python strings.
Like Python strings, rich text objects are supposed to be immutable. Methods like
Text.append()
or Text.upper()
return a new Text
object instead of modifying the data in place.
Attempting to modify the contents of an existing Text
object is
not supported and may lead to weird results.
Here we document the methods of the Text
class.
The other classes have the same methods.
- class pybtex.richtext.Text(*parts)¶
The
Text
class is the top level container that may containString
,Tag
orHRef
objects.- __init__(*parts)¶
Create a text object consisting of one or more parts.
Empty parts are ignored:
>>> Text() == Text('') == Text('', '', '') True >>> Text('Word', '') == Text('Word') True
Text() objects are unpacked and their children are included directly:
>>> Text(Text('Multi', ' '), Tag('em', 'part'), Text(' ', Text('text!'))) Text('Multi ', Tag('em', 'part'), ' text!') >>> Tag('strong', Text('Multi', ' '), Tag('em', 'part'), Text(' ', 'text!')) Tag('strong', 'Multi ', Tag('em', 'part'), ' text!')
Similar objects are merged together:
>>> Text('Multi', Tag('em', 'part'), Text(Tag('em', ' ', 'text!'))) Text('Multi', Tag('em', 'part text!')) >>> Text('Please ', HRef('/', 'click'), HRef('/', ' here'), '.') Text('Please ', HRef('/', 'click here'), '.')
- __eq__(other)¶
Rich text objects support equality comparison:
>>> Text('Cat') == Text('cat') False >>> Text('Cat') == Text('Cat') True
- __len__()¶
len(text)
returns the number of characters in the text, ignoring the markup:>>> len(Text('Long cat')) 8 >>> len(Text(Tag('em', 'Long'), ' cat')) 8 >>> len(Text(HRef('http://example.com/', 'Long'), ' cat')) 8
- __contains__(item)¶
value in text
returnsTrue
if any part of thetext
contains the substringvalue
:>>> 'Long cat' in Text('Long cat!') True
Substrings splitted across multiple text parts are not matched:
>>> 'Long cat' in Text(Tag('em', 'Long'), 'cat!') False
- __getitem__(key)¶
Slicing and extracting characters works like with regular strings, formatting is preserved.
>>> Text('Longcat is ', Tag('em', 'looooooong!'))[:15] Text('Longcat is ', Tag('em', 'looo')) >>> Text('Longcat is ', Tag('em', 'looooooong!'))[-1] Text(Tag('em', '!'))
- __add__(other)¶
Concatenate this Text with another Text or string.
>>> Text('Longcat is ') + Tag('em', 'long') Text('Longcat is ', Tag('em', 'long'))
- add_period(period='.')¶
Add a period to the end of text, if the last character is not “.”, “!” or “?”.
>>> text = Text("That's all, folks") >>> print(str(text.add_period())) That's all, folks.
>>> text = Text("That's all, folks!") >>> print(str(text.add_period())) That's all, folks!
- append(text)¶
Append text to the end of this text.
For Tags, HRefs, etc. the appended text is placed inside the tag.
>>> text = Tag('strong', 'Chuck Norris') >>> print((text + ' wins!').render_as('html')) <strong>Chuck Norris</strong> wins! >>> print(text.append(' wins!').render_as('html')) <strong>Chuck Norris wins!</strong>
- capfirst()¶
Capitalize the first letter of the text.
>>> Text(Tag('em', 'long Cat')).capfirst() Text(Tag('em', 'Long Cat'))
- capitalize()¶
Capitalize the first letter of the text and lowercase the rest.
>>> Text(Tag('em', 'LONG CAT')).capitalize() Text(Tag('em', 'Long cat'))
- endswith(suffix)¶
Return True if the text ends with the given suffix.
>>> Text('Longcat!').endswith('cat!') True
Suffixes split across multiple parts are not matched:
>>> Text('Long', Tag('em', 'cat'), '!').endswith('cat!') False
- isalpha()¶
Return True if all characters in the string are alphabetic and there is at least one character, False otherwise.
- join(parts)¶
Join a list using this text (like string.join)
>>> letters = ['a', 'b', 'c'] >>> print(str(String('-').join(letters))) a-b-c >>> print(str(String('-').join(iter(letters)))) a-b-c
- lower()¶
Convert rich text to lowercase.
>>> Text(Tag('em', 'Long cat')).lower() Text(Tag('em', 'long cat'))
- render(backend)¶
Render this
Text
into markup.- Parameters
backend – The formatting backend (an instance of
pybtex.backends.BaseBackend
).
- render_as(backend_name)¶
Render this
Text
into markup. This is a wrapper method that loads a formatting backend plugin and callsText.render()
.>>> text = Text('Longcat is ', Tag('em', 'looooooong'), '!') >>> print(text.render_as('html')) Longcat is <em>looooooong</em>! >>> print(text.render_as('latex')) Longcat is \emph{looooooong}! >>> print(text.render_as('text')) Longcat is looooooong!
- Parameters
backend_name – The name of the output backend (like
"latex"
or"html"
).
- split(sep=None, keep_empty_parts=None)¶
>>> Text('a + b').split() [Text('a'), Text('+'), Text('b')]
>>> Text('a, b').split(', ') [Text('a'), Text('b')]
- startswith(prefix)¶
Return True if the text starts with the given prefix.
>>> Text('Longcat!').startswith('Longcat') True
Prefixes split across multiple parts are not matched:
>>> Text(Tag('em', 'Long'), 'cat!').startswith('Longcat') False
- upper()¶
Convert rich text to uppsercase.
>>> Text(Tag('em', 'Long cat')).upper() Text(Tag('em', 'LONG CAT'))
- class pybtex.richtext.String(*parts)¶
A
String
is a wrapper for a plain Python string.>>> from pybtex.richtext import String >>> print(String('Crime & Punishment').render_as('text')) Crime & Punishment >>> print(String('Crime & Punishment').render_as('html')) Crime & Punishment
- class pybtex.richtext.Tag(name, *args)¶
A
Tag
represents something like an HTML tag or a LaTeX formatting command:>>> from pybtex.richtext import Tag >>> tag = Tag('em', 'The TeXbook') >>> print(tag.render_as('html')) <em>The TeXbook</em> >>> print(tag.render_as('latex')) \emph{The TeXbook}
- class pybtex.richtext.HRef(url, *args, external=False)¶
A
HRef
represends a hyperlink:>>> from pybtex.richtext import Tag >>> href = HRef('http://ctan.org/', 'CTAN') >>> print(href.render_as('html')) <a href="http://ctan.org/">CTAN</a> >>> print(href.render_as('latex')) \href{http://ctan.org/}{CTAN}
>>> href = HRef(String('http://ctan.org/'), String('http://ctan.org/')) >>> print(href.render_as('latex')) \url{http://ctan.org/}
- class pybtex.richtext.Protected(*args)¶
A
Protected
represents a “protected” piece of text.Protected.lower()
,Protected.upper()
,Protected.capitalize()
, andProtected.capitalize()
are no-ops and just return theProtected
object itself.Protected.split()
never splits the text. It always returns a one-element list containing theProtected
object itself.In LaTeX output,
Protected
is {surrounded by braces}. HTML and plain text backends just output the text as-is.
>>> from pybtex.richtext import Protected >>> text = Protected('The CTAN archive') >>> text.lower() Protected('The CTAN archive') >>> text.split() [Protected('The CTAN archive')] >>> print(text.render_as('latex')) {The CTAN archive} >>> print(text.render_as('html')) <span class="bibtex-protected">The CTAN archive</span>
New in version 0.20.
Style API¶
A formatting style in Pybtex is a class inherited from
pybtex.style.formatting.BaseStyle
.
- class pybtex.style.formatting.BaseStyle(label_style=None, name_style=None, sorting_style=None, abbreviate_names=False, min_crossrefs=2, **kwargs)¶
The base class for pythonic formatting styles.
- format_bibliography(bib_data, citations=None)¶
Format bibliography entries with the given keys and return a
FormattedBibliography
object.- Parameters
bib_data – A
pybtex.database.BibliographyData
object.citations – A list of citation keys.
Pybtex loads the style class as a plugin,
instantiates it with proper parameters and
calls the format_bibliography()
method that does
the actual formatting job.
The default implementation of format_bibliography()
calls a format_<type>()
method for each bibliography entry, where <type>
is the entry type, in lowercase. For example, to format
an entry of type book
, the format_book()
method is called.
The method must return a Text
object.
Style classes are supposed to implement format_<type>()
methods
for all entry types they support. If a formatting method
is not found for some entry, Pybtex complains about unsupported entry type.
An example minimalistic style:
from pybtex.style.formatting import BaseStyle
from pybtex.richtext import Text, Tag
class MyStyle(BaseStyle):
def format_article(self, entry):
return Text('Article ', Tag('em', entry.fields['title']))
Template language¶
Manually creating Text
objects may be tedious.
Pybtex has a small template language to simplify common formatting tasks,
like joining words with spaces, adding commas and periods, or handling missing fields.
The template language is is not very documented for now, so you should look at the code in the pybtex.style.template module and the existing styles.
An example formatting style using template language:
from pybtex.style.formatting import BaseStyle, toplevel
from pybtex.style.template import field, join, optional
class MyStyle(BaseStyle):
def format_article(self, entry):
if entry.fields['volume']:
volume_and_pages = join [field('volume'), optional [':', pages]]
else:
volume_and_pages = words ['pages', optional [pages]]
template = toplevel [
self.format_names('author'),
sentence [field('title')],
sentence [
tag('emph') [field('journal')], volume_and_pages, date],
]
return template.format_data(entry)