Formatting bibliographies¶
The main purpose of Pybtex is turning machine-readable bibliography data into human-readable bibliographies formatted in a specific style. Pybtex reads bibliography data that looks like this:
@book{graham1989concrete,
title = "Concrete mathematics: a foundation for computer science",
author = "Graham, Ronald Lewis and Knuth, Donald Ervin and Patashnik, Oren",
year = "1989",
publisher = "Addison-Wesley"
}
and formats it like this:
R. L. Graham, D. E. Knuth, and O. Patashnik. Concrete mathematics: a foundation for computer science. Addison-Wesley, 1989.
Pybtex contains two different formatting engines:
The BibTeX engine uses BibTeX
.bst
styles.The Python engine uses styles written in Python.
BibTeX engine¶
The BibTeX engine is fully compatible with BibTeX style files and is used by default.
How it works¶
When you type pybtex mydocument, the following things happen:
Pybtex reads the file
mydocument.aux
in the current directory. This file is normally created by LaTeX and contains all sorts of auxiliary information collected during processing of the LaTeX document.Pybtex is interested in these three pieces of information:
- Bibliography style:
First, Pybtex searches the
.aux
file for a\bibstyle
command that specifies which formatting style will be used.For example,
\bibstyle{unsrt}
instructs Pybtex to use formatting style defined in the fileunsrt.bst
.- Bibliography data:
Next, Pybtex expects to find at least one
\bibdata
command in the.aux
file that tells where to look for the bibliography data.For example,
\bibdata{mydocument}
means “use the bibliography data frommydocument.bib
”.- Citations:
Finally, Pybtex needs to know which entries to put into the resulting bibliography. Pybtex gets the list of citation keys from
\citation
commands in the.aux
file.For example,
\citation{graham1989concrete}
means “include the entry with the keygraham1989concrete
into the resulting bibliograhy”.A wildcard citation
\citation{*}
tells Pybtex to format the bibliography for all entries from all data files specified by all\bibdata
commands.
Pybtex executes the style program in the
.bst
file specified by the\bibstyle
command in the.aux
file. As a result, a.bbl
file containing the resulting formatted bibliography is created.A
.bst
style file is a program in a domain-specific stack-based language. A typical piece of the.bst
code looks like this:FUNCTION {format.bvolume} { volume empty$ { "" } { "volume" volume tie.or.space.connect series empty$ 'skip$ { " of " * series emphasize * } if$ "volume and number" number either.or.check } if$ }
The code in a
.bst
file contains the complete step-by-step instructions on how to create the formatted bibliography from the given bibliography data and citation keys. For example, aREAD
command tells Pybtex to read the bibliography data from all files specified by\bibdata
commands in the.aux
file, anITERATE
command tells Pybtex to execute a piece of code for each citation key specified by\citation
commands, and so on. The built-inwrite$
function tells Pybtex to write the given string into the resulting.bbl
file. Pybtex implements all these commands and built-in functions and simply executes the.bst
program step by step.A complete reference of the
.bst
language can be found in the BibTeX hacking guide by Oren Patashnik. It is available by running texdoc btxhak in most TeX distributions.
Python engine¶
The Python engine is enabled by running pybtex with the -l python
option.
Differences from the BibTeX engine¶
Formatting styles are written in Python instead of the
.bst
language.Formatting styles are not tied to LaTeX and do not use hardcoded LaTeX markup. Instead of that they produce format-agnostic
pybtex.richtext.Text
objects that can be converted to any markup format (LaTeX, Markdown, HTML, etc.).Name formatting, label formatting, and sorting styles are defined separately from the main style.
How it works¶
When you type pybtex -l python mydocument, this things happen:
Pybtex reads the file
mydocument.aux
in the current directory and extracts the name of the the bibliography style, the list of bibliography data files and the list of citation keys. This step is exactly the same as with the BibTeX engine.Pybtex reads the biliography data from all data files specified in the
.aux
file into a singleBibliographyData
object.Then the formatting style is loaded. The formatting style is a Python class with a
format_bibliography()
method. Pybtex passes the bibliography data (aBibliographyData
object) and the list of citation keys toformat_bibliography()
.The formatting style formats each of the requested bibliography entries in a style-specific way.
When it comes to formatting names, a name formatting style is loaded and used. A name formatting style is also a Python class with a specific interface. Similarly, a label formatting style is used to format entry labels, and a sorting style is used to sort the resulting style. Each formatting style has a default name style, a default label style and a default sorting style. The defaults can be overridden with options passed to the main style class.
Each formatted entry is put into a
FormattedEntry
object which is just a container for the formatted label, the formatted entry text (apybtex.richtext.Text
object) and the entry key. The reason that the label, the key and the main text are stored separately is to give the output backend more flexibility when converting theFormattedEntry
object to the actual markup. For example, the HTML backend may want to format the bibliography as a definition list, the LaTeX backend would use\bibitem[label]{key} text
constructs, etc.Formatted entries are put into a
FormattedBibliography
object—it simply contains a list ofFormattedEntry
objects and some additional metadata.The resulting
FormattedBibliography
is passed to the output backend. The default backend is LaTeX. It can be changed with thepybtex --output-backend
option. The output backend converts the formatted bibliography to the specific markup format and writes it to the output file.
Python API¶
The base interface¶
Both the Python engine and the BibTeX engine use the same interface
defined in pybtex.Engine
.
pybtex.Engine
has a handful of methods but most of them are just
convenience wrappers for Engine.format_from_files()
that does the
actual job.
- class pybtex.Engine¶
- make_bibliography(aux_filename, style=None, output_encoding=None, bib_format=None, **kwargs)¶
Read the given
.aux
file and produce a formatted bibliography usingformat_from_files()
.- Parameters
style – If not
None
, use this style instead of specified in the.aux
file.
- format_from_string(bib_string, *args, **kwargs)¶
Parse the bigliography data from the given string and produce a formated bibliography using
format_from_files()
.This is a convenience method that calls
format_from_strings()
with a single string.
- format_from_strings(bib_strings, *args, **kwargs)¶
Parse the bigliography data from the given strings and produce a formated bibliography.
This is a convenience method that wraps each string into a StringIO, then calls
format_from_files()
.
- format_from_file(filename, *args, **kwargs)¶
Read the bigliography data from the given file and produce a formated bibliography.
This is a convenience method that calls
format_from_files()
with a single file. All extra arguments are passed toformat_from_files()
.
- format_from_files(**kwargs)¶
Read the bigliography data from the given files and produce a formated bibliography.
This is an abstract method overridden by both
pybtex.PybtexEngine
andpybtex.bibtex.BibTeXEngine
.
The BibTeXEngine class¶
The BibTeX engine lives in the pybtex.bibtex
module.
The public interface consists of the BibTeXEngine
class and a
couple of convenience functions.
- class pybtex.bibtex.BibTeXEngine¶
The Python fomatting engine.
See
pybtex.Engine
for inherited methods.- format_from_files(bib_files_or_filenames, style, citations=['*'], bib_format=None, bib_encoding=None, output_encoding=None, bst_encoding=None, min_crossrefs=2, output_filename=None, add_output_suffix=False, **kwargs)¶
Read the bigliography data from the given files and produce a formated bibliography.
- Parameters
bib_files_or_filenames – A list of file names or file objects.
style – The name of the formatting style.
citations – A list of citation keys.
bib_format – The name of the bibliography format. The default format is
bibtex
.bib_encoding – Encoding of bibliography files.
output_encoding – Encoding that will be used by the output backend.
bst_encoding – Encoding of the
.bst
file.min_crossrefs – Include cross-referenced entries after this many crossrefs. See BibTeX manual for details.
output_filename – If
None
, the result will be returned as a string. Else, the result will be written to the specified file.add_output_suffix – Append a
.bbl
suffix to the output file name.
- pybtex.bibtex.make_bibliography(*args, **kwargs)¶
A convenience function that calls
BibTeXEngine.make_bibliography()
.
- pybtex.bibtex.format_from_string(*args, **kwargs)¶
A convenience function that calls
BibTeXEngine.format_from_string()
.
- pybtex.bibtex.format_from_strings(*args, **kwargs)¶
A convenience function that calls
BibTeXEngine.format_from_strings()
.
- pybtex.bibtex.format_from_file(*args, **kwargs)¶
A convenience function that calls
BibTeXEngine.format_from_file()
.
- pybtex.bibtex.format_from_files(*args, **kwargs)¶
A convenience function that calls
BibTeXEngine.format_from_files()
.
The PybtexEngine class¶
The Python engine resides in the pybtex
module
and uses an interface similar to the BibTeX engine.
There is the PybtexEngine
class and some convenience functions.
- class pybtex.PybtexEngine¶
The Python fomatting engine.
See
pybtex.Engine
for inherited methods.- format_from_files(bib_files_or_filenames, style, citations=['*'], bib_format=None, bib_encoding=None, output_backend=None, output_encoding=None, min_crossrefs=2, output_filename=None, add_output_suffix=False, **kwargs)¶
Read the bigliography data from the given files and produce a formated bibliography.
- Parameters
bib_files_or_filenames – A list of file names or file objects.
style – The name of the formatting style.
citations – A list of citation keys.
bib_format – The name of the bibliography format. The default format is
bibtex
.bib_encoding – Encoding of bibliography files.
output_backend – Which output backend to use. The default is
latex
.output_encoding – Encoding that will be used by the output backend.
bst_encoding – Encoding of the
.bst
file.min_crossrefs – Include cross-referenced entries after this many crossrefs. See BibTeX manual for details.
output_filename – If
None
, the result will be returned as a string. Else, the result will be written to the specified file.add_output_suffix – Append default suffix to the output file name (
.bbl
for LaTeX,.html
for HTML, etc.).
- pybtex.make_bibliography(*args, **kwargs)¶
A convenience function that calls
PybtexEngine.make_bibliography()
.
- pybtex.format_from_string(*args, **kwargs)¶
A convenience function that calls
PybtexEngine.format_from_string()
.
- pybtex.format_from_strings(*args, **kwargs)¶
A convenience function that calls
PybtexEngine.format_from_strings()
.
- pybtex.format_from_file(*args, **kwargs)¶
A convenience function that calls
PybtexEngine.format_from_file()
.
- pybtex.format_from_files(*args, **kwargs)¶
A convenience function that calls
PybtexEngine.format_from_files()
.