Reading and writing bibliography data¶

Reading bibliography data ¶

One of the most common things to do with Pybtex API is parsing BibTeX files. There are several high level functions in the pybtex.database module for reading bibliography databases.

pybtex.database.parse_string(value, bib_format, **kwargs)¶

Parse a Unicode string containing bibliography data and return a BibliographyData object.

Parameters:

value – Unicode string.
bib_format – Data format (“bibtex”, “yaml”, etc.).

Added in version 0.19.

pybtex.database.parse_bytes(value, bib_format, **kwargs)¶

Parse a byte string containing bibliography data and return a BibliographyData object.

Parameters:

value – Byte string.
bib_format – Data format (for example, “bibtexml”).

Added in version 0.19.

pybtex.database.parse_file(file, bib_format=None, **kwargs)¶

Read bibliography data from file and return a BibliographyData object.

Parameters:

file – A file name or a file-like object.
bib_format – Data format (“bibtex”, “yaml”, etc.). If not specified, Pybtex will try to guess by the file name.

Added in version 0.19.

Each of these functions does basically the same thing. It reads the bibliography data from a string or a file and returns a BibliographyData object containing all the bibliography data.

Here is a quick example:

>>> from pybtex.database import parse_file
>>> bib_data = parse_file('../examples/tugboat/tugboat.bib')
>>> print(bib_data.entries['Knuth:TB8-1-14'].fields['title'])
Mixing right-to-left texts with left-to-right texts
>>> for author in bib_data.entries['Knuth:TB8-1-14'].persons['author']:
...     print(unicode(author))
Knuth, Donald
MacKay, Pierre

Writing bibliography data ¶

The BibliographyData class has several methods that are symmetrical to the functions described above:

BibliographyData.to_string() formats the bibliograhy data into a string,
BibliographyData.to_bytes() formats the bibliograhy data into a byte string,
BibliographyData.to_file() writes the bibliograhy data to a file.

>>> from pybtex.database import BibliographyData, Entry
>>> bib_data = BibliographyData({
...     'article-minimal': Entry('article', [
...         ('author', 'L[eslie] B. Lamport'),
...         ('title', 'The Gnats and Gnus Document Preparation System'),
...         ('journal', "G-Animal's Journal"),
...         ('year', '1986'),
...     ]),
... })
>>> print(bib_data.to_string('bibtex'))
@article{article-minimal,
    author = "L[eslie] B. Lamport",
    title = "The Gnats and Gnus Document Preparation System",
    journal = "G-Animal's Journal",
    year = "1986"
}

Bibliography data classes ¶

Pybtex uses several classes to represent bibligraphy databases:

BibliographyData is a collection of individual bibliography entries and some additional metadata.
Entry is a single bibliography entry (a book, an article, etc.).

An entry has a key (like "knuth74"), a type ("book", "article", etc.), and a number of key-value fields ("author", "title", etc.).
Person is a person related to a bibliography entry (usually as an author or an editor).

class pybtex.database.BibliographyData(entries=None, preamble=None, wanted_entries=None, min_crossrefs=2)¶

entries¶

A dictionary of bibliography entries referenced by their keys.

The dictionary is case insensitive:

>>> bib_data = parse_string("""
...     @ARTICLE{gnats,
...         author = {L[eslie] A. Aamport},
...         title = {The Gnats and Gnus Document Preparation System},
...     }
... """, 'bibtex')
>>> bib_data.entries['gnats'] == bib_data.entries['GNATS']
True

property preamble_list¶

LaTeX preamble as list of strings

>>> bib_data = parse_string(
...     r"""
...     @PREAMBLE{"\newcommand{\noopsort}[1]{}"}
...     @PREAMBLE{"\newcommand{\nooptilde}[1]{}"}
... """,
...     "bibtex",
... )
>>> print(bib_data.preamble_list)
['\\newcommand{\\noopsort}[1]{}', '\\newcommand{\\nooptilde}[1]{}']

Added in version 0.19: Earlier versions used get_preamble(), which is now deprecated.

property preamble¶

LaTeX preamble.

>>> bib_data = parse_string(
...     r"""
...     @PREAMBLE{"\newcommand{\noopsort}[1]{}"}
... """,
...     "bibtex",
... )
>>> print(bib_data.preamble)
\newcommand{\noopsort}[1]{}

Added in version 0.19: Earlier versions used get_preamble(), which is now deprecated.

get_preamble()¶: Deprecated since version 0.19: Use preamble instead.

to_string(bib_format, **kwargs)¶

Return the data as a unicode string in the given format.

Parameters:: bib_format – Data format (“bibtex”, “yaml”, etc.).

Added in version 0.19.

classmethod from_string(value, bib_format, **kwargs)¶

Return the data from a unicode string in the given format.

Parameters:: bib_format – Data format (“bibtex”, “yaml”, etc.).

Added in version 0.22.2.

to_bytes(bib_format, **kwargs)¶

Return the data as a byte string in the given format.

Parameters:: bib_format – Data format (“bibtex”, “yaml”, etc.).

Added in version 0.19.

to_file(file, bib_format=None, **kwargs)¶

Save the data to a file.

Parameters:

file – A file name or a file-like object.
bib_format – Data format (“bibtex”, “yaml”, etc.). If not specified, Pybtex will try to guess by the file name.

Added in version 0.19.

lower()¶

Return another BibliographyData with all identifiers converted to lowercase.

>>> data = parse_string(
...     """
...     @BOOK{Obrazy,
...         title = "Obrazy z Rus",
...         author = "Karel Havlíček Borovský",
...     }
...     @BOOK{Elegie,
...         title = "Tirolské elegie",
...         author = "Karel Havlíček Borovský",
...     }
... """,
...     "bibtex",
... )
>>> data_lower = data.lower()
>>> list(data_lower.entries.keys())
['obrazy', 'elegie']
>>> for entry in data_lower.entries.values():
...     entry.key
...     list(entry.persons.keys())
...     list(entry.fields.keys())
'obrazy'
['author']
['title']
'elegie'
['author']
['title']

class pybtex.database.Entry(type_, fields=None, persons=None)¶

A bibliography entry.

key = None¶: Entry key (for example, 'fukushima1980neocognitron').

type = None¶: Entry type ('book', 'article', etc.).

fields = None¶: A dictionary of entry fields. The dictionary is ordered and case-insensitive.

persons = None¶

A dictionary of entry persons, by their roles.

The most often used roles are 'author' and 'editor'.

to_string(bib_format, **kwargs)¶

Return the data as a unicode string in the given format.

Parameters:: bib_format – Data format (“bibtex”, “yaml”, etc.).

classmethod from_string(value, bib_format, entry_number=0, **kwargs)¶

Return the data from a unicode string in the given format.

Parameters:

bib_format – Data format (“bibtex”, “yaml”, etc.).
entry_number – entry number if the string has more than one.

Added in version 0.22.2.

class pybtex.database.Person(string='', first='', middle='', prelast='', last='', lineage='')¶

A person or some other person-like entity.

>>> knuth = Person("Donald E. Knuth")
>>> knuth.first_names
['Donald']
>>> knuth.middle_names
['E.']
>>> knuth.last_names
['Knuth']

first_names = None¶: A list of first names.

Added in version 0.19: Earlier versions used first(), which is now deprecated.

middle_names = None¶: A list of middle names.

Added in version 0.19: Earlier versions used middle(), which is now deprecated.

prelast_names = None¶: A list of pre-last (aka von) name parts.

Added in version 0.19: Earlier versions used middle(), which is now deprecated.

last_names = None¶: A list of last names.

Added in version 0.19: Earlier versions used last(), which is now deprecated.

lineage_names = None¶: A list of linage (aka Jr) name parts.

Added in version 0.19: Earlier versions used lineage(), which is now deprecated.

property bibtex_first_names¶

A list of first and middle names together. (BibTeX treats all middle names as first.)

Added in version 0.19: Earlier versions used Person.bibtex_first(), which is now deprecated.

>>> knuth = Person("Donald E. Knuth")
>>> knuth.bibtex_first_names
['Donald', 'E.']

get_part(type, abbr=False)¶

Get a list of name parts by type.

>>> knuth = Person("Donald E. Knuth")
>>> knuth.get_part("first")
['Donald']
>>> knuth.get_part("last")
['Knuth']

property rich_first_names¶: A list of first names converted to rich text.

Added in version 0.20.

property rich_middle_names¶: A list of middle names converted to rich text.

Added in version 0.20.

property rich_prelast_names¶: A list of pre-last (aka von) name parts converted to rich text.

Added in version 0.20.

property rich_last_names¶: A list of last names converted to rich text.

Added in version 0.20.

property rich_lineage_names¶: A list of lineage (aka Jr) name parts converted to rich text.

Added in version 0.20.

first(abbr=False)¶: Deprecated since version 0.19: Use first_names instead.

middle(abbr=False)¶: Deprecated since version 0.19: Use middle_names instead.

prelast(abbr=False)¶: Deprecated since version 0.19: Use prelast_names instead.

last(abbr=False)¶: Deprecated since version 0.19: Use last_names instead.

lineage(abbr=False)¶: Deprecated since version 0.19: Use lineage_names instead.

bibtex_first()¶: Deprecated since version 0.19: Use bibtex_first_names instead.

Reading and writing bibliography data¶

Reading bibliography data¶

Writing bibliography data¶

Bibliography data classes¶

Reading bibliography data ¶

Writing bibliography data ¶

Bibliography data classes ¶