Contents

bibtexparser: API
- bibtexparser — Parsing and writing BibTeX files
- bibtexparser.bibdatabase — The bibliographic database object
- bibtexparser.bparser — Tune the default parser
- bibtexparser.customization — Functions to customize records
  - Exception classes
- bibtexparser.bwriter — Tune the default writer
- bibtexparser.bibtexexpression — Parser’s core relying on pyparsing

bibtexparser: API ¶

`bibtexparser` — Parsing and writing BibTeX files¶

BibTeX is a bibliographic data file format.

The bibtexparser module can parse BibTeX files and write them. The API is similar to the json module. The parsed data is returned as a simple BibDatabase object with the main attribute being entries representing bibliographic sources such as books and journal articles.

The following functions provide a quick and basic way to manipulate a BibTeX file. More advanced features are also available in this module.

Parsing a file is as simple as:

import bibtexparser
with open('bibtex.bib') as bibtex_file:
   bibtex_database = bibtexparser.load(bibtex_file)

And writing:

import bibtexparser
with open('bibtex.bib', 'w') as bibtex_file:
    bibtexparser.dump(bibtex_database, bibtex_file)

bibtexparser.dump(bib_database, bibtex_file, writer=None)[source]¶

Dump BibDatabase object as a BibTeX text file

Parameters

bib_database (BibDatabase) – bibliographic database object
bibtex_file (file) – file to write to
writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)

Example:

import bibtexparser
with open('bibtex.bib', 'w') as bibtex_file:
    bibtexparser.dump(bibtex_database, bibtex_file)

bibtexparser.dumps(bib_database, writer=None)[source]¶

Dump BibDatabase object to a BibTeX string

Parameters

bib_database (BibDatabase) – bibliographic database object
writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)

Returns

BibTeX string

Return type

unicode

bibtexparser.load(bibtex_file, parser=None)[source]¶

Load BibDatabase object from a file

Parameters

bibtex_file (file) – input file to be parsed
parser (BibTexParser) – custom parser to use (optional)

Returns

bibliographic database object

Return type

BibDatabase

Example:

import bibtexparser
with open('bibtex.bib') as bibtex_file:
   bibtex_database = bibtexparser.load(bibtex_file)

bibtexparser.loads(bibtex_str, parser=None)[source]¶

Load BibDatabase object from a string

Parameters

bibtex_str (str or unicode) – input BibTeX string to be parsed
parser (BibTexParser) – custom parser to use (optional)

Returns

bibliographic database object

Return type

BibDatabase

`bibtexparser.bibdatabase` — The bibliographic database object ¶

class bibtexparser.bibdatabase.BibDatabase[source]¶

Bibliographic database object that follows the data structure of a BibTeX file.

comments¶: List of BibTeX comment (@comment{…}) blocks.

entries¶

List of BibTeX entries, for example @book{…}, @article{…}, etc. Each entry is a simple dict with BibTeX field-value pairs, for example ‘author’: ‘Bird, R.B. and Armstrong, R.C. and Hassager, O.’ Each entry will always have the following dict keys (in addition to other BibTeX fields):

ID (BibTeX key)
ENTRYTYPE (entry type in lowercase, e.g. book, article etc.)

property entries_dict¶: Return a dictionary of BibTeX entries. The dict key is the BibTeX entry key

preambles¶: List of BibTeX preamble (@preamble{…}) blocks.

strings¶: OrderedDict of BibTeX string definitions (@string{…}). In order of definition.

`bibtexparser.bparser` — Tune the default parser¶

class bibtexparser.bparser.BibTexParser(data=None, **args)[source]¶

A parser for reading BibTeX bibliographic data files.

Example:

from bibtexparser.bparser import BibTexParser

bibtex_str = ...

parser = BibTexParser()
parser.ignore_nonstandard_types = False
parser.homogenize_fields = False
parser.common_strings = False
bib_database = bibtexparser.loads(bibtex_str, parser)

Parameters

customization – function or None (default) Customization to apply to parsed entries.
ignore_nonstandard_types – bool (default True) If True ignores non-standard bibtex entry types.
homogenize_fields – bool (default False) Common field name replacements (as set in alt_dict attribute).
interpolate_strings – bool (default True) If True, replace bibtex string by their value, else uses BibDataString objects.
common_strings – bool (default False) Include common string definitions (e.g. month abbreviations) to the bibtex file.
add_missing_from_crossref – bool (default False) Resolve BibTeX references set in the crossref field for BibTeX entries and add the fields from the referenced entry to the referencing entry.

common_strings¶: Load common strings such as months abbreviation Default: False.

customization¶: Callback function to process BibTeX entries after parsing, for example to create a list from a string with multiple values. By default all BibTeX values are treated as simple strings. Default: None.

homogenize_fields¶: Sanitize BibTeX field names, for example change url to link etc. Field names are always converted to lowercase names. Default: False.

ignore_nonstandard_types¶: Ignore non-standard BibTeX types (book, article, etc). Default: True.

interpolate_strings¶: Interpolate Bibtex Strings or keep the structure

parse(bibtex_str, partial=False)[source]¶

Parse a BibTeX string into an object

Parameters

bibtex_str – BibTeX string
partial – If True, print errors only on parsing failures. If False, an exception is raised.

Type

str or unicode

Type

boolean

Returns

bibliographic database

Return type

BibDatabase

parse_file(file, partial=False)[source]¶

Parse a BibTeX file into an object

Parameters

file – BibTeX file or file-like object
partial – If True, print errors only on parsing failures. If False, an exception is raised.

Type

file

Type

boolean

Returns

bibliographic database

Return type

BibDatabase

`bibtexparser.customization` — Functions to customize records¶

A set of functions useful for customizing bibtex fields. You can find inspiration from these functions to design yours. Each of them takes a record and return the modified record.

bibtexparser.customization.add_plaintext_fields(record)[source]¶

For each field in the record, add a plain_ field containing the plaintext, stripped from braces and similar. See https://github.com/sciunto-org/python-bibtexparser/issues/116.

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.author(record)[source]¶

Split author field into a list of “Name, Surname”.

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.convert_to_unicode(record)[source]¶

Convert accent from latex to unicode style.

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.doi(record)[source]¶

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.editor(record)[source]¶

Turn the editor field into a dict composed of the original editor name and a editor id (without coma or blank).

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.getnames(names)[source]¶

Convert people names as surname, firstnames or surname, initials.

Parameters: names (list) – a list of names
Returns: list – Correctly formated names

Note

This function is known to be too simple to handle properly the complex rules. We would like to enhance this in forthcoming releases.

bibtexparser.customization.homogenize_latex_encoding(record)[source]¶

Homogenize the latex enconding style for bibtex

This function is experimental.

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.journal(record)[source]¶

Turn the journal field into a dict composed of the original journal name and a journal id (without coma or blank).

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.keyword(record, sep=',|;')[source]¶

Split keyword field into a list.

Parameters

record (string, optional) – the record.
sep – pattern used for the splitting regexp.

Returns

dict – the modified record.

bibtexparser.customization.link(record)[source]¶

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.page_double_hyphen(record)[source]¶

Separate pages by a double hyphen (–).

Parameters: record (dict) – the record.
Returns: dict – the modified record.

bibtexparser.customization.splitname(name, strict_mode=True)[source]¶

Break a name into its constituent parts: First, von, Last, and Jr.

Parameters

name (string) – a string containing a single name
strict_mode (Boolean) – whether to use strict mode

Returns

dictionary of constituent parts

Raises

customization.InvalidName – If an invalid name is given and strict_mode = True.

In BibTeX, a name can be represented in any of three forms:

First von Last
von Last, First
von Last, Jr, First

This function attempts to split a given name into its four parts. The returned dictionary has keys of first, last, von and jr. Each value is a list of the words making up that part; this may be an empty list. If the input has no non-whitespace characters, a blank dictionary is returned.

It is capable of detecting some errors with the input name. If the strict_mode parameter is True, which is the default, this results in a customization.InvalidName exception being raised. If it is False, the function continues, working around the error as best it can. The errors that can be detected are listed below along with the handling for non-strict mode:

Name finishes with a trailing comma: delete the comma

Too many parts (e.g., von Last, Jr, First, Error): merge extra parts into First

Unterminated opening brace: add closing brace to end of input

Unmatched closing brace: add opening brace at start of word

bibtexparser.customization.type(record)[source]¶

Put the type into lower case.

Parameters: record (dict) – the record.
Returns: dict – the modified record.

Exception classes ¶

class bibtexparser.customization.InvalidName[source]¶: Exception raised by customization.splitname() when an invalid name is input.

`bibtexparser.bwriter` — Tune the default writer ¶

class bibtexparser.bwriter.BibTexWriter(write_common_strings=False)[source]¶

Writer to convert a BibDatabase object to a string or file formatted as a BibTeX file.

Example:

from bibtexparser.bwriter import BibTexWriter

bib_database = ...

writer = BibTexWriter()
writer.contents = ['comments', 'entries']
writer.indent = '  '
writer.order_entries_by = ('ENTRYTYPE', 'author', 'year')
bibtex_str = bibtexparser.dumps(bib_database, writer)

add_trailing_comma¶: BibTeX syntax allows the comma to be optional at the end of the last field in an entry. Use this to enable writing this last comma in the bwriter output. Defaults: False.

comma_first¶: BibTeX syntax allows comma first syntax (common in functional languages), use this to enable comma first syntax as the bwriter output

common_strings¶: Whether common strings are written

contents¶: List of BibTeX elements to write, valid values are entries, comments, preambles, strings.

display_order¶: Tuple of fields for display order in a single BibTeX entry. Fields not listed here will be displayed alphabetically at the end. Set to ‘[]’ for alphabetical order. Default: ‘[]’

entry_separator¶: Characters(s) for separating BibTeX entries. Default: new line.

indent¶: Character(s) for indenting BibTeX field-value pairs. Default: single space.

order_entries_by¶: Tuple of fields for ordering BibTeX entries. Set to None to disable sorting. Default: BibTeX key (‘ID’, ).

write(bib_database)[source]¶

Converts a bibliographic database to a BibTeX-formatted string.

Parameters: bib_database (BibDatabase) – bibliographic database to be converted to a BibTeX string
Returns: BibTeX-formatted string
Return type: str or unicode

`bibtexparser.bibtexexpression` — Parser’s core relying on pyparsing¶

class bibtexparser.bibtexexpression.BibtexExpression[source]¶

Gives access to pyparsing expressions.

Attributes are pyparsing expressions for the following elements:

main_expression: the bibtex file
string_def: a string definition
preamble_decl: a preamble declaration
explicit_comment: an explicit comment
entry: an entry definition
implicit_comment: an implicit comment

exception ParseException(pstr: str, loc: int = 0, msg: Optional[str] = None, elem=None)¶

Exception thrown when a parse expression doesn’t match the input string

Example:

try:
    Word(nums).set_name("integer").parse_string("ABC")
except ParseException as pe:
    print(pe)
    print("column: {}".format(pe.column))

prints:

Expected integer (at char 0), (line:1, col:1)
 column: 1

add_log_function(log_fun)[source]¶

Add notice to logger on entry, comment, preamble, string definitions.

Parameters: log_fun – logger function

set_string_expression_parse_action(fun)[source]¶: Set the parseAction for string_expression expression.

Note

See set_string_name_parse_action.

set_string_name_parse_action(fun)[source]¶: Set the parseAction for string name expression.

Note

For some reason pyparsing duplicates the string_name expression so setting its parseAction a posteriori has no effect in the context of a string expression. This is why this function should be used instead.

bibtexparser.bibtexexpression.add_logger_parse_action(expr, log_func)[source]¶: Register a callback on expression parsing with the adequate message.

bibtexparser.bibtexexpression.field_to_pair(string_, location, token)[source]¶

Looks for parsed element named ‘Field’.

Returns: (name, value).

bibtexparser.bibtexexpression.in_braces_or_pars(exp)[source]¶: exp -> (exp)|{exp}

bibtexparser.bibtexexpression.strip_after_new_lines(s)[source]¶

Removes leading and trailing whitespaces in all but first line.

Parameters: s – string or BibDataStringExpression

bibtexparser: API ¶

`bibtexparser` — Parsing and writing BibTeX files¶

`bibtexparser.bibdatabase` — The bibliographic database object ¶

`bibtexparser.bparser` — Tune the default parser¶

`bibtexparser.customization` — Functions to customize records¶

Exception classes ¶

`bibtexparser.bwriter` — Tune the default writer ¶

`bibtexparser.bibtexexpression` — Parser’s core relying on pyparsing¶

Table of Contents

Previous topic

Next topic

This Page

bibtexparser: API¶

bibtexparser — Parsing and writing BibTeX files¶

bibtexparser.bibdatabase — The bibliographic database object¶

bibtexparser.bparser — Tune the default parser¶

bibtexparser.customization — Functions to customize records¶

Exception classes¶

bibtexparser.bwriter — Tune the default writer¶

bibtexparser.bibtexexpression — Parser’s core relying on pyparsing¶

bibtexparser: API ¶

`bibtexparser` — Parsing and writing BibTeX files¶

`bibtexparser.bibdatabase` — The bibliographic database object ¶

`bibtexparser.bparser` — Tune the default parser¶

`bibtexparser.customization` — Functions to customize records¶

Exception classes ¶

`bibtexparser.bwriter` — Tune the default writer ¶

`bibtexparser.bibtexexpression` — Parser’s core relying on pyparsing¶