Contents
-
bibtexparser
— Parsing and writing BibTeX filesbibtexparser.bibdatabase
— The bibliographic database objectbibtexparser.bparser
— Tune the default parserbibtexparser.customization
— Functions to customize recordsbibtexparser.bibtexexpression
— Parser’s core relying on pyparsing
bibtexparser: API¶
bibtexparser
— Parsing and writing BibTeX files¶
BibTeX is a bibliographic data file format.
The bibtexparser
module can parse BibTeX files and write them. The API is similar to the
json
module. The parsed data is returned as a simple BibDatabase
object with the main attribute being
entries
representing bibliographic sources such as books and journal articles.
The following functions provide a quick and basic way to manipulate a BibTeX file. More advanced features are also available in this module.
Parsing a file is as simple as:
import bibtexparser
with open('bibtex.bib') as bibtex_file:
bibtex_database = bibtexparser.load(bibtex_file)
And writing:
import bibtexparser
with open('bibtex.bib', 'w') as bibtex_file:
bibtexparser.dump(bibtex_database, bibtex_file)
- bibtexparser.dump(bib_database, bibtex_file, writer=None)[source]¶
Dump
BibDatabase
object as a BibTeX text file- Parameters
bib_database (BibDatabase) – bibliographic database object
bibtex_file (file) – file to write to
writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)
Example:
import bibtexparser with open('bibtex.bib', 'w') as bibtex_file: bibtexparser.dump(bibtex_database, bibtex_file)
- bibtexparser.dumps(bib_database, writer=None)[source]¶
Dump
BibDatabase
object to a BibTeX string- Parameters
bib_database (BibDatabase) – bibliographic database object
writer (BibTexWriter) – custom writer to use (optional) (not yet implemented)
- Returns
BibTeX string
- Return type
unicode
- bibtexparser.load(bibtex_file, parser=None)[source]¶
Load
BibDatabase
object from a file- Parameters
bibtex_file (file) – input file to be parsed
parser (BibTexParser) – custom parser to use (optional)
- Returns
bibliographic database object
- Return type
Example:
import bibtexparser with open('bibtex.bib') as bibtex_file: bibtex_database = bibtexparser.load(bibtex_file)
- bibtexparser.loads(bibtex_str, parser=None)[source]¶
Load
BibDatabase
object from a string- Parameters
bibtex_str (str or unicode) – input BibTeX string to be parsed
parser (BibTexParser) – custom parser to use (optional)
- Returns
bibliographic database object
- Return type
bibtexparser.bibdatabase
— The bibliographic database object¶
- class bibtexparser.bibdatabase.BibDatabase[source]¶
Bibliographic database object that follows the data structure of a BibTeX file.
- comments¶
List of BibTeX comment (@comment{…}) blocks.
- entries¶
List of BibTeX entries, for example @book{…}, @article{…}, etc. Each entry is a simple dict with BibTeX field-value pairs, for example ‘author’: ‘Bird, R.B. and Armstrong, R.C. and Hassager, O.’ Each entry will always have the following dict keys (in addition to other BibTeX fields):
ID (BibTeX key)
ENTRYTYPE (entry type in lowercase, e.g. book, article etc.)
- property entries_dict¶
Return a dictionary of BibTeX entries. The dict key is the BibTeX entry key
- preambles¶
List of BibTeX preamble (@preamble{…}) blocks.
- strings¶
OrderedDict of BibTeX string definitions (@string{…}). In order of definition.
bibtexparser.bparser
— Tune the default parser¶
- class bibtexparser.bparser.BibTexParser(data=None, **args)[source]¶
A parser for reading BibTeX bibliographic data files.
Example:
from bibtexparser.bparser import BibTexParser bibtex_str = ... parser = BibTexParser() parser.ignore_nonstandard_types = False parser.homogenize_fields = False parser.common_strings = False bib_database = bibtexparser.loads(bibtex_str, parser)
- Parameters
customization – function or None (default) Customization to apply to parsed entries.
ignore_nonstandard_types – bool (default True) If True ignores non-standard bibtex entry types.
homogenize_fields – bool (default False) Common field name replacements (as set in alt_dict attribute).
interpolate_strings – bool (default True) If True, replace bibtex string by their value, else uses BibDataString objects.
common_strings – bool (default False) Include common string definitions (e.g. month abbreviations) to the bibtex file.
add_missing_from_crossref – bool (default False) Resolve BibTeX references set in the crossref field for BibTeX entries and add the fields from the referenced entry to the referencing entry.
- common_strings¶
Load common strings such as months abbreviation Default: False.
- customization¶
Callback function to process BibTeX entries after parsing, for example to create a list from a string with multiple values. By default all BibTeX values are treated as simple strings. Default: None.
- homogenize_fields¶
Sanitize BibTeX field names, for example change url to link etc. Field names are always converted to lowercase names. Default: False.
- ignore_nonstandard_types¶
Ignore non-standard BibTeX types (book, article, etc). Default: True.
- interpolate_strings¶
Interpolate Bibtex Strings or keep the structure
- parse(bibtex_str, partial=False)[source]¶
Parse a BibTeX string into an object
- Parameters
bibtex_str – BibTeX string
partial – If True, print errors only on parsing failures. If False, an exception is raised.
- Type
str or unicode
- Type
boolean
- Returns
bibliographic database
- Return type
bibtexparser.customization
— Functions to customize records¶
A set of functions useful for customizing bibtex fields. You can find inspiration from these functions to design yours. Each of them takes a record and return the modified record.
- bibtexparser.customization.add_plaintext_fields(record)[source]¶
For each field in the record, add a plain_ field containing the plaintext, stripped from braces and similar. See https://github.com/sciunto-org/python-bibtexparser/issues/116.
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.author(record)[source]¶
Split author field into a list of “Name, Surname”.
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.convert_to_unicode(record)[source]¶
Convert accent from latex to unicode style.
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.doi(record)[source]¶
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.editor(record)[source]¶
Turn the editor field into a dict composed of the original editor name and a editor id (without coma or blank).
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.getnames(names)[source]¶
Convert people names as surname, firstnames or surname, initials.
- Parameters
names (list) – a list of names
- Returns
list – Correctly formated names
Note
This function is known to be too simple to handle properly the complex rules. We would like to enhance this in forthcoming releases.
- bibtexparser.customization.homogenize_latex_encoding(record)[source]¶
Homogenize the latex enconding style for bibtex
This function is experimental.
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.journal(record)[source]¶
Turn the journal field into a dict composed of the original journal name and a journal id (without coma or blank).
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.keyword(record, sep=',|;')[source]¶
Split keyword field into a list.
- Parameters
record (string, optional) – the record.
sep – pattern used for the splitting regexp.
- Returns
dict – the modified record.
- bibtexparser.customization.link(record)[source]¶
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.page_double_hyphen(record)[source]¶
Separate pages by a double hyphen (–).
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
- bibtexparser.customization.splitname(name, strict_mode=True)[source]¶
Break a name into its constituent parts: First, von, Last, and Jr.
- Parameters
name (string) – a string containing a single name
strict_mode (Boolean) – whether to use strict mode
- Returns
dictionary of constituent parts
- Raises
customization.InvalidName – If an invalid name is given and
strict_mode = True
.
- In BibTeX, a name can be represented in any of three forms:
First von Last
von Last, First
von Last, Jr, First
This function attempts to split a given name into its four parts. The returned dictionary has keys of
first
,last
,von
andjr
. Each value is a list of the words making up that part; this may be an empty list. If the input has no non-whitespace characters, a blank dictionary is returned.It is capable of detecting some errors with the input name. If the
strict_mode
parameter isTrue
, which is the default, this results in acustomization.InvalidName
exception being raised. If it isFalse
, the function continues, working around the error as best it can. The errors that can be detected are listed below along with the handling for non-strict mode:Name finishes with a trailing comma: delete the comma
Too many parts (e.g., von Last, Jr, First, Error): merge extra parts into First
Unterminated opening brace: add closing brace to end of input
Unmatched closing brace: add opening brace at start of word
- bibtexparser.customization.type(record)[source]¶
Put the type into lower case.
- Parameters
record (dict) – the record.
- Returns
dict – the modified record.
Exception classes¶
bibtexparser.bwriter
— Tune the default writer¶
- class bibtexparser.bwriter.BibTexWriter(write_common_strings=False)[source]¶
Writer to convert a
BibDatabase
object to a string or file formatted as a BibTeX file.Example:
from bibtexparser.bwriter import BibTexWriter bib_database = ... writer = BibTexWriter() writer.contents = ['comments', 'entries'] writer.indent = ' ' writer.order_entries_by = ('ENTRYTYPE', 'author', 'year') bibtex_str = bibtexparser.dumps(bib_database, writer)
- add_trailing_comma¶
BibTeX syntax allows the comma to be optional at the end of the last field in an entry. Use this to enable writing this last comma in the bwriter output. Defaults: False.
- comma_first¶
BibTeX syntax allows comma first syntax (common in functional languages), use this to enable comma first syntax as the bwriter output
- common_strings¶
Whether common strings are written
- contents¶
List of BibTeX elements to write, valid values are entries, comments, preambles, strings.
- display_order¶
Tuple of fields for display order in a single BibTeX entry. Fields not listed here will be displayed alphabetically at the end. Set to ‘[]’ for alphabetical order. Default: ‘[]’
- entry_separator¶
Characters(s) for separating BibTeX entries. Default: new line.
- indent¶
Character(s) for indenting BibTeX field-value pairs. Default: single space.
- order_entries_by¶
Tuple of fields for ordering BibTeX entries. Set to None to disable sorting. Default: BibTeX key (‘ID’, ).
- write(bib_database)[source]¶
Converts a bibliographic database to a BibTeX-formatted string.
- Parameters
bib_database (BibDatabase) – bibliographic database to be converted to a BibTeX string
- Returns
BibTeX-formatted string
- Return type
str or unicode
bibtexparser.bibtexexpression
— Parser’s core relying on pyparsing¶
- class bibtexparser.bibtexexpression.BibtexExpression[source]¶
Gives access to pyparsing expressions.
Attributes are pyparsing expressions for the following elements:
main_expression: the bibtex file
string_def: a string definition
preamble_decl: a preamble declaration
explicit_comment: an explicit comment
entry: an entry definition
implicit_comment: an implicit comment
- exception ParseException(pstr: str, loc: int = 0, msg: Optional[str] = None, elem=None)¶
Exception thrown when a parse expression doesn’t match the input string
Example:
try: Word(nums).set_name("integer").parse_string("ABC") except ParseException as pe: print(pe) print("column: {}".format(pe.column))
prints:
Expected integer (at char 0), (line:1, col:1) column: 1
- add_log_function(log_fun)[source]¶
Add notice to logger on entry, comment, preamble, string definitions.
- Parameters
log_fun – logger function
- set_string_expression_parse_action(fun)[source]¶
Set the parseAction for string_expression expression.
Note
See set_string_name_parse_action.
- set_string_name_parse_action(fun)[source]¶
Set the parseAction for string name expression.
Note
For some reason pyparsing duplicates the string_name expression so setting its parseAction a posteriori has no effect in the context of a string expression. This is why this function should be used instead.
- bibtexparser.bibtexexpression.add_logger_parse_action(expr, log_func)[source]¶
Register a callback on expression parsing with the adequate message.