prov package

Subpackages

Submodules

prov.constants module

prov.dot module

Graphical visualisation support for prov.model.

This module produces graphical visualisation for provenanve graphs. Requires pydot module and Graphviz.

References:

prov.dot.prov_to_dot(bundle, show_nary=True, use_labels=False, direction='BT', show_element_attributes=True, show_relation_attributes=True)[source]

Convert a provenance bundle/document into a DOT graphical representation.

Parameters
  • bundle (ProvBundle) – The provenance bundle/document to be converted.

  • show_nary (bool) – shows all elements in n-ary relations.

  • use_labels (bool) – uses the prov:label property of an element as its name (instead of its identifier).

  • direction – specifies the direction of the graph. Valid values are “BT” (default), “TB”, “LR”, “RL”.

  • show_element_attributes (bool) – shows attributes of elements.

  • show_relation_attributes (bool) – shows attributes of relations.

Returns

pydot.Dot – the Dot object.

prov.graph module

prov.graph.graph_to_prov(g)[source]

Convert a MultiDiGraph that was previously produced by prov_to_graph() back to a ProvDocument.

Parameters

g – The graph instance to convert.

prov.graph.prov_to_graph(prov_document)[source]

Convert a ProvDocument to a MultiDiGraph instance of the NetworkX library.

Parameters

prov_document – The ProvDocument instance to convert.

prov.identifier module

class prov.identifier.Identifier(uri)[source]

Bases: object

Base class for all identifiers and also represents xsd:anyURI.

provn_representation()[source]

PROV-N representation of qualified name in a string.

property uri

Identifier’s URI.

class prov.identifier.Namespace(prefix: str, uri: str)[source]

Bases: object

PROV Namespace.

contains(identifier)[source]

Indicates whether the identifier provided is contained in this namespace.

Parameters

identifier – Identifier to check.

Returns

bool

property prefix

Namespace prefix.

qname(identifier)[source]

Returns the qualified name of the identifier given using the namespace prefix.

Parameters

identifier – Identifier to resolve to a qualified name.

Returns

QualifiedName

property uri

Namespace URI.

class prov.identifier.QualifiedName(namespace, localpart)[source]

Bases: prov.identifier.Identifier

Qualified name of an identifier in a particular namespace.

property localpart

Local part of qualified name.

property namespace

Namespace of qualified name.

provn_representation()[source]

PROV-N representation of qualified name in a string.

prov.model module

Python implementation of the W3C Provenance Data Model (PROV-DM), including support for PROV-JSON import/export

References:

PROV-DM: http://www.w3.org/TR/prov-dm/ PROV-JSON: https://openprovenance.org/prov-json/

class prov.model.Literal(value, datatype=None, langtag=None)[source]

Bases: object

property datatype
has_no_langtag()[source]
property langtag
provn_representation()[source]
property value
class prov.model.NamespaceManager(namespaces=None, default=None, parent=None)[source]

Bases: dict

Manages namespaces for PROV documents and bundles.

add_namespace(namespace)[source]

Adds a namespace (if not available, yet).

Parameters

namespaceNamespace to add.

add_namespaces(namespaces)[source]

Add multiple namespaces into this manager.

Parameters

namespaces (List of Namespace or dict of {prefix: uri}.) – A collection of namespace(s) to add.

Returns

None

get_anonymous_identifier(local_prefix='id')[source]

Returns an anonymous identifier (without a namespace prefix).

Parameters

local_prefix – Optional local namespace prefix as a string (default: ‘id’).

Returns

Identifier

get_default_namespace()[source]

Returns the default namespace.

Returns

Namespace

get_namespace(uri)[source]

Returns the namespace prefix for the given URI.

Parameters

uri – Namespace URI.

Returns

Namespace.

get_registered_namespaces()[source]

Returns all registered namespaces.

Returns

Iterable of Namespace.

parent = None

Parent NamespaceManager this manager one is a child of.

set_default_namespace(uri)[source]

Sets the default namespace to the one of a given URI.

Parameters

uri – Namespace URI.

valid_qualified_name(qname)[source]

Resolves an identifier to a valid qualified name.

Parameters

qname – Qualified name as QualifiedName or a tuple (namespace, identifier).

Returns

QualifiedName or None in case of failure.

class prov.model.ProvActivity(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvElement

Provenance Activity element.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:startTime>, <QualifiedName: prov:endTime>)
get_endTime()[source]

Returns the time the activity ended.

Returns

datetime.datetime

get_startTime()[source]

Returns the time the activity started.

Returns

datetime.datetime

set_time(startTime=None, endTime=None)[source]

Sets the time this activity took place.

Parameters
  • startTime – Start time for the activity. Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • endTime – Start time for the activity. Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

used(entity, time=None, attributes=None)[source]

Creates a new usage record for this activity.

Parameters
  • entity – Entity or string identifier of the entity involved in the usage relationship (default: None).

  • time – Optional time for the usage (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasAssociatedWith(agent, plan=None, attributes=None)[source]

Creates a new association record for this activity.

Parameters
  • agent – Agent or string identifier of the agent involved in the association (default: None).

  • plan – Optionally extra entity to state qualified association through an internal plan (default: None).

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasEndedBy(trigger, ender=None, time=None, attributes=None)[source]

Creates a new end record for this activity.

Parameters
  • trigger – Entity triggering the end of this activity.

  • ender – Optionally extra activity to state a qualified end through which the trigger entity for the end is generated (default: None).

  • time – Optional time for the end (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasInformedBy(informant, attributes=None)[source]

Creates a new communication record for this activity.

Parameters
  • informant – The informing activity (relationship source).

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasStartedBy(trigger, starter=None, time=None, attributes=None)[source]

Creates a new start record for this activity. The activity did not exist before the start by the trigger.

Parameters
  • trigger – Entity triggering the start of this activity.

  • starter – Optionally extra activity to state a qualified start through which the trigger entity for the start is generated (default: None).

  • time – Optional time for the start (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

class prov.model.ProvAgent(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvElement

Provenance Agent element.

actedOnBehalfOf(responsible, activity=None, attributes=None)[source]

Creates a new delegation record on behalf of this agent.

Parameters
  • responsible – Agent the responsibility is delegated to.

  • activity – Optionally extra activity to state qualified delegation internally (default: None).

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

class prov.model.ProvAlternate(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Alternate relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:alternate1>, <QualifiedName: prov:alternate2>)
class prov.model.ProvAssociation(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Association relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:activity>, <QualifiedName: prov:agent>, <QualifiedName: prov:plan>)
class prov.model.ProvAttribution(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Attribution relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:entity>, <QualifiedName: prov:agent>)
class prov.model.ProvBundle(records=None, identifier=None, namespaces=None, document=None)[source]

Bases: object

PROV Bundle

actedOnBehalfOf(delegate, responsible, activity=None, identifier=None, other_attributes=None)

Creates a new delegation record on behalf of an agent.

Parameters
  • delegate – Agent delegating the responsibility (relationship source).

  • responsible – Agent the responsibility is delegated to (relationship destination).

  • activity – Optionally extra activity to state qualified delegation internally (default: None).

  • identifier – Identifier for new association record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

activity(identifier, startTime=None, endTime=None, other_attributes=None)[source]

Creates a new activity.

Parameters
  • identifier – Identifier for new activity.

  • startTime – Optional start time for the activity (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • endTime – Optional start time for the activity (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

add_namespace(namespace_or_prefix, uri=None)[source]

Adds a namespace (if not available, yet).

Parameters
  • namespace_or_prefixNamespace or its prefix as a string to add.

  • uri – Namespace URI (default: None). Must be present if only a prefix is given in the previous parameter.

add_record(record)[source]

Adds a new record that to the bundle.

Parameters

recordProvRecord to be added.

agent(identifier, other_attributes=None)[source]

Creates a new agent.

Parameters
  • identifier – Identifier for new agent.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

alternate(alternate1, alternate2)[source]

Creates a new alternate record between two entities.

Parameters
  • alternate1 – Entity or a string identifier for the first entity (relationship source).

  • alternate2 – Entity or a string identifier for the second entity (relationship destination).

alternateOf(alternate1, alternate2)

Creates a new alternate record between two entities.

Parameters
  • alternate1 – Entity or a string identifier for the first entity (relationship source).

  • alternate2 – Entity or a string identifier for the second entity (relationship destination).

association(activity, agent=None, plan=None, identifier=None, other_attributes=None)[source]

Creates a new association record for an activity.

Parameters
  • activity – Activity or a string identifier for the activity.

  • agent – Agent or string identifier of the agent involved in the association (default: None).

  • plan – Optionally extra entity to state qualified association through an internal plan (default: None).

  • identifier – Identifier for new association record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

attribution(entity, agent, identifier=None, other_attributes=None)[source]

Creates a new attribution record between an entity and an agent.

Parameters
  • entity – Entity or a string identifier for the entity (relationship source).

  • agent – Agent or string identifier of the agent involved in the attribution (relationship destination).

  • identifier – Identifier for new attribution record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

property bundles

Returns bundles contained in the document

Returns

Iterable of ProvBundle.

collection(identifier, other_attributes=None)[source]

Creates a new collection record for a particular record.

Parameters
  • identifier – Identifier for new collection record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

communication(informed, informant, identifier=None, other_attributes=None)[source]

Creates a new communication record for an entity.

Parameters
  • informed – The informed activity (relationship destination).

  • informant – The informing activity (relationship source).

  • identifier – Identifier for new communication record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

property default_ns_uri

Returns the default namespace’s URI, if any.

Returns

URI as string.

delegation(delegate, responsible, activity=None, identifier=None, other_attributes=None)[source]

Creates a new delegation record on behalf of an agent.

Parameters
  • delegate – Agent delegating the responsibility (relationship source).

  • responsible – Agent the responsibility is delegated to (relationship destination).

  • activity – Optionally extra activity to state qualified delegation internally (default: None).

  • identifier – Identifier for new association record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

derivation(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)[source]

Creates a new derivation record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the derivation (default: None).

  • generation – Optionally extra activity to state qualified generation through a generation (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new derivation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

property document

Returns the parent document, if any.

Returns

ProvDocument.

end(activity, trigger=None, ender=None, time=None, identifier=None, other_attributes=None)[source]

Creates a new end record for an activity.

Parameters
  • activity – Activity or a string identifier for the entity.

  • trigger – trigger: Entity triggering the end of this activity.

  • ender – Optionally extra activity to state a qualified end through which the trigger entity for the end is generated (default: None).

  • time – Optional time for the end (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new end record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

entity(identifier, other_attributes=None)[source]

Creates a new entity.

Parameters
  • identifier – Identifier for new entity.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

generation(entity, activity=None, time=None, identifier=None, other_attributes=None)[source]

Creates a new generation record for an entity.

Parameters
  • entity – Entity or a string identifier for the entity.

  • activity – Activity or string identifier of the activity involved in the generation (default: None).

  • time – Optional time for the generation (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new generation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

get_default_namespace()[source]

Returns the default namespace.

Returns

Namespace

get_provn(_indent_level=0)[source]

Returns the PROV-N representation of the bundle.

Returns

String

get_record(identifier)[source]

Returns a specific record matching a given identifier.

Parameters

identifier – Record identifier.

Returns

ProvRecord

get_records(class_or_type_or_tuple=None)[source]

Returns all records. Returned records may be filtered by the optional argument.

Parameters

class_or_type_or_tuple – A filter on the type for which records are to be returned (default: None). The filter checks by the type of the record using the isinstance check on the record.

Returns

List of ProvRecord objects.

get_registered_namespaces()[source]

Returns all registered namespaces.

Returns

Iterable of Namespace.

hadMember(collection, entity)

Creates a new membership record for an entity to a collection.

Parameters
  • collection – Collection the entity is to be added to.

  • entity – Entity to be added to the collection.

hadPrimarySource(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)

Creates a new primary source record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the primary source (default: None).

  • generation – Optionally to state qualified primary source through a generation activity (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new primary source record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

has_bundles()[source]

True if the object has at least one bundle, False otherwise.

Returns

bool

property identifier

Returns the bundle’s identifier

influence(influencee, influencer, identifier=None, other_attributes=None)[source]

Creates a new influence record between two entities, activities or agents.

Parameters
  • influencee – Influenced entity, activity or agent (relationship source).

  • influencer – Influencing entity, activity or agent (relationship destination).

  • identifier – Identifier for new influence record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

invalidation(entity, activity=None, time=None, identifier=None, other_attributes=None)[source]

Creates a new invalidation record for an entity.

Parameters
  • entity – Entity or a string identifier for the entity.

  • activity – Activity or string identifier of the activity involved in the invalidation (default: None).

  • time – Optional time for the invalidation (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new invalidation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

is_bundle()[source]

True if the object is a bundle, False otherwise.

Returns

bool

is_document()[source]

True if the object is a document, False otherwise.

Returns

bool

membership(collection, entity)[source]

Creates a new membership record for an entity to a collection.

Parameters
  • collection – Collection the entity is to be added to.

  • entity – Entity to be added to the collection.

mention(specificEntity, generalEntity, bundle)[source]

Creates a new mention record for a specific from a general entity.

Parameters
  • specificEntity – Entity or a string identifier for the specific entity (relationship source).

  • generalEntity – Entity or a string identifier for the general entity (relationship destination).

  • bundle – XXX

mentionOf(specificEntity, generalEntity, bundle)

Creates a new mention record for a specific from a general entity.

Parameters
  • specificEntity – Entity or a string identifier for the specific entity (relationship source).

  • generalEntity – Entity or a string identifier for the general entity (relationship destination).

  • bundle – XXX

property namespaces

Returns the set of registered namespaces.

Returns

Set of Namespace.

new_record(record_type, identifier, attributes=None, other_attributes=None)[source]

Creates a new record.

Parameters
  • record_type – Type of record (one of PROV_REC_CLS).

  • identifier – Identifier for new record.

  • attributes – Attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

plot(filename=None, show_nary=True, use_labels=False, show_element_attributes=True, show_relation_attributes=True)[source]

Convenience function to plot a PROV document.

Parameters
  • filename (String) – The filename to save to. If not given, it will open an interactive matplotlib plot. The filetype is determined from the filename ending.

  • show_nary (bool) – Shows all elements in n-ary relations.

  • use_labels (bool) – Uses the prov:label property of an element as its name (instead of its identifier).

  • show_element_attributes (bool) – Shows attributes of elements.

  • show_relation_attributes (bool) – Shows attributes of relations.

primary_source(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)[source]

Creates a new primary source record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the primary source (default: None).

  • generation – Optionally to state qualified primary source through a generation activity (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new primary source record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

quotation(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)[source]

Creates a new quotation record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the quotation (default: None).

  • generation – Optionally to state qualified quotation through a generation activity (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new quotation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

property records

Returns the list of all records in the current bundle

revision(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)[source]

Creates a new revision record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the revision (default: None).

  • generation – Optionally to state qualified revision through a generation activity (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new revision record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

set_default_namespace(uri)[source]

Sets the default namespace through a given URI.

Parameters

uri – Namespace URI.

specialization(specificEntity, generalEntity)[source]

Creates a new specialisation record for a specific from a general entity.

Parameters
  • specificEntity – Entity or a string identifier for the specific entity (relationship source).

  • generalEntity – Entity or a string identifier for the general entity (relationship destination).

specializationOf(specificEntity, generalEntity)

Creates a new specialisation record for a specific from a general entity.

Parameters
  • specificEntity – Entity or a string identifier for the specific entity (relationship source).

  • generalEntity – Entity or a string identifier for the general entity (relationship destination).

start(activity, trigger=None, starter=None, time=None, identifier=None, other_attributes=None)[source]

Creates a new start record for an activity.

Parameters
  • activity – Activity or a string identifier for the entity.

  • trigger – Entity triggering the start of this activity.

  • starter – Optionally extra activity to state a qualified start through which the trigger entity for the start is generated (default: None).

  • time – Optional time for the start (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new start record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

unified()[source]

Unifies all records in the bundle that haves same identifiers

Returns

ProvBundle – the new unified bundle.

update(other)[source]

Append all the records of the other ProvBundle into this bundle.

Parameters

other (ProvBundle) – the other bundle whose records to be appended.

Returns

None.

usage(activity, entity=None, time=None, identifier=None, other_attributes=None)[source]

Creates a new usage record for an activity.

Parameters
  • activity – Activity or a string identifier for the entity.

  • entity – Entity or string identifier of the entity involved in the usage relationship (default: None).

  • time – Optional time for the usage (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new usage record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

used(activity, entity=None, time=None, identifier=None, other_attributes=None)

Creates a new usage record for an activity.

Parameters
  • activity – Activity or a string identifier for the entity.

  • entity – Entity or string identifier of the entity involved in the usage relationship (default: None).

  • time – Optional time for the usage (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new usage record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

valid_qualified_name(identifier)[source]
wasAssociatedWith(activity, agent=None, plan=None, identifier=None, other_attributes=None)

Creates a new association record for an activity.

Parameters
  • activity – Activity or a string identifier for the activity.

  • agent – Agent or string identifier of the agent involved in the association (default: None).

  • plan – Optionally extra entity to state qualified association through an internal plan (default: None).

  • identifier – Identifier for new association record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasAttributedTo(entity, agent, identifier=None, other_attributes=None)

Creates a new attribution record between an entity and an agent.

Parameters
  • entity – Entity or a string identifier for the entity (relationship source).

  • agent – Agent or string identifier of the agent involved in the attribution (relationship destination).

  • identifier – Identifier for new attribution record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasDerivedFrom(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)

Creates a new derivation record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the derivation (default: None).

  • generation – Optionally extra activity to state qualified generation through a generation (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new derivation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasEndedBy(activity, trigger=None, ender=None, time=None, identifier=None, other_attributes=None)

Creates a new end record for an activity.

Parameters
  • activity – Activity or a string identifier for the entity.

  • trigger – trigger: Entity triggering the end of this activity.

  • ender – Optionally extra activity to state a qualified end through which the trigger entity for the end is generated (default: None).

  • time – Optional time for the end (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new end record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasGeneratedBy(entity, activity=None, time=None, identifier=None, other_attributes=None)

Creates a new generation record for an entity.

Parameters
  • entity – Entity or a string identifier for the entity.

  • activity – Activity or string identifier of the activity involved in the generation (default: None).

  • time – Optional time for the generation (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new generation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasInfluencedBy(influencee, influencer, identifier=None, other_attributes=None)

Creates a new influence record between two entities, activities or agents.

Parameters
  • influencee – Influenced entity, activity or agent (relationship source).

  • influencer – Influencing entity, activity or agent (relationship destination).

  • identifier – Identifier for new influence record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasInformedBy(informed, informant, identifier=None, other_attributes=None)

Creates a new communication record for an entity.

Parameters
  • informed – The informed activity (relationship destination).

  • informant – The informing activity (relationship source).

  • identifier – Identifier for new communication record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasInvalidatedBy(entity, activity=None, time=None, identifier=None, other_attributes=None)

Creates a new invalidation record for an entity.

Parameters
  • entity – Entity or a string identifier for the entity.

  • activity – Activity or string identifier of the activity involved in the invalidation (default: None).

  • time – Optional time for the invalidation (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new invalidation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasQuotedFrom(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)

Creates a new quotation record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the quotation (default: None).

  • generation – Optionally to state qualified quotation through a generation activity (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new quotation record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasRevisionOf(generatedEntity, usedEntity, activity=None, generation=None, usage=None, identifier=None, other_attributes=None)

Creates a new revision record for a generated entity from a used entity.

Parameters
  • generatedEntity – Entity or a string identifier for the generated entity (relationship source).

  • usedEntity – Entity or a string identifier for the used entity (relationship destination).

  • activity – Activity or string identifier of the activity involved in the revision (default: None).

  • generation – Optionally to state qualified revision through a generation activity (default: None).

  • usage – XXX (default: None).

  • identifier – Identifier for new revision record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasStartedBy(activity, trigger=None, starter=None, time=None, identifier=None, other_attributes=None)

Creates a new start record for an activity.

Parameters
  • activity – Activity or a string identifier for the entity.

  • trigger – Entity triggering the start of this activity.

  • starter – Optionally extra activity to state a qualified start through which the trigger entity for the start is generated (default: None).

  • time – Optional time for the start (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • identifier – Identifier for new start record.

  • other_attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

class prov.model.ProvCommunication(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Communication relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:informed>, <QualifiedName: prov:informant>)
class prov.model.ProvDelegation(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Delegation relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:delegate>, <QualifiedName: prov:responsible>, <QualifiedName: prov:activity>)
class prov.model.ProvDerivation(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Derivation relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:generatedEntity>, <QualifiedName: prov:usedEntity>, <QualifiedName: prov:activity>, <QualifiedName: prov:generation>, <QualifiedName: prov:usage>)
class prov.model.ProvDocument(records=None, namespaces=None)[source]

Bases: prov.model.ProvBundle

Provenance Document.

add_bundle(bundle, identifier=None)[source]

Add a bundle to the current document.

Parameters
  • bundle (ProvBundle) – The bundle to add to the document.

  • identifier – The (optional) identifier to use for the bundle (default: None). If none given, use the identifier from the bundle itself.

bundle(identifier)[source]

Returns a new bundle from the current document.

Parameters

identifier – The identifier to use for the bundle.

Returns

ProvBundle

property bundles

Returns bundles contained in the document

Returns

Iterable of ProvBundle.

static deserialize(source=None, content=None, format='json', **args)[source]

Deserialize the ProvDocument from source (a stream or a file path) or directly from a string content.

Available serializers can be queried by the value of :py:attr:~prov.serializers.Registry.serializers after loading them via :py:func:~prov.serializers.Registry.load_serializers().

Note: Not all serializers support deserialization.

Parameters
  • source – Stream object to deserialize the PROV document from (default: None).

  • content – String to deserialize the PROV document from (default: None).

  • format – Serialization format (default: ‘json’), defaulting to PROV-JSON.

Returns

ProvDocument

flattened()[source]

Flattens the document by moving all the records in its bundles up to the document level.

Returns

ProvDocument – the (new) flattened document.

has_bundles()[source]

True if the object has at least one bundle, False otherwise.

Returns

bool

is_bundle()[source]

True if the object is a bundle, False otherwise.

Returns

bool

is_document()[source]

True if the object is a document, False otherwise.

Returns

bool

serialize(destination=None, format='json', **args)[source]

Serialize the ProvDocument to the destination.

Available serializers can be queried by the value of :py:attr:~prov.serializers.Registry.serializers after loading them via :py:func:~prov.serializers.Registry.load_serializers().

Parameters
  • destination – Stream object to serialize the output to. Default is None, which serializes as a string.

  • format – Serialization format (default: ‘json’), defaulting to PROV-JSON.

Returns

Serialization in a string if no destination was given, None otherwise.

unified()[source]

Returns a new document containing all records having same identifiers unified (including those inside bundles).

Returns

ProvDocument

update(other)[source]

Append all the records of the other document/bundle into this document. Bundles having same identifiers will be merged.

Parameters

other (ProvDocument or ProvBundle) – The other document/bundle whose records to be appended.

Returns

None.

class prov.model.ProvElement(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRecord

Provenance Element (nodes in the provenance graph).

is_element()[source]

True, if the record is an element, False otherwise.

Returns

bool

exception prov.model.ProvElementIdentifierRequired[source]

Bases: prov.model.ProvException

Exception for a missing element identifier.

class prov.model.ProvEnd(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance End relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:activity>, <QualifiedName: prov:trigger>, <QualifiedName: prov:ender>, <QualifiedName: prov:time>)
class prov.model.ProvEntity(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvElement

Provenance Entity element

alternateOf(alternate2)[source]

Creates a new alternate record between this and another entity.

Parameters

alternate2 – Entity or a string identifier for the second entity.

hadMember(entity)[source]

Creates a new membership record to an entity for a collection.

Parameters

entity – Entity to be added to the collection.

specializationOf(generalEntity)[source]

Creates a new specialisation record for this from a general entity.

Parameters

generalEntity – Entity or a string identifier for the general entity.

wasAttributedTo(agent, attributes=None)[source]

Creates a new attribution record between this entity and an agent.

Parameters
  • agent – Agent or string identifier of the agent involved in the attribution.

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasDerivedFrom(usedEntity, activity=None, generation=None, usage=None, attributes=None)[source]

Creates a new derivation record for this entity from a used entity.

Parameters
  • usedEntity – Entity or a string identifier for the used entity.

  • activity – Activity or string identifier of the activity involved in the derivation (default: None).

  • generation – Optionally extra activity to state qualified derivation through an internal generation (default: None).

  • usage – Optionally extra entity to state qualified derivation through an internal usage (default: None).

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasGeneratedBy(activity, time=None, attributes=None)[source]

Creates a new generation record to this entity.

Parameters
  • activity – Activity or string identifier of the activity involved in the generation (default: None).

  • time – Optional time for the generation (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

wasInvalidatedBy(activity, time=None, attributes=None)[source]

Creates a new invalidation record for this entity.

Parameters
  • activity – Activity or string identifier of the activity involved in the invalidation (default: None).

  • time – Optional time for the invalidation (default: None). Either a datetime.datetime object or a string that can be parsed by dateutil.parser().

  • attributes – Optional other attributes as a dictionary or list of tuples to be added to the record optionally (default: None).

exception prov.model.ProvException[source]

Bases: prov.Error

Base class for PROV model exceptions.

exception prov.model.ProvExceptionInvalidQualifiedName(qname)[source]

Bases: prov.model.ProvException

Exception for an invalid qualified identifier name.

qname = None

Intended qualified name.

class prov.model.ProvGeneration(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Generation relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:entity>, <QualifiedName: prov:activity>, <QualifiedName: prov:time>)
class prov.model.ProvInfluence(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Influence relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:influencee>, <QualifiedName: prov:influencer>)
class prov.model.ProvInvalidation(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Invalidation relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:entity>, <QualifiedName: prov:activity>, <QualifiedName: prov:time>)
class prov.model.ProvMembership(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Membership relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:collection>, <QualifiedName: prov:entity>)
class prov.model.ProvMention(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvSpecialization

Provenance Mention relationship (specific Specialization).

FORMAL_ATTRIBUTES = (<QualifiedName: prov:specificEntity>, <QualifiedName: prov:generalEntity>, <QualifiedName: prov:bundle>)
class prov.model.ProvRecord(bundle, identifier, attributes=None)[source]

Bases: object

Base class for PROV records.

FORMAL_ATTRIBUTES = ()
add_asserted_type(type_identifier)[source]

Adds a PROV type assertion to the record.

Parameters

type_identifier – PROV namespace identifier to add.

add_attributes(attributes)[source]

Add attributes to the record.

Parameters

attributes – Dictionary of attributes, with keys being qualified identifiers. Alternatively an iterable of tuples (key, value) with the keys satisfying the same condition.

property args

All values of the record’s formal attributes.

Returns

Tuple

property attributes

All record attributes.

Returns

List of tuples (name, value)

property bundle

Bundle of the record.

Returns

ProvBundle

copy()[source]

Return an exact copy of this record.

property extra_attributes

All names and values of the record’s attributes that are not formal attributes.

Returns

Tuple of tuples (name, value)

property formal_attributes

All names and values of the record’s formal attributes.

Returns

Tuple of tuples (name, value)

get_asserted_types()[source]

Returns the set of all asserted PROV types of this record.

get_attribute(attr_name)[source]

Returns the attribute of the given name.

Parameters

attr_name – Name of the attribute.

Returns

Tuple (name, value)

get_provn()[source]

Returns the PROV-N representation of the record.

Returns

String

get_type()[source]

Returns the PROV type of the record.

property identifier

Record’s identifier.

is_element()[source]

True, if the record is an element, False otherwise.

Returns

bool

is_relation()[source]

True, if the record is a relation, False otherwise.

Returns

bool

property label

Identifying label of the record.

property value

Value of the record.

class prov.model.ProvRelation(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRecord

Provenance Relationship (edge between nodes).

is_relation()[source]

True, if the record is a relation, False otherwise.

Returns

bool

class prov.model.ProvSpecialization(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Specialization relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:specificEntity>, <QualifiedName: prov:generalEntity>)
class prov.model.ProvStart(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Start relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:activity>, <QualifiedName: prov:trigger>, <QualifiedName: prov:starter>, <QualifiedName: prov:time>)
class prov.model.ProvUsage(bundle, identifier, attributes=None)[source]

Bases: prov.model.ProvRelation

Provenance Usage relationship.

FORMAL_ATTRIBUTES = (<QualifiedName: prov:activity>, <QualifiedName: prov:entity>, <QualifiedName: prov:time>)
exception prov.model.ProvWarning[source]

Bases: Warning

Base class for PROV model warnings.

prov.model.encoding_provn_value(value)[source]
prov.model.first(a_set)[source]
prov.model.parse_boolean(value)[source]
prov.model.parse_xsd_datetime(value)[source]
prov.model.parse_xsd_types(value, datatype)[source]
prov.model.sorted_attributes(element, attributes)[source]

Helper function sorting attributes into the order required by PROV-XML.

Parameters
  • element – The prov element used to derive the type and the attribute order for the type.

  • attributes – The attributes to sort.

Module contents

exception prov.Error[source]

Bases: Exception

Base class for all errors in this package.

prov.read(source, format=None)[source]

Convenience function returning a ProvDocument instance.

It does a lazy format detection by simply using try/except for all known formats. The deserializers should fail fairly early when data of the wrong type is passed to them thus the try/except is likely cheap. One could of course also do some more advanced format auto-detection but I am not sure that is necessary.

The downside is that no proper error messages will be produced, use the format parameter to get the actual traceback.