Specifying data for analysis

We introduce the concept of a “data store”. This represents the data record(s) that you want to analyse. It can be a single file, a directory of files, a zipped directory of files or a single sqlitedb file containing multiple data records.

We represent this concept by a DataStore class. There are different flavours of these:

  • directory based

  • Sqlite based

All of these types support being indexed, iterated over, etc..

A read only data store

To create one of these, you provide a path AND a suffix of the files within the directory / zip that you will be analysing. (If the path ends with .sqlitedb, no file suffix is required.)

Data store “members”

These are able to read their own raw data.

Looping over a data store

Making a writeable data store

The creation of a writeable data store is specified with mode="w", or (to append) mode="a". In the former case, any existing records are overwritten. In the latter case, existing records are ignored.

Sqlitedb data stores for serialised data

When you specify a Sqlitedb data store as your output (by using open_data_store()) you write multiple records into a single file making distribution easier.

One important issue to note is the process which creates a Sqlitedb “locks” the file. If that process exits unnaturally (e.g. the run that was producing it was interrupted) then the file may remain in a locked state. If the db is in this state, cogent3 will not modify it unless you explicitly unlock it.

This is represented in the display as shown below.

To unlock, you execute the following:

Interrogating run logs

If you use the apply_to() method, a scitrack logfile will be included in the data store. This includes useful information regarding the run conditions that produced the contents of the data store.

Log files can be accessed vial a special attribute.

Each element in that list is a DataMember which you can use to get the data contents.