Advanced use of Eet Data Descriptors

A real life example is usually the best way to see how things are used, but they also involve a lot more code than what needs to be shown, so instead of going that way, we'll be borrowing some pieces from one in the following example.

It's been slightly modified from the original source to show more of the varied ways in which Eet can handle our data.

This example shows a cache of user accounts and messages received, and it's a bit more interactive than previous examples.

Let's begin by looking at the structures we'll be using. First we have one to define the messages the user receives and one for the one he posts. Straight forward and nothing new here.

typedef struct
{
const char *screen_name;
const char *name;
const char *message;
unsigned int id;
unsigned int status_id;
unsigned int date;
unsigned int timeline;
} My_Message;
typedef struct
{
const char *dm_to;
const char *message;
} My_Post;

One more to declare the account itself. This one will contain a list of all messages received, and the posts we make ourselves will be kept in an array. No special reason other than to show how to use arrays with Eet.

typedef struct
{
unsigned int id;
const char *name;
Eina_List *messages;
My_Post *posts;
int posts_count;
} My_Account;
Type for a generic double linked list.
Definition: eina_list.h:318

Finally, the main structure to hold our cache of accounts. We'll be looking for these accounts by their names, so let's keep them in a hash, using that name as the key.

typedef struct
{
unsigned int version; // it is recommended to use versioned configuration!
Eina_Hash *accounts;
} My_Cache;
struct _Eina_Hash Eina_Hash
Type for a generic hash table.
Definition: eina_hash.h:285

As explained before, we need one descriptor for each struct we want Eet to handle, but this time we also want to keep around our Eet file and its string dictionary. You will see why in a moment.

static Eet_Data_Descriptor *_my_cache_descriptor;
static Eet_Data_Descriptor *_my_account_descriptor;
static Eet_Data_Descriptor *_my_message_descriptor;
static Eet_Data_Descriptor *_my_post_descriptor;
struct _Eet_Data_Descriptor Eet_Data_Descriptor
Opaque handle that have information on a type members.
Definition: Eet.h:2631
static Eet_File *_my_cache_file = NULL;
static Eet_Dictionary *_my_cache_dict = NULL;
struct _Eet_File Eet_File
Opaque handle that defines an Eet file (or memory).
Definition: Eet.h:527
struct _Eet_Dictionary Eet_Dictionary
Opaque handle that defines a file-backed (mmaped) dictionary of strings.
Definition: Eet.h:533

The differences begin now. They aren't much, but we'll be creating our descriptors differently. Things can be added to our cache, but we won't be modifying the current contents, so we can consider the data read from it to be read-only, and thus allow Eet to save time and memory by not duplicating thins unnecessary.

static void
_my_cache_descriptor_init(void)
{
// The FILE variant is good for caches and things that are just
// appended, but needs to take care when changing strings and files must
// be kept open so mmap()ed strings will be kept alive.
_my_cache_descriptor = eet_data_descriptor_file_new(&eddc);
_my_account_descriptor = eet_data_descriptor_file_new(&eddc);
_my_message_descriptor = eet_data_descriptor_file_new(&eddc);
_my_post_descriptor = eet_data_descriptor_file_new(&eddc);
#define EET_EINA_FILE_DATA_DESCRIPTOR_CLASS_SET(clas, type)
This macro is an helper that set all the parameter of an Eet_Data_Descriptor_Class correctly when you...
Definition: Eet.h:3075
EAPI Eet_Data_Descriptor * eet_data_descriptor_file_new(const Eet_Data_Descriptor_Class *eddc)
This function creates a new data descriptor and returns a handle to the new data descriptor.
Definition: eet_data.c:2090
Instructs Eet about memory management for different needs under serialization and parse process.
Definition: Eet.h:2828

As the comment in the code explains, we are asking Eet to give us strings directly from the mapped file, which avoids having to load it in memory and data duplication. Of course, there are things to take into account when doing things this way, and they will be mentioned as we encounter those special cases.

Next comes the actual description of our data, just like we did in the previous examples.

#define ADD_BASIC(member, eet_type) \
EET_DATA_DESCRIPTOR_ADD_BASIC \
(_my_message_descriptor, My_Message, # member, member, eet_type)
ADD_BASIC(screen_name, EET_T_STRING);
ADD_BASIC(name, EET_T_STRING);
ADD_BASIC(message, EET_T_STRING);
ADD_BASIC(id, EET_T_UINT);
ADD_BASIC(status_id, EET_T_UINT);
ADD_BASIC(date, EET_T_UINT);
ADD_BASIC(timeline, EET_T_UINT);
#undef ADD_BASIC
#define EET_T_STRING
Data type: char *.
Definition: Eet.h:2589
#define EET_T_UINT
Data type: unsigned int.
Definition: Eet.h:2587
#define ADD_BASIC(member, eet_type) \
(_my_post_descriptor, My_Post, # member, member, eet_type)
ADD_BASIC(dm_to, EET_T_STRING);
ADD_BASIC(message, EET_T_STRING);
#undef ADD_BASIC
#define EET_DATA_DESCRIPTOR_ADD_BASIC(edd, struct_type, name, member, type)
Adds a basic data element to a data descriptor.
Definition: Eet.h:3432

And the account struct's description doesn't add much new, but it's worth commenting on it.

#define ADD_BASIC(member, eet_type) \
EET_DATA_DESCRIPTOR_ADD_BASIC \
(_my_account_descriptor, My_Account, # member, member, eet_type)
ADD_BASIC(name, EET_T_STRING);
ADD_BASIC(id, EET_T_UINT);
#undef ADD_BASIC
(_my_account_descriptor, My_Account, "messages", messages,
_my_message_descriptor);
(_my_account_descriptor, My_Account, "posts", posts,
_my_post_descriptor);
#define EET_DATA_DESCRIPTOR_ADD_VAR_ARRAY(edd, struct_type, name, member, subtype)
Adds a variable size array type to a data descriptor.
Definition: Eet.h:3752
#define EET_DATA_DESCRIPTOR_ADD_LIST(edd, struct_type, name, member, subtype)
Adds a linked list type to a data descriptor.
Definition: Eet.h:3511

How to add a list we've seen before, but now we are also adding an array. There's nothing really special about it, but it's important to note that the EET_DATA_DESCRIPTOR_ADD_VAR_ARRAY is used to add arrays of variable length to a descriptor. That is, arrays just like the one we defined. Since there's no way in C to know how long they are, we need to keep track of the count ourselves and Eet needs to know how to do so as well. That's what the posts_count member of our struct is for. When adding our array member, this macro will look for another variable in the struct named just like the array, but with _count attached to the end. When saving our data, Eet will know how many elements the array contains by looking into this count variable. When loading back from a file, this variable will be set to the right number of elements.

Another option for arrays is to use EET_DATA_DESCRIPTOR_ADD_ARRAY, which takes care of fixed sized arrays. For example, let's suppose that we want to keep track of only the last ten posts the user sent, and we declare our account struct as follows

typedef struct
{
unsigned int id;
const char *name;
Eina_List *messages;
My_Post posts[10];
} My_Account;

Then we would add the array to our descriptor with

EET_DATA_DESCRIPTOR_ADD_ARRAY(_my_account_descriptor, My_Account, "posts",
posts, _my_post_descriptor);
#define EET_DATA_DESCRIPTOR_ADD_ARRAY(edd, struct_type, name, member, subtype)
Adds a fixed size array type to a data descriptor.
Definition: Eet.h:3720

Notice how this time we don't have a posts_count variable in our struct. We could have it for the program to keep track of how many posts the array actually contains, but Eet no longer needs it. Being defined that way the array is already taking up all the memory needed for the ten elements, and it is possible in C to determine how much it is in code. When saving our data, Eet will just dump the entire memory blob into the file, regardless of how much of it is really used. So it's important to take into consideration this kind of things when defining your data types. Each has its uses, its advantages and disadvantages and it's up to you to decide which to use.

Now, going back to our example, we have to finish adding our data to the descriptors. We are only missing the main one for the cache, which contains our hash of accounts. Unless you are using your own hash functions when setting the descriptor class, always use hashes with string type keys.

#define ADD_BASIC(member, eet_type) \
EET_DATA_DESCRIPTOR_ADD_BASIC \
(_my_cache_descriptor, My_Cache, # member, member, eet_type)
ADD_BASIC(version, EET_T_UINT);
#undef ADD_BASIC
(_my_cache_descriptor, My_Cache, "accounts", accounts,
_my_account_descriptor);
} /* _my_cache_descriptor_init */
#define EET_DATA_DESCRIPTOR_ADD_HASH(edd, struct_type, name, member, subtype)
Adds a hash type to a data descriptor.
Definition: Eet.h:3584

If you remember, we told Eet not to duplicate memory when possible at the time of loading back our data. But this doesn't mean everything will be loaded straight from disk and we don't have to worry about freeing it. Data in the Eet file is compressed and encoded, so it still needs to be decoded and memory will be allocated to convert it back into something we can use. We also need to take care of anything we add in the current instance of the program. To summarize, any string we get from Eet is likely to be a pointer to the internal dictionary, and trying to free it will, in the best case, crash our application right away.

So how do we know if we have to free a string? We check if it's part of the dictionary, and if it's not there we can be sure it's safe to get rid of it.

static void
_eet_string_free(const char *str)
{
if (!str)
return;
if ((_my_cache_dict) && (eet_dictionary_string_check(_my_cache_dict, str)))
return;
} /* _eet_string_free */
EAPI int eet_dictionary_string_check(Eet_Dictionary *ed, const char *string)
Checks if a given string comes from a given dictionary.
Definition: eet_dictionary.c:598
EINA_API void eina_stringshare_del(Eina_Stringshare *str)
Notes that the given string has lost an instance.
Definition: eina_stringshare.c:533

See how this is used when adding a new message to our cache.

static My_Message *
_my_message_new(const char *message)
{
My_Message *msg = calloc(1, sizeof(My_Message));
if (!msg)
{
fprintf(stderr, "ERROR: could not calloc My_Message\n");
return NULL;
}
msg->message = eina_stringshare_add(message);
return msg;
EINA_API Eina_Stringshare * eina_stringshare_add(const char *str)
Retrieves an instance of a string for use in a program.
Definition: eina_stringshare.c:606
} /* _my_message_new */
static void
_my_message_free(My_Message *msg)
{
_eet_string_free(msg->screen_name);
_eet_string_free(msg->name);
_eet_string_free(msg->message);
free(msg);
} /* _my_message_free */

Skipping all the utility functions used by our program (remember you can look at the full example here) we get to our cache loading code. Nothing out of the ordinary at first, just the same old open file, read data using our main descriptor to decode it into something we can use and check version of loaded data and if it doesn't match, do something accordingly.

static My_Cache *
_my_cache_new(void)
{
My_Cache *my_cache = calloc(1, sizeof(My_Cache));
if (!my_cache)
{
fprintf(stderr, "ERROR: could not calloc My_Cache\n");
return NULL;
}
my_cache->accounts = eina_hash_string_small_new(NULL);
my_cache->version = 1;
return my_cache;
} /* _my_cache_new */
EINA_API Eina_Hash * eina_hash_string_small_new(Eina_Free_Cb data_free_cb)
Creates a new hash table for use with strings with small bucket size.
Definition: eina_hash.c:800
static Eina_Bool
_my_cache_account_free_cb(const Eina_Hash *hash EINA_UNUSED,
const void *key EINA_UNUSED,
void *data,
void *fdata EINA_UNUSED)
{
_my_account_free(data);
return EINA_TRUE;
}
#define EINA_TRUE
boolean value TRUE (numerical value 1)
Definition: eina_types.h:539
unsigned char Eina_Bool
Type to mimic a boolean.
Definition: eina_types.h:527
#define EINA_UNUSED
Used to indicate that a function parameter is purposely unused.
Definition: eina_types.h:339

Then comes the interesting part. Remember how we kept two more global variables with our descriptors? One of them we already used to check if it was right to free a string or not, but we didn't know where it came from. Loading our data straight from the mmapped file means that we can't close it until we are done using it, so we need to keep its handler around until then. It also means that any changes done to the file can, and will, invalidate all our pointers to the file backed data, so if we add something and save the file, we need to reload our cache.

Thus our load function checks if we had an open file, if there is it gets closed and our variable is updated to the new handler. Then we get the string dictionary we use to check if a string is part of it or not. Updating any references to the cache data is up you as a programmer to handle properly, there's nothing Eet can do in this situation.

static void
_my_cache_free(My_Cache *my_cache)
{
eina_hash_foreach(my_cache->accounts, _my_cache_account_free_cb, NULL);
eina_hash_free(my_cache->accounts);
free(my_cache);
} /* _my_cache_free */
EINA_API void eina_hash_free(Eina_Hash *hash)
Frees the given hash table's resources.
Definition: eina_hash.c:868
EINA_API void eina_hash_foreach(const Eina_Hash *hash, Eina_Hash_Foreach func, const void *fdata)
Calls a function on every member stored in the hash table.
Definition: eina_hash.c:1223

The save function doesn't have anything new, and all that's left after it is the main program, which doesn't really have anything of interest within the scope of what we are learning.