My little Android warehouse

Random thoughts about my experience as moonlight android developer.

Generating Preloaded Sqlite Data

Preloading data in Sqlite is a common problem during android development.
The most common solutions are:
  • hardcode the insert statements and execute them (possibly inside a transaction) the first time the db is opened
  • place the data inside a csv file, load it the first time the db gets opened, parse the rows and populate the db
  • prepare a pre-built sqlite file, ship it along with your app and load it the first time the db is opened


    I'll focus on the third approach, because it allows us to check our db outside the app without extracting it from our device, and because it allows us to prepare the data in a more "human friendly way".

    It's worth to mention the excellent sqlite asset helper library by Jeff Gifelt, that allows us to embed a prebuilt (compressed) sqlite file without having to reinvent the wheel. For this reason, this post will be about how to handle data entry process (and something more), and you won't find a (yet another) way to preload a sqlite file.

    Let's start with a problem: you want to build a pretty simple Seasonal Fruit and Vegetables app, and the amount of data you have to deal with is big enough to discourage you to prepare the inserts manually.
    You'll have:
    • the list of products
    • the description of each product (possibly localized in different languages)
    • the data related to the seasonality of each product
    So, here's the approach:

    Step 1: data entry

    Use a spreadsheet to fill the data in a human friendly way, export it as csv (or tsv, so you won't have to deal with any random comma inside the descriptions), parse it and produce the sqlite file together with localized string res files to be embedded inside the application.
    In my case I had a file  having the genre (fruit or veg), the calories count, the name and the description in italian and in english:

     asparagus     0     25     Asparagi     Descrizione asparagi in italiano.      Asparagus     English asparagus description  
    chard 0 12 Bietola Descrizione bietola in italiano. Chard English chard description


    I also have files that describe the seasonality of each products in different areas, but I will not describe them since they are not relevant to the approach I am describing here.

    Step 2: Parse!

    Choose your favorite scripting language. I went for python since it's super slick and it comes with batteries included, such as xml writing and sqllite support.

    Creating the db is straightforward:
     CREATE_ITEMS = 'create table Item (_id integer primary key, Name text, kind integer, calories integer);'  
    CREATE_ITEM_PER_MONTHS = 'create table ItemPerMonth (ItemId integer, Month integer, Freshness integer, Area integer);'
    CREATE_ITEM_PREFERRED = 'create table PreferredItem (PreferredItemId integer);'
    conn = sqlite3.connect('fruit')
    c = conn.cursor()
    c.execute(CREATE_ITEM_PER_MONTHS)
    c.execute(CREATE_ITEM_PREFERRED)


    Then we need to use the data we have to fullfill two purpouses:
    • fill the database
    • fill the localized string resources 
    Again, filling the database consists of parsing each line of the csv file and calling the appropriate insert.

    In python is as easy as calling

       cursor.execute(statement)  

    on a cursor object, where statement is the insert statement.

    Together with that, we do have a lot of data that needs to be rendered in the proper language.
    In order to take advantage of the resource framework, I chose to generate the resource files instead of relying on a "locale" field in the db to filter in my queries.
    Again, filling an xml node is pretty straightforward in python:

       name = ET.SubElement(root, "string")  
    name.set("name", item_name)
    if lang_name.find('%') != -1:
    name.set("formatted", "false")
    name.text = lang_name


    At this point, we can generate an it_descriptions.xml file and an eng_descriptions.xml file (and even let our script place them in the proper folders).

    When we want to use our descriptions in Android (i.e. in  our cursor adapter's bindView), we  need to bind the product name (i.e. "asparagus") with the corresponding resource. That's easy and it can be done in this way:

      int nameResID = context.getResources().getIdentifier(name,   
    "string", context.getPackageName());

    By doing that, we can retrieve the resource id of the string related to the name that was generated by our script. If we are worried about the cost of the lookup, we can even add a lru cache to absorb that cost.


    TL;DR:
    • Write your data in a human readable format
    • Parse that data with your favourite language
    • Write the data that does not depends on the language in sqllite
    • Write the language-dependent data in resource files
    • Use getIdentifier for retrieving the resource id of the strings automatically generated
    Bonus point: again, you can use this convention for pictures as well, so you can throw an "asparagus.png" image in your assets to be loaded afterwards.

    The biggest advantage of this approach is that you have only one master source of your data (the spreadsheet) that you can modify and rerun the generator over every time you modify it.

    If you are interested in the whole loader python script you can check it here