The "Found Some" Data Access Problem

| Comments

manpurse.jpg

I’ve written my fair share of data access code. Recently I’ve come up with a nice solution for the case when only some of the requested keys are in the database.

My initial reaction to the “some found” scenario was implement one of two options:

  1. Raise an Exception
  2. Return only the found items

The first option is attractive because asking for an ID carries with it the assumption that the item exists. If it doesn’t, it’s an error.

The second option becomes more desirable if you don’t want to crash the whole operation if a tiny fraction of items might be hosed. Imagine manpurses.com, where a page normally shows 100 purses, but only one of them is missing from the database. Should we kill all hope of showing the 99 good records?

Recently I came up with what I think is a Pretty Good solution to the whole scenario. Indulge me as I walk through my thinking.

A Data Adapter for Man Purses

class UnforgivingManPurseDataAdapter(object):

    def __init__(self, db):
        self.db = db

    def get_bags(self, bag_ids):
        """Get bags for a list of ids.

        Raise an exception if any bags aren't found.

        """
        bags = self.db.query("select * from BAGS where id in (" + bag_ids + ")")
        if len(bags) < len(bag_ids):
            raise BagException("problem getting bags for ids " + str(bag_ids))
        return bags

The UnforgivingManPurseAdapter implements the first option from above: it raises an error if any of the bags aren’t found in the database. Deeming that too harsh for a production situation, I came up with a solution like this:

Squishy Man Purse Adapter

class SquishyManPurseDataAdapter(object):

    def __init__(self, db):
        self.db = db

    def get_bags(self, bag_ids, is_missing_ok=True):
        """Get bags for a list of ids

        Raise an exception if is_missing_ok is False and not
        all bags are found in the database.

        """
        bags = self.db.query("select * from BAGS where id in (" + bag_ids + ")")
        if len(bags) < len(bag_ids) and not is_missing_ok:
            raise BagException("problem getting bags for ids " + str(bag_ids))
        return bags

The SquishyManPurseAdapter lets the caller decide if missing bags should be an error. This is fine, but the method signature is kind of polluted. Plus in the default case the operation fails silently. That’s not awesome.

Another implementation of option two is to return a tuple of found and missing products:

Tuple Man Purse Data Adapter

class TuplelManPurseDataAdapter(object):

    def __init__(self, db):
        self.db = db

    def get_bags(self, bag_ids):
        """Get bags for a list of ids

        Raise an exception if is_missing_ok is False and not
        all bags are found in the database.

        """
        bags = self.db.query("select * from BAGS where id in (" + bag_ids + ")")
        missing_ids = []
        if len(bags) < len(bag_ids):
            missing_ids = self._get_missing_ids(bag_ids, bags)    
        return (bags, missing_ids)

This is bad because now the caller needs to check if there are any missing ids. Plus the method’s name implies it gets bags, but it really gets bags AND missing bag ids. Grossout.

My Solution: A Smart and Friendly Data Error Class


class PursesNotFoundError(Exception):
    """Some purses are missing.

    Check the missing_ids attribute for the missing purse ids.
    The found_purses attribute holds the purses that were found.

    """
    missing_ids = None
    found_purses = None

    def __init__(self, missing_ids=None, found_purses=None):
        self.missing_ids = missing_ids or []
        self.found_purses = found_purses or []


class HumaneManPurseDataAdapter(object):

    def __init__(self, db):
        self.db = db

    def get_bags(self, bag_ids):
        """Get bags for a list of ids.

        Raise an exception if any bags aren't found.

        """
        bags = self.db.query("select * from BAGS where id in (" + bag_ids + ")")
        if len(bags) < len(bag_ids):
            missing_ids = self._get_missing_ids(bag_ids, bags)
            raise PursesNotFoundError(missing_bag_ids, bags)
        return bags

I really like this solution because it:

  1. Makes no bones about the fact that something went wrong. An exception will always be raised on missing purses.
  2. Allows the caller to recover from the error with minimal effort (if it chooses to do so).

In retrospect, I really need to modify my thinking around Exceptions. They’re not just vessels for an error string that rocket their way up through the stack when Things Go Wrong. They can and should be packed with data and functionality that make it easier for the caller to recover from the error.

In fact, Python 2.6 does away with the whole “message” attribute of the Exception, which further reinforces the fact that it should probably be a fairly rich object.

blog comments powered by Disqus