Tuesday, September 6, 2011

Tables Are No Domain Objects: Table Relation Transformations Part 1

This is the second part of a blog series 'Tables Are No Domain Objects'. In this post we will discuss where database relations are good candidates to be accessed in a different manner when they are represented by domain objects of our business layer.

The most obvious kind of a database table relation is a one (A) to many (B) relation where rows in table B hold a foreign key column that points to a unique key (usually the primary key) of table A. Most data access layers, based on an O/R-Mapper or custom mappings, do a good job to map this kind of database relations into objects, but there are some cases where it can make sense to transform those relations into a different structure or provide a different access than given by our database.

Foreign Key Fields And Transparent Database Relations

Before we step into more specific types of relations, there is one very basic thing where each of us should think about when starting to design a new business layer. In a database relations are always represented by foreign keys but when data are loaded into a object structure we can use object references, so we don't really need those foreign key fields as part of our objects. For instance, a sales order line object does not need to hold the ID of its parent sales order, it can hold a object reference of the sales order. One good reason to keep foreign key fields present our in domain objects is to have some additional logging and debugging information. However, we should never use those fields to implement any business logic on them, instead all business logic should always be implemented on the corresponding object references. (Very rare exceptions prove the rule though.)

This was already discussed in the previous blog post but should be recalled for sake of completeness. O/R-Mapper like Entity Framework or NHibernate provide a powerful query interface to access data, but using those queries in our business layer or presentation layers will cause a tight coupling between our source code and the database structure. Apart from other issues, discussed in the other post, queries like this can become a issue if we ever need to refactor our database structure or domain objects.

var orders = from o in efContext.Orders
             where o.CustomerId == currentCustomer.Id
             select o;

foreach (var order in orders) {
   // process order
}
Instead of this it is usually much safer to provide strong typed access methods out of our data access layer.
var orders = myDataContext.Orders.GetForCustomer(currentCustomer);

foreach (var order in orders) {
   // process order
}
Please read the previous post (Tables Are No Domain Objects Part 1) to see further issues, especially when using Entity Framework.

Aggregations

In general I'm not as restrictive as other architects, who say it is always a bad solution to access any related objects of a current object reference, but when it comes to aggregations it can sometimes be dangerous to be done from outside of the class that holds the objects to be aggregated.

One of the most common examples for a an aggregation that we should consider to encapsulate is when we have to calculate the price of a sales order that is based on the price of its line items.

SalesOrder order = GetOrder();
decimal orderPrice = 
   order.SalesOrderLines.Sum(line => line.ArticlePrice * line.ItemCount);
From the very beginning of a new system this could work pretty nice. The problem is, what if the calculation of the sales orders price ever changes? Salespeople are creative to find new ways to sell the companies products and usually it is only a matter of time when discount features become required. Discounts can be a special offer for specific articles or article categories, a graduated discount depending on the orders all round price or many other types. Now we can run into trouble if we do a outside calculation of a sales orders price. A better solution is to put the aggregation into the sales order class.

public partial class SalesOrder {
   public decimal GetPrice() {
      return SalesOrderLines.Sum(line => line.ArticlePrice * line.ItemCount);
   }
}
For now we only encapsulated the calculation that we have done from outside (what already avoids a duplication of logic to multiply with the sales lines item count) but when it comes to discounting we don't need to scan our whole source code to find all places where an orders price is calculated. We only have to adapt the body of our SalesOrder.GetPrice method and the rest of the system doesn't even notice the new calculation.

(The approach to do money calculations with decimal becomes part of a subsequent blog of this series.)

Status Objects

Status objects are special kinds of domain objects that describe the current status of their parent objects. They usually exist in a collaboration of their parent domain object and a description object that describes the current status.


In addition to providing a current status of their parent object, status objects are often used to log an operational history of an object since they are usually not deleted or updated after their first creation. Since each new status object can change the state of its parent they are often very important rules for the processing of an object.

As an example, say we have a parent SalesOrder domain object that can hold a list of SalesOrderStats objects where each of the status is described by a referenced SalesOrderStatusDescription object. Now what if we want to know if an orders current status is "Closed"? Without some design effort we would have to do something like this.
public partial class SalesOrderStatusDescription {
   // status description code constants
   public const string ClosedCode = "Closed";
   // ...
}
// =========================================
// sample usage
SalesOrder order = GetOrder();

bool isClosed = (from status in order.SalesOrderStatus
                 where status.OrderId = order.Id
                 orderby status.CreationDate descending
                 select status.SalesOrderStatusDescription.Code)
                 .First()
                 == SalesOrderStatusDescription.ClosedCode;
Apart from the fact that this causes is tight coupling between three different domain objects and their base tables, it would be crap if we always would have to do so in upper layers, just to get an objects current state.

A first thing we can do to prettify this is introduce an enum that either represents the possible codes of our SalesOrderStatusDescription objects or represents the possible foreign key values pointing to the SalesOrderStatusDescription primary keys. Since we would need to always load the descriptions to parse the code fields, we will do the foreign key solution, what causes a lower database utilization. Yes, I know we should try to never base any functionality on foreign key values but I tend to see this as one of the valid exceptions. Our descriptions IDs are usually immutable and it does not make a big difference if our source code is coupled to the Code column of the status description or its primary key.

public enum SalesOrderStatusCode : int {
   Created = 1,
   Approved = 2,
   Delivered = 3,
   Payed = 4,
   Closed = 5,
}
Next step we can do is add a new property to our sales order status that represents the value of the enum. Unfortunately Entity Framework does not provide native support for enums, so we need to do workaround by casting the foreign keys value.
public partial class SalesOrderStatus {
   public SalesOrderStatusCode Code {
      get { return (SalesOrderStatusCode)SalesOrderStatusDescriptionId; }
      set { SalesOrderStatusDescriptionId = (int)value; }
   }
}
Okay, now we are able to shorten the previous snippet a little bit, but without one more method we would still need to traverse the list of all existing status whenever we want to know the current one. Since the current status of an object is usually a widely needed information we should add a method to our sales order that encapsulates the traversing returns the code of the current status.

public partial class SalesOrder {
   public SalesOrderStatusCode GetCurrentStatusCode() {
      return (from status in SalesOrderStatus
              orderby status.CreationDate descending
              select status.Code)
              .First();
   }
}
// =========================================
// sample usage
SalesOrder order = GetOrder();
bool isClosed = order.GetCurrentStatusCode() == SalesOrderStatusCode.Closed;
This interface is much niftier and will make life much easier in client code.

As an optional, last step we could add a IsClosed method to our order. I use to do this only for the most important states of an object though.
public partial class SalesOrder {
   public bool IsClosed() {
      return GetCurrentStatusCode() == SalesOrderStatusCode.Closed;
   }
}
// =========================================
// sample usage
SalesOrder order = GetOrder();
bool isClosed = order.IsClosed();
Now our sales order provides a really handy interface that helps us to concentrate on other things when implementing features that need to work with the sales order status.

Last but not least, we should add a corresponding method set the new status of an order.
public partial class SalesOrder {
   public void SetStatus(SalesOrderStatusCode code) {
      SalesOrderStatus status = new SalesOrderStatus();
      status.CreationDate = DateTime.Now;
      status.Code = code;
      status.SalesOrder = this;
      SalesOrderStatus.Add(status);
   }
}
There is one line in this method that could cause problems. Setting the CreationDate with by using the local hosts time is only safe if we are sure that all client PCs are configured with the same time server, otherwise we can get deflections of the creation date of new status. Since the creation date is essential for these objects this could cause issues in production. One thing we can do is to use the time from a central server, like the database server, instead of trusting the clients.

Since there are usually much more places where we need to know the current status of an object than places where a status of an object becomes changed I tend to add less strong typed set methods like SetPayed().

As we have seen, due to their importance and the complicated access, status objects are usually good candidates to be handled in a very different way than they are stored in our database and some architectural effort to get them into a more fashionable, object-oriented structure can be a good investment.

Performance Tuning. Since this series concentrates on designing our domain objects I kept this until now, but our current solution requires to always retrieve all existing status objects from the database server, what causes a unneeded network traffic and database utilization. We should consider to add a method to our data access layer that loads only the current status, description ID or description code, instead of loading all status objects if not yet loaded anyway. This is another important reason why we should encapsulate the get method, since we need to change only one place.

Outlook

In the next post we will continue the discussion of table relation transformations.

We will have a look at many to many relations where we might need to handle the weakness of O/R-Mappers.

As last part of the discussion about table relation transformations we will have a look at versioned data.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.