What is data lineage and why do you need it?

Hugh asked me to talk about data lineage for construction companies, but the conversation was more wide ranging than that. First we took a step back to describe the context in which data lineage makes sense.

Serving your customers you get certain inputs, like construction plans. Next you deliver certain outputs: like a building. To get from the inputs to the outputs you go through a set of steps to get there. These are your business processes. 

To deliver these outputs in an efficient way, you use software systems to support them. These systems can be ERP systems to keep track of orders and stock, or financial systems to do your accounting, or CRM systems to keep track of customer contacts and sales opportunities. These systems contain what I call small data. It can be large datasets, but they are stored in a very structured way in the underlying databases of the transactional systems. There are very valuable use cases to be developed with big data as well, but let’s put these aside for now.

Xudo’s motto is: first put your small data in order, before starting with big data analytics.