It is important to have a broad vision of where you want to go, and not just start using the data you have on hand. In the previous section we discussed KPI's, basket analysis and customer segmentation as ways to steer your organization towards customer intimacy. These use cases are all distinctly different, but they have one thing in common: they use more or less the same source data. At least past sales per product per customer should be in the dataset to be able to do the analysis we have in mind. Of course it would be even better to have a couple of extra attributes for each record in your data, like sales rep, region, demographics, ...
A data analytics project can seldom be run without taking a closer look at the business processes and the way they are represented in the business applications. Most of the time, a high level overview of what sources you would like to use is not enough. You need to analyze them in more detail to get a good understanding of the meaning of the different fields in your applications. To be able to link these different systems together, you need to identify unique identifiers on the right level of detail.
The next step is to identify the source systems where we have the data we need. It can be just one system, for example the database of the cash register. Or it can be different systems, for example the database of your e-shop for the sales info and the CRM for more details on the customer. If you want to include gross margins per sale, it may be needed to also link the product database or the ERP. And if you want to go even further and look at the net margin per period, you will probably have to include the data from the financial system as well.
The process of combining data from different sources into a single, unified view is called data integration. It can be that this unique identifier to integrate the data doesn't exist yet. Then you need to have a talk with the people who run these applications to explore possibilities to add them somehow. Very pragmatic this could mean to add a mandatory custom field in your invoicing system where you fill in the project number that is used in your CRM. This will allow to tie the data of the two systems together. Check how this custom field shows up in the database or API of your invoicing system to make sure you can use it later in your data integration steps. In your accounting system you can use analytical accounting to add the unique identifiers you need as a cost center or cost bearer. Depending on the accounting software you use, this will be straightforward or more cumbersome, but seldom completely impossible.
Once you have identified the different systems that are in scope, you can start putting together a plan on how to collect and transform this data in a format that is suitable for analysis. It is understandable if you don't want to build the whole environment before getting some results, so you start small with the systems that are easy to access so that you can show the value of your work early to the business stakeholders. However, keep the future use cases in mind so that you can extend and scale the initial setup accordingly.