The state of #Agile according to Data Modellers

DataVersity released their snapshot survey on Data Modelling and the results aren’t good for Data Modellers who want more adoption of Agile by the Data Modelling community.

Before we get into the details, DataVersity is a great source of references and webinars for all things data. You can find them by clicking on this link: DataVersity

The Results

1) How important is it to know how to work with Agile Teams? (this was graded on a scale of 1-10. I’ve tried to group them into categories)

  • 27% thought it was important (1-3 on a scale of 10)
  • 27% were neutral (4-6 on a scale of 10)
  • 46% thought it was not important (7-10 on a scale of 10)

2) How much experience do you have with Agile or Scrum projects?

  • 12% – 10+ projects
  • 9% – 5-10 projects
  • 20% – 3-4 projects
  • 16% – 2 projects
  • 8% – 1 project
  • 33% – no experience

3) If you have been the data architect or data modeller on an Agile project, how satisfied were you with the results?

  • 22% – very satisfied
  • 46% – fairly satisfied
  • 32% – not satisfied

Analysis

What I took from these responses are that the majority of people feel that Agile isn’t that important to them. But I hope there is some light at the end of the tunnel as a good percentage of Data Modellers have not been on an Agile project or have only been on 1 Agile project. (41%) In addition, of the ones that were on Agile projects, 68% reported that they were either very satisfied or fairly satisfied. Those are encouraging numbers. Although some Data Modellers don’t think Agile is important to them, there seems to be a correlation that once they are exposed to the methods, they view them as positive.

Dimensional Modelling

Those of you that have read my blogs have seen my positive comments on Dimensional Modelling. Another concern I had with the results of the survey are the percentage of Data Modellers that feel understanding Dimensional Modelling concepts are not important to them.

  • 28% thought it was important (1-3 on a scale of 10)
  • 34% were neutral (4-6 on a scale of 10)
  • 40% thought it was not important (7-10 on a scale of 10)

Given that all Data Modellers have had to model the data to support reporting, this is a somewhat concerning statistic. Dimensional Modelling is not something new as well, not like XML and No SQL data stores. Shockingly 62% felt that knowledge of XML and No SQL Data Stores are not important to them.

Summary

I think we have a ways to go to encourage the adoption of new processes, methods, and technologies in the Data Modelling and Database professions. Even within our own profession, items like Dimensional Modelling still lack full acceptance.

I would encourage every Data Modeller out there to learn about Agile and Agile methods. As we move into the second decade of Agile, its adoption is increasing and it will affect every type of project in the future. Understanding the methods will help the Data Modeller determine how best to integrate with Agile methods and practices.

I feel that Data Modellers must remember that we are providing a service to projects and the business. If the projects and the business are becoming more Agile, we also must become Agile. If not, the projects and businesses will be drawn towards other solutions and services that do align with their methods. This could possibly drive more projects toward No SQL and XML that are not good candidates for those projects.

 

Why #Dimensional Analysis should be done on every #datamodel

Those of you who have worked with me, know of my fondness for Operational Data stores. I have always believed in the importance of having an enterprise or holistic view of the data requirements for every application. An Operational Data Store seemed to be the perfect vehicle to ensure this happened. Perhaps my fondness was related to not wanting to stray too far from the normalization rules that I knew quite well. In this way, it was a new-ish discipline or context that really wasn’t new.

I always looked kinda sideways at those weird Dimensional modelers with their Star Schemas and Snowflakes. I mean if they really put their mind to it, they would be able to figure out how to solve their data needs with a nice relational normalized Operational Data Store, Only exceptional and massive amounts of data require the Dimensional modeling constructs that these models typically use right? I mean what is so complicated about a model with only 100 main tables? Shouldn’t everyone know how to write SQL by hand?

On my latest project, I have had the opportunity to become re-introduced to Dimensional Analysis and modeling and I have found the process fascinating and very valuable. Besides the obvious benefits that are being realized by being able to model the data in a way that allows the clients to efficiently write and execute queries, there was an unexpected benefit.

Taking a normalized Data Model and attempting to translate it into a Dimensional Model really challenges and validates your data model. It is easy to create a model with a multitude of complex relationships than it is to distill in down to a handful of FACTs and Dimensions. With so many relationships, it is possible to inconsistencies to exist and hide in the data model. I found multiple modeling errors in the process on trying to create a Dimensional model from my relational model. When you distill a relational model down to a Dimensional model, inconsistencies and errors become very apparent in the creation of the FACTs and Dimensions.

Dimensional Analysis also forces you to look at the data in a different way. Instead of a relational/hierarchical way, I find it forces me to look at the data in a chronological way and forces me to consider data changes, data history, and data latency in ways I may not have considered before. Not having to account for data across time and verify consistency at every point is quite a bit simpler.

Summary

I am a convert of using Dimensional Analysis on all my data models for validation of the data model and additional analysis of the data.  I’ve discovered that I need to understand the data better to create a Dimensional Model than a normalized model. More factors need to be considered and creating the Dimensional model with fewer objects requires that the data model has greater consistency, integrity, and cohesion.

Simple is hard.