The Agile Data (AD) Method

Thin Slicing: Enabling Continuous Data Warehousing

A fundamental concept of an agile approach to data warehousing is to slice value into thin, consumable pieces that may be deployed into production quickly. These thin slices of value are completely implemented – the analysis, design, programming, and testing are complete – and offer real business value. It is fairly clear how to thinly slice when you’re building a website, or a mobile application. It isn’t clear how to do so if you’re building a data warehouse (DW), a business intelligence (BI) solution, or other type of data product. The focus of this article is to describe thin slicing, sometimes called vertical slicing, means for a DW/BI solution. This article is organized into the following topics:
  1. What is thin slicing in a DW/BI solution?
  2. Why thin slicing?
  3. Thin slicing strategies for a data warehouse
  4. You can do this too
  5. What new skills will you need?
  6. What happens when you don’t thinly slice?
  7. Thin slicing in context
  8. Parting thoughts
  9. Recommended resources

1. What is Thin Slicing in a DW/BI Solution?

A thin slice is a top to bottom, fully implemented and tested piece of value that provides business value to someone. Most important, it should be possible to easily deploy a thin slice into production upon request. A slice can be very small, such as a single edit field on a screen, the implementation of a business rule or calculation, or the updated layout of a screen. Better yet, a thin slice addresses a question story, which is a small usage-oriented requirement for a DW/BI solution. An agile team accomplishes all of this implementation work during a single iteration/sprint. Teams following a lean delivery lifecycle this timeframe typically shrinks to days and even hours in some cases.

A slice is fully implemented from beginning (the data sources) to end (accessibility by end users). This means that you have fully implemented (within a matter of days or even hours):

  • Extraction from the data source(s). This is required only for the data elements that you need for the given thin slice. For a mature DW you are likely to have most of the data elements already, and maybe even all of them. Worst case is when you need elements from one or more “new” data sources that you have never accessed before. This will require the initial work to gain access to the data source and to analyze the data source (ideally read its supporting documentation) to identify the data elements that you require.
  • Staging of the raw source data. I recommend that whenever you access a source table for the first time that you stage the entire table at that point, a strategy I’ve adopted from DataVault2. The implication is that you may already be staging the required data elements even if you’ve never needed them before. When that’s not the case you’ll need to do the work to stage the required tables. Of course, if your DW architecture doesn’t include staging incoming raw data first then this step should be skipped.
  • Transformation/cleansing of the source data. You need to do the work, if any, the transform the incoming source data for just the new data elements that you require for this slice.
  • Loading the data into the DW. Once again, you need to do this for just the new data elements required for this slice.
  • Loading your information marts. Do this when the data elements are needed in your information marts.
  • Updating the appropriate BI views/reports where needed. As you’ll soon see your slice may simply make some data available in the DW or marts for ad-hoc reporting.

A common theme running through all of those steps is that you only do the work for the vertical slice that you’re currently working on. You get the work done in a matter of days (and even hours once you get good at it) instead of weeks or months.

2. Why Thin Slicing?

Thin slicing is a fundamental agile technique for breaking up a large piece of functionality into something that is easier, safer, and quicker to work with. There are several critical benefits of thin slicing the implementation of a DW/BI solution:

  1. Reduce the feedback cycle. By focusing on delivering small, thin slices of value you have more opportunities to show working functionality to stakeholders and thereby receive concrete feedback that you can act on. This enables your stakeholders to steer your work more effectively. It also motivates your team to test throughout the lifecycle, thereby reducing your overall cost of fixing any found defects dramatically.
  2. Increased ability to meet actual stakeholder needs. By taking a flexible, evolutionary approach to developing your DW/BI solution where you regularly seek feedback you end up discovering what your stakeholders actually need in practice. With a traditional approach where you attempt to think everything through up front the best you can possibly do is to build something to specification – this is unfortunately ineffective because people are not good at defining their needs up front and even if they were they would change their minds anyway due to changes in the marketplace.
  3. More competitive. Delivering in small, incremental slices enables your team to react to changing requirements quickly. The ability to deploy these thin slices easily enables your organization to react quickly to marketplace dynamics, thereby increasing your competitiveness.
  4. Increased quality. Thin slicing forces data professionals to adopt modern, agile database techniques that have a significantly greater focus on quality than do traditional techniques. Agile DB techniques such as database refactoring and database regression testing, are clearly focused on data quality.
  5. Lower implementation risk. Working in small thin slices forces the team to fully integrate and test their solution very early in the lifecycle. If there are integration issues they will be found much earlier in the lifecycle when they are easier and less expensive to address.
  6. Reduced cost of delay. Delivering in thin slices enables teams to get working functionality into the hands of their stakeholders quickly, reducing overall cost of delay (opportunity cost from a management accounting point of view).

There are several common complaints about working this way, but they rarely seem to hold water in practice. These complaints are:

  1. It takes longer to deliver the overall solution. No, the traditional/serial approach tends to take longer in practice due to less sense of urgency and the likelihood that the team will spend time building functionality that stakeholders don’t actually want (because they built to the specification). By building incrementally you deliver smaller, valuable functionality into production sooner thereby reducing cost of delay.
  2. We need to think everything through at the beginning. This is actually a good idea, as long as we do so in an agile manner. Agile modeling techniques such as agile requirements envisioning and agile architectural envisioning exist to do exactly this. Think through the big issues up front, but explore the details at the last-most responsible moment.
  3. It’s more expensive in the long run. This is also very rare in practice. Furthermore, the real issue is producing value, not what the expense of doing so is. Teams that deliver continuous value enjoy higher levels of ROI on average than traditional teams because they work in priority order and deliver incrementally (once again, reducing cost of delay).

3. Thin Slicing Strategies for a DW/BI Solution?

There are several strategies that you can choose to employ with thin slicing the requirements for a DW/BI solution. The following table describes these strategies. There are example question stories for each strategy as well as some advice for when to apply each strategy.

Table 1. Vertical slicing strategies for a DW/BI solution.

Slicing Strategy Example Question Stories When to Do This
One new data element from a single data source
  • As a Professor I would like to know the names of my students so that I know who should be there
  • As a Student I would like to know what courses are taught at the university
Very early days when you are still building out fundamental infrastructure components. Very common for the first iteration or two of Construction. These slices still add real business value, albeit minimal.
One new data element from several sources
  • As a Professor I would like the student list for a seminar that I teach
  • As a Student I would like to know what seminars are being taught this semester
Early days during Construction when you are still building out the infrastructure. These slices add some business value, often fleshing a DW data element to include the full range of data values for it.
A change to an existing report
  • As a Professor I would like to know the standard deviation of marks within a seminar that I teach
  • As a Student I would like to know how many spots are still available in a seminar
Evolution of existing functionality to support new decision making
A new report
  • As a Professor I would like to know the distribution curve of student marks in a seminar that I teach so I may adjust accordingly
  • As a Registrar I would like to know what Seminars are close to being full
Several iterations into Construction when the DW/BI solution has been built up sufficiently.
A new reporting view
  • As a Registrar I would like to know what the prerequisites are for a seminar so that I can advise students
  • As a Professor I would like to know the current course load of each student within a seminar that I teach
Several iterations into Construction when the DW/BI solution has been built up sufficiently.
A new DW/DM table
  • As a Chancellor I would like to track the revenues generated from parking pay meters to identify potential profits to divert to supporting students
  • As a Professor I would like to recommend suggested readings to help people prepare before taking a seminar
Several iterations into Construction when the DW/BI solution has been built up sufficiently.

There are several interesting things about the question stories in the table:

  1. They are written from the point of view of your stakeholders. They aren’t a technical specification. For example, the first story describes how professors want a list of student names but it isn’t saying from what data source(s), what the element names are – These are design issues, not requirement issues.
  2. They always provide business value. The first story appears to be the beginnings of an attendee list for a seminar. Having something as simple as a list of names does in fact provide a bit of value to professors.
  3. Sometimes that business value isn’t (yet) sufficient. It may take several iterations to implement something that your stakeholders want delivered into production, particularly at first. For example, although a list of student names is the beginnings of a class list it might not be enough functionality to justify putting it into production. Professors also need to know the program that the student is enrolled in, their current year of study, and basic information about the seminar. The decision as to whether the functionality is sufficient to ship is in the hands of your stakeholder (this is one of the reasons why you want to demo your work on a regular basis).

4. You Can Do This Too

This can be hard to hear sometimes, but you’re not special. Others are in fact doing this, often for years, and have been doing so successfully. Yes, just like you, they had to deal with:

  • Solving hard problems
  • Legacy data sources that were rarely perfect
  • Legacy data sources that were not under their control, sometimes owned by people difficult to work with, and sometimes not even owned by their organization
  • Stakeholders who change their minds, or ask for fixed budgets, or ask for exact delivery dates, and of course ask for any combinations thereof
  • Teams made up of experienced data professionals whose culture tells them that they need to do detailed modeling up front

5. What New Skills Will You Need?

Vertical slicing is an important agile skill in general. Other agile data skills enable vertical slicing of a DW/BI solution. These skills include:

 

6. What Happens When You Don’t Thinly Slice?

Teams that don’t know how to thin slice their work often fall into one of the Mini-Waterfall or Staggered Mini-Waterfall process anti-patterns. Neither of these strategies are agile – let’s explore each of them and see why.

Figure 1 below depicts a mini-waterfall approach where a team works through the traditional phases, mostly in order, throughout an iteration/sprint. These iterations are typically longer than usual, often four or more weeks in length, whereas 80% of agile teams have iterations of two weeks or less. Mini-waterfalls are common with teams that are very new to agile and in this case should be seen as a step in the right direction away from the traditional/serial approach towards an agile approach. However, if you’re taking a mini-waterfall approach because of one or more of the reasons discussed earlier (see you can do this too) then what’s really happened is that the team is using one of those flimsy excuses for not making the behavioural changes required to be truly effective.

Figure 1. A mini-waterfall (click to enlarge).

 

Figure 2 depicts the Staggered Mini-Waterfall anti-pattern. The team is organized into functional silos such as data analysts, data architects/designers, developers, and testers. The analysts do their “sprint” where they complete the data analysis work for one or more stories. They then hand this off to the designers who do their “design sprint”, who hand off to the developers to do their “implementation sprint”, and finally to the testers who do their “testing sprint.” Once the analysts hand off their work to the designers they move on to analyze the next batch of requirements (often user stories). Once again, at best this might be a step towards becoming agile but it certainly isn’t agile. Many times the team is composed of people who are overly specialized and still need to learn the modern agile database skills of the agile database techniques stack. Remember, agilists strive to become cross-functional generalizing specialists. This is ok if you’re just starting out with agile, as we like to say you go to war with the army that you’ve got so if everyone is a specialist then that’s how you start out. BUT, when you invest in your people and when team members recognize the importance of learning new skills then they can quickly work together to learn new skills from one another.
Figure 2. Staggered mini-waterfalls (click to enlarge).

As we show in the article Continuous Data Warehousing it is in fact possible for DW/BI teams to work in an agile or even continuous manner. There is absolutely no reason, except as a step in your team’s overall learning effort, to follow either a Mini-Waterfall or Staggered Mini-Waterfall approach. You can and should do better.

7. Thin Slicing in Context

The following table summarizes the trade-offs associated with thin slicing and provides advice for when (not) to adopt it.

Advantages
  • Deliver high-value functionality sooner
  • Reduces risk
  • Enables opportunity for feedback
Disadvantages
  • Requires clean architecture, design, and implementation
  • Requires “full stack” data capability within a team to develop and evolve the solution
When to Adopt This Practice Whenever a development team is taking an agile approach then you will need to vertically slice your work.

8. Parting Thoughts

Thin slicing is an important skill for any agile team, regardless of what they are building. In this article you learned that it is highly desirable to do so for a DW/BI solution and more importantly that the techniques exist to do so. For most people the hardest thing about thin slicing is to adopt the agile mindset behind working this way, something that can be very tough for experienced data professionals given the cultural impedance mismatch between traditional data professionals and modern agile practitioners.

9. Recommended Resources


Recommended Reading

Choose Your WoW! 2nd Edition
This book, Choose Your WoW! A Disciplined Agile Approach to Optimizing Your Way of Working (WoW) – Second Edition, is an indispensable guide for agile coaches and practitioners. It overviews key aspects of the Disciplined Agile® (DA™) tool kit. Hundreds of organizations around the world have already benefited from DA, which is the only comprehensive tool kit available for guidance on building high-performance agile teams and optimizing your WoW. As a hybrid of the leading agile, lean, and traditional approaches, DA provides hundreds of strategies to help you make better decisions within your agile teams, balancing self-organization with the realities and constraints of your unique enterprise context.

I also maintain an agile database books page which overviews many books you will find interesting.