Thin Slicing: Enabling Continuous Data Warehousing
- What is thin slicing in a DW/BI solution?
- Why thin slicing?
- Thin slicing strategies for a data warehouse
- You can do this too
- What new skills will you need?
- What happens when you don’t thinly slice?
- Thin slicing in context
- Parting thoughts
- Recommended resources
1. What is Thin Slicing in a DW/BI Solution?
A thin slice is a top to bottom, fully implemented and tested piece of value that provides business value to someone. Most important, it should be possible to easily deploy a thin slice into production upon request. A slice can be very small, such as a single edit field on a screen, the implementation of a business rule or calculation, or the updated layout of a screen. Better yet, a thin slice addresses a question story, which is a small usage-oriented requirement for a DW/BI solution. An agile team accomplishes all of this implementation work during a single iteration/sprint. Teams following a lean delivery lifecycle this timeframe typically shrinks to days and even hours in some cases.
A slice is fully implemented from beginning (the data sources) to end (accessibility by end users). This means that you have fully implemented (within a matter of days or even hours):
- Extraction from the data source(s). This is required only for the data elements that you need for the given thin slice. For a mature DW you are likely to have most of the data elements already, and maybe even all of them. Worst case is when you need elements from one or more “new” data sources that you have never accessed before. This will require the initial work to gain access to the data source and to analyze the data source (ideally read its supporting documentation) to identify the data elements that you require.
- Staging of the raw source data. I recommend that whenever you access a source table for the first time that you stage the entire table at that point, a strategy I’ve adopted from DataVault2. The implication is that you may already be staging the required data elements even if you’ve never needed them before. When that’s not the case you’ll need to do the work to stage the required tables. Of course, if your DW architecture doesn’t include staging incoming raw data first then this step should be skipped.
- Transformation/cleansing of the source data. You need to do the work, if any, the transform the incoming source data for just the new data elements that you require for this slice.
- Loading the data into the DW. Once again, you need to do this for just the new data elements required for this slice.
- Loading your information marts. Do this when the data elements are needed in your information marts.
- Updating the appropriate BI views/reports where needed. As you’ll soon see your slice may simply make some data available in the DW or marts for ad-hoc reporting.
A common theme running through all of those steps is that you only do the work for the vertical slice that you’re currently working on. You get the work done in a matter of days (and even hours once you get good at it) instead of weeks or months.
2. Why Thin Slicing?
Thin slicing is a fundamental agile technique for breaking up a large piece of functionality into something that is easier, safer, and quicker to work with. There are several critical benefits of thin slicing the implementation of a DW/BI solution:
- Reduce the feedback cycle. By focusing on delivering small, thin slices of value you have more opportunities to show working functionality to stakeholders and thereby receive concrete feedback that you can act on. This enables your stakeholders to steer your work more effectively. It also motivates your team to test throughout the lifecycle, thereby reducing your overall cost of fixing any found defects dramatically.
- Increased ability to meet actual stakeholder needs. By taking a flexible, evolutionary approach to developing your DW/BI solution where you regularly seek feedback you end up discovering what your stakeholders actually need in practice. With a traditional approach where you attempt to think everything through up front the best you can possibly do is to build something to specification – this is unfortunately ineffective because people are not good at defining their needs up front and even if they were they would change their minds anyway due to changes in the marketplace.
- More competitive. Delivering in small, incremental slices enables your team to react to changing requirements quickly. The ability to deploy these thin slices easily enables your organization to react quickly to marketplace dynamics, thereby increasing your competitiveness.
- Increased quality. Thin slicing forces data professionals to adopt modern, agile database techniques that have a significantly greater focus on quality than do traditional techniques. Agile DB techniques such as database refactoring and database regression testing, are clearly focused on data quality.
- Lower implementation risk. Working in small thin slices forces the team to fully integrate and test their solution very early in the lifecycle. If there are integration issues they will be found much earlier in the lifecycle when they are easier and less expensive to address.
- Reduced cost of delay. Delivering in thin slices enables teams to get working functionality into the hands of their stakeholders quickly, reducing overall cost of delay (opportunity cost from a management accounting point of view).
There are several common complaints about working this way, but they rarely seem to hold water in practice. These complaints are:
- It takes longer to deliver the overall solution. No, the traditional/serial approach tends to take longer in practice due to less sense of urgency and the likelihood that the team will spend time building functionality that stakeholders don’t actually want (because they built to the specification). By building incrementally you deliver smaller, valuable functionality into production sooner thereby reducing cost of delay.
- We need to think everything through at the beginning. This is actually a good idea, as long as we do so in an agile manner. Agile modeling techniques such as agile requirements envisioning and agile architectural envisioning exist to do exactly this. Think through the big issues up front, but explore the details at the last-most responsible moment.
- It’s more expensive in the long run. This is also very rare in practice. Furthermore, the real issue is producing value, not what the expense of doing so is. Teams that deliver continuous value enjoy higher levels of ROI on average than traditional teams because they work in priority order and deliver incrementally (once again, reducing cost of delay).
3. Thin Slicing Strategies for a DW/BI Solution?
There are several strategies that you can choose to employ with thin slicing the requirements for a DW/BI solution. The following table describes these strategies. There are example question stories for each strategy as well as some advice for when to apply each strategy.
Table 1. Vertical slicing strategies for a DW/BI solution.
Slicing Strategy | Example Question Stories | When to Do This |
One new data element from a single data source |
|
Very early days when you are still building out fundamental infrastructure components. Very common for the first iteration or two of Construction. These slices still add real business value, albeit minimal. |
One new data element from several sources |
|
Early days during Construction when you are still building out the infrastructure. These slices add some business value, often fleshing a DW data element to include the full range of data values for it. |
A change to an existing report |
|
Evolution of existing functionality to support new decision making |
A new report |
|
Several iterations into Construction when the DW/BI solution has been built up sufficiently. |
A new reporting view |
|
Several iterations into Construction when the DW/BI solution has been built up sufficiently. |
A new DW/DM table |
|
Several iterations into Construction when the DW/BI solution has been built up sufficiently. |
There are several interesting things about the question stories in the table:
- They are written from the point of view of your stakeholders. They aren’t a technical specification. For example, the first story describes how professors want a list of student names but it isn’t saying from what data source(s), what the element names are – These are design issues, not requirement issues.
- They always provide business value. The first story appears to be the beginnings of an attendee list for a seminar. Having something as simple as a list of names does in fact provide a bit of value to professors.
- Sometimes that business value isn’t (yet) sufficient. It may take several iterations to implement something that your stakeholders want delivered into production, particularly at first. For example, although a list of student names is the beginnings of a class list it might not be enough functionality to justify putting it into production. Professors also need to know the program that the student is enrolled in, their current year of study, and basic information about the seminar. The decision as to whether the functionality is sufficient to ship is in the hands of your stakeholder (this is one of the reasons why you want to demo your work on a regular basis).
4. You Can Do This Too
This can be hard to hear sometimes, but you’re not special. Others are in fact doing this, often for years, and have been doing so successfully. Yes, just like you, they had to deal with:
- Solving hard problems
- Legacy data sources that were rarely perfect
- Legacy data sources that were not under their control, sometimes owned by people difficult to work with, and sometimes not even owned by their organization
- Stakeholders who change their minds, or ask for fixed budgets, or ask for exact delivery dates, and of course ask for any combinations thereof
- Teams made up of experienced data professionals whose culture tells them that they need to do detailed modeling up front
5. What New Skills Will You Need?
Vertical slicing is an important agile skill in general. Other agile data skills enable vertical slicing of a DW/BI solution. These skills include:
- Agile analytics – To explore legacy data sources.
- Agile data modeling – For streamlined initial modeling and evolutionary detailed modeling of your data structures.
- Database refactoring – To safely and easily evolve existing databases, including your data warehouse and data marts.
- Database regression testing – To validate your work in an automated manner
- Continuous database integration – To ensure changes are automatically regression tested.
- Continuous database deployment – To ensure working updates to your database are shared appropriately.
- Agile modeling in general – There’s more to modeling than data.
6. What Happens When You Don’t Thinly Slice?
Teams that don’t know how to thin slice their work often fall into one of the Mini-Waterfall or Staggered Mini-Waterfall process anti-patterns. Neither of these strategies are agile – let’s explore each of them and see why.
Figure 1 below depicts a mini-waterfall approach where a team works through the traditional phases, mostly in order, throughout an iteration/sprint. These iterations are typically longer than usual, often four or more weeks in length, whereas 80% of agile teams have iterations of two weeks or less. Mini-waterfalls are common with teams that are very new to agile and in this case should be seen as a step in the right direction away from the traditional/serial approach towards an agile approach. However, if you’re taking a mini-waterfall approach because of one or more of the reasons discussed earlier (see you can do this too) then what’s really happened is that the team is using one of those flimsy excuses for not making the behavioural changes required to be truly effective.
Figure 1. A mini-waterfall (click to enlarge).
As we show in the article Continuous Data Warehousing it is in fact possible for DW/BI teams to work in an agile or even continuous manner. There is absolutely no reason, except as a step in your team’s overall learning effort, to follow either a Mini-Waterfall or Staggered Mini-Waterfall approach. You can and should do better.
7. Thin Slicing in Context
The following table summarizes the trade-offs associated with thin slicing and provides advice for when (not) to adopt it.
Advantages |
|
Disadvantages |
|
When to Adopt This Practice | Whenever a development team is taking an agile approach then you will need to vertically slice your work. |
8. Parting Thoughts
Thin slicing is an important skill for any agile team, regardless of what they are building. In this article you learned that it is highly desirable to do so for a DW/BI solution and more importantly that the techniques exist to do so. For most people the hardest thing about thin slicing is to adopt the agile mindset behind working this way, something that can be very tough for experienced data professionals given the cultural impedance mismatch between traditional data professionals and modern agile practitioners.
9. Recommended Resources
- Agile Analytics
- Agile Core Practices for DW/BI Teams
- The Agile Database Techniques Stack
- The Agile Modeling Web Site
- Continuous Data Warehousing
- Disciplined Agile Delivery (DAD)
- Look-Ahead Data Analysis
- Question Stories
- Workshop: Continuous Data Warehousing (DW): A Disciplined Hybrid Method for Practitioners – Two days
- Workshop: Continuous Data Warehousing for Leaders – One day