Recently, I Googled test data management (TDM). I’m not sure why I was surprised by the results, but I was. The majority of the results on the first page were focused on subsetting and masking from production. This is not the whole picture.

Why is subsetting and masking not the only answer?

  • Production copies will not deliver all the data required. Production data typically has 10-20% of the possible data combinations the code is written for.
  • Someone has to search through the data to see if what they need is available.
  • What happens if you don’t find the data you need? Someone has to create the missing data one way or another.

If you’re using production data to test with, you will have gaps in the available data.

Problems Ahead

If you’re using masked data from production and you don’t have to create any additional data, then the assumption is that you have 100% coverage in production. But is that really the case?

Were the test cases written to align to the available data? Or vice versa?

How do you find out what your production data coverage is? This is accomplished through the use of a data coverage analysis tool.

How do you know if your test cases are hitting all the permutations? This is accomplished through the use of a test case optimization tool.

What if you find some of the data you need using masked production data, but you need more of it?  You can ‘create’ more after the fact, but wouldn’t it be easier to create more data when you’re subsetting it? That’s where data cloning comes in.


How is it that all of these options are available but not every company is doing it? If you look at what each TDM solution offers, you’ll find that the most focused solution is from Grid Tools. This is their area of expertise.

When you’re putting together an RFP, why not look at the company that is focused on TDM?

I had an RFP request come in this week that was focused only upon masking and subsetting. The comparison is between what my team offers and two other companies. So of course only masking and subsetting are on the list, because that’s all those companies can do. The RFP that Orasi delivered is inclusive of all areas of TDM. We always start off with a TDM assessment. Even if the client has done their own, it’s always a best practice for us to evaluate the situation. Why? We’re an outside party with no attachment to any of the projects, teams or processes being evaluated. So we’re unbiased. The proposal included best practice methods such as the use of Grid Tools Datamaker to enable synthetic data generation. This will ensure full data coverage for all test cases. Then things like data profiling, data coverage analysis, test case optimization, test data on demand, to help expand the stability of their TDM practice and increase the quality of their QA testing.

Getting back to the reason I blogged all this… It’s tough to learn about TDM from Google. You’ve probably heard of most of the other companies on the first page. Do your project a favor and dig deeper. Look at the company that can do the full breadth of TDM.

For reference, the Grid Tools TDM application suite can:

  • Create synthetic data (Datamaker)
  • Match test case criteria to existing data (Test Matching)
  • Analyze the coverage of the existing data (Data Coverage)
  • Optimize test cases (Agile Designer)
  • Enable testers to create data on demand (Test Data On Demand)
  • Create additional data during subsetting (Data Cloning)

AND profile, subset and mask data. Yeah, they have applications that can do those things too!


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.