Automated data quality control
This talk is episode #2 of the ‘Data Quality’ series started on previous conference SQA Days 27. That time we looked at some historical cases, studied basic theory. Also, we verbally discussed some data quality issues and even caught some of them "on paper".
Now is the time for practice. The map is not the territory it represents, isn’t it? Let’s see.
In our company, we have worked out the framework for regular data quality monitoring. Today I’ll disclose how this framework is organized, what is needed to integrate it into the real environment. I’ll also demonstrate how it works for the main cases of operating with data, namely, historical data analysis, data migrations, batch and instream data processing.