Finding a needle in a haystack. Testing the Big data
  • 40 min

What if you had to deal with data warehouse containing trillions of records? When only scanning of a single table takes a few days, and amount of data sources grows, and every day they produce over 3 billions of new records? How do you assure the sanity of your data, when making a few releases per day? How do you tell if they are still correct? The same way as spacecrafts explore the entire planets having limited resources! We were capable to find a series of approaches, that have already proven their efficiency. In my presentation I will tell how we assure data consistency and how we learn to understand our data.

Ищем иголку в стоге сена. Тестирование Больших данных from Vlad Orlikov on Vimeo.

Comments ({{Comments.length}} )
  • {{comment.AuthorFullName}}
    {{comment.AuthorInfo}}
    {{ comment.DateCreated | date: 'dd.MM.yyyy' }}