Avro and Kafka: A Tester’s Challenges in the World of Big Data

  • 20 min

Avro is a convenient format for working with data streams in Kafka and for storage in Big Data systems, but in practice it often becomes a source of problems for QA engineers and analysts.

In this talk, we will look at the key problems we encountered when testing reporting with binary source data: 

  • the inability to quickly customise data for verifying different scenarios; 
  • the long and complex process of finding specific transactions across large sets of files in storage;
  • the limitations of working with Kafka without a schema registry. 

These constraints significantly slow down and complicate testing. To overcome them, we moved from “manual hacks” and Jupyter notebooks to more universal solutions — developing our own converters and adopting TRINO as a platform tool for analysis. 

The audience will get not just an overview of a popular binary format, but a practical set of ideas and tools for working with Avro, helping reduce routine work and focus on validating business logic.

Comments ({{Comments.length}})
  • {{comment.AuthorFullName}}
    {{comment.AuthorInfo}}
    {{ comment.DateCreated | date: 'dd.MM.yyyy' }}

To leave a feedback you need to

or
Chat with us, we are online!