Get the batch out of here!!

Posted on 16 September 2015 in Big Data
by Emile Bakker
Waiting for a file or document is something of the past century. Especially in a business process the transfer of information is one of the major obstacles. Ever since Google Docs (and Office 365) have been around we all have learned that collaborate (parallel) access to a document works far better than single (serial) access.


Serial batch pain

Now this serial access pain is still the daily ingredient of a market researcher's life. Pretty much all quantitative project within market research revolves around collecting data. And pretty much all these collection processes make data available through an export process. Working with an export process means that at defined times a snapshot of the data is taken and passed on as a static bucket. The static nature of this bucket makes this bucket invalid as soon as the next export is made. So any work done on it needs to be merged with the next new static batch. Due to this pain, most researchers work with 2 buckets during a course of a project. A 10% check, so they move a bucket with data as soon as 10% is in, and the end of project datafile. But that is where it stops.

Get the flow

First time you think about this, it does not sound shocking. But look at the impact on the organization of it. In your workflow of your projects, there are people waiting to start to do their thing. Someone is desperate to see the first 10% of the completes to check if the project runs fine (and use that to make a somewhat safe assumption that the remaining 90% of the project will run fine) and someone else is desperate to start the analysis of the collected data. So when you have collected your data, your project is only half done. And in the meantime, your customer loves to see a dashboard that actually shows life ticks of the incoming completes.

"If a batch kills the flows, what will a batch do for your workflow?"

Taking away the batches allows you to start building your analysis right when the project starts, and it allows you to check continuously the correctness and progress of your collected data. And it allows you to show your customer live fieldwork! And best of all.. when your collection process is done, you can deliver.


Our Nebu Data Hub allows you to create this smooth organization. By connecting your downstream tools (like SPSS, R-studio and Excel) directly to the living dataset, everybody is working always on the latest collected data. Collect and Deliver!


