When someone writes in a Tableau Forum that he is writing a book about "democratizing data" meaning
- make it automatically available to those who need it
- when and where they need it
- based on their roles and responsibilities,
- in forms they can use, and
- with the freedom to use it as they choose
- while simultaneously protecting security and privacy
well, we just can't help but sit up and take notice. After all, democratizing data is what Tableau is all about. And when he says he's looking for Tableau customers to tell their stories, of course we want to help.
So I was delighted when I read more about the person behind the forum post - W. David Stephenson. David is a well-known author and e-government expert. He and Vivek Kundra are co-authoring a book called "Democratizing Data." Vivek is the CTO of the District of Columbia, a big Tableau customer in the e-government space, and is up for the top IT post in the Obama administration.) David and I exchanged a little email correspondence about the future book and I thought you would be interested.
Elissa Fink (EF):David, what is motivating you and Vivek Kundra to write a book about democratizing data?
W. David Stephenson (WDS): We share a deep passion for what Vivek terms the "digital public square": using Web 2.0 tools to recreate on a virtual basis the Athenian agora, where the people would come together to debate policy and do business. We also believe that an integrated data-centric strategy that makes valuable organizational data available to all workers, not just to elites (with the exact amount and combination determined on the basis of an individual worker's role), can improve their work by giving them actionable information when and where they need it to make decisions, can cut operating costs, break down isolation between departments and functions, and encourage collaboration.
EF:Speaking for Tableau, I couldn't agree more. And why do you think democratizing data so important right now?
WDS: There's never been a more important time for democratizing data, due to the global economic collapse and the accompanying loss of faith in government and corporations. If they follow the lead of the District of Columbia, States of Rhode Island and Utah and the U.K., they will start issuing real-time structured data feeds and invite watchdog groups, politicians, the media -- and us! to hold them accountable by analyzing this data themselves, rather than simply relying on their reports and their interpretations. While governments have taken the lead in this sort of transparency, banks and other corporations have also lost their credibility, and would be well advised to consider their own transparency initiatives.
Equally important, if government agencies start creating Web 2.0 tools such as tags, topic hubs and threaded discussions, they may find that visualizations and other interpretations of the data will result in crowdsourcing, in which innovative new approaches will result from the discussion and interplay of perspectives.
Finally, democratizing data can be crucial as companies have had to lay off workers. The remaining workers have had to shoulder new responsibilities, so they need more information, more tools to help them do their jobs and collaborate with other workers to increase overall efficiency.
EF: There's definitely pressure everywhere to do more and make better decisions with less - and if democratizing data can help, enlightened companies will be signing up. So what is democratizing data? Is it more than just sharing static views of data with more people?
WDS: It's much more than just sharing static data with more people! It begins with an attitude: that organizations need to be "data centric," with structured data distributed automatically at the core of their operations, accessible to a wide range of users, from employees to regulators to suppliers -- all from that central data hub (rather than being captured by proprietary applications -- a great argument for Tableau!). The key elements are:
- 1. structure the data: that metadata will stay with the data as it is used, so it can be accessed and shared equally by an infinite number of applications and devices
- 2. syndicate the data on a real-time basis to maximize its utility: in many cases this will be the first time workers have had access to real-time data, when and where they need it
- 3. release it externally as feeds that can be analyzed by the public
- 4. if you really want to benefit, "crowdsource" with the data as the District of Columbia did with its Apps for Democracy contest: developers were invited to use one or more of the data streams to create open-source applications that would serve the public. For a total cost of $50,000 ($20,000 of which was for prizes) the District got 47 apps: an ROI of 4,000%!
EF: I'm glad to hear it's so much more than just sharing static views - being able to interact with data is key to what people learn from it. So what do you see as the major challenges to organizations who want to democratize their data?
WDS: The major obstacles are inertia: it's been so difficult to share data for so long, that now we have the technology to do so, perhaps management can't even visualize the potential. One striking example, IMHO, is that the Netherlands has launched an incredible experiment, the Dutch Taxonomy Project, which allows companies that otherwise would have to file annual reports with 30-40 different agencies & companies to instead file a single XBRL data file (which all of the agencies then access automatically!) which would both save them and the goverment large amounts of time and money. However, only a relatively small percentage of companies take advantage of the program (and, by the way, once they've gone to the time and effort to tag the data for these official filings, they could amortize the cost and increase the benefits by also distributing the data internally).
EF: Given that we recently passed “data privacy day”, how “democratic” do you think data can get? Do companies have to worry about data falling into the wrong hands?
WDS: Data security is a tremendous problem, but that can't be used as the justification for not democratizing data. A great example of how to do it right is what Dr. John Halamka and the IT staff at Beth Israel Deaconess Medical Center (BIDMC) in Boston have done with the Online Medical Records (OMR): nothing is as fraught with privacy issues as medical records, and, at the same time, as critical for those with a need to know (especially ER docs) to have real-time access to all the information. Halamka has done that, and you can't tell me that corporations can't do the same, if they begin with that data-centric perspective and then determine various levels of access depending on the person's role.