Connect to PDF Data Directly in Tableau 10.3
Rahim Bhojani is a Senior Engineering Manager for Tableau. This post first appeared on Tableau's blog on 28 March 2017
We are making it even easier to connect Tableau to your data and jumpstart your analysis. With Tableau 10.3, we’re adding support for PDF files!
PDF files are everywhere, from business reports to scientific publications (and everything in between). Many of the world’s great open data portals (like data.gov) catalog and distribute data in PDF format.
Unfortunately, getting data out of PDF files can be a huge headache. There are some tools (like pdftext) that let you do that, but building that capability directly into Tableau will save you the frustration involved and let you focus your energy on analyzing and exploring the data itself.
The PDF connector will allow you to connect to PDF files, identify tables, and let you treat this like any other data source within Tableau.
With this connector, pre-processing data from PDF documents by brute force or copy-pasting is a thing of the past. Now you can connect to PDF documents like you can a text file, leverage all of Tableau’s awesome capabilities (cross data-source joins, parameters, and more), and build impactful visualizations with ease.
If you have PDF documents containing complex (hierarchical) tables, be assured that we’re continuing feature development with this in mind. We know there are some limitations, and we would love to hear from you about your experience and needs. I encourage you to vote up your requests on the Ideas Forum or reach out to us via the beta forum! (If you haven't yet, be sure to sign up for the beta program to get access to new features and product discussions.)
To see how to use the PDF connector, check out the training video on Tableau's corporate site 'Connecting to PDFs' (please enter your email address to watch). Try it out for yourself with this PDF about New Zealand: Water Physical Stock Account 1995–201'.