Tableau and Marklogic Visualize Unstructured Data
Everyone talks about extracting value from big data and specifically tapping into the value of unstructured data. Marklogic, a Tableau partner, is actually helping customers do it! Marklogic’s goal is to help customers rapidly make sense of large amounts of unstructured data. Marklogic is the leading Enterprise NoSQL database on the market due to how quickly they can process, catalog, store and search tons of unstructured data. This data can then be exposed to Tableau users through the “Other ODBC” connection in Tableau.
One of the most important features for managing unstructured data is the ability to quickly search the data and get back relevant results. Marklogic provides support for full text searches as well as complex searches on Unstructured data. Tableau customers can initiate these searches using simple parameters in Custom SQL. Marklogic then searches the entire indexed document, not just titles of documents or the metadata.
After searching the unstructured data you may want to enrich that data with outside relevant information. Marklogic utilizes a triple store and the semantic web to provide customized data that enriches the data discovery process.
To demonstrate Tableau and Marklogic searching and visualizing unstructured data, we created a sample workbook of indexed public mailing list messages. This workbook is a subset of emails from Markmail.org. You can see the dashboard here on Tableau Public.
You can start with the search bar created using a Tableau parameter in custom SQL. That parameter is just the right hand side of a SQL MATCH query, and it can be anything you’d expect from a powerful search engine like MarkLogic: Booleans, proximity search, fielded search and so on. The entire email is searched, not just a column like you’d get with a relational database! Try it! Search for any term and click on an email snippet. You will see a new window appear showing the entire document (in this case an email).
If you type in “Hadoop” with a capital H (Marklogic search uses SPARQL which is case sensitive), you will see the right hand side of the dashboard fill with new data. This isn’t just any data. It’s not in a database per se, it’s coming from the Web. MarkLogic’s triple store indexes to see “what else do we know?” — like an infobox you see sometimes with Google searches.
(A note about this dashboard. In order to provide this demo to Tableau Public, we have extracted the data from MarkLogic into a TDE file — but you could easily run live data from MarkLogic into Tableau for real-time analysis on your own server.)
For more information on Tableau and Marklogic, you can view a recent webinar on Analytics, NoSQL, and Visualization hosted by Marklogic and Tableau.