Modern Data Architecture
Digital transformation has created a massive influx of data, and it’s not slowing down. Modern BI has lowered the barrier to entry to wide-spread, secure, and governed data for speedy time to insight for business users, but it relies on a backbone of a complete, modern data architecture stack to manage the overall flow of data.
Organizations are collecting, processing, and analyzing more diverse data than ever before. Data formats are multiplying with structured data, semi-structured data (or object storage), and raw, unstructured data in a data lake or in the cloud.
The Tableau Platform fits wherever you are on your digital transformation journey because it's built for flexibility—the ability to move data across platforms, adjust infrastructure on-demand, take advantage of new data types and sources, and enable new users and use cases. We believe your analytics platform should not dictate your data pipeline infrastructure or strategy, but should help you to leverage the investments you’ve already made, including those with partner technologies within the modern data architecture stack.
In this resource, we'll break down the components of modern data architecture, and how Tableau's Technology Partner Ecosystem can support you in building out your analytics stack.
Create & Connect
Connect Tableau to your data, wherever it lives.
Connecting directly to spreadsheets and business applications makes it easy for analysts to interact directly with the data needed to make better decisions.
Traditional Data Connectivity
In Traditional BI environments, IT prepares and houses data into a centralized repository, which analysts query for analysis. If there’s a need to ask questions about the data outside of this centralized repository, users rely on spreadsheets, data extracts, and other “shadow IT” workarounds.
Modern Data Connectivity
Tableau can connect natively to and live query against your commonly used business applications, allowing analysts and business users to ask questions directly against the data with which they are most familiar. This agile approach accelerates insight delivery, freeing up expert resources for effective data enrichment and advanced analytics modeling.
Get started with Google Ads Connector to improve campaign performance.
We chose Tableau to understand our SAP data because of its ease of use and intuitiveness. It will help employees across our company to discover, understand, see trends and outliers in the numbers—so they can take quick action.
Store & Process
Drive optimal performance—regardless of data size or type.
A modern data architecture should handle any data source: cloud applications, big data databases, as well as structured and unstructured repositories.
Traditional Data Storage
Acting as a repository for query-ready data from disparate data sources, data warehouses provide the computing capability and architecture that allow massive amounts of data or summaries of data to be delivered to business users. There will always be a place for traditional databases and data warehouses in a modern analytics infrastructure, and they continue to play a crucial role in delivering governed, accurate, and conformed dimensional data across the enterprise for self-service reporting.
There are, however, limitations to traditional data warehousing. Firstly, IT must know exactly what kinds of questions analysts are asking in order to build prepared views of the data for fast access to analysts. Secondly, with traditional, on-premises data warehouse deployments, it is a challenge to scale analytics across an increasing number of users.
Modern Data Storage
A modern analytics strategy accepts that not all data questions within an organization can be answered from only one data source. An effective data strategy should enable flexible storage and processing for querying for all types of data. The guiding principle is that analysts and business users more familiar with the data shouldn't need to rely on IT to curate prepared views prior to analysis. Instead, the process becomes iterative—IT grants direct access to the data lake when appropriate for quick queries and operationalizing large data sets in a data warehouse for repeated analysis.
As more workloads and data sources move to the cloud, organizations are also increasingly shifting towards cloud-based data warehouses such as Amazon Redshift, Google Big Query, Snowflake, or Microsoft SQL Data Warehouse. Connecting directly to these data sources opens up the potential for organizations to quickly scale underlying cloud infrastructure as demand for data access surges. Another benefit of modern data warehouses is additional functionality to accelerate data processing. For example, Snowflake and Cloudera can handle analytics on structured and semi-structured data without complex transformation.
Analysis often requires direct connections to source data before it's staged in a data warehouse. This is supported in Tableau with native connections to popular data lakes like Amazon S3 via Redshift Spectrum or Amazon Athena connectors, or the Databricks connector, which allows you to connect directly to Delta Lake for fast, fine-grain data exploration.
With Tableau, you just hook it up to the Redshift server, connect, run a query, and publish it to the Server and you're literally done in an hour. It’s great—it feels like one product.
Transport & Prepare
Access, combine, and clean data to serve as a single source of truth.
Data often needs massaging before we can drag and drop in Tableau, whether you're taking a traditional or modern approach.
Traditional Data Pipeline
In the traditional data pipeline, all data must be processed, prepared, and centralized prior to analysis by business users. This process is referred to as “extract-transform-load,” or ETL.
Raw, unstructured data can be extracted, but it often needs massaging and reshaping before it can loaded into a data warehouse. The processes devoted to data prep and structuring require users to wait to access information until the ETL process is complete.
Modern Data Pipeline
Unlike traditional ETL pipelines, in modern analytics scenarios, data can be loaded into a centralized repository prior to being processed. This pattern is called “extract-load-transform,” or ELT.
ELT allows for more flexibility and faster iteration, as transformations happen directly on the raw data, allowing IT to continually test the best possible data transformations prior to loading into a data warehouse.
Tableau Prep is a great choice for those organizations looking to easily prepare data prior to analysis. For additional ELT capabilities, Tableau has partners like Informatica, Alteryx, Fivetran, Trifacta, Talend, and Datameer that transport and prepare data in a way that works fluidly with Tableau.
Spend less time preparing data and invest more time analyzing data in Tableau.
Synchronize and integrated your on-premise and/or cloud data with Informatica.
We use Alteryx to make sense of massive amounts of data, and we use Tableau to present that. So it's become an incredible tool. It's revolutionizing the way things are being done.
Catalog & Govern
Make self-service analytics possible with trusted data.
Data catalogs help users discover certified data assets, so they can spend time asking questions of their data—not asking questions about the quality of that data.
Traditional Data Governance
In a traditional BI environment, governance is often seen as a way for IT to restrict access or lock down data or content. As a consequence, silos are created, resulting in enterprise-wide data that lacks business context and is difficult to discover. When users finally do identify applicable data assets, the validity of that data is often unclear.
Modern Data Governance
In a modern, self-service analytics environment, governance is employed quite differently—as a way to enable and empower your people, rather than restrict them. Data catalogs serve as a shared business glossary of data sources and common data definitions, allowing users to more easily find the right data for decision-making from curated, trusted, and certified data sources.
With the Tableau Catalog, users can now quickly discover relevant data assets from Tableau Server and Tableau Online. By seamlessly connecting the metadata in enterprise data catalogs from leading vendors such as Informatica, Collibra and Alation, Tableau users can extend the power of the Tableau Catalog to extend to their enterprise data sources.
Learn more about governed, self-service analytics at scale.
Enable IT and Line-of-business collaboration through trusted data with agility and scale.
The Collibra and Tableau partnership empowers organizations to make better data-driven business decisions.
Extend governance capabilities for the speed of self-service analytics with the trust in data.
With the combination of Alation and Tableau, GoDaddy’s Enterprise Data team was able to examine the lineage of a table, search multiple sources for a field, and increase visibility and control.
Analyze & Augment
Analytics get smarter with automated machine learning and natural language capabilities.
Machine Learning and AI improve data access and data quality, uncover previously hidden insights, suggest analyses, deliver predictive analytics, and suggest actions. Natural language interfaces will make it easier for business users to gain insights and make better decisions.
Traditional Data Science
The traditional data science method relies exclusively on data scientists for model development, deployment, and interpretation. This manual approach—separate from a standard analytics practice—is time-consuming and leads to bottlenecks that can make it difficult to deliver insights to the business.
Modern Data Science
A modern approach to data science, centered around automated machine learning, enables business users to ask questions of their data to reveal predictive and prescriptive insights which are seamlessly integrated into their analytics environment. This frees up data scientists to focus their time on higher-value data aggregation and model creation.
Emerging smart capabilities harness machine learning to assist people with tasks including data preparation, data discovery and understanding of user intent based on historical data-access patterns. To support advanced, predictive analytics, Tableau lets you connect to R and Python libraries and packages, import saved models and write and embed new models into calculations. Tableau also partners with vendors, including DataRobot and RapidMiner, to integrate their advanced analytics platforms designed to support sophisticated predictive modeling.
Natural language processing (NLP) technologies like Ask Data enable users to ask questions of their data using simple text, without understanding the underlying data model. Natural language generation (NLG) technologies can automatically generate stories from data, explaining insights locked within structured data. Tableau integrates with partner NLG technologies such as Narrative Science, Automated Insights, and ARRIA via dashboard extensions to enrich the analytics experience in Tableau.
Learn how to communicate the value of automated machine learning results through visualizations.
Leveraging Google BigQuery's machine learning capabilities for analysis in Tableau.
Tableau makes it faster and easier to identify patterns and build practical models using R.
We used the Tableau extension for 'what-if' analysis and have implemented that alongside several of our different predictive models. All of this is powered by the Tableau extension with DataRobot in the back end to produce these reports on an ongoing, real-time basis.
Host & Scale
Quickly deploy Tableau and scale it across your organization.
Organizations are increasingly moving business processes and infrastructure to the cloud. Full-cloud and hybrid services have removed some of the major hurdles encountered with on-premises systems, making solutions faster and easier to deploy and manage.
Traditional Analytics Deployment
In a traditional environment, databases and analytic applications are hosted and managed by the organization with technology infrastructure on its own premises. This means the organization is responsible for provisioning sufficient hardware and providing resources to ensure performance scales with future demand. The organization is also actively managing, monitoring, and maintaining the software.
Modern Analytics Deployment
Tableau can be deployed on public-clouds from Amazon Web Services, Google Cloud Platform, Microsoft Azure or Alibaba Cloud. There are often benefits in cost, scalability, and flexibility to using infrastructure or platform as a service (IaaS and PaaS).
Web-based analytics can also be delivered as software-as-a-service (SaaS). Tableau Online is your analytics platform fully hosted in the cloud. This means you don’t have to worry about hardware or software maintenance. A hybrid model for analytics allows you to connect to data regardless of the database in which it’s stored or the infrastructure upon which it’s hosted.
One of the greatest benefits of analytics in the cloud is flexibility. There's not a lot of setup required as there has been in traditional models, nor are there the same concerns around storage limits, cluster overhead, or performance. This gives users the freedom to try things, fail quickly, and move on to something else. You don't have to know where you're going—you have the freedom to explore, discover, and modernize your analytics strategy.
Tableau’s approach to cloud is simple: It’s all about choice. The choice of how and where you deploy your analytics; the choice to analyze any data, regardless of where it resides. From a fully-hosted SaaS solution, to a hybrid approach of your own software deployed on a cloud platform or on-premises, Tableau lets you deploy and manage your analytics on your own terms.
Tableau on AWS provides a next-generation architecture that fosters innovation and reduces costs. The solution has changed our BI consumption patterns, moving from hindsight to insight-driven reporting.