We’re happy to announce the beta release of TabPy, a new API that enables evaluation of Python code from within a Tableau workbook.

When you use TabPy with Tableau, you can define calculated fields in Python, thereby leveraging the power of a large number of machine-learning libraries right from your visualizations.

This new Python integration in Tableau enables powerful scenarios. For example, it takes only a few lines of Python code to get the sentiment scores for reviews of products sold at an online retailer. Then you can explore the results in many ways in Tableau.

You might filter to see just the negative reviews and review their content to understand the reasons behind them. You might to get a list of customers to reach out to. Or you might visualize overall sentiment changes over time.

Other common business scenarios include:

  • Lead scoring: Create a more efficient conversion funnel by scoring your users' behavior with a predictive model.
  • Churn prediction: Learn when and why users leave, and predict and prevent it from happening.

You can easily install the TabPy server on your computer or on a remote server. Configure Tableau to connect to this service by entering the service URL and port number under Help > Settings and Performance > Manage External Service Connection in Tableau Desktop. Then you can use Python scripts as part of your calculated fields in Tableau, just as you’ve been able to do with R since Tableau 8.1.

TabPy uses the popular Anaconda environment, which comes preinstalled and ready to use with many common Python packages including scipy, numpy, and scikit-learn. But you can install and use any Python library in your scripts.

If you have a team of data scientists developing custom models in your company, TabPy can also facilitate sharing those models with others who want to leverage them inside Tableau via published model.

Once published, all it takes to run a machine-learning model is a single line of Python code in Tableau regardless of model type or complexity. You can estimate the probability of customer churn using logistic regression, multi-layer perceptron neural network, or gradient boosted trees just as easily by simply passing new data to the model.

Using published models has several benefits. Complex functions become easier to maintain, share, and reuse as deployed methods in the predictive-service environment. You can improve and update the model and code behind the endpoint while the calculated field keeps working without any change. And a dashboard author does not need to know or worry about the complexities of the model behind this endpoint.

Together, Tableau and Python enable many more advanced-analytics scenarios, making your dashboards even more impactful. To learn more about TabPy and download a copy, please visit our GitHub page. Give TabPy a try, and let us know what you think at beta@tableau.com.

If you’re attending the Tableau Conference, stop by our session on TabPy to learn more and see some cool demos.

Do more with Tableau and Big Data

Find out how Tableau solves many of the problems Big Data can present to organizations of any size.
Learn more.

También podría interesarle...


This is really amazing. This opens up potential opportunities for Tableau to be used in semi-real-time applications. This can be used to call microservices or to trigger actions.

Gospel for us Pythonic lovers! Thank you Tableau! This is amazing. I cannot wait to explore this feature.

Well... Just put my hair in a bun, started a fresh pot of coffee, am rolling up my sleeves, and checking out of reality and into TabPy.

Not crazy about having to install a server to use this feature....how is performance of Python calculations when using large datasets compared to R...which seemed very slow when I tried it.

Chris S.: not sure what you mean. Perhaps you can provide a specific example? Pure Python is generally comparable or faster than pure R on most benchmarks. Of course, in real-life use cases, it boils down to the specific libraries you use. Like R (with Rcpp), most of the data libraries in Python (Pandas, Numpy, etc.) call underlying code written in a lower-level language like C/Fortran (e.g. BLAS/Lapack) so the performance difference should be non-issue.

Awesome news.

Question though -- How do the Script_XXX() functions determine whether it is a R script or a Python script? Does Tableau infer that the script somehow? Do you have to configure a workbook to either use R or Python, but not both? For example if the script is just the number 1, which is valid in Python and R, where does Tableau send that script to execute?

Hi Alex,
Currently you can have only one active external service at a time (one per each running Tableau.exe) on desktop and one setting per server.

~ Bora

Getting error: package not found via pip and conda

Hi Ashish,
Did you download the zip file from Github and follow the install instructions or did you try 'pip install tabpy'? There is no such package in PyPI at this point so the latter will fail. If you don't want to go through full install, there is install instructions for users who already have anaconda configured also in Github. The steps provided pip using the local package contained in the download from Github repository.

~ Bora

This is amazing.

Hi Bora,

I was earlier using pip install TabPy / conda install TabPy
Gone through the Github, seems like I'll be able to install and have it running once I am home in evening. Thanks a ton for clarifying.

~ Ashish

Great to know the new feature of TabPy in Python code it make me more easy to work thank you.

Thank for the new Python Code,Extended Team Model

Great !!!

It was more than three years ago when I first wrote about my desire to have a feature like this in Tableau. At the time, R integration was a proposed and was an upcoming idea. With each passing year, I have to keep updating this article because Tableau keeps delivering new integrated technologies. It's becoming a full-time job trying to keep up with Tableau:


Hi Bora,

I am getting this error while trying to run setup.bat

Could not open requirements file: [Errno 2] No such file or directory: './tabpy-server/requirements.txt'
Invalid requirement: './tabpy-client'
Traceback (most recent call last):
File "c:\python27\lib\site-packages\pip\req\req_install.py", line 82, in __init__
req = Requirement(req)
File "c:\python27\lib\site-packages\pip\_vendor\packaging\requirements.py", line 96, in __init__
requirement_string[e.loc:e.loc + 8]))
InvalidRequirement: Invalid requirement, parse error at "'./tabpy-'"

Invalid requirement: './tabpy-server'
Traceback (most recent call last):
File "c:\python27\lib\site-packages\pip\req\req_install.py", line 82, in __init__
req = Requirement(req)
File "c:\python27\lib\site-packages\pip\_vendor\packaging\requirements.py", line 96, in __init__
requirement_string[e.loc:e.loc + 8]))
InvalidRequirement: Invalid requirement, parse error at "'./tabpy-'"

Any guess why. Will appreciate any help,


Hi Ashish,
Did you download the zip using the green link in the upper left corner, unzip it and ran setup from inside tabpy folder that contains tabpy-server folder?



I'm receiving this error in Tableau when trying to use the sentiment script shown above. Anyone have an idea what's causing this? Thanks

Error in base::parse(text = .cmd) : :1:6: unexpected symbol 1: from vaderSentiment.vaderSentiment^

Looking at the error, issue seems like that the script is being sent to R, instead of Python. Can you check if Manage External Services dialog is pointing to TabPy server? Also note that this will work with Tableau 10.1 or higher.

Looking at the error, issue seems like that the script is being sent to R, instead of Python. Can you check if Manage External Services dialog is pointing to TabPy server? Also note that this will work with Tableau 10.1 or higher.

I was able to connect to the TabPy server, but now I'm receiving this error when running sentiment score script shown above.

Error processing script
Error when POST /evaluate: Traceback
Traceback (most recent call last):
File "C:\Users\ngrah008\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server\tabpy.py", line 467, in post
result = yield self.call_subprocess(function_to_evaluate, arguments)
File "C:\Users\ngrah008\Anaconda\envs\Tableau-Python-Server\lib\site-packages\tornado\gen.py", line 1008, in run
value = future.result()
File "C:\Users\ngrah008\Anaconda\envs\Tableau-Python-Server\lib\site-packages\tornado\concurrent.py", line 232, in result
File "C:\Users\ngrah008\Anaconda\envs\Tableau-Python-Server\lib\site-packages\tornado\gen.py", line 282, in wrapper
yielded = next(result)
File "C:\Users\ngrah008\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server\tabpy.py", line 482, in call_subprocess
File "", line 2
from vaderSentiment.vaderSentiment import sentiment as vs=[]
SyntaxError: invalid syntax
Error type : SyntaxError
Error message : invalid syntax (, line 2)

I get the same error as Nathan G when trying to emulate the first sample shown (vaderSentiment). Any ideas - did you solve your issue Nathan?

Looks like the Python library used in the example got an update that has some breaking changes. I reinstalled from scratch and observed the same error. So please try the following instead. We will update the image in the blog post with the same script.

SCRIPT_REAL("from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
for i in range(0,len(_arg1)):
a = analyzer.polarity_scores(_arg1[i])['compound']
return vs", ATTR([Comment Text]))

You superstar Bora.
Thanks for going above and beyond over a weekend to find a solution to this. I was struggling with very limited knowledge to get this working.


Python is a widely used high level, general purpose; interpreted, dynamic programming language supports multiple programming paradigms….


My company network is not allowing to installing and Run the TabPy server, I was suggested to configure the Proxy, I get ProtocolError('Connection aborted') error message.
Could you please suggest some steps to resolve this?


hi,i'm getting this problem while using tabpy. using the sample - supterstore datasets, and i want to cluster the sub-category using the sum([Profit]) and sum([Sales]), but with the error returns: ValueError : n_samples=1 should be >= n_clusters=2. And here's my script:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=2)
for i in range(len(_arg1)):
KMmodel = kmeans.fit(tmp)
labels = KMmodel.labels_
return labels",

Python calls the Java code, and then calls the Python program using TabPy. The first calculation succeeds, and when it is re calculated, the table is computed, and then the Python will hang up.

SCRIPT_STR("import jpype
import os.path
jarpath = os.path.join(os.path.abspath('.'), 'E:/resource/test.jar')
if not jpype.isJVMStarted():
jpype.startJVM(jpype.getDefaultJVMPath(),'-ea', '-Djava.class.path=%s' % jarpath)

jprint = jpype.java.lang.System.out.println
JDClass = jpype.JClass('com.Utils')
a = JDClass()

text = a.encryptData_ECB(_arg[0])

return text",ATTR([细分]))

Agregar nuevo comentario

Suscribirse a nuestro blog