7 data competitions for data scientists and analysts
Data competitions serve many functions. They are a great way to learn best practices, gather feedback on your work, and advance your skills. They can also serve as a method of brainstorming by crowdsourcing solutions to problems.
Whatever their function, data competitions are an opportunity to push boundaries and encourage creativity among the best and the brightest in a variety of data-related fields.
Here are a range of data competitions to try your hand at, from data visualization to data science and everything in between.
Hosted by: Tableau
Every year, Tableau Public hosts several data viz contests to set up the ultimate viz challenge — Iron Viz. Virtual “qualifier” contests are held throughout the year, each to highlight data viz skills through a specific theme. The winner of each of these contests competes in a live Iron Viz showdown in front of an audience of 17k at Tableau Conference or Tableau Conference Europe, where the three contest winners go up against each other to create a data visualization using the same data set.
Viz entries are judged on several criteria: the overall design of the viz, how well the viz and the data tell the story, and depth of analysis of the data and the results. Even if you don’t win, the Iron Viz contests are a great way to dip your feet into the world of analytics and get feedback from peers and experts in the field.
Hosted by: Kaggle
We would be remiss if we didn’t include Kaggle in a list of data competitions. Kaggle is one the most well-known platforms for hosting competitions in data science. The site started out doing machine learning competitions, from which it acquired the fame it has now. Kaggle competitions regularly draw thousands of entries, both teams and individuals, competing for lucrative prizes.
The competitions feature complex problems where the entrants strive to create the best algorithm to solve the issue. Rewards range from the satisfaction of knowledge, to receiving “swag,” to monetary compensation. Most of the monetary prize competitions tend with reside in the tens of thousands of dollars range, however there have been some that surpassed $100,000. The highest single prize so far was a first place $1 million prize for any team that was able to beat real estate company Zillow’s “Zestimate” benchmark model in predicting home values. In addition to hosting competitions for various organizations, Kaggle also does an annual Data Science Bowl competition that seeks to better society through science.
Kantar Information is Beautiful Awards
Hosted by: Information is Beautiful and Kantar
If you’re involved in data viz at all, you likely know about Information Is Beautiful, David McCandless’ site to showcase how visual design augments the transfer of information. With this, the site also runs the Information is Beautiful Awards to bring attention to the very best and most creative data visualizations.
Each year, awards are given out to those who excel in turning data into art — judging how well the data is arrayed, as well as how well it highlights the information and brings insight to the topic. The contest culminates in the Information is Beautiful awards ceremony. Longlisted entrants are featured on the site, while shortlisted entrants attend the awards ceremony at the end of the year. At the 2018 ceremony, shortlisted entries found their vizzes immortalized on the party’s cookies.
From cookie fame to thousands of dollars, data competitions highlight the best and brightest. Even if you start out small, data competitions are a good way to hone your craft at every level of expertise.
Driven Data Competitions
Hosted by: Driven Data
Much like Kaggle, Driven Data also hosts competitions in data science to crowdsource solutions to difficult predictive problems. They seek to solve pressing social challenges within our world by building statistical models to aid in prediction. While the competitions are not as large as Kaggle, Driven Data focuses on solving issues that have a tangible impact on our world, benefitting both humans and nature. Several of the higher profile competitions have involved the AARP Foundation sponsoring a competition to predict seniors’ physical safety, and a competition through The Nature Conservancy and Gulf of Maine Research Institute to help sustainable fisheries by measuring and counting fish from video footage.
Entries are judged based on how well they can predict data to solve the issue, compared against actual values in current data as a benchmark. Winning models are then integrated with the hosting organization in order to better improve their goals. Some of the competitions offer bragging rights and others offer monetary prizes, but all ensure that they have an effect on the world.
Hosted by: CrowdANALYTIX
CrowdANALYTIX also features data modeling competitions, diving into machine learning, artificial intelligence, deep learning, and natural language processing. These challenges are more informal, though no less rewarding. Like many of the other competitions listed here, some competitions are for the sake of learning, and others have a prize pool.
The platform consists of two “layers,” the machine layer of bots and the human layer of data scientists building those bots and algorithms. Here, the data competitions take a slight turn and are viewed as more of a work in progress for consistent iteration. Winning algorithms are moved to CrowdANALYTIX’s database and then monitored for fine-tuning. If the algorithm starts to degrade, it is moved back to the community to be adjusted or rebuilt.
Coda Lab Competitions
Hosted by: Coda Lab
Coda Lab is open source platform for computational research. The competitions are held for the sake of collaborative research and code testing. While they don’t offer prestigious prizes, they work together to create more efficient and reproducible code. Coda Lab features heavily on the programming and code-building of data and can be a good way to dip your feet into collaborative projects and challenges.
Hosted by: Topcoder
Topcoder is similar to Coda Lab in that it is also a collaborative effort to compile code testing and research. They have a wide array of challenges and competitions on their main site, ranging from data science to coding to web design. Many of these offer decent prize rewards, though some are simply for the sake of a challenge.
The main draw is the annual Topcoder Open, the “Ultimate Programming and Design Tournament.” It features a range of competitions such as algorithms, development, UI design, and quality assurance. The initial competitions are online, with the winners earning points that net them additional prizes and a trip to the TCO finals hosted in the US. The TCO also has smaller regional events to bring the competition to even more people. These events are only a day or two, but offer more international opportunities to get involved.
Hot Tip: Need practice? Try out Makeover Monday! Each Sunday, a new data set is posted and people from around the world create data visualizations that are then discussed via webinar on the Wednesday of that week.