The Future of Kaggle: Where We Came From and Where We’re Going:
Kaggle started off running supervised machine learning competitions. This attracted a talented and diverse community that now has nearly one million members. It’s exposed us to hundreds of machine learning usecases, introduced hundreds of thousands to machine learning, and helped push the state of the art forward. We’ve expanded by launching an open data platform, Kaggle Datasets, along with a reproducible and collaborative machine learning platform, Kaggle Kernels. They have already achieved strong adoption by our community by making it simpler to get started with, share, and collaborate on data and code.
We’ve achieved less than 1% of what we’re capable of. Several weeks ago we launched an announced an acquisition by Google. This enables us to move forward more rapidly and ambitiously. Working with analytics and machine learning is fraught with pain right now. It’s the software engineering equivalent of programming in assembly. It’s tough to access data. It’s tough to collaborate. It’s tough to reproduce results. We’ve seen these pain points over, and over, and over again. We’ve seen them in how our customer’s internal teams function. We’ve experienced them collaborating with our customers. We’ve seen them as people approach our competitions individually, and they become even more pronounced when our users team up. We want to solve this, and foster an era of intelligent services that improve your lives every single day.
In this talk, I’ll go into depth on the lessons we’ve learned from running Kaggle and the most frustrating pain points we’ve seen. I’ll discuss how you can ameliorate these by leveraging current open source tools and technologies, and wrap up by painting a picture of the future we’re building towards.