Solve These Tough Data Problems and Watch Job Offers Roll In
Late in 2015, Gilberto Titericz, an electrical engineer at Brazil’s state oil company Petrobras, told his boss he planned to resign, after seven years maintaining sensors and other hardware in oil plants. By devoting hundreds of hours of leisure time to the obscure world of competitive data analysis, Titericz had recently become the world’s top-ranked data scientist, by one reckoning. Silicon Valley was calling. “Only when I wanted to quit did they realize they had the number-one data scientist,” he says.
Petrobras held on to its champ for a time by moving Titericz into a position that used his data skills. But since topping the rankings that October he’d received a stream of emails from recruiters around the globe, including representatives of Tesla and Google. This past February, another well-known tech company hired him, and moved his family to the Bay Area this summer. Titericz described his unlikely journey recently over colorful plates of Nigerian food at the headquarters of his new employer, Airbnb.
Titericz earned, and holds, his number-one rank on a website called Kaggle that has turned data analysis into a kind of sport, and transformed the lives of some competitors. Companies, government agencies, and researchers post datasets on the platform and invite Kaggle’s more than one million members to discern patterns and solve problems. Winners get glory, points toward Kaggle’s rankings of its top 66,000 data scientists, and sometimes cash prizes.
Alone and in small teams with fellow Kagglers, Titericz estimates he has won around $100,000 in contests that included predicting seizures from brainwaves for the National Institutes of Health, the price of metal tubes for Caterpillar, and rental property values for Deloitte. The TSA and real-estate site Zillow are each running competitions offering prize money in excess of $1 million.
Veteran Kagglers say the opportunities that flow from a good ranking are generally more bankable than the prizes. Participants say they learn new data-analysis and machine-learning skills. Plus, the best performers like the 95 “grandmasters” that top Kaggle’s rankings are highly sought talents in an occupation crucial to today’s data-centric economy. Glassdoor has declared data scientist the best job in America for the past two years, based on the thousands of vacancies, good salaries, and high job satisfaction. Companies large and small recruit from Kaggle’s fertile field of problem solvers.
In March, Google came calling and acquired Kaggle itself. It has been integrated into the company’s cloud-computing division, and begun to emphasize features that let people and companies share and test data and code outside of competitions, too. Google hopes other companies will come to Kaggle for the people, code, and data they need for new projects involving machine learning—and run them in Google’s cloud.
Kaggle grandmasters say they’re driven as much by a compulsion to learn as to win. The best take extreme lengths to do both. Marios Michailidis, a previous number one now ranked third, got the data-science bug after hearing a talk on entrepreneurship from a man who got rich analyzing trends in horseraces. To Michailidis, the money was not the most interesting part. “This ability to explore and predict the future seemed like a superpower to me,” he says. Michailidis taught himself to code, joined Kaggle, and before long was spending what he estimates was 60 hours a week on contests—in addition to a day job. “It was very enjoyable because I was learning a lot,” he says.
Michailidis has since cut back to roughly 30 hours a week, in part due to the toll on his body. Titericz says his own push to top the Kaggle rankings, made not long after the birth of his second daughter, caused some friction with his wife. “She’d get mad with me every time I touched the computer,” he says.
Entrepreneur SriSatish Ambati has made Kagglers a core strategy of his startup, H2O, which makes data-science tools for customers including eBay and Capital One. Ambati hired Michailidis and three other grandmasters after he noticed a surge in downloads when H2O’s software was used to win a Kaggle contest. Victors typically share their methods in the site’s busy forums to help others improve their technique.
H2O’s data celebrities work on the company’s products, providing both expertise and a marketing boost akin to a sports star endorsing a sneaker. “When we send a grandmaster to a customer call their entire data-science team wants to be there,” Ambati says. “Steve Jobs had a gut feel for products; grandmasters have that for data.” Jeremy Achin, cofounder of startup DataRobot, which competes with H2O and also has hired grandmasters, says high Kaggle rankings also help weed out poseurs trying to exploit the data-skills shortage. “There are many people calling themselves data scientists who are not capable of delivering actual work,” he says.
Competition between people like Ambati and Achin helps make it lucrative to earn the rank of grandmaster. Michailidis, who works for Mountain View, California-based H2O from his home in London, says his salary has tripled in three years. Before joining H2O, he worked for customer analytics company Dunnhumby, a subsidiary of supermarket Tesco.
Large companies like Kaggle champs, too. An Intel job ad posted this month seeking a machine-learning researcher lists experience winning Kaggle contests as a requirement. Yelp and Facebook have run Kaggle contests that dangle a chance to interview for a job as a prize for a good finish. The winner of Facebook’s most recent contest last summer was Tom Van de Wiele, an engineer for Eastman Chemical in Ghent, Belgium, who was seeking a career change. Six months later, he started a job at Alphabet’s artificial-intelligence research group DeepMind.
H2O is trying to bottle some of the lightning that sparks from Kaggle grandmasters. Select customers are testing a service called Driverless AI that automates some of a data scientist’s work, probing a dataset and developing models to predict trends. More than 6,000 companies and people are on the waitlist to try Driverless. Ambati says that reflects the demand for data-science skills, as information piles up faster than companies can analyze it. But no one at H2O expects Driverless to challenge Titericz or other Kaggle leaders anytime soon. For all the data-crunching power of computers, they lack the creative spark that makes a true grandmaster.
“If you work on a data problem in a company you need to talk with managers, and clients,” says Stanislav Semenov, a grandmaster and former number one in Moscow, who is now ranked second. He likes to celebrate Kaggle wins with a good steak. “Competitions are only about building the best models, it’s pure and I love it.” On Kaggle, data analysis is not just a sport, but an art.