Earlier this year, Google undercut its Silicon Valley rivals by acquiring what it described as “the world’s largest community of data scientists and machine learning enthusiasts” in the online platform Kaggle. The acquisition gave Google more direct access to Kaggle’s one million members, who compete and earn prize money for developing artificial intelligence solutions to all manner of data analysis problems — from improving the algorithms used by the online real estate giant Zillow, to helping a satellite company use data to “track the human footprint in the rainforest.”
Response to the Google merger within the tightly-knit Kaggle community was somewhat mixed, but the platform’s international membership, which includes researchers in at least 194 countries, is now in revolt over Kaggle’s recent decision to host a U.S. government competition — one worth $1.5 million in prize money — that is legally off limits to foreigners.
The current controversy involves the Passenger Screening Algorithm Challenge sponsored by the U.S. Department of Homeland Security. Federal legislation known as the America Competes Act, enacted in 2007 and reauthorized under Barack Obama in 2011, bars the government, among other things, from awarding federal prize money like what DHS is offering to anyone who is not an American citizen or permanent resident.
Such prize restrictions have rankled the Kaggle community before. The $1.2-million Zillow Prize launched in May, for example, initially prohibited Chinese Kagglers from participating in the second round of the competition because of Zillow’s concerns about acquiring intellectual property rights in China.
But the online real estate database company eventually lifted that restriction after a chorus of Kagglers voiced their disapproval, and the new, exclusive DHS competition has generated a particularly pitched backlash among platform users who argue that it runs counter to the Kaggle culture of open data, international cooperation, and meritocratic competition. Some international Kaggle members have raised concerns about U.S. participants rejecting potential foreign teammates because of the prize restrictions, while others expressed outrage at being invited to make contributions to American national security without having a fair shot at the prize money.
Several Kagglers have effectively called for a boycott of the competition.
“Hosting open competitions in which people are differentiated by anything except their skill is unacceptable and it is definitely against the spirit of the community,” said Vladimir Iglovikov, a senior data scientist at TrueAccord, a debt-reconciliation startup based in San Francisco, in an email message.
Iglovikov is currently a U.S. permanent resident who is eligible to win prize money in the DHS competition. But earlier this year, he was blocked from receiving prize money for winning second place in the Safe Passage competition sponsored by the U.K. defense company BAE Systems and hosted by the U.K.’s Ministry of Defense. At the time, competition organizers told Iglovikov that his Russian citizenship disqualified him because of Russia’s score on the Transparency International’s Corruption Perceptions Index for 2014.
“I believe that community is the biggest asset of Kaggle,” Iglovikov said, “and it was the main reason why Google acquired it.”
Qingchen Wang, a Kaggle member and data scientist at ORTEC Consulting, a business analytics firm in the Netherlands, expressed similar concerns. “There are over one million users, almost 60,000 active competitors, and countless other people who know of Kaggle and follow competition outcomes, so a competition that is held on Kaggle is sure to garner a lot of positive attention,” Wang said in an email. He proudly lists his Kaggle competitions grandmaster rank in his LinkedIn profile. “However, to leverage the one million users and 60,000 active competitors for positive attention while not allowing most of them to win prizes is unfair, to say the least.”
The new Passenger Screening Algorithm Challenge invites Kagglers to analyze a large dataset of body-scan images collected from volunteers passing through airport security scanners. In return, the DHS hopes that Kagglers will submit machine-learning algorithms that could improve the accuracy of U.S. airport scanners in detecting possible threats. Such machine learning algorithms represent the most popular artificial intelligence techniques used by data scientists to find patterns in huge amounts of data.
The main prize announcement on the official Kaggle forum has received more than 150 downvotes, even as a forum posting titled “This is insane discrimination and [an] insult to our international community” has received more than 400 upvotes. Groups of Kagglers can work in teams, and a few of the team names taking shape on the competition’s leaderboard reflect the current level of discord. One team name used by an individual Kaggler is simply called “Kohei (non-U.S. citizen).” Another is dubbed “I’m not eligible, your loss Kaggle.”
Wang suggested that it didn’t have to be this way. Kaggle could have minimized controversy around the public challenge by making it a “master-only” private competition, similar to ones that have been held in the past, where participants are restricted based on Kaggle rank, he said. Alternatively, a different competition platform such as InnoCentive, aside from Kaggle, could have hosted the competition with a focus on soliciting contributions from U.S. participants.
The outcry has not been lost on Anthony Goldbloom, the chief executive of Kaggle. Goldbloom was born, raised, and educated in Australia before cofounding Kaggle in 2010 and moving the startup to San Francisco in 2011. In a phone call, he acknowledged the unfairness of what he described as a “two-speed competition,” where international Kagglers can only compete for ranking points on the platform, while U.S. Kagglers can compete for both points and prize money.
“We’re an international community with 40 percent in the U.S. and 60 percent overseas,” Goldbloom said. “We went into this knowing it was a less-than-ideal situation, because it cuts off a huge percentage of our user base.”
DHS’s Science and Technology Directorate first discussed the idea for the Passenger Screening Algorithm Challenge back in December 2015. The original vision for the challenge would have prevented non-U.S. citizens and permanent residents from even competing, let alone being eligible for prize money. But Kaggle put in “quite a lot of work” to find an interpretation of the America Competes Act that would allow international Kagglers to compete in the challenge, Goldbloom said. (Google was not involved beyond its March 2017 acquisition of Kaggle, Goldbloom noted, and a spokesman for the search giant, William Fitzgerald, said the company had no comment on the issue.)
Goldbloom and his team also see the DHS collection of body scan images as “a really interesting dataset” that could help move machine-learning and artificial intelligence research forward if it is made public. In their view, it’s better to seize such research opportunities — even if limited to some degree — rather than not at all.
“We obviously had a choice. Do we host a competition where non-U.S. citizens are not open to winning prize money, or do we not?” Goldbloom said. “Our view is that the machine learning world is better off with this being represented.”
Despite their frustrations, Kagglers such as Iglovikov and Wang said that they generally don’t blame defense-related government agencies such as the DHS for having their hands tied in doling out prize money. Some U.S. federal agencies have even found creative workarounds by providing honorable mentions, prize celebration events, or “introductions to the broader business community,” said John Verrico, a spokesman for the DHS’s Science and Technology Directorate.
But Kagglers seem to expect the platform’s broader mission of crowdsourcing important data-science solutions to transcend national borders and parochial interests. Indeed, many Kaggler comments reflect a fear that the meritocratic online community they have built together may now be under threat, and what Kaggle — or now, parent company Google — does next may go a long way toward dispelling or confirming such fears for the larger data science community.
“What is not OK,” Iglovikov said, is that Kaggle has agreed to host a competition in which people are differentiated primarily by where they live, rather than by characteristics “like intelligence, creativity, skill, the ability to learn fast, [and] the ability to work effectively as a part of the team.”
Jeremy Hsu is a freelance journalist based in Brooklyn. His work has appeared in publications such as Scientific American, Wired, Backchannel, IEEE Spectrum and Mosaic. He maintains the Lovesick Cyborg blog for Discover Magazine.