Open data for all?


Screen Shot 2015-10-21 at 4.03.42 PM

“I like writing code and it magically coming up with numbers that are useful or meaningful to people,” said Nathan Johnson, an independent developer measuring bus performance in the city. “I think there is a huge opportunity for increasing equity, transparency and accountability.”

It’s been six years since New York City first rallied its technically-inclined citizens to build an app for that.

At its inception in 2009, New York City’s BigApps contest invited entrants to “develop applications that could benefit New Yorkers,” arming them with over 170 data sets and promising $20,000 in prize money. Today, that $20,000 has ballooned to $125,000, and some 1,350 data sets populate an Open Data Portal available not just to competition participants, but to the Internet at large. But as tech heavyweights throw their capital behind the competition and the City Council keeps pushing open data legislation, the city’s 311 hotline receives thousands of service requests a day – all of them, of course, catalogued online.

“It’s easier than ever for New Yorkers to engage with their communities at the touch of a button,” said Mayor Bill de Blasio, in an Oct. 14 statement announcing the creation of informational neighborhood websites. A beta version of the site, which will be released in 2016, confirms there was a rat sighting and excessive noise from jackhammering in Midtown West in the last two weeks. But the “number one complaint” Community Board 4’s transportation committee hears about the 311 system is that it is unresponsive, said committee co-chair Ernest Modarelli, adding that his constituents sometimes feel their complaints are going into an “endless log.”

John Krauss, a technical research fellow at New York University’s GovLab, echoed the sentiment at an Oct. 1 hearing held by the City Council’s committee on technology. Of the 106 user questions that have been posted to the Open Data Portal since 2011, he said, less than half have been answered, with an average of 180 days elapsing between query and response.

The Open Data Portal, the online 311 system, and BigApps are three of a battery of open data initiatives that were launched under former Mayor Michael Bloomberg’s administration. Some of the initiatives predate 2012 legislation that requires all agency data to be housed on the Open Data Portal by 2018 – but almost all demand funding, technical ability and cooperation across governmental offices, an effort Councilman James Vacca likened to “herding cats,” at the Oct. 1 hearing.

At the outset, many such initiatives were geared towards the technically literate. At a hearing about the 2012 legislation, then-Mayor Bloomberg said the release of data would catalyze “the creativity, intellect, and enterprising spirit of computer programmers to build tools that help us all improve our lives.”  By contrast, the city’s most recent update to the plan carries the egalitarian title “Open Data for All,” and zeroes in on public outreach – promising a “five-borough tour” and unveiling a new “Data Lens” feature that turns spreadsheets on the portal into charts or maps. The plan’s success will be marked by how many New Yorkers – and “not just the tech-savvy New Yorkers” – use open data in their day-to-day lives, said New York City’s chief analytics officer Amen Ra Mashariki at the Oct. 1 hearing, shortly after Councilman Vacca ticked off user friendliness as a persistent problem with open data efforts.

“Imagine your typical person who isn’t typically working with numbers,” said Lauren Rennee, an organizer at civic technology group BetaNYC. “No one wants to breeze through a 60 page document of tables,” she said, adding that BetaNYC is working with the Manhattan borough president’s office to provide technical assistance to community boards. 

Some believe developers will continue to be indispensable mediators – translating the government’s raw data into something useful for citizens. “It’s not like my mother would download some spreadsheets” off a portal, said Max Heimstädt, a doctoral student at Freie Universität, in Berlin, who’s studying the introduction of open data in large cities, including New York City. Others think it will be “open data for all,” but that it would require knowing open data platforms exist, knowing how to handle the data, and “knowing that this is a form of political participation,” said Rennee, adding, it’s an understanding that “every time I report a pothole, I am data-entering.” 

Either way, the Oct. 1 hearing acknowledged that problems with open data extend beyond just technically illiterate users.

For one, some data sets lack contextual information, or are so riddled with jargon that users can’t tease fact from flotsam. “DVR_DELETE_TIMESTMP” reads a column label on a data set about removing derelict vehicles; “Column M: FY15 Per Capita (no ATS growth)” defines a column relating to school budget allocation.

“Scientific and technical terminology is presented to the public with no explanation,” said Councilman Vincent Gentile at the hearing. He proposed a law that would require definitions for terms “the general public cannot reasonably be expected to understand,” and his proposal was discussed at the hearing alongside several others that seek to make open data more intelligible to casual users, and to increase the number and quality of data sets on the portal.

Another 90 data sets are slated to be released this year, but while legislation requires agencies to supply data, “the laws don’t have any teeth,” said Ben Wellington, the creator of a blog called I Quant NY that has used open data to, as a recent example, determine that September is “quantifiably the best month to go to the farmer’s market.” “If the agencies don’t release the data, there are no consequences,” he said, citing NYCHA as a “black hole” of data, and the New York Police Department as an agency who’s been slow to adopt open data, but is moving in the right direction.

Bureaucratic red-tape, contracts with outside vendors, or a lack of funding or technical ability might impede an agency’s compliance, as might a certain anxiety about having to “hand over data that used to be their own,” added Heimstädt.

Still, the “attitude of agency staff has changed,” said Steven Romalewski, director of the CUNY Mapping Service at CUNY Graduate Center’s Center for Urban Research. “They realize this is the direction that the government is going in.” Many city agencies, like the city’s Department of Finance, are already collecting large amounts of data to fulfill a specific function – like collecting property taxes. All that data has ancillary benefits that are “incredibly powerful and useful beyond that very specific purpose it’s collected for,” said Romalewski.

Civic technologists have already used open data to visualize the trees that line city streets, and to find two fire hydrants in the city that were racking up more than $55,000 in parking tickets per year. Still, their suggestions for open data include:

Screen Shot 2015-10-26 at 5.02.36 PM