Governments around the world have released more than a million open data sets over the last decade. Analyzing that data -- when it's accurate -- can help policymakers make better decisions, but they're only beginning to tap into this potential. A project from the Massachusetts Institute of Technology Media Lab shows what's possible.
Governments around the world have released more than a million open data sets over the last decade. This information has helped fuel job creation and some societal changes, including increased government accountability and consumer protection, more transparent health care costs and more resilience against climate change.
A project from the Massachusetts Institute of Technology Media Lab shows what's possible. In 2013, the Foundation for Research of the State of Minas Gerais, Brazil's version of the National Science Foundation, hired César Hidalgo, the head of the MacroConnections group at the lab, to produce a report on where industrial developments would flourish. Hidalgo decided to go beyond a stale, static document and create something he believed would be more useful and dynamic.
His team at the Media Lab released its DataViva engine, which allows users to visualize more than 500 gigabytes of Brazilian government data in 1 billion different ways. The public can use the software platform to quickly and easily mash up multiple sets of economic, demographic, trade and educational data. The idea behind DataViva, Hidalgo explained, "is to make reports obsolete."
Now the team is relaunching DataViva with information from all across Brazil -- making it, the largest data visualization platform online.
In the two years since DataViva's release, the MIT team has taken an amazing amount of information -- international trade data from more than 5,000 municipalities, employment data from 50 million workers in Brazil's formal economy, enrollment and graduation data from of Brazil's university and basic education systems, and five years' worth of tax data -- and has standardized it, structured it and added it to its system.
The code behind DataViva is open source and available on the code-sharing site GitHub, and all of the government data it uses is accessible as downloadable files. Hidalgo and his business partners are now exploring whether other countries, states and cities may find the data visualization engine useful.
The long-term vision of the site may not be apparent on first visit either. DataViva, along with The Observatory of Economic Complexity or Pantheon, are similar to encyclopedias in the way that no one would ever read one front to back yet their usefulness as a resource reveals itself over time. Journalists looking to cite raw numbers about the economy, decision-makers looking to validate policy with data trends, or even curious citizens wondering about the composition of their municipality or the distribution of salaries paid to people with their same occupation, can consult the tool in a very structured and logical way. This is due to the many iterations made to the both the [user interface] and [user experience] of the site.