Open Data for Development in Latin America and the Caribbean » democratic control

Get involved

democratic control

Open Data is Hot Topic at the W3C Brazil Conference

The city of São Paulo hosted on October 18-20 the 4th Web.br Conferece – an event promoted by the W3C Brazil office to debate the future of the Web – and Open Data was one of the hot topics debated.

According to the manager of W3C Brazil, Vagner Diniz, debating Data opening is paramount for the Web’s progress. Hence, the topic was part of several programming activities, such as panels, lectures, coffee break chats, as well as during the hackathon:

“There is an ever increasing number of devices capable of connecting to the Internet. The connection of several different types of devices to the Internet only makes sense if such devices can communicate with each other – i.e. if they can exchange information between them -, so that this data sharing enables better use of each of the devices connected. When we talk about an open Web, which was the theme of this Conference, we are talking about a Web that comprises these devices connected. And by talking about a Web that comprises these devices connected, we are referring to Open Data. For it is paramount to have data capable of trafficking from one place to the other seamlessly or data that enables me from my device to access data in a different device, thus enriching my Web experience”.

With a full room at 9 a.m., Jeanne Holm’s (Data.Gov evangelizer and Chief Systems Architect at NASA’s Jet Propulsion Laboratory) talk presented the U.S. experience opening its data and its impact on citizens’ lives. In an exclusive interview, Jeanne Holm said that the U.S. Government’s focus in regards to this topic is on how to provide more data, information and services to citizens, so as to enable them to make better decisions daily.

According to her, the government’s Open Data initiative involves 180 agencies, which have already provided access to 400 thousand databases.

“What is interesting about this is that when developers come together, such as in an event like this Conference today, they get their hands on these data and create applications or websites, or data journalists analyze them and help understanding what those data mean.”

Here you can watch the whole interview:

Another highlight of the Conference was an announcement by the Ministry of Justice confirming its first publication of data on the website dados.gov.br. According to Francisco Carvalheira, Coordinator of the Ministry of Justice’s Transparency and Access to Information Program, the institution decided to open its database of customer complaints received through Procons (Customer Protection Agencies) across the country. A “substantiated complaint” is an administrative procedure foreseen by the Customer Protection Code that represents 15% of the complaints registered by Procons.

“We believe that society will be able to come up with potential uses for this database. We believe that by publishing this database in open format we’ll be contributing to the actual Customer Protection public policy.”

The announcement was made by the Ministry of Justice during the Panel: “How to make the most of the Access to Information Act“. During this presentation, Francisco Carvalheira told that the institution has so far received 2,047 requests to access the information.

Also, in order to ensure the practical aspect of the debates, the Web.br created a space for journalists, programmers and webdesigners to work together in existing databases to produce information. During the Decoders hackathon, Open Data cases were presented and application templates were created using public data bases.

Zeno Rocha is a developer and he tells that him and a friend created a game specially to be presented at Decoders and motivate participating developers. According to him,  is an application aimed at providing young Facebook users information on politicians in a fun way.

The developers Kako and Rafael, on the other hand, saw their “Transpolitica” project win the Hackthon. This was the first time they both worked with Open Data.

Open 311

3-1-1 is a number well known in some cities of the United States and Canada, where citizens can notify the authorities about situations that are not urgent like non working traffic lights, illegal burning, roadway problems, etc. The goal is to leave the number 9-1-1 for those emergencies that really need immediate attention.

Open 311 “provides open channels of communication for issues that concern public space and public services. Using a mobile device or a computer, someone can enter information (ideally with a photo) about a problem at a given location. This report is then routed to the relevant authority to address the problem. What’s different from a traditional 311 report is that this information is available for anyone to see and it allows anyone to contribute more information. By enabling collaboration on these issues, the open model makes it easier to collect and organize more information about important problems. By making the information public, it provides transparency and accountability for those responsible for the problem. Transparency also ensures that everyone’s voice is heard and in-turn encourages more participation”.

Learn more

 

Developing Latin America

Desarollando America Latina (Developing Latin America) is an event aimed at fostering applications development, which takes place simultaneously in eight Latin American countries: Argentina, Brazil, Bolivia, Chile, Costa Rica, Mexico, Peru and Uruguay. Its goal is to gather web developers, webmasters, web designers, journalists, among other professionals in a new applications contest, based on reusing open data.

It is renowned as the biggest collaborative hackathon of the region and is currently in its second edition. In 2011, the applications created using open data were related to issues such as health, education and security. Among about 50 applications created, Onde Acontece? won first prize. Its idea was to cross various data and provide information on public safety.

Brazilian Open Data Portal

A data repository, the dados.gov.br portal aggregates 82 pubic datasets formerly scattered across the Internet. Launched by the Ministry of Planning, the project design also had extensive contributions from society. Moreover, the website also enables people to suggest new data for opening, to participate in Open Data events and to keep up-to-date with the portal’s development initiatives.

Users can also check out a few applications developed by communities using data available through the portal. One of the applications is the so-called “Basômetro“, a tool that enables measuring parliamentary support to the government and monitoring members of parliament’s stances on legislation votes.
Another application available on the website pinpoints the work accidents between 2002 and 2009 in the map of Brazil. Users are able to view accidents by municipality and by type.

The dados.gov is part of the National Infrastructure of Open Data (INDA), which is a project aimed at setting forth technical standards for Open Data, promoting qualification and sharing public information using open formats and free software.

“Most of the data stored by governments is not translated into information or services to the population”

Interview originally published in Blog Públicos – Estado de São Paulo

“Governments are not really aware of the amount and nature of the data they have stored. When they do have a rough idea, they lack the time to consider how that data can be applied and converted into services for the population.”

The general manager of the W3C consortium in Brazil, an international community of 300 private and state enterprises and universities that work together to develop Web standards, Vagner Diniz maintains in his interview to Públicos that governments must allow civil society to decide which public data are of interest to the population. He also believed that both parties must join forces to make the data supply meet the demand for information.

“We cannot just sit around waiting for the government to publish information, wasting money on data that might not even be of interest to the population. We will try to identify which data can be actually useful, create a demand for it and reach an agreement with government bodies to come up with a framework of priorities,” he says.
According to Diniz, civil society can spot possibilities in the data that are overlooked by governments. “Two hundred million people will see much more than 4 or 5 million civil servants.”

Why is it important for governments to publish their data in open formats?
The amount of data gathered and not used by governments ends up creating a useless mass of information. Governments use only the portion of the data that they need for administrative purposes. Most of them are not translated into information or services for the population. Governments are not really aware of the amount and nature of the data they have stored. When they do have a rough idea, they lack the time to consider how that data can be applied and converted into services to the population.

How important is this information to civil society?

What’s most important in making this information available is allowing the population itself to say: “This set of data might interest me, it is useful to me. Let me use it because I’ll be able to come up with scenarios in which it is relevant, while you as government have too many other concerns that prevent you from seeing what I can see.” In other words, it’s the idea that two hundred million people will see much more than 4 or 5 million civil servants. With governments worldwide starting to open their data, organizations, communities, interested individuals, Web programmers and volunteers have created interesting application software to make use of the data available.

What about to governments?
Curiously, this has generated an exchange of data within governments themselves. Different government bodies now have access to information from other bodies, which was previously very difficult to obtain due to endless bureaucratic processes.

This will undoubtedly contribute to greater government efficiency. But how can we guarantee that the immense supply of data stored by governments will meet society’s demand for information?
That is a tough task which I do not expect to see easily accomplished. Reaching an ideal stage of free-flowing information from government to society will be a hard process. It will involve raising awareness. There is a lot of resistance to publishing public data because the government sees itself much more as a proprietor than a custodian of that data. Public data are public, they belong to the population, and governments are custodians of data, but they act like proprietors. They fear what will be done to “their” data. A second effort involves qualification, as publishing these data in open formats demands a certain degree of technical expertise. We have to study the technologies that allow data to be openly published on the Internet. We must train people to do this.

Now…
…lastly, there must be an open and frank dialogue between the custodians of the data, the government bodies, and those interested in having access to the data, civil society organizations and many private citizens. We will try to address priorities. We cannot just sit around waiting for the government to publish information, wasting money on data that might not even be of interest to the population. We will try to identify which data can be actually useful, create a demand for it and reach an agreement with government bodies to come up with a framework of priorities.”

You once mentioned that developing application software is much easier than gathering consistent data. Could you explain this?
Developing an application based on data available merely involves creating a code which any slightly experienced web developer can read and freely apply to his own application. It is quite simple, much like creating a Web page. You don’t even have to be a Web developer to create a Web page nowadays, thanks to the tools available. Publishing data in an open format is more complicated, given that you, as the custodian of that data, have many other concerns besides the technical aspect of making the data available. It’s about more than that…

Yes…
…you have to make sure that the data is consistent. There cannot be another dataset with information that clashes with the data being published. You will publish three, four, ten databases, and any similar information they contain cannot be inconsistent. Secondly, there are security issues you need to worry about. You cannot allow the person who will use the data to alter them in any way. Thirdly, the data being published must be certified. Because if someone happens to misuse these data and alter them in any way, and then claim to have obtained the information from a government website, you, as the publisher, can prove that the original data were altered by that person. So there are many aspects to be considered when making information available.

Can you give an interesting example of data inconsistency?
I had an experience as IT director of a city in the state of São Paulo. A typical case was the city’s streets register. Each city hall department had its own register, with data boxes tailored to the needs of each department. The finance department’s register was geared towards collecting property tax, while the register of the public roads department focused on road works. The legal department was more focused on executing outstanding debts, and so forth. I counted six or seven registers. All of them had different information about the same streets. Even worse, the street names also differed among the registers, with different abbreviations. You never knew if a street in one register was the same as in another. It was also impossible to unify these registers, as they had different formats. This poses a serious problem when the information is made available, as different registers show the same information in different ways.

This reveals not only the size of the problem, but also the growing need to standardize government information.
Absolutely. This has been critical since the adoption of information technology in the organization of corporations. The need for standardization goes way back. Professionals in the area joke that the purpose of information technology is not to help you get better organized, but to help you make the same blunders you used to do without it (laughs). When you computerize an environment without altering processes and standardizing information, you will just do the same things you did before, but more quickly.


Can the private sector benefit from open data? If so, how?

I believe so, although the private sector has not yet realized this. It can benefit greatly in many areas of the open data value chain, especially technology businesses. One example is publishing open data on the Web. Moreover, creative and innovative businesses will scrutinize the open data carefully and be able to find ways to reuse and transform these data into commercially valuable services.

Can you give an example?
Nowadays, the IBGE Census is a rich source of information. It contains a lot of data on the country, the citizens, their distribution and characteristics. If these data are made available they can be extremely useful, albeit ensuring the right to confidentiality of personal data. Based on them you could, for example, offer consultancy services for new businesses, basing it on socioeconomic profiles; and you could also give advice on which businesses are in demand based on household profiles. There is another example in operation in Brazil called Gas Finder, an application for mobile phones which allows users to locate nearby gas stations. It is extremely useful and was developed using data available on the website of the National Oil Agency. You don’t necessarily have to generate income by charging the customer directly; income may be generated from ads displayed with the information. All it takes is entrepreneurship and creativity.

Cases: Apps for Democracy

The idea was born in 2008, due to DC’s government willing to ensure that both society, governments and businesses could make good use of DC.gov’s Data Catalog (that provides, for example, public information on poverty and crime indicators, in an open format).

Therefore, a competition was created to award the best applications developed based on data from the Catalog. The first contest cost Washington DC U$50,000 and produced 47 iPhone, Facebook and web applications with an estimated value in excess of U$2,600,000 to the city.

The application iLive.at won a gold medal for providing crime, safety and demographic information for those looking for a place in DC.

Another award-winning project was Park It DC, which allows users to check a specific area in the district for parking information.

Learn more about the project and check out the video at:

The 5 stars of Open Data

When we talk about Open Datastrategies that are farther reaching than publishing information, we may introduce the concept of Linked Data into the debate or go even further: Linked Open Data (LOD).

In the words of Tim Berners-Lee, the inventor of the World Wide Web, “Linked Open Data is Linked Data which is released under an open license”. Linked Data does not always have to be open. However, Linked Open Data does. Linked Open Data may only be referred to as such if it is open. And, aiming to promote this type of data, Tim Berners-Lee suggests a 5-star rating system.

This rating system awards a star to initiatives that make information publicly available in open format. More stars are awarded progressively based on how open and accessible the data analyzed is:

 Available on the Internet (in any format – e.g. PDF), provided that under an open license, to be Open Data

★★ Available on the Internet as machine-readable structured data (in an Excel file with an XLS extension)

★★★ Available on the Internet as machine-readable structured data and in a non-proprietary format (CSV instead of Excel)

★★★★ All of the above and it must use W3C open standards (RDF and SPARQL): use URL to identify things, so that people can point at their publications.

★★★★★ All of the above plus: link your data to other people’s data to provide context.

 

We have reproduced below a list of the benefits of publishing data according to the 5-star rating system, both for publishers and consumers:

 

Benefits of the 5-star rating

Rating

Consumer

Publisher

  • you can see data
  • you can print it
  • you can store it (e.g. in your hard drive or in a memory stick)
  • you can change data as you wish
  • you can access the data from any system
  • you can share the data with anyone

 

  • publishing is simple
  • you don’t need to keep repeating that people are allowed to use the data

 

★★

  • Same benefits as for one star rating
  • Proprietary software can be used to process, aggregate, calculate and view data. Data may be exported in any structured format.
  • publishing is easy

★★★

  • Same benefits as for two-star rating- You are able to handle data as you wish, without having to use particular software.
  • publishing is even easier

★★★★

  • Same benefits as for three-star rating
  • you are able to leave markings
  • you are able to reuse part of the data
  • you are able to reuse existing tools and data libraries, even if these are only partially compliant with the standards used by the publisher
  • you can combine data with other data.
  • you have control over data items and you can optimize access to it
  • Other publishers may link to your data, promoting it to 5 stars

★★★★★

  • you can uncover more linked data whilst consuming data
  • you can learn about the 5-star rating
  • you make your data easier to find
  • you add value to your data
  • your organization enjoys the same benefits of linking data as consumers

 

 

How to Open?

In order to be regarded as open, public data must be comprehensive, accessible, primary (no statistical treatment), current, machine readable, non-discriminatory (e.g. not requiring registration), non-proprietary and its licenses must ensure such principles without limiting its freedom of use.

Several publicly available data are not really open. They may have been published in proprietary formats – i.e. not readable by software – and with restrictive licenses; they may be available in HTML tables, plain text files or PDF. Developers must, therefore, translate these data, cross-reference them and publish them according to the rules and principles set forth.

Institutions that wish to open their data must prepare an activities plan. This task includes from determining which data will be published to how it’ll be published and viewed, to strategies to promote the use of such data by communities and activists.

The international movement for government data opening is based on 3 laws proposed by David Eaves:

  • If data can’t be spidered or indexed, it doesn’t exist.
  • If it isn’t available in open and machine readable format, it can’t engage.
  • If a legal framework doesn’t allow it to be repurposed, it doesn’t empower.

 

In order words, the first step towards opening data is identifying the information controlled by governments, companies, etc. Then it must be converted into a machine readable format and, finally, made accessible to all.

We have listed a series of documents below which may be used as guidelines by governments, developers and others interested in data opening processes. Check out:

 

Open Government Data and Laws on Access to Information

The initial steps towards sharing public information were taken through “laws on access to information“, first heard of in Sweden in 1766. Not long after that, in 1951, Finland proposed regulations related to the topic, followed by the U.S. in 1966. Currently, there are about 80 countries who have adopted some kind of legislation on access to information, based on two operating principles – reactive and proactive transparency.

The principle of reactive transparency sets forth that governmental bodies are legally obliged to reply to requests for public information made by citizens, usually within five to thirty working days. The principle of proactive transparency, on the other hand, requires agencies to share and publish information which has not necessarily been requested by citizens.

The nature and format of the data published vary depending on the country’s regulations, although most countries require proactive sharing of information related to institutional data, the role of public agencies, services offered, procedure rules and lists of employees and authorities. Several laws of access to information also require public sharing of budgets and public agreements, for example.

In regards to format requirements for public data sharing, it is important to stress that these laws were created before the digital age and, therefore, they aimed at sharing public documents, not the information used to prepare such documents.

Contrastingly, more recent regulations implemented in the digital age require governments to publish information online. This is the case of the Chilean law on access to information, approved in 2009, which require authorities to proactively publish on their websites up-to-date public information. The latter includes: organic structure, roles, regulatory procedures, lists of public services provided, means of access to such public services, lists of public employees and their respective salaries, among others.

Older laws of access to information, such as the Canadian and the American legislation, have been reviewed to adapt to the digital age. Hence, these revisions require information to be published online and set forth that both requests and responses to requests must be submitted electronically.

Hence, Open Government Data (OGD) may be viewed as the natural next step and legacy of the principle of proactive transparency in public information sharing, regulated by laws on access to information. In fact, both laws of access to information and OGD may be viewed independently, and not necessarily linked to each other. We may also infer that a broader understanding of information access rights – particularly when legally enforced – and the State’s duty to enforce such rights provides a more substantial and effective basis for the design of Open Government Data initiatives.

In other words, if substantiated by the aforementioned laws on access to information, open government data policies will not be based on citizens’ right or not to access to information; instead, they’ll be based on the amount and format of data available.

What is it?

 

 

What is Open Data?
According to the Open Knowledge Foundation, a non-profit organization, “open data is data that can be freely used, reused and redistributed by anyone.” It involves the publication and sharing of information online in open formats, readable by machines, which may be freely and automatically reused by society.

 

When is data regarded as open?
Data is regarded as open when there is:

  • Availability and access: data must be fully available for a reasonable reproduction cost, preferably through downloading; it must also be available in a convenient and changeable format.
  • Reuse and redistribution: data must be provided so as to enable reuse and redistribution, including cross referencing with other datasets.
  • Universal participation: anyone can use, reuse and redistribute it, without discrimination against industry, people or groups (restrictions such as “non-commercial” that prevent commercial use are forbidden, as well as limited use for certain purposes, such as “education only”).

 

What types of data can be open?
All data can be open!
There is usually interest from the following in opening data: governments, companies, activists and teaching and research institutions, for example.

 

Why open data?

Opening data enables:

  • Transparency and democratic control;
  • Population engagement;
  • Citizen empowerment;
  • Better or new private services;
  • Innovation;
  • Improved efficacy and effectiveness of governmental services;
  • Assessment of the impact of policies;
  • Uncovering new things by combining data sources and standards.

 

What about open government data?
These are information produced by governments that must be made available to all citizens for any purpose. Government data are regarded as open when they comply with the following laws and principles.

 

What are they for?
For reuse by citizens and organizations in a society to verify, clarify, inspect and monitor them, according to their interests. Opening public data strengthens institutions, enables citizenship and social control, fights corruption, promotes transparency, enables inspections and fosters new ideas for public policies from within society itself.
Citizen engagement enables the government to improve its processes and increase the transparency of public administration. This happens because the Open Government Data Available clarifies how the sectors that are still not aligned with social control and service goals work.

 

How does it work in practice?
Opening data enables, for example, creating a mobile phone application showing where the public schools in an area are located, as well as how vacancies are distributed and where the highest demand for places is; or, how public money is being spent or even public safety levels in a given municipality or neighborhood.