===This is the feedback given to NIC and DST in response to the NDSAP policy in Jan 2013===
Problems with NDSAP for Open Data
The policy that governs sharing government data in India is the National Data Sharing and Accessibility Policy (NDSAP). This policy is the first step toward opening up government data, however, now that it has been implemented through the creation data.gov.in there are many flaws in the policy that prevent it being the cornerstone for truly opening up government data.
The following are changes that the open data community would like to see so that the NDSAP so that it can truly become an open data policy and so that Data.Gov.In and other initiatives can fulfill its promise of openness and transparency.
Establishing Open Data as the goal of the NDSAP The NDSAP is clearly aimed at the internal audience of government ministries, however, it’s scope goes beyond internal issues and moves into the public sphere. It’s overall goal should be to make government more transparent to its citizens and should become an open data policy not just an internal sharing policy.
RTI Integration India’s premiere transparency legislation the Right to Information already has requirements in it for proactive sharing of information, including data. There is no reason that the NDSAP should stray far from this requirements. By aligning with the RTI policy it will strengthen the bill and give it popular support. 1. Use RTI’s privacy stipulations and transparently and clearly define what is classified and what isn’t 2. Nothing prevents everything from being put in the Restricted Use category. Make clear what is considered priority data and what datasets will be published. Offer guidelines on what should be registered and what should be restricted. 3. Pricing policy. Not linked to RTI or any other clear pricing guidelines that establish what data has to be paid for and way. In addition data that has been 100% collected using public funds should be noted separately as data collected with some private partnership. 4. Restricted lists should be published for each department with reasons as to why this data is not open. 5. It is unclear from the policy what all departments it covers, and whether it is enforceable. This policy should cover the same scope as the Right to Information (RTI) Act: all 'public authorities' as defined under the RTI Act should be covered by this policy.
Transparency in decision making of how data is assigned to category The rationale for the three-fold categorization is unclear. In particular, it is unclear why the category of 'registered access' exists, and on what basis the categorization into 'open access' and 'registered access' is to be done. If the purpose of registration is to track usage, there are many better ways of doing so without requiring registration. A. Having three categories of: Open data Partially restricted data Restricted data B. Data that is classified as non-shareable (as per a reading of s.8 and s.9 of RTI Act as informed by the decisions of the Central Information Commission) should be classified as ‘restricted’. C. The rationale for classifying data as 'open' or 'partially restricted' should be how the data collection body is funded. If it depends primarily on public funds, then the data it outputs should necessarily be made fully open. If it is funded primarily through private fees, then the data may be classified as 'partially restricted'. 'Partially restricted' data may be restricted for non-commercial usage, with registration and/or a licence being required for commercial usage.
Standardized and open formats for data download. As seen on the current data.gov.in site there is a lack of consistent standards when it comes to how the data is available. The policy should aim to require open standards, and require that the data that is put out be compliant with the Interoperability Framework for e-Governance (IFEG) that the government is currently in the process of drafting and finalizing. A. The policy should reference the National Open Standards Policy that was finalised by the Department of Information Technology in November 2010, as well as to the IFEG. B. The data should be made available, insofar as possible, in structured documents with semantic markup, which allows for intelligent querying of the content of the document itself. Before settling upon a usage-specific semantic markup schema, well-established XML schemas should be examined for their suitability and used wherever appropriate. It must be ensured that the metadata are also in a standardized and documented format.
Copyright No license has been prescribed in the policy for the data. Despite India not allowing for database rights, it still allows for copyright over original literary works, which includes original databases. All governmental works are copyrighted by default in India, just as they are in the UK. To ensure that this policy goes beyond merely providing access to data to ensure that people are able to use that data, it must provide for a conducive copyright license. A. The license that has been created by the UK government (another country in which all governmental works are copyrighted by default) may be referred to: http://www.nationalarchives.gov.uk/doc/open-government-licence/ B. However, the UK needed to draft its own license because the concept of database rights are recognized in the EU, which is not an issue here in India. Thus, it would be preferable to use the Open Data Commons - Attribution license: http://www.opendatacommons.org/licenses/by/
The UK license is compatible with both the above-mentioned license as well as with the Creative Commons - Attribution license, and includes many aspects that are common with Indian law, e.g., bits on usage of governmental emblems, etc.
Archival and longevity The policy is silent on how long data must be made available.
There must be a system of archival that is prescribed to enable citizens to access older data. Further, a versioning and nomenclature system is required alongside the metadata to ensure that citizens know the period that the data pertains to, and have access to the latest data by default.