LOD%20in%20the%20Czech%20Republic.pdf

Type: Document | Status: ready

Linked Open Data in The Czech Republic Martin Nečaský Project: Implementace strategií v oblasti otevřených dat II Reg. number: CZ.03.4.74/0.0/0.0/15_025/0004172

Existing LOD 2

Publishers 3 ● Czech Social Security Administration ○ Statistical data about pensions ○ RDF dumps, SPARQL endpoint, CSV dumps ○ https://data.cssz.cz/web/otevrena-data/katalog-otevrenych-dat ○ https://data.cssz.cz/web/otevrena-data/-/duchodci-v-cr-krajich-okresech ● General Financial Directorate ○ Central Register of Subsidies ○ RDF dumps, SPARQL endpoint, CSV dumps

Problems of LOD 4 do not understand LOD benefits Consumers Publishers }

Benefits of LOD 5 Code lists Statistical data Base registries

Code lists 6

Problems with codelists ● Different data models, structures and formats ● Duplicated by publishers ● Not shared by publishers ● No notifications about changes ● No propagation of changes 7 Social Security Administration LOD solution for codelists 8 Unified data model, structure and formats : https://data.cssz.cz/resource/pension-kind/PK_D PK_D skos:notation Orphan pension skos:prefLabel https://data.cssz.cz/ontology/pension-kinds/PensionKindScheme skos:inScheme RDF SKOS IRI patterns Social Security Administration LOD solution for codelists 9 https://data.cssz.cz/resource/pension-kind/PK_D PK_D skos:notation Orphan pension skos:prefLabel Minimizing duplicities with https://data.cssz.cz/ontology/pension-kinds/PensionKindScheme skos:inScheme Office of the Government https://data.vlada.cz/resource/population-prediction/5684 150 gov:refPeriod 2022 dct:type Linking gov:population-prediction

Social Security Administration LOD solution for codelists 10 .../pension-kind/PK_D PK_D skos:notation Orphan pension skos:prefLabel Sharing codelists and their items : .../PensionKindScheme skos:inScheme Office of the Government .../population-prediction/5684 12345 gov:population-prediction gov:refPeriod 2022 dct:type Web Application GET /resource/pension-kind/PK_D GET /resource/population-prediction/5684 1⃣ 2⃣ Population projection Year : 2022 ... Orphan pensions : 12345 ... HTTPS IRI dereferencing LOD solution for codelists 11 Change notification and propagation based on LD Notifications Social Security Administration .../PensionKindScheme inbox 📥 ldp:inbox change notification 🔔 ldp:contains POST 🔔 Office of the Government GET GET GET GET Our next steps towards LOD codelists ● National standard for publishing codelists as LOD ● Mandatory publishing of important codelists as OD ● Publishing codelists managed by Czech Statistical Office ○ 100s of codelists used by many public bodies ● Alignment of codelists ○ Czech Statistical Office ○ Czech Social Security Administration ○ EU Vocabularies (Publications Office of the EU) 12

Statistical data 13

Problems with statistical data ● Different data models, structures and formats ● Hard to explore ● Hard to integrate 14 Social Security Administration LOD solution for statistical data 15 Unified data model, structure and formats : https://data.cssz.cz/resource/observation/duchodci-v-cr-krajich-okresech/2017-12-31/pocet-duchodcu/pk_d/ok.3201/f cssa:pension-kind 193 cssa:number-of-pensions https://data.cssz.cz/resource/dataset/duchodci-v-cr-krajich-okresech qb:dataSet RDF Data Cube Voc. IRI patterns https://data.cssz.cz/resource/pension-kind/PK_D ... cssa:refArea ... SDMX cssa:refPeriod LOD solution for statistical data 16 Easier exploration Which statistical measures do we measure in Czech regions? Who publishes these measures in which datasets?

State Administration of Land Surveying and CadastreSocial Security Administration LOD solution for statistical data 17 Easier exploration based on ... dim:pension-kind 193 cssa:number-of-pensions ../duchodci-v-cr-krajich-okresech qb:dataSet https://ruian.linked.opendata.cz/zdroj/vúsc/108 cssa:refArea https://data.cssz.cz/resource/pension-kind/PK_D Office of the Government ... ../population-prediction qb:dataSet 175 gov:population-prediction Czech Statistical Office ... ../population qb:dataSet 428475 czso:population dct:type gov:refArea czso:refArea RDF Data Cube Voc. Codelists reuse SDMX SPARQL Social Security Administration LOD solution for statistical data 18 cssa:number-of-pensions ../duchodci-v-cr-krajich-okresech Office of the Government ../population-prediction gov:population-prediction Czech Statistical Office ../population czso:population Easier exploration based on RDF Data Cube Voc. Codelists reuse SDMX SPARQL State Administration of Land Surveying and CadastreSocial Security Administration LOD solution for statistical data 19 Straightforward integration thanks to ... dim:pension-kind 193 cssa:number-of-pensions ../duchodci-v-cr-krajich-okresech qb:dataSet https://ruian.linked.opendata.cz/zdroj/vúsc/108 cssa:refArea https://data.cssz.cz/resource/pension-kind/PK_D Office of the Government ... ../population-prediction qb:dataSet 175 gov:population-prediction Czech Statistical Office ... ../population qb:dataSet 428475 czso:population dct:type gov:refArea czso:refArea RDF Data Cube Voc. Codelists reuse SDMX SPARQL LOD solution for statistical data 20 Straightforward integration thanks to ...193 ... ../vúsc/108 PK_D ... 175 ... ... 428475 RDF Data Cube Voc. Codelists reuse SDMX SPARQL ... ... 193 cssa:number-of-pensions https://ruian.linked.opendata.cz/zdroj/vúsc/108 sdmx:refArea 175 gov:population-prediction 428475 czso:population LOD solution for statistical data 21 https://etl.opendata.cz/#/pipelines/edit/canvas?pipeline=https:%2F%2Fetl.opendata.cz%2Fres ources%2Fpipelines%2F1544018816319 Straightforward integration thanks to RDF Data Cube Voc. Codelists reuse SDMX SPARQL Our next steps towards LOD statistical data ● National standard for publishing statistical data as LOD ● Publishing statistical data ○ Czech Statistical Office ○ Czech Social Security Administration ○ ( ■ Technology Agency of the Czech Republic ■ Institute of Health Information and Statistics of the Czech Republic ■ … ○ ) ● Alignment with codelists ● Alignment with Register of territorial identification, addresses and real estates 22

Base registries 23

Czech base registries ● Register of inhabitants ● Register of persons (companies) ● Register of territorial identification, addresses and real estates ● Register of rights and responsibilities of public authorities ● Collection of Laws 24 “Base registries, important components of European public services, are a trusted and authoritative source of information. They provide basic information on data items such as people, companies, vehicles, licences, buildings, locations and roads.” -- ISA2

Register of territorial identification 25 Country (NUTS I) NUTS II regions NUTS III regions City regions level I City regions level II City regions level III City districts Streets Address points Buildings

Register of territorial identification 26 ● Primary and the only territorial identities ● Territorial knowledge graph ○ names, shapes, relationships, ... ● Problems ○ Identities technically expressed as integers which are hard to link and reuse on the Web ○ Knowledge base published in XML which is hard to integrate with other data sources

LOD solution for Register of territorial identification 27 https://linked.cuzk.cz/resource/ruian/adresni-misto/21954381 https://linked.cuzk.cz/resource/ruian/stavebni-objekt/21788421 Buildings Address points https://linked.cuzk.cz/resource/ruian/ulice/480703 Streets https://linked.cuzk.cz/resource/ruian/obec/554782 Cities https://linked.cuzk.cz/resource/ruian/vusc/19 NUTS3

State Administration of Land Surveying and Cadastre LOD solution for Register of territorial identification 28 https://linked.cuzk.cz/resource/ruian/adresni-misto/21954381 https://linked.cuzk.cz/resource/ruian/stavebni-objekt/21788421 https://linked.cuzk.cz/resource/ruian/ulice/480703 https://linked.cuzk.cz/resource/ruian/obec/554782 https://linked.cuzk.cz/resource/ruian/vusc/19 http://nuts.geovocab.org/id/CZ010 https://www.wikidata.org/wiki/Q1085

State Administration of Land Surveying and Cadastre LOD solution for Register of territorial identification 29 https://linked.cuzk.cz/resource/ruian/adresni-misto/21954381 https://linked.cuzk.cz/resource/ruian/stavebni-objekt/21788421 https://linked.cuzk.cz/resource/ruian/ulice/480703 https://linked.cuzk.cz/resource/ruian/obec/554782 https://linked.cuzk.cz/resource/ruian/vusc/19http://nuts.geovocab.org/id/CZ010 https://www.wikidata.org/wiki/Q1085 ... ... ... ... 30 ● Experimental representation https://linked.cuzk.cz.opendata.cz/sparql ● IRIs alive, but currently only HTML representation available ● Discussions with State Administration of Land Surveying and Cadastre LOD solution for Register of territorial identification

Register of rights and responsibilities 31 Public bodies Responsibilities Information systems Access rights

32 ● Primary and the only identities of public bodies, their responsibilities, etc. ● Governmental knowledge graph ○ Relationships between public bodies, responsibilities, access rights, information systems, etc. ○ Links to related legislation ● Problems ○ Identities technically expressed as integers which are hard to link and reuse on the Web ○ Knowledge base published in XML which is hard to integrate with other data sources Register of rights and responsibilities

33 ● Will be published on 1 Jan 2019 ● SPARQL endpoint ● 25 datasets ○ JSON-LD distributions ● Documentation ○ official standard for each dataset ○ auto-generated from RDFS/OWL vocabularies, JSON @context and JSON Schema ○ https://data.gov.cz/otevřené-formální-normy/registr-práv-a-povinností/orgány-veřejné-moci LOD solution for Register of rights and responsibilities

34 Auto-generating documentation of LOD datasets Vocabularies Dataset - JSON-LD distribution @context JSON Schema XSLT SPARQL Dataset - documentation JSON structures documentation with semantics from Vocabularies RDF structures documentation with semantics from Vocabularies Working SPARQL queries

Collection of Law 35 ● Structured and machine readable representation of Collection of Laws ● Each legal document and its part with dereferencable persistent IRI ● 10/2018 - 12/2020

Thank You! Martin Nečaský [email protected] - [email protected] 36