Skip to main content

API wrapper for data.gouv.fr

Project description

pygouv

pygouv is a non official Python API wrapper for data.gouv.fr, the French Official Open Data Portal.

Installation

You can install pygouv with:

pip install pygouv

Usage

from pygouv import * 

home( )

Displays the datasets that are currently exhibited within the home page of the data.gouv.fr portal:

home()
acronym archived badges created_at deleted description frequency frequency_date id last_modified ... spatial.granularity spatial.zones spatial extras.recommendations extras.recommendations:sources extras.apigouvfr:apis extras.datafairDatasetId extras.datafairOrigin temporal_coverage.end temporal_coverage.start
0 SI-DEP None [] 2020-10-21T09:20:09.639000 None **Point d'attention : les difficultés de remon... daily 2020-10-22T09:19:41 5f8fe1290de5138270132602 2020-12-20T19:27:08.779000 ... fr:iris [country:fr] NaN NaN NaN NaN NaN NaN NaN NaN
1 SI-DEP None [] 2020-09-29T15:32:39.020000 None **Point d'attention: Les difficultés de remont... daily 2020-10-15T14:19:14 5f733777722fc12a413290eb 2020-12-20T19:27:11.047000 ... fr:epci [country:fr] NaN NaN NaN NaN NaN NaN NaN NaN
2 None None [] 2020-06-17T11:16:00.320000 None ### Présentation des indicateurs de suivi\n\nL... daily 2020-10-29T15:46:30 5ee9df5003284f565d561278 2020-12-20T16:24:05.918000 ... NaN NaN NaN [{'id': '5eaaf07f5abc47e306c5c258', 'score': 1... [matomo] NaN NaN NaN NaN NaN
3 None None [{'kind': 'covid-19'}] 2020-03-27T15:40:10.048000 None **Point d'information :** Un établissement hos... daily 2020-03-28T15:39:51 5e7e104ace2080d9162b61d8 2020-12-20T19:06:25.164000 ... NaN NaN NaN [{'id': '5e74ecf52eb7514f2d3b8845', 'score': 4... [matomo] NaN NaN NaN NaN NaN
4 None None [] 2020-06-11T16:00:55.511000 None Le diagnostic de performance énergétique (DPE)... unknown None 5ee2391763c79811ddfbc86a 2020-10-28T09:53:44.306000 ... NaN NaN NaN NaN NaN [{'logo': '/images/api-logo/ademe.png', 'openn... dpe-france https://koumoul.com/s/data-fair NaN NaN
5 None None [{'kind': 'covid-19'}] 2020-04-20T17:20:30.636000 None ### Le fonds de solidarité \n\nDans le context... daily 2020-04-21T17:49:37 5e9dbdbe71589194c8f7b42f 2020-12-07T17:59:34.038000 ... fr:departement [country:fr] NaN NaN NaN NaN NaN NaN 2020-12-31 2020-03-01
6 SI-DEP None [] 2020-05-29T16:10:35.407000 None **Point d'attention : Les difficultés de remon... daily 2020-10-22T09:53:26 5ed117db6c161bd5baf070be 2020-12-20T19:20:29.771000 ... other [country:fr] NaN [{'id': '5e7de8cf4663c08d4f74ba01', 'score': 8... [matomo] NaN NaN NaN NaN NaN
7 None None [] 2020-09-28T16:44:09.669000 None Version : AGRIBALYSE® v3.0.1\n\nAGRIBALYSE® es... unknown None 5f71f6b9f23df7fcd508af57 2020-11-24T17:19:10.789000 ... NaN NaN NaN NaN NaN [{'logo': '/images/api-logo/ademe.png', 'openn... agribalyse-synthese https://koumoul.com/s/data-fair NaN NaN
8 None None [] 2020-09-22T14:23:13.380000 None **Point d'attention : Depuis le 17/10/2020, le... daily 2020-09-23T14:20:21 5f69ecb155c43420918410b8 2020-12-21T00:30:59.566000 ... fr:departement [country:fr] NaN NaN NaN NaN NaN NaN NaN NaN

9 rows × 46 columns

home() provides a pandas dataFrame with the following columns :

for col in home().columns:
    print(col)
acronym
archived
badges
created_at
deleted
description
frequency
frequency_date
id
last_modified
last_update
license
owner
page
private
resources
slug
tags
temporal_coverage
title
uri
metrics.discussions
metrics.followers
metrics.issues
metrics.reuses
metrics.views
organization.acronym
organization.class
organization.id
organization.logo
organization.logo_thumbnail
organization.name
organization.page
organization.slug
organization.uri
spatial.geom
spatial.granularity
spatial.zones
spatial
extras.recommendations
extras.recommendations:sources
extras.apigouvfr:apis
extras.datafairDatasetId
extras.datafairOrigin
temporal_coverage.end
temporal_coverage.start

site_metrics( )

Provides global metrics related to the data.gouv.fr portal.

site_metrics()
datasets discussions followers max_dataset_followers max_dataset_reuses max_org_datasets max_org_followers max_org_reuses max_reuse_datasets max_reuse_followers organizations public-service resources reuses users
0 36046 7628 24129 130 133 1017 572 114 74 324 2659 0 195826 2453 65663

search( )

Searches for specific data sets through the data.gouv API according to the pattern provided into the query parameter. It takes three arguments query, page and page_size:

srch = search(query='cafés à paris', 
       page = 0, # look at page 0 (the default)
       page_size = 20) # pull 20 results

srch
acronym archived badges created_at deleted description frequency frequency_date id last_modified ... organization.logo_thumbnail organization.name organization.page organization.slug organization.uri extras.datagouv_ckan_id extras.datagouv_ckan_last_sync spatial.geom spatial.granularity spatial.zones
0 None None [] 2020-10-30T05:01:49.236000 None **Ce jeu de données recense les déclarations d... unknown None 5f9b902d3784843c84d5f959 2020-10-29T18:36:54 ... https://static.data.gouv.fr/avatars/b2/0e83ec3... Mairie de Paris https://www.data.gouv.fr/fr/organizations/mair... mairie-de-paris https://www.data.gouv.fr/api/1/organizations/m... NaN NaN NaN NaN NaN
1 None None [] 2014-04-23T10:10:02.419000 None La liste des cafés à un euro de Paris.\n\nCe j... unknown None 536998e0a3a729239d2050a4 2016-03-16T08:35:42.023000 ... https://static.data.gouv.fr/avatars/b2/0e83ec3... Mairie de Paris https://www.data.gouv.fr/fr/organizations/mair... mairie-de-paris https://www.data.gouv.fr/api/1/organizations/m... 29853df3-3493-43d9-82a3-4b6b49c44977 2014-09-16T09:31:21.096000 NaN other []

2 rows × 49 columns

The search() function provides a pandas dataFrame with the following columns:

for col in srch.columns:
    print(col)
acronym
archived
badges
created_at
deleted
description
frequency
frequency_date
id
last_modified
last_update
license
owner
page
private
resources
slug
spatial
tags
temporal_coverage
title
uri
extras.harvest:domain
extras.harvest:last_update
extras.harvest:remote_id
extras.harvest:source_id
extras.ods:geo
extras.ods:has_records
extras.ods:url
extras.remote_url
metrics.discussions
metrics.followers
metrics.issues
metrics.reuses
metrics.views
organization.acronym
organization.class
organization.id
organization.logo
organization.logo_thumbnail
organization.name
organization.page
organization.slug
organization.uri
extras.datagouv_ckan_id
extras.datagouv_ckan_last_sync
spatial.geom
spatial.granularity
spatial.zones

explain( )

Provides in French a detailed description of a data set. It takes one mandatory argument which is the dataset_id that you can get from output of the search() function:

srch[['id', 'title']]
id title
0 5f9b902d3784843c84d5f959 Terrasses éphémères
1 536998e0a3a729239d2050a4 Liste des cafés à un euro

In order to get a well formated text, we need to use it with print() as follows:

print(explain('5f9b902d3784843c84d5f959'))
**Ce jeu de données recense les déclarations de terrasse éphémère.**

La Ville de Paris a pris la décision d'autoriser l'extension gratuite des terrasses pour les cafés, bars et restaurants.

Habituellement soumises à une autorisation, les extensions provisoires sont exceptionnellement enregistrées à titre gratuit, et sont valables jusqu'en juin 2021.

La déclaration est effectuée via un téléservice sur Paris.fr. Chaque commerçant doit signer et afficher une charte d'engagement.

[Plus d'informations: sur Paris.fr](https://www.paris.fr/pages/reouverture-des-bars-et-restaurants-agrandir-ou-creer-sa-terrasse-7847)

Description des données :

· \- Reference Déclaration (Numéro de la déclaration)

· \- Nom Commerce

· \- Adresse Commerce et coordonnées géographiques (X ; Y)

resources( )

resources() lists all the available resources within a specific data set:

res = resources('5f9b902d3784843c84d5f959')
res
checksum created_at description filesize filetype format id last_modified latest mime ... url extras.check:available extras.check:count-availability extras.check:date extras.check:headers:charset extras.check:headers:content-disposition extras.check:headers:content-type extras.check:status extras.check:url extras.ods:type
0 None 2020-10-30T05:01:49.244000 - *Référence déclaration*: reference_declarati... None remote csv 59952231-e65f-4e52-b99a-f18ab317d4cc 2020-10-29T18:36:54 https://www.data.gouv.fr/fr/datasets/r/5995223... text/csv ... https://opendata.paris.fr/explore/dataset/terr... True 27 2020-12-16T10:06:11.943000 utf-8 attachment; filename="terrasses-ephemeres.csv" application/csv 200 https://opendata.paris.fr/explore/dataset/terr... api
1 None 2020-10-30T05:01:49.244000 - *Référence déclaration*: reference_declarati... None remote json f094f3da-9b57-4fc3-ac1c-3538ace77d44 2020-10-29T18:36:54 https://www.data.gouv.fr/fr/datasets/r/f094f3d... application/json ... https://opendata.paris.fr/explore/dataset/terr... True 27 2020-12-16T10:06:11.963000 utf-8 attachment; filename="terrasses-ephemeres.json" application/json 200 https://opendata.paris.fr/explore/dataset/terr... api

2 rows × 25 columns

resources() outputs a pandas dataFrame with the following columns :

for col in res.columns:
    print(col)
checksum
created_at
description
filesize
filetype
format
id
last_modified
latest
mime
preview_url
published
schema
title
type
url
extras.check:available
extras.check:count-availability
extras.check:date
extras.check:headers:charset
extras.check:headers:content-disposition
extras.check:headers:content-type
extras.check:status
extras.check:url
extras.ods:type

We can grab the URL, the format and the description of the resources:

res[['description', 'format', 'url']]
description format url
0 - *Référence déclaration*: reference_declarati... csv https://opendata.paris.fr/explore/dataset/terr...
1 - *Référence déclaration*: reference_declarati... json https://opendata.paris.fr/explore/dataset/terr...

Now we can just extract with pandas the required resource. In this example, we've chosen to work with a csv file:

df = pd.read_csv(res['url'][0], sep=";")
df
reference_declaration nom_commerce adresse_commerce coord_x coord_y
0 169 Cafe au pere tranquille 16 rue Pierre Lescot, 75001 PARIS 652202.4778 6.862661e+06
1 278 Cafe le marignan 18 rue de Marignan, 75008 PARIS 649134.9044 6.863454e+06
2 916 CAFE PETITE 52 rue René Boulanger, 75010 PARIS 653031.4339 6.863377e+06
3 177 Chez Julien 1 rue du Pont Louis-Philippe, 75004 PARIS 652687.2630 6.861829e+06
4 669 Demain c'est loin 9 rue Julien Lacroix, 75020 PARIS 654960.0896 6.863351e+06
... ... ... ... ... ...
9576 22470 Le 7 EME B’ART 19 rue de Choiseul, 75002 PARIS 651285.2949 6.863569e+06
9577 22501 BIBIMBAP 32 Boulevard de l'Hôpital NaN NaN
9578 22507 Holiday Inn Paris Elysees 24 rue de Miromesnil NaN NaN
9579 22532 Le mandarin de rambuteau 11 rue Rambuteau, 75004 PARIS 652741.2095 6.862449e+06
9580 22522 Secret de Paris 61 rue de Clichy NaN NaN

9581 rows × 5 columns

suggest_territory( )

Returns suggested territory pages according to the query provided by the user:

terr = suggest_territory(query = 'paris', result_size=10)
terr
id image_url page parent title
0 fr:commune:75056@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris
1 fr:commune:75115@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 15e Arrondissement
2 fr:commune:75120@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 20e Arrondissement
3 fr:commune:75118@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 18e Arrondissement
4 fr:commune:75119@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 19e Arrondissement
5 fr:commune:75113@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 13e Arrondissement
6 fr:commune:75117@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 17e Arrondissement
7 fr:commune:75116@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 16e Arrondissement
8 fr:commune:75111@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 11e Arrondissement
9 fr:commune:75112@1943-01-01 https://www.data.gouv.fr/fr/territories/commun... Paris Paris 12e Arrondissement

suggest_territory() returns a pandas dataFrame with the following columns:

for col in terr.columns:
    print(col)
id
image_url
page
parent
title

Code of Conduct

Please note that the pygouv project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pygouv-0.0.9-py3-none-any.whl (7.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page