Faostat Python Package
Project description
Faostat Python Package
Tools to read data from Faostat API.
Features
- Read Faostat data and metadata as list of tuples or as pandas dataframe.
- MIT license.
Warning: Versions 1.x.x still have the functions get_areas, get_years, get_items and get_elements, for backward compatibility, but they are deprecated and will be removed from version 2.x.x. Also the input parameter https is deprecated and will be removed from version 2.x.x. Please use the new utility set_requests_args instead.
Documentation
Getting started:
Requires Python 3.6+
pip install faostat
It is available also from Anaconda.org.
Read the list of available datasets:
As a list of tuples:
faostat.list_datasets()
Read the available datasets and return a list of tuples. The first element of the list contains the header line. More information on the available datasets can be found in the official Faostat website.
Example:
>>> ld = faostat.list_datasets()
>>> ld[0]
('code', 'label', 'date_update', 'note_update', 'release_current', 'state_current', 'year_current', 'release_next', 'state_next', 'year_next')
>>> ld[1:4]
[('QCL', 'Crops and livestock products', '2022-02-17', 'minor revision', '2021-12-21 / 2022-02-17', 'final', '2020', '2022-12', 'final', '2020'),
('QI', 'Production Indices', '2021-03-18', '', '2021-03-18', 'final', '2019', '2022-04', 'final', '2020'),
('QV', 'Value of Agricultural Production', '2021-03-18', 'minor revision', '2021-03-18', 'final', '2020', '2022-04', 'final', '2019')]
As a pandas dataframe:
faostat.list_datasets_df()
It reads the available datasets and returns a pandas dataframe. The first element of the list contains the header line.
More information on the available datasets can be found in the official Faostat website.
Example:
>>> df = faostat.list_datasets_df()
>>> df
code label ... state_next year_next
0 QCL Crops and livestock products ... final 2020
1 QI Production Indices ... final 2020
2 QV Value of Agricultural Production ... final 2019
3 FS Suite of Food Security Indicators ... final 2021
4 SCL Supply Utilization Accounts ... final 2020
.. ... ... ... ... ...
70 FA Food Aid Shipments (WFP) ...
71 RM Machinery ...
72 RY Machinery Archive ...
73 RA Fertilizers archive ...
74 PA Producer Prices (old series) ...
Check parameters for a given dataset:
Frequently you will need just a subset of a dataset, for instance only one year or country. You will therefore use the following functions.
To retrieve the available parameters for a given dataset:
As a list of tuples:
faostat.list_pars(code)
Given the code of a dataset, it reads the parameters and returns them as a list of tuples. The first tuple ("row") contains the header, in order: the parameter code, the available coding systems (when applicable) and the subdimensions. Subdimension are reported as dictionary where keys are the subdimension codes while the values are the descriptions of the subdimensions (definitions). They can be used to find subset of codes with get_par or get_par_df.
Example:
>>> a = faostat.list_pars('QCL')
>>> a
[('parameter code', 'coding_systems', 'subdimensions {code: meaning}'),
('area', ['M49', 'FAO', 'ISO2', 'ISO3'], {'countries': 'Countries', 'regions': 'Regions', 'specialgroups': 'Special Groups'}),
('element', [], {'elements': 'Elements'}),
('item', ['CPC', 'FAO'], {'items': 'Items', 'itemsagg': 'Items Aggregated'}),
('year', [], {'years': 'Years'})]
In the example, you can see that for the parameter 'area' there are four possible coding systems (default is 'FAO'). Moreover, there are subdimensions with code name different from the parameter code.
As a pandas dataframe:
faostat.list_pars_df(code)
Given the code of a dataset, it reads the parameters and returns them as a dataframe. The columns are, in order: parameter code, cding_system (when applicable) and a dictionary with the subdimesions, represented in a dictionary where keys are the subdimension codes while the values are the descriptions of the subdimensions (definitions). They can be used to find subset of codes with get_par or get_par_df.
Example:
>>> a = faostat.list_pars('QCL')
>>> a
parameter code coding_systems subdimensions {code: meaning}
0 area ['M49', 'FAO', 'ISO2', 'ISO3'] {'countries': 'Countries', 'regions': 'Regions', 'specialgroups': 'Special Groups'}
1 element [] {'elements': 'Elements'}
2 item ['CPC', 'FAO'] {'items': 'Items', 'itemsagg': 'Items Aggregated'}
3 year [] {'years': 'Years'}
In the example, you can see that for the parameter 'area' there are four possible coding systems (default is 'FAO'). Moreover, there are subdimensions with code name different from the parameter code.
To retrieve the available values of a parameter for a given dataset:
As a dictionary:
faostat.get_par(code, par)
Given the code of a dataset and a parameter (or a subdimension), it reads the values and returns a dataframe.
As a pandas dataframe:
faostat.get_par_df(code, par)
Given the code of a dataset and a parameter (or a subdimension), it reads the values and returns a dataframe.
Example, retrieve the available areas and their codes as a dictionary:
>> import faostat
>>> y = faostat.get_par('QCL', 'area')
>>> y
{'Afghanistan': '2',
'Albania': '3',
'Algeria': '4',
'Angola': '7',
...}
Example, retrieve the available special groups of areas, as a dataframe:
>> import faostat
>>> y = faostat.get_par_df('QCL', 'specialgroups')
y
label code aggregate_type
0 European Union (27) + (Total) 5707 +
1 European Union (27) > (List) 5707> >
2 Least Developed Countries + (Total) 5801 +
3 Least Developed Countries > (List) 5801> >
.. ... ... ...
8 Low Income Food Deficit Countries + (Total) 5815 +
9 Low Income Food Deficit Countries > (List) 5815> >
10 Net Food Importing Developing Countries + (Total) 5817 +
11 Net Food Importing Developing Countries > (List) 5817> >
Read data from a dataset:
As a list of tuples:
faostat.get_data(code, pars={}, coding={}, show_flags=False, null_values=False, show_notes=False, strval=True)
Given the code of a Faostat dataset, it returns the data as a list of tuples.
As a pandas dataframe:
faostat.get_data_df(code, pars={}, coding={}, show_flags=False, null_values=False, show_notes=False, strval=True)
To download only a subset of the dataset, you need to pass pars={key: value, ...}:
- key: parameter code obtained with list_pars();
- value: can be a number, a string or a list, from the codes obtained with get_par().
pars is optional, but recommended to avoid Timeout Error due to too large query.
If you want to download the data in a specific coding system, different from the 'FAO' default, you need to pass coding={key: value, ...}.
- key: coding obtained with list_pars();
- value: can be a number, a string or a list, from the codyng_systems obtained with get_par() for the given parameter.
Set show_flags=True if you want to download also the data flags.
Set null_values=True to download also the null data.
Set show_notes=True to download the notes.
By default, the results are kept as provided by the Faostat API, so they are all strings (also the numbers). Set strval=False if you want the code to provide the results as numbers.
Example: Download a subset of data, based on parameters, as a list of tuples, with default coding_system:
>>> mypars = {'element':[2312, 2313],'item':'221'}
>>> data = faostat.get_data('QCL', pars=mypars)
>>> data[40:44]
[('QCL', 'Crops and livestock products', '2', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2014', '2014', 'ha', '13703'),
('QCL', 'Crops and livestock products', '2', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2015', '2015', 'ha', '14676'),
('QCL', 'Crops and livestock products', '2', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2016', '2016', 'ha', '19481'),
('QCL', 'Crops and livestock products', '2', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2017', '2017', 'ha', '19793')]
Example: Download a subset of data, based on parameters, as a list of tuples, with choosen coding_system:
>>> mypars = {'element':[2312, 2313],'item':'221'}
>>> mycoding = {'area': 'ISO3'}
>>> data = faostat.get_data('QCL', pars=mypars, coding=mycoding)
>>> data[40:44]
[('QCL', 'Crops and livestock products', 'AFG', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2014', '2014', 'ha', '13703'),
('QCL', 'Crops and livestock products', 'AFG', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2015', '2015', 'ha', '14676'),
('QCL', 'Crops and livestock products', 'AFG', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2016', '2016', 'ha', '19481'),
('QCL', 'Crops and livestock products', 'AFG', 'Afghanistan', '5312', 'Area harvested', '221', 'Almonds, with shell', '2017', '2017', 'ha', '19793')]
Example: Download a subset of data as numbers, as a list of tuples:
>>> mypars = {'area': '5815',
'element': [2312, 2313],
'item': '221',
'year': [2020, 2021]}
>>> data = faostat.get_data('QCL', pars=mypars, strval=False)
>>> data
[('Domain Code', 'Domain', 'Area Code', 'Area', 'Element Code', 'Element', 'Item Code', 'Item', 'Year Code', 'Year', 'Unit', 'Value'),
('QCL', 'Crops and livestock products', 5815, 'Low Income Food Deficit Countries', 5312, 'Area harvested', 221, 'Almonds, in shell', 2020, '2020', 'ha', 112434),
('QCL', 'Crops and livestock products', 5815, 'Low Income Food Deficit Countries', 5312, 'Area harvested', 221, 'Almonds, in shell', 2021, '2021', 'ha', 129916)]
Example: Download a subset of data as numbers, as a dataframe:
>>> mypars = {'area': '5815',
'element': [2312, 2313],
'item': '221',
'year': [2020, 2021]}
>>> data = faostat.get_data_df('QCL', pars=mypars, strval=False)
>>> df
Domain Code Domain Area Code ... Year Unit Value
0 QCL Crops and livestock products 5815 ... 2020 ha 112434
1 QCL Crops and livestock products 5815 ... 2021 ha 129916
To set up a proxy and other request arguments:
You may need to modify the default download settings. This package uses the [requests][req] package and allows to set some of its arguments:
- timeout: how long to wait for the server before raising an error, in sec. Default is 120 sec.
- proxies : sets the proxies. It overwrites the proxy setting of any previous runs of setproxy. For the Faostat API, only the https proxy is used. Default is None (the optional argument is not passed to the request).
- verify : whether to verify the server’s TLS certificate, or to use a CA bundle. Defaults to None (the optional argument is not passed to the request).
- cert : whether to use a SSL client cert file. Defaults to None (the optional argument is not passed to the request).
faostat.set_requests_args([timeout=120.], [proxies=None], [verify=None], [cert=None])
It returns None.
For detailed information, please refer to the documentation of the package [requests][req_req].
Example:
>>> import faostat
>>> mytimeout = 240.
>>> myproxy = {'https': 'http://myuser:mypass@123.45.67.89:1234'}
>>> faostat.set_requests_args(timeout=mytimeout, proxies=myproxy)
To check the settings:
faostat.get_requests_args()
It returns a dictionary with the argument names and their respective values, exactly as they are passed to the request.
Bug reports and feature requests:
Please open an issue or send a message to noemi.cazzaniga [at] polimi.it. Before opening a new issue, please have a look at the existing issues.
Disclaimer:
Download and usage of Faostat data is subject to FAO's general terms and conditions.
Data sources:
- Faostat database: online catalog.
References:
- Python package pandas: Python Data Analysis Library.
- Python package eurostat: Tools to read data from Eurostat.
History:
version 1.1.2 (June 2024):
- Internal bug fix
version 1.1.1 (May 2024):
- Removed the functions get_areas, get_years, get_items and get_elements.
- Implemented all codings.
- set_requests_args and get_requests_args replace https_proxy args.
- Changed the base url.
- https input parameter is deprecated.
version 1.0.2 (Oct 2023):
- Bug fix: build.
version 1.0.1 (Oct 2023):
- Implemented all the parameters.
- Prevented list_datasets to show the datasets that are not accessible (update_date=None).
version 0.1.1 (2022):
- First official release.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file faostat-1.1.2.tar.gz
.
File metadata
- Download URL: faostat-1.1.2.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b270ab596c40cdc3ea777563384b56df01dbfcb9bcbb09cc156e51c28bcc46de |
|
MD5 | 4d93ae89f6f409b9148e6f81cd9a14cd |
|
BLAKE2b-256 | ce643b0fec31599efb2469a078994c31b0527fa410322d5de51138103ef97c22 |
File details
Details for the file faostat-1.1.2-py3-none-any.whl
.
File metadata
- Download URL: faostat-1.1.2-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 954aa2ac3a77ffccd1ffe5d36da225b02b09b9635d713eefa5861e92ee047aa4 |
|
MD5 | 91ec8d4b5b127ce3a397693bb1fa2ecb |
|
BLAKE2b-256 | 2af35d8000cb41314f0b96bf103e71ee080abd912f5673a47ed6c34892faf59c |