Skip to main content

A batch to analyse AI jobs in Vietnam

Project description

:poop: AIJobs collector :poop:

Batch app

This repo contains batch codes to collect data from many top job postings sites in Vietnam such as Indeed VN, VietnamWorks, TopCV, ...

We use Github Actions to collect the data automatically. Please note that, some websites in Vietnam have mechanisms to prevent scrappers like bots, therefore, we must keep retrying every 5 minutes.

Currently, the list of website we are collecting data from is as follows.

Website URL Batch from Batch cron Queries
TopCV https://www.topcv.vn 2023-08-19 59 12 * * * or manual ai engineer, computer vision, machine learning
VietnamWorks https://vietnamworks.com 2023-08-19 59 12 * * * or manual ai engineer, computer vision, machine learning
Indeed Vietnam https://vn.indeed.com 2023-08-19 59 12 * * * or manual ai engineer, computer vision, machine learning

Online app

Besides the batch app which is setup in Github Actions to crawl data daily, we provide an online app to test the scenarios of data collected. We use MongoDB to store the data collections. To setup an environment for analysing data, see mongodb environment setup.

To run the online app:

$ python uninstall aijobs_batch
$ python setup.py install
$ aijobs_online --reload --workers 1 --host 0.0.0.0 --port 9000 --log_level info

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

AIJobs_Batch-1.0.0a1-py3.11-py3-any.whl (18.1 kB view hashes)

Uploaded Python 3

AIJobs_Batch-1.0.0a1-py3.10-py3-any.whl (18.1 kB view hashes)

Uploaded Python 3

AIJobs_Batch-1.0.0a1-py3.9-py3-any.whl (18.1 kB view hashes)

Uploaded Python 3

AIJobs_Batch-1.0.0a1-py3.8-py3-any.whl (18.1 kB view hashes)

Uploaded Python 3

AIJobs_Batch-1.0.0a1-py3.7-py3-any.whl (18.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page