Scraping API for LinkedIn, Built on the back of linkedin_api by Tom Quirk

These details have not been verified by PyPI

Project description

LI scrAPI for Python

API using the current endpoints on linkedin solely for scraping data without the official api. Based entirely on linkedin-api by Tom Quirk

Why use this library rather than the other one?

Async support
Types powered by Pydantic
HTTPX so we have http2 support as well
You don't need to interact with linkedIn but rather want data

Caution: This library is not officially supported by LinkedIn. Using it might violate LinkedIn's Terms of Service. Use it at your own risk.

Installation

Python >= 3.10 required

Quick Start

Script Client

linkedin = LinkedInScriptApi(credentials["username"], credentials["password"])
jobs = linkedin.search_jobs("Software", total_jobs = 10_000)

Async Version

session = AsyncClient()
client = AsyncLinkedInClient(session=session)
linkedin = AsyncLinkedIn(client)
await linkedin.authenticate(credentials["username"], credentials["password"])
await linkedin.get_profile_privacy_settings("khalid-a-53a190142")
profile = await linkedin.search_people(current_company=[CompanyID.GOOGLE], past_companies=[CompanyID.APPLE], include_private_profiles=True)
company = await linkedin.get_company_updates(public_id="google")
await linkedin.get_organization("google")
jobs = await linkedin.search_jobs(
    "Software Engineer",
    sort_by=SortBy.DATE,
    location=GeoID.USA,
    remote=[LocationType.ONSITE],
    limit=10,
)
if jobs:
    for job in jobs.elements:
        job_complete = await linkedin.get_job(job.tracking_urn.split(":")[-1])
        job_skills = await linkedin.get_job_skills(job.tracking_urn.split(":")[-1])
    print(job_complete)
await linkedin.search({"keywords": "software"})
res = await linkedin.search_people(keywords="software",include_private_profiles=True)
await linkedin._close()

Sync Version

session = Client()
client = LinkedInClient(session=session)
linkedin = LinkedIn(client)
linkedin.authenticate(credentials["username"], credentials["password"])

linkedin.get_profile_privacy_settings("khalid-a-53a190142")
profile = linkedin.search_people(current_company=[CompanyID.GOOGLE], past_companies=[CompanyID.APPLE], include_private_profiles=True)
company = linkedin.get_company_updates(public_id="google")
linkedin.get_organization("google")
jobs = linkedin.search_jobs(
    "Software Engineer",
    sort_by=SortBy.DATE,
    location=GeoID.USA,
    remote=[LocationType.ONSITE],
    limit=10,
)
if jobs:
    for job in jobs.elements:
        job_complete = linkedin.get_job(job.tracking_urn.split(":")[-1])
        job_skills = linkedin.get_job_skills(job.tracking_urn.split(":")[-1])
    print(job_complete)
linkedin.search({"keywords": "software"})
res = linkedin.search_people(keywords="software",include_private_profiles=True)
linkedin._close()
session = Client()
client = LinkedInClient(session=session)
linkedin = LinkedIn(client)
linkedin.authenticate(credentials["username"], credentials["password"])

linkedin.get_profile_privacy_settings("khalid-a-53a190142")
profile = linkedin.search_people(current_company=[CompanyID.GOOGLE], past_companies=[CompanyID.APPLE], include_private_profiles=True)
company = linkedin.get_company_updates(public_id="google")
linkedin.get_organization("google")
jobs = linkedin.search_jobs(
    "Software Engineer",
    sort_by=SortBy.DATE,
    location=GeoID.USA,
    remote=[LocationType.ONSITE],
    limit=10,
)
if jobs:
    for job in jobs.elements:
        job_complete = linkedin.get_job(job.tracking_urn.split(":")[-1])
        job_skills = linkedin.get_job_skills(job.tracking_urn.split(":")[-1])
    print(job_complete)
linkedin.search({"keywords": "software"})
res = linkedin.search_people(keywords="software",include_private_profiles=True)
linkedin._close()        session = Client()
client = LinkedInClient(session=session)
linkedin = LinkedIn(client)
linkedin.authenticate(credentials["username"], credentials["password"])

linkedin.get_profile_privacy_settings("khalid-a-53a190142")
profile = linkedin.search_people(current_company=[CompanyID.GOOGLE], past_companies=[CompanyID.APPLE], include_private_profiles=True)
company = linkedin.get_company_updates(public_id="google")
linkedin.get_organization("google")
jobs = linkedin.search_jobs(
    "Software Engineer",
    sort_by=SortBy.DATE,
    location=GeoID.USA,
    remote=[LocationType.ONSITE],
    limit=10,
)
if jobs:
    for job in jobs.elements:
        job_complete = linkedin.get_job(job.tracking_urn.split(":")[-1])
        job_skills = linkedin.get_job_skills(job.tracking_urn.split(":")[-1])
    print(job_complete)
linkedin.search({"keywords": "software"})
res = linkedin.search_people(keywords="software",include_private_profiles=True)
linkedin._close()

Documentation

The examples give a quick run down of the documentation if this project takes off or gets some traction I'll make dedicated docs. The code as well has sufficient doc strings and types to get an idea of how to interact with the code

Disclaimer

This library is not endorsed or supported by LinkedIn. It is an unofficial library intended for educational purposes and personal use only. By using this library, you agree to not hold the author or contributors responsible for any consequences resulting from its usage.

Contributing

Any and all contributions are helpful, if you have discovered various IDs LinkedIn uses for anything of interest make a PR and add it to the query options.

If you feel we need a new method or something to pull from LinkedIn then the following would be very helpful:

Add the method to the LinkedIn Interface
Supply the logic to both the sync and async classes
Add mock tests for the assumed LinkedIn response
add the method to the script if necessary
Make a PR and lets merge it in!

Development

Development installation

TODO

Troubleshooting

I keep getting a `CHALLENGE`

Linkedin will throw you a curve ball in the form of a Challenge URL which requires Javascript to solve. Your best chance at resolution is to in on your browser use a separate library like browser-cookie3, getting the cookie from your browser and passing it to the API.

Search problems

Mileage may vary when searching general keywords like "software" using the standard search method. They've recently added some smarts around search whereby they group results by people, company, jobs etc. if the query is general enough. Try to use an entity-specific search method (i.e. search_people) where possible. Likewise if there is something you feel that should be supported please request it with a curl statement to build the request

How it works

This project attempts to provide a simple Python interface for the Linkedin API.

Do you mean the legit Linkedin API?

NO! To retrieve structured data, the Linkedin Website uses a service they call Voyager. Voyager endpoints give us access to pretty much everything we could want from Linkedin: profiles, companies, connections, messages, etc. - anything that you can see on linkedin.com, we can get from Voyager.

How does it work?

Deep dive

Voyager endpoints look like this:

https://www.linkedin.com/voyager/api/identity/profileView/tom-quirk

Or, more clearly

 ___________________________________ _______________________________
|             base path             |            resource           |
https://www.linkedin.com/voyager/api /identity/profileView/tom-quirk

They are authenticated with a simple cookie, which we send with every request, along with a bunch of headers.

To get a cookie, we POST a given username and password (of a valid Linkedin user account) to https://www.linkedin.com/uas/authenticate.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Jul 13, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

li_scrapi-1.0.0.tar.gz (33.4 kB view details)

Uploaded Jul 13, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

li_scrapi-1.0.0-py3-none-any.whl (41.0 kB view details)

Uploaded Jul 13, 2024 Python 3

File details

Details for the file li_scrapi-1.0.0.tar.gz.

File metadata

Download URL: li_scrapi-1.0.0.tar.gz
Upload date: Jul 13, 2024
Size: 33.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/21.6.0

File hashes

Hashes for li_scrapi-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`13ff91df1fc00475e3d5d1818a4ffe56102eebd005988f0405b2f4ac28928613`
MD5	`40adb158b512cb90248da461a7fda04c`
BLAKE2b-256	`f67f1207f8c5bc9b7ef3314e16ae6eac618c10dc483074657cea1f1cebfd2add`

See more details on using hashes here.

File details

Details for the file li_scrapi-1.0.0-py3-none-any.whl.

File metadata

Download URL: li_scrapi-1.0.0-py3-none-any.whl
Upload date: Jul 13, 2024
Size: 41.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/21.6.0

File hashes

Hashes for li_scrapi-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd7fcd15c6f80572dbca62695b0fa05404298e3085d88b02ed9ae5b6670f6608`
MD5	`61bc9fac69683213aa4e9cce2a66b861`
BLAKE2b-256	`01abc8bdee63cf67935c53b14d18754d823de51d4fc55bc59b8b5fe982f91b96`

See more details on using hashes here.

li_scrapi 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

LI scrAPI for Python

Installation

Quick Start

Documentation

Disclaimer

Contributing

Development

Development installation

Troubleshooting

I keep getting a `CHALLENGE`

Search problems

How it works

Deep dive

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

li_scrapi 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

LI scrAPI for Python

Installation

Quick Start

Documentation

Disclaimer

Contributing

Development

Development installation

Troubleshooting

I keep getting a CHALLENGE

Search problems

How it works

Deep dive

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

I keep getting a `CHALLENGE`