Introducing LeetScrape - a powerful and efficient Python package designed to scrape problem statements and their topic and company tags, difficulty, test cases, hints, and code stubs from LeetCode.com. Easily download and save LeetCode problems to your local machine, making it convenient for offline practice and studying. It is perfect for anyone preparing for coding interviews. With the LeetScrape, you can boost your coding skills and improve your chances of landing your dream job.
Project description
Leetcode Questions Scraper
Introducing the LeetScrape - a powerful and efficient Python package designed to scrape problem statements and basic test cases from LeetCode.com. With this package, you can easily download and save LeetCode problems to your local machine, making it convenient for offline practice and studying. It is perfect for software engineers and students preparing for coding interviews. The package is lightweight, easy to use and can be integrated with other tools and IDEs. With the LeetScrape, you can boost your coding skills and improve your chances of landing your dream job.
Use this package to get the list of Leetcode questions, their topic and company tags, difficulty, question body (including test cases, constraints, hints), and code stubs in any of the available programming languages.
Usage
Import the relevant classes from the leetcode
package:
from leetscrape.GetQuestionsList import GetQuestionsList
from leetscrape.GetQuestionInfo import GetQuestionInfo
from leetscrape.utils import combine_list_and_info, get_all_questions_body
Get the list of questions, companies, topic tags, categories using the GetQuestionsList
class:
ls = GetQuestionsList()
ls.scrape() # Scrape the list of questions
ls.to_csv(directory_path="../data/") # Save the scraped tables to a directory
Warning The default ALL_JSON_URL in the
GetQuestionsList
class might be out-of-date. Please update it by going to https://leetcode.com/problemset/all/ and exploring the Networks tab for a query returning all.json.
Query individual question's information such as the body, test cases, constraints, hints, code stubs, and company tags using the GetQuestionInfo
class:
# This table can be generated using the previous commnd
questions_info = pd.read_csv("../data/questions.csv")
# Scrape question body
questions_body_list = get_all_questions_body(
questions_info["titleSlug"].tolist(),
questions_info["paidOnly"].tolist(),
save_to="../data/questionBody.pickle",
)
# Save to a pandas dataframe
questions_body = pd.DataFrame(
questions_body_list
).drop(columns=["titleSlug"])
questions_body["QID"] = questions_body["QID"].astype(int)
Note The above code stub is time consuming (10+ minutes) since there are 2500+ questions.
Create a new dataframe with all the questions and their metadata and body information.
questions = combine_list_and_info(
info_df = questions_body, list_df=ls.questions, save_to="../data/all.json"
)
Create a PostgreSQL database using the SQL dump and insert data using sqlalchemy
.
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
engine = create_engine("<database_connection_string>", echo=True)
questions.to_sql(con=engine, name="questions", if_exists="append", index=False)
# Repeat the same for tables ls.topicTags, ls.categories,
# ls.companies, # ls.questionTopics, and ls.questionCategory
Use the queried_questions_list
PostgreSQL function (defined in the SQL dump) to query for questions containy query terms:
select * from queried_questions_list('<query term>');
Use the all_questions_list
PostgreSQL function (defined in the SQL dump) to query for all the questions in the database:
select * from all_questions_list();
Use the get_similar_questions
PostgreSQL function (defined in the SQL dump) to query for all questions similar to a given question:
select * from get_similar_questions(<QuestionID>);
Use the extract_solutions
method to extract solution code stubs from your python script. Note that the solution method should be a part of a class named Solution
(see here for an example):
# Returns a dict of the form {QuestionID: solutions}
solutions = extract_solutions(filename=<path_to_python_script>)
Use the upload_solutions
method to upload the extracted solution code stubs from your python script to the PosgreSQL database.
upload_solutions(engine=<sqlalchemy_engine>, row_id = <row_id_in_table>, solutions: <solutions_dict>)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file leetscrape-0.1.2.tar.gz
.
File metadata
- Download URL: leetscrape-0.1.2.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.7 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d804b794e9d0425b9ea81d927e9cc80a84f894fd4846f3f2ed202e38feb5c36a |
|
MD5 | 03e88155cfc9e82b12650ed4d1c53910 |
|
BLAKE2b-256 | 13a998781edd8de890c5f1ae07afe1a3c269107633f243d8e5e3b82872c50301 |
File details
Details for the file leetscrape-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: leetscrape-0.1.2-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.7 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3490e5298640062a6e1703b8d7597435608ac45bfb4c2b22b90f09f4983a9155 |
|
MD5 | b2a7c5ec10cec7e8c79671841e61ce29 |
|
BLAKE2b-256 | 018a565218aade76c5e5c7dbdfc6057b2ca6609aefd1f8e157fe977a9c89fe61 |