Parser for easy data retrieval from cian.ru
Project description
Cian Parser
What is this?
This is a parser with which you can easily get data from a website cian.ru.
Quick Guide
This module is based on Selenium-Stealth, using BeautifulSoup as well as Asyncio
Data you can get:
- Name of apartment
- The city district in which the housing estate is located
- Price of the apartment
- Time to the subway
- How to get to the subway
- Nearest subway
- Price per square meter
- Total square footage
- Living Space
- Floor
- Number of stories in the house
- Year of delivery of the house
- Surrendered or not
- Finishing
- Parking
- Ceiling Heights
- Builder Rating
Using
Using the library is as simple and convenient as possible:
Let's import it first:
First, import everything from the library (use the from
... import *
construct).
Examples of all operations:
Сreate an instance of a class Cian_Parser
(PATH - file save path, URL - site url, BOOST (True or False) - you can also receive a seller rating, but the speed is reduced several times, COUNT_PAGE - how many pages of apartments do you want to receive):
parser = Cian_Parser(PATH, URL, BOOST, COUNT_PAGE)
Receive all data of all apartments in CSV format using the start_parsing()
function:
parser.start_parsing()
If you want to create your own parser logic, then use the description of the other modules:
Сreate an instance of a class Pagination
(parser - parser instance from the Flats_Url class, next_button_selector - XPATH pagination buttons)
pagination = Pagination(parser, next_button_selector)
Checking for next page using the HasNextPage()
function:
await pagination.HasNextPage()
Go to next page using the GoToTheNextPage()
function:
await pagination.GoToTheNextPage()
Developer
My site: link
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.