expliot_finder.scraper.core package¶

Submodules¶

expliot_finder.scraper.core.cve_scrapper module¶

CVE(Common Vulnerabilities and Exposures) scrapper.

Leaning on version of the service that was captured after the target scanned by ‘vulnerability_scanner’ module, this module will try to find most relevant CVE’s in web by using scraping technique. If this module find CVE’s the chance to finding a matching exploit increases. Information detected by this module will be saved and returned in the following form:

# Returns URL the most suitable CVE to the captured version of the
# service
'https://www.cvedetails.com/cve/CVE-2002-1646/'

class expliot_finder.scraper.core.cve_scrapper.SuitableCVEFinder(cve_table_url: str, service_version: str)[source]¶

Bases: object

Class storing a CVEs scraper that scrap page ‘https://www.cvedetails.com’.

Scrapper in this class will find most relevant CVE for captured service. If the script executes methods in this class, it means that ‘sites_finder’ module found page with few CVE’s that’s are stored in HTML table. The purpose of the methods in this class is to extract the best suitable CVE for service that was captured after the target was scanned by module named: ‘vulnerability_scanner’.

service_version¶: Detected version of the service for which the CVE will be searched for.

cve_table_url¶: URL to page with an HTML table containing partially matching CVEs for the detected service. The scrapper will only pull out the most suitable CVE.

cve_table_url: str¶

extracted_service_ver_in_nums() → list[str][source]¶

Extract only numbers from service version.

After extracting version of the service only in nums, scraper will be able to find the most suitable CVE for captured ‘service’ from provided HTML table with CVE.

Returns: List with a string of numbers from ‘service version’, without any letters or words.

async find_suitable_cve() → Optional[list[str]][source]¶

Run sequence of functions to start scraping a provided HTML page with CVEs.

This handle will execute functions in following order which:

Extract the numbers from the version of the service that will be

used to find the most suitable CVE.

Asynchronously get the whole content of the HTML table with links
to CVEs.
Using an extracted numbers from captured service version,
asynchronously scrape already downloaded HTML table page in order to find most suitable CVE for captured service.

Returns: One single URL to most suitable CVE for captured ‘service’.

async get_page_content() → bytes[source]¶

Create async client session and perform a GET request.

Perform a GET request to page (‘self.cve_table_url’) with CVE’s stored in HTML table.

Returns: Content of page with few CVE’s stored in HTML table.

async static scrape_cve_table_page(page_content: bytes, parsed_service_ver: list[str]) → Optional[str][source]¶

Scrape provided HTML table to find most suitable CVE for detected service.

The ‘page_content’ will hold a page with an HTML table filled with all CVE’s which partially match service. ‘Partially’ means that this HTML table was found by ‘sites_finder’ module and this module was looking for CVE by ‘service name’ not exactly by ‘service version’. So this HTML table will store few CVE’s for different versions of captured service and this scrapper will extract best matching CVE by searching for the exact version of the captured service. Provide page in pram: ‘page_content’ must be from domain: (https://www.cvedetails.com).

Parameters

parsed_service_ver – List with a string of numbers from ‘service version’, without any letters or words. Using the version of the service prepared in this way, the scraper will find the most suitable CVE for this captured service.
page_content – Content of page from domain: ‘https://www.cvedetails.com’ with few CVE’s stored in HTML table that partially match service.

Returns

URL the most suitable CVE to the captured version of the service.

service_version: str¶

expliot_finder.scraper.core.sites_finder module¶

Search pages with CVE or ready exploits for captured ‘service_version’.

Information detected by this module will be saved and returned in the following form:

# Returns list of URLs that redirect to CVEs or exploits that match or
# partially match to captured service version.
[
    'https://www.exploit-db.com/exploits/21314',
    'https://www.cvedetails.com/vulnerability-list/vendor_id-120/product_id-317/SSH-Ssh2.html',
    ...
]

class expliot_finder.scraper.core.sites_finder.GoogleSitesFinder(service_version: str)[source]¶

Bases: object

Finder of ready exploits and CVEs in web for captured ‘service’.

Using a Google search engine, the methods will make queries to find sites with matching exploits and the CVE for the version of the service that was captured after the target was scanned and currently is iterated in ‘__main__.py’. If multiple versions of services have been captured then a class(‘GoogleSitesFinder’) instance will be created several times and each time with a next one successively captured version of the service. Found pages will be filtered according to the domain of the page (selected domains contain appropriate content). Sample return by this module:

service_version¶: Version of the service that was captured after the target was scanned by ‘vulnerability_scanner’ module.

search_query¶: String contains (‘base_query’ + service_version) and by combining those two string we get query that’s can be used to search ready exploits and CVEs in google.

async __send_search_query() → HTMLResponse¶

Send a ‘search_query’ to async consumable session by using GET request.

Parameters: search_query – Search query used to find ready exploits or CVE’s for captured ‘service_version’.
Returns: HTML response object. The content of the answer is exactly like that itself as if the query was made by google search engine.

__extract_urls() → list[str]¶

Extract all URLs from HTML response.

HTML response will store URLs to different sites and other content. This method will extract only URLs to site from whole HTMLResponse content.

Parameters: response – HTML response, store content returned after executing a query to Google search engine.
Returns: Links to different pages extracted from HTML response content.

static filter_extracted_urls(site: str, urls: list[str]) → list[str][source]¶

Filter extracted URLs to find pages with CVE or ready exploits.

Parameters

site –
The value depends on provided parameter in ‘exploit_finder.executor’ but can be one of:
- ’https://www.exploit-db.com’
- ’https://www.cvedetails.com’
Only pages with those domains will be returned.
urls – Links to different pages extracted from HTML response content.

Returns

URLs that redirect to CVEs or exploits that match or partially match to captured service version.

async search_for_pages(site: str) → list[str][source]¶

Run a functions to find pages with ready exploits or CVEs.

Ready exploits and CVEs will be searched for captured version of the service that was captured after the target was scanned by ‘vulnerability_scanner’ module.

Parameters

site –

What site should the scraper look for. Can be one of:

’https://www.exploit-db.com’
’https://www.cvedetails.com’

Returns

List of pages containing ready-made exploits for detected ‘service_version’ or pages that contain information about detected ‘service_version’.

property search_query: str¶

Get google_query.

Returns

‘search_query’ that will be used in google search engine to find: ready exploits or CVE.

Module contents¶

Aliases for modules: (‘cve_scrapper’, ‘sites_finder’).

class expliot_finder.scraper.core.GoogleSitesFinder(service_version: str)[source]¶

Bases: object

Finder of ready exploits and CVEs in web for captured ‘service’.

service_version¶: Version of the service that was captured after the target was scanned by ‘vulnerability_scanner’ module.

search_query¶: String contains (‘base_query’ + service_version) and by combining those two string we get query that’s can be used to search ready exploits and CVEs in google.

async __send_search_query() → HTMLResponse¶

Send a ‘search_query’ to async consumable session by using GET request.

Parameters: search_query – Search query used to find ready exploits or CVE’s for captured ‘service_version’.
Returns: HTML response object. The content of the answer is exactly like that itself as if the query was made by google search engine.

__extract_urls() → list[str]¶

Extract all URLs from HTML response.

HTML response will store URLs to different sites and other content. This method will extract only URLs to site from whole HTMLResponse content.

Parameters: response – HTML response, store content returned after executing a query to Google search engine.
Returns: Links to different pages extracted from HTML response content.

static filter_extracted_urls(site: str, urls: list[str]) → list[str][source]¶

Filter extracted URLs to find pages with CVE or ready exploits.

Parameters

site –
The value depends on provided parameter in ‘exploit_finder.executor’ but can be one of:
- ’https://www.exploit-db.com’
- ’https://www.cvedetails.com’
Only pages with those domains will be returned.
urls – Links to different pages extracted from HTML response content.

Returns

URLs that redirect to CVEs or exploits that match or partially match to captured service version.

async search_for_pages(site: str) → list[str][source]¶

Run a functions to find pages with ready exploits or CVEs.

Ready exploits and CVEs will be searched for captured version of the service that was captured after the target was scanned by ‘vulnerability_scanner’ module.

Parameters

site –

What site should the scraper look for. Can be one of:

’https://www.exploit-db.com’
’https://www.cvedetails.com’

Returns

List of pages containing ready-made exploits for detected ‘service_version’ or pages that contain information about detected ‘service_version’.

property search_query: str¶

Get google_query.

Returns

‘search_query’ that will be used in google search engine to find: ready exploits or CVE.

class expliot_finder.scraper.core.SuitableCVEFinder(cve_table_url: str, service_version: str)[source]¶

Bases: object

Class storing a CVEs scraper that scrap page ‘https://www.cvedetails.com’.

service_version¶

Detected version of the service for which the CVE will be searched for.

Type: str

cve_table_url¶

URL to page with an HTML table containing partially matching CVEs for the detected service. The scrapper will only pull out the most suitable CVE.

Type: str

cve_table_url: str¶

extracted_service_ver_in_nums() → list[str][source]¶

Extract only numbers from service version.

After extracting version of the service only in nums, scraper will be able to find the most suitable CVE for captured ‘service’ from provided HTML table with CVE.

Returns: List with a string of numbers from ‘service version’, without any letters or words.

async find_suitable_cve() → Optional[list[str]][source]¶

Run sequence of functions to start scraping a provided HTML page with CVEs.

This handle will execute functions in following order which:

Extract the numbers from the version of the service that will be

used to find the most suitable CVE.

Asynchronously get the whole content of the HTML table with links
to CVEs.
Using an extracted numbers from captured service version,
asynchronously scrape already downloaded HTML table page in order to find most suitable CVE for captured service.

Returns: One single URL to most suitable CVE for captured ‘service’.

async get_page_content() → bytes[source]¶

Create async client session and perform a GET request.

Perform a GET request to page (‘self.cve_table_url’) with CVE’s stored in HTML table.

Returns: Content of page with few CVE’s stored in HTML table.

async static scrape_cve_table_page(page_content: bytes, parsed_service_ver: list[str]) → Optional[str][source]¶

Scrape provided HTML table to find most suitable CVE for detected service.

Parameters

parsed_service_ver – List with a string of numbers from ‘service version’, without any letters or words. Using the version of the service prepared in this way, the scraper will find the most suitable CVE for this captured service.
page_content – Content of page from domain: ‘https://www.cvedetails.com’ with few CVE’s stored in HTML table that partially match service.

Returns

URL the most suitable CVE to the captured version of the service.

service_version: str¶