0x00 Preface

---

The previous article 'Penetration Basics - Exchange Version Detection and Vulnerability Scanning' introduced two methods for version detection using Python. For version identification, known version information is first obtained from the official website, stored in a list, and then detailed Exchange version information is obtained through string matching. The open-source code Exchange_GetVersion_MatchVul.py received positive feedback. However, this method has a drawback: it requires regularly accessing the official website to manually update the version information list in the scanning script.

To further improve efficiency, this article introduces another implementation method: by accessing the official website and directly extracting detailed version information from the returned data. The advantage is that it no longer requires regular script updates.

0x01 Introduction

---

This article will cover the following:

  • Parsing webpage data with BeautifulSoup
  • Implementation details
  • Open-source code

0x02 Parsing Webpage Data with BeautifulSoup

---

BeautifulSoup is a Python library for extracting data from HTML or XML files, which can improve development efficiency.

Installation:

pip install bs4

1. Basic Usage

In Python implementation, you first need to obtain webpage content through the requests library, then call BeautifulSoup for parsing.

Test code:

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
url = "https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019"

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
}
response = requests.get(url, verify=False, headers=headers)
soup = BeautifulSoup(response.text, features="html.parser")
print(soup.prettify())

The above code will access https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019, pass the webpage data to BeautifulSoup for optimization and display.

Example of partial output from executing the code:

  Â

Exchange Server 2019 CU12 May22SU

May 10, 2022

15.2.1118.9

15.02.1118.009

Exchange Server 2019 CU12 (2022H1)

April 20, 2022

15.2.1118.7

15.02.1118.007

For the above results, each "tr" node corresponds to a version information, with child nodes "td" representing specific version details

2. Only filter out the content of "tr" nodes

Test code:

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
url = "https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019"

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
}
response = requests.get(url, verify=False, headers=headers)
soup = BeautifulSoup(response.text, features="html.parser")
for tag in soup.find_all('tr'):
print(tag)
print("---")

Example output from executing the code:







---






---





Exchange Server 2019 CU12 May22SUMay 10, 202215.2.1118.915.02.1118.009
Exchange Server 2019 CU12 (2022H1)April 20, 202215.2.1118.715.02.1118.007
Exchange Server 2019 CU11 May22SUMay 10, 202215.2.986.2615.02.0986.026

Next, attempt to remove invalid data

3. Extract version information

Test code:

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
url = "https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019"

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
}
response = requests.get(url, verify=False, headers=headers)
soup = BeautifulSoup(response.text, features="html.parser")
for tag in soup.find_all('tr'):
for string in tag.stripped_strings:
print((string))
print("---")

Example of partial output from executing the code:

---

Exchange Server 2019 CU12 May22SU
May 10, 2022
15.2.1118.9
15.02.1118.009
---
Exchange Server 2019 CU12 (2022H1)
April 20, 2022
15.2.1118.7
15.02.1118.007
---

Exchange Server 2019 CU11 May22SU
May 10, 2022
15.2.986.26
15.02.0986.026
---
  Â
Exchange Server 2019 CU11 Mar22SU
March 8, 2022
15.2.986.22
15.02.0986.022
---

Next, you can attempt to match the exact version

4. Exact Version Matching

Test code:

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
url = "https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019"

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
}
response = requests.get(url, verify=False, headers=headers)
soup = BeautifulSoup(response.text, features="html.parser")

version = "15.2.986.26"
for tag in soup.find_all('tr'):
if version in tag.stripped_strings:
print("[+] Exchange Information")
for versiondata in tag.stripped_strings:
if (len(versiondata)==5):
continue
print(" " + versiondata)

Example output of executing code:

[+] Exchange Information
Exchange Server 2019 CU11 May22SU
May 10, 2022
15.2.986.26
15.02.0986.026

For older versions of Exchange, accurate version numbers cannot be obtained, so a rough version matching function also needs to be implemented

5. Rough Version Matching

Test code:

from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
url = "https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019"

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36",
}
response = requests.get(url, verify=False, headers=headers)
soup = BeautifulSoup(response.text, features="html.parser")

version = "15.2.986"
for tag in soup.find_all('tr'):
if version in tag.text:
print("[+] Exchange Information")
for versiondata in tag.stripped_strings:
if (len(versiondata)==5):
continue
print(" " + versiondata)

Example output from executing the code:

[+] Exchange Information
Exchange Server 2019 CU11 May22SU
May 10, 2022
15.2.986.26
15.02.0986.026
[+] Exchange Information
Exchange Server 2019 CU11 Mar22SU
March 8, 2022
15.2.986.22
15.02.0986.022
[+] Exchange Information
Exchange Server 2019 CU11 Jan22SU
January 11, 2022
15.2.986.15
15.02.0986.015
[+] Exchange Information
Exchange Server 2019 CU11 Nov21SU
November 9, 2021
15.2.986.14
15.02.0986.014
[+] Exchange Information
Exchange Server 2019 CU11 Oct21SU
October 12, 2021
15.2.986.9
15.02.0986.009
[+] Exchange Information
Exchange Server 2019 CU11
September 28, 2021
15.2.986.5
15.02.0986.005

6. Extract webpage data timestamp

To accurately obtain version information, it is also necessary to extract the update time of the webpage data

Mark the location of webpage data timestamp:

  • 06/29/2022

  • Code to locate this timestamp:

    print(soup.find_all('time'))

    Example output from executing the code:

    []

    Code to extract the timestamp:

    print(soup.find('time').text)

    Example output from executing the code:

    06/29/2022

    Based on the above information, we can write new code to identify Exchange versions by reading data from the official website to obtain accurate versions. Considering automated identification of multiple targets, to avoid repeatedly accessing the website to read data, the code structure has been appropriately optimized. It only needs to access https://docs.microsoft.com/en-us/exchange/new-features/build-numbers-and-release-dates?view=exchserver-2019 once and save the webpage result in a variable. The code has been uploaded to GitHub at the following address:

    An open-source project

    Considering situations where the internal network cannot access the official website, we implemented a method to parse webpage files locally to obtain accurate versions. The code has been uploaded to GitHub at the following address:

    An open-source project

    You can first visit the official website and save the webpage content as exchange.data, then execute the script Exchange_GetVersion_ParseFromFile.py

    0x03 Summary

    ---

    This article introduces optimization methods for Exchange version identification, eliminating the need to manually update version information lists in scanning scripts. The open-source code includes Exchange_GetVersion_ParseFromWebsite.py and Exchange_GetVersion_ParseFromFile.py