Understanding Microsoft Defender Threat Intelligence (Defender TI)

February 15, 2023

Threat intelligence is information on unethical activity that targets a company’s networks, devices, applications, and data that is backed up by evidence. Threat intelligence makes sense conceptually. However, gathering and analyzing the necessary data is much more difficult. It might be overwhelming to consider all the risks that can damage or destroy company information technology. Threat intelligence gathered in the context of an attack may reveal your vulnerabilities, the identity of your attackers, their motives, their capabilities, the potential damage they could do to your information assets, and potential indicators of compromise.

If we need to collect information about recent attack, we used to refer much more websites since all form of IOC can’t be fetched from same ONSIT. Cyber threat intelligence analysts struggle to strike a balance between consuming a wide range of threat information and determining which threat information offers the greatest hazards to their business and/or industry. Instead of concentrating on what helps their company protect themselves—gaining insights about the actors through analysis and correlation—analysts spend a lot of time on data discovery, collecting, and parsing. We should believe that Microsoft is providing a Threat Intelligence platform where all the details are fetched from same website.

Microsoft Defender Threat Intelligence (Defender TI):

When doing threat infrastructure analysis and collecting threat intelligence, it is a platform that streamlines triage, incident response, threat hunting, vulnerability management, and cyber threat intelligence analyst workflows.

Advantage:

By creating a platform called Defender TI that gathers and enriches crucial data sources, displays data in an innovative user interface, and correlates when indicators are linked to articles and vulnerabilities, infrastructure chains together indicators of compromise (IOCs), and allows users to collaborate on investigations with other Defender TI licensed users within their tenant, Microsoft hopes to completely reimagine the analyst workflow.
Having a Threat Analysis & Intelligence Platform that enables precise and rapid analyses of alerting is important since security organizations are acting on an ever-increasing volume of intelligence and alerts within their environment.

How the infrastructure works?

To expand an investigation, infrastructure chaining makes use of the connections between highly connected datasets. This method, which forms the basis of threat infrastructure analysis, enables companies to discover new connections, classify related attack activities, and substantiate assumptions during incident response.

Attack campaigns use a variety of obfuscation techniques, from straightforward geo-filtering to complex strategies like passive OS fingerprinting.
This might put a stop to any point-in-time investigations. The concept of infrastructure chaining is illustrated in the screenshot above.
We may start with a piece of malware that tries to connect to an IP address using our data enrichment capabilities (possibly a C2).
Perhaps an SSL certificate with a well-known name, such a domain name, was hosted by that IP address.
That domain might be linked to a website with a special tracker embedded in the code, such a NewRelicID or another analytic ID we might have seen previously.
Or possibly the domain may have historically been connected to other infrastructure that may shed light on our investigation.
The key insight is that, while one data item taken out of context may not be particularly helpful, we may begin to piece together a narrative when we notice the natural connection to all this other technical data.

How Microsoft collects all data to one portal?

Microsoft makes it simpler for its community and customers to do infrastructure research by consolidating many data sets into a single platform, Microsoft Defender Threat Intelligence (Defender TI).
Microsoft gathers, processes, and indexes internet data to help users priorities incidents, identify adversaries’ infrastructure linked to actor groups pursuing their organization, and detect and respond to threats.
Microsoft gathers information from the internet via its PDNS sensor network, a global proxy network of virtual users, port scanning, and reliance on other sources for malware and additional Domain Name System (DNS) data.
This internet data is categorized into two distinct groups: traditional and advanced.

Data sets:

Let see what data can be fetched from the Threat Intelligence console:

1-Resolutions:

Passive DNS (PDNS) is a system of record that stores DNS resolution data for a given location, record, and timeframe. Users are able to see which domains resolved to an IP address and vice versa using this historical resolution data set. With this data collection, time-based correlation based on IP or domain overlap is possible. The identification of previously undiscovered or recently established threat actor infrastructure may be made possible using PDNS.

The PDNS resolution data includes the following:

Resolve: the name of the resolving entity (either an IP Address or Domain)
Location: the location the IP address is hosted in.
Network: the netblock or subnet associated with the IP address.
ASN: the autonomous system number and organization name
First Seen: a timestamp that displays the date that we first observed this resolution.
Last Seen: a timestamp that displays the date that we last observed this resolution.
Source: the source that enabled the detection of the relationship.
Tags: any tags applied to this artifact in the Defender TI system.

We can get the following answers from the above picture:

For Domains & IP:

When was the domain first observed resolving to an IP address by Defender TI?
When was the last time it was seen actively resolving to an IP address by Defender TI?
What IP address(s) does it currently resolve to?
Is the IP address routable?
What subnet is it part of?
Is there an owner associated with the subnet?
What AS is it part of?
What geolocation is there?

2-Whois:

Whois is a protocol that lets anyone query information about a domain, IP address, or subnet. One of the most common functions for Whois in threat infrastructure research is to identify or connect disparate entities based on unique data shared within Whois records.

Our Whois data includes the following:

Record Updated: a timestamp that indicates the day a Whois record was last updated.
Last Scanned: the date that the Defender TI system last scanned the record.
Expiration: the expiration date of the registration, if available.
Created: the age of the current Whois record.
Whois Server: the server is set-up by an ICANN accredited registrar to acquire up-to-date information about domains that are registered within it.
Registrar: the registrar service used to register the artifact.
Domain Status: the current status of the domain. An ”active” domain is live on the internet.
Email: any email addresses found in the Whois record, and the type of contact each one is associated with (e.g. admin, tech).
Name: the name of any contacts within the record, and the type of contact each is associated with.
Organization: the name of any organizations within the record, and the type of contact each is associated with.
Street: any street addresses associated to the record, and the type of contact it is associated with.
City: any city listed in an address associated to the record, and the type of contact it is associated with.
State: any states listed in an address associated to the record, and the type of contact it is associated with.
Postal Code: any postal codes listed in an address associated to the record, and the type of contact it is associated with.
Country: any countries listed in an address associated to the record, and the type of contact it is associated with.
Phone: any phone numbers listed in the record, and the type of contact it is associated with.
Name Servers: any name servers associated to the registered entity.

3-Certificates:

A cryptographic key is digitally linked to a set of user-provided details through SSL certificates, which are files. Defender TI gathers SSL certificate associations from IP addresses on various ports by using internet-scanning methods. These certificates are kept in a local database and provide us the ability to trace the history of an individual SSL certificate on the Internet.

Certificate data includes the following:

Sha1: The SHA1 algorithm hash for an SSL Cert asset.
First Seen: a timestamp that displays the date that we first observed this certificate on an artifact.
Last Seen: a timestamp that displays the date that we last observed this certificate on an artifact.
Infrastructure: any related infrastructure associated with the certificate.

When a user expands on a SHA1 hash, the user will be able to see details about the following, which includes: Serial Number, Issued, Expires, Subject Common Name, Issuer Common Name, Subject Alternative Name(s), Issuer Alternative Name(s), Subject Organization Name, Issuer Organization Name, SSL Version, Subject Organization Unit, Issuer Organization Unit, Subject Street Address, Issuer Street Address, Subject Locality, Issuer Locality, Subject State/Province, Issuer State/Province, Subject Country, Issuer Country, Related Infrastructure.

4-Trackers:

Trackers are unique codes or values found within web pages and often used to track user interaction. These codes can be used to correlate a disparate group of websites to a central entity. Often, actors will copy the source code of a victim’s website they are looking to impersonate for a phishing campaign. Seldomly will actors take the time to remove these IDs that allow users to identify these fraudulent sites using Microsoft’s Trackers data set

The tracker data includes the following:

Hostname: the hostname that hosts the infrastructure where the tracker was detected.
First Seen: a timestamp that displays the date that we first observed this tracker on the artifact.
Last Seen: a timestamp that displays the date that we last observed this tracker on the artifact.
Type: the type of tracker that was detected (e.g. GoogleAnalyticsID, JarmHash).
Value: the identification value for the tracker.
Tags: any tags applied to this artifact in the Defender TI system.

5-Components:

Web components are details describing a web page or server infrastructure gleaned from Microsoft performing a web crawl or scan. These components allow a user to understand the makeup of a webpage or the technology and services driving a specific piece of infrastructure.

The component data includes the following:

Hostname: the hostname that hosts the infrastructure where the component was detected.
First Seen: a timestamp of the date that we first observed this component on the artifact.
Last Seen: a timestamp of the date that we last observed this component on the artifact.
Category: the type of component that was detected (e.g. Operating System, Framework, Remote Access, Server).
Name + Version: the component name and the version running on the artifact (e.g. Microsoft IIS (v8.5).
Tags: any tags applied to this artifact in the Defender TI system.

6-Host pairs:

Host pairs are two pieces of infrastructure (a parent and a child) that share a connection observed from a virtual user’s web crawl. The connection could range from a top-level redirect (HTTP 302) to something more complex like an iframe or script source reference.

The host pair data includes the following:

Parent Hostname: the host that is referencing an asset or “reaching out” to the child host
Child Hostname: the host that is being called on by the parent host
First Seen: a timestamp of the date that we first observed a relationship with the host.
Last Seen: a timestamp of the date that we last observed a relationship with the host.
Cause: the type of connection between the parent and child hostname. Potential causes include script.src, link.href, redirect, img.src, unknown, xmlhttprequest, a.href, finalRedirect, css.import, or parentPage connections.
Tags: any tags applied to this artifact in the Defender TI system.

7-Cookies:

Cookies are small pieces of data sent from a server to a client as the user browses the internet. These values sometimes contain a state for the application or little bits of tracking data. Defender TI highlights and indexes cookie names observed when crawling a website and allows users to dig into everywhere we have observed specific cookie names across its crawling and data collection. Cookies are also used by malicious actors to keep track of infected victims or store data to be used later.

The cookie data includes the following:

Hostname: the host infrastructure that is associated with the cookie.
First Seen: a timestamp of the date that we first observed this cookie on the artifact.
Last Seen: a timestamp of the date that we last observed this cookie on the artifact.
Name: the name of the cookie (e.g. JSESSIONID, SEARCH_NAMESITE).
Domain: the domain associated with the cookie.
Tags: any tags applied to this artifact in the Defender TI system.

8-DNS:

Microsoft has been collecting DNS records over the years, providing users insight into mail exchange (MX) records, nameserver (NS) records, text (TXT) records, start of authority (SOA) records, canonical name (CNAME) records, and pointer (PTR) records.

Our DNS data includes the following:

Value: the DNS record associated with the host.
First Seen: a timestamp that displays the date that we first observed this record on the artifact.
Last Seen: a timestamp that displays the date that we last observed this record on the artifact.
Type: the type of infrastructure associated with the record. Potential options include Mail Servers (MX), text files (TXT), name servers (NS), CNAMES, and Start of Authority (SOA) records.
Tags: any tags applied to this artifact in the Defender TI system.

9-Reverse DNS:

While a forward DNS lookup queries the IP address of a certain hostname, a reverse DNS lookup queries a specific hostname of an IP address. This dataset will show similar results as the DNS dataset.

The Reverse DNS data includes the following:

Value: the value of the Reverse DNS record.
First Seen: a timestamp of the date that we first observed this record on the artifact.
Last Seen: a timestamp of the date that we first observed this record on the artifact.
Type: the type of infrastructure associated with the record. Potential options include Mail Servers (MX), text files (TXT), name servers (NS), CNAMES, and Start of Authority (SOA) records.
Tags: any tags applied to this artifact in the Defender TI system.

10-Intelligence:

Intelligence provides all the articles and open projects related to the IOC which are collected from various sources.

How defender will pick the suspicious listings?

Reputation Scores are determined by a series of factors, including

Known associations to blacklisted entities
A series of machine learning rules used to assess risk.

Why we need Reputation Score?

For any Host, Domain, or IP Address, Microsoft Defender Threat Intelligence offers unique reputation rankings.
This score helps users in immediately understanding any observed connections to malicious or suspicious infrastructure, whether validating the reputation of a known or unknown entity. The platform offers easy access to information on these entities’ behaviour (such as First and Last Seen timestamps, ASNs, and associated infrastructure) as well as a list of rules that, when relevant, have an impact on reputation scores.
Reputation info is essential when determining how reliable your own attack surface is. It is also helpful for evaluating unknown servers, domains, or IP addresses that pop up during investigations.
Any previous malicious or suspicious activity that had an impact on the entity will be disclosed by these scores, along with any other known indicators of compromise that need to be considered.

Reputation score:

Score	Category	Description
75+	Malicious	The entity has confirmed associations to known malicious infrastructure that appears on our blocklist and matches machine learning rules that detect suspicious activity.
50 – 74	Suspicious	The entity is likely associated to suspicious infrastructure based on matches to three or more machine learning rules.
25 – 49	Neutral	The entity matches at least two machine learning rules.
0 – 24	Unknown (Green)	If the score is “Unknown” and green, the entity has returned at least one matched rule.
0 – 24	Unknown (Grey)	If the score is “Unknown” and grey, the entity has not returned any rule matches.

Searching for and pivoting on threat intelligence:

The search box accepts a wide range of inputs; users can look up specific artefacts as well as Article or Project names.

Artifacts:

IP address: Search “47.243.233[.]244’ in the Threat Intelligence Search bar. This action results in an IP Address search.

Domain: Search “dohtest.site” in the Threat Intelligence Search bar. This action results in a Domain search.

Keyword: Search ‘Medusa Malware” in the Threat Intelligence Search bar. This action results in a Keyword search. Keyword searches cover any type of keyword, which may include a term, email address, etc. Keyword searches result in associations with articles, projects, as well as data sets.

And we can perform more searches as provided as option in the drop-down box:

Conclusion:

Microsoft’s main goal is to make as much information about Internet infrastructure available as feasible to assist various security use cases. For analysts to understand the data sets and the associated threats, additional tools, training, and effort are needed as threat data becomes more widely available. By offering a single perspective into many data sources, Microsoft Defender Threat Intelligence (Defender TI) unites these initiatives.