The impact of encryption and AI on open source in DPI: challenges and opportunities

Roy Chua, AvidThink portrait

By Roy Chua, AvidThink
Published on: 03.12.2024

Deep packet inspection (DPI) technology has played a crucial role in network visibility, security, and performance management for decades. Evolving from five-tuple and signature matching in early open-source software firewalls (FW), and intrusion detection and prevention systems (IDS, IPS), DPI technology has grown increasingly sophisticated. Many open-source security packages can detect common network protocols accurately though some previously popular projects are no longer actively maintained.

However, many of today’s applications rely on secure data transmission technologies, such as TLS/SSL, QUIC, DNS-over-HTTPS/TLS and various VPNs. Application developers want to enable end-to-end confidentiality and prevent snooping, malicious injection of malware payloads and deliberate tampering of messages.

At the same time, network and service providers seek to categorize traffic and applications streams to ensure network security and help improve performance and customer experiences. The security and network ecosystem has developed encrypted traffic intelligence (ETI) to augment traditional DPI to aid in accurate detection and categorization of application traffic – an ongoing and evolving challenge.

As an analyst supportive of the open-source community for many years now (see my perspectives in a previous Linux Foundation-sponsored paper on open-source in networking), I read with interest a recent report on the state of open-source DPI from ipoque, a Rohde & Schwarz company and media site The Fast Mode. The survey was conducted March to April 2024, and data was provided by 48 vendors in the networking, analytics and cybersecurity space. Here are my observations and thoughts on the results of that survey.

Observation #1 – Encryption will be pervasive

The widespread adoption of encryption protocols such as TLS/SSL and QUIC is transforming the way we approach network security. In fact, most popular websites today no longer respond on unencrypted HTTP. With the increased use of Encrypted Server Name Indication (ESNI), Encrypted Client Hello (ECH), applications push the boundaries of privacy and security even further. VPN clients and home security gateways that turn on DNS-over-HTTPS/TLS to protect against third parties snooping on DNS queries help improve end users’ privacy (after all, should your carrier know which sites you’re visiting?). As encryption becomes ubiquitous, DPI solutions must adapt to maintain effective network visibility and security.

    Observation #2 – Open-source DPI is good but has limits

    While open-source DPI solutions have been effective in detecting common encryption protocols, they struggle with newer, more sophisticated methods of traffic protection and concealment, such as ESNI and ECH. The survey by Rohde & Schwarz and The Fast Mode found that for common encryption methods, confidence in open-source DPI remains relatively high:

    • 73.3% of respondents believe open-source DPI can effectively handle TLS/SSL traffic
    • 72.7% express confidence in its ability to manage QUIC and other encrypted Layer 7 protocols like SRTP, SSH, DTLS, etc
    • 71.1% think it can handle VPN protocols adequately

      However, when it comes to newer, more challenging encryption protocols, confidence levels drop:

      • 56.8% of respondents believe open-source DPI can manage ESNI and ECH
      • 55.8% express confidence in its ability to handle DNS over HTTPS/TLS

      These findings highlight a growing gap between the capabilities of current open-source DPI solutions and the evolving encryption landscape.

      Observation #3 – Obfuscation and anonymization bring added challenges

      While encryption is a primary concern, the increasing use of obfuscation and anonymization techniques presents additional challenges for DPI solutions. The survey results show:

      • 50.0% of respondents express confidence in open-source DPI's ability to handle anonymized traffic
      • 40.5% believe it can effectively manage obfuscated traffic

      Given that the respondents are arguably experts on DPI tasked with building network traffic and application visibility solutions at their companies, one has to conclude that open-source DPI has significant blind spots when it comes to these traffic concealment techniques. This limitation could potentially leave networks vulnerable to various security threats and performance issues that rely on deliberate methods to evade detection.

      Observation #4 – Increasing reliance on AI changes ecosystem dynamics

      The integration of Artificial Intelligence (AI) and Machine Learning (ML) techniques is revolutionizing the field of DPI. As traditional DPI techniques struggle with encrypted, obfuscated and anonymized traffic, artificial intelligence (AI) has emerged as a solution. We know of many vendors – both DPI/ETI technology providers, and those further up the stack like network observability/analytics and cybersecurity vendors – who are training their own machine learning (ML)/deep learning (DL) models. By leveraging AI-driven approaches, DPI solutions can improve their accuracy in app and traffic classification thus elevating their effectiveness in detecting and preventing security threats.

      However, an open-source ecosystem looking to build openly shareable ML models faces more challenges than just recruiting open-source developers to their cause. Difficulties include:

      • Lack of well-established workflows for contribution of training data or models for the DPI ecosystem: Well-established code contribution workflows exist, whereas training open-weights (or fully open-source) models and making them available for networking developers isn’t common yet (though sites such as Hugging Face, that feature in the generative AI movement, could be utilized).
      • Privacy and compliance issues around data gathering: Securing and ensuring ongoing updates of training data (using real-world traffic or potentially synthetic data) for application categorization may be easier for commercial entities than a collection of open-source developers. A commercial enterprise can easily fund efforts, schedule ongoing model retraining or fine-tuning, and have the appropriate legal relationships and governance frameworks to collect and use such data for training. Obtaining end-user permissions, ensuring data privacy and keeping in compliance with regulations may be hard to wrangle for a collection of open-source developers – especially if there isn’t a single legal entity that holds responsibility.

      Unsurprisingly, the survey by Rohde & Schwarz/The Fast Mode shows that 31.0% of respondents report significant or moderate use of AI/ML/DL techniques in their open-source DPI implementations. This low adoption rate of AI in open-source solutions contrasts with the trend in commercial DPI products and the products built on top of these DPI/ETI engines. Many commercial vendors are leveraging advanced ML and DL techniques to restore visibility into encrypted traffic flows. AI-driven approaches coupled with regularly updated DNS to IP maps aim to identify patterns and characteristics in encrypted traffic without compromising the encryption itself, offering a path forward for maintaining network visibility in an increasingly encrypted world.

      Observation #5 – Open-source DPI needs augmentation

      Even though open-source has powered a lot of network security and visibility in the past decades, the rise of encryption and deliberate obfuscation, coupled with the use of AI/ML approaches may push networking and cybersecurity vendors to augment or possibly replace their DPI/ETI libraries with commercial solutions.

        The survey by Rohde & Schwarz/The Fast Mode shows that 36.2% of respondents are in the process of migrating from open-source to commercial DPI solutions. This shift suggests that many vendors are realizing limitations in open-source options. In addition to the difficulties in handling encrypted data, DPI/ETI performance is also an issue with open-source libraries. The factors named by vendors who are considering commercial solutions include:

        • 62.2% cite higher traffic volumes
        • 57.8% need a comprehensive and frequently updated signature library
        • 57.8% point to increasingly complex traffic types
        • 53.3% note the challenges from traffic encryption and obfuscation

        Some DPI vendors, like Rohde & Schwarz (the sponsor for the survey, and this blog post) have taken steps to reduce the barrier to adoption of their solution by providing migration tools to help with this process.

        Conclusion

        The evolving encryption landscape presents significant challenges for network visibility tools, particularly for open-source DPI solutions. While these solutions continue to handle common encryption protocols well, they struggle with newer, more sophisticated methods of traffic protection and concealment.

        While open-source DPI solutions offer benefits in terms of cost and customization, the rapidly changing encryption landscape may require networking vendors to evaluate commercial DPI solutions to maintain effective network visibility, security and performance management. Even open-source proponents, like we at AvidThink, recognize this new reality.

        This post is sponsored by ipoque, a Rohde & Schwarz company. If you would like to learn more about the state of open-source DPI and its challenges, opportunities and alternatives, check out this report.

        Roy Chua, AvidThink portrait

        Roy Chua, AvidThink

        Contact me on LinkedIn

        Roy, an entrepreneurial executive with 20+ years of IT experience, is the founder of AvidThink, an independent analyst firm covering infrastructure technologies at both carriers and enterprises. AvidThink's clients include Fortune 500 technology firms, early-stage startups, and upstart unicorns. Roy has been quoted by and featured on major publications including WSJ, FierceTelecom/Wireless, The New Stack and Light Reading. Roy is a graduate of MIT Sloan (MBA) and UC Berkeley (BS, MS EECS).

        ipoque blog - discover the latest news and trends in IP network analytics

        Sign up for the ipoque newsletter

        Stay informed about the latest advances and trends in
        deep packet inspection and network traffic visibility