Exploring Middleware superpowers with Surfly’s Proxy Technology

Posted by: Maxim Tsoy
30 June 2021
min read

Update: To see Surfly middleware in action, check out our new service – WebToppings. You can read the announcement blog here.

Intro

Over the last few years, Surfly has become the leader of the co-browsing market. Among other things, our success is attributed to unique features, such as element masking, audit logs, and granular HTTP filtering (“blacklist”). All these capabilities require a deep introspection of the application being co-browsed, and that is truly what makes Surfly stand out from the crowd.

From the very first prototype in 2009, we’ve known this core technology has great potential, far beyond co-browsing.

Humble superpower

At the core of our co-browsing solution lies the content-rewriting proxy. It is a full-featured HTTP and WebSocket proxy that is able to handle traffic from any modern web application. It parses and modifies the traffic on the fly in such a way that all subsequent traffic is routed through the proxy. This means, among other things, that all the links on the served pages are rewritten, so they point to proxied versions of the target URLs.

But that’s just a part of the story. Anyone familiar with web server rewriting modules (such as ARR in IIS or mod_rewrite in Apache) knows that having rewrite rules inside the HTTP proxy is just not enough. Modern web apps rely on Javascript, so most of the content is actually generated on the client side. For example, the code below would create a link that is virtually impossible to capture by the proxy:

const currentDate = new Date();
const link = document.createElement('a');
// At the time of writing, this would generate a link to https://the-year-2021.test
link.href = 'https://the-year-' + currentDate.getFullYear() + '.test';
document.body.appendChild(link);

Moreover, the Javascript code itself could be dynamically generated:

let foo = {};
const code = 'foo.href = "//example.com"';
eval(code); // this will not do anything
foo = window.location;
eval(code); // this time the page will navigate to example.com

And if that is not enough, the code could be obfuscated. This code does exactly the same as the one above:

const _0x5c2c=['foo.href\x20=\x20\x22//example.com\x22','703271gmlQyD','381497emYCfi',
'1019923hJJvMQ','603QLrHWo','383617OfWLUT','location','1177365yPaujx',
'1559GDizPw','648076JfmmxG'];const _0x34f2=function(_0x11440c,_0x3244b0){_0x11440c=_0x11440c-0x17d;let _0x5c2cf2=_0x5c2c[_0x11440c];return _0x5c2cf2;};const _0x452fd3=_0x34f2;(function(_0x4f836c,_0xafd59a){const _0x5094f6=_0x34f2;while(!![]){try{const _0x516045=parseInt(_0x5094f6(0x185))+-parseInt(_0x5094f6(0x183))+-parseInt(_0x5094f6(0x17f))+parseInt(_0x5094f6(0x181))+-parseInt(_0x5094f6(0x186))*-parseInt(_0x5094f6(0x180))+-parseInt(_0x5094f6(0x184))+parseInt(_0x5094f6(0x17d));if(_0x516045===_0xafd59a)break;else _0x4f836c['push'](_0x4f836c['shift']());}catch(_0x8adaf){_0x4f836c['push'](_0x4f836c['shift']());}}}(_0x5c2c,0xb21d8));let foo={};const code=_0x452fd3(0x182);eval(code),foo=window[_0x452fd3(0x17e)],eval(code);

In order to handle these and other tricky cases, we’ve built a full Javascript sandbox. This allows us to apply custom wrapper functions to all execution points, such as function calls and object property access. These wrappers are executed in runtime inside the browser, so it allows us to intercept all relevant Web API requests and change the return value before giving it back to the application code.

 

Surfly can wrap all inputs and outputs of a web application, creating a virtualisation layer around it

 

Together, the Javascript sandbox and the network proxy can practically capture all inputs and outputs of any web application, forming a sort of virtualization layer around it. We call it interaction middleware. Such a versatile system unlocks a plethora of different use-cases. Think of a magic tool that allows you to augment or completely change any web application. Need to add new functionality and UI elements? Easy. Implement a security layer for complex access control? That’s possible too.

And all that without any changes to the source code of the original application. Here’s just a short list of possible applications using this technology:

  • Applying browser extensions without actually installing one
  • User-specific customisations to the UI
  • Privacy content filters
  • Automation on top of 3rd-party web applications
  • Custom integrations of existing web apps, without any special APIs from their side
  • Security policy and authentication layers
  • Interactive annotation tools working on top of 3rd-party web apps
  • Transparent monitoring and analytics tools
  • [insert your idea here]

The scope of possibilities is truly limitless.

Huge piece to bite off

It is also a challenge. While the basic idea is simple and elegant, in practice it can be really hard to do.

First of all, the problem scope is enormous. The Web platform is rapidly growing, new Web APIs, protocols, and security mechanisms are released every month. Implementations (such as web browsers and web servers) are not always following the standards, and the standards themselves are constantly evolving.

It is relatively simple to build an interaction middleware that supports most of the web platform features. But handling the remaining cases will take much more time, sometimes requiring a complete re-implementation of the basic browser behavior. There are a few other companies that attempted this, but we believe we have by far the best coverage of the modern web platform. This includes the latest JS language features, Web APIs, and HTTP-based protocols.

Out of the shade

Despite the technical challenges, we have been building and polishing our middleware framework for the last decade. It is an ambitious task, but keeping focus on the co-browsing application helped us to perfect the technology in a pragmatic way, while delivering a great usable product along the way.

Now we feel that it’s ready to be shared with the world. This is why we are introducing the Interaction Middleware as a standalone product. It is in early access for now, which means we will be working closely with the interested developers. Interaction Middleware is the result of thousands of hours of research, tens of architectural iterations, and countless hours of debugging and bug fixing. But most importantly, it is battle-proven by the co-browsing solution that is used in millions of sessions worldwide.

Speaking of battle proof. Since 2017 we have partnered with the Startpage search engine to create a privacy-oriented proxy solution to power their “Anonymous View” feature. The privacy layer protects users from various tracking techniques including advanced browser fingerprinting, DNS, and cookie analysis. This is by far our biggest venture into non-cobrowsing applications of our proxy system, and quite a successful one 😉

We also tested the waters back in 2017 with Surfly Labs experiments, which organically brought several exciting spin-off projects, so we know that there is a demand for such a product.

What next?

With the introduction of the Interaction Middleware, we will start offering our core technology directly to developers. Hoping to see lots of creative ideas, we’re fully committed to making you successful with our technology.

We are also going to create more Middleware-related content for web developers and hackers on this tech blog. We’re excited to share our experiences and insights, so stay tuned and please get in touch if you are interested in this technology!

Want to start building?