Exploring Web Rendering: Streaming HTML

Make server rendering feel like client rendering but without most of the JavaScript
Rapidly-flowing stream of water interspersed with large rocks

Photo by Kylir Horton on Flickr

5-part Series: Exploring Web Rendering

  1. Isomorphic JavaScript & Hydration
  2. Partial Hydration (a.k.a. “Islands”)
  3. Progressive Hydration
  4. Streaming HTML ⟵ you’re reading this
  5. Server Components Architecture

Welcome back to the 4th article in this performance-focussed series about web rendering; we’ve come a long way in a short time, so thank you to those following along. This discussion began with client-side rendering (CSR) and has progressed to and encouraged server-side rendering (SSR) for performance reasons. Remember that the server is the key to this strategy because using it to render eliminates the need to use JavaScript in the browser to do so. As discussed in part 2 of this series about islands rendering, the rapidly-growing performance divide between midrange and premium phones over the past 10 years signifies that faster hardware can no longer be relied upon for better performance. Rather, execution efficiency must be improved, and the most expensive asset download per byte is JavaScript. Thus, minimizing JavaScript’s use is key to good performance as the previous articles in this series have explained.

In an attempt to prevent any terminology-related misunderstandings, “streaming” is a common term most typically associated with audio and video. However, streaming HTML is an unrelated, text-based technology focussed on delivering content to the user as quickly as possible. While both can be accomplished in a browser, the latter is the focus of this article.

Streaming HTML is a very different concept from the hydration-based topics explored in this series so far. Let’s begin our investigation by providing a short history, a general definition, and an abstracted example before diving into the details. Back in 1994, Netscape Navigator 1.0 beta was introduced with a feature called progressive HTML rendering that allowed response HTML to be periodically flushed to the browser in “chunks” until the response was complete. In fact, HTTP 1.1 later added official support for doing so named chunked transfer encoding wherein a stream can transfer multiple chunks hence the name “streaming”. Effectively a modern rename of the same concept, streaming rendering uses a single HTTP response to incrementally server render a page’s HTML while minimizing the amount of JavaScript used. More specifically, the initial streaming response contains static content for data-independent components and placeholder content when waiting for data to be returned. In the meantime, the HTTP connection is held open while the server waits for its data requests to complete. Each bit of response is subsequently streamed to the user’s browser as soon as it’s ready, then its respective placeholder HTML is replaced, and finally the connection closes when all data requests are complete. Conversely, traditional server rendering waits for a page’s full output to complete including all data requests before any response begins; most server-rendered web pages load this way.

To provide some additional clarity, consider a dinner analogy with a customer, dinner table, waiter, kitchen, and plates of food. The customer orders 3 plates of food A (red), B (green), and C (blue) from least to most time consuming to prepare, respectively. With traditional server rendering, the waiter will wait for the kitchen to complete all 3 plates before returning them all to the table at once. Specifically, using the image below, step 1 shows plate A as complete but must wait in the kitchen. Step 2 shows plate B gets completed next followed by step 3 which shows plate C being finished last. Not till step 4 do all the plates leave the kitchen at once to be brought to the dinner table. Notice that no plates can leave the kitchen individually; the inverse is the main enhancement that streaming rendering enables.

Diagram showing how traditional server rendering requires all plates from the kitchen to be complete before being delivered to the customer's table
Traditional server rendering requires all plates to be complete before being served

Conversely, streaming allows the waiter to bring each plate to the table as the kitchen completes it. Specifically, step 1 shows plate A as ready to be served. By step 2, the customer receives plate A at their table at the same time the kitchen completes plate B. Similarly, for step 3, the waiter brings plate B to the dinner table as plate C is completed. By step 4, the waiter has brought all 3 plates to the customer’s table. Clearly the customer can start eating more quickly in the streaming example; I know which waiter I would rather have!

Diagram showing how streaming rendering allows any plate to be delivered to the customer's table as soon as the kitchen completes it
Streaming rendering allows each plate to be brought to the table as it is ready

Streaming responses can occur in one of two ways: (1) in-order and (2) out-of-order. As the name suggests, in-order streaming delivers HTML to the browser in the same order as authored; this means that only HTTP is required but no JavaScript! For example, if 3 data requests are needed to build the output for a page and the first request is the slowest, that first request must complete and have its HTML returned before each remaining data request has its content delivered to the page; this is a similar problem to the traditional server rendering plates example above, specifically forcing completed data requests to wait before being served. Out-of-order streaming may unfortunately require a tiny bit of JavaScript in the browser to render each chunk of HTML, but immediate response after each data request completion is a huge advantage; thus most modern frameworks including React use this method. More specifically, each placeholder is rendered with an identifier, and each content chunk is followed by a tiny bit of JavaScript used to find the respective placeholder ID then replace the HTML with the new content; the following code example will show this in detail.

Now that the fundamentals are understood, a real-world approximation of React behavior follows in order to better illuminate the details. Using the <Suspense> built-in component in combination with rendering via renderToPipeableStream (Node.js) or renderToReadableStream (other runtimes) allows full out-of-order streaming support. Loading user-specific data from a database can often be a source of latency, so assume a demo homepage application with static content and a dynamic user profile dropdown if a user is logged-in. Specifically, <HomepageLayout />, <Carousel />, <Offers />, and <LearnMore /> components all output static HTML whereas our <UserProfile /> component returns dynamically-built HTML based upon data fetched from a database. The application could be structured like this:

<HomepageLayout>
  <Suspense fallback={<LoadingSpinner />}>
    <UserProfile />
  </Suspense>
  <Carousel />
  <Offers />
  <LearnMore />
</HomepageLayout>

React will render this top-down, so first it will render the <HomepageLayout> layout component and its children – all of which are the remaining components to render. First of these children is the Suspense boundary around <UserProfile />. Fallback component <LoadingSpinner /> is shown because <UserProfile /> data is being fetched on the server. Next, the <Carousel />, <Offers />, and <LearnMore /> components render their static content in order. Finally, because all components have rendered once, the component tree in the browser looks something like this (changes from one code block to the next are surrounded by “START CHANGE” and “END CHANGE” HTML comments):

<HomepageLayout>

  <!-- START CHANGE -->
  <div id=”suspense-placeholder-1”>
    <LoadingSpinner />
  </div>
  <!-- END CHANGE -->

  <Carousel />
  <Offers />
  <LearnMore />
</HomepageLayout>

Once the data is fully fetched on the server, the remaining HTML for the <UserProfile /> component streams to the browser at the end of the DOM:

<HomepageLayout>
  <div id=”suspense-placeholder-1”>
    <LoadingSpinner />
  </div>
  <Carousel />
  <Offers />
  <LearnMore />
</HomepageLayout>

<!-- START CHANGE -->
<!-- begin UserProfile HTML -->
<ul>
  <li>menu item 1</li>
  <li>menu item 2</li>
  <li><button type=”button>Logout</button></li>
</ul>
<script type=”text/javascript” id=”suspense-render-1”>
  const suspenseRender = document
    .getElementById(‘suspense-render-1’) // Assume this works
    .previousElementSibling;
  document
    .getElementById(‘suspense-placeholder-1’)
    .replaceWith(suspenseRender);

  // Hydration code here...
</script>
<!-- end UserProfile HTML -->
<!-- END CHANGE -->

When the synchronous suspense-render-1 <script> gets executed, the <UserProfile /> HTML gets copied into its correct placement then is hydrated:

<HomepageLayout>

  <!-- START CHANGE -->
  <ul>
    <li>menu item 1</li>
    <li>menu item 2</li>
    <li><button type=”button>Logout</button></li>
  </ul>
  <!-- END CHANGE -->

  <Carousel />
  <Offers />
  <LearnMore />
</HomepageLayout>
<!-- begin UserProfile HTML -->
<ul>
  <li>menu item 1</li>
  <li>menu item 2</li>
  <li><button type=”button>Logout</button></li>
</ul>
<script type=”text/javascript” id=”suspense-render-1”>
  const suspenseRender = document
    .getElementById(‘suspense-render-1’)
    .previousElementSibling;
  document
    .getElementById(‘suspense-placeholder-1’)
    .replaceWith(suspenseRender);

  // Hydration code here...
</script>
<!-- end UserProfile HTML -->

The final HTML in its desired location remains with full interactivity after hydration. The <script> tag can be effectively ignored because it is done executing.

After reviewing its fundamental and practical aspects, streaming rendering can still be better understood by evaluating its pros and cons.

Streaming rendering provides numerous benefits over traditional server rendering. Moving data fetching and rendering to the server typically involves a latency penalty because both must fully complete before a response can be sent. Streaming eliminates this restriction by allowing fallback UI elements to be rendered in place of data-dependent components until the latter can be delivered to the browser. Furthermore, as in the React example above, HTML chunks can be delivered in any order with out-of-order streaming and a tiny trailing JavaScript payload to swap the HTML into its final location. In addition to the numerous performance benefits outlined by this series, client-rendered applications enjoy security benefits by doing data fetching and rendering on the server because vendors, technologies, and API tokens are hidden away from prying eyes. And streaming applications can often feel as fast as client-rendered ones if delivered using edge rendering – using the servers around the world and closest to your users to deliver dynamic content similar to a Content Delivery Network (CDN).

Streaming has a few downsides. Once a response begins and a response code is chosen (e.g. 200), there is no way to change it, so an error occurring during a data fetch will have to let the user know another way. Also, not all frameworks and runtimes support streaming, so choose carefully. Marko was one of the first (2014!) JavaScript frameworks to support streaming, and React and SolidJS both have great support for streaming as well, for example.

Our streaming journey has come to a close, so I hope it was enjoyable and enlightening! If you have any questions or feedback about this topic or any others, please share your thoughts with me on Twitter. I would certainly enjoy hearing from you! And be sure to check out Babbel’s engineering team on Twitter to learn more about what’s going on across the department. The next and final article in this series will truly be an epic endeavor when we explore server components! They’re not just for React…

Share: