It's Time to Rethink the Default Cache Size of Web Browsers

Case in Point

Caching is a very hot topic these days. Aggregation of scripts and style sheets, compression, setting Expires and Last-Modified headers. All of that is about improving loading times and reducing the amount of traffic. It's pretty smart stuff and it works pretty well.

There are always two cases to consider: a user with a non-primed cache and a user with a primed cache. If the cache is primed, the user will need to download far less files, because most of it is already cached. Scripts are the same, style sheets are the same, lots (or even all) images are the same, and maybe even the document itself is the same. Well, that's the deal basically.

Yesterday I noticed something odd, however. I visited some page I visited just the day before and all those images were reloaded. I also observed this on a lot of other websites I visit regularly. Of course I checked the headers and everything looked fine. Pretty odd. Especially if you consider that I already ramped up the cache size to 150mb ages ago.

Average Web Page Size Triples Since 2003

I remembered that article about the size growth which has been posted over at Slashdot a few months ago. Connections are getting faster and websites are getting bigger and bigger. There is more markup, far more scripts, more styling, bigger and better images (and also more of them), and all that other stuff - just more of it. And there are of course also all those funny videos, screencasts, interviews, and the like. Just one of those videos can be easily up to 400mb in size.

For example the 1up: RSVP video I watched yesterday was already 384mb in size. And it wasn't even remotely HD with its puny 500x319 resolution. Even with more advanced video codecs (like H.264) and audio codecs (like HE-AAC) the size of those videos is going to increase for the foreseeable future.

Today's Default Cache Sizes

Browser Default Cache Size
Opera 20mb
Firefox 50mb (fixed at that value since 2004)
Internet Explorer 10% of your drive space (IE7 caps at 1gb)

Those are the cache sizes we can expect. Opera's 20mb are quickly overwritten and so are Firefox' 50mb. IE's route is interesting, but the scale is somewhat off. At least they were smart enough to cap it at 1gb otherwise we would have ridiculous cache sizes nowadays. The problem with IE's approach is basically that the size of HDDs increases far quicker than those of websites.

HDD capacity over time
Figure 1: HDD capacity over time

As you can see the capacity of HDDs increases at mind-boggling rates. Note that the y-axis is logarithmic; with a linear scale we would see a typical exponential curve. Of course the size of websites is primarily linked to bandwidth and not the size of HDDs. However, the size of the partition is a good way to figure out the upper limit.

A simple solution

If we take that factor of 3 for every 5 years and 50mb as the starting point for 2004 we can create a simple formula like this one:

(3^((year-2004)/5))*50

Year Cache Size (mb)
2004 50
2005 62
2006 78
2007 97
2008 120
2009 150
2010 187
2011 233
2012 290
2013 361
2014 450
[...]
2019 1350

Looks pretty sensible, doesn't it? Well, the starting point was a bit conservatively chosen, but you should get the idea. Keep in mind that those would be only the default sizes and a user can override them whenever he/she wants. Additionally, a x% of the partition capping could be added to prevent it from getting overly greedy.

Comments

adaptive cache should be better

It's also time to think cache algorithm, adaptive cache, that have a greater % of chance to keep more often reused data, as it's used in kernels memory caches and swap management, should be better.

Site you visit every days should be cached more thant other, especially css or js files, that are lighter and are more constant than picures or xml (depending on articles content).

This kind of algorithm is used at most system (software/hardware) level, this sould be used too in web browsers.

Agree

Yep, a smarter cache would be a welcome thing.

Also, a small correction. IE7's cache size is different than previous versions. It defaults to 50 megs, not a percentage of drive space.

re: Agree

Thanks for the clarification. :)

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options