Caching is a very hot topic these days. Aggregation of scripts and style sheets, compression, setting Expires and Last-Modified headers. All of that is about improving loading times and reducing the amount of traffic. It's pretty smart stuff and it works pretty well.
There are always two cases to consider: a user with a non-primed cache and a user with a primed cache. If the cache is primed, the user will need to download far less files, because most of it is already cached. Scripts are the same, style sheets are the same, lots (or even all) images are the same, and maybe even the document itself is the same. Well, that's the deal basically.
Yesterday I noticed something odd, however. I visited some page I visited just the day before and all those images were reloaded. I also observed this on a lot of other websites I visit regularly. Of course I checked the headers and everything looked fine. Pretty odd. Especially if you consider that I already ramped up the cache size to 150mb ages ago.
I remembered that article about the size growth which has been posted over at Slashdot a few months ago. Connections are getting faster and websites are getting bigger and bigger. There is more markup, far more scripts, more styling, bigger and better images (and also more of them), and all that other stuff - just more of it. And there are of course also all those funny videos, screencasts, interviews, and the like. Just one of those videos can be easily up to 400mb in size.
For example the 1up: RSVP video I watched yesterday was already 384mb in size. And it wasn't even remotely HD with its puny 500x319 resolution. Even with more advanced video codecs (like H.264) and audio codecs (like HE-AAC) the size of those videos is going to increase for the foreseeable future.
| Browser | Default Cache Size |
|---|---|
| Opera | 20mb |
| Firefox | 50mb (fixed at that value since 2004) |
| Internet Explorer | 10% of your drive space (IE7 caps at 1gb) |
Those are the cache sizes we can expect. Opera's 20mb are quickly overwritten and so are Firefox' 50mb. IE's route is interesting, but the scale is somewhat off. At least they were smart enough to cap it at 1gb otherwise we would have ridiculous cache sizes nowadays. The problem with IE's approach is basically that the size of HDDs increases far quicker than those of websites.

As you can see the capacity of HDDs increases at mind-boggling rates. Note that the y-axis is logarithmic; with a linear scale we would see a typical exponential curve. Of course the size of websites is primarily linked to bandwidth and not the size of HDDs. However, the size of the partition is a good way to figure out the upper limit.
If we take that factor of 3 for every 5 years and 50mb as the starting point for 2004 we can create a simple formula like this one:
(3^((year-2004)/5))*50
| Year | Cache Size (mb) |
|---|---|
| 2004 | 50 |
| 2005 | 62 |
| 2006 | 78 |
| 2007 | 97 |
| 2008 | 120 |
| 2009 | 150 |
| 2010 | 187 |
| 2011 | 233 |
| 2012 | 290 |
| 2013 | 361 |
| 2014 | 450 |
| [...] | |
| 2019 | 1350 |
Looks pretty sensible, doesn't it? Well, the starting point was a bit conservatively chosen, but you should get the idea. Keep in mind that those would be only the default sizes and a user can override them whenever he/she wants. Additionally, a x% of the partition capping could be added to prevent it from getting overly greedy.
Comments
adaptive cache should be better
It's also time to think cache algorithm, adaptive cache, that have a greater % of chance to keep more often reused data, as it's used in kernels memory caches and swap management, should be better.
Site you visit every days should be cached more thant other, especially css or js files, that are lighter and are more constant than picures or xml (depending on articles content).
This kind of algorithm is used at most system (software/hardware) level, this sould be used too in web browsers.
Agree
Yep, a smarter cache would be a welcome thing.
Also, a small correction. IE7's cache size is different than previous versions. It defaults to 50 megs, not a percentage of drive space.
re: Agree
Thanks for the clarification. :)
Always used increased cache ;)
Good point! That's why I always increased my cache size since many years. Right now, it's even 250MB. But I deal a lot with flash and multimedia contents, so this is really needed and I have lots of free space left for that.
How about privacy issues?
If you have your browser delete the cache at the end of every session, a huge cache can become quite annoying, right?
Post new comment