Filtered by Web development

Page 3

Reset

Find out all localStorage keys and their value sizes

July 13, 2019
0 comments Web development, JavaScript

I use localhost:3000 for a lot of different projects. It's the default port on create-react-app's dev server. The browser profile remains but projects come and go. There's a lot of old stuff in there that I have no longer any memory of adding.

My Storage tab in Firefox

Working in a recent single page app, I tried to use localStorage as a cache for some XHR requests and got: DOMException: "The quota has been exceeded.".
Wat?! I'm only trying to store a ~250KB JSON string. Surely that's far away from the mythical 5MB limit. Do I really have to lzw compress the string in and out to save room and pay for it in CPU cycles?

Better yet, find out what junk I still have in there.

Paste this into your Web Console (it's safe as milk):


Object.entries(localStorage).forEach(([k,v]) => console.log(k.padEnd(50), v.length, (v.length / 1024).toFixed(1) + 'KB'))

The output looks something like this:

Web Console output

Or, sorted and filtered a bit:


Object.entries(localStorage).sort((a, b) => b[1].length -a[1].length).slice(0,5).forEach(
([k,v]) => console.log(k.padEnd(50), v.length, (v.length / 1024).toFixed(1) + 'KB'));

Looks like this:

Sorted and sliced

And for the record, summed total in kilobytes:


(Object.values(localStorage).map(x => x.length).reduce((a, b) => a + b) / 1024).toFixed(1) + 'KB';

Summed in KB

Wrapping up

Seems my Firefox browser's localStorage limit is still 5MB.

Also, you can do the loop using localStorage.length and localStorage.key(n) and localStorage.getItem(localStorage.key(n)).length but using Object.entries(localStorage) seems neater.

I guess this means I can still use localStorage in my app. It seems I just need to localStorage.removeItem('massive-list:items') which sounds like an experiment, from eons ago, for seeing how much I can stuff in there.

From jQuery to Cash

June 18, 2019
4 comments Web development, JavaScript

tl;dr; The main JavaScript bundle goes from 29KB to 6KB by switching from JQuery to Cash. Both with Brotli compression.

In Web Performance, every byte counts. Downloading less stuff means faster network operations but for JavaScript it also means less to parse and execute. This site used use JQuery 3.4.1 but now uses Cash 4.1.2. It requires some changes to how you use $ and most noticeable is the lack of animations and $.ajax.

I still stand by the $ function. It's great when you have a regular (static) website that isn't a single page app but still needs a little bit of interactive JavaScript functionality. On this site, I use it for making the commenting work and some various navigation/header stuff.

Switching to Cash means you have to stop doing things like $.getJSON() and $('.classname').fadeIn(400) which, in a sense, gives Cash an unfair advantage because those bits take up a large portion of the bundle size. Yes, there is a custom build of jQuery without those but check out this size comparison:

Bundle Uncompressed (bytes) Gzipped (bytes)
jQuery 3.4.1 88,145 30,739
jQuery 3.4.1 Slim 71,037 24,403
Cash 4.1.2 14,818 5,167

I still needed a fadeIn function, which I was relying on from jQuery, but to remedy that I just copied one of these from youmightnotneedjquery.com. It would be better to not do that an use a CSS transform instead but, well, I'm only human.

Before: with jQuery
Before: with jQuery

Another thing you'll need to replace is to switch from $.ajax to fetch but there are good polyfills but I haven't bothered with polyfills because the tiny percentage of visitors I have, without fetch support still get a working site but can't post comments.

I was contemplating doing what GitHub did in 2018 which was to replace jQuery with real vanilla JavaScript code but it didn't seem worth it now that Cash is only 5KB (gzipped) and it's an actively maintained project too.

Before: with jQuery
Before: with jQuery

After: with Cash
After: with Cash

WebSockets vs. XHR 2019

May 5, 2019
0 comments Web development, Web Performance, JavaScript

Back in 2012, I did an experiment to compare if and/or how much faster WebSockets are compared to AJAX (aka. XHR). It would be a "protocol benchmark" to see which way was faster to schlep data back and forth between a server and a browser in total. The conclusion of that experiment was that WebSockets were faster but when you take latency into account, the difference was minimal. Considering the added "complexities" of WebSockets (keeping connections, results don't come where the request was made, etc.) it's not worth it.

But, 7 years later browsers are very different. Almost all browsers that support JavaScript also support WebSockets. HTTP/2 might make things better too. And perhaps the WebSocket protocol is just better implemented in the browsers. Who knows? An experiment knows.

So I made a new experiment with similar tech. The gist of the code is best explained with some code:


// Inside App.js

loopXHR = async count => {
  const res = await fetch(`/xhr?count=${count}`);
  const data = await res.json();
  const nextCount = data.count;
  if (nextCount) {
    this.loopXHR(nextCount);
  } else {
    this.endXHR();
  }
};

Basically, pick a big number (e.g. 100) and send that integer to the server which does this:


# Inside app.py 

# from the the GET querystring "?count=123"
count = self.get_argument("count")   
data = {"count": int(count) - 1}
self.write(json.dumps(data))

So the browser keeps sending the number back to the server that decrements it and when the server returns 0 the loop ends and you look how long the whole thing took.

Try It

The code is here: https://github.com/peterbe/sockshootout2019

And the demo app is here: https://sockshootout.app (Just press "Start!", wait and press it 2 or 3 more times)

Location, location, location

What matters is the geographical distance between you and the server. The server used in this experiment is in New York, USA.

What you'll find is that the closer you are to the server (lower latency) the better WebSocket performs. Here's what mine looks like:

My result between South Carolina, USA and New York, USA
My result between South Carolina, USA and New York, USA

Now, when I run the whole experiment all on my laptop the results look very different:

Running all locally
Running all locally

I don't have a screenshot for it but a friend of mine ran this from his location in Perth, Australia. There was no difference. If any difference it was "noise".

Same Conclusion?

Yes, latency matters most. The technique, for the benefit of performance, doesn't matter much.

No matter how fancy you're trying to be, what matters is the path the bytes have to travel. Or rather, the distance the bytes have to travel. If you're far away a large majority of the total time is sending and receiving the data. Not the time it takes the browser (or the server) to process it.

However, suppose you do have all your potential clients physically near the server, it might be beneficial to use WebSockets.

Thoughts and Conclusions

My original thought was to use WebSockets instead of XHR for an autocomplete widget. At almost every keystroke, you send it to the server and as search results come in, you update the search result display. Things like that need to be fast and "snappy". But that's not where WebSockets shine. They shine in their ability to actively await results without having a loop that periodically pulls. There's nothing wrong with WebSocket and it has its brilliant use cases.

In summary, don't bother just to get a single-digit percentage performance increase if the complexity of the code and infrastructure is non-trivial. Keep building cool stuff with WebSockets but if you expect one result per action, XHR is good enough.

Bonus

The experiment app does collect everyone's results (just the timings and IP) and I hope to find the time to process this and build graph a correlating the geographical distance compared to the difference between the two techniques. Watch this space!

By the way, if you do plan on writing some WebSocket implementation code I highly recommend Sockette. It's solid and easy to use.

KeyCDN vs AWS CloudFront

April 29, 2019
3 comments Web development, Web Performance

Before I commit to KeyCDN for my little blog I wanted to check if CloudFront is better. Why? Because I already have an AWS account set up, familiar with boto3, it's what we use for work, and it's AWS so it's usually pretty good stuff. As an attractive bonus, CloudFront has 44 edge locations (KeyCDN 34).

Price-wise it's hard to compare because the AWS CloudFront pricing page is hard to read because the costs are broken up by regions. KeyCDN themselves claim KeyCDN is about 2x cheaper than CloudFront. This seems to be true if you look at cdnoverview.com's comparison too. CloudFront seems to have more extra specific costs. For example, with AWS CloudFront you have to pay to invalidate the cache whereas that's free for KeyCDN.

I also ran a little global latency test comparing the two using Hyperping using 7 global regions. The results are as follows:

KeyCDN on Hyperping.io
KeyCDN on Hyperping.io

CloudFront on Hyperping.io
CloudFront on Hyperping.io

Region KeyCDN CloudFront Winner
London 27 ms 36 ms KeyCDN
San Francisco 29 ms 46 ms KeyCDN
Frankfurt 47 ms 1001 ms KeyCDN
New York City 52 ms 68 ms KeyCDN
São Paulo 105 ms 162 ms KeyCDN
Sydney 162 ms 131 ms CloudFront
Mumbai 254 ms 76 ms CloudFront

Take these with a pinch of salt because it's only an average for the last 1 hour. Let's agree that they both faster than your regular Nginx server in a single location.

By the way, both KeyCDN and CloudFront support Brotli compression. For CloudFront, this was added in July 2018 and if your origin can serve according to Content-Encoding you simply tell CloudFront to cache based on that header.

Although I've never tried it CloudFront does have an API for doing cache invalidation (aka. purging) and you can use boto3 to do it but I've never tried it. For KeyCDN here's how you do cache invalidation with the python-keycdn-api:


api = keycdn.Api(settings.KEYCDN_API_KEY)
call = "zones/purgeurl/{}.json".format(settings.KEYCDN_ZONE_ID)
all_urls = [
    'origin.example.com/static/foo.css',
    'origin.example.com/static/foo.cssbr',
    'origin.example.com/images/foo.jpg',
]
params = {"urls": all_urls}
response = api.delete(call, params)
print(response)

I'm not in love with that API but I know it issues the invalidation fast whereas with CloudFront I heard it takes a while to take effect.

I think I might put my whole site behind a CDN

April 23, 2019
0 comments Web development, Nginx, Web Performance

tl;dr; I'm going to put this blog behind KeyCDN and I expect a 2-4x performance boost (on Time To First Byte).

Right now, requests to my blog go straight to an Nginx server in DigitalOcean in NYC, USA. The Nginx server, 99% of the time, serves the blog posts (and static assets) as index.html files straight from disk. If the request is GET /plog/some-slug it will search for a file called /path/to/cached/files/plog/some-slug/index.html (or index.html.br or index.html.gz depending on the user agent's Accept-Encoding header). Only if the file doesn't exist on disk, it goes through to Django (via uWSGI built into Nginx). All of it is done with HTTP/2 and uses LetsEncrypt for SSL.

This has been working great but it's time to step it up. It's time to put the whole site behind a CDN. And I think I'm going to use KeyCDN for it.

In the past, it used to be best-practice that you serve your HTML document from your smart server (e.g. Django) and then, for the static assets, you put in a CDN. Like this:


<html>
  <link rel="stylesheet" href="https://myaccount123.cloudakamaifastlyflare.com/static/main.d910ef9a33.css">

...
<body>
  <img src="https://myaccount123.cloudakamaifastlyflare.com/images/hero.jpg">

...

But with HTTP/2, this becomes an anti-pattern for web performance because your client has already made an expensive HTTP/2 connection (and SSL negotiation) to https://yourcooldomain.com and now it's cheap to just download the rest. I used to do it like that too and I don't regret it. As a matter of fact, on https://songsear.ch is straight to Nginx but all its images are (lazy) loaded via songsearch-2916.kxcdn.com. But I think, when time allows, I'll put all of Song Search behind a CDN too.

Basically, it's time to put the whole site behind a CDN. With smart purging techniques and smarter CDNs respecting your dynamic content cache control headers, it's time to share the load. ...all over the world.

CDN Choices

There are many sites that want to compare CDNs. But many are affiliated or even made by one of them. So it's hard to get comparisons. For example, KeyCDN demonstrates they're the cheapest by comparing themselves with 5 others that they picked. (But mind you, that seems reasonably backed up by this comparison on cdn.reviews).

CDNPerf does a decent job with cool graphs and stuff. Incidentally, they rank my current favorite (KeyCDN) as the slowest compared to the well known giants that I compared it to.

CDNPerf graph

But mind you, the perf difference between KeyCDN and the winner (topmost in the graph as of today) is 36ms vs 47ms which are both fantastic numbers.

CDNPerf list

It's hard to compare CDNs because they're all pretty fast, and actually, they're all reasonably cheap. What really matters is the features and that's a lot harder to compare. CloudFlare often comes up as a CDN provider with stellar features that impress me. I've never actually used them but at least they mention "Fast cache purge" and "API programmability" are their key features. But they also don't mention Brotli caching which I know is a feature KeyCDN supports.

KeyCDN has been great to me in the past when I've used it to CDN host static assets. I'm familiar with their interface and they recently launched an API to do things like purge-by-tag and purge-by-URL. They're cheap, which matters because in this context it's all side-project stuff I want to put behind a CDN. They have a Python library which, although very rough around the edges, it works. And also very important; I've communicated very successfully with them through their support and they've been responsive and helpful. So I'll go with KeyCDN.

The Opportunity

Before I move my domain www.peterbe.com to become a CNAME for one of their CDN domains, I wanted to experiment a little and see how it works and what performance numbers I get for comparison. So I set up beta.peterbe.com and did some Django and Nginx wiring so it would work the same but with the difference that it goes through a CDN for everything.

Then I picked a random page and set up a Hyperping monitor from all of its available regions and let it brew for a while. Unfortunately, Hyperping doesn't let you compare two monitors side-by-side so you're going to have to use your own eyes to compare the graphs:

www means no CDN, just the origin Nginx
NOT behind a CDN (server is New York, USA)

beta means with a CDN in front
Behind a CDN

The "total Response Time" in Hyperping doesn't really make sense. They're an average across all regions it pings from. If you live in, for example, Germany; the only response time that matters to you is 1,215 ms versus 40 ms. Equally, if you live somewhere in New York, the only response time that matters to you is 20 ms versus 64 ms.

I actually ran another benchmark. I used Python like this:


t0 = time.time()
r = requests.get('https://www.peterbe.com/plog/some-slug')
t1 = time.time()
print("Took", t1 - t0)

I did this from South Carolina which means my nearest KeyCDN edge location could be Atlanta, Miami, or New York. Either way, I'm reasonably near New York (compared to the rest of the world) so it'd be a fair performance comparison for all US east coast traffic. (Insert disclaimer here). It downloads the most recent blog posts, in repeated cycles, which gives the CDN a solid chance to warm up and then it compares the median of the last 100 downloads. The output of this is as follows:

beta
    COUNT               1854 (but only using the last 100)
    HIT RATIO           100.0%
    AVERAGE (all)       63.12ms
    MEDIAN (all)        61.89ms

www
    COUNT               1856 (but only using the last 100)
    HIT RATIO           100.0%
    AVERAGE             136.22ms
    MEDIAN              135.61ms

("HIT RATIO" for the non-CDN URL means it was served entirely without Djando server rendering)

What it means is that the median, with a CDN is: 62ms and 135.6ms without. That's a 2x boost.

The crawler stats script is available here: github.com/peterbe/peterbecom-cdn-crawl and I would be thrilled if you can clone it and run it and report what numbers you get and where you're running it from.

Notes and Conclusion

Mind you, 62ms vs. 136ms might sound like a silly difference if Webpagetest says it takes 700ms until the page is interactive (on an LTE connection). And this is a tiny super-optimized page. But never forget A) we can't all live in the US east-coast area and B) if the HTML can download marginally faster it allows the browser to parse it sooner and start downloading all the other stuff much sooner. It'll make a big difference! I'm sure you've all seen graphs like this:

Cold-cache MDN page on 4G
Imagine if all those static asset downloads could have started a whole second "to the left"

Of course a CDN is faster. It's no news. But it's also a hassle and it costs money. It's 2019 and most good CDNs now support Brotli, fast purge-by-url, and HTTP/2. It's time to make the switch! It's not like cache-invalidation is hard.

UPDATE April 23 2019 (same day)

KeyCDN has a neat looking tool that is similar to Hyperping but more of a one-off kinda deal. It's called Performance Test and I wouldn't be surprised it's biased as heck because they probably run these pings from the same location'ish as where they have the edge locations. Anyway, the results are nevertheless juicy. Note the last, TTFB column numbers.

Performance Test without CDN
Performance Test without CDN

Performance Test with CDN
Performance Test with CDN

Whatsdeployed rewritten in React

April 15, 2019
0 comments Web development, Python, React, JavaScript

A couple of months ago my colleague Michael @mythmon Cooper wanted to add a feature to the front-end code of Whatsdeployed and learned that the whole front-end is spaghetti jQuery code. So, instead, he re-wrote it in React. My only requirements were "Use create-react-app and no redux", i.e. keep it simple.

We also took the opportunity to rewrite some of the ways that URLs are handled. It used to be that a "short link" would redirect. For example GET /s-5HY would return 302 to Location: ?org=mozilla&repo=tecken&name[]=Dev&url[]=https://symbols.dev.mozaws.net/__version__&name[]=Stage... Basically, the short link was just an alias for a redirect. Just like those services like bit.ly or g.co. Now, the short link is a permanent fixture. The short link is included in the XHR calls to the server for getting the relevant data.

All old URLs will continue to work but now the canonical URL becomes /s/5HY/mozilla-services/tecken, for example. The :org/:repo isn't really necessary because the server knows exactly what 5HY (in this example means), but it's nice for the URL bar's memory.

Another thing that changed was how it can recognize "bors commits". When you use bors, you put a bunch of commits into a GitHub Pull Request and then ask the bors bot to merge them into master. Using "bors mode" in Whatsdeployed is optional but we believe it looks a lot more user-friendly. Here is an example of mozilla/normandy with and without bors toggled on and off.

Without "bors mode"
Without "bors mode"

With "bors mode"
With "bors mode"

Thank you mythmon!

Lastly, hopefully this will make it a lot easier to contribute. Check out https://github.com/peterbe/whatsdeployed. All you need is Python 3, a PostgreSQL, and almost any version of Node that can run create-react-apps. Ping me if you find it hard to get up and running.

KeyCDN vs. DigitalOcean Nginx

April 12, 2019
0 comments Web development, Nginx, Web Performance

tl;dr; The global average response time of serving an image from my NYC DigitalOcean server compared to a CDN is almost 10x.

KeyCDN is a CDN service that I use for side-projects. It's great. It has about ~35 edge locations. I don't know much about how their web servers work but I can't imagine it's much different from the origin server. In principle.

The origin server is my DigitalOcean (6 vCPU, 16 GB RAM, Ubuntu 14) droplet. It's running an up-to-date CloudFlare build of Nginx and the static images are served straight from (SSD) disk with a 4 weeks TTL (max-age=2419200,public,immutable). The SSL is done with LetsEncrypt and I'm somewhat confident the Nginx is decently configured and uses HTTP/2.

So the CDN, on songsearch-2916.kxcdn.com, is basically configured to front any requests to songsear.ch. If the origin has cache-control headers, KeyCDN knows it can hold on to it for a while, but it's not a guarantee that it will for the full time specified in the cache-control. Either way; how does it compare?

The Experiment

I picked a random static asset URL. It's a 32 KB JPEG file. Its origin URL and its CDN URL are:

  1. https://songsear.ch/static/albums/2017/05/24/08/170630_300x300.jpg
  2. https://songsearch-2916.kxcdn.com/static/albums/2017/05/24/08/170630_300x300.jpg

Next, I set up a Hyperping monitor on both URLs as GET requests. For the regions (regions from where Hyperping will do pings from), I picked the following:

  1. San Francisco, USA
  2. New York, USA
  3. London, United Kindom
  4. Frankfurt, Germany
  5. Mumbai, India
  6. São Paulo, Brazil
  7. Sydney, Australia

(I wish I had selected all 12 possible regions when I started but now it's too late for lazy me)

Then, I let Hyperping GET these URLs for a while and behold, here are the numbers:

The Results

Average response time:

  • The CDN: 79 ms
  • The origin: 714 ms

That's a 10x difference!

Mind you, the "average response time" is across all regions. It doesn't reflect what people get. If 90% of your visitors are from Australia, the average response times would, of course, be very different. But as an example, the origin server is in New York and there, the average response time is 26 ms vs. 105 ms which is a 5x difference.

Here are some screenshots from Hyperping:

KeyCDN results
KeyCDN

Origin server
Origin server

Conclusion

KeyCDN's server is clearly fast and worth doing. It's unsurprising that it performs better far away from New York but it's surprising how much faster it is at serving than the origin when pinged from New York (5x difference).

The site is still NOT fronted by a CDN because, apart from the images, almost all content is un-cacheable. However, I need to do more research and experimentation with putting everything behind a CDN and being meticulous with setting no-cache headers on dynamic stuff and using async tools to invalidate CDN caches when appropriate.

Optimize inlined SVG on developer.mozilla.org

April 4, 2019
0 comments Web development, Web Performance

tl;dr We could make the initial HTML document 40% smaller if moved from inline SVG to external, optimized, .svg static assets. But there are lots of caveats unless the SVG can be used as an image.

One of the many goals of MDN Web Docs this year is to make it faster. That makes users happier and as a side-effect, it makes Google happier. And hopefully, being faster will mean Google ranks us higher.

I'm still new to the MDN code base and there are many things we can do. One thing I noticed is that the site uses inline SVG. E.g.


<a href="/en-US/docs/Learn">References &amp; Guides
    <svg class="icon icon-caret-down" xmlns="http://www.w3.org/2000/svg" width="16" height="28" viewBox="0 0 16 28">
      <path d="M16 11a.99.99 0 0 1-.297.703l-7 7C8.516 18.89 8.265 19 8 19s-.516-.109-.703-.297l-7-7A.996.996 0 0 1 0 11c0-.547.453-1 1-1h14c.547 0 1 .453 1 1z"/>
    </svg>
</a>

The site uses HTTP/2 so the argument of reducing the number of requests is not valid. Well, with caveats. Browser support for HTTP/2 is getting really good. Definitely good enough to make it worth betting on.
It used to be that there's a trade-off for making static assets external: you can potentially avoid downloads, at all, by browser caching and the initial HTML document becomes smaller. But all, at the cost of more requests.

There are other, more subtle, differences with SVG. For example, the content of the SVG might depend on dynamic data. There might be others that I'm not aware and I'm quick to admit that I don't know much about use and stuff but this article might be full of those details.

The Experiment

I wrote a script that opens https://developer.mozilla.org/en-US/ and extracts every <svg> tag and puts them on disk. E.g. svg.icon.icon-smile_fbf6292.svg. They have a hash checksum on the content in case two <svg>s are different (but with the same classList). Then it does the following:

  1. Run svgo on each .svg to create a .min.svg.
  2. Run zopfli on each .min.svg to create a .min.svg.gz
  3. Run brotli on each .min.svg to create a .min.svg.br
  4. Sum all inline ones total size, sum the size of all .min.svg, sum the size of all .min.svg.gz, sum the size of all .min.svg.br.

Results

Technique Number Total Bytes
Inline 27 22,142 (21.6KB)
Optimized with svgo 15 14,566 (14.2KB)
Zopfli compressed 15 6,236 (6.1KB)
Brotli compressed 15 5,789 (5.7KB)

Conclusions and Caveats

For every single MDN page, we stand to make the initial HTML document 22KB smaller. Every time. Most web developers I know often Google for something and end up on MDN and do so frequently enough that there's a good chance for a warm browser cache.

But! This 22KB is uncompressed. Since the HTML documents are served gzipped, at a ratio of about 1:4, the total inline SVGs is roughly 5.6KB. At the time of writing the MDN home page is 58,496 bytes decompressed and 14,570 bytes gzipped. So that means that we stand to potentially strip away 40% of the document size!

Second but! There are some non-trivial differences in usage of SVG. You can't simply replace...


<a href="/en-US/docs/Learn">References &amp; Guides
  <svg class="icon icon-caret-down" xmlns="http://www.w3.org/2000/svg" width="16" height="28" viewBox="0 0 16 28">
    <path d="M16 11a.99.99 0 0 1-.297.703l-7 7C8.516 18.89 8.265 19 8 19s-.516-.109-.703-.297l-7-7A.996.996 0 0 1 0 11c0-.547.453-1 1-1h14c.547 0 1 .453 1 1z"/>
  </svg>
</a>

...with...


<a href="/en-US/docs/Learn">References &amp; Guides
  <img src="/static/icon-caret.914d0e4.min.svg">
</a>

(Compare this and this)

You can, instead, use <svg><use xlink:href="/static/icon-caret.914d0e4.min.svg".></svg> but it comes with its own host of challenges and problems (styling and IE support) and you still need this <svg> tag to do the wrapping in the first place which adds bytes.

It's not always worth compressing tiny static assets. And it might be worth experimenting with what the CPU cost is for a low-performance mobile device to decompress the asset versus just eating the extra network download cost of leaving it uncompressed.

HTTP/2 is great in that it allows the browser to download external assets earlier, on the same open connection, as the initial HTML document. But it's not without risks and costs that need to be carefully considered.

Format numbers with numberWithCommas() or Number.toLocaleString()

March 5, 2019
1 comment Web development, JavaScript

In a highly unscientific survey of exactly 2 French native friends, I asked them what they think about formatting large numbers the "French way" versus doing it the "English way". In particular, if the rest of content/app is English, would it be jarring if the formatting of numbers was French. Both Adrian and Mathieu said they prefer displaying the number the French way even if the app/content is French.

If you have an English browser opening https://codepen.io/peterbe/pen/xBRGoN means it's going to display the two numbers the same way. But if you have a French locale in your browser it'll look like this:

French

Number.toLocaleString() is now universally supported so that's no longer a worry.

For years I was using a function like this:


/* http://stackoverflow.com/a/2901298 */
function numberWithCommas(x) {
  var parts = x.toString().split('.');
  parts[0] = parts[0].replace(/\B(?=(\d{3})+(?!\d))/g, ',');
  return parts.join('.');
}

numberWithCommas(100000000);
// "100,000,000"

jsperf
and it works and is reasonably fast too but it's so tempting to not use and instead stand on the shoulder-of-browsers to supply this functionality instead. But consider the alternative:


(100000000).toLocaleString();
// "100,000,000"

The thing that's always worried me is; What will someone's reaction be if the texts are in one locale and the formatting of numbers (and dates, etc!) are in another locale?

All 2 of the people I asked say they don't mind the mixing but admit that it's weird but that ultimately they prefer "their" format.

If you're non-English browser, what do you prefer? If you're a web usability expert, please, too, drop a comment to share what you think.

Django ORM optimization story on selecting the least possible

February 22, 2019
16 comments Web development, Django, Python, PostgreSQL

This an optimization story that should not surprise anyone using the Django ORM. But I thought I'd share because I have numbers now! The origin of this came from a real requirement. For a given parent model, I'd like to extract the value of the name column of all its child models, and the turn all these name strings into 1 MD5 checksum string.

Variants

The first attempted looked like this:


artist = Artist.objects.get(name="Bad Religion")
names = []
for song in Song.objects.filter(artist=artist):
    names.append(song.name)
return hashlib.md5("".join(names).encode("utf-8")).hexdigest()

The SQL used to generate this is as follows:


SELECT "main_song"."id", "main_song"."artist_id", "main_song"."name", 
"main_song"."text", "main_song"."language", "main_song"."key_phrases", 
"main_song"."popularity", "main_song"."text_length", "main_song"."metadata", 
"main_song"."created", "main_song"."modified", 
"main_song"."has_lastfm_listeners", "main_song"."has_spotify_popularity" 
FROM "main_song" WHERE "main_song"."artist_id" = 22729;

Clearly, I don't need anything but just the name column, version 2:


artist = Artist.objects.get(name="Bad Religion")
names = []
for song in Song.objects.filter(artist=artist).only("name"):
    names.append(song.name)
return hashlib.md5("".join(names).encode("utf-8")).hexdigest()

Now, the SQL used is:


SELECT "main_song"."id", "main_song"."name" 
FROM "main_song" WHERE "main_song"."artist_id" = 22729;

But still, since I don't really need instances of model class Song I can use the .values() method which gives back a list of dictionaries. This is version 3:


names = []
for song in Song.objects.filter(artist=a).values("name"):
    names.append(song["name"])
return hashlib.md5("".join(names).encode("utf-8")).hexdigest()

This time Django figures it doesn't even need the primary key value so it looks like this:


SELECT "main_song"."name" FROM "main_song" WHERE "main_song"."artist_id" = 22729;

Last but not least; there is an even faster one. values_list(). This time it doesn't even bother to map the column name to the value in a dictionary. And since I only need 1 column's value, I can set flat=True. Version 4 looks like this:


names = []
for name in Song.objects.filter(artist=a).values_list("name", flat=True):
    names.append(name)
return hashlib.md5("".join(names).encode("utf-8")).hexdigest()

Same SQL gets used this time as in version 3.

The benchmark

Hopefully this little benchmark script speaks for itself:


from songsearch.main.models import *

import hashlib


def f1(a):
    names = []
    for song in Song.objects.filter(artist=a):
        names.append(song.name)
    return hashlib.md5("".join(names).encode("utf-8")).hexdigest()


def f2(a):
    names = []
    for song in Song.objects.filter(artist=a).only("name"):
        names.append(song.name)
    return hashlib.md5("".join(names).encode("utf-8")).hexdigest()


def f3(a):
    names = []
    for song in Song.objects.filter(artist=a).values("name"):
        names.append(song["name"])
    return hashlib.md5("".join(names).encode("utf-8")).hexdigest()


def f4(a):
    names = []
    for name in Song.objects.filter(artist=a).values_list("name", flat=True):
        names.append(name)
    return hashlib.md5("".join(names).encode("utf-8")).hexdigest()


artist = Artist.objects.get(name="Bad Religion")
print(Song.objects.filter(artist=artist).count())

print(f1(artist) == f2(artist))
print(f2(artist) == f3(artist))
print(f3(artist) == f4(artist))

# Reporting
import time
import random
import statistics

functions = f1, f2, f3, f4
times = {f.__name__: [] for f in functions}

for i in range(500):
    func = random.choice(functions)
    t0 = time.time()
    func(artist)
    t1 = time.time()
    times[func.__name__].append((t1 - t0) * 1000)

for name in sorted(times):
    numbers = times[name]
    print("FUNCTION:", name, "Used", len(numbers), "times")
    print("\tBEST", min(numbers))
    print("\tMEDIAN", statistics.median(numbers))
    print("\tMEAN  ", statistics.mean(numbers))
    print("\tSTDEV ", statistics.stdev(numbers))

I ran this on my PostgreSQL 11.1 on my MacBook Pro with Django 2.1.7. So the database is on localhost.

The results

276
True
True
True
FUNCTION: f1 Used 135 times
    BEST 6.309986114501953
    MEDIAN 7.531881332397461
    MEAN   7.834429211086697
    STDEV  2.03779968066591
FUNCTION: f2 Used 135 times
    BEST 3.039121627807617
    MEDIAN 3.7298202514648438
    MEAN   4.012803678159361
    STDEV  1.8498943539073027
FUNCTION: f3 Used 110 times
    BEST 0.9920597076416016
    MEDIAN 1.4405250549316406
    MEAN   1.5053835782137783
    STDEV  0.3523240470133114
FUNCTION: f4 Used 120 times
    BEST 0.9369850158691406
    MEDIAN 1.3251304626464844
    MEAN   1.4017681280771892
    STDEV  0.3391019435930447

Bar chart

Discussion

I guess the hashlib.md5("".join(names).encode("utf-8")).hexdigest() stuff is a bit "off-topic" but I checked and it's roughly 300 times faster than building up the names list.

It's clearly better to ask less of Python and PostgreSQL to get a better total time. No surprise there. What was interesting was the proportion of these differences. Memorize that and you'll be better equipped if it's worth the hassle of not using the Django ORM in the most basic form.

Also, do take note that this is only relevant in when dealing with many records. The slowest variant (f1) takes, on average, 7 milliseconds.

Summarizing the difference with percentages compared to the fastest variant:

  • f1 - 573% slower
  • f2 - 225% slower
  • f3 - 6% slower
  • f4 - 0% slower

UPDATE Feb 25 2019

James suggested, although a bit "missing the point", that it could be even faster if all the aggregation is pushed into the PostgreSQL server and then the only thing that needs to transfer from PostgreSQL to Python is the final result.

By the way, name column in this particular benchmark, when concatenated into one big string, is ~4KB. So, with variant f5 it only needs to transfer 32 bytes which will/would make a bigger difference if the network latency is higher.

Here's the whole script: https://gist.github.com/peterbe/b2b7ed95d422ab25a65639cb8412e75e

And the results:

276
True
True
True
False
False
FUNCTION: f1 Used 92 times
    BEST 5.928993225097656
    MEDIAN 7.311463356018066
    MEAN   7.594626882801885
    STDEV  2.2027017044658423
FUNCTION: f2 Used 75 times
    BEST 2.878904342651367
    MEDIAN 3.3979415893554688
    MEAN   3.4774907430013022
    STDEV  0.5120246550765524
FUNCTION: f3 Used 88 times
    BEST 0.9310245513916016
    MEDIAN 1.1944770812988281
    MEAN   1.3105544176968662
    STDEV  0.35922655625999383
FUNCTION: f4 Used 71 times
    BEST 0.7879734039306641
    MEDIAN 1.1661052703857422
    MEAN   1.2262606284987758
    STDEV  0.3561764250427344
FUNCTION: f5 Used 90 times
    BEST 0.7929801940917969
    MEDIAN 1.0334253311157227
    MEAN   1.1836051940917969
    STDEV  0.4001442703048186
FUNCTION: f6 Used 84 times
    BEST 0.80108642578125
    MEDIAN 1.1119842529296875
    MEAN   1.2281338373819988
    STDEV  0.37146893005516973

Result: f5 is takes 0.793ms and (the previous "winner") f4 takes 0.788ms.

I'm not entirely sure why f5 isn't faster but I suspect it's because the dataset is too small for it all to matter.

Compare:

songsearch=# explain analyze SELECT "main_song"."name" FROM "main_song" WHERE "main_song"."artist_id" = 22729;
                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using main_song_ca949605 on main_song  (cost=0.43..229.33 rows=56 width=16) (actual time=0.014..0.208 rows=276 loops=1)
   Index Cond: (artist_id = 22729)
 Planning Time: 0.113 ms
 Execution Time: 0.242 ms
(4 rows)

with...

songsearch=# explain analyze SELECT md5(STRING_AGG("main_song"."name", '')) AS "names_hash" FROM "main_song" WHERE "main_song"."artist_id" = 22729;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=229.47..229.48 rows=1 width=32) (actual time=0.278..0.278 rows=1 loops=1)
   ->  Index Scan using main_song_ca949605 on main_song  (cost=0.43..229.33 rows=56 width=16) (actual time=0.019..0.204 rows=276 loops=1)
         Index Cond: (artist_id = 22729)
 Planning Time: 0.115 ms
 Execution Time: 0.315 ms
(5 rows)

I ran these two SQL statements about 100 times each and recorded their best possible execution times:

1) The plain SELECT - 0.99ms
2) The STRING_AGG - 1.06ms

So that accounts from ~0.1ms difference only! Which kinda matches the results seen above. All in all, I think the dataset is too small to demonstrate this technique. But, considering the chance that the complexity might not be linear with the performance benefit, it's still interesting.

Even though this tangent is a big off-topic, it is often a great idea to push as much work into the database as you can if applicable. Especially if it means you can transfer a lot less data eventually.