January 29, 2011

Simple Cross-Domain Javascript HTTP with call.jsonlib.com

In Python, fetching any webpage in the world is a one-liner:

print urllib2.urlopen('http://davidbau.com/data/animals').read()

But Javascript's same origin policy prevents you from doing the same thing in Javascript unless your script happens to be running on a page from the same domain. The SOP is intended to protect the security of the user by limiting access to private resources such as:

  1. Private browser state like cookies from other domains.
  2. Private web pages that may be only accessible behind firewalls.

While this is a nice thing for protecting logged-in banking sessions and secrets on corporate LANs, it also means that plenty of perfectly safe and useful network code cannot be written with Javascript. You need to do your cross-domain networking server-side, and then bounce your requests off a server in your own domain.

JSONP Lets Servers Expose Services Cross-Domain

Interestingly, the "src" attribute of the <script> tag is not subject to the same origin restriction, which means that servers can intentionally expose services cross-domain by encoding their data as a javascript function call.

In 2005 Bob Ippolito proposed standardizing this convention as a format called JSONP. Today, JSONP can be used to make cross-domain calls to various useful APIs including Flickr, GData, Twitter, and YQL. JSONP is incredibly useful and is supported in all the major AJAX libraries such as JQuery.

YQL in particular is very versatile, and it actually allows you to make your own further HTTP requests to other domains. However, it does a lot more and is a bit complicated and slow as a result. So for the specific purpose of making secure cross-domain HTTP calls on the public internet, I have been using a service I've posted on Google App Engine, called call.jsonlib.com.

Cross-Domain Javascript HTTP Requests on the Public Web

The library at call.jsonlib.com lets you get very close to the python one-liner:

jsonlib.fetch('http://davidbau.com/data/animals', function(m) { alert(m.content); });

It is just a simple packaging of the python urllib2 library as a JSONP library. But it is tremendously useful. It allows you to:

  • Make GET or POST requests to arbitrary public servers
  • Set arbitrary HTTP headers
  • Read all the HTTP headers of the response

Because all the requests are made from the context of a server on the public internet, it does not expose your private LAN or private cookies to security holes.

A few corners are smoothed out to make use from javascript easier. For example:

  • jsonlib.fetch is intelligent about encodings and converts the content to unicode based on HTTP headers, content sniffing, and <meta> tags in the HTML content, if any.
  • jsonlib.fetch can also do basic scraping tasks server-side such as extracting elements that match a particular css selector, or stripping HTML markup and returning just text.
  • jsonlib.fetch is careful to access the proxy using https if the underlying URL being scraped is https, so that the whole path is encrypted.

More documentation here.

Some examples here.

Posted by David at January 29, 2011 06:47 AM
Comments

Nice! Are you planning to rnu the proxy for anyone? It seems rife for abuse, either intentional or accidental. Maybe publish the server code too so folks can run their own proxy wherever they want? It's easy enough to hack up a proxy, of course, but a generic one with a nice API like yours would be welcome.

Posted by: Nelson Minar at January 29, 2011 11:11 AM

Hi Nelson. You're right. Here is the server code, packaged to run on GAE (along with its antiquated version of python):

http://www.assembla.com/code/jsonlib/subversion/nodes/trunk

If anybody runs another proxy, let me know.

In the meantime, the service at call.jsonlib.com is far from capacity limits. Feel free to use it for small-volume apps.

Posted by: David at January 29, 2011 01:47 PM

I have put together a little JSON sample that iterates over a JavaScript object and posts the property values to a cross domain server that is hosts by a DotNet.aspx page that then converts a C# object to a JSON string that is then posted back to the browser and converted back to a JavaScript object without having to use Window.Eval()
The resultant JavaScript object is then finally past back to a call-back function that is ready to uses and the code does not need 3rd party libraries, works in net framework 2.0 and upwards and has been tested with IE6-IE9, Firefox plus it's lightweight.

Click my name for full details

Posted by: flash at August 5, 2011 08:58 AM

Turns out something has started to hit call.jsonlib.com too hard in the last day, and so I've implemented a QPS limit. A single referer host is limited to 1 qps now, and a client ip is limited to 0.5 qps.

This should allow small apps that are driven by user action but it should discourage apps that poll automatically. (If you want to poll, you should run your own app server.)

Posted by: David at September 28, 2011 08:43 AM
Post a comment









Remember personal info?