January 12, 2011

Snowpocalypse Hits Boston



We'll need to dig out in Lincoln MA. All this snow arrived last night. It's still coming down!

Posted by David at 10:14 AM | Comments (0)

January 22, 2011

Dabbler Under Version Control

I notice that my son's friends have been very active on dabbler.org recently, so I've placed it under version control so that they don't lose more than a day of work if there is an accident.

The dabbler svn repository is here, with automatic check-ins every night.

Posted by David at 07:52 AM | Comments (0)

January 29, 2011

Simple Cross-Domain Javascript HTTP with call.jsonlib.com

In Python, fetching any webpage in the world is a one-liner:

print urllib2.urlopen('http://davidbau.com/data/animals').read()

But Javascript's same origin policy prevents you from doing the same thing in Javascript unless your script happens to be running on a page from the same domain. The SOP is intended to protect the security of the user by limiting access to private resources such as:

  1. Private browser state like cookies from other domains.
  2. Private web pages that may be only accessible behind firewalls.

While this is a nice thing for protecting logged-in banking sessions and secrets on corporate LANs, it also means that plenty of perfectly safe and useful network code cannot be written with Javascript. You need to do your cross-domain networking server-side, and then bounce your requests off a server in your own domain.

JSONP Lets Servers Expose Services Cross-Domain

Interestingly, the "src" attribute of the <script> tag is not subject to the same origin restriction, which means that servers can intentionally expose services cross-domain by encoding their data as a javascript function call.

In 2005 Bob Ippolito proposed standardizing this convention as a format called JSONP. Today, JSONP can be used to make cross-domain calls to various useful APIs including Flickr, GData, Twitter, and YQL. JSONP is incredibly useful and is supported in all the major AJAX libraries such as JQuery.

YQL in particular is very versatile, and it actually allows you to make your own further HTTP requests to other domains. However, it does a lot more and is a bit complicated and slow as a result. So for the specific purpose of making secure cross-domain HTTP calls on the public internet, I have been using a service I've posted on Google App Engine, called call.jsonlib.com.

Cross-Domain Javascript HTTP Requests on the Public Web

The library at call.jsonlib.com lets you get very close to the python one-liner:

jsonlib.fetch('http://davidbau.com/data/animals', function(m) { alert(m.content); });

It is just a simple packaging of the python urllib2 library as a JSONP library. But it is tremendously useful. It allows you to:

  • Make GET or POST requests to arbitrary public servers
  • Set arbitrary HTTP headers
  • Read all the HTTP headers of the response

Because all the requests are made from the context of a server on the public internet, it does not expose your private LAN or private cookies to security holes.

A few corners are smoothed out to make use from javascript easier. For example:

  • jsonlib.fetch is intelligent about encodings and converts the content to unicode based on HTTP headers, content sniffing, and <meta> tags in the HTML content, if any.
  • jsonlib.fetch can also do basic scraping tasks server-side such as extracting elements that match a particular css selector, or stripping HTML markup and returning just text.
  • jsonlib.fetch is careful to access the proxy using https if the underlying URL being scraped is https, so that the whole path is encrypted.

More documentation here.

Some examples here.

Posted by David at 06:47 AM | Comments (4)

January 30, 2011

Using goo.gl with jsonlib

An example using jsonlib to invoke an inconvenient HTTP API from Javascript.

When Google released the goo.gl URL shortener API recently, they did not provide a JSONP API, so it is not convenient to make goo.gl URLs directly from Javascript. But then I wanted to implement a goo.gl-based "Save As Short Url" feature in the save box of Heidi's Sudoku Chrome App. The app is 100% Javascript and isn't even hosted on its own web server. What to do?

It is easy to do cross-domain HTTP by bouncing JSONP requests off call.jsonlib.com. With jsonlib, invoking the goo.gl API is just a few lines of code.

How To Do It

Here is how the goo.gl API works:

  1. You assemble a JSON string containing your long URL in the 'longUrl' field.
  2. This string must be POSTed to the googleapis server, with Content-Type "application/json".
  3. The response is JSON that will have your short URL in the 'id' field.

In Javascript, it is inconvenient to POST data that is not formatted as "application/x-www-form-urlencoded", and it is hard to use cross-domain HTTP requests that do not return script-formatted JSONP services. However, call.jsonlib.com solves both of these problems easily.

Here is the code for a goo.gl shortener wrapper that shows how it can be done with a single call to jsonlib.fetch:

<script src="http://call.jsonlib.com/jsonlib.js"></script>
<script>
function googlurl(url, cb) {
  jsonlib.fetch({
    url: 'https://www.googleapis.com/urlshortener/v1/url',
    header: 'Content-Type: application/json',
    data: JSON.stringify({longUrl: url})
  }, function (m) {
    var result = null;
    try {
      result = JSON.parse(m.content).id;
      if (typeof result != 'string') result = null;
    } catch (e) {
      result = null;
    }
    cb(result);
  });
}
// Make a short URL for a nicely written book.
googlurl('http://www.amazon.com/Numerical-Linear-Algebra-Lloyd-'
       + 'Trefethen/dp/0898713617', function(s) { alert(s); });
</script>

Voila! Short URLs in a single function call.

Play with the code here if you like.

Posted by David at 11:53 AM | Comments (1)