Automatic embedding of ‘stuff’ with oembed

One funny thing when reimplementing your blogging workflow is that you learn all kind of new stuff. While creating my last list of favourite tweets, it quickly became clear that pasting the ‘embed this tweet’-Code from the Twitter website for more than a dozen tweets is nothing I want to do on a regularly basis. So I searched for a way to generate the embed code automatically from the URLs of the tweets and found out there already is a standard for: oembed. It’s an API definition that’s implemented by various Sites like youtube, flickr or twitter and returns metadata as well as ready-to-use code for embedding videos, photos, or tweets. An example communication looks like this (result indented for better readability):

kheymann@corax:~$ curl -k "https://api.twitter.com/1/statuses/oembed.xml?url=https%3A//twitter.com/WilliamShatner/status/286910551899127808"

<?xml version="1.0" encoding="UTF-8"?>
<oembed>
  <type>rich</type>
  <author_name>William Shatner</author_name>
  <cache_age>31536000000</cache_age>
  <version>1.0</version>
  <width>550</width>
  <height/>
  <html>&lt;blockquote class="twitter-tweet" lang="de"&gt;&lt;p&gt;@&lt;a href="https://twitter.com/cmdr_hadfield"&gt;cmdr_hadfield&lt;/a&gt; Are you tweeting from space? MBB&lt;/p&gt;&amp;mdash; William Shatner (@WilliamShatner) &lt;a href="https://twitter.com/WilliamShatner/status/286910551899127808"&gt;3. Januar 2013&lt;/a&gt;&lt;/blockquote&gt;&lt;script async src="//platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;</html>
  <provider_name>Twitter</provider_name>
  <url>https://twitter.com/WilliamShatner/status/286910551899127808</url>
  <provider_url>http://twitter.com</provider_url>
  <author_url>https://twitter.com/WilliamShatner</author_url>
</oembed>

It feeds the tweets URL to twitters oembed API, which returns an oembed xml structure of type ‘rich’ with lots of metadata and the HTML code for embedding it. This behaviour is the same for all oembed providers, so the only information needed for each URL to be embedded is the oembed API URL for each service. For this, three alternatives exist currently:

  1. Use a embedding aggregation service. The main one seems to be embed.ly, which knows about a huge amount of embeddable content and is free for up to 10.000 Requests per month.
  2. Use the oembed autodetection method. According to the oembed specification, enabled services can add a special header to their Websites containing information on how to embed the content. At the already mentioned tweet, these headers look like:
<link rel="alternate" type="application/json+oembed" href="https://api.twitter.com/1/statuses/oembed.json?id=286910551899127808" title="Twitter / WilliamShatner: @Cmdr_Hadfield Are you tweeting ...">
<link rel="alternate" type="text/xml+oembed" href="https://api.twitter.com/1/statuses/oembed.xml?id=286910551899127808" title="Twitter / WilliamShatner: @Cmdr_Hadfield Are you tweeting ...">
  1. Keep your own list. If you know what services you will embed, you can just code the mapping of url patterns to oembed endpoints in your code.

As I’d like to use oembed in this blog, I’m thinking about writing a pelican-plugin for oembed. Currently I’m investigating the micawber python library and the pelican-latex plugin as an example on how to write plugins for pelican. If I should get around to actually implementing it, I will conver it in a later blog post.

Comments !

social