Friday, November 26, 2010

Building an open, accessible and event driven Internet

First off, I just want to give my view of the Intenet. I believe there are two reasons why the Internet is successful, or rather why it hasn't yet become just another failed or obsolete technology:

  1. HTTP is a brilliantly engineered protocol that has proven successful throughout the history of the Internet (albeit a short history).
  2. The enormous amount of careful and hard work that went into developing HTML and controlling 'web standards'.
Although, thanks to the continual attempts by Microsoft to derail and dissolve web standards, I believe we are 10 years behind where we could have been today. Fortunately, the appeal of an 'open' and 'standardised' Internet was so overwhelming that even the mighty Microsoft now find themselves at a disavantaged position having choosen a negative and destructive policy rather than assuming a supportive and innovation led role.

Rather than attack on Microsoft, I would cite this as an example of how little tolerance the Internet has toward anyone attempting to control or bend it for their own profit. In comparison, Google took the opposite view - By helping the Internet become a 'better place', they have become an extremely successful and profitable company and continue to live up to their "Don't be evil" motto.

With regard to the topic of Facebook and the controvasy that surrounds privacy policies and the 'evil empire' etc - I wouldn't worry too much, it would take 30 seconds to register to an alternative social network if people weren't happy with Facebook. We should also remember that over ten years ago we were all signing up to 'friendsreunited' and 'myspace', so gated online communities are not new, after all a social network is just an elaborate forum. I say this because do not like the idea of building an Internet on top of the Internet, I believe it defeats the purpose of the Internet and is largely pointless as HTTP is well-known, well-understood, powerful and incredibly prolific protocol. I had always thought Google Wave was doomed to failure, not because it wasn't an impressive technology, but it simply didn't seem to offer anything new - except a big learning curve.

One of the main motivations behind Femtoo was to make a positive move toward a more open, accesible, event-driven Internet. I'll define these term:
  • Open -  Allowing access to lots of data
  • Accesible - Prodiving easy, convenient, standard and well-documented access to structured content
  • Event-driven - The ability for web applictions to communicate asyncronously with each other via a system of events and callbacks

Why not add 'real-time' to the list? Because the Internet has always been 'real-time'. If I were to publish changes to this blog post, those changes would be reflected on the Internet instantly. The part that generally is not real-time are indexed search results etc - but the 'Internet' can not be blamed for this!

Some core concepts behind Femtoo could be summarized with these sentences:

  • Treat the Internet as a rich, accessible database
  • Treat a change in Internet content as a single event
  • Reference individual parts of a page with a unique URL (not just the entire page)
  • Treat the Internet as a single enormous 'workflow' engine.
Ok, but Femtoo is just a webpage tracking and notification system, right? Yes. But I believe it is also an interesting step toward facilitating all of the objectives listed above.

Femtoo allows you to pull out a single piece of data from within a webpage. This single piece of data can then be directly referenced via a single URL, for example this URL always references the latest jQuery version (parsed as a number).

When you create a tracker in Femtoo, have the option of specifying an HTTP Callback URL. So when the tracker detects a change in content (for example a new jQuery version), Femtoo effectively creates an event and fires a pre-configured 'callback' along with the details of the content change.

Because you can track pretty much any textual conent with Femtoo, you can point Femtoo at another web service, monitor the results and fire an event to a 'listener' web service. In this case, Femtoo would almost acting like the 'controller' part of an MVC architecture, but on the scale of the Internet. Futhermore, this process could be one part of a series of 'inter-service' events that constitute an 'Internet Workflow Engine'.

At the moment, most websites are just a collection of static webpage that contain unstructure content. For this reason, Femtoo has to 'poll' webpages to 'artificially' fire events, which is far from ideal. However, if websites were to become 'service friendly' and actively fired events upon content change, then the Internet could start to get really interesting...

Thursday, September 23, 2010

Thanks Google - Google Reader to stop change tracking - I said it was hard than it looks!

Today on the official Google Reader blog it was announced that Google will withdraw their change tracking feature.

This is good news for Femtoo.com - the service that provides possibly the most sophisticated webpage change tracking and notification system on the web.

As I say, tracking webpage changes is more difficult than it may first appear - and although Google has given it a good shot, it would seem that some things are best left to the experts :-)

If you want to be notified of specific changes to webpages, try Femtoo.com - it's free, powerful and very cool!

Tuesday, March 2, 2010

CSS Selectors: Real world examples

1. BBC News Headline

Website: http://news.bbc.co.uk/
CSS Selector: A.tsh,A.tshsplash

2. Price of an Amazon.com item

Example 1: http://www.amazon.com/Toshiba-Satellite-L505-GS5037-TruBrite-15-6-Inch/dp/B0030INLSW/ref=dp_ob_title_ce?ie=UTF8&qid=1267628403&sr=1-2
Example 2: http://www.amazon.com/Kindle-Wireless-Reading-Display-Generation/dp/B0015T963C/ref=dp_ob_title_def
CSS Selector:B.priceLarge

3. Item count of an eBay auction search for "ferrari"

Website: http://motors.shop.ebay.co.uk/Cars-/9801/i.html?_nkw=ferrari&_catref=1&_fln=1&_trksid=p3286.c0.m282
CSS Selector:SPAN.countClass

CSS Selectors: Cheat Sheet

'Class' selectors

.headline - select every element that has the 'headline' class.

'Tag' selectors

p - select every paragraph on the page.
a - select every link/anchor on the page.
h1 - select every Heading 1 tag on the page.

'Id' selectors

#item-price - select the element that has the 'id' "item-price".

'Nth element' selectors

p:eq(0) - select only the first paragraph.
a:eq(4) - select the fifth 'anchor' tag on the page.
.menu-item:eq(3) - select only the forth element that contains the 'menu-item' class.

Combined selectors

h1, p - select every h1 tag AND every paragraph tag.
h3.subtitle - select every h3 tag that contains a 'subtitle' class.
h3:eq(0), a.news-link - select the first h3 tag AND every anchor tag that has the 'news-link' class.

CSS Selectors: The Basics

This post will demonstrate ways to use 'CSS Selectors' (http://www.w3.org/TR/CSS2/selector.html) to target parts of a webpage that you wish to track.

There are three basic ways of identifying parts of a webpage using css selectors:

1. Element 'Tag'
2. Element 'ID'
3. Element 'Class'

Let's give a quick practical example of each of three ways of identifying content:

1. 'Tag' - a tag is the name given to the type of structural element. For example in a typical webpage 'normal' text is often placed with 'paragraph' tags like this:

<p>This is the first paragraph</p>
<p>This is the second paragraph</p>

If we wanted to select all of the paragraphs in a webpage, our 'CSS Selector' would simply be p.

?? other examples of tag selection
<h1>foo</h1>

2. 'ID' - Every structural tag in a webpage can have defined an optional 'id' attribute. Within a single webpage no two elements should have the same 'id'. Therefore an 'id' should point directly to a single element somewhere in the page.  For example

<p>here is a paragraph of text<p>
<p id="error-message">This is an error message!</p>

If we just want to extract the error message our selector would simply be #error-message. The '#' symbol is important and tells Femtoo that we are looking for an id.

3. 'Class' - Every webpage tag can contain a 'class'. Actually, they can contain as many classes as you want. Here is an example of how classes might be used in a navigation menu in a webpage:

<ul id="menu">
<li class="menu-item"><a href="#">Products</a></li>
<li class="menu-item selected"><a href="#">Services</a></li>
<li class="menu-item"><a href="#">Contact Us</a></li>
</ul>

...well sort of... actually using .selected might not be specific enough - this means that we would end up selecting the content of every single element on the page that contains a 'selected' class. In our example, a better selector might be: .menu-item.selected - which means retrieve all elements that contain both the 'selected' class and the 'menu-item' class. Actually, we can be even more specific with #menu .menu-item.selected. By using the #menu selector we are ensuring that we will not accidentally select other items in the page that also have the classes 'menu-item' and 'selected'

Friday, February 5, 2010

Femtoo: New Feature: Manage your tracker subscriptions...

Hi all,

This is a 'public' feature so that people that have subscribed to trackers via the 'Track this page' button can now view their subscriptions and unsubscribe to which every one they desire.

The new page located here: http://femtoo.com/subscriptions/.

Keep on tracking!

Tuesday, January 26, 2010

The race for content tracking - Femtoo vs Google

This recent ReadWriteWeb article highlights the growing importance of content extraction, tracking and notification:

http://www.readwriteweb.com/archives/google_reader_can_now_track_changes_on_any_web_pag.php#comment-183234

The official Google Blog post is here:

Follow changes to any website

Whilst the recently added site tracking feature of Google Reader is a s
tep towards Femtoo's goal of 'Internet as a data base' - it is clear that Google Reader is a far less sophisticated and generally less useful solution when compared with Femtoo.