Random Etc. Notes to self. Work, play, and the rest.

Posts Tagged ‘Visualisation’

Plimpplampplettere

Because Mike is right, radii are beautiful, I cooked up a little sketch I'd been thinking about to show connections between the tags in my del.icio.us links.

Posts are dispersed on the time axis, left to right, and a circle is drawn connecting posts which share tags. The circle is drawn with a lower alpha value, so darker circles mean that more than one tag is shared. It's not that meaningful, save from showing that I use more tags per post than when I started using del.icio.us (nothing a quick graph wouldn't show too), but it's a pleasing pattern never the less.

Plimpplampplettere, by the way, is almost certainly made up, and not really the Dutch word for "skimming stones" (Thanks to Ron for clarification).

Cheese, Stinky… Challenging, Emotion

My friend Ben launched his project Holy Shallot at Dorkbot London last night. The product of a year's cookery school, a life-time's obsessing and a thoroughly interconnected brain, Holy Shallot graphs food relationships across ingredients, flavours, emotions, cooking methods and seasonality.

It's a great piece of work, and - whilst it's one that I suspect will never truly be finished - it works, it's here now, and it's definitely worth a look.

Flickr Favourites Tree

favourites tree?

Work in progress.

averyveryverylongcadillac

Information spaces meet physical spaces in Searchscapes Manhattan. A quick browse through the inspiration behind the project is well worth the effort, but I'm surprised the video for Alex Gopher's The Child isn't there.

If anyone has a copy of the Vodaphone advert done in the same style, I'd be interested in getting hold of it.

Also: Behind you! Bees! Or a pantomime dame! is back. Hooray for that.

Half-baked thoughts on Social Network Visualisation for Flickr

Sites like Flickr (Friendster, Orkut, Tribe, MySpace, etc.) are collecting masses of data about people. Person A knows Person B, B knows C and D, D knows E, who knows A, and so on. If you're into your discrete maths (or your systems thinking), you'll know that the formalisation of these kinds of connections is called Graph Theory. If not, then bear in mind that in graph theory a Graph is composed of Nodes (or points) and Edges (or connections). If A,B,C,D,E are nodes, then "A knows B" is an edge between A and B. (Or if Flickr is a pair of shoes, A,B,C,D,E are the eyelets, and "A knows B" is a shoelace.)

My first example can be drawn in 2D like so:

Plotting these relationships in 2D, or 3D space is known as graph embedding, and finding useful ways to interrogate them is an intriguing problem to think about. Indeed, it's a research field in its own right. Graphs are well understood data structures and many tools are available to manipulate and analyse them (e.g. GraphViz). One reason for this is that they are the cornerstone of many computer science problems (and solutions), but also because they can be used across many different disciplines. The ubiquitous application of graph theoretical methods across many scientific disciplines is the main subject of recent popular science books such as Linked and Ubiquity.

Social Network Visualisation tends to borrow heavily, if not totally, from graph theory. Having looked at several examples of this over the past year, I've spotted some common pitfalls which I'll try and articulate here.

Graph Visualisation Pitfalls

On Flickr, we can easily find out if A knows B by looking to see if A has listed B as a contact. But listing of contacts is tied up with all sorts of other practical considerations. The main reason A adds B as a contact isn't so that we can use that data for social network analysis, unfortunately. On Flickr, it says A knows B, or A likes B's photos, or A chatted to B in the forums and might want to find them again, or B added A (for any of these reasons) so A was being polite and reciprocated, and so on. Not all connections are made equal.

In reality A might have anywhere from 0 to 500 contacts. As the average number of contacts creeps upwards, the naive attempt to draw the graph falls over. My ASCII art attempt would be screwed as soon as there was a group of five mutual acquaintences. Even with the assistance of mature software like GraphViz, the network of X knows Y is too dense to draw clearly. The not-a-tree problem is faced by many network visualisations. There are probably too many connections to graph.

For many of us, it's intuitive to visualise these relationships as a tree. This cluster is connected to that cluster, and people are either in one cluster or another. Clusters probably have sub-clusters. We can handle these kind of relationships easily, but unfortunately they fall down straight away when presented with real data. For example, grouping people by handed-ness is a trivial example of something which generates overlapping sets. Imagine A and B are left-handed, D and E are right-handed, but C is ambidextrous. We aren't dealing with heirarchical clusters, we're dealing with overlapping sets.

A knows B, B knows C, C knows D. What's the connection between A and D? If we're analysing a terrorist network, then we might have found a potential link which is worth investigating further. But if we're trying to recommend photography or music or web-links to A, should we include D's tastes? If D is C's drug dealer, and B is C's little sister, and A is C's elderly next door neighbour? Probably not. Connections aren't transitive.

Show Me The Eye Candy Anyway

So assume we're aware of the above caveats, and we have a densely but ambiguously connected graph of contacts. The main task with this kind of visualisation, once you have the data, is how to display it in a readable format. It's almost certainly too much data to just throw out there (but it's always worth a try), so how do we prune it down to show only the meaningful stuff?

On Flickr, GustavoG has been busy producing graphs of the mutual contacts and testimonials networks. You can see them all in his FlickrLand set. These are interesting to examine for many reasons, not least in how he has pruned the network in order to get manageable visualisations from it. The whole social network on Flickr would be too big to show in detail, so Gustavo doesn't show it all. He rightly spots that testimonials should indicate fairly strong ties, and the network is much sparser than the contacts network. He's also attempts to trim the constacts network down by set the requirement that contacts must be mutual for a graph edge to exist, and he's tried different thresholds for how many mutual contacts a person must have before they are added to the graph.

Gustavo has used yEd, a Java graph layout program, to produce graphs using the "organic layout", and it works pretty well. In particular, in certain graphs there are undeniably meaningful clusters to be found, ones that expert users can spot straight away. At 50 mutual contacts and 100 mutual contacts in particular, the clusters are pronounced. At 10 mutual contacts the network is too dense to be very meaningful - certainly as it's presented at the moment it doesn't say much at all. At 200 mutual contacts, it tells us what many regulars to Flickr already know - there are only a few very well connected folks, and they mostly know each other. Because testimonials indicate stronger ties, the overall network is very fragmented, leaving many loose mini-clusters. Nevertheless, the overall picture is surprisingly well knitted together.

FlickrGraph

Gustavo's method for making the data manageable is to remove nodes which aren't significant in the overall picture. For instance, in mapping out clusters of users on Flickr, it's probably not a big deal to lose people with less than 10 mutual contacts. Marcos Weskamp recently used a different tactic to cut down the same data - remove edges, and only show a subset (window) of nodes at any one time.

Marcos's FlickrGraph is fantastic to play with - the interactivity and clean design help there - but unfortunately it is less meaningful than Gustavo's static graphs.

The FlickGraph suffers from the data source because the current Flickr API will always return contacts sorted by user-id (an essentially arbitrary number). Because of this, and the limit of ten contacts shown, a lot of popular people won't appear to be connected to anyone despite the fact that they are.

Ways To Improve These Visualisations

Wanted: Richer Data.

We already know how to get meaningful visualisations from our data. We have to start with meaningful data. Marcos Weskamp has demonstrated a neat way of graphing mailing list interactions with his Social Circles project. In my opinion, the FlickrGraph lacks some of the insight that Social Circles provides. This is partly due to technical implementation, but mainly because Social Circles is sourced from real interactions and implicit connections, not watered-down explicit declarations of interest. Flickr users like HyperBob are very active in the forums and comments, but don't keep contacts on Flickr. Social Circles-style data would capture that.

On Flickr, there are several implicit contacts networks we could use:

The idea here is that we should be visualising the actual social things people are doing, rather than the social acts they say they are doing. I also touched upon these ideas in a post about Architects, Social Networks and Hypertext. But enough of my ranting, I'm off to try one of these methods.

Extispicious… and Delicious

Extispicious... and Delicious
Extispicious... and Delicious
Originally uploaded by Just_Tom.

Weighted Links as Positive Feedback

There are a couple of posts over at Signal vs Noise about weighted lists using font size and background colour (another one for Widgetopia, I think), as recently seen on Flickr, we make money not art, craigslist, 43 things and del.icio.us, amongst others.

I like this technique wherever there isn't an obvious feedback loop involved, but as soon as the size of the link drives its popularity which then drives its size, I think the utility falls down. This is illustrated best on 43 folders, where you have to try really hard to see past the 2 or 3 most popular links.

On Flickr, the technique is more effective for two reasons - the font size change is more subtle, and the popularity of a tag doesn't usually mean that it will attract more photos.

Everything2 has had weighted links for a while in the form of the softlinks which appear at the bottom of each article. (Softlinks are generated when someone follows a link from within an article, or if someone searches for something from the article page).

There are two variations, the regular view shows related articles - the lighter the link the more well-traveled the connection. There is also a 'chaos' view which sizes links according to how well-travelled they are, and shows everything associated with the article in question.


Softlinks are great because they allow anyone to link to relevant articles within Everything2, and illustrate what people are actually thinking about after reading an article. They're bad because they can easily be abused to cause offence, and mean ones tend to hang around once they're there - a sure sign that first arrivals get an unfair advantage and that positive feedback is tiliting the odds in favour of the already-popular.

Note I'm not saying that rich-get-richer systems are always undesirable. They certainly promote stability, but positive feedback on lists isn't a great way to promote new-ness and I think that's a trap that some site designers fall into when using charts as an entry point for discovering content.

Plasticbag Visualisation

Last week Tom Coates asked for people to have a go at visualising his 5 years of weblog posts. Looking for any excuse to help with the beta-testing of the Processing environment, I gladly obliged.

Today it turns out that apart from Cal Henderson (a close friend of Tom's, who used the opportunity to take the piss a bit) nobody else contributed. How embarrassing! Nevertheless, it was an interesting diversion for a couple of hours. For an explanation of what I came up with, see Tom's analysis.

Anyway, I'll post the code here once there's a public release of Processing with which it will work. Until then, if you want to know more - or if you think I got it wrong - you can email me tom(at)tom(dash)carden(dot)co(dot)uk.

Flickr Rainbow

The most recent images from Flickr.com tagged with the colours of the rainbow.

A Brief History Of Me

For any readers who don't know me, here is a brief overview of some of the things I have been involved with recently.

Non-realistic Rendering In Immersive Environments

A non-photorealistic rendering of an Indian temple
Is there an 'uncanny valley' for rendering quality? For our EngD group project, Sheep Dalton (Ovinity), Monica Martini (Martini Architects), Sean Varney (Soho Cyberscan / Framestore CFC) and I built a model of an Indian Temple and a non-photorealistic OpenGL rendering engine for use in desktop and immersive VR systems.

Social Network Visualisation

A visualisation of an alumni network
A social network visualisation for the students, alumni and staff who have been involved with the MSc Virtual Environments in the Bartlett (UCL's architecture school).

Algorithmic Art

Algorithmic Art, using the Processing environment
A series of pixel-exposure techniques using the Processing environment.

Virtual Trees

Whipping Trees, a VRML Project
Whipping trees, a VRML world and a study in dynamic growth, responsive form and emergent spaces. Completed as part of Methods of Synthetic Construction 1. You can view all our VRML coursework on the course website. This work was significant for me because it involved taking what were effectively several small sketches (pieces of code) and combining them into a single piece of work, with a narrative and a sense of cohesion.

2D Robot Simulation

Bio-Inspired Evolutionary Agent Simulation Toolkit
At Leeds University, I started the development of a Bio-inspired Evolutionary Agent Simulation Toolkit, a project initiated by Seth Bullock and ably continued by David Gordon (now of Framestore CFC). Whilst I was there, I also took part in a Bio-inspired Computing reading group, worked on an interface for playing poker against evolved neural-network players, and investigated the possibility of doing image processing with artificial life.

← Before After →