Saturday, February 19, 2005

Stephen Downe's Talk at Northern Voice (now including links)

(Updated Feb 21 for formatting/links)

Stephen Downes, Northern Voice – Community Blogging

(Stephen's Powerpoint Here - 9 megs)

(This is raw, uncleaned up and unlinked... getting it out there as fast as I can.)

I come from the other coast of Canada, Moncton New Brunswick. I work for the National Research Council which means I work for the government and they own what I say.

I’m not talking about people coming together and blogging on the same site, but how blogging becomes a community and how a community becomes a group of bloggers.

Four sections, what constitutes a community, rant and rave against the long tail, about meaning and about distributed network semantics. Reframe your thoughts about what community is, what community on the web is, and what is a community of blogger.

What constitutes community?
We look around in the real world. What creates a community in the real world is proximity. We are part of a community because we live in the same place that others live. Even if you have nothing whatsoever to do with your neighbors. For a long time the concept of community online was based around the same concept. Hagel and Armstrong, Figallo. A community was a place. A website, a portal. Today a social networking site. The model from Hagel was set up a site, bring in the people, and retire to the Caymans. Didn’t work out that way.

My field of study is online learning. I don’t know much about social networks or blogs. In online learning, learning, schools, universities, are almost the prototypical communities. Gather into one place, get subjects together, slice and dice information for people to become obedient members of society. Online we have the learning management system or the learning content management system. Again, a site where you log in, you go to the room where you are allowed to talk to each other. Social networking is much the same thing., or you log on to Friendster, LinkedIn, Flikr. Going to that place. To the large degree a matter of proximity or perhaps also persistence. I challenge the perception that these are communities. They are simply people in the same place. I don’t think that this defines community. Not going to rehash what others have said. Figallo talked about the web of relationship, Bock - common interest, Paccangella – articulated patters of relationships. These accounts are typical, widespread, and in most of the online community work.

I want to draw out two major elements that are definitive of community. First of all that there is an idea that there is a network. Interact. Communicate. A place for discussion. In some sense a relationship, not mere proximity. They are connected in some way. The second, important thing is semantic. That these relations are about something. A common interest, values, a set of beliefs, and affinity for cats or bee keeping.

We have a pretty good understanding of networks. A much less refined sense of meaning. Fortunately one of my other jobs is as a philosopher studying meaning. I never thought that would be useful. I used to have a sign that said “you are not going to get a job.” Give up now. This will not prepare me for future employment. I’ll come back to meaning

I want to rant and rave. The long tail is a property of scale free networks. You get a lot of people linking to each other. If you map this you create a set of links. You get the phenomena of 6 degrees. You can get anywhere in the network in a number of hops. What happens in a network of this type is that some people get logs a links, other gets just a few. If you look in the world of blogging for instance, Boing Boing, Instapundit get thousands and thousands and of links. News Trolls gets one. You get a power law. You have instanpundit at one end with thousands of links, then the long tail of thousands with few links.

What creates the power law phenomenon. One thing is growth. The network grows over time. The other thing is preferential attachment. You are looking for something to link to. You are a blogger, you just listened to Tim Bray and you need something to look for. If you just go looking you will find Scripting News, Scoble’s site. Better than the newspaper so you link to them. Two things happening. Some people get linked to because they were first. As time went by they had the most links so they were most likely to be found by new people. The way a tree grows. You have a trunk of the tree. That is where the action is because it was first. The rest of the tree attaches itself to the trunk.

People have talked a lot about the long tail. Worship the long tail. Mind the long tail. All these people talking about the long tail have the unique quality of not being part of it. I live in the long tail. I can say from my perspective that people in the long tail would rather not be part of it, they simply want to be read. IN Canada we have socialists and they say they represent the working class. Like saying they represent the long tail. Policies that identify the working people. Ask the working people. They don’t want to be the working. So they support the policies of the rich. They don’t support the long tail.

(Branching cluster pictures)
What you should notice about a network like this is that it is hierarchical. The really important things are in the center.

Now, thinking about how this comes to be. If everyone linked to everyone there would be no long tail and we’d all be instapundits. Preferential attachment occurs only because there is a shortage, so that is why the power law exists in so many places. Like economic distribution in society. There is a shortage of money. If you want money you are attracted to people with money. Online it is a shortage of time. You can’t look at millions of blogs. Even Scoble only reads 1000. He must sit there at night and thing” I missed most of it.” The other thing that creates scale free networks is that the links are random. So you reach out for what is available, rather than what is good. Instapundit is available. Easy to find.

My approach to this, and the reason for ranting and raving against the long tail, networks are not a set of random connections but a set of semantically organized connections. Community as proximity – random connections. That’s how you find yourself in a meeting with a person from a different point of view on how the street. Community as networks of semantic relations. Connections between members of the community is based on the meaning of the members or entities in the network. IN order to create community, we pick the most salient connection. What does that mean? How does something become the most salient connection.

What does a post means? What anything means? What a resource means? What does this person say about the world? Once way of trying to fix meaning to a blog post is through tagging. Tagging has been the rage. I am also anti tagging. Take any post. What would a graph of all the possible tags of this post look like. You are gong to get a power law. You have a post about the Prime Minister. Martin. Tax Break. My goldfish tag. You are going to get a power law curve of tags. If you do it that way, then the meaning of the posts becomes the meaning of the tags the big spike. That that tag contains only a part of the meaning of the post. A very narrow, one dimensional look of something that might be more complex. The meaning of a post is not simply contained in the post. This is where we have trouble with meaning. We think we have a pretty good handle on how to say something about something else. What does the world Paris mean. It might also be where I went last summer. Where they speak French. When we push , the understanding of the meaning falls apart quickly. The meaning of the post is not inherent it the word, the post. It is distributed. It is in the set of relationships and connections that it has in use. Wittgenstein: Meaning is use. The meaning of the world becomes something very different. Family of resemblance. When I was looking for definitions of community I found this “community is like pornography. I recognize it when I see it.” There are two ways of looking at the world.

One ways it to look at the world from the point of view of words. The others is through patterns. (Diagram picture). That messy lines and dots is a concept. Your blog post. If you use words, you cut through that with a knife that gives you an abstraction. If you look at it from pattern perspective the meaning emerges. Emergent is a tough concept… fudging the example. Emergence is like when you recognize Richard Nixon on your TV. It is actually a whole bunch of dots organized in a ways so that when you look at the TV you recognize that organization of dots as being similar in form to Richard Nixon. I have never met him so that represents my understanding of him. Richard Nixon is not in the pixels, but in the organization of the pixels. The image of RN is emergent from the pixels. This doesn’t happen without a perceiver, without the capacity to recognize the pattern as RN. Take someone not around in the 70/s and show them a picture of RN == some guy. You have to have the context in which to recognize a pattern in a network. When we use words, we warp this. Going after the big spike. Words distort, pull the pattern into themselves. It isn’t. But if we focus on the spike, the meaning of the concept is derived form the word and the meaning becomes the word.

If we think of meaning as use, then what does a blog post talk about. What it means is contained in the network of relations it finds itself in. Comments, links, evaluations. Relations which will be used to characterize and individual post.

If we look at meaning as inherent in the post, describable in the word. We get an organization that looks like the network formed of random connections. I was looking on the NorthernVoice website, “when you are tagging please use…” They could have used anything. You end up with clustering that looks like one of these scale free networks.

If meaning is thought of as distributed, derived from the relations and not just the content of the word you get a very different pattern.

Why does this matter? If we’re deriving meaning and connection and community in a random fashion, everything flows from the big spike. Scoble talked about someone asking him to link. He said no, create something of value and I will decide if it is worth linking to. That is the big spike telling the long tail what to do. That is what happens when meaning is derived in the center. That sort of arrangement requires control. Look at Technorati tags. We have already got tag spam, some organized tag. “Everyone should do it this way.” Everyone who doesn’t is being chaotic and distributed. If the meaning emerges from the pattern, there is no one in control. Scoble can’t tell me what to write and it doesn’t matter if he does or doesn’t link. This creates meaning through diversity, not conformity. Two very different pictures of community.

So how do we pull this off? How do we kill the big spike? How do we transform from tagging from something for spamming that gets us to meaningful communities. We come back to online learning. In online learning, what’s happening slowly, with resistance. The big spike people don’t want to let go. University publishers, researchers, publishers. IN a classroom the teacher is the big spike and the students are the long time. What is happening slowly is a shift from centralized, place-based networks into something more distributed. Where learning resources are avail not from a given place or authority but out there on the network. Some of us are after is a way of being able to recognize in something that doesn’t require tagging 6 million items, the resources that are salient to us as individuals.

Now people don’t get that in the online world. IN social networking. We gotta standardize, standardize. I tell them the most popular form of XML is RSS, there is no standard and that’s the thing that is working. Educational communities the old way. Neat topics. Classes. This kind of structure both in schools and the blogosphere, where you have the flow from the top is ripe for abuse. JD Lassica influence peddling in the blogosphere. Companies are going to get good at this. 43 things had the entire blogosphere fooled for a couple of weeks and it fell apart. The Wall Street Opinion columns define from the top with others echoing the words.

Future learning environments place the individuals at the center, and a range of resources that they bringing from a wide variety of resources. He actually has 43 things in this diagram. You are able to draw out a theory of community

1. As a means of organizing input and experience
2. As a means of putting that into context
3. As a means of taking what you have done, remixed, repuprosed, so it can become part of someone else’ meaning

Community is antithetical to copyright. The community is defined as the relationship between the members with semantical … (missed)

People exist in relationship to things, people, resources.

We can’t just blast 8 million blog post and expect Technorati to take care of it. What has to happen is this massive set of posts has to self organize. There has to be filter that is not random, that is not spam blocking. A mechanism of determining what we want, easier than what we don’t want.

How do we do this? We create a representation of the relationship between people and resources. The semantic social network. We attach author information to RSS about blog posts. It kills me that this hasn’t happened. It is a huge source of information. IN the item, dc creator tag, but a link to FOAF file. Connected people with people with resources with resources. This gives me a mechanism to finding resources based on my placement within a community of like minded individuals. Really cool stuff will filter through. That semantic social network is just a first pass.

We want to create these connections on many levels. We want metadata not just by the creator of the post but readers. Third party metadata. We are starting to see that. Links, references, annotations. But it can’t be site based. That doesn’t create a network.

Create a tag, identify it and you add your third party meta data. SSN-commentary, type of third party. Who wrote it and what they had to say (I made up the terms). The way this should work in the educational community. As much of this meta data is created through automatic means. If I look at a resource while taking a physics class. So the context is physics. Even thought it might be a picture of a rabbit. The system would log it. What is relevant is that I looked at it any my data is attached to that. Rich…

My contention is that instead of the spike-based power-law based, when we get something like the semantic social network, patterns of organization will be created. We are not creating communities around word, but the community itself emerges as being created by or defined by that dense set of connections. I set up a primitive first pass EDURSS, an aggregator. There should be many instances, everyone would have this on their desktop. It pulls in data from my community, my network of friends. If you set up the network this way, you can stop worrying about searching; The network becomes the search; The stuff that comes out the other end is stuff that is of interest. These inputs come from the entire blogosphere rather than the top 100.

The community is the network. No centralized place. Only people, resources, distributed, acting on their own behalf and interest. Marvin Minsky “The Society of Mind.” Self selected relationships, contextual information to establish meaning. Not only defines the community and emerges FROM The community.

If you looked at the connections from those looking at wikipedia…your post in context of other wikipedia posts. I bet that’s what Google is up to. (I did not get all of that).

Audience comment: can’t hear what he is asking. Compare entire contexts to entire contexts.

Comment: The reasons top down KM, ontology and taxonomy systems don’t tend to work because the goal of those systems is to decontextualize information. Individuals have no interest in decontextualizing; If it works for you , it might work for someone else with a similar context.

Stephen: great summary. We live in a context laden environment.

Another quite comment from the audience (hey, remember the trick of repeating the question?)

There are things with tagging. People lie. Tagging is an explicit attempt to attach words to post. Not inherently wrong. But the word always under-describes the resources. It often mis-describes the resource. Soon as the spammers get into the tagging system, everything on the viagra sites will be about community. There is a big spike thing about tagging. Technorati. The big words are frequently used tags. They are useless to me. If I don’t know what I’m looking for, I’ll never find it. If I don’t know what the tag is I’m not going to get it from the top 100 tags. I have to search for it randomly. One step further. If I want to find a resource I have to conform to the tagging regimen that has been previously established. Like on Yahoo, you had to think like a Yahoo person. I don’t want to buy into that one. With tagging, there is only one community.

The following is conversation with the audience and I can’t hear most of what the audience members are saying:
Cantor: You are wondering why academics get a bad reputation. The Flikr tagging experience is a fantastic, folksonomy experience. If spammers get in, Stuart will kick them out If the folksonmy folks can enjoy themselves that is good.

It is fun, but it is useful? Who are you to say it is not good? It is not good to me.

Cantor: It’s the academics telling us what is good.

I find it is ironic who you puled in. Like you were long tailing us… (Can’t hear it all)
The second thing, I could say whatever I want, you would hear whatever you want, even if it is something you want. … can’t hear him. Example of the world Paris – we all don’t share the same meaning. Asking for a system that captures and recognizes the context. Tagging is context free. There is no relation…

It is the context of Flikr, of Technorati – of closed centralized systems.


Anonymous Suzanna said...

Wow! I'm always envious of people who can type so fast. Stephen's talk went so fast and I wasn't able to digest all the ideas so I am grateful for the "re-run" on your blog.

8:35 AM  

