Sunday, August 29, 2010

Thoughts on the Summer of Android

For me, the summer of Android, when I really discovered Google's mobile operating system, started at Google I/O:

  • Froyo, the new Android version was launched.
  • We were given the additional conference gift of the Sprint EVO 4G, then arguably the most advanced Android phone available.

Like Saul on the road from Tarsus to Damascus, my eyes were opened within a day of possessing the EVO. Since, I've purchased the Droid x for myself (an upgrade from my original Droid) and have been following all telephony, not just mobile, closely. The summer has seen a tidal wave of telephony developments from Google and its partners.

Here are my summary observations about where we stand today, the weekend before summer informally ends:

  • The ability to make calls from Gmail when combined with Google Voice is simply game changing. Suddenly, I have the best, free telephone routing system (Google Voice) liberated from the need for a physical device. At this juncture, I don't need separate POTS (plain old telephone service), and I only really need mobile for when I'm out of contact with a land-based network.
  • It's easy to see the day when everyone in the phone business will just be a data network provider. Coverage and speed for price will be the defining characteristics.
  • Since I now view mobile connectivity as a supplement to and not a substitute for land connectivity, I'm most concerned with net neutrality for land-based networks and trunk lines. That's where most connectivity is playing out for me now. It turns out mobile is a specific use case where I want to access specific things. In other words, I can see how prioritization might make sense in the mobile scenario.
  • The biggest value in my mobile phone is my ability to connect back to "web-based" services that add context to my current situation, for instance, all Google location-aware services.
  • The next biggest value in my mobile phone is the ability to connect to my "social graph" (i.e., the network people I'm connected to via some electronic service) in all of its forms. Android-facebook integration is useful in this regard, but the real ace in the hole is the contacts manager, and I think eventually, profiles. By the way, since I've had to resort to various Google Reader hacks to follow aspects of my social graph, I'll throw that in there too.
  • The third biggest value in my phone is my ability to use it to access reference information in all forms, be it electronic books or the web.

Note how making voice phone calls is not listed as a separate item. I do this infrequently and would tend to include it as part of connecting with my social graph. I should point out that this is a giant change from where I was at the time of the ATT breakup in 1984. Then, I was excited by the new plethora of voice-related communication options. Simply put, times have changed. Text has replaced voice in most applications.

Tuesday, August 24, 2010

Google Buzz for Engaged Academic Publishing

Even if you're not in academia, if you read this article from the New York Times, you'll realize that academic publishing is undergoing upheaval. Essentially, it's the same upheaval that's upending publishing in general: The quality of what's disseminated would improve dramatically if it were subject to continuous revision based on feedback from as large a group of people as possible who have information to bring to bear. It's just that the established players are having a hard time understanding, let alone adopting, the new quality model.

The longstanding quality model to be replaced in academic publishing is prior peer review. In prior peer review, articles are not published until they pass muster with a small group of experts, usually two to three people. The notion is that these experts will carefully consider the evidence and conclusions presented in any given work, only publishing those works which pass the quality test.

In the remainder of this essay, I'll first outline why prior peer review is the problem not the solution for the dissemination of quality academic research. Then, I'll present one approach I've cobbled together with Google Buzz.

Prior peer review is the problem, not the solution

The problem with prior peer review is that it effectively limits the scope of review to just a few people, some of whom may not be particularly qualified or may be operating on an agenda that precludes publishing certain, otherwise meritorious pieces. Even in prior peer review's most benign form, it implies that work must fit within the current consensus, which tends to change infrequently at best.

I once had an article go through peer review at four separate outlets over the course of a decade before getting accepted for publication.

When I first started trying to get the article published, it was a bit out there. It applied cutting edge statistical techniques to understand the effectiveness of emotion-laden persuasion tactics used by credit collectors. The article is now available for purchase online from a prestigious journal at the single article price of $30. If you're really interested, I'll be happy to send it to you gratis. I didn't do it for the money.

For all ten years this article was under review, its fundamental conclusions did not change, though there were additional analyses performed and re-packagings undertaken. The end result was a piece of writing hidden behind a paywall. In other words, ten years of effort for non-dissemination beyond the journal's subscribers.

Enter Google Buzz

Given this tale, one can reasonably wonder what Google Buzz has to offer. After all, it doesn't seem in any way to match the process I've just described.

When trying to publish your work, there are three things that matter:

  • Reach
  • Targeting
  • Feedback

In the prior peer review model, the feedback that matters comes from the reviewers prior to any dissemination. Targeting is based on what your colleague's read. Reach is typically limited to those colleagues and their cohort.

Now, consider Google Buzz. For significant efforts, I typically get feedback from 10 or more people. Often a number of people will reshare the post, sending it on to their audiences which often number in the thousands. Some of these individuals alone have greater reach than any, other than the best-known, academic outlets.

But, do any of these Google Buzz people know anything?

To use Buzz effectively, you need to target people who know something. I've basically cultivated Buzz associations based on my intellectual interests. It turns out that these intellectual interests all relate to the extremely applied research agenda I am pursuing in Internet APIs, online social networking, education, and (oddly enough) web marketing. I'm getting feedback from people highly knowledgeable in all of these areas. I suspect this feedback will particularly help me on the software end of the endeavor as well as in understanding some of the implications for online social networking

But is it academic publication?

One thing my Buzz postings clearly are not is dissemination to a small group of experts after approval by an even smaller group of experts. In that sense, they are not academic publication.

However, it's important to remember that academic publication started out as the exchange of letters between people who thought they could profit from talking to each other about the work they were engaged in. I'm essentially using buzz as the envelope for a similar endeavor.

I would describe the buzz posts I've worked on and crafted for feedback as having achieved the kind of engagement I wanted: specifically feedback from highly knowledgeable people who, in some cases, have built significant businesses using the tools and social networking phenomena I'm investigating. Two recent examples of posts that illustrate this dynamic are: the distribution of friend connections among Buzz users relative to Facebook users, and how to infer a person's online social connections even if they keep them private.

So, my personal answer is that Google Buzz is providing me a platform for more engaged academic publishing than what I achieved through the more traditional route.

What's the long term prognosis?

To a large extent, the importance of what you do depends on how it impacts others. I'm doing everything I can to have impact, and enlisting knowledgeable input from others strikes me as important in achieving that.

I'm somewhat less concerned by the sociology of academia. Demonstrating impact tends to trump all.

Monday, August 23, 2010

Laporte's lemma: A workaround for Buzz lists using Google Reader

I need to thank Gord Wait for putting this idea back in my head in an actionable way. If you're a buzz user, you may feel beset by noise. You're following a lot of interesting people, but some of them post infrequently such that their posts are hard to discern in the flow.

There have been many calls for twitter-like buzz lists to solve this problem. Essentially, you would create lists of people whose updates you wanted to track, and then by clicking on the lists, you would see only the updates from those people.

Such lists have not been forthcoming from the buzz team. However, as Gord reminded me, you can simulate such a list by using the atom representation of the person's public activity stream, and then grouping all of the people you want to follow into a folder in reader.

The question is: What is the url to that atom stream representation? Well, it turns out it has a canonical form that looks like so:

https://www.googleapis.com/buzz/v1/activities/UserNameOrNumericID/@public 

For instance, if you wanted to follow me this way, the following URL with my user name would suffice:

https://www.googleapis.com/buzz/v1/activities/fpgibson/@public

Some people don't want to reveal their user name, so you're only able to get their numeric ID. It's the exact same format. Here's me via numeric ID:

https://www.googleapis.com/buzz/v1/activities/114242352345417873286/@public

In each of these URLs, the part in green represents the service you are calling, and it never changes. The part in red is a modifier indicating that you only want what the poster chooses to share with everybody. That also never changes (You can't track private posts in Google Reader because it does not support the authentication required).

This is how I'll track the 70 students I'll have using buzz in two weeks.

So, why is this called Laporte's lemma?

The title is meant as a humorous take on Leo Laporte's abandonment of buzz this past weekend because he felt people didn't notice his stuff anyhow and Louis Gray's response that that might be because he, Louis, had so much other stuff in his stream. As a result, Leo's absence went unnoticed by Louis.

In the world of mathematics, a lemma is a little side proof one does in getting to the main event or it is an easy to derive consequence of a proof that has practical import. It's in this latter sense that I meant the word in the title.

 

Sunday, August 22, 2010

Inferring Your Ties on Google Buzz Using What Your Friends Say

scatter-total-v-in-network-ties.png

A key component of privacy in social networks is the extent to which your connections and associations are public. For instance, Facebook makes your friends' names and IDs available to any application you use, though it allows you some control over how this information is revealed on your profile page. Google Buzz allows you to remove the lists of people you are following and who follow you from your public profile. An interesting question for a Buzz user is how effectively this feature allows you to hide your ties to others from public view.

The graph above shows that in a sample of over seven thousand Google Buzz users, I was able to infer approximately 40% of their ties without needing to refer to their reported ties. I just used the ties their friends reported. With a sufficiently comprehensive crawl, this percentage would approach seventy, or the percentage of people I estimate to make their following and follower lists public.

In other words, in social networks, what your friends say about your ties reveals a lot, even if you yourself keep the information hidden.

How I performed this analysis

Using the student-derived data set I reported on previously, I did the following:

  • I collected the follower and following information of 7,225 network participants reporting their following and follower lists publicly.
  • I counted a tie when one participant appeared in both the following and follower lists of another participant. This method allowed me to infer when a person who kept their lists hidden was tied to another person.
  • For users reporting their ties publicly, I plotted the relationship between inferred and reported ties.
  • Regressing reported on inferred ties for these public reporters revealed that for every inferred tie, the person reported approximately 2.5 ties. Stated otherwise, inferred ties represent 40% of reported ties (n.b., 0.4 or 40% is the inverse of 2.5).

That's not the end of it

One might assume that if everyone kept their following and follower lists hidden that that would be the end of it. Well, not really. Ties can simply be inferred based on public communication patterns. The lesson here is that the extent to which any of your interactions take place in a public space, inhabitants of that space will be able to infer things about you and the people you are connected to.

Areas that require further work

As in my prior post, my sampling approach here is not random. In particular, my students were following people who they could find publicly, so the estimate of the percentage of people hiding their following and follower lists is likely low. Further, I'm assuming that people who hide their following and follower lists are similar to those who report them publicly.

The solution to both these issues is better study design with random sampling. Further, the issue of hidden follower and following lists can be addressed by getting those users' permission to access their lists.

 

Thursday, August 12, 2010

Do buzz users have more ties than Facebook users? 10,000 answer

two-way-tie-frequency.png

The graph above is based on an analysis of 10,113 buzz users where 7,225 of them chose to share their "following" and "follwers" lists publicly. In the graph, I'm counting Buzz user ties as occurring when two users each follow the other. In other networks, like Facebook, ties are equivalent to friend relationships between users.

The distribution of ties is quite clearly skewed to the left:

  • The maximum number of ties recorded for any one user is 4,754.
  • However, the vast bulk (90%) of users have 394 ties or less.
  • 76% of users have the mean number of ties (159) or less.
  • The median user (the user at the half way point in the list) has 50 ties.
  • In other words, half the users on buzz have 50 ties or less.

So, do Buzz users have more ties than Facebook users?

As of today (8/12/2010), the average Facebook user has 130 friends. If we look at the mean number of ties for a buzz user in this sample, 159, then buzz has a slight edge but is clearly in the same ballpark.

In the case of either buzz or Facebook, we have to recognize that the ties are nominal. The level of ties actually maintained on either network is likely just a fraction of the nominal number as suggested by an in-house analysis done by Facebook.

How I collected the buzz data (why you should take this with a grain of salt)

The data is based on a snowball sample originating with myself and my 10 web marketing practicum students. I then went to the friends and finally to the friends-of-friends levels. The data were collected from August 4 to August 5, 2010. Here is a summary of the number of users analyzed at each level:

  • Core group (myself + students): 11
  • Friends: 161
  • Friends of friends: 9,941

A few points about this sample:

  • The students all have fewer than the median number of ties. While I have over 100 or double the median publicly reported.
  • In  other words, all of the outliers with many ties come from the Friends and Friends of friends levels.

This latter illustrates the problem with snowball sampling. It's hard to know what aspect of it is representative of the population as a whole. A random sample would alleviate this problem.

Going forward

I began down this path in an attempt to keep track of the networks my students were forming as we proceeded through the semester. The area that interests me most is the quality of ties they are forming. Going forward, I'm most likely to push on content analysis and network structure.

Future Posts

The mathematically inclined will have already noticed that 29% of my sample did not want to share their following and followers lists publicly. I'll be providing more analysis on this group in a future post.

Monday, July 5, 2010

Tracking the evolution of symmetric ties on Google Buzz

symmetricTiesShort.jpg

At the end of May, I introduceda tool for monitoring the up-to-the-minute status of my students' bootstrapping efforts on buzz (the code is available as open source software here). Today, I'm introducing another tool for tracking the evolution of symmetric ties in Buzz networks. Symmetric ties, where two people mutually follow each other's updates on the network, are important because the more symmetric ties a participant has, the more possibilities they will have for conversation and hence the more useful they will find the network.

You can see an excerpt of the tool's output in the image above. It's basically a web page where participants in the Web Marketing Practicum see three columns beside their mini profiles:

  • The number of symmetric (mutual) ties they had as of the analysis date, along with links to all of those people's profiles.
  • The number of symmetric ties that they added since a reference date, again with links to the actual profiles.
  • The number of symmetric ties that they subtracted since a reference date, again with profile links.

Using this information, participants can see who is entering and leaving each other's mutual ties networks. Perhaps more importantly, discussion can ensue about the type of network each participant is trying to grow and whether the additions and subtractions that have occurred make sense in that context.

A few quick remarks:

  • With one exception, every student has at least doubled the size of their symmetric ties network since May 31, 2010 when I did my first analysis.
  • In the two weeks covered by the pictured analysis, students typically increased the size of their symmetric ties network by over 20%.

How the tool might be useful

I was interested to read Adewale Oshineye's Buzz post where he, in effect, wondered why people would be concerned with symmetric ties on Buzz. So, it seems useful to list a few reasons why a symmetric ties tracking tool might be useful:

  • It's hard to imagine people exchanging information on a sustained basis unless they are mutually following each other (i.e., have symmetric ties). Knowing that your symmetric ties network is growing is an indicator that the network is a useful source of exchange.
  • People, being people, tend to seek reciprocation in their social relationships. They find non-reciprocating (asymmetric) ties less appealing and will deemphasize networks where such ties abound.  Knowing who is in your symmetric ties network and how it is changing over time is a useful indicator of the kind of value you're getting out of it. Is the network you're growing on the platform a colleagues network, a family network, a friends network, or what?
  • In general, it pays to have visual indicators of effectiveness. Buzz currently lets you know how many people you are following and how many are following you. However, there is strong evidence that high follower counts do not translate into reach. On the face of it, symmetric tie counts might be a better indicator (but see limitations below).

Tool limitations

  • The analysis is purely descriptive. It only shows how one aspect of the networks has evolved. It doesn't shed any light on why or how that evolution occurred.
  • The way the tool is set up, it suggests that gaining more and more symmetric ties is a good thing. I believe that bias to be useful when you're starting out. However, at some point, it doesn't make sense to try to increase your symmetric ties beyond what you can effectively track.
  • The tool places no weight on the quality of the tie. Is this someone it makes sense for you to be tied to?

Future directions

If you step back, you realize that Buzz is one instantiation of the next phase of personal publishing. A number of thoughts arise:

  • I'm interested in knowing more about the connection structure in these networks. How are people down through the friends of friends structures connected?
  • I'm interested in expanding the analysis beyond the current core group of people. If I go to friends of friends, I start to have enough data to make computing individual page rank interesting.
  • I wonder to what it extent it would pay to make friend recommendations from this data.

Getting the code

Once I have a chance to clean the code up, it will be available on my buzz list tracker project website.

Monday, May 31, 2010

A Buzz API Python Application for Social Networks Applied to My Students' Buzz Networks

BuzzNetworks.jpg

Click the image to see the results of a project I've been working on for the last week. My goal was to start to get a handle on how well students in my Web Marketing Practicum class were getting on in Buzz. What I produced was a basic tabular report designed to show how well the students were connecting with each other and with external parties.

Here's how to read the columns:

  • Participant: Just the class participant whose network is being examined. I'm included in this as I'm part of the network.
  • Mutually Following: The people the participant is both following and being followed by. For many social network analysts, this column is what constitutes the social network. Other participants in the class are color coded orange and in a bold weight font. Those outside the class are blue and in a regular weight font. A quick perusal of this column reveals that, with the exception of myself, the majority of each of the other participant's networks is composed of other participants.
  • Not Following Participant Back: Often, this column is not important. It may represent people who the participant is following purely for information. However, it can start to be an issue of poor perception management if no one the participant follows is following them back.
  • Participant's Other Followers: These are people who follow the participant but are not followed by the participant. They may represent an opportunity for the participant.

I'm sure this analysis seems super simple. What's the upshot?

  • One issue with Buzz as it stands in May, 2010 is that it does not have an easy way to perceive your network. Who's in? Who's out? Who are potential people to connect with. This analysis begins to provide an answer to all of those questions.
  • Right now, I'm clearly the most connected node on this network by any measure. Students may be able to feed off of me. Also, some of them have started to grow their networks, and as they do so, they can feed off of each other.

My plan is to do a separate post on the Python code itself sometime in the next 10 days, and I'll include a cleaned up version of the code with that. Suffice it to say that I did not use the Python client libraries for the Buzz API. Rather, I just used the RESTful API. The main reason was that it was chock full of examples for how to get the data I wanted. I did wind up writing a simple Python abstraction layer for it.

Next Steps

To  be honest, it may be seeing wether I can duplicate this effort with the twitter API. All reports indicate that my students may be having an easier time there. Tracking is certainly easier. I just created a list for my students.

However, even with twitter, figuring out who is in your active network is hard. As simple as this exercise is, it begins to accomplish that task.