Response to Matt MacArthur

In a comment on my most recent blog post, NMAH’s Matt MacArthur brought up a major and valid criticism of the enthusiasm of myself and others like me for open data initiatives:

Mike touches on an important point about what people actually *want* from the Smithsonian (and museums in general). I heard a very interesting presentation from the Powerhouse Museum in Australia recently. They have had their collection database available via download/API for a while now – they are leaders on this openness” front. What they have found is that while this is was a radical/exciting development among proponents who care about such things, in reality hardly anyone has made use of it. This is particularly true for the education audience, who they thought would be eager to use raw data in the ways that you mention. Instead, teachers and students continue to gravitate toward specific bits of content that support their curriculum, and the more traditional, mediated “online exhibit” type of material. Maybe this will change and it still may be an important avenue for the Smithsonian to pursue. But for now, the evidence available to me shows that the public demand to see a lot more of the Smithsonian’s “stuff” online along with reliable interpretation, and have some social functionality around that content, is much greater than the demand to “walk away with our stuff and do whatever they want with it.”

On the one hand, I can’t argue with this line of reasoning. First off, the people who will want to access raw data online is always going to be smaller than the number of people who will want to just look at it, consume it passively. Much like the number of people who use their computers to program, do complex modeling, or calculate is always going to be smaller than the number who use them for entertainment and communications.

Of course, some people say that the fact that most of us just play games, surf the net, and write email means that the personal computer is dead in the water and devices like the iPad are the future. What these people overlook is that the iPad is not a very good device for doing innovative programing or developing next-level software. If you want computers to keep developing on the software level, you need keyboards and processing power. Just because the majority could get by with an iPad-like device doesn’t mean we should stop producing PCs, or that we should stop producing them at a price point that keeps a low barrier to entry so that merit is more important than deep pockets in the long march to innovation.

The situation when it comes to digital archives and exhibitions is not that dissimilar. You want to give the majority of people what they want– if grandma is scared of computers, but comfortable with the iPad, by all means, get her an iPad!– but as long as doing so doesn’t interfere with the enjoyment of the more passive-consuming majority(1), you also need to be designing keeping in mind the innovators, the hackers, the bleeding-edge early adopters… in other words, you need to design for the developers, as well as the average consumer.

The audience may not be there, at least not at first. But these considerations have to be made from the beginning, to be incorporated into the heart of the code from the get-go, or else it’s going to be nearly impossible if the demand picks up. This is part of what makes the Smithsonian Commons such a awesome and ambitious project– it’s going to have to cobble things from the many different, often privately-contracted and sometimes proprietary CMSs and databases that the various museums of the Smithsonian system, and bring it all together into one place. This is no easy task because when the different projects were begun, they were not designed to interoperate.

It’s important, when beginning a project like the Smithsonian Commons, to design the project so that it is capable of maximum openness. It’s easier to nail some doors shut than it is to tear down walls.

Similarly, while the case of the Powerhouse Museum might be somewhat discouraging– all this great openness and nobody using it– the Powerhouse Museum is not the Smithsonian. No other museum is the Smithsonian. The SI is “the world’s largest museum complex and research organization,” according to the home page. The Smithsonian is large enough that, if the Commons is implemented well, it could counter this trend of disuse. The Smithsonian Commons could be a tipping point.

People won’t develop the tools if there’s not a potential audience for them. If the SI works with other museums like the Powerhouse to ensure interoperability and good data standards, it’s enough of a behemoth that the SI’s working on opening up might actually encourage development and use for the Powerhouse Museum. If people can design tools that allow you to digitally manipulate, analyze, and play with the combined collections of an entire international network of museums, suddenly you’re looking at something with enough potential use, and enough potential audience, that it might be worth doing.

I’m not saying this will happen necessarily, of course. I can’t predict the future. But I can see that the Smithsonian is uniquely positioned to help push this sort of thing into reality. I’d hate to see that opportunity squandered because of a lack of perceived interest.

It may well be that the average user will always be the casual browser, the person who wants to see the stuff, along with a little social functionality. But arguments of “demand” shouldn’t be applied to openness and APIs. There’s a moral argument to this, for institutions with a public service mission, but let’s look beyond that to a completely pragmatic view. With computers, “demand” isn’t a fixed quality. The world only needs so many eggs. With computers, however, demand is a constantly shifting value, because demand created by tools. And tools that can be developed by outsiders with little to no cost other than time can suddenly prove quite important.

Look at Twitter clients– when the website first launched, I doubt there was much perceivable demand for standalone programs that simply talked to a website that let you post SMS-sized messages on the web. But Twitter was created with an open and robust API, and clients emerged and multiplied. They’re key to the site’s success– I doubt I would keep using Twitter as much as I do if I always had to navigate back to and refresh the website. Making it an always-on part of my desktop makes it invaluable by comparison.

Fostering a dev community is a way to ensure a small but powerful group of passionate early adopters. It can bring new and unexpected functionalities to the project. And if people start building tools that take advantage of the Commons’s wide-open API and data standards, they may just come up with a cool tool that brings even more casual users even deeper into the project. Why bet on the fact that they won’t, and close the project off? Isn’t it better to hope they do and leave the possibility open?

Finally, I’d like to suggest that while I said it’s likely that casual users will always be the core of the user base, the numbers may be shifting. Google’s recent unveiling of their Android App Inventor points toward some of the folks with the deep pockets and the big brains actually investing time, money, and energy into lowering the barriers to application development in some interesting ways. If the Smithsonian Commons were interoperable with App Inventor, wouldn’t that be an amazing project for beginning students interested in software development, or the use of new media in traditional disciplines?

(1) The notion of the average visitor’s experience of museums– or experience of any form of media or spectacle for that matter– as being “passive” is one that I find deeply problematic, but that’s a matter for a different post.


Thoughts on the Smithsonian Commons

Reading the Smithsonian’s recent announcement of the debut of the Smithsonian Commons Prototype and playing around on that page has left me feeling rather ambivalent, with more questions than answers.

I like the impetus behind the project– it’s ambitious and well-intentioned. Integrating the Institution’s many web presences, putting them in an environment where the user has more control of how they use and experience it, allowing guests to collect and curate, themselves, rather than maintaining the position that curation is a rarefied activity best left to experts– these are commendable goals, and the Commons, if it lives up to the promises of the Prototype page, will deliver on these things. But part of me feels like it’s just… insufficient.

“Vast, findable, shareable, and free” is a great start. But it’s not enough. What is lacking is any definition of openness, or any commitment to a specific vision of what openness means.

The goal of the project seems to be an opening up of the Smithsonian to a wider public– and I think that’s a great goal. But I worry that where the prototype has currently settled is may be giving more lip service to the principle of openness than it is embracing what that principle entails. This is where I start to have a lot of questions.

The prototype page promises that the Commons represents a “dedicat[tion] to stimulating learning, creation, and innovation through open access to Smithsonian research, collections and communities.” And yet how open will that access truly be? In the four video use-cases presented by the prototype page, I see very little openness with data. I primarily see a more social approach to playing in the Smithsonian’s sandbox. Letting others play in your sandbox is definitely a step toward openness, but true openness is letting others walk away with your sand and do whatever they want with it.

To put it another way: “Screws better than glues.” Ownership is about the ability to alter, remake, use, remix, or hack. And you need to give your visitors data, not just let them see it. Being open with information in the digital age means not just allowing people to look at your books, but letting them walk away with a copy and seeing what they can do with it. Until that point, you’re not really being open. You’re just being transparent.

Openness is a moving target, of course. There’s “open” and then there’s open. And there are some indications that the project has the potential to be truly open. But they are somewhat ambiguous. In the use-case videos, two things are mentioned that give me hope that this could be a truly open project.

The first thing was that, in the video of the teacher, she is able to download her collection from the Smithsonian Commons and use it– in this case, by making a Powerpoint for her fourth-graders out of images of Teddy Roosevelt she has gathered. This is hardly particularly exciting– she could have done the same thing by mastering the elusive “left mouse click” technique. But is this all the download function will allow you to do, or is it just a failing of imagination on the part of this hypothetical teacher? I want to know– how much metadata will be downloaded when you use that download tool? What format will your data come in? Will it be a rich enough data set to let you really do something with it?

But second, and perhaps more excitingly, the Smithsonian Commons will have an API. Of course, that can mean a lot of things. Will this API be available to any developer who wants to incorporate Smithsonian resources into his or her own site, or is it an internal API that allows all the various SI museum sites and digital archives (which run on a variety of different CMSs) to interoperate and participate in the Commons? And if it is public– how expansive will it be? Some APIs are limited to highly specific functionalities, where others really let you get into the guts of the thing and really do something innovative. Which will this be?

People trust Google. Not everyone, of course, and as Jeff Jarvis has been pointing out a lot lately, a lot more Americans do than Europeans. But ultimately, it’s a trusted company. They have access to everything on my phone, my email, they have access to 98% of my search activity… Normally, I’d say that anyone who trusted a profit-driven company that much was either crazy or stupid. And yet I do it. Why?

There’s a couple things. One is openness. Even before the Data Liberation Front initiative, Google was fairly good about letting me export my data. I can take my ball and go home, because they let me own my data, even if they also own my data.

But the other one– the really big one– is their commitment to not being evil. The adoption of the motto “Don’t be evil” was a step toward the creation of a certain type of culture– one that was constantly asking certain fundamental questions when coming into new projects– What does it mean to be evil? Is this new project evil? Can it be used for evil? Do its implications for malfeasant use overwhelm its potential for good or convenience?

Openness, like I said, is a moving target. What the Smithsonian needs to do, in approaching this project which has the potential to be really revolutionary, is to work on creating a similar culture, one that is always questioning openness. What does it mean to be open? How open can we be, here? Is this project being executed in the most open way possible?

As a publicly supported institution, openness should be seen as a moral obligation, a key element of the SI’s mission. Public institutions need to see “open” as the default, not the exception. And yet, looking through the SI’s web and new media strategy wiki, I don’t see that sort of discussion going on. The adjective “open” is used a lot, but there’s not as much grappling with what it means, or what it implies.

I hope none of this comes off as negative toward the Smithsonian or toward the Smithsonian Commons project. I think it’s a great idea. As the Jefferson Library’s Eric Johnson has pointed out, in some ways, Smithsonian 2.0 is really getting back to the organizational structures of Smithsonian 0.2. Under Spencer Baird’s tenure, the Smithsonian’s collections grew exponentially because of the crowdsourcing of knowledge in the form of specimens sent in by amateurs and hobbyists. Moreover, many of those doing the curation and gatekeeping during this period were, likewise, not exactly formally trained. They learned by doing– on-the-job training that taught how museums work by forcing you to make a museum work.

It’s natural that some museum workers– like many in academia– will have resistances to openness. After all, museums and universities are the great organizers of Knowledge. Their identity is often contingent upon their reputation for being able to separate wisdom from hokum, to selectively place that seal of approval on the true and disavow the false. And years and years of schooling and job experience are invested in credentialing, in the creation of the trust necessary to make such pronouncements authoritative and accurate. Openness can be seen as threatening to this, with its non-hierarchical structures, armchair experts, and “wisdom of crowds.” Working toward a truly open model for a project like the Smithsonian Commons is, in some ways, going to be an uphill battle. But the first step of that battle has to be changing the discourse, actually forcing people to discuss, tease out, interrogate the principle of openness.

For the Smithsonian to move forward and remain relevant, not to mention for it to remain true to its mission as a public institution– it needs to take a hard look at these questions when beginning a project with as much potential as the Smithsonian Commons.

The Smithsonian Folklife Festival began today. There is a quote from Secretary S. Dillon Ripley, under whose tenure at the SI the Folklife Festival began, that pertains just as much to the advent of the Smithsonian Commons as it does to the founding of the Festival: “Take the objects out of their cases and make them sing.”

The Smithsonian Commons is a project that could well have just that ability, to unbind the vast collective knowledge of the Smithsonian Institution and put it out there for the whole world to experience.

The question of openness can be reduced to this: you can take the objects out of their cases. But do you just want to put them in front of a worldwide public, or to put them in their hands?


Another call for Open Access

Just a couple days after I talked about Open Access book publishing, the incomparable danah boyd, in her blog, calls for the elimination of locked-down academic journals and databases.

I’m all for her idea– after all, we’re all accessing journals electronically at this point anyway, and server space is far cheaper and more flexible than the current publication/database model. I know that some academic libraries– including one of my former institutions– are so cash-strapped that all they can afford to do is maintain their database subscriptions, and have had to put book purchases on hold.

One question I would pose, however: the open-access model works much better going forward, as we look to our next publication. What kind of model can be made to replace or lower barriers to the “backwards-facing” (for lack of a better term) aspect of academic journal databases? Is there any way we can open up access to the vast array of old journal articles that are, to most of us, only accessible via databases like JStor, Ebsco, or Project MUSE?