Examining libraries, records management and emerging media trends
RSS icon Home icon
  • The wonderful world of datasets; 10 places to get datasets from all over the world

    Posted on January 4th, 2010 Bruce No comments

    Some researchers, analysts and writers prefer to collect their own data, design surveys and create brillant experiments to collect data. There is much to be learned in taking that road, to be sure. However, sometimes, you just need some datasets to get the ball rolling and try out some research ideas. After discussing this issue of finding datasets with a friend who does economic research, I thought it might be useful to share some of the resources I looked up. Before I get into the list, you may be wondering why datasets matter or why the broader movement that has made this possible – open data – matters. Partly, I view it as an extension of that famous open source programming adage attributed to Eric Raymonds, “Many eyeballs make all bugs shallow.” Many perspectives on a dataset can yield different values and it is much better than the alternative; a researcher creates a dataset for one purpose and then puts it away somewhere where nobody else can ever make use of it. Further, in the case of public authorities, the quality and openness of data can indicate the attitude of that organization toward using data. Without further ado, here is the list of ten places to get datasets:

    1. Google’s H1N1 Flu Trends This tool tracks how people searched for the illness and breaks it down by geography and other factors. This would be of interest to those in the public health field and others, for example.
    2. City of San Francisco Data: This is a treasuretrove of data for advocates, researchers, bureaucrats and others. There is data on public transit, housing, crime and more.
    3. City of Toronto Data: I was excited to blog about the city’s efforts to build this last year and now it is finally up. The breadth of coverage looks broader than San Francisco; you can get data on licensed child care centres, parks and other areas. While transit data is here, there does not appear to be crime or policing data, alas.
    4. Google Transit Feed: Google has created a standard for the world’s public transit authorities to share their data and the results can be found here. Data is available from: Washington DC, Vancouver, Cleaveland, Perth (Australia) and other places besides.
    5. Comprehensive Knowledge Archive Network  (CKAN): This is a general purpose archive with data on many different topics including Afghanistan election data and data from the British Antarctic Survey.
    6. Infochimps: Find Any Dataset in the World: (Note: not all data here is free). All kinds of data here including; data on Twitter, population statistics for US states, and 65 datasets about income.
    7. DBpedia: This project seeks to process the content of Wikipedia into database format where people can run queries and do other things like that.
    8. Freebase: Offers datasets on a variety of topics including recreational / pop culture subjects
    9. National Longitudinal Surveys (NLS): Created by the US Bureau of Labor Statistics, this is the place to go to learn about the US workforce.
    10. European Social Survey: Created by several universities across Europe, the ESS has data on income, population and other typical qualities of interest to researchers

    Did I miss some important datasets? If so, please feel free to share your ideas in a comment. I am particularly curious to know what experience people have had with these tools and how easy or difficult they are to work with.

  • New blog directions

    Posted on December 29th, 2009 Bruce No comments

    This blog is over six months old and I want to take it in some new directions. Specifically, I’m thinking of starting some new features on different topics. There are a few ideas I have in mind. One is to do some profiles of people in the profession that are doing interesting things that deserve to be heard more widely. Another idea is to chronicle my learning strategies now that I’m out of university. For example, I’ve been reading books on economics since mid 2008 with great interest and I’d like to share a little bit of what I’ve learned here. My plans for the blog continue to evolve over time but I think I am still driven by two connected ideas; sharing what I learn and digging deeper into the world of information and the people that make it up.

  • Becoming an Internet Librarian: an article in OLA’s magazine

    Posted on November 6th, 2009 Bruce No comments

    Several months ago, I submitted an article to Access, the magazine of the Ontario Library Association. My article – Becoming an Internet Librarian – is a blend of autobiography and my reflections on different aspects of librarianship. As time goes on, I find more and more ways that librarians contribute to their organizations and to society generally. Enjoy the article and I’d love if you have any comments to share.

  • Research on Library Customer Service: we know how to satisfy your info needs

    Posted on October 13th, 2009 Bruce 2 comments

    When I approach a cashier to make a purchase, I hope to conclude the transaction smoothly and quickly. Few things irritate me more in that context than trying to get the cashier’s attention during the process. I have had several experiences where I am poised to pay for an item, only to find the cashier engaged in an apparently irrelevant conversation with a co-worker. I feel like saying, “I’m a paying customer, so can you please focus on me so I can pay?” I find that a frustrating experience. Maybe retail staff are simply not motivated enough to deliver quality service? I think part of the problem is proper customer service skills are simply handed down as orders, rather than explained. I, for one, think it is better to explain rules and show how they make a positive contribution. [Conversely, if a rule cannot be explained or does not make a positive contribution, then maybe it should be reconsidered!]

    I am happy to report that librarians know both the how and why of customer service. In preparation for my training at AskON tomorrow, I was asked to read an article called, “The effects of librarians’ behavioral performance on user satisfaction in chat reference services,” by Nahyun Kwon and Vicki L. Gregory published in Reference & User Services Quarterly (RUSQ). In brief, the authors analyzed transcripts from Internet chat reference sessions to determine whether or not compliance with reference librarian guidelines increased user satisfaction. The answer is yes. What makes users satisfied? The authors found six behaviours to be particularly important:

    • used the patron’s name during the reference interview;
    • communicated more receptively and listened more carefully;
    • searched with or for the patron;
    • provided pointers;
    • asked the patron whether the question was completely answered; and
    • asked the patron to come back if they needed further assistance.

    Those all sound like good practices. The third item reminds me of research I heard about in grad school on approachability which claimed that patrons in libraries think librarians are more approachable when they are up and about helping somebody, rather than seated at a desk. I have encountered some of these behaviours when on the phone with various companies and generally find they strike a good note.

  • The Non-Durability of the Web: Yahoo! discontinues GeoCities

    Posted on October 5th, 2009 Bruce No comments

    At the end of October 2009, GeoCities will cease to exist when Yahoo! pulls the plug on the service that helped wanted many personal websites back in the 1990s. All is not lost however: the Internet Archive is undertaking an effort to store as much of this material as they can, but it is unclear how they are doing that or if they should. In many ways, GeoCities is past its prime but there is still plenty of interesting content there and it shows how the early days of the Web operated. Comparing Web usage from 1997 to 2007 could make for an interesting research project, I imagine.

    The Internet Archive’s efforts to ‘archive’ GeoCities, however impressive, cannot be considered “archiving” in the professional or classic sense. The relatively low cost of data storage (low is not the same as zero or free though) seduces some to think that simply everything produced should be stored – it would appear that is how the IA is ‘archiving’ GeoCities content. Simply making copies of as much content as possible is not what I would call an archive. Imagine an organization is moving out of a building and setting up elsewhere, would it make sense to “archive” every single piece of paper and data in the office? Such an archive would be both large and difficult to use. What criteria are being used to determine what should be kept? What about the people who created the GeoCities content?

    The article linked to above raises some interesting questions about the durability of data on the Internet. Institutions such as universities and libraries have, in many cases, existed for decades or centuries. It is unlikely that a dissertation or other valuable item at Harvard or Oxford will be in danger of loss (though fire, flooding and other disasters are always a possibility), but is that the case for free Web services? If the example of the sudden death of GeoCities is any indication, then one has some reason to be skeptical about the longevity of data in cloud applications such as Gmail. Maybe there will be a move, at some point, to introduce “data longevity” standards into user agreements? Maybe such guarantees will serve to differentiate free services from paid ones? In the final analysis, it may only be traditional archives that can be counted on to archive content professionally and retain it for the long term.

  • Banned Books Week in the US

    Posted on October 2nd, 2009 Bruce 2 comments

    It is banned books week in the United States (September 26- October 3) , always a good opportunity to recall the importance of the profession’s commitment to freedom of expression and intellectual freedom more generally.

    According to the Banned Books Week website, the tradition started in the early 1980s and has only grown since. The American Library Association has also put together a good Banned Books website. I have often thought that a course dedicated to reading banned books would be a great educational experience.  In fact, some books that have been challenged frequently in the USA (e.g. To Kill A Mockingbird and The Lord of the Flies) were required reading in English while I read others (e.g. 1984 by George Orwell and Brave New World by Aldous Huxley; we like dystopian fiction here in Canada; “The Handmaid’s Tale” by Margaret Atwood was assigned reading and also good) for book reports with the enthusiastic support of teachers. I am also delighted to report that Google Books has put together a Banned Books website too. In reading through the list of books mentioned by ALA, I was surprised to see that the education authorities in Toronto sought to ban “The Lord of the Flies” in 1988.

  • Princeton students & Cory Doctorow disapprove of ebooks and e-readers

    Posted on September 30th, 2009 Bruce No comments

    There has been some interesting activity in the ebook sector lately that I have been following with great interest. For the time being, I think that e-readers are best suited for reading journalism (e.g. I often read the New York Times and the Globe & Mail on my iPhone), but less suited for longer form works. Perhaps my views would change if I used the Kindle, but Amazon has yet to make it available outside the United States but I think it unlikely.

    Cory Doctorow, noted for providing his books for free through his website (as well as writing great columns on technology and blogging at BoingBoing), observed in a recent interview that he doesn’t think novels work in e-readers. He argues that the sort of sustained reading necessary in the case of novels doesn’t work well with ebook readers. I would tend to agree, but somehow, I still want e-readers (with print on demand publishing existing to give them competition) to succeed for pragmatic reasons. Packing 5-10 books into luggage for a trip is difficult and heavy.

    Likewise, students at Princeton University have not embraced the Kindle enthusiastically. As the Daily Princetonian reports, Kindles yet to woo University users, students are not pleased with the device. One student summarized his experience by saying: “It’s clunky, slow and a real pain to operate.” One particularly interesting objection to use of the Kindle in the academic or research context is the lack of page numbers which makes it difficult to cite passages. I don’t know exactly which audience had in mind for developing the device, but research and academic users needs do not appear to be well served thus far. Despite the problems, I think Princeton is to be commended for its efforts to run a pilot test of the technology.

  • “A tale of two countries’ libraries”: Canada’s libraries doing much better than U.S. counterparts

    Posted on September 22nd, 2009 Bruce No comments

    This September 20, 2009 article in the Toronto Star – A tale of two countries’ libraries – is a great read. It shows how successful Canada’s public libraries are and the quotes from Faculty of Information Senior Fellow Wendy Newman are not to be missed. I count it a blessing that no Canadian library I know of is in danger of being shut down.Here’s a quote to get you started on the article:

    Contrary to what you might have heard, libraries are not in a terminal state of decline, “they’re not even sick,” says Wendy Newman, a senior fellow at the University of Toronto’s faculty of information, formerly library sciences, now known as the “I School.”

    “Libraries are back big-time, they’re having a renaissance.”

    Circulation was up 27 per cent this summer across Ontario’s 330 systems and 1,000 branches. Toronto, already the largest system in the world with 99 branches, is expanding with two more.

    “We’re not intimidated by the future at all,” laughs Shelagh Paterson, executive director of the Ontario Library Association.

    This is good news by any measure and it is great to see these facts acknowledged in the Toronto Star.

  • The Compact for Open Access Publishing Equity

    Posted on September 18th, 2009 Bruce No comments

    This week witnessed a major development in open access as a number of major American universities agreed to support open access through the Compact for Open-Access Publishing Equity. As I understand it, this move would mean more support for projects such as the Public Library Of Science journals where funding comes from author fees (according to an article referred to below, such fees are quite rare) or other types of support, rather than subscription fees to a conventional publisher.

    Here is a quote from the Compact’s website:

    Scholarly publishing is going through a transformation as a result of digital means of communication, coupled with the financial predicament of libraries. With the most recent economic downturn, access to scholarly articles, so important to research progress and public advancement, will no doubt suffer.

    Open-access scholarly journals have arisen as an alternative to traditional subscription scholarly journals. Open-access journals make their articles available freely to anyone, while providing the same services common to all scholarly journals, such as management of the peer-review process, filtering, production, and distribution. Since open-access journals do not charge subscription or other access fees, they must cover their operating expenses through other sources, including subventions, in-kind support, or, in a sizable minority of cases, processing fees paid by or on behalf of authors for submission to or publication in the journal.

    You can also read an open access article that explains this development in further detail: Equity for Open-Access Journal Publishing. I love the idea of open access and the potential it offers to support greater learning and scholarship. The article linked to above delves into the economics of open access, including the problem of moral hazard. Currently, five universities are signatories to the Compact: Harvard, Dartmouth, Cornell, MIT and UC Berkeley.

  • 2009 Library and Archives Canada Consultation ends tomorrow

    Posted on September 17th, 2009 Bruce No comments

    Library and Archives Canada is consulting with the library community. To view the survey and participate, consult the website that the Ontario Library Association (OLA) has created for the consultation. The consultation document does not appear to require membership in the OLA specifically. It is good to see the organization consult as it charts new directions. Daniel J. Caron, Librarian and Archivist of Canada, has also written a letter describing the challenges the organization faces and some of its plans for the future.

    I would like to see LAC (or some other organization in Canada) consider copying some of the great ideas implemented in the United States and UK. Let’s look at a few quick examples:

    Canada need not copy those projects specifically, but they are a good starting point if one is looking for inspiration.