Library Hat Rotating Header Image


Blockchain: Merits, Issues, and Suggestions for Compelling Use Cases

* This post was also published in ACRL TechConnect.***

Blockchain holds a great potential for both innovation and disruption. The adoption of blockchain also poses certain risks, and those risks will need to be addressed and mitigated before blockchain becomes mainstream. A lot of people have heard of blockchain at this point. But many are unfamiliar with how this new technology exactly works and unsure about under which circumstances or on what conditions it may be useful to libraries.

In this post, I will provide a brief overview of the merits and the issues of blockchain. I will also make some suggestions for compelling use cases of blockchain at the end of this post.

What Blockchain Accomplishes

Blockchain is the technology that underpins a well-known decentralized cryptocurrency, Bitcoin. To simply put, blockchain is a kind of distributed digital ledger on a peer-to-peer (P2P) network, in which records are confirmed and encrypted. Blockchain records and keeps data in the original state in a secure and tamper-proof manner[1] by its technical implementation alone, thereby obviating the need for a third-party authority to guarantee the authenticity of the data. Records in blockchain are stored in multiple ledgers in a distributed network instead of one central location. This prevents a single point of failure and secures records by protecting them from potential damage or loss. Blocks in each blockchain ledger are chained to one another by the mechanism called ‘proof of work.’ (For those familiar with a version control system such as Git, a blockchain ledger can be thought of as something similar to a P2P hosted git repository that allows sequential commits only.[2]) This makes records in a block immutable and irreversible, that is, tamper-proof.

In areas where the authenticity and security of records is of paramount importance, such as electronic health records, digital identity authentication/authorization, digital rights management, historic materials that may be contested or challenged due to the vested interests of certain groups, and digital provenance to name a few, blockchain can lead to efficiency, convenience, and cost savings.

For example, with blockchain implemented in banking, one will be able to transfer funds across different countries without going through banks.[3] This can drastically lower the fees involved, and the transaction will take effect much more quickly, if not immediately. Similarly, adopted in real estate transactions, blockchain can make the process of buying and selling a property more straightforward and efficient, saving time and money.[4]

Disruptive Potential of Blockchain

The disruptive potential of blockchain lies in its aforementioned ability to render the role of a third-party authority obsolete, which records and validates transactions and guarantees their authenticity, should a dispute arise. In this respect, blockchain can serve as an alternative trust protocol that decentralizes traditional authorities. Since blockchain achieves this by public key cryptography, however, if one loses one’s own personal key to the blockchain ledger holding one’s financial or real estate asset, for example, then that will result in the permanent loss of such asset. With the third-party authority gone, there will be no institution to step in and remedy the situation.


This is only some of the issues with blockchain. Other issues include (a) interoperability between different blockchain systems, (b) scalability of blockchain at a global scale with large amount of data, (c) potential security issues such as the 51% attack [5], and (d) huge energy consumption [6] that a blockchain requires to add a block to a ledger. Note that the last issue of energy consumption has both environmental and economic ramifications because it can cancel out the cost savings gained from eliminating a third-party authority and related processes and fees.

Challenges for Wider Adoption

There are growing interests in blockchain among information professionals, but there are also some obstacles to those interests gaining momentum and moving further towards wider trial and adoption. One obstacle is the lack of general understanding about blockchain in a larger audience of information professionals. Due to its original association with bitcoin, many mistake blockchain for cryptocurrency. Another obstacle is technical. The use of blockchain requires setting up and running a node in a blockchain network, such as Ethereum[7], which may be daunting to those who are not tech-savvy. This makes a barrier to entry high to those who are not familiar with command line scripting and yet still want to try out and test how a blockchain functions.

The last and most important obstacle is the lack of compelling use cases for libraries, archives, and museums. To many, blockchain is an interesting new technology. But even many blockchain enthusiasts are skeptical of its practical benefits at this point when all associated costs are considered. Of course, this is not an insurmountable obstacle. The more people get familiar with blockchain, the more ways people will discover to use blockchain in the information profession that are uniquely beneficial for specific purposes.

Suggestions for Compelling Use Cases of Blockchain

In order to determine what may make a compelling use case of blockchain, the information profession would benefit from considering the following.

  1. What kind of data/records (or the series thereof) must be stored and preserved exactly the way they were created.
  2. What kind of information is at great risk to be altered and compromised by changing circumstances.
  3. What type of interactions may need to take place between such data/records and their users.[8]
  4. How much would be a reasonable cost for implementation.

These will help connecting the potential benefits of blockchain with real-world use cases and take the information profession one step closer to its wider testing and adoption. To those further interested in blockchain and libraries, I recommend the recordings from the Library 2.018 online mini-conference, “Blockchain Applied: Impact on the Information Profession,” held back in June. The Blockchain National Forum, which is funded by IMLS and is to take place in San Jose, CA on August 6th, will also be livestreamed.


[1] For an excellent introduction to blockchain, see “The Great Chain of Being Sure about Things,” The Economist, October 31, 2015,

[2] Justin Ramos, “Blockchain: Under the Hood,” ThoughtWorks (blog), August 12, 2016,

[3] The World Food Programme, the food-assistance branch of the United Nations, is using blockchain to increase their humanitarian aid to refugees. Blockchain may possibly be used for not only financial transactions but also the identity verification for refugees. Russ Juskalian, “Inside the Jordan Refugee Camp That Runs on Blockchain,” MIT Technology Review, April 12, 2018,

[4] Joanne Cleaver, “Could Blockchain Technology Transform Homebuying in Cook County — and Beyond?,” Chicago Tribune, July 9, 2018,

[5] “51% Attack,” Investopedia, September 7, 2016,

[6] Sherman Lee, “Bitcoin’s Energy Consumption Can Power An Entire Country — But EOS Is Trying To Fix That,” Forbes, April 19, 2018,

[7] Osita Chibuike, “How to Setup an Ethereum Node,” The Practical Dev, May 23, 2018,

[8] The interaction can also be a self-executing program when certain conditions are met in a blockchain ledger. This is called a “smart contract.” See Mike Orcutt, “States That Are Passing Laws to Govern ‘Smart Contracts’ Have No Idea What They’re Doing,” MIT Technology Review, March 29, 2018,

Using the Stripe API to Collect Library Fines by Accepting Online Payments

*** This post was originally published in ACRL TechConnect on Sep. 10, 2014.***

Recently, my library has been considering accepting library fines via online. Currently, many library fines of a small amount that many people owe are hard to collect. As a sum, the amount is significant enough. But each individual fines often do not warrant even the cost for the postage and the staff work that goes into creating and sending out the fine notice letter. Libraries that are able to collect fines through the bursar’s office of their parent institutions may have a better chance at collecting those fines. However, others can only expect patrons to show up with or to mail a check to clear their fines. Offering an online payment option for library fines is one way to make the library service more user-friendly to those patrons who are too busy to visit the library in person or to mail a check but are willing to pay online with their credit cards.

If you are new to the world of online payment, there are several terms you need to become familiar with. The following information from the article in SixRevisions is very useful to understand those terms.1

  • ACH (Automated Clearing House) payments: Electronic credit and debit transfers. Most payment solutions use ACH to send money (minus fees) to their customers.
  • Merchant Account: A bank account that allows a customer to receive payments through credit or debit cards. Merchant providers are required to obey regulations established by card associations. Many processors act as both the merchant account as well as the payment gateway.
  • Payment Gateway: The middleman between the merchant and their sponsoring bank. It allows merchants to securely pass credit card information between the customer and the merchant and also between merchant and the payment processor.
  • Payment Processor: A company that a merchant uses to handle credit card transactions. Payment processors implement anti-fraud measures to ensure that both the front-facing customer and the merchant are protected.
  • PCI (the Payment Card Industry) Compliance: A merchant or payment gateway must set up their payment environment in a way that meets the Payment Card Industry Data Security Standard (PCI DSS).

Often, the same company functions as both payment gateway and payment processor, thereby processing the credit card payment securely. Such a product is called ‘Online payment system.’ Meyer’s article I have cited above also lists 10 popular online payment systems: Stripe, Authorize.Net, PayPal, Google Checkout, Amazon Payments, Dwolla, Braintree, Samurai by FeeFighters, WePay, and 2Checkout. Bear in mind that different payment gateways, merchant accounts, and bank accounts may or may not work together, your bank may or may not work as a merchant account, and your library may or may not have a merchant account. 2

Also note that there are fees in using online payment systems like these and that different systems have different pay structures. For example, has the $99 setup fee and then charges $20 per month plus a $0.10 per-transaction fee. Stripe charges 2.9% + $0.30 per transaction with no setup or monthly fees. Fees for mobile payment solutions with a physical card reader such as Square may go up much higher.

Among various online payment systems, I picked Stripe because it was recommended on the Code4Lib listserv. One of the advantages for using Stripe is that it acts as both the payment gateway and the merchant account. What this means is that your library does not have to have a merchant account to accept payment online. Another big advantage of using Stripe is that you do not have to worry about the PCI compliance part of your website because the Stripe API uses a clever way to send the sensitive credit card information over to the Stripe server while keeping your local server, on which your payment form sits, completely blind to such sensitive data. I will explain this in more detail later in this post.

Below I will share some of the code that I have used to set up Stripe as my library’s online payment option for testing. This may be of interest to you if you are thinking about offering online payment as an option for your patrons or if you are simply interested in how an online payment API works. Even if your library doesn’t need to collect library fines via online, an online payment option can be a handy tool for a small-scale fund-raising drive or donation.

The first step to take to make Stripe work is getting an API keys. You do not have to create an account to get API keys for testing. But if you are going to work on your code more than one day, it’s probably worth getting an account. Stripe API has excellent documentation. I have read ‘Getting Started’ section and then jumped over to the ‘Examples’ section, which can quickly get you off the ground. ( I found an example by Daniel Schröter in GitHub from the list of examples in the Stripe’s Examples section and decided to test out. ( Most of the time, getting an example code requires some probing and tweaking such as getting all the required library downloaded and sorting out the paths in the code and adding API keys. This one required relatively little work.

Now, let’s take a look at the form that this code creates.


In order to create a form of my own for testing, I decided to change a few things in the code.

  1. Add Patron & Payment Details.
  2. Allow custom amount for payment.
  3. Change the currency from Euro to US dollars.
  4. Configure the validation for new fields.
  5. Hide the payment form once the charge goes through instead of showing the payment form below the payment success message.


4. can be done as follows. The client-side validation is performed by Bootstrapvalidator jQuery Plugin. So you need to get the syntax correct to get the code, which now has new fields, to work properly.

This is the Javascript that allows you to send the data submitted to your payment form to the Stripe server. First, include the Stripe JS library (line 24). Include JQuery, Bootstrap, Bootstrap Form Helpers plugin, and Bootstrap Validator plugin (line 25-28). The next block of code includes an event handler for the form, which send the payment information to the Stripe via AJAX when the form is submitted. Stripe will validate the payment information and then return a token that identifies this particular transaction.


When the token is received, this code calls for the function, stripeResponseHandler(). This function, stripeResponseHandler() checks if the Stripe server did not return any error upon receiving the payment information and, if no error has been returned, attaches the token information to the form and submits the form.


The server-side PHP script then checks if the Stripe token has been received and, if so, creates a charge to send it to Stripe as shown below. I am using PHP here, but Stripe API supports many other languages than PHP such as Ruby and Python. So you have many options. The real payment amount appears here as part of the charge array in line 326. If the charge succeeds, the payment success message is stored in a div to be displayed.


The reason why you do not have to worry about the PCI compliance with Stripe is that Stripe API asks to receive the payment information via AJAX and the input fields of sensitive information does not have the name attribute and value. (See below for the Card Holder Name and Card Number information as an example; Click to bring up the clear version of the image.)  By omitting the name attribute and value, the local server where the online form sits is deprived of any means to retrieve the information in those input fields submitted through the form. Since sensitive information does not touch the local server at all, PCI compliance for the local server becomes no concern. To clarify, not all fields in the payment form need to be deprived of the name attribute. Only the sensitive fields that you do not want your web server to have access to need to be protected this way. Here, for example, I am assigning the name attribute and value to fields such as name and e-mail in order to use them later to send a e-mail receipt.

(NB. Please click images to see the enlarged version.)

Screen Shot 2014-08-17 at 8.01.08 PM

Now, the modified form has ‘Fee Category’, custom ‘Payment Amount,’ and some other information relevant to the billing purpose of my library.


When the payment succeeds, the page changes to display the following message.


Stripe provides a number of fake card numbers for testing. So you can test various cases of failures. The Stripe website also displays all payments and related tokens and charges that are associated with those payments. This greatly helps troubleshooting. One thing that I noticed while troubleshooting is that Stripe logs sometimes do lag behind. That is, when a payment would succeed, associated token and charge may not appear under the “Logs” section immediately. But you will see the payment shows up in the log. So you will know that associated token and charge will eventually appear in the log later.


Once you are ready to test real payment transactions, you need to flip the switch from TEST to LIVE located on the top left corner. You will also need to replace your API keys for ‘TESTING’ (both secret and public) with those for ‘LIVE’ transaction. One more thing that is needed before making your library getting paid with real money online is setting up SSL (Secure Sockets Layer) for your live online payment page. This is not required for testing but necessary for processing live payment transactions. It is not a very complicated work. So don’t be discouraged at this point. You just have to buy a security certificate and put it in your Web server. Speak to your system administrator for how to get the SSL set up for your payment page. More information about setting up SSL can be found in the Stripe documentation I just linked above.

My library has not yet gone live with this online payment option. Before we do, I may make some more modifications to the code to fit the staff workflow better, which is still being mapped out. I am also planning to place the online payment page behind the university’s Shibboleth authentication in order to cut down spam and save some tedious data entry by library patrons by getting their information such as name, university email, student/faculty/staff ID number directly from the campus directory exposed through Shibboleth and automatically inserting it into the payment form fields.

In this post, I have described my experience of testing out the Stripe API as an online payment solution. As I have mentioned above, however, there are many other online payment systems out there. Depending your library’s environment and financial setup, different solutions may work better than others. To me, not having to worry about the PCI compliance by using Stripe was a big plus. If your library accepts online payment, please share what solution you chose and what factors led you to the particular online payment system in the comments.

* This post has been based upon my recent presentation, “Accepting Online Payment for Your Library and ‘Stripe’ as an Example”, given at the Code4Lib DC Unconference. See the slides  below..

  1. Meyer, Rosston. “10 Excellent Online Payment Systems.” Six Revisions, May 15, 2012.
  2. Ullman, Larry. “Introduction to Stripe.” Larry Ullman, October 10, 2012.

Query a Google Spreadsheet like a Database with Google Visualization API Query Language

***  This post was originally published in ACRL TechConnect on Dec. 4, 2013. ***

Libraries make much use of spreadsheets. Spreadsheets are easy to create, and most library staff are familiar with how to use them. But they can quickly get unwieldy as more and more data are entered. The more rows and columns a spreadsheet has, the more difficult it is to browse and quickly identify specific information. Creating a searchable web application with a database at the back-end is a good solution since it will let users to quickly perform a custom search and filter out unnecessary information. But due to the staff time and expertise it requires, creating a full-fledged searchable web database application is not always a feasible option at many libraries.

Creating a MS Access custom database or using a free service such as Zoho can be an alternative to creating a searchable web database application. But providing a read-only view for MS Access database can be tricky although possible. MS Access is also software locally installed in each PC and therefore not necessarily available for the library staff when they are not with their work PCs on which MS Access is installed. Zoho Creator offers a way to easily convert a spreadsheet into a database, but its free version service has very limited features such as maximum 3 users, 1,000 records, and 200 MB storage.

Google Visualization API Query Language provides a quick and easy way to query a Google spreadsheet and return and display a selective set of data without actually converting a spreadsheet into a database. You can display the query result in the form of a HTML table, which can be served as a stand-alone webpage. All you have to do is to construct a custom URL.

A free version of Google spreadsheet has a limit in size and complexity. For example, one free Google spreadsheet can have no more than 400, 000 total cells. But you can purchase more Google Drive storage and also query multiple Google spreadsheets (or even your own custom databases) by using Google Visualization API Query Language and Google Chart Libraries together. (This will be the topic of my next post. You can also see the examples of using Google Chart Libraries and Google Visualization API Query Language together in my presentation slides at the end of this post.)

In this post, I will explain the parameters of Google Visualization API Query Language and how to construct a custom URL that will query, return, and display a selective set of data in the form of an HTML page.

A. Display a Google Spreadsheet as an HTML page

The first step is to identify the URL of the Google spreadsheet of your choice.

The URL below opens up the third sheet (Sheet 3) of a specific Google spreadsheet. There are two parameters you need to pay attention inside the URL: key and gid.

This breaks down the parameters in a way that is easier to view:



Key is a unique identifier to each Google spreadsheet. So you need to use that to cretee a custom URL later that will query and display the data in this spreadsheet. Gid specifies which sheet in the spreadsheet you are opening up. The gid for the first sheet is 0; the gid for the third sheet is 2.

Screen Shot 2013-11-27 at 9.44.29 AM

Let’s first see how Google Visualization API returns the spreadsheet data as a DataTable object. This is only for those who are curious about what goes on behind the scenes. You can see that for this view, the URL is slightly different but the values of the key and the gid parameter stay the same.

Screen Shot 2013-11-27 at 9.56.03 AM

In order to display the same result as an independent HTML page, all you need to do is to take the key and the gid parameter values of your own Google spreadsheet and construct the custom URL following the same pattern shown below.


Screen Shot 2013-11-27 at 9.59.11 AM

By the way, if the URL you created doesn’t work, it is probably because you have not encoded it properly. Try this handy URL encoder/decoder page to encode it by hand or you can use JavaScript encodeURIComponent() function.
Also if you want the URL to display the query result without people logging into Google Drive first, make sure to set the permission setting of the spreadsheet to be public. On the other hand, if you need to control access to the spreadsheet only to a number of users, you have to remind your users to first go to Google Drive webpage and log in with their Google account before clicking your URLs. Only when the users are logged into Google Drive, they will be able see the query result.

B. How to Query a Google Spreadsheet

We have seen how to create a URL to show an entire sheet of a Google spreadsheet as an HTML page above. Now let’s do some querying, so that we can pick and choose what data the table is going to display instead of the whole sheet. That’s where the Query Language comes in handy.

Here is an example spreadsheet with over 50 columns and 500 rows.


Screen Shot 2013-11-27 at 10.15.41 AM

What I want to do is to show only column B, C, D, F where C contains ‘Florida.’ How do I do this? Remember the URL we created to show the entire sheet above?


There we had no value for the tq parameter. This is where we insert our query.

Google Visualization API Query Language is pretty much the same as SQL. So if you are familiar with SQL, forming a query is dead simple. If you aren’t SQL is also easy to learn.

  • The query should be written like this:
  • After encoding it properly, you get something like this:
  • Add it to the tq parameter and don’t forget to also specify the key:

I am omitting the gid parameter here because there is only one sheet in this spreadsheet but you can add it if you would like. You can also omit it if the sheet you want is the first sheet. Ta-da!

Screen Shot 2013-11-27 at 10.26.13 AM

Compare this with the original spreadsheet view. I am sure you can appreciate how the small effort put into creating a URL pays back in terms of viewing an unwieldy large spreadsheet manageable.

You can also easily incorporate functions such as count() or sum() into your query to get an overview of the data you have in the spreadsheet.

  • select D,F count(C) where (B contains ‘author name’) group by D, F

For example, this query above shows how many articles a specific author published per year in each journal. The screenshot of the result is below and you can see it for yourself here:,F,count(C)+where+%28B+contains+%27Agoulnik%27%29+group+by+D,F&key=0AqAPbBT_k2VUdEtXYXdLdjM0TXY1YUVhMk9jeUQ0NkE

Screen Shot 2013-11-27 at 11.34.25 AM

Take this spread sheet as another example.


This simple query below displays the library budget by year. For those who are unfamiliar with ‘pivot‘, pivot table is a data summarization tool. The query below asks the spreadsheet to calculate the total of all the values in the B column (Budget amount for each category) by the values found in the C column (Years).

Screen Shot 2013-11-27 at 11.46.49 AM

This is another example of querying the spreadsheet connected to my library’s Literature Search request form. The following query asks the spreadsheet to count the number of literature search requests by Research Topic (=column I) that were received in 2011 (=column G) grouped by the values in the column C, i.e. College of Medicine Faculty or College of Medicine Staff.

  • select C, count(I) where (G contains ‘2011’) group by C


C. More Querying Options

There are many more things you can do with a custom query. Google has an extensive documentation that is easy to follow:

These are just a few examples.

    : Order the results in the descending order of the column of your choice. Without ‘DESC,’ the result will be listed in the ascending order.
  • LIMIT 5
    : Limit the number of results. Combined with ‘Order by’ you can quickly filter the results by the most recent or the oldest items.

My presentation slides given at the 2013 LITA Forum below includes more detailed information about Google Visualization API Query Language, parameters, and other options as well as how to use Google Chart Libraries in combination with Google Visualization API Query Language for data visualization, which is the topic of my next post.

Happy querying Google Spreadsheet!


Do You Feel Inadequate? For Hard-Working Overachievers

I have not been a very diligent blog writer in the year of 2013. So on the last day of 2013, I decided to write a post. I thought about writing a post with the title of “Adieu 2013!” but then I changed my mind. So here is a more sensational title, “Do you feel inadequate?”

Inadequacy is an interesting feeling. While I often felt inadequate as a college and grad student in academia and sometimes as a librarian in the libraryland, I was also pretty convinced that I was quite smart and bright. (Don’t throw stones at me just yet; Look what I said is ‘was’!) So how do you feel smart and inadequate at the same time? The answer is actually quite simple. You feel inadequate because you are smart enough to know that there are smarter people than you. So feeling inadequate itself is not a bad thing. But being unable to set the goal for you to overcome that feeling is a problem. Being unable to be at peace with the fact that there will always be people who are more bright and talented than you is a problem.

I wish I realized this, much earlier in my life. But it should still count just the same. I do no longer believe that I am particularly bright nor that I am seriously inadequate. I am just OK. That’s really not bad at all. I think that actually, this is a great place to be – knowing that I am OK to be who I am. This may sound mundane to many. But I guarantee you that if you are one of those exceptionally bright and successful over-achievers with little experience in failure then this would be a particularly hard belief for you to subscribe to.

[ADDED: I also want to emphasize that smart or brilliant is not an innate quality. You have to work on it to become smart or brilliant. So if you do work on things you want to become smart about, you will become smart. Believe me and go for it. On the other hand, you also need to ‘actively decide’ on what boundary you will set up in pushing yourself towards smartness or brilliance because there will be certain things you want to preserve such as sanity and work-life balance. For example, are you willing to sacrifice your 2 hours at a gym everyday and spend that time instead for getting smarter? How about sleeping only 4 hours a day and use the rest of it on some project you love? It will work, but maybe that is or is not what you really want. (Also consider if it will be a long-term or a short-term thing.) And so, you will be less smart than those who take those measures. So you are just OK. But it is you who decide to be so! No hard feelings, right?]

(On the other hand, if you have had this belief that you are just OK in your entire career and have never been discontent with yourself, maybe that is a clear warning sign. Don’t be a seat-warmer and go find a challenge that excites your librarian heart!)

Back in January, 2013. I read this blog post by Miss July, “ego, thy name is librarianship”. Many librarians shared the angst of wanting to be recognized widely and quickly for their hard work and intelligence. But the thing is, a lot of times, what makes someone recognized is chance and luck more than anything. Sometimes, a project you put a lot of work into and is completely worthy of others’ acclaim will go unnoticed. At other times, something you haphazardly put together to meet a deadline may make you famous! When I was a babybrarian, one of my friends, Will, who was a few years senior to me in being a librarian, asked if I recognized someone’s name. I had zero idea who that person was. But during his LIS days, that person was beyond famous in the libraryland, I was told.

I am not saying that luck and chance are important to fame than hard work and intelligence to dismiss the famous ‘and’ brilliant ‘and’ hard-working people in the libraryland, whom I  admire. I just want to point out that hard work and brilliance is only a sufficient condition for being a good librarian and professional, not for gaining fame. If you are aiming at the former, your hard work and brilliance will be more than rewarded. If you are aiming at the latter, on the other hand, achieving that will be way trickier since it is mostly up to others. I am simply sharing this to help other bright librarians who are struggling with the work-life balance and the feeling of inadequacy in spite of hard work and many achievements.

I also want all LIS students and grads in the library job market to know that something similar applies to the hiring process. I used to believe that only the best candidate with the most achievements gets hired. (Just like the grades given on an absolute scale!) But this naive belief is simply not true. From serving on multiple search committees at academic libraries, I have learned that candidates’ applications, resume/CVs, and cover letters are evaluated on a relative, not an absolute scale. And each organization has different priorities, specific needs, and most of all, unique individuals on their search committees or as hiring officials at any given time. You need to be the right fit for a given position at a given time and at a specific place. Being that right fit requires a lot of luck and chance beyond your many achievements. I also saw many cases in which some candidates whom the search committee I belonged to rejected but who found jobs that are just as good as or even better than what we had to offer. So no need to fall into despair by a rejection letter. You just have to try a little longer until you get selected.

From Flickr Creative Commons by Mari Z

If I were wiser, perhaps I would have written “How to Overcome Your Feeling of Inadequacy in Five Easy Steps(!)” instead of “Do You Feel Inadequate?” Unfortunately, I don’t have such five easy steps. I can only say that it took very many years for me to understand that working on the stuff that seems to you most difficult and unenjoyable doesn’t make you the most brilliant and hard-working person (It is probably a rather poor investment of your time and brain and so please don’t do that.) and that being a good supporter and follower can make just as great a contribution to a project as being its leader. A good thing about becoming a mid-career professional and getting old is that you get to care less about stupid stuff like what others think about you and more about important stuff like what you can do to make yourself better at things you want to do because you think they matter. How smart I look to others or whether I can be famous becomes rather trivial compared to whether I can get this thing done or to work, I understand something correctly, and what I do makes me happy and proud. I also recommend great blog posts by Andromeda and Coral about how to overcome the feeling of inadequacy, particularly in library technology and coding. Like they recommend, go sit at the table and develop your own swagger!

The New Year’s Eve is a great moment for reflection. In Müdigkeitsgesellschaft (Fatigue Society) (Berlin: Matthes & Seitz, 2010), which I’ve read recently (not in German but in Korean translation), a Korean-German philosopher Byung-Chul Han writes that driven by the goal of maximizing production, the contemporary society no longer disciplines people and instead makes all of us into individual entrepreneurs. He calls this new society ‘achievement society’ over-saturated with positivity and affirmation. All of us are now free to exploit ourselves, and there is no limit to how far we can go in our own free self-exploitation.  Doesn’t this sound familiar and similar to the mantra of managing oneself and the celebration of creativity?

Happy New Year to you all!

LITA Forum 2013 – My Talks on Google Visualization API and Faculty Bibliography

I did two presentations at LITA Forum last week. One was a concurrent session and the other was a lightning talk. This was the first time I attended a LITA Forum and it was a really great conference.  All of the LITA Forum presentations are available to everyone here:

Below are the slides of my two talks.

1. Take Better Care of Library Data and Spreadsheets with Google Visualization API Query Language

2. Building a Faculty Publications Database

Fear No Longer Regular Expressions

*** This post has been originally published on ACRL TechConnect Blog on July 31, 2013. ***

Regex, it’s your friend

You may have heard the term, “regular expressions” before. If you have, you will know that it usually comes in a notation that is quite hard to make out like this:

(?=^[0-5\- ]+$)(?!.*0123)\d{3}-\d{3,4}-\d{4}

Despite its appearance, regular expressions (regex) is an extremely useful tool to clean up and/or manipulate textual data. I will show you an example that is easy to understand. Don’t worry if you can’t make sense of the regex above. We can talk about the crazy metacharacters and other confusing regex notations later. But hopefully, this example will help you appreciate the power of regex and give you some ideas of how to make use of regex to make your everyday library work easier.

What regex can do for you – an example

I looked for the zipcodes for all cities in Miami-Dade County and found a very nice PDF online ( But when I copy and paste the text from the PDF file to my text editor (Sublime), the format immediately goes haywire. The county name, ‘Miami-Dade,’ which should be in one line becomes three lines, each of which lists Miami, -, Dade.

Ugh, right? I do not like this one bit. So let’s fix this using regular expressions.

(**Click the images to bring up the larger version.)

Screen Shot 2013-07-24 at 1.43.19 PM

Screen Shot 2013-07-24 at 1.43.39 PM

Like many text editors, Sublime offers the find/replace with regex feature. This is a much powerful tool than the usual find/replace in MS Word, for example, because it allows you to match many complex cases that fit a pattern. Regular expressions lets you specify that pattern.

In this example, I capture the three lines each saying Miami,-,Dade with this regex:


When I enter this into the ‘Find What’ field, Sublime starts highlighting all the matches. I am sure you already guessed that \n means a new line. Now let me enter Miami-Dade in the ‘Replace With’ field and hit ‘Replace All.’

Screen Shot 2013-07-24 at 2.11.43 PM

As you can see below, things are much better now. But I want each set of three lines – Miami-Dade, zipcode, and city – to be one line and each element to be separated by comma and a space such as ‘Miami-Dade, 33010, Hialeah’. So let’s do some more magic with regex.

Screen Shot 2013-07-24 at 2.18.17 PM

How do I describe the pattern of three lines – Miami-Dade, zipcode, and city? When I look at the PDF, I notice that the zipcode is a 5 digit number and the city name consists of alphabet characters and space. I don’t see any hypen or comma in the city name in this particular case. And the first line is always ‘Miami-Dade.” So the following regular expression captures this pattern.

Miami-Dade\n\d{5}\n[A-Za-z ]+

Can you guess what this means? You already know that \n means a new line. \d{5} means a 5 digit number. So it will match 33013, 33149, 98765, or any number that consists of five digits. [A-Za-z ] means any alphabet character either in upper or lower case or space (N.B. the space at the end right before ‘]’).

Anything that goes inside [ ] is one character. Just like \d is one digit. So I need to specify how many of the characters are to be matched. if I put {5}, as I did in \d{5}, it will only match a city name that has five characters like ‘Miami,’ The pattern should match any length of city name as long as it is not zero. The + sign does that. [A-Za-z ]+ means that any alphabet character either in upper or lower case or space should appear at least or more than once. (N.B. * and ? are also quantifiers like +. See the metacharacter table below to find out what they do.)

Now I hit the “Find” button, and we can see the pattern worked as I intended. Hurrah!

Screen Shot 2013-07-24 at 2.24.47 PM

Now, let’s make these three lines one line each. One great thing about regex is that you can refer back to matched items. This is really useful for text manipulation. But in order to use the backreference feature in regex, we have to group the items with parentheses. So let’s change our regex to something like this:

(Miami-Dade)\n\(d{5})\n([A-Za-z ]+)

This regex shows three groups separated by a new line (\n). You will see that Sublime still matches the same three line sets in the text file. But now that we have grouped the units we want – county name, zipcode, and city name – we can refer back to them in the ‘Replace With’ field. There were three units, and each unit can be referred by backslash and the order of appearance. So the county name is \1, zipcode is \2, and the city name is \3. Since we want them to be all in one line and separated by a comma and a space, the following expression will work for our purpose. (N.B. Usually you can have up to nine backreferences in total from \1 to\9. So if you want to backreference the later group, you can opt not to create a backreference from a group by using (?: ) instead of (). )

\1, \2, \3

Do a few Replaces and then if satisfied, hit ‘Replace All’.

Ta-da! It’s magic.

Screen Shot 2013-07-24 at 2.54.13 PM

Regex Metacharacters

Regex notations look a bit funky. But it’s worth learning them since they enable you to specify a general pattern that can match many different cases that you cannot catch without the regular expression.

We have already learned the four regex metacharacters: \n, \d, { }, (). Not surprisingly, there are many more beyond these. Below is a pretty extensive list of regex metacharacters, which I borrowed from the regex tutorial here: . I also highly recommend this one-page Regex cheat sheet from MIT (

Note that \w will match not only a alphabetical character but also an underscore and a number. For example, \w+ matches Little999Prince_1892. Also remember that a small number of regular expression notations can vary depending on what programming language you use such as Perl, JavaScript, PHP, Ruby, or Python.

Metacharacter Description
\ Specifies the next character as either a special character, a literal, a back reference, or an octal escape.
^ Matches the position at the beginning of the input string.
$ Matches the position at the end of the input string.
* Matches the preceding subexpression zero or more times.
+ Matches the preceding subexpression one or more times.
? Matches the preceding subexpression zero or one time.
{n} Matches exactly n times, where n is a non-negative integer.
{n,} Matches at least n times, n is a non-negative integer.
{n,m} Matches at least n and at most m times, where m and n are non-negative integers and n <= m.
. Matches any single character except “\n”.
[xyz] A character set. Matches any one of the enclosed characters.
x|y Matches either x or y.
[^xyz] A negative character set. Matches any character not enclosed.
[a-z] A range of characters. Matches any character in the specified range.
[^a-z] A negative range characters. Matches any character not in the specified range.
\b Matches a word boundary, that is, the position between a word and a space.
\B Matches a nonword boundary. ‘er\B’ matches the ‘er’ in “verb” but not the ‘er’ in “never”.
\d Matches a digit character.
\D Matches a non-digit character.
\f Matches a form-feed character.
\n Matches a newline character.
\r Matches a carriage return character.
\s Matches any whitespace character including space, tab, form-feed, etc.
\S Matches any non-whitespace character.
\t Matches a tab character.
\v Matches a vertical tab character.
\w Matches any word character including underscore.
\W Matches any non-word character.
\un Matches n, where n is a Unicode character expressed as four hexadecimal digits. For example, \u00A9 matches the copyright symbol

Matching modes

You also need to know about the Regex matching modes. In order to use these modes, you write your regex as shown above, and then at the end you add one or more of these modes. Note that in text editors, these options often appear as checkboxes and may apply without you doing anything by default.

For example, [d]\w+[g] will match only the three lower case words in ding DONG dang DING dong DANG. On the other hand, [d]\w+[g]/i will match all six words whether they are in the upper or the lower case.

Look-ahead and Look-behind

There are also the ‘look-ahead’ and the ‘look-behind’ pattern in regular expressions. These often cause confusion and are considered to be a tricky part of regex. So, let me show you a simple example of how it can be used.

Below are several lines of a person’s last name, first name, middle name, separated by his or her department name. You can see that this is a snippet from a csv file. The problem is that a value in one field – the department name- also includes a comma, which is supposed to appear only between different fields not inside a field. So the comma becomes an unreliable separator. One way to solve this issue is to convert this csv file into a tab limited file, that is, using a tab instead of a comma as a field separater. That means that I need to replace all commas with tabs ‘except those commas that appear inside a department field.’

How do I achieve that? Luckily, the commas inside the department field value are all followed by a space character whereas the separator commas in between different fields are not so. Using the negative look-ahead regex, I can successfully specify the pattern of a comma that is not followed by (?!) a space \s.


Below, you can see that this regex matches all commas except those that are followed by a space.


For another example, the positive look-ahead regex, Ham(?=burg) , on the other hand, will match ‘Ham‘ in Hamburg when it is applied to the text: Hamilton, Hamburg, Hamlet, Hammock.

Below are the complete look-ahead and look-behind notations both positive and negative.

  • (?=pattern)is a positive look-ahead assertion
  • (?!pattern)is a negative look-ahead assertion
  • (?<=pattern)is a positive look-behind assertion
  • (?<!pattern)is a negative look-behind assertion

Can you think of any example where you can successfully apply a look-behind regular expression? (No? Then check out this page for more examples:

Now that we have covered even the look-ahead and the look-behind, you should be ready to tackle the very first crazy-looking regex that I introduced in the beginning of this post.

(?=^[0-5\- ]+$)(?!.*0123)\d{3}-\d{3,4}-\d{4}

Tell me what this will match! Post in the comment below and be proud of yourself.

More tools and resources for practicing regular expressions

There are many tools and resources out there that can help you practice regular expressions. Text editors such as EditPad Pro (Windows), Sublime, TextWrangler (Mac OS), Vi, EMacs all provide regex support. Wikipedia ( offers a useful comparison chart of many text editors you can refer to. is a convenient online Javascript Regex tester. FireFox also has Regular Expressions add-on (

For more tools and resources, check out “Regular Expressions: 30 Useful Tools and Resources”

Library problems you can solve with regex

The best way to learn regex is to start using it right away every time you run into a problem that can be solved faster with regex. What library problem can you solve with regular expressions? What problem did you solve with regular expressions? I use regex often to clean up or manipulate large data. Suppose you have 500 links and you need to add either EZproxy suffix or prefix to each. With regex, you can get this done in a matter of a minute.

To give you an idea, I will wrap up this post with some regex use cases several librarians generously shared with me. (Big thanks to the librarians who shared their regex use cases through Twitter! )

  • Some ebook vendors don’t alert you to new (or removed!) books in their collections but do have on their website a big A-Z list of all of their titles. For each such vendor, each month, I run a script that downloads that page’s HTML, and uses a regex to identify the lines that have ebook links in them. It uses another regex to extract the useful data from those lines, like URL and Title. I compare the resulting spreadsheet against last month’s (using a tool like diff or vimdiff) to discover what has changed, and modify the catalog accordingly. (@zemkat)
  • Sometimes when I am cross-walking data into a MARC record, I find fields that includes irregular spacing that may have been used for alignment in the old setting but just looks weird in the new one. I use a regex to convert instances of more than two spaces into just two spaces. (@zemkat)
  • Recently while loading e-resource records for government documents, we noted that many of them were items that we had duplicated in fiche: the ones with a call number of the pattern “H, S, or J, followed directly by four digits”. We are identifying those duplicates using a regex for that pattern, and making holdings for those at the same time. (@zemkat)
  • I used regex as part of crosswalking metadata schemas in XML. I changed scientific OME-XML into MODS + METS to include research images into the library catalog. (@KristinBriney)
  • Parsing MARC just has to be one. Simple things like getting the GMD out of a 245 field require an expression like this: |h\[(.*)\] MARCedit supports regex search and replace, which catalogers can use. (@phette23)
  • I used regex to adjust the printed label in Millennium depending on several factors, including the material type in the MARC record. (@brianslone )
  • I strip out non-Unicode characters from the transformed finding aids daily with regex and replace them with Unicode equivalents. (@bryjbrown)


Playing with JavaScript and JQuery – the Ebook link HTML string generator and the EZproxy bookmarklet generator

***This post has been published in ACRL TechConnect on April 8, 2013. ***

In this post, I will describe two cases in which I solved a practical problem with a little bit of JavaScript and JQuery. Check them out first here before reading the rest of the post which will explain how the code works.

  1. Library ebook link HTML string generator
  2. EZproxy bookmarklet generator – Longer version (with EZproxy Suffix)
  3. EZproxy bookmarklet generator – Shorter version (with EZproxy Prefix)


1. Library ebook link HTML string generator

If you are managing a library website, you will be using a lot of hyperlinks for library resources. Sometimes you can distribute some of the repetitive tasks to student workers, but they are usually not familiar with HTML. I had a situation in which I needed to add a number of links for library e-books for my library’s Course E-book LibGuide page, when I was swamped with many other projects at the same time. So I wondered if I could somehow still use the library’s student assistant’s help by creating an providing a simple tool, so that the student only needs to input the link title and url and get me the result as HTML. This way, I can still delegate some work when I am swamped with other projects that require my attention. (N.B. Needless to say, this doesn’t mean that what I did was the best way to use the student assistance for this type of work. I didn’t want them to edit the page directly because this page had tabs tabs and the student using the WYSWYG editor might inadvertently remove part of the tabbed box code.)

The following code exactly does that.

This HTML form takes an e-book title and the link to the book as input and spits out a hyperlink as a list item as a result. For example, if you fill in the title field with ‘Bradley’s Neurology in Clinical Practice’ and the link field with its url:’s+Neurology+in+Clinical+Practice, then the result would be shown in the text area : <li><a href=”’s+Neurology+in+Clinical+Practice”> Bradley’s Neurology in Clinical Practice</a></li>  I also wanted the library student assistant to be able to do this for many e-books at once and just send me the whole set of HTML snippets that cover all ebooks. So after running the form once, if one fills out the title and the link field with another e-book information, the result would be added to the end of the previous HTML string and be displayed in the text area. The result would be like this: <li><a href=”’s+Neurology+in+Clinical+Practice”> Bradley’s Neurology in Clinical Practice</a></li><li><a href=”″>Cardiovascular Physiology</a></li>. Since the code is in the text area, the student can also edit if there was any error when s/he was filling out the form after clicking the button ‘Send to Text Area’.

Now, let’s take a look at what is going on behind the scene. This is the entire html file. The Javascript/JQuery code that is generating the html string in the text area is from line 22-32.

<script type="text/javascript" src=""></script>
		This page creates the html-friendly string for a link with a title and a url. 
		<li>Fill out the form below and click the button. </li>
		<li>If the link is messy, clean up first by using the <a href="">URL decoder</a>.</li>
		<li>Copy and paste the result inside the text area after you are done.</li>
		Title: <input type="text" id="title1" size="100"/><br/>
		Link: <input type="text" id="link1" size="100"/><br/>
		<button id="b1">Send to Text Area</button>
	<p id="result1"></p>
	<textarea rows="20" cols="80"></textarea>

	<script type="text/javascript">
	  //  alert('<a href="'+$("#link").val()+'">' + $("#title").val()+"</a>");
	    $('#result1').text('<li><a href="'+$("#link1").val()+'">' + $("#title1").val()+"</a></li>");
	    var res1='<li><a href="'+$("#link1").val()+'">' + $("#title1").val()+"</a></li>";
	 //   $('textarea').text(res1);
	}); // doc ready


Since I am using JQuery I am starting with the obligatory $(document).ready in line 2. In line 3,  I am giving a callback function that will be executed when the #b1 – the button with the id of b1 in line 17 above- is clicked.  Line 4 is commented out. I used this initially to test if I am getting the right string out of the input from the title and the link field using the JS alert. Line 5 is filling the p tag with the id of result 1 in line 19 above with the thus-created string. The string is also saved in variable res1 in line 7. Then it is attached to the content of the textarea field in line 8. Line 9 is commented out. If you use line 9 instead of line 8, any existing content in the textarea will be removed and only the new string created from the title and the link field will show up in the text area.

<script type="text/javascript">
	  //  alert('<a href="'+$("#link").val()+'">' + $("#title").val()+"</a>");
	    $('#result1').text('<li><a href="'+$("#link1").val()+'">' + $("#title1").val()+"</a></li>");
	    var res1='<li><a href="'+$("#link1").val()+'">' + $("#title1").val()+"</a></li>";
	 //   $('textarea').text(res1);
	}); // doc ready

You do not have to know a lot about scripting to solve a simple task like this!

2. EZproxy bookmarklet generator – Longer version

My second example is a bookmarklet that reloads the current page through a specific EZproxy system.

If you think that this bookmarklet reinvents the wheel since the LibX toolbar already does this, you are correct. And also if you are a librarian working with e-resources, you already know to add the EZproxy suffix at the end of the domain name of the url when a patron asks if a certain article on a web page is available through the library or not. But I found that no matter how many times I explain this trick of adding the EZproxy suffix to patrons, the trick doesn’t seem to stick in their busy minds. Also, many doctors and medical students, who are the primary patrons of my library, work at the computers in hospitals and they do not have the necessary privilege to install a toolbar on those computers. But they can create a bookmark.

Similarly, many students asked me why there is no LibX toolbar for their mobile devices unlike in their school laptops. (In the medical school where I work, all students are required to purchase a school-designated laptop; this laptop is pre-configured with all the necessary programs including the library LibX toolbar.) Well, mobile devices are not exactly computers and so the browser toolbar doesn’t work. But students want an alternative and they can create a bookmark on their tablets and smartphones. So the proxy bookmarklet is still a worthwhile tool for the mobile device users.

This is where the bookmarklet is: To test, drag the link on the top that says Full-Text from FIU Library to your bookmark toolbar. Then go to Click the new bookmarklet you got on your toolbar. The page will reload and you will be asked to log in through the Florida International University EZproxy system. When you are authenticated, you will be seeing the page proxied:

You will be surprised to see how simple the bookmarklet is (and there is even a shorter version than this which I will show in the next section). It is a JavaScript function wrapped inside a hyperlink. Lines 2-5 each takes the domain name, the path name, and any search string after the url path from the current window location object. So in the case of, is and location.pathname is Issue.aspx . The rest of the url ?journalid=67&issueID=4452&direction=P — is In line 4, I am putting my institution’s ezproxy suffix between these two, and in line 5, I am asking the browser load this new link.

<a href="javascript:(function(){
	var path=location.pathname;
	var newlink='http://'+host+''+path+srch;;
})();">Full-Text from FIU Library</a>

Now let’s take a look at the whole form. I created this form for those who want to create a ready-made bookmarklet recipe. All they need is their institution’s EZproxy suffix and whatever name they want to give to the bookmarklet. Once one fills out those two fields and click ‘Customize’ button, one will get the full HTML page code with the bookmarklet as a link in it.

<script type="text/javascript" src=""></script>
<h1>EZproxy Bookmarklet</h1>	
	<a href="javascript:(function(){
		var path=location.pathname;
                var newlink='http://'+host+''+path+srch;;
	})();">Full-Text from FIU Library</a> 
	(Drag the link to the bookmark toolbar!)
<p>This is a bookmarket that will reroute a current page through FIU EZproxy.
If you have the LibX toolbark installed, you do not need this bookmarklet. Simply select 'Reload Page via XYZ EZproxy' on the menu that appears when you right click.
Created by Bohyun Kim on March, 2013. 	
<h2>How to Test</h2>
	<li>Drag the link above to the bookmark toolbar of your web browser.</li>
	<li>Click the bookmark when you are on a webpage that has an academic article.</li>
	<li>It will ask you to log in through the FIU EZproxy.</li>
	<li>Once you are authenticated and the library has a subscription to the journal, you will will able to get the full-text article on the page.</li>
	<li>Look at the url and see if it contains If it does, the bookmarklet is working.</li>

<h2>Make One for Your Library</h2>
		Bookmark title: <input type="text" id="title1" size="40" placeholder="e.g. Full-Text ABC Library"/>
		<em>e.g. Full-text ABC Library</em>
		Library EZproxy Suffix: <input type="text" id="link1" size="31" placeholder="e.g."/>
		<button id="b1">Customize</button></em>
	<p><strong>Copy the following into a text editor and save it as an html file.</strong></p>
		<li>Open the file in a web browser and drag your link to the bookmark toolbar.</li>
	<p id="result1" style="color:#F7704F;">**Customized code will appear here.**</p>
	<p><strong>If you want to make any changes to the code:</strong></p>
	<textarea rows="10" cols="60"></textarea>

	<script type="text/javascript">
		var pre="&lt;html&gt; &lt;head&gt; &lt;script type=&quot;text/javascript&quot; src=&quot;;&gt;&lt;/script&gt; &lt;/head&gt; &lt;body&gt; &lt;a href=&quot;javascript:(function(){ var; var path=location.pathname; var; var newlink='http://'+hst";
		var link1=$('#link1').val();
		var post="'+path+srch;; })();">";
	  	var title=$("#title1").val();
	  	var end="&lt;/a&gt; &lt;/body&gt; &lt;/html&gt;";
	  	var final=$('<div />').html(pre+"+'."+link1+post+title+end).text()
	}); // doc ready


That ‘customize’ part is done here with several lines of JQuery. You will notice that the process is quite similar to what I did in my first example. In lines 4-8, I am just stitching together a bunch of text strings to spit out the whole code in the text area eventually when the ‘Customize’ button is clicked. All special characters used in HTML tags such as ‘<‘ and ‘>’ have been changed to html enities. In line 9, I am taking the entire string saved in the variable end —I hope you name your variables a little more carefully than I do!—  and adding it to an empty div so that the string would be set as the inner HTML property of that div. And then I retrieve it using the .text() method. The result is the HTML code with the html entities decoded.

<script type="text/javascript">
		var pre="&lt;html&gt; &lt;head&gt; &lt;script type=&quot;text/javascript&quot; src=&quot;;&gt;&lt;/script&gt; &lt;/head&gt; &lt;body&gt; &lt;a href=&quot;javascript:(function(){ var; alert(hst); var path=location.pathname; var; var newlink='http://'+hst";
		var link1=$('#link1').val();
		var post="'+path+srch;; })();">";
	  	var title=$("#title1").val();
	  	var end="&lt;/a&gt; &lt;/body&gt; &lt;/html&gt;";
	  	var final=$('<div />').html(pre+"+'."+link1+post+title+end).text()
	}); // doc ready

Not too bad, right? I hope these examples show how just a few or several lines of code can be used to solve your practical problems at work. Coding is there for you to automate time-consuming and/or repetitive tasks.

3. EZproxy bookmarklet generator – Shorter version

There is a simpler way to create a EZproxy bookmarklet than the one sketched above. If you simply add EZproxy prefix in front of the entire url of the page where a user is is, you achieve the same result. In this case, you do not have to break the url with host, pathname, search string, etc.

<a href="javascript:void(location.href=%22">Full-Text from FIU Library</a>

Here are the code for this much simpler EZproxy bookmarklet and the bookmarklet generator. If you know the prefix of you library’s EZproxy prefix, you can make one for your library.

So there are many ways to get the same thing done. Some are more elegant and some are less so. Usually a shorter one is a more elegant solution. The lesson is that you usually get to the most elegant solution after coming up with many less elegant solutions first.

Do you have a problem that can be solved by creating several lines of code? Have you ever solved a practical problem using a bit of code? Share your experience in the comments!


Using Git with BitBucket: Basic commands – pull, add, commit, push

If you do coding, you will be making lots of tiny little changes to a file. You are also likely to be working on multiple computers at different locations. For documents and other types of files, we often use Dropbox or Google Drive/Docs. Tools like Dropbox and Google Drive/Docs are central repositories, and they allow you to always keep the most up-to-date version of the file, make changes to it, save it, and then access it at another computer. Dropbox and Google Drive/Docs also supports version-control. With those tools, you can review changes and revert to the previous state of file if you would like.

BitBucket is a tool similar to Dropbox. And it works with Git. Git is is a distributed revision control and source code management (SCM) system.  GitHub, which is another version-controlled repository web site is popular among coders and also uses Git. (If you want to try this with GitHub, the process is pretty much the same, but you may want to start by cloning somebody’s existing repo. Check out this ACRL TechConnect post by Eric.) BitBucket is similar to GitHub but it allows a completely private repository which is not supported in the free GitHub plan. So if you are working on stuff and don’t want them to be seen publicly until you are ready to release, BitBucket is a good choice.

In this post, I will show you the very basic commands of Git and  how I make changes to files, sync my local repository with the BitBucket repository, add/commit them to my local repo and push them to the remote repo in BitBucket for those who are not familiar with Git. Here, I assume that you already have git installed and set up a repository in BitBucket. If not, there are many tutorials online and also the instruction you get after signing up for BitBucket is quite clear and straightforward.

Once you are all set, the first step I run in my repo is always status command. When I type in git status in the command line (I am using Terminal in Mac), git shows in red that in my repo there are two files that I modified but have not committed. This means that the files were added, meaning they were explicitly added by the git add command but changed without being committed. You can make all sorts of changes to files but until you run the git commit command, it is sort of not official and you cannot push your changes and sync your updated files in the local repo with the remote repo.

**If the image is blurry, just double-click it and you will be able to see the clear version.**

git status


But I also vaguely remember that I worked on some files in this remote repo from home several days ago and pushed the changes. Those changes have not yet been reflected in my local repo at my work computer. So I want to sync my local repo with the remote repo first, which has more up-to-date files. To do this, I run the following command. This literally pulls in all the changes from the remote repo to my local repo. And after a few seconds, you get the message that everything went well and you are good to go.

git pull origin master


Now at this point, I can start work on all the up-to-date files. So I go ahead and make some more changes to the two files – bkmklt.html and link.html. After that, I am ready to make the changes official, I run the commit command. You can type in a specific file name you want to commit  or you can just use a dot to make Git to commit all modified files. The commit command requires a commit message. So you have to add some message at the end. Here I write that I removed some comment lines. The message is more important if the change is for other people but try to make it as specific as possible.

If I have created a completely new file in my local repo (that is a designated folder where I set up a Git repo), I will have to add the file first in order to be able to commit it. You can do so by git add and you should be ready to commit it. If you want to add all new files you can also use a dot (.)  instead of a file name.

Below, I ran the status command again before running the commit command.

git commit . -m “removing some comment lines”


So the modified files have been committed now. But they are still in my local repo. If I want to get to them later from another computer however, I need to sync these files with the central repo in BitBucket. The push command does that. You are pushing out your local changes into the remote repo here. If all goes well, you will get the success message as  shown below. (Type in your BitBucket password when prompted.)

git push origin master


The change is also recorded in my BitBucket account page.



I can also view the changes I made in BitBucket. The red part means what I deleted and the green part means what I added.


Now when I go home and want to work on these files, I will start with “git pull origin master” to sync my home computer with the remote that is up-to-date.

It is really very much like Dropbox or Google Docs except that you get to type in lots of commands. I hope this will help those who are curious about Git but somewhat intimidated by the command lines! So try it out if you are curious.

Here is a great diagram that shows the command sequence of Git. And the entire Git cheatsheet is at:

Screen Shot 2013-03-25 at 4.23.29 PM

If you want to go to the next level and try working on someone else’s project in GitHub by forking and branching, here is a good tutorial “How to GitHub: Fork, Branch, Track, Squash and Pull Request” by Rich Jones. There are also many other resources for learning and using Git and GitHub at the ALA ALCTS/LITA CodeYear IG github repository.








Effectively Learning How To Code: Tips and Resources

*** This post has been originally published in ACRL TechConnect on Dec. 10, 2012. ***

Librarians’ strong interest in programming is not surprising considering that programming skills are crucial and often essential to making today’s library systems and services more user-friendly and efficient for use. Not only for system-customization, computer-programming skills can also make it possible to create and provide a completely new type of service that didn’t exist before. However, programming skills are not part of most LIS curricula, and librarians often experience difficulty in picking up programming skills.

In this post, I would like to share some effective strategies to obtain coding skills and cover common mistakes and obstacles that librarians make and encounter while trying to learn how to code in the library environment based upon the presentation that I gave at Charleston Conference last month, “Geek out: Adding Coding Skills to Your Professional Repertoire.” (slides: At the end of this post, you will also find a selection of learning and community resources.

How To Obtain Coding Skills, Effectively

1. Pick a language and concentrate on it.

There are a huge number of resources available on the Web for those who want to learn how to program. Often librarians start with some knowledge in markup languages such as HTML and CSS. These markup languages determine how a block of text are marked up and presented on the computer screen. On the other hand, programming languages involve programming logic and functions. An understanding of the basic programming concepts and logic can be obtained by learning any programming language. There are many options, and some popular choices are JavaScript, PHP, Python, Ruby, Perl, etc. But there are many more.  For example, if you are interested in automating tasks in Microsoft applications such as Excel, you may want to work with Visual Basic. If you are unsure about which language to pick, search for a few online tutorials for a few languages to see what their different syntaxes and examples are like. Even if you do not understand the content completely, this will help you to pick the language to learn first.

2. Write and run the code.

Once you choose a language to learn, there are many paths that you can follow. Taking classes at a local community college or through an online school may speed up the initial process of learning, but it could be time-consuming and costly. Following online tutorials and trying each example is a good alternative that many people take. You may also pick up a few books along the way to supplement the tutorials and use them for reference purposes.

If you decide on self-study, make sure that you actually write and run the code in the examples as you follow along the books and the tutorials. Most of the examples will appear simple and straightforward. But there is a big difference between reading through a code example and being actually able to write the code on your own and to run it successfully. If you read through programming tutorials and books without actually doing the hands-on examples on your own, you won’t get much benefit out of your investment. Programming is a hands-on skill as much as an intellectual understanding.

3. Continue to think about how coding can be applied to your library.

Also important is to continue to think about how your knowledge can be applied to your library systems and environment, which is often the source of the initial motivation for many librarians who decide to learn how to program. The best way to learn how to program is to program, and the more you program the better you will become at programming. So at every chance of building something with the new programming language that you are learning, no matter how small it is, build it and test out the code to see if it works the way you intended.

4. Get used to debugging.

While many who struggle with learning how to code cite lack of time as a reason, the real cause is likely to be failing to keep up the initial interest and persist in what you decided to learn. Learning how to code can be exciting, but it can also be a huge time-sink and the biggest source of frustration from time to time. Since the computer code is written for a machine to read, not for a human being, one typo or a missing semicolon can make the program non-functional. Finding out and correcting this type of error can be time-consuming and demoralizing. But learning how to debug is half of programming. So don’t be discouraged.

5. Find a community for social learning and support.

Having someone to talk to about coding problems while you are learning can be a great help. Sign up for listservs where coding librarians or library coders frequent, such as code4lib and web4lib to get feedback when you need. Research the cause of the problem that you encounter as much as possible on your own. When you still are unsure about how to go about tackling it, post your question to the sites such as Stack Overflow for suggestions and answers from more experienced programmers. It is also a good idea to organize a study group with like-minded people and get support for both coding-related and learning-related problems. You may also find local meet-ups available in your area using sites like

Don’t be intimidated by those who seem to know much more than you in those groups (as you know much more about libraries than they do and you have things to contribute as well), but be aware of the cultural differences between the developer community and the librarian community. Unlike the librarian community that is highly accommodating for new librarians and sometimes not-well-thought-out questions, the developer community that you get to interact with may appear much less accommodating, less friendly, and less patient. However, remember that reading many lines of code, understanding what they are supposed to do, and helping someone to solve a problem occurring in those lines can be time-consuming and difficult even to a professional programmer. So it is polite to do a thorough research on the Web and with some reference resources first before asking for others’ help. Also, always post back a working solution when your problem is solved and make sure to say thank you to people who helped you. This way, you are contributing back to the community.

6. Start working on a real-life problem ‘now.’ Don’t wait!

Librarians are often motivated to learn how to code in order to solve real-life problems they encounter at their workplace. Solving a real-life problem with programming is therefore the most effective way to learn and to keep up the interest in programming. One of the greatest mistake in learning programming is putting off writing one’s own code and waiting to work on a real-life problem for the reason that one doesn’t know yet enough to do so. While it is easy to think that once you learn a bit more, it would be easier to approach a problem, this is actually a counter-productive learning strategy as far as programming is concerned because often the only way to find out what to learn is by trying to solve a problem.

7. Build on what you learned.

Another mistake to avoid in learning how to program is failing to build on what one has learned. Having solved one set of problem doesn’t mean that you will remember that programming solution you created next time when you have to solve a similar problem. Repeating what one has succeeded at and expanding on that knowledge will lead to a stronger foundation for more advanced programming knowledge. Also instead of trying to learn more than one programming language (e.g. Python, PHP, Ruby, etc.) and/or a web framework (e.g. Django, cakePHP, Ruby On Rails, etc.) at the same time, first try to become reasonably good at one. This will make it much easier to pick up another language later in the future.

8. Code regularly and be persistent.

It is important to understand that learning how to program and becoming good at it will take time. Regular coding practice is the only way to get there. Solving a problem is a good way to learn, but doing so on a regular basis as often as possible is the best way to make what you learned stick and stay in your head.

While is it easy to say practice coding regularly and try to apply it as much as possible to the library environment, actually doing so is quite difficult. There are not many well-established communities for fledgling coders in libraries that provide needed guidance and support. And while you may want to work with library systems at your workplace right away, your lack of experience may prove problematic in gaining a necessary permission to tinker with them. Also as a full-time librarian, programming is likely to be thrown to the bottom of your to-do list.

Be aware of these obstacles and try to find a way to overcome them as you go. Set small goals and use them as milestones. Be persistent and don’t be discouraged by poor documentation, syntax errors, and failures. With consistent practice and continuous learning, programming can surely be learned.


A. Resources for learning

B. Communities


More APIs: writing your own code (2)

*** This blog post has been originally published in ACRL TechConnect 0n October 9, 2012. ***

My previous post “The simpest AJAX: writing your own code (1)” discussed a few Javascript and JQuery examples that make the use of the Flickr API. In this post, I try out APIs from providers other than Flickr. The examples will look plain to you since I didn’t add any CSS to dress them up. But remember that the focus here is not in presentation but in getting the data out and re-displaying it on your own. Once you get comfortable with this process, you can start thinking about a creative and useful way in which you can present and mash up the same data. We will go through 5 examples I created with three different APIs.  Before taking a look at the codes, check out the results below first.

I. Pinboard API

The first example is Pinboard. Many libraries moved their bookmarks in to a different site when there was a rumor that may be shut down by Yahoo. One of those sites were Pinboard.  By getting your bookmark feeds from Pinboard and manipulating them, you can easily present a subset of your bookmark as part of your website.

(a) Display bookmarks in Pinboard using its API

The following page uses JQuery to access the JSONP feed of my public bookmarks in Pinboard. $.ajax() method is invoked on line 13. Line 15, jsonp:”cb”, gives the name to a callback function that will wrap the JSON feed data in it. Note line 18 where I print out data received into the console. This way, you can check if you are receiving JSONP feed in the console of Firebug. Line 19-22 uses $.each() function to access each element in the JSONP feed and the .append() method to add each bookmark’s title and url to the “pinboard” div. JQuery documentation has detailed explanation and examples for its functions and methods. So make sure to check it out if you have any questions about a JQuery function or method.

Pinboard API – example 1

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Pinboard: JSONP-AJAX Example</title>
<script type="text/javascript" src=""></script>
<p> This page takes the JSON feed from <a href="">my public links in Pinboard</a> and re-presents them here. See <a href="">the Pinboard's JSON feed of my links</a>.</p>
<div id="pinboard"><h1>All my links in Pinboard</h1></div>
  success: function(data) {
  	console.log(data); //dumps the data to the console to check if the callback is made successfully
    $.each(data, function(index, item){
      $('#pinboard').append('<div><a href="' + item.u + '">' + item.d
+ '</a></div>');
      }); //each
    } //success
  }); //ajax


Here is the screenshot of the page. I opened up the Console window of the Firebug (since I dumped the received in line 18) and you can see the incoming data here. (Note. Click the images to see the large version.)

But it is much more convenient to see the organization and hierarchy of the JSONP feed in the Net panel of Firebug.

And each element of the JSONP feed can be drilled down for further details by clicking the object in each row.

(b) Display only a certain number of bookmarks

Now, let’s display only five bookmarks. In order to do this, only one more line is needed. Line 9 checks the position of each element and breaks the loop when the 5th element is processed.

Pinboard API – example 2

  success: function(data) {
    $.each(data, function(index, item){
      	$('#pinboard').append('<div><a href="' + item.u + '">' + item.d
+ '</a></div>');
    	if (index == 4) return false; //only display 5 items
      }); //each
    } //success
  }); //ajax

(c) Display bookmarks with a certain tag

Often libraries want to display bookmarks with a particular tag. Here I add a line using JQuery method $.inArray() to display only bookmarks tagged with ‘fiu.’ $.inArray()  method takes value and array as parameters and returns 0 if the value is found in the array otherwise -1. Line 7 checks if the tag array of a bookmark (item.t) does include ‘fiu,’ and only in such case displays the bookmark. As a result, only the bookmarks with the tag ‘fiu’ are shown in the page.

Pinboard API – example 3

  success: function(data) {
    $.each(data, function(index, item){
    	if ($.inArray("fiu", item.t)!==-1) // if the tag is 'fiu'
      		$('#pinboard').append('<div><a href="' + item.u + '">' + item.d
+ '</a></div>');
	}); //each
    } //success
  }); //ajax

II. Reddit API

My second example uses Reddit API. Reddit is a site where people comment on news items of interest. Here I used $.getJSON() instead of $.ajax() in order to process the JSONP feed from the Science section of Reddit. In the case of Pinboard API, I could not find out a way to construct a link that includes a call back function in the url. Some of the parameters had to be specified such as jsonp:”cb”, dataType:’jsonp’. For this reason, I needed to use $.ajax() function. On the other hand, in Reddit, getting the JSONP feed url was straightforward:

Line 19 adds a title of the page. Line 20-22 extracts the title and link to the news article that is being commented and displays it. Under the news item, the link to the comments for that article in Reddit is added as a bullet item. You can see that, in Line 17 and 18, I have used the console to check if I get the right data and targeting the element I want and then commented out later.

This is just an example, and for that reason, the result is a rather simplified version of the original Reddit page with less information. But as long as you are comfortable accessing and manipulating data at different levels of the JSONP feed sent from an API, you can slice and dice the data in a way that suits your purpose best. So in order to make a clever mash-up, not only the technical coding skills but also your creative ideas of what different sets of data and information to connect and present to offer something new that has not been available or apparent before.

My second example uses Reddit API. Reddit is a site where people comment on news items of interest. Here I used $.getJSON() method instead of $.ajax() in order to process the JSONP feed from the Science section of Reddit. In the case of Pinboard API, I could not find out a way to construct a link for a call back function. Some of the parameters had to be specified such as jsonp:”cb”, dataType:’jsonp’. So I needed to use $.ajax() method.

On the other hand, in Reddit, getting the JSONP feed url was straightforward:   Line 19 adds a title of the page. Line 20-22 extracts the title and link to the news article that is being commented and displays it. Under the news item, the link to the comments for that article in Reddit is added as a bullet item.You can see that in Line 17 and 18, I have used the console to check if I get the right data and targeting the element I want.

This is just an example, and for that reason, the result is rather a simplified version of the original Reddit page with less information. But as long as you are comfortable accessing and manipulating data at different levels of the JSONP feed sent from an API, you can slice and dice the data in a way that suits your purpose best.

Reddit API – example

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Reddit-Science: JSONP-AJAX Example</title>
<script type="text/javascript" src=""></script>

<p> This page takes the JSONP feed from <a href="">Reddit's Science section</a> and presents the link to the original article and the comments in Reddit to the article as a pair. See <a href="">JSONP feed</a> from Reddit.</p>

<div id="feed"> </div>
<script type="text/javascript">
	//run function to parse json response, grab title, link, and media values, and then place in html tags
	$.getJSON('', function(rd){
		$('#feed').html('<h1>*Subredditum Scientiae*</h1>');
		$.each(, function(k, v){
      		$('#feed').append('<div><p>'+(k+1)+': <a href="' + + '">' +'</a></p><ul><li style="font-variant:small-caps;font-size:small"><a href="''">Comments from Reddit</a></li></ul></div>');
      }); //each
	}); //getJSON


The structure of a JSON feed can be confusing to make out particularly. So make sure to use the Firebug Net window to figure out the organization of the feed content and the property name for the value you want.

But what if the site from which you would like to get data doesn’t offer JSONP feed? Fortunately you can convert any RSS or XML feed into JSONP feed. Let’s take a look!

III. PubMed Feed with Yahoo Pipes API

Consider this PubMed search. This is simple search that looks for items in PubMed that has to do with Florida International University College of Medicine where I work. You may want to access the data feed of this search result, manipulate, and display in your library website. So far, we have performed a similar task with the Pinboard and the Reddit API using JQuery. But unfortunately PubMed does not offer any JSON feed. We only get RSS feed instead from PubMed.

This is OK, however. You can either manipulate the RSS feed directly or convert the RSS feed into JSON, which you are more familiar with now. Yahoo Pipes is a handy tool for just that purpose. You can do the following tasks with Yahoo Pipes:

  • combine many feeds into one, then sort, filter and translate it.
  • geocode your favorite feeds and browse the items on an interactive map.
  • power widgets/badges on your web site.
  • grab the output of any Pipes as RSS, JSON, KML, and other formats.

Furthermore, there may be a pipe that has been already created for exactly what you want to do by someone else. As PubMed is a popular resource, I found a pipe for PubMed search. I tested, copied the pipe, and changed the search term. Here is the screenshot of my Yahoo Pipe.

If you want to change the pipe, you can click “View Source” and make further changes. Here I just changed the search terms and saved the pipe.

After that, you want to get the results of the Pipe as JSON. If you hover over the “Get as JSON” link in the first screenshot above, you will get a link:”Florida International University” and “College of Medicine” But this returns JSON, not JSONP.

In order to get that JSON feed wrapped into a callback function, you need to add this bit, &_callback=myCallback, at the end of the url: Now the JSON feed appears wrapped in a callback function like this: myCallback( ). See the difference?

Line 25 enables you to bring in this JSONP feed and invokes the callback function named “myCallback.” Line 14-23 defines this callback function to process the received feed. Line 18-20 takes the JSON data received at the level of data.value. item, and prints out each item’s title (item.title) with a link ( Here I am giving a number for each item by (index+1). If you don’t put +1, the index will begin from 0 instead of 1. Line 21 stops the process when the processed item reaches 5 in number.

Yahoo Pipes API/PubMed – example

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>PubMed and Yahoo Pipes: JSONP-AJAX Example</title>
<script type="text/javascript" src=""></script>
<p> This page takes the JSONP feed from <a href=""> a Yahoo Pipe</a>, which creates JSONP feed out of a PubMed search results and re-presents them here. 
<br/>See <a href="">the Yahoo Pipe's JSON feed</a> and <a href="">the original search results in PubMed</a>.</p>
<div id="feed"></div>
<script type="text/javascript">
	//run function to parse json response, grab title, link, and media values - place in html tags
	function myCallback(data) {
		$("#feed").html('<h1>The most recent 5 publications from <br/>Florida International University College of Medicine</h1><h2>Results from PubMed</h2>');
		$.each(data.value.items, function(index, item){
      		$('#feed').append('<p>'+(index+1)+': <a href="' + + '">' + item.title
+ '</a></p>');
        if (index == 4) return false; //display the most recent five items
      }); //each
	} //function
<script type="text/javascript" src=""></script>

Do you feel more comfortable now with APIs? With a little bit of JQuery and JSON, I was able to make a good use of third-party APIs. Next time, I will try the Worldcat Search API, which is closer to the library world and see how that works.