Koha OPAC over SSL breaks GoogleIndicTransliteration

GoogleIndicTransliteration is a nifty Koha feature allowing easy typing and searching in several Indian language to Indian users. However, a bug prevents it from working if the OPAC is run over SSL (i.e. https). This post provides a clear description and a fix for the problem.

Many Indian Koha users use the GoogleIndicTransliteration option to offer their users the facility to search in Indian languages on the OPAC. This nifty feature allows users to phonetically type in their search queries in Indian languages in order to search catalogs that are (a) multi-lingual or (b) in a Indian language other than English.

mixed_03

However, if you are security minded (and you *should* be if your OPAC is on the Net and allows your users to log in) and you decide to serve your site over to SSL (i.e. https), then guess what? The GoogleIndicTransliteration feature stops immediately with the browser console showing MIXED CONTENT error. Every single Koha version from 3.18.0 (when this feature made its way back into Koha after a long hiatus) up to the latest 16.05.2 (released on August 1, 2016) are affected by this problem.

mixed_02

I do not have time, just show me how to fix this

If you are in a hurry, jump over to the section “Your options until the patch is officially released” at the end of this post. Remember to read the caveat and the assumption, you have been warned! 😉

Why is HTTPS so important?

Let’s take a moment to understand why HTTPS is so important. Let’s assume that your Koha server is on your institutional LAN / intranet or hosted online, either on the cloud or on your own server connected to the Internet via a leased line.

Without HTTPS, every time you login into Koha (staff and/or OPAC) and perform *any* ILS transactions (e.g. patron contact information change, holds, fines, circulation etc) all of that information is available in PLAIN TEXT to everyone on your network.

If you are only connected to your institutional network, then that is the direct extent up to which anyone can see what you are doing. If your server is accessible over the Internet, then basically the whole wide world can see what you are doing. For instance, when you login over HTTP, it is actually the equivalent of writing down your username and password on a postcard and mailing it across the globe. Anyone who handles it during transit, or wants to, can simply read it. That’s why the world is moving away from the plain vanilla HTTP.

postcard

In simple terms, HTTPS on the other hand creates an end-to-end encrypted “tunnel” between your server and the browser that is requesting access (e.g. to the OPAC). Think of it as a secure, sealed box with the contents inside and only you, the user, have the “key” to unlock it. The actual process is depicted in the graphics below:

Image source : https://www.identrustssl.com/

Briefly HTTPS has 3 main benefits:

(a) Authentication
(b) Data integrity
(c) Secrecy

None of these are provided by HTTP, thus if your Koha server is online, the SSL (HTTPS) is simply a must these day!

The Basics Explained

GoogleIndicTransliteration feature utilizes a Google API designed for phonetic input of several Indian languages by transliterating text written in English on the fly to its Indian language equivalent. For example, if you type “Rabindranath” and it is set to transliterate to Bengali, the software will automatically convert to “রবীন্দ্রনাথ” or say “Premchand” to “प्रेमचाँद” if set for Hindi.

As with every Google API (and there are many), the Transliteration API too needs to be loaded by a minified Javascript API loader program, known simply as the “Google API Loader“.

How it works

Once GoogleIndicTransliteration system preference is set to “Show” from the Koha staff client, the code inside the file opac-bottom.inc loads up the API loader code available at www.google.com/jsapi, which in turns provides the framework so that the actual transliteration code available in the file googleindictransliteration.js can work its magic and provide the users with the transliteration feature.

GoogleIndicTransliteration system preference
The GoogleIndicTransliteration system preference is set to “Show” on the OPAC.
Why does HTTPS break it but not HTTP?

Short answer: Mixed context!

Long answer: HTTPS is important to protect both your site and your users from attacks online. As of now, Koha code in opac-bottom.inc calls the jsapi code over HTTP, instead of letting the browser handle it correctly based on the security context (i.e. whether the page is being served over HTTP or HTTPS). So when OPAC is on HTTP, jsapi is fetched over HTTP, things are on the same page. However, when the OPAC is served over HTTPS and jsapi continues to be fetched over HTTP, all modern browsers will flag it as a security violation known as “MIXED CONTENT” and halts the loading of jsapi, as seen in the screenshot below:

mixed_04
Error shown in Chrome’s browser console

As a result, googleindictransliteration.js has nothing to work with. End result, the GoogleIndicTransliteration feature does not work anymore! Bingo! We’ve found ourselves with a Koha bug!

Present status of bug

There is a patch submitted to Koha Bugzilla against Bug 17103 – Google API Loader jsapi called over http, waiting for sign-off and QA. Once it clears Koha’s project governance processes, it is expected to get pushed to the master and then be released with a stable version of Koha. Once that happenes we won’t have this issue anymore.
NOTE: Expect this fix to get backported across the current supported older releases.

Your options until the patch is officially released

(a) Do without GoogleIndicTransliteration feature until the fix is officially released by the Koha project if you are using HTTPS

OR

(b) Edit your “opac-tmpl/bootstrap/en/includes/opac-bottom.inc” file. Find the following section:

[% IF ( GoogleIndicTransliteration ) %]
    <script type="text/javascript" src="http://www.google.com/jsapi"></script>	
    <script type="text/javascript" src="[% interface %]/[% theme %]/js/googleindictransliteration.js"></script>
[% END %]

Replace the protocol notifier “http:” from jsapi URI with “https:“and save the file. It should look like this after the change:

[% IF ( GoogleIndicTransliteration ) %]
    <script type="text/javascript" src="https://www.google.com/jsapi"></script>	
    <script type="text/javascript" src="[% interface %]/[% theme %]/js/googleindictransliteration.js"></script>
[% END %]

CAVEAT: If you are doing this edit, it is assumed that you know what you are doing. If you make any mistake and break something during this, its all on you.

ASSUMPTION: This edit assumes you are on Koha 3.18.x and later and is using a .deb package based installation on Debian or Ubuntu.

Easy peasy way of automating remote backup on Google Drive for your Koha database

This post discusses how to automate your Koha ILS’s MySQL database backup on to Google drive and send an email when it is complete. It shows how you can take advantage of Google Drive’s 15GB space for free (Dropbox only gives you 2GB on the free access) and do it all from the command line and save the much needed RAM for your Koha server rather than waste it on the GUI, which is also a security risk. Further this attempts to introduce the novice readers into details of the commands they are supposed to follow, with further reading resources, should they be inclined to learn more.

Having your Koha ILS database to be regularly backed up on to remote, cloud storage is an excellent idea. By doing so you ensure a critical off-site, disaster recovery measure, which is very good. However, as with all things human, if we leave it on ourselves to do it, there will come to pass a time when we will (a) forget to do it or (b) be unable to do it for some
reason. As we all know good ol’ Captain Murphy’s Law[1] will strike us whenever we are least prepared; in this case typically that one time we forgot or were unable to take the backup, the darned thing will crash!

So backup automation is key. Not only it ensures regularity without fail. It also removed one more essential chore from our immediate plate, thus leaving us free to do other things without feeling guilty over this key housekeeping chore.

Cloud backup – Google vs Dropbox

Dropbox and Google Drive comes across as immediate choice of cloud based backup. However, their free editions differ [2]… only by about 13GB of space between them. So for long-term online backup Google Drive is the de-facto choice.

Our objective

So, here is what we set out to do:

  1. create a datetime stamped backup of the database; (so we can tell just by seeing the filename when the backup was taken)
  2. compress it with bzip2 utility; (so all those loooooong lines of SQL text do not take up so much space, a text file can compress up to within 10% of it original size)
  3. upload it to a specified folder on Google Drive; (so that all our backups remain in one place, date-wise)
  4. email the user that the remote backup process is complete. (so when we outside or on vacation and don’t have access to our workstation, we still get a notification when it was completed and if we don’t get one, then that something certainly went wrong and someone should do something about it)

And of course, since we are talking about making this happen everyday at the same time, we need to create a cron job that will deliver all of 1, 2, 3 and 4 to us in a single neat little command.

As you all know, no self respecting system administrator will ever be caught running the X11 windowing system on a production server. So we are going to do these the way real system admins do: from the command line.

NOTE: X11 is the geekspeak for the Graphical User Interface (GUI) environment we see e.g. when we log into an Ubuntu Desktop (which is typically the Unity desktop)

Command line in this day and age? Are you nuts???

No! And here is the reason. X11 is not only an inherently insecure protocol that puts your production system at risk, it is also (compared with a command line only system) a tremendous resource hog! We all know that more free memory (RAM) is usually-a-good-thing ™, so instead of wasting our precious RAM on running a GUI (and all the unnecessary software along with it making it slow *and* insecure) we are going to show you how to do this all from a command line. One other thing: if you ever need the assistance of an expert, you will find that command line setups are also easier to debug (for an expert), after all, aren’t they always asking you to check your “logs”? All those are after all command line output. So like the Chloromint ad below, please don’t ask us again why we love the command line! 😉

Preparations

We want a normal user account with no admin privileges; say in our case we will call it l2c2backup and we will do it from the terminal using the adduser l2c2backup command. See below:

blog_01

Next up, we need to switch over to the new user account and create a synchronization folder for Google drive.

blog_02

At this point, we’ll press “Ctrl+D” and exit from the l2c2backup user and come back to the root user or sudo user, for we now need to install a command line google drive client on our system. We are going to use the (almost) official Google Drive command line client for Linux known simply as “drive” and available from https://github.com/odeke-em/drive

Since we are using Debian, we have the advantage of using the pre-built binaries, which we shall install in the following manner by executing in turn each of the commands:
# apt-get install software-properties-common
# apt-add-repository 'deb http://shaggytwodope.github.io/repo ./'
# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 7086E9CC7EC3233B
# apt-key update
# apt-get update
# sudo apt-get install drive

NOTE:If you are using Ubuntu or other mainstream Linux distributions, you can use the instructions given here on the Platform Packages page.

Once we have completed installation of “drive“, we now need to go back to our /home/l2c2backup/gdrive folder as the user l2c2backup and initialize the sync folder (i.e. /home/l2c2backup/gdrive) using the command “drive init

blog_03

Copy the really long URL that the command tells you to visit and open it in your web browser. You will see an application authorization dialog screen come up, click on the “Allow” button.

blog_03A

NOTE: Before pasting the URL, you must make sure that at this point you are logged in into the actual Google user account where you want to send the backups to. Don’t make a mess here.

Assuming you did everything as I have mentioned so far, you will be automatically redirected to the page with the authorization key. It will look pretty much like the one below. Of course, every request will generate a separate access authorization key, so use the one generated specifically against your request.

blog_03B

Copy this key and paste it back at the prompt in your terminal window and press <ENTER>. DO NOT TRY TO TYPE IT OUT BY HAND, COPY-N-PASTE IS THE ONLY WAY HERE!

If you have done everything alright then you should be back at the command prompt without any error or any other message. Your sync folder should now be ready.

Putting our solution together

Now that we have the Google Drive sync ready, it is time to look at each piece of our basic requirement.

1. Creating a datetime stamped backup of our database

First we need to create the name of our output file for the MySQL backup. For this we shall use this: BACKFILE="<dbname>.$(date +"%Y%m%d_%H%M%S").sql;. The date format will give us a datetime string formatted as “20160723_000001” when the date & time is 12:00:01 (AM) on 23-July-2016. For this example, let us assume that the BACKFILE environment variable will hold the value: koha_ghci.20160723_000001.sql.

Note: replace <dbname> with the actual name of your Koha database, which in our case is koha_ghci. So, the syntax for us looked like: BACKFILE=koha_ghci.$(date +"%Y%m%d_%H%M%S").sql;. If you want to learn more about the format specific to the date command, you can read up this.

Next we will create the actual db backup using the datetime stamped output filename we just created. For that mysqldump -u<mysql_db_username> -p<mysql_db_passwd> <dbname> > /home/l2c2backup/gdrive/$BACKFILE.

Note: replace the <mysql_db_username>, <mysql_db_passwd> and <dbname> placeholders with your actual values. In our example case, the actual backup command string looked like this: mysqldump -ukoha_ghci -pASx2xvercbHXzs2dP koha_ghci > /home/l2c2backup/gdrive/$BACKFILE.

2. Compressing our SQL export

The previous step had exported our koha_ghci database as koha_ghci.20160723_000001.sql. We shall now compress this with bzip2 /home/l2c2backup/gdrive/$BACKFILE, which will give us the compressed file koha_ghci.20160723_000001.sql.bz2

3. Upload the compressed SQL backup to Google drive

Before we proceed with the actual upload, we should create a dedicated directory *on* our actual Google drive to store our backups. Lets call this directory as DBBACKUPS and create it on our online Google Drive space. It should be mentioned here that the command for upload using this library we are using, takes the form of drive push --destination <remote_folder_name> <full_path_to_compressed_file>. This code will ask for confirmation and we need to pass “Y” for yes before it will proceed. So we need to take care out that by adding echo Y | before the drive push command.

So in our case it will be echo Y | drive push --destination DBBACKUPS /home/l2c2backup/gdrive/$BACKFILE.bz2

Note:If you wish to learn about the other various options you can additionally use with drive push, I suggest you read this for the details.

4. Sending an email when the upload is done.

We are not running a dedicated, full fledged mail server like say Postfix on this box. Rather we are using the lightweight mstmp-mta with our Gmail account as the mail relay. If you want to know how to configure it, I suggest that you read this tutorial, ignoring the “mutt” part which you do not require. It is very simple. We had email sending working in under a minute. That’s just how long it took use to configure it.

Note: Just remember you *must* have openssl installed otherwise you will never be able to talk to GMail. And also you will need to go to your Google account and enable support for that Google likes to call “less secure apps” (which means any app that does use Google’s OAuth2 protocol for authentication. You will be authenticating over TLS and it is a perfectly safe thing to do, so just ignore Google’s ominous tone and enable “less secure apps”.

Now that we have msmtp-mta up and running, we will send out that email using this: printf "To: <recipient_email_address>\nFrom: <your_gmail_address>\nSubject: <dbname> db backed up on GDrive\n\nSee filename $BACKFILE.bz2 on DBBACKUPS folder on Google Drive of <your_gmail_address>.\n\nBackup synced at $(date +"%Y-%m-%d %H:%M:%S")" | msmtp <recipient_email_address>

In our case that happened to be printf "To: monitoring@l2c2.co.in\nFrom: indradg@gmail.com\nSubject: KOHA_GHCI db backed up on GDrive\n\nSee filename $BACKFILE.bz2 on DBBACKUPS folder on Google Drive of indradg@gmail.com.\n\nBackup synced at $(date +"%Y-%m-%d %H:%M:%S")" | msmtp indradg@l2c2.co.in.

5. Putting it all together

Now that we have all the parts of the puzzle in place, it is time to assemble it into a single piece. And the way, it worked for us was BACKFILE=koha_ghci.$(date +"%Y%m%d_%H%M%S").sql; mysqldump -ukoha_ghci -pASx2xvercbHXzs2dP koha_ghci > /home/l2c2backup/gdrive/$BACKFILE && bzip2 /home/l2c2backup/gdrive/$BACKFILE && echo Y | drive push --destination DBBACKUPS /home/l2c2backup/gdrive/$BACKFILE.bz2 && printf "To: indradg@l2c2.co.in\nFrom: indradg@gmail.com\nSubject: KOHA_GHCI db backed up on GDrive\n\nSee filename $BACKFILE.bz2 on DBBACKUPS folder on Google Drive of indradg@gmail.com.\n\nBackup synced at $(date +"%Y-%m-%d %H:%M:%S")" | msmtp indradg@l2c2.co.in

Note: The reason we used the “&&” is that in BASH it stands for what is called as “Logical AND”. In simple English this merely means that unless the previous command is not not executed successfully, whatever comes next simply won’t execute.

A BASH script and a cron job

We placed this one-liner script that cobbled together into the following BASH script which we named as “backuptogoogle.sh” and placed it in the folder /usr/local/bin after setting its execution bit on with chmod a+x /usr/local/bin/backuptogoogle.sh

#!/bin/bash
BACKFILE=koha_ghci.$(date +"%Y%m%d_%H%M%S").sql; mysqldump -ukoha_ghci -pASx2xvercbHXzs2dP koha_ghci > /home/l2c2backup/gdrive/$BACKFILE && bzip2 /home/l2c2backup/gdrive/$BACKFILE  && echo Y | drive push --destination DBBACKUPS /home/l2c2backup/gdrive/$BACKFILE.bz2 && printf "To: indradg@l2c2.co.in\nFrom: indradg@gmail.com\nSubject: KOHA_GHCI db backed up on GDrive\n\nSee filename $BACKFILE.bz2 on DBBACKUPS folder on Google Drive of indradg@gmail.com.\n\nBackup synced at $(date +"%Y-%m-%d %H:%M:%S")" | msmtp indradg@l2c2.co.in

We setup a root user cron job with crontab -e and adding the following line and saving it.

@daily /usr/local/bin/backuptogoogle.sh

Note: The @daily shortcut will execute our script exactly at mid-night everyday. If you want to know what are the other useful cronism shortcuts, I suggest you read this useful post by my Koha colleague and good friend D. Ruth Bavousett over here.

Backup automation from command line

If you have been able to follow the instructions by suitably modifying them to your specific settings, you have just achieved backup automation from the command line. Like I said… It’s Easy Peasy!!! 😀

References:

[1] https://en.wikipedia.org/wiki/Murphy%27s_law

[2] http://www.cloudwards.net/dropbox-vs-google-drive/#features

We are switching to Let’s Encrypt as it exits public beta

Announcement!

Back in February this year, we had announced that we were switching on Let’s Encrypt SSL certificates on our servers and VMs on a trial basis. We are happy to share the news that with Let’s Encrypt finally getting out of beta stage, we are shifting to LE certs  for all our SSL support.

Why Let’s Encrypt?

From the LE website, some numbers:

 

Since our beta began in September 2015 we’ve issued more than 1.7 million certificates for more than 3.8 million websites. [..] We set out to encrypt 100% of the Web. We’re excited to be off to a strong start, and with so much support across the industry.

Why SSL is important to you?

By default, Koha’s OPAC and intranet sites use HTTP and *not* HTTPS. This means all data exchanged between your server and your visitor / patron / library staff accessing either of the sites over the Internet, goes over as plain clear text.

To give a real world analogy, this is like writing down your credit / debit card number, CVV / CVV2, expiry date on a postcard and posting it. Anyone that comes across the postcard while it goes from the sender to the recipient can read it. You are essentially broadcasting your user credentials to everyone on the Internet.

On the contrary, HTTPS encrypts all the data exchanged between the parties communicating. Only parties who can read the information thus encrypted is you and your user / visitor and no one else.

How does shifting to LE certs help our users?

Earlier we had to charge our users extra (INR 1200 to 2500 per year) for SSL support for their hosted OPAC and Koha ILS staff client. We were using SSL services of providers like PositiveSSL etc. However, LE’s objective is to bring affordable encryption to 100% of the world wide web – their certificate is FREE! So as our client-partner YOU get to enjoy (a) better value proposition (b) a well-supported quality SSL certificate with global recognition.

You may wonder why and how LE does this. This is what LE has to say about itself on it web site:

Let’s Encrypt is a free, automated, and open certificate authority brought to you by the Internet Security Research Group (ISRG). [..] ISRG’s mission is to reduce financial, technological, and education barriers to secure communication over the Internet.

The sponsors who have committed their support to make LE a success read like a who’s who of the modern Internet as we know it. It is also supported by American Library Association (ALA). So if you are a librarian, this endorsement will mean just how seriously LE is being taken! 😀

Even if you not a client-partner, but happen to follow this blog, we suggest that you seriously consider giving Let’s Encrypt a shot, if for nothing else for peace of mind.

Happy SSLing!