Koha Quick tip: Removing the dot from salutation like “Ms.” after a bulk patron import

A quick-n-dirty way to remove the trailing dot from salutation field after a bulk patron import into Koha.

Earlier today the deputy librarian at our client partner Gouri Devi Institute of Medical Sciences and Hospital (GIMSH) called us wanting to figure out how to fix a small, but crucial problem. The employee list from GIMSH’s HR department was used to create the patron bulk import CSV [1] file. All 143 records got imported without a glitch and each individual records looked perfectly OK when viewing them. However, when she tried to edit these patron records, she noticed that the salutation (e.g. Mr, Mrs, Ms etc.) would not show up in the salutation drop-down during edit! ūüôĀ

missing salutation
Missing salutation even though salutation was entered into the database

Explanation

About 141 out of the total 143 records had a period.” after the salutation and two didn’t. These two records which did not have the period at the end of salutation were displaying the stored value correctly in the drop-down, rather than showing as being blank. As it happens, Koha uses the BorrowersTitles system preference to store the options to display. The default options are Mr|Mrs|Miss|Ms. As you can see, none of these have a period at their end. So, while the salutation field value was stored in GIMSH’s borrowers table, it was not being displayed during the edit as the value didn’t match with the values stored in BorrowersTitles syspref.

The “Fix”

***WARNING*** The following step requires you to directly execute a SQL on the Koha database. This is usually not recommended, but some times, it is required to quickly fix a problem.

Using the TRIM() string function of MySQL in tandem with the specifier trailing [2] we can take out the “.(period) at the end of the salutation from all the records in the borrowers table in the following manner:

UPDATE `borrowers` SET `title` = TRIM(TRAILING '.' FROM `title`);
salutation drop down is working now
salutation drop down is showing the stored value correctly now

References

[1] See the entry Comma separated values from Wikipedia.

[2] String Functions in MySQL 5.5 – http://dev.mysql.com/doc/refman/5.5/en/string-functions.html#function_trim

[3] Sample patron data file – patron_import.csv NB. Those of you who wish to study how the HR data was prepared for import into Koha, can look here for a sample CSV file with dummy data (do not worry about privacy as the data is fake).

How fast does we respond to client-partner requests?

L2C2 Technologies’ client partners are located across different timezones and even when they are in India, there are quite a few morning / evening shift institutions. Obviously a standard business hours policy as per Indian Standard Time is not the best way to provide them with support. Today, a Saturday, a request ticket came in at 7:40 PM. It was closed under 25 mins.

A custom subject-wise report of titles with author name, no. of copies, subject name in serialized listing

A custom SQL report for Koha that generates subject wise title lists with author name, no. of copies, subject name and biblionumber, written in response to a reader query over email.

Last week Mr. Gautam Mukhopadhyay, Librarian, Chandrapur College in Burdwan, West Bengal wrote in with a request:

Respected Sir,

I’m writing this seeking a solution for the problem relating to a report generation from Koha.¬†I want to get a list of titles under a particular broader subject¬†field-tag¬†(650). Quite a number of times I’ve checked from SQL Report. But all were in vain as those were not the same what I actually want to get. Following is the specimen of the opted report:
Sr. No.   Title     Author   Copy No.    Subject
¬† 1 ¬† ¬† ¬† ¬† ………. ¬† ¬† X ¬† ¬† ¬† ¬† ¬†3 ¬† ¬† ¬† ¬† ¬† ¬† Bengali
Under the subject Bengali or English or whatsoever, I want to get the titles those are belong to that particular subject. However, it won’t be a problem if there are different reports for different subjects. It’s Ok. But the SQL Query should be a general query structure that can be applicable for all such reports on the titles belong to a broader subject like Bengali, History, Geography etc.
Sir, please let me know the query structure, if possible.
Regards,
GM

¬†Here is a possible solution for his request, which pretty much does what Mr. Mukhopadhyay had specified in his request.¬†In this example we’ll use a sample MARC21 file which can be downloaded from here to try out this example. This dataset has a 14 unique bibliographic records with a total of 42 item (holdings) record, belonging to 03 specific broader subjects i.e. English, Economics and Political Science. As per Mr. Mukhopadhyay’s use-case, the MARC field 650 holds the broader subject classification. However, to match real world scenarios the 650 fields in some of the cases have other subject headings defined including narrower terms. Also additionally we are going to add an additional column to our report – the biblionumber, so that if required we can cross check a title in the report generated against the biblionumber in the database.

CAVEAT EMPTOR: If you are going to try out this example, we suggest that you define a new Koha library and import this MARC file into it. Mixing this sample data with your existing records is strongly advised against.

Step #1 – Create a new Koha instance and set it up
We are going to use the koha-create Debian command to create a new Koha instance and we shall call our instance as demo.
sudo koha-create --create-db demo

You may calls your instance by whatever name you like.¬†If you are not aware of the koha-create command, please read up “Commands provided by the Debian packages“.¬†Next we will do a default setup and proceed to define a Library that we’ll call “L2C2 Technologies Demo Library” identified by the code “MAIN”, using these instructions here.

NB.: To use the marc file used in this example you must set the library code for your demo branch as “MAIN”, the name (of the library branch) can be whatever you want it to be.

Step #2 – Define a new Authorize value category
Since our example marc file has biblios with only (a) English (b) Economics and (c) Political science, we will define a new authorized value category which we’ll call as SUBLOOKUP, under¬†Home ‚Äļ Administration ‚Äļ Authorized values. Once setup our new authorized value SUBLOOKUP will look like this:
gmreport_02
This authorized value list will provide the¬†subject selection list for¬†our¬†custom SQL report. So if you have more subjects you will need to add them here so that they look like this. The “%” in the Authorized value is *critical*, and if you want to be really strict about it, you can drop the preceding “%” and retain only the one at the end. However should you do that, your first 650 field *must* always be the broader subject heading that you wish to filter your report on.
Step #3 – Define our custom SQL report
We will go to Home ‚Äļ Reports ‚Äļ Guided reports wizard ‚Äļ Create from SQL and create a new SQL report. In this case, we’ll name the report as “List title with number of copies filtered by subject”, add a note that says – “A report written at the request of Mr. Gautam Mukhopadhyay, Chandrapur College, BWN”. The SQL will be as given below. The report once¬†saved, will allow us to run it.
SELECT 
 (@row:=@row+1) AS `S/N`, 
 gmData.Title, 
 gmData.Author, 
 gmData.Copies, 
 REPLACE (@TargetSubject:=<<Select the subject|SUBLOOKUP>>, '%', '') AS Subject, 
 gmData.biblioid AS `Biblionumber` 
FROM 
 (SELECT
 biblio.title AS Title, 
 biblio.biblionumber as biblioid, 
 ExtractValue(biblioitems.marcxml,'//datafield[@tag="245"]/subfield[@code>="c"]') AS Author, 
 count(items.itemnumber) AS Copies, 
 ExtractValue(biblioitems.marcxml,'//datafield[@tag="650"]/subfield[@code>="a"]') AS Subject 
 FROM 
 items 
 LEFT JOIN 
 biblioitems on (items.biblioitemnumber=biblioitems.biblioitemnumber) 
 LEFT JOIN 
 biblio on (biblioitems.biblionumber=biblio.biblionumber) 
 GROUP BY 
 biblio.biblionumber 
 ORDER BY 
 biblio.biblionumber) as gmData, 
 (SELECT @row := 0) r 
 WHERE Subject LIKE <<Re-select the subject tag|SUBLOOKUP>>
Let us take a moment to understand what this piece of SQL syntax really means.
(@row:=@row+1) AS `S/N`,

and

(SELECT @row := 0) r

The use of the @row variable and the counter (@row:=@row+1) gives us our “serial number” column in the report listing. ¬†We can also see the authorized value list “SUBLOOKUP” that we had defined earlier referenced here in the SQL.

NOTE: As you may note, we are asking the user to select the subject *twice*, (first time: ‘Select the subject’ and second time: ‘Re-select the subject tag’).¬†While theoretically we should not be required to do so, thank to the use of the runtime variable @TargetSubject, in reality we ran into a type casting error (see below), thus we used this less than pretty way of asking the user to select the subject twice, to get our job done.

gmreport_03

Step #4 – Running the report

After the report is saved, it is now time to run it, using the “Run report” option. What we’ll see now will be like this:

gmreport_04

We need to select the *same* subject from both the drop-down lists and click on “Run the report” button. Selecting “Economics” we shall in our case get the following report:

gmreport_05

Step #5 – Prettifying the custom report user interface

Having the user to select the subject twice is cumbersome as well prone to human error, so we decided it is time for some jquery magic to streamline this and leave the users with one only a single drop-down to choose from. For this we’ll turn turn to the IntranetUserJS system preference and add the following jquery snippet:

 $("label[for='sql_params_Reselectthesubjecttag']").hide()
 $('#sql_params_Reselectthesubjecttag').hide();
 $("#sql_params_Selectthesubject").change(function() {
   var subval = $('#sql_params_Selectthesubject').val();
   $("#sql_params_Reselectthesubjecttag").val(subval);
 });

If this is the first time you are hearing about the IntranetUserJS system preference, you should definitely read up this. Those of you who are indeed familiar with IntranetUserJS, all we are doing here is to (1) hide the second subject selection dropdown and its label and then (2) we are defining that whenever the user chooses a value from the *first* drop-down, the second (and now hidden) drop-down should also have the same value selected automatically. After saving the IntranetUserJS update, on running the report we shall see this:

gmreport_06

And bingo! We are done!
Extraa Innings: To see the actual report in action
  1. Go to the URL https://demo-staff.l2c2academy.co.in/
  2. Use User name / Password : demo / demo
  3. Go to the section¬†Home ‚Äļ Reports ‚Äļ Guided reports wizard ‚Äļ Saved reports
  4. Select “Run” from the “Actions” dropdown at the right.gmreport_07
  5. Play with the subject selection options to see the different outcome.

 

 

MarcEdit QuickTip #3 – Getting your 952 (items / holdings data) field in place for import into Koha

Shows how to de-duplicate a .mrc file, by merging duplicate bibliographic records spread all over the file and then gather up the holdings record into repeatable 952 field that Koha expects its for item records.

Last night Pawan Sharma, a fellow user on “Koha Users” reached out for some help with importing his items into Koha. Like many other, he too had moved his catalog data from Microsoft Excel spreadsheet to MarcEdit utilizing MarcEdit’s “Delimited Text Translator” feature which at the end of the process had given him a .mrc file. This he proceeded to upload into Koha by using the More > Tools > Catalog > Stage MARC records for import¬†option.

There were no surprises here, *except* that for every single books with multiple copies Koha imported each of the copies as a separate biblio record, instead of a single entry for the biblio with multiple item records attached to it via Marc21 952 repeatable field that Koha uses for managing holdings data. Simply put his data needed to be de-duplicated with the holding data merged back before import, typically using the ISBN number of the records (MARC field 020).

NOTE: If you wish to read more about Koha’s holding records schema see “Holdings data fields (9xx)” from the Koha Community wiki.

For someone who has not done this before, MarcEdit’s de-duplication and then merging it can seem like a daunting task. This post will hopefully demystify the process.

The discussions on Koha Users were based on a lot of assumptions, especially with no idea about Pawan’s data. So, I offered to take a look at it. He first sent me a .mrc file that had 12806 records, which I immediately converted into MarcEdit’s MarcBreaker mnemonic, human readable format.

marcedit_01

And proceeded to take a “Field count” report (see under “Tools” menu of MarcEditor) to check exactly how many records had ISBN (MARC21 field 020) out of the total number of records.

marcedit_01A

The result as can be seen above – NOT A SINGLE ONE of the 12806 biblio records had an ISBN number! Well, this file can be de-duplicated and merged, but *not* using MarcEdit. Only being told about this Pawan mentioned that he had other .mrc files that had ISBN and so he sent a second .mrc (LG-32016-32979.mrc) file over. Turns out of the total of 965 biblio records in this second file, 828 records had ISBN numbers defined.

marcedit_02A

The¬†next task was to extract the¬†records that *had* ISBN numbers. The remaining 137 can not be dealt with in this process and will have to be dealt with separately.¬†For now, we closed the file LG-32016-32979.mrk file with 965 records and went back to the MarcEdit main window in order to use the “Delete Selected Records” option available under Tools > Select MARC Records

marcedit_02B

The next few steps are¬†simple, if not immediately apparent to a new user of MarcEdit. We’ll use the numbered markers on the screenshot to explain it in steps. First, we selected the LG-32016-32979.mrk file with the 965 records in step #1; next we typed in 020 (since we want to match for ISBN) in the Display Field option (by default it shows 245$a); third step was to click on “Import File” button. After the file is imported (takes just a second or two depending on your file size) this the top-left data grid which was blank so far, will show up data similar to this. Finally in step #4, we will click the “Does Not Match” link. Records that do not have an ISBN number will be selected just like the big red arrow here shows.

marcedit_02CThe last step is to click on “Delete Selected”, this will open a File Save dialog with the title “Remaining Records”.¬†In the case, we provided the name LG-32016-32979_ISBN.mrk¬†and saved it and exited from this deletion utility.

This file LG-32016-32979_ISBN.mrk now has the 828 records with ISBN numbers and each of which has a holding records. This is what we will work with for the deduplication process.

marcedit_02D

Using the Tools > Record Deduplication option of MarcEditor, we will now remove the duplicate records into a separate file and save it with the name LG-32016-32979_ISBN_DEDUP.mrk. We will use ISBN as the field to use to identify duplicates. A popup showed us that 828 records processes, so we are done with deduplication. We will also need to save our original work file LG-32016-32979_ISBN.mrk. This file now contains biblio records with unique ISBN number. A quick check with the Fields Count tool showed us there were now 523 records (down from 828 records originally, the rest 305 records are the duplicates that are now saved in LG-32016-32979_ISBN_DEDUP.mrk).

marcedit_02E

¬†Now for the next step MARC Merge, which was the last step in this process. We have to go back to the main MarcEdit window and use the menu option Tools > Merge Records. The order of files we specify here is highly¬†*important*. The “Source File” in this case was¬†LG-32016-32979_ISBN.mrk¬†(the file with the 523 records with unique ISBN numbers), the “Merge File” is¬†LG-32016-32979_ISBN_DEDUP.mrk (the file where we had removed the duplicates to in the previous step) and finally, “Save File” is simply the name of the new merged file we are going to create (Hint: this is the final file that we will push to Koha). We named the final file as¬†LG-32016-32979_ISBN_MERGED.mrk. The Record Identifier is of course 020 (i.e. ISBN number) and we move on the next screen.

marcedit_02F

This is next step is basically *everything* we have been working for in this post so far, we select the field to merge in from “Merge File” into the “Source File” and click next.

marcedit_02GIn this¬†case everything went well and we were presented with the following screen that said “Merge Completed” and gave us the full path and filename to our merged file¬†LG-32016-32979_ISBN_MERGED.mrk.

marcedit_02H

Of course we opened up the LG-32016-32979_ISBN_MERGED.mrk file in MarcEditor. The first thing was to check the Field Count report, and this is what we saw 523 biblio records with a total of 828 holding records, which sounds right! Below is example of the merged holdings.

marcedit_02I

Of course there is still the task of exporting the MarcBreaker (.mrk) back to .mrc so that Koha can ingest it for its MARC21 staging workflow, but everyone knows that ūüôā

NOTE: For reference to this tutorial I’m attaching the zip file containing all the¬†LG-32016-32979 files used in this example.