Data Carpentry for Library Professionals: Practical Skills for the Digital Age

The Charutar Vidya Mandal (CVM) University

Oct 12-17, 2023

11:00 AM to 12:30 PM (IST) & 2:00 to 3:30 PM (IST)

Instructors: Prof. Parthasarathy Mukhopadhyay, Prof. Aditya Tripathi, Dr. Raina Gaharwar

General Information

Library Carpentry is made by people working in library- and information-related roles to help you:

Library Carpentry introduces you to the fundamentals of computing and provides you with a platform for further self-directed learning. For more information on what we teach and why, please see our paper "Library Carpentry: software skills training for library professionals".

Who: The course is for people working in library- and information-related roles. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where: Online. Get directions with OpenStreetMap or Google Maps.

When: Oct 12-17, 2023. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Accessibility: We are committed to making this workshop accessible to everybody. For workshops at a physical location, the workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email or for more information.

Roles: To learn more about the roles at the workshop (who will be doing what), refer to our Workshop FAQ.

Further information: Participation fee: Rs.1200 (Participants from India) and 20 USD (Other Participants).

Organization: The workshop is organized by The Charutar Vidya Mandal (CVM) University, Vallabh Vidyanagar-388120, Gujarat, India

The CVM University

Code of Conduct

Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.


Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey


Day 1 (Data Carpentry Tool - OpenRefine)

11:00 Introduction of Data carpentry- tools & rules
11:45 Installation and configuration of OpenRefine in Linux and Windows environment
12:30 Lunch break
14:00 Concept of REST/API and sources for API-enabled ODbL datasets
14:45 Experiment with data fetching and data extraction in OpenRefine (Open Access Status determination)
15:30 END

Day 2 (Advanced Data Carpentry)

11:00 Library carpentry for citation and altmetric data
11:45 Deep faceting of data and GRELs in depth
12:30 Lunch break
14:00 Data reconciliation -concept and tools
14:45 MARC data import/export in/from OpenRefine and Data reconciliation - name authority Subject authority
15:30 Summary
15:45 END

Day 3 (Data Carpentry for Text)

11:00 Named Entity Recognition (NER)
11:45 Use of Stanford NLP
12:30 Lunch break
14:00 Automatic sentiment analysis
14:45 Automated subject indexing - the future possibility (Demonstration only)
15:30 END

Day 4 (MarcEdit)

11:00 Introduction to MarcEdit and Getting Started with MarcEdit
11:45 Working with MARC files, Layout of the MarcEditor and Profiling Your MARC data
12:30 Lunch break
14:00 Manipulating MARC data and Tasks and Automation
14:45 Integrations and Regular Expressions
15:30 END

Day 5 (Introduction to Git/GitHub and Web Scrapping)

11:00 Introduction to Git/GitHub and Getting started with Git
11:45 Sharing your work, Review and GitHub Pages
12:30 Lunch break
14:00 Introduction: What is web scraping? and Open Source Web Scraping Tools
14:45 Selecting content on a web page with XPath, Manually scrape data using browser extensions and Web Scraping for Content Creation
15:30 END


To participate in a Library Carpentry workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.


OpenRefine is a tool to clean up and organize messy data. Please find instructions to install it and the data used in the lesson in the lesson.


MarcEdit is a free, open-source application. Please find instructions to install it and the data used in the lesson.


Git is a version control system that lets you track who made changes to what when and has options for easily updating a shared or public version of your code on

Follow the instructions on the lesson to install Git on your system.

You will need an account at for parts of the Git lesson. Basic GitHub accounts are free. We encourage you to create a GitHub account if you don't have one already. Please consider what personal information you'd like to reveal. For example, you may want to review these instructions for keeping your email address private provided at GitHub. You will need a supported web browser.


Web scraping is the process of extracting data from websites. There are a variety of ways to scrape a website to extract information for reuse. Refer to the Setup section to install the required software to follow along this lesson.