Exporting data

Our data is free to use subject to the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license. We supply the option to export the entire database including the relationships among entities as CSV files. This page explains how to do it.

Use cases

The exporting feature is useful if you want to one of the following:

  • Explore and analyze data in a way that Pandit website interface does not supply.
  • Create a visualization bases on the relationship among entities.

Before you export

  1. The exported data is up to date to the time you download them. Remember that the downloaded file is a only a copy of the data, which means that any change in the data that is done in Pandit database after your export is not reflected in your copy.
  2. The export process may take a long time if you download more than one thousand entities. Therefore, we recommend that you export only the data you actually need.
  3. A single exported file may be as big as 5 MB if you export many thousands of entities.
  4. Do not close the browser tab while the export file is being built, or else you will have to export from start.

Create an exported file

We recommend that you export only the data that are relevant to your research. Filtering out non-relevant entities will facilitate you work of exploring and analyzing the data.

Each of the search pages has a button called "Export as CSV". The search pages are Works, People, Sites, Institutions, States, Manuscripts, Extracts and Print sources. When you click the button, the system is building a CSV file especially for you. Note that a separate CSV file is exported for each entity type. If you need data of more than a single entity type, you should repeat the export process in the search page of each entity type you need.

Instructions for exporting results from any of the search pages mentioned above:

  1. Limit the results by using the filters that are located above the results and in the left sidebar.
  2. Click the "Apply" button for the filtering to take effect.
  3. Browse the results to verify that the data is indeed filtered to your expectation.
  4. Click the "Export as CSV" button and wait. The page will be refreshed and you may see a blue progress bar (in case of thousands of entities this can take a few minutes).
  5. When the system has built the export file it prompts you to save it. Save it to your local computer.
  6. Ensure the validity of the file (read below how to do it).

The name of the downloaded CSV file contains the type of the exported entities and the time of export.
Example: person_search-2019-12-31-23-59-panditproject.org.csv (that is, data regarding people that was up to date to the last minute before the year 2020 began).

Open a search exported file

Once saved to your computer, you can view the data that is in the file by opening it in your preferred program or tool, such as Google Spreadsheet, Microsoft Excel, OpenOffice/LibreOffice Calc, etc. Note that if the file you import is very big then both the loading of it in your program and editing it might be slow.

We recommend to use Google Spreadsheets and we supply here the instructions to do it step-by-step:

  1. Add a new spreadsheet (File > New > Spreadsheet).
  2. Open the "Import file" dialog (File > Import...).
  3. Select the "Upload" tab.
  4. Either drag the CSV file to the dialog or use the select option.
  5. In the newly opened "Import file" dialog choose "Insert new sheet(s)", "Detect automatically" and "Yes" (convert text to numbers, dates, and formulas). Click "Save data".
  6. Ensure the file's validity - verify that the number of rows in the exported file is the same as the result count stated on the search page (the first row is an extra one holding the names of the columns). If not, the export process was not correct and you should make a new export file (it rarely happens).
  7. Freeze the first line where you have the names of the fields (View -> Freeze > 1 row).
  8. According to your needs, it may be handy to freeze also the first column or even the first two columns (the first column is always the "Entity ID").

Google supply some basic analyzing option using free text. See the "Explore" button at the bottom of the spreadsheet (four-pointed star).

Open a template-based exported file

Sometimes we prepare a Google spreadsheet as a template. Currently, it is the case only with Manuscripts grouped by work grouped by author. If you are exporting data from regular search pages, this section isn't relevant to you.

Open the exported file as described here.

  1. Go to the the is given in the link of the page where you export data from
  2. Make your own a copy of the spreadsheet (File > Make a copy).
  3. In your copy, go the tab "Exported data".
  4. Open the "Import file" dialog (File > Import...).
  5. Select the "Upload" tab.
  6. Either drag the CSV file to the dialog or use the select option.
  7. In the newly opened "Import file" dialog, choose "Replace current sheet", "Detect automatically", and "Yes" (convert text to numbers, dates, and formulas). Click "Save data".
  8. Ensure the file's validity - verify that the number of rows in the exported file is the same as the result count stated on the search page (the first row is an extra one holding the names of the columns). If not, the export process was not correct, and you should make a new export file (it rarely happens).
  9. Switch to the "Prepared view" sheet (tab). The data should be displayed there in a predefined manner.
  10. If you wish, you can get the results as a PDF file (File > Download > PDF document).

Analyze an exported file

The CSV file contains the values of all the fields of the entity and not only the ones that are displayed in the results of the search page.

  • The first line of the CSV file contains a comma-separated-values list of the names of the fields ("columns") and the rest of the lines represent one entity each.
  • Whenever the entity type has a going out reference field then there are two fields: the first one of the referenced entity ID and the second one of referenced entity title/name/Identifier.
  • If multiple values were entered into the field then they are also separated by a comma.
  • In case you have to know the field type of each field, you can refer to the Configuration Sheet and choose the tab names with the entity type of your file. The column "Data type" will tell you whether the value is of a plain text, a list of predefined options, an integer, a reference to another entity, etc.

Note that a CSV file does not contain values of entities that are referenced to the entities in the file. That is, backwards references are not included. If you need the entities that refer to a given entity than you can find it in the columns that refer to te entity in this file.

There are two cases here:

  1. The entity of this type refers to a second entity of the same entity type. In this case, the relationship you are looking for resides in columns in the same file.
    Example: Say you have a CSV of people and you want to know who are the students of a given person. However, there is no "Students" column. In this case, you should search for the students in the "Teachers" columns using the entity ID of the given person.
  2. The entity of this type refers to a second entity of the another entity type. In this case, the relationship you are looking for resides in columns that are in the another file.
    Example: You have a CSV file of people but you want to know which works a given person wrote. However, you cannot find any "Works" in the file. In this case you can find the relationship in the CSV of works in a column named "Authors".

There are third-party tools that can help you make the connections and get the "backwards references" of the entities.

Export contribution of a specific user or project

You can get a list of the names and IDs of all the entities that were edited or created by a specific contributor or that are attributed to a specific project.

  1. Go to a page of a specific user or a specific project. You can pick a contributor from the Community page. If you want to export the entities that you have contributed to then go to your account page.
  2. You can see the number of entities to which the user contributed. Click "See full list".
  3. Filter it according to your needs.
  4. Click "Apply".
  5. Click "Export as CSV" or "Export as TXT".
  6. Save the file the system built for you.

You can open and analyze a CSV file as explained above. A TXT file can be opened by any word processing program.