Posts tagged ‘geneagrapher’

Geneacache

First a note about the Geneagrapher: a new release is impending. The release includes many internal changes: lots of refactoring to improve the code, better conformance to Python coding conventions (remember, I wrote the first version a long time ago and my proficiency with Python has improved a lot since then), better code coverage by the tests, better design to enable more extensibility, and a local caching mechanism to eliminate multiple network requests for the same record. I will explain what I mean by extensibility in a later post.

Now, the point of this post: thanks to the changes to Geneagrapher mentioned above, I have set up a web-based “Geneacache“. It is a very early preview, so the API may change in the near future. A lot of other changes are needed, too, and it is possible I will move the page to a different address. Here’s the idea: your software can use the Geneacache to retrieve records from the Mathematics Genealogy Project (MGP), saving you the trouble of scraping the MGP pages or having to use Geneagrapher to do it for you. The response contains the record’s information in JSON. For example, for Gauß you currently get:

{
    "advisors": [
        18230
    ],
    "descendants": [
        151876,
        55175,
        29642,
        18603,
        19953,
        29458,
        62547,
        18232,
        18233
    ],
    "institution": "Universität Helmstedt",
    "name": "Carl Friedrich Gauß",
    "year": 1799
}

Behind the scenes, the Geneacache either returns what it has locally or fetches it from the MGP, stores it locally, and then returns the record to you.

This is not used for anything at the moment, but I intend to start exposing Geneagrapher through a web page again (history lesson: the first version of Geneagrapher, from when I was in graduate school — was a web service) at some point in the future. The Geneagrapher client is nice and all, but I imagine most users are not interested in installing it locally to use it.

I am also planning to get in touch with the MGP folks about this and related topics.

Geneagrapher Repository Moved

I have moved the Geneagrapher repository to GitHub using svn2git. The repository’s new home: https://github.com/davidalber/Geneagrapher.

I had to reorganized my Subversion repository to make it work, and my tags were trashed in the end due to conflicts in my tags and the way Git tags things. That’s not a huge deal, though, because I can recreate everything, if needed. Anyhow, the big take-home message here is that the Geneagrapher repository is now open to the public.

One other action that I took, after moving the repository, was to nuke the trunk (er, master). I had made new feature progress in the trunk years ago, but stopped. I decided it would just be easier to continue by reverting the trunk to the latest maintenance branch.

Geneagrapher 0.2.1-r2 Released

Version 0.2.1-r2, a maintenance release, of the Geneagrapher is now available.

A test in Geneagrapher 0.2.1-r1 was broken by new information added to the Mathematics Genealogy Project. This release fixes the test, but does not change the functionality from version 0.2.1-r1. Existing installations need not install this release.

For more information, please see the Geneagrapher page and the Geneagrapher 0.2.1-r1 release announcement.

Geneagrapher 0.2.1-r1 Released

Version 0.2.1-r1, a maintenance release, of the Geneagrapher is now available.

A few tests in Geneagrapher 0.2.1 have become broken since that version was released. This release fixes those tests, but does not change the functionality from version 0.2.1. Existing installations need not install this release.

For more information, please see the Geneagrapher page and the Geneagrapher 0.2.1 release announcement.

Geneagrapher 0.2.1 Released

Version 0.2.1 of the Geneagrapher is now available for installation. This release does not add new features to the software, but it does fix a debilitating issue that caused multiple advisors to be ignored (this problem was introduced following changes to the Mathematics Genealogy Project pages). Two users brought this to my attention, and I am grateful for their help (although I initially thought the problem was isolated to the recently-deleted, original web-based version of the Geneagrapher).

All pre-0.2.1 installations of the Geneagrapher should be updated to Version 0.2.1.

Changes made for this release:

  • Multiple advisors are now captured correctly. While this problem was manifesting itself, ancestor trees were coming out as a branch-free tree.
  • Added a test for the multiple advisor case, which enables quicker recognition of similar problems.
  • Updated a few tests that had become broken due to updates in the Math Genealogy Project’s database.

Since the features remain unchanged, please see the Geneagrapher 0.2 release announcement for more information, including how to find and install the package.

Geneagrapher 0.2-r1 Released

Python 2.6 was released less than a week ago. This Geneagrapher release slightly changes an installation-related file to enable installation on machines running Python 2.6 that have not yet installed Python setuptools.

The Geneagrapher features in this release are identical to those in version 0.2, so interested parties are recommended to read the Geneagrapher 0.2 release announcement for more information on the package.

If you have successfully installed Geneagrapher 0.2, there is no need to install version 0.2-r1.

Geneagrapher 0.2 Usage Guide

The purpose of this post is to explain how to use version 0.2 of the Mathematics Genealogy Grapher (Geneagrapher). For more information about the release of version 0.2 see the release announcement. For more information about the Geneagrapher in general, see the Mathematics Genealogy Grapher Page.

Basic Concepts

The input to the Geneagrapher is a set of starting nodes. If you want to build the ancestor graph of C. Felix Klein, then C. Felix Klein is the starting node for that graph. Multiple starting nodes may be provided (to produce the combined ancestor graph for an academic department, for instance).

Each individual stored in the Mathematics Genealogy Project’s website has a unique integer as an identifier, and this identifier is what is passed to the Geneagrapher for starting nodes. The identifier is embedded in the URL for records in the Mathematics Genealogy Project website. For example, Carl Gauß has the ID 18231 (http://genealogy.math.ndsu.nodak.edu/id.php?id=18231) and Leonhard Euler has the ID 38586 (http://genealogy.math.ndsu.nodak.edu/id.php?id=38586).

Before running the Geneagrapher, go to the Mathematics Genealogy Project and gather the identifiers of the starting nodes for the graph you have in mind.

Geneagrapher Usage

After installing the Geneagrapher, running

ggrapher --help

should produce

Usage: ggrapher [options] ID ...

Create a Graphviz "dot" file for a mathematics genealogy, where ID is a record
identifier from the Mathematics Genealogy Project. Multiple IDs may be passed.

Options:
  -h, --help            show this help message and exit
  -f FILE, --file=FILE  write output to FILE [default: stdout]
  -a, --with-ancestors  retrieve ancestors of IDs and include in graph
  -d, --with-descendants
                        retrieve descendants of IDs and include in graph
  -v, --verbose         list nodes being retrieved
  -V, --version         print version and exit

Explanations of some of the options are given below, followed by examples.

-f FILE, –file=FILE

By default, the Geneagrapher writes the data it generates to standard output. If you want the data written to file, you need to redirect the output or use the -f or –file switch. When one of these switches is used, the data is saved in the file name provided.

-a, –with-ancestors

When one of these switches is provided to the Geneagrapher, an ancestor graph is generated. An ancestor graph starts with the starting nodes and the works up to their advisors, their advisors’ advisors, and so on.

-d, –with-descendants

These switches instruct the Geneagrapher to extract information about the descendants of the starting nodes (i.e., their advisees, their advisees’ advisees, and so on).

Processing the Dot File

To process the generated dot file, Graphviz is needed. Graphviz installs several programs for processing dot files. For the Geneagrapher, I use the dot program. Let’s look at an example.

If the Geneagrapher has generated a file named ‘graph.dot’, we can do

dot -Tpng graph.dot > graph.png

This command produces a PNG file containing the graph. That’s really all there is to it. Almost.

A slightly more complicated process

I find that nicer looking final images are produced by following a more circuitous route. In the example above, I would run

dot -Tsvg graph.dot > graph.svg

This produces an SVG file. At this point, I use Inkscape to open the file and export a PNG file.

A number of other ways to do this are available.

Examples

Update: the Mathematics Genealogy Project has added new data since the examples below were constructed, so if re-run, the results will look different. The commands, however, all remain correct.

Single Node Ancestry: Carl Gauß

To produce the ancestry dot file for Carl Gauß (http://genealogy.math.ndsu.nodak.edu/id.php?id=18231) and save it in the file ‘gauss.dot’, run the command

ggrapher -f gauss.dot -a 18231

Multiple Node Ancestry: Friedrich Bessel and Christian Gerling

To produce the combined ancestry dot file for Friedrich Bessel (http://genealogy.math.ndsu.nodak.edu/id.php?id=18603) and Christian Gerling (http://genealogy.math.ndsu.nodak.edu/id.php?id=29642) and save it in the file ‘bessel_gerling.dot’, run the command

ggrapher -f bessel_gerling.dot -a 18603 29642

Single Node Descendant Graph: Haskell Curry

To produce the descendant dot file for Haskell Curry (http://genealogy.math.ndsu.nodak.edu/id.php?id=7398) and save it in the file ‘curry.dot’, run the command

ggrapher -f curry.dot -d 7398

Note that descendant graphs often have a lot of “fan out”.

Geneagrapher 0.2 Released

I am pleased to announce the first release of the Mathematics Genealogy Grapher (Geneagrapher) package. The Geneagrapher has been around for a couple years, but it was previously only a web-based tool. At this time the original version is still available on my old site. This package is written in Python, so users will need to have Python installed (get it here).

Here are the most significant changes, from the perspective of the user:

  • Descendant trees. Now trees can be built placing a starting node at the top and graphing all of its descendants. A couple points on this:
    • These sorts of graphs tend to have a lot of “fan out” because some people have a lot of students.
    • Be careful. Do not inadvertently (or intentionally!) run a job that requests the data for thousands of nodes.
  • Better character handling. I believe all characters are now displayed correctly, as long as the generated dot file is processed by Graphviz a certain way (see the Geneagrapher 0.2 Usage Guide).
  • No limit on the number of starting nodes.
  • This is a client application, meaning the user installs it somewhere and runs it there. Furthermore, this package only generates the input file to Graphviz, so that also needs to be installed. This is probably more of a hassle than most Geneagrapher users want to go through (not all, though), but this is just the first step.

Additionally, behind-the-scenes changes happened:

  • Large portions of the code were rewritten.
  • Added a test suite to make it more maintainable. In particular, this should allow quicker diagnosis and modifications when the Mathematics Genealogy Project pages have changed.

Getting the Package

For downloading and installation information, see the Geneagrapher Page.

Instructions

Usage examples are in the Geneagrapher 0.2 Usage Guide.

Geneagrapher Development

Earlier this year I spent some time cleaning up (and rewriting portions) of the geneagrapher tool. I intend to release the source as soon as I get it packaged, but want to take this opportunity to list some of the properties and features of the upcoming release.

  • Command-line based. The previous geneagrapher was designed to be completely web-based. I have written this one so that other people can use the tool on their machines.
  • Produces Graphviz dot files. Processing the file is left to the user.
  • The ability to generate trees with ancestors and/or descendants. The original tool only had the ability to generate ancestor trees, primarily because a descendant tree can be very large.
  • The tool works by crawling the Math Genealogy site, and each time the site’s design changes, the originally geneagrapher tool breaks. This new version has a number of tests that will help to detect when this happens earlier and to hasten the time needed to adjust.