GEDCOM Standards: Is an Update Coming?

GEDCOM Standards: Is an Update Coming?

GEDCOM, according to Wikipedia, is an acronym for GEnalogical Data COMmunication. It is a type of file used by genealogy programs as a standard for presenting family history facts in a text file easily read by other genealogy programs, websites, and services. It is used to pass family history research to and from computer programs, upload into family history trees and online services and apps, and to share among fellow researchers and family members.

Developed by the Church of Latter Day Saints, it is unique in today’s computer world. It is both proprietary and not. It’s referred to as “open de facto,” which doesn’t mean its open source, just open as a standard.

GEDCOM was developed for the public in 1984 and the latest update was in 1996 with version 5.5, and we’ve been locked into its limits ever since.

A standard was needed as genealogy software began to develop. Databases needed to be shared through exports and imports, thus a standard was developed with the Mormon Church taking the lead as they represented the largest demographic of users and consumers at the time.

As the demand grew for flexible and powerful genealogy software programs, people also demanded more data stored and transferred between programs such as religion, jobs, multiple spouses, step children, adoptions, foreign language characters, more events, personal and professional, and other historical data not represented or are hard to fit the square peg into the round hole with the current GEDCOM file structure.

Like so much in today’s technology, I like many assumed GEDCOM would be an evolving file format. It isn’t. Many attempts have been made to add new features and improve it, but nothing appears to have happened with it since 2001.

So I went digging.

What’s Happened to GEDCOM Standards?

In general, people are saying that the church and GEDCOM supporters have lost interest and the church has stepped down from GEDCOM development completely, leaving the format unattended and unsupported officially. My research shows this to be true with a variety of groups stepping forward to make their attempt to revise and update GEDCOM, but while proposals continue to be announced, no one group has stepped into the empty shoes left behind by the church.

In The Problem with GEDCOM, the podcast hosts talked about how GEDCOM works and its limitations, and whose working on improving it.

There are several groups working towards improving GEDCOM. Better GEDCOM (http://bettergedcom.wikispaces.com/) is an independent community working to build a better GEDCOM file specification that serves 21st century genealogists. Their website is actually a knowledgebase using the wiki format, along with discussions, to try and develop an agreed upon standard for the GEDCOM format. BetterGEDCOM is a great place to get the latest news on the issues involved with the GEDCOM file format and what the genealogy community is trying to do to fix the problems.

The International OpenGen Alliance (http://www.opengen.org/) is another community group made up of volunteers trying to develop a single standard for genealogy data exchange. OpenGen uses a Basecamp platform which requires a membership – the platform provides a space for discussions, file uploads and other tools to collaborate with others on the issues involved with improving the GEDCOM standard.

Genealogy programs have added their own file data formats and found ways to import and export between their various programs, and the GEDCOM file format, while still acceptable, is basically ignored and left locked in at version 5.5 as the “official” version. So genealogy developers run off on their merry way finding new ways to work around this.

I did more research and found some sites working on improving the GEDCOM data model. Unfortunately, most of these are set in walled garden style sites with little or no evidence of recent activity. They might be viable and highly active groups, but there is little I found to show for it. This could say more to their web development and design choices than organizational structure, but the family history fan club is hunting for activity and finding little visible.

Build a Better GEDCOM is a wiki site to coordinate efforts and volunteers to improve the file standard. According to the site, the group meets weekly, but there is no blog or information on updates or news. The most recent discussion is Model in the Future Directions Document from January 2011, which shows at least some activity.

OpenGen GEDCOM data model flow chart and graphInternational OpenGen Alliance is a membership organization that features semi-monthly meetings on creating an OpenGen data model. The last meeting was in March 2011 with no news on the next one. There’s a small note at the top that says the last update was February of this year. It does have some interesting recommendations for their Core Data Model for the OpenGen project, giving you a good visual flow of how all the genealogy data can come together, but I’d love to see more updated information and activity to know where this will is going.

In Is GEDCOM Dead?, an article originally published in Genealogical Computing and reprinted all over the web on genealogy sites, the author, Beau Sharbrough, looks towards a future that embraces XML formats, which is where GEDCOM data models are indeed headed, a form of micro-formats that create an HTML markup language feel to the data structure and model. While published in 2001, many still refer to it as a landmark article on the past and future of GEDCOM.

GEDCOM Explained, by Dick Eastman, highlights one of the major problems that led to the creation of GEDCOM and offers it as a reason why it’s so hard to create a format that can keep up with the rapid pace of software development.

You need to be aware that the creation of the GEDCOM standard was not a perfect implementation. For one thing, not all the data fields are specified precisely in the GEDCOM specifications. Next, not all the programmers of the various genealogy programs interpreted the specifications in exactly the same manner. For instance, your present genealogy program might be perfectly happy with a birth date listed as, “after 1847 but before 1852.” However, once that information is exported in a GEDCOM file and then imported into a different program, the birth date may say something else. Typically, it is simply left blank.

Another problem is that not all genealogy programs have the same ideas about databases. One program may have only one field for “occupation,” assuming that every person on the face of the earth never, ever changed careers. Another genealogy program may have the ability to record multiple occupations during the person’s lifetime. When transferring data via GEDCOM from the more powerful program to the simpler one, some of these occupations will be lost. These are a couple of simple examples; you can find numerous other inconsistencies when moving data between dissimilar programs.

At the end of the 2002 article, Eastman, a legend and genealogy master, commented on an article he wrote about the XML Version of GEDCOM, saying that he really didn’t think that the next version of GEDCOM would be approved soon, and it still lies dormant.

Sue Adams found the “Is GEDCOM Dead?” article recently and was also frustrated with the lack of development, which speaks loudly as to what most of us family history folks are thinking.

I commend the publication of the article because raising the issue with the genealogy public is long overdue. I believe that the antiquated GEDCOM is holding back the Genealogy world, as it is the underlying cause of data transfer and other problems. An analysis of the search terms recorded for this blog suggests that 30% of queries are about this and related issues, which if representative, should be a big wake-up call for commercial developers of genealogy software.

I think Sue speaks for me, as well.

While GEDCOM started out as a proprietary format, the genealogy community has take it on. We need one of these groups or someone new to take the reigns in being the visible proponent of this projects, helping us stay up-to-date, informed, and possibly consulted, or at the least invited to help in some way, even if it is spreading the word of mouth. Transparency is what I’m asking. Change has to come and we want to know what’s coming.

GEDCOM History

My research turned up some interesting articles and commentaries on the history of the GEDCOM standard.

Most Recent Articles by Lorelle VanFossen


About Lorelle VanFossen

Lorelle VanFossen hosts Family History Blog covering her ancestors and related family members. She is one of the top bloggers in the world, and host of the Lorelle on WordPress, providing WordPress and blogging tips for bloggers of all levels. A popular keynote speaker and trainer, she is also editor, producer, contributor, and official disruptive thinker for Bitwire Media which includes WordCast, Making My Life Network, Stories of Our Journeys, Life on the Road, WordCast Conversations, and the very popular WordCast Podcast.
This entry was posted in Genealogy Techniques and tagged , , , , , , , , , , , , . Bookmark the permalink.

3 Responses to GEDCOM Standards: Is an Update Coming?

  1. Mark Sherwood says:

    Lorelle,

    If you want to learn about GEDCOM, there’s really no better source than Tamura Jones’es blog.
    She’s been writing about it for years.
    Start with her fantastic overview of GEDCOM Alternatives
    While you’re there, check out her recent articles on AncestorSync.
    Interesting stuff!

    Happy family tree climbing.
    Mark.

    • Thank you for the reference. Her work is definitely the kind of thing that helps the rest of us understand what’s going on. I searched for several days to find such a reference. Wonder why this never turned up. Thanks!

  2. Pingback: Lorelle’s World as of June 20th, 2011 | Taking Your Camera on the Road

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.