The Periodic Table of Genealogical Variables
The Genealogical GEDCOM Problem Solution

http://jytangledweb.org/genealogy/GEDCOM/

John H. Yates

Last Update: Fri Aug 02 11:34 EDT 2013
Initial Version: August 2013


I posted this content to the rootsweb Legacy mailing list (legacy@rootsweb.com) July-August 2013. I think it is worthwhile placing here for more eyes and minds to discover.
July 31, 2013
The elegant, proper solution to the mess that program vendors deliver in parsing place names is not difficult.
Form a committee responsible for defining what variables are needed to describe addresses around the world. (even more general than simple place names).
e.g. StreetNo Street City County State ZipCode ZipCode+4 will be a US starting set.
Add to it all the fields that are needed in GB, Europe, etc.
Vet that list with users until the list becomes stable, that is no new field thought up by users is not represented or needed to be represented.
Then the program vendors need to define each of these fields as unique variables in their internal database, with ways of seeing precisely which field you are entering each piece of data into.
Then provide an unlimited number of data address masks that are user definable. That is, for a USAddress mask, check the boxes in the e.g. above.
But allow the user to define UKAddress, UKAddress2, EuropeAddress1, PortugualAddress, etc. As many as you like.
Then everywhere an address appears in your data, the user has a choice to make, the choice of mask that that address uses.
This will allow any user to *properly* and uniquely represent any address in any style they please, but would most likely choose the style for where the address is in the world. US ones for US addresses. UK ones for UK addresses.
When you pop up a window containing an address for the person (or whatever you are displaying) you will see the mask style used for the address, and only the fields populated in that mask.
Never again will you wonder if what you are looking at is City, or County, or whatever. Each piece will uniquely go where it belongs in the data, making searches on counties, etc., trivial. And the display may lump StreetNo Street City County State, etc. together in a line, but there must also be an option to make the display label each piece so you don't go insane, as now, trying to differentiate city and county, etc.
Make the general list. Make the trivial programming changes, this is simply good programming and I am astonished that no program vendor has taken this step!
Problem solved once and for all.
Unless address styles change, but then just tweak the code with a trivial modification because the algorithm will already be general enough to make that trivial.
I have no illusions any program vendor will actually do this. It makes too much sense!
John H. Yates
P.S. This concept also will solve the GEDCOM problem (which is the loss of data when exchanged between programs). See: http://jytangledweb.org/genealogy/model/ which is an overly complicated description of this in general terms. But it boils down to a very simple and achievable concept.
John H. Yates
August 1, 2013
Here I give a simple to understand real world scientific example of what my solution is that may be easier for people to understand.
The Periodic Table of the Elements. This is equivalent to the full set of address variables that I propose to be developed. (and extensible to ALL relevant genealogical variables, names, source reference variables, etc.).
World wide committees of chemists over centuries developed this unique list. There were only 103 known when I was young. About 118 are known today.
This table has been indispensable to the multinational pursuit of chemistry. It is the agreed upon lexicon of reality that all chemists agree upon. From it one can write EVERY possible chemical formula for compounds and chemical reactions. If one can't, it will mean an element is missing, and needs to be placed in the chart, upon which the formula or reaction can be written from symbols in the chart. This is science.
What I am proposing is that a list of genealogy variables be developed that satisfies every (serious and intelligent = scientific) genealogist. They can have different names, just as chemical elements have different names in different countries. But what they represent is a unique thing. If any variable is missing, add it. Soon you will have the periodic table of genealogy variables that can build the genealogies for people in all countries. And with an agreed upon lexicon of variables, it will be trivially possible to share that information between programs.
This is the way out of the genealogy alchemy world of programs that we live in today. Take my advice or leave it. But when it gets solved, it will fit this model. Guaranteed. Delay if you wish, or make progress in the direction required. That is the choice of those writing genealogy programs and the supposed gurus that try to advance the field. I am neither. I am simply a scientist analyzing the properties of the solution. Which seems to escape everyone currently in either of those categories (program writers and gurus).
John H. Yates
August 2, 2013
I have now decided to call this committee determined genealogical variable list The Periodic Table of Genealogical Variables. The variables are analagous to the elements in the periodic table of chemistry. The elements are the fundamental building blocks of chemistry notation. The genealogical variables are the fundamental building blocks of genealogical notation. And thus its addition to the title of this page.
[The rest of this message was in response to a question about whether street numbers could be formatted after the street name, as apparently in Spain].
Once this is established [The Periodic Table of Genealogical Variables], formatting the marked up variables output is just window dressing programming.
When defining an address mask by selecting (checking) variables the order in the output can also be specified. By dragging or select and up/down arrows. The same as for Name variable output masks (e.g. Given1, Given2, Surname, ...), and Source Reference output masks built from the Periodic Table Variables (e.g. [AUTHOR], [PUBLISHER], [TITLE], [DATE], ...). Even evidence analysis variables need to be added to the Periodic Table of Genealogical Variables., eventually, if not at first. And ample notes field variables must be defined, of course. Basically, if it is information that needs to go into your genealogy research, there must be a variable defined to hold the information.

I urge readers not to underestimate the power of this simple model in advancing the field of genealogical programming. And genealogists armed with better tools will have more time to focus on more and more complicated problems. Much like mathematicians armed with Mathematica. Let the computer do the simple stuff, and keep raising the bar of what is simple stuff.
John H. Yates