Last Update: Fri Aug 02 11:34 EDT 2013
Initial Version: August 2013
I posted this content to the rootsweb Legacy mailing list (legacy@rootsweb.com)
July-August 2013. I think it is worthwhile placing here for more eyes and minds to
discover.
July 31, 2013
The elegant, proper solution to the mess that program vendors deliver
in parsing place names is not difficult.
Form a committee responsible for defining what variables are needed
to describe addresses around the world. (even more general than simple
place names).
e.g. StreetNo Street City County State ZipCode ZipCode+4
will be a US starting set.
Add to it all the fields that are needed in GB, Europe, etc.
Vet that list with users until the list becomes stable, that is
no new field thought up by users is not represented or needed
to be represented.
Then the program vendors need to define each of these fields as
unique variables in their internal database, with ways of seeing
precisely which field you are entering each piece of data into.
Then provide an unlimited number of data address masks that are
user definable. That is, for a USAddress mask, check the boxes
in the e.g. above.
But allow the user to define UKAddress, UKAddress2, EuropeAddress1,
PortugualAddress, etc. As many as you like.
Then everywhere an address appears in your data, the user has a
choice to make, the choice of mask that that address uses.
This will allow any user to *properly* and uniquely represent
any address in any style they please, but would most likely choose
the style for where the address is in the world. US ones for US
addresses. UK ones for UK addresses.
When you pop up a window containing an address for the person
(or whatever you are displaying) you will see the mask style
used for the address, and only the fields populated in that
mask.
Never again will you wonder if what you are looking at is City,
or County, or whatever. Each piece will uniquely go where it
belongs in the data, making searches on counties, etc., trivial.
And the display may lump StreetNo Street City County State, etc.
together in a line, but there must also be an option to make
the display label each piece so you don't go insane, as now,
trying to differentiate city and county, etc.
Make the general list. Make the trivial programming changes,
this is simply good programming and I am astonished that no
program vendor has taken this step!
Problem solved once and for all.
Unless address styles change, but then just tweak the code
with a trivial modification because the algorithm will already
be general enough to make that trivial.
I have no illusions any program vendor will actually do this.
It makes too much sense!
John H. Yates
P.S. This concept also will solve the GEDCOM problem (which is
the loss of data when exchanged between programs). See:
http://jytangledweb.org/genealogy/model/
which is an overly complicated description of this in general
terms. But it boils down to a very simple and achievable
concept.
John H. Yates
August 1, 2013
Here I give a simple to understand real world scientific example of
what my solution is that may be easier for people to understand.
The Periodic Table of the Elements. This is equivalent to the full set
of address variables that I propose to be developed. (and extensible to
ALL relevant genealogical variables, names, source reference variables,
etc.).
World wide committees of chemists over centuries developed this unique
list. There were only 103 known when I was young. About 118 are known
today.
This table has been indispensable to the multinational pursuit of
chemistry. It is the agreed upon lexicon of reality that all chemists
agree upon. From it one can write EVERY possible chemical formula for
compounds and chemical reactions. If one can't, it will mean an element
is missing, and needs to be placed in the chart, upon which the formula
or reaction can be written from symbols in the chart. This is science.
What I am proposing is that a list of genealogy variables be developed
that satisfies every (serious and intelligent = scientific) genealogist.
They can have different names, just as chemical elements have different
names in different countries. But what they represent is a unique thing.
If any variable is missing, add it. Soon you will have the periodic
table of genealogy variables that can build the genealogies for people
in all countries. And with an agreed upon lexicon of variables, it
will be trivially possible to share that information between programs.
This is the way out of the genealogy alchemy world of programs that
we live in today. Take my advice or leave it. But when it gets solved,
it will fit this model. Guaranteed. Delay if you wish, or make progress
in the direction required. That is the choice of those writing genealogy
programs and the supposed gurus that try to advance the field. I
am neither. I am simply a scientist analyzing the properties of the
solution. Which seems to escape everyone currently in either of those
categories (program writers and gurus).
John H. Yates
August 2, 2013
I have now decided to call this committee determined genealogical
variable list
The Periodic Table of Genealogical Variables.
The variables are analagous to the elements in the
periodic table of chemistry. The elements are the fundamental
building blocks of chemistry notation. The genealogical variables are
the fundamental building blocks of genealogical
notation.
And thus its addition to the title of this page.
[The rest of this message was in response to a question about whether
street numbers could be formatted after the street name, as apparently
in Spain].
Once this is established [The Periodic Table of Genealogical Variables],
formatting the marked up variables output is
just window dressing programming.
When defining an address mask by selecting (checking) variables the
order in the output can also be specified. By dragging or select and
up/down arrows. The same as for Name variable output masks (e.g. Given1,
Given2, Surname, ...), and Source Reference output masks built from the
Periodic Table Variables (e.g. [AUTHOR], [PUBLISHER], [TITLE], [DATE],
...). Even evidence analysis variables need to be added to the Periodic
Table of Genealogical Variables., eventually, if not at first. And
ample notes field variables must be defined, of course.
Basically, if it is information that needs to go into your genealogy
research, there must be a variable defined to hold the information.
I urge readers not to underestimate the power of this simple model in
advancing the field of genealogical programming. And genealogists armed
with better tools will have more time to focus on more and more
complicated problems. Much like mathematicians armed with
Mathematica.
Let the computer do the simple stuff, and keep
raising the bar of what is simple stuff.
John H. Yates