I did not recognise many of the names on my DNA match list, some of whom had quite significant amounts of shared DNA. Many matches have no family tree linked to their results, which makes it extremely difficult to work out any connection, although this has become easier with the cluster analysis available through MyHeritage. For those matches that do have a family tree attached, Ancestry attempts so predict the relationship based on shared ancestors in our trees with “ThruLines“. Whilst some of these predictions are not valid, many do at least turn out to be correct, and by including the additional branches to my tree, I was able to confirm many of these predictions.
MyHeritages’s autoclusters tool is a great way of grouping your DNA matches – in this way, matches with no associated tree can be grouped with other relatives who they share DNA with, and this gets us one step further in identifying a most recent common ancestor (MRCA).
Before DNA, I often used to ask myself … “why am I following this side branch …. I’m not even sure these people are related to me anymore ….’ as I continued down yet another rabbit hole. But know, some of these additional journey’s are paying off and helping identify how my DNA matches are related to me and establishing our most recent common ancestor (MRCA).
My strategy nowadays whenever I think I may be heading down a branch of research (usually because I’ve found something interesting about them in an old newspaper report or other document) is to ask myself – are these people likely to share any DNA with me? …. If that answer is ‘yes’, then I go ahead with further research down that branch, as you never know – one day it may be the key that breaks down some of the brick walls in my tree.
Network analysis with Gephi
Whats next ?
This 52 ancestors in 52 weeks theme of ‘Branching Out‘ made me think of all the tools we currently have to explore DNA matches and identify our common ancestors. Whilst there are great tools available through Ancestry, MyHeritage and other companies, there are also lots of individuals with great ideas on how to explore our DNA matches data further. I’ve seen a few posts of people using angraph database tool called Gephi, which is used to analyse networks, and this has some incredibly powerful features which are worth exploring. In future, I’d love to try and analyse the DNA matches of my matches (and this data can be downloaded by some testing companies and GEDMATCH), but as part of this weeks challenge, I thought I would try and do a very simple analysis of surnames in my WatsonRoots tree. I currently have 3552 people in this tree, and 857 family groups. From this list of 857 family groups, I created a list of surname pairs (husband and wife’s birth surname), and created a simple graph database that shows how surnames are connected to each other. The red dots each represent a single surname, and get larger the more connections to other surnames they have. Its a very crude analysis, but its cool to watch how the algortithm sorts out all the names and connections, and shows what a powerful tool this could be to support DNA match analysis. Definitely something to explore in future.
Surname connections explored
The video below shows how Gephi figures out the most efficient way to represent the connection between the different surnames on my tree. Clearly I have a lot of research from the Stevens family (my maternal great grandmother) in my tree!