Albemarle History Mapping

Exploring local history with GIS and mapping

Exploring the Albemarle Emigration Web App

In my last article I discussed using semi-structured textual data from a book called “Albemarle County In Virginia” by Edgar Woods to build an interactive web app showing emigration from Albemarle County.  Now I would like to spend some time exploring the app to draw conclusions, and form some questions that require additional research.

Before diving in there are a few caveats and limitations that should be stated up front:

  • Woods does not provide any sources for his work, so we have to take his word about who emigrated and where they went.  We also don’t know how complete his list is.
  •  The list mostly includes only individuals or couples.  In a few instances Woods identifies a “family”.  Therefore these numbers are likely only approximate and may not include children or other extended family.
  • Finally, Woods does not include any dates.  I spot checked a small number of individuals using the rest of Woods’s book and other records.  The earliest date I found was 1786 and the latest was 1887.   The lack of dates is unfortunate because it makes it impossible to visualize any trends over time, or to relate any specific movements to major events.

Despite these limitations we can still draw some interesting conclusions from the data.  The first thing that’s obvious is the concentration of emigrants in Kentucky.   Woods himself mentions this on page 55 of his book.

Using the Chart widget to look at the distribution of emigrants across all states helps further illustrate the popularity of Kentucky as a destination:

webapp_ss_3

The next two states with the largest number of Albemarle emigrants, Tennessee (103) and Missouri (70), can’t compare with the 291 emigrants who moved to Kentucky.

Overall, the data shows a clear emphasis on movement to the west.  There are some indications of movement to the South, with clusters in North Carolina and George, but in nowhere near the numbers who moved west.  This article does a great job of explaining the many social and cultural factors that drew settlers west, particularly to Kentucky.  The data Woods provides clearly illustrates this trend.

Comparing the data to the 1822 basemap is informative.  If you plot the emigrant destinations on a current map of the United States it doesn’t look very impressive and raises questions.  Why didn’t anybody move to Iowa?  What about Nebraska, Kansas, and Oklahoma?

compare1

But plotting the data against the 1822 basemap makes it clear that these pioneers actually pushed to the farthest reaches of the (then) United States.  Going any further would have required venturing into the Missouri or Northwest Territories.  It’s also possible that this tells us something about how Woods put his list together.  He may have chosen to exclude emigrants to the Territories because they were not officially states at the time, or because their destinations could not be located with enough specificity.

compare2

 

Some of the spatial statistics tools in ArcGIS can also help us understand and summarize this data.  I used the “Linear Directional Mean” tool to calculate the average distance and direction of all the flow lines connecting Albemarle with emigrant destinations.  This is a quick and easy way to summarize all of the flow lines, keeping in mind that this only accounts for straight-line distances and not the actual routes the emigrants would have taken.  Represented by the black line the screenshot below, it shows that the average straight-line distance traveled was 413 miles in  a southwest direction and ended in Kentucky.  This is also generally in the direction of the Cumberland Gap (red icon below), the main crossing point through the Appalachian Mountains for settlers heading west from Virginia.

webmap_ss_4

 

The “Directional Distribution” tool for calculating a standard deviational ellipse is another useful tool.  It helps summarize a dataset by showing how points are dispersed, their average spatial center, and their general orientation.  In this case, I ran an ellipse showing one standard deviation (below in orange), meaning the ellipse encompasses points within one standard deviation from the mean center of the dataset.  The center of the ellipse, over Kentucky, indicates the average spatial center of all listed emigrant locations. The ellipse’s orientation highlights that emigrant destinations are oriented from southeast to northwest.  This is another way of illustrating the popularity of western destinations for Albemarle emigrants, as opposed to going south or north from Albemarle.

webmap_ss_5

Viewing the data spatially also presents some questions.  For instance, there are no points located in South Carolina, despite the fact that emigration occurred to North Carolina and Georgia.  Is there a good historical reason for this?  More curiously, there are no data points in any northern states.  Is this true, or does it suggest an anti-northern bias in the way Woods, writing only 35 years after the Civil War, compiled his data?

Another oddity is noticeable when looking at residents who emigrated to Illinois.  Out of 26 emigrants to the Prairie State, only four of them (two couples) had a destination county listed.  The other 22 are unlocated and are all single men.  Is this coincidence?  Was there something about Illinois that encouraged single men, and not couples or families, to journey there?  Or does the fact that their destination is unknown imply that Woods had less information about these individuals, and therefore may not have known who they traveled with?

 

Answering these questions is beyond the scope of this article.  For now, I hope I’ve shown how our understanding of historic data can be improved by plotting it spatially and applying GIS tools.  Seeing these locations displayed interactively, as opposed to in a printed list, also makes it easier to form questions and plan further research.  Please feel free to leave a message or email with any comments or questions.

 

Mapping Albemarle Emigration

I’m a big fan of the book “Albemarle County In Virginia” by Edgar Woods for two reasons: 1) it’s a thorough history of the County packed full of names and places, and 2) the Internet Archive has a full copy digitized and OCR’d.  This makes it relatively easy to search through the text and plot all those place names on a map.  In Appendix 8, Woods provides a list of Albemarle County residents who emigrated to other states along with their destinations.

Movement was not uncommon in 18th and 19th century American.  Woods remarks on this on page 55 of his book:

The migratory spirit which characterized the early settlers, was rapidly developed at this period. Removals to other parts of the country had begun some years before the Revolution. The direction taken at first was towards the South. A numerous body of emigrants from Albemarle settled in North Carolina. After the war many emigrated to Georgia, but a far greater number hastened to fix their abodes on the fertile lands of the West, especially the blue grass region of Kentucky. For a time the practice was prevalent on the part of those expecting to change their domicile, of applying to the County Court for a formal recommendation of character, and certificates were given, declaring them to be honest men and good citizens.”

The list Woods provides stretches for nine pages.  I thought it would be interesting to put all those names on a map and then build an Esri Web App to help explore this data.  The rest of this article will cover how  I went from this:

Albemarle Emigrants listed by Woods

To this:

The app is available here.  If you’re less interested in the gritty technical details, feel free to jump to my next post about using the app to draw conclusions about Albemarle emigration.


Note: Although I’ll explain the steps I took, my focus is on the general process and not documenting every detail 100%.  Therefore I may skip over some minor sub-steps for the sake of brevity.  Feel free to email or post a comment if you have questions.

Preparing the Data

The first step was to format and prepare the list of names/places so they could be geocoded (assigned geospatial coordinates for plotting on a map).   The Internet Archive has already done the hard work of scanning  and OCRing (converted the words in an image to text).  Here’s a sample of what the data looks like:

OHIO. 

James and Mary (Woods) Garth 

William and Elizabeth (Davis) Irvin, L^ancaster 

Thomas Irvin, Lancaster 

Martin and Mildred Dawson, Gallia Co. 

Andrew J. Humphreys, Logan Co. 

John Wiant, Champaign Co. 

John and Sarah Garrison, Preble Co.

The text is semi-structured which makes it easier to work with.  The list is organized into sections by state.  Each line below the state name gives the name of an individual, or spouse, with their destination separated by a comma.  Most of the names are located at the county level, however some are at the city level and others are blank.

For this type of work I like using Excel for my data cleaning.  I copied the entire list, about 700 rows, into a spreadsheet .  Initially the spreadsheet looked like this:

excel1

Notice all the white space?  The OCR process can be messy and in this case left some sections of the list with blank lines between names.  The OCR process also captures the name of the book and page number printed at the top of every page.  On line 679 you’ll also notice an OCR artifact, where “ILLINOIS” is spelled  “IIvI^INOIS”.  This type of occurrence is sprinkled throughout the text.

I started by removing all the blank rows using a little Excel magic.  Then for each row I split the destination name into a new column called “Place Name”.  This was easy thanks to the fact that Woods was consistent in using a comma between the person’s name and their associated destination name.  Excel’s Text to Columns tool using a comma delimiter will automatically create a new column for all the text after the comma.

We go from this:

excel2

To this:

excel3

From here it was just a matter of some brute-force data cleaning.  I added a column called “State” and manually copied in the State name for each row.  Next I deleted all the extraneous rows containing the state name, page numbers, and book titles.  I also used search-and-replace to find and fix some of the OCR artifacts like “^”.

The next step was to add a column called “Place Type” to capture the provided location level (county, city, or state) for each row.  Again, the structure of the text helped by making it possible to formulate a few rules:

  1. Emigrants located at the county level will have “Co.” at the end of their place name
  2. Cities will have a value in the “Place Name” field that doesn’t end in “Co.”
  3. Emigrants who can only be located at the state level will have a blank “Place Name” field since they didn’t have an associated location.

Following these rules, I filled out the “Place Type” column with values for “County”, “City”, and “State”.

The last step was to add a column called “Total” to store the number of people described in each line.  In most cases it was either one person or a married couple, however a few rows referenced several people or a family.

Here’s what the final result looked like:

excel4

Geocoding the Data

This new spreadsheet was an improvement over the original list in terms of usability, but it still needed geospatial coordinates in order to put these locations on a map.  There are multiple options for geocoding textual data including building a Geocoder in ArcGIS or using a publicly available geocoding service from Google, Bing, or the US Census Bureau.  However, most of these services are geared towards processing a standard street address.  Since I only had count, city, and state names to worry about I opted for a more manual process in ArcGIS.

I located the counties and states in my list using shapefiles from the US Census Bureau.  This was done by calculating the centroid of each county and state, and then using a table join to attach those coordinates to my list.  There were only 10 unique city names so for those I located them manually using Google Maps (a low tech solution, but it works).

In the end I had 413 emigrant destinations and was able to plot them on the map for a quick and dirty visualization.

Working with Python

This was definite progress, but there was a problem.  Because of how the data was structured,  if multiple people went to the same location there would be multiple overlapping points at that spot.  From looking at the map you’d have no idea if there was one point or 100 at a particular spot.  It would also make using the web map more difficult, since you would have to navigate through numerous popups to see all the names at a particular location.

The ideal solution would be one point for each location, with an attribute for the number of people at that location (allowing me to symbolize with proportional symbols).  Each  point would also store all the emigrant names at that location in a single field, allowing for easy viewing in a popup window.

To reformat the data in this way I decided to re-purpose an existing Python script I had handy.  I used a similar process for the Albemarle Historic Web Map where I geocoded place names in the text of  “Albemarle County In Virginia”.   In that project I used a custom gazetteer (list of place names) to search for places in the text and associate that text with a spatial coordinate.  This case is a bit different because the geocoding was already done.  But the process could be adapted to take all those stacked points and combine them into a new, single point.  Here is the workflow I came up:

  • In ArcGIS, develop a list of unique place names and give each a unique ID.
    • I did this by dissolving my original shapefile on the Latitude, Longitude, and place name fields. I also used a statistic field to sum all of the people at that location.
    • Each location was assigned a  unique ID number.  The exact number didn’t matter as long as it was unique to that spatial location
    • Example: Gerrard County, Kentucky has an ID of 52
  • In ArcGIS, associate each of the original emigrant points with the unique ID number for that location
    • This was done using a spatial join with the place name shapefile to transfer the unique ID attribute to all the emigrant points that shared the same location
    • Example: The four points that fall on Gerrard County each now have an ID of 52
  • Convert the attributes of each shapefile to a CSV table
    • One table called “Places” includes the unique place names, coordinates, and ID numbers
    • Another table called “People” contains emigrants and the unique ID of their location

Finally, I could take these tables and use them as the input to my Python script:

import csv

#Open the input and output CSVs
peopleCSV = open('People.csv', 'rt')
placesCSV = open('Places.csv', 'rt')
outputCSV = open('Output.csv', 'wb')



try:
    #Create objects to read inputs and write the output
    peopleReader = csv.reader(peopleCSV)
    placeReader = csv.reader(placesCSV)
    writer = csv.writer(outputCSV)

    #Write a header column in the output CSV
    writer.writerow( ("ID", "State", "CountyCity", "Type", "Lat", "Lon", "Total", "People") )

    #Create an empty list to hold the names of people at each unique location
    peopleNames = []

    #Loop throught the CSV with place names
    for row in placeReader:
        #Create variables to store each of the relevant place name attributes
        placeID = row[0]
        state = row[1]
        placeName = row[2]
        placeType = row[3]
        lat = row[4]
        lon = row[5]
        total = row[6]

        #Loop through the CSV with people names.        
        for row in peopleReader:
            #Create variables to capture the unique ID associated with each place and each person
            peopleID = row[0]
            personName = row[1]

            #Check to see if the ID from the Place list matches this row of the Person list. If they match this means the person is located at that place
            if placeID == peopleID:

                #Add an HTML break tag to the person's name - this is necessary for dispalying properly in the webmap popup box                
                entry = personName + "
"
                #Add the person's name to the peopleNames list
                peopleNames.append(entry)

        #Create empty string variable
        printNames = ""

        #peopleNames is a list and will be formatted weird if we write it directly to a CSV. Loop through the list and add each name to a string
        for people in peopleNames:

            printNames += people

        #Have the reader return to the beginning of the People list
        peopleCSV.seek(0) 

        #Write out a new row to the output CSV.  All the emigrants at this location will now be stored in a single field
        writer.writerow( (placeID, state, placeName, placeType, lat, lon, total, printNames) )

        #Reset the list to empty for the next iteration
        peopleNames = []                

    #Have the reader return the beinning the Places list
    placesCSV.seek(0)

        
finally:
    #Close the input and output CSVs
    peopleCSV.close()
    placesCSV.close()
    outputCSV.close()

This output CSV could then be brought back into ArcGIS and converted to a Feature Class.  Here’s what the final results looked like using proportional symbols to show the number of emigrants at each location:

 

As a final step I calculated a few additional layers for the webmap:

  • “flow lines” connecting each destination point with its origin in Albemarle County.
  • The Linear Directional Mean of all those lines to highlight the average direction, length, and geographic center of the flow lines
  • A Standard Deviational Ellipse of all the destination points to better understand their central tendency, dispersion, and direction.

Getting Everything Online

Building an Esri Web App is a two step process.  First you build a web map, then you pull that map into an app.  I uploaded all the data to ArcGIS Online (AGOL) and went to work on the map.  AGOL makes it easy to create proportional symbols.  I also created a custom popup to highlight the number of emigrants at each location and show their names.  Another cool feature of AGOL is that you can easily embed a chart or graph into your popups.  I added a pie chart showing the number of emigrants at each location compared to the overall total number of emigrants.

Adding a Basemap

At this point I was happy with how the map was shaping up, but it was missing something.  I wanted to give some historic context for these locations beyond what the standard Esri basemaps provided.  I took a visit to the David Rumsey map collection and found a nice looking map by John Melish from 1822.  The map was gorgeous but the color and shading detracted from my focus on the emigrant locations.  To fix this, I downloaded the map, pulled it into Photoshop, and converted it from color to gray scale.  Then I brought the map into ArcGIS and georeferenced it to give it spatial coordinates.  Finally, I took the new georeferenced, gray scale version of Cram’s map and used Python’s GDAL library to build raster map tiles.  Tiling is a great choice for building raster web map layers that load quickly.  I uploaded the tiles and added them as a layer to my map:

Building an App

The last step was to incorporate the map into an app.  This would give the ability to create a custom, streamlined interface and add widgets with additional functionality.  Esri’s WebApp Builder is a handy way to quickly make good looking apps.  I wanted to keep the final app simple, so I chose a minimal theme called “Billboard.”  I also wanted to keep the app very focused on my content, so I turned off some of the default widgets that weren’t necessary here, like the “Locate Me”, “Search”, and “Overview Map” widgets.  I added a few new widgets to help users get the most out of the app:

  • An information panel with details about the map and sources
  • A layer list widget for toggling layers on and off
  • A chart widget for building interactive bar graphs
  • An attribute table widget for exploring the emigrant attribute data
  • A sharing widget so users can easily share the app with their friends and colleagues

These widgets are what separate an “app” from just a “map”.  They allow the user to easily interact with the map by turning layers on and off, reordering layers, and adjusting the transparency.  The chart widget is a really handy way for exploring the data.  The widget displays a bar chart, pulling directly from the live data, of the total number of emigrants by state.  What’s cool about it is that the chart is interactive and linked with the map, so selecting a state on the chart will highlight the associated points on the map.

webapp_ss_6

 


I hope this post has been helpful for understanding how unstructured historical text can be cleaned and plotted on an interactive web map.  Please keep reading,  visit the app and explore for yourself, or ask questions in the comments!

 

Introducing the Albemarle Historic Web Map

Albemarle County is fortunate to have a well documented and preserved  history.  When I began researching this history I quickly discovered numerous online resources.  Being a spatially-minded person, I started plotting points on a map to make sense of the different locations and events I was reading about. Eventually I realized that other people might find this useful so I started to work on an interactive web map (viewable here).

From the beginning I had a few different goals for this project:

  • Help people easily connect with local history
  • Make a map that was visually appealing, fun to use, and with data layers that loaded quickly
  • Be sure it worked well on mobile devices so users could easily get outside and explore with it
  • Demonstrate how existing textual sources can be enhanced just by displaying them spatially
web map screenshot

The Albemarle Historic Web Map.

My primary goal with the web map is to make it easier for residents to tap into the history that exists all around us. Plotting things spatially allows a user to form a personal connection with this history by learning about people and events linked to personally relevant places.  This could be in your backyard (literally) or your neighborhood, along the road you drive to work,  a shopping center, or the park where you take your dog for a walk on the weekends.

My own experience with a barn that I regularly drive past illustrates how this can work.  After I built the map I was browsing around my neighborhood and noticed a farm called Woodlands (38° 06′ 21″ N, 078° 30′ 01″ W) that was listed as a historic landmark.  I recognized the property immediately because it sits on a particularly scenic section of landscape.  The farm is noteworthy both architecturally and for being associated with John Richard Wingfield, a prominent late nineteenth century state senator.   One of the associated buildings on the farm is an antebellum barn that is clearly visible from the road.  The historic register nomination form spends almost four pages discussing the barn, and includes this passage:

“One of only eight or ten antebellum barns surviving in Albemarle, the Woodlands barn is unique among them. Relatively little is known about early barns in central Virginia, and the barn at Woodlands helps fill a significant gap in our knowledge of the area’s farm buildings and its agricultural practices in the mid-nineteenth century. The barn is to date the earliest mixed-use barn recorded in the region.”

Not just any old barn.

Not just any old barn

Pretty interesting, right?  Plotting this feature on a map and making this connection isn’t a revolutionary discovery.  It may not even be of much interest to anyone unfamiliar with the particular area.  But for those who have a spatial connection to Woodlands, either because they live in the area or pass by it, I think this is an interesting tidbit and knowing about it makes driving by a richer experience.


Right now the map includes sources that are readily and freely available online (mostly).  I didn’t comb through archives or digitize old records.  Instead, I wanted to see how much value I could squeeze out of taking easily available sources and displaying them spatially.  Eventually this might change and include original research, but for now I’m happy to stick with existing sources:

  • Historic buildings with their names and dates of construction.  This layer was developed by conflating several data sets from the Albemarle County Office of Geographic Data Services.
  • Historic sites from the National Register of Historic Places and the Virginia Landmarks Register cataloged by the Virginia Department of Historic Resources (VDHR). Using a list of geolocations from Wikipedia, I plotted each historic site on the map and then linked those locations back to their nomination form and pictures.  These forms are filled out when a site is nominated for inclusion on the Register and they contain both a thorough physical description of the property as well as a historical record of previous owners and events that took place on the property.
  • I manually digitized a layer of historic areas and neighborhoods to show both VDHR historic districts and historic villages. This layer links back to the relevant VDHR forms or a community architectural survey commissioned in 1995 by the county.  Places today that seem to be just a small cluster of houses at an intersection were actually once bustling communities and this layer links back to sources that help tell their story.
  • Multiple books have been written about the history of Albemarle County. One of the best is a book from 1901 called “Albemarle County In Virginia: Giving Some Account of What It Was By Nature, of What it Was Made by Man, and of Some Of the Men Who Made It” (they don’t make titles like that anymore) by Reverend Edgar Woods.  It’s packed full of the people and events that shaped Albemarle, and many of them are tied to a specific location whether it’s a geographic locale or the name of a house or estate.  The Internet Archive has a full digital copy of the book online.  I wrote a Python script that searched the text for place names and then associated with geocoordinates.  I think that being able to browse this book spatially gives it new relevance in the digital era.
  • Road side historical markers offer bite-sized chunks of history. Markers in Albemarle County were photographed and added to the map.  In some cases they overlap with other data layers, and in others they offer a unique story that isn’t captured in any of the other map sources.
  • Several maps exist for Albemarle County from the Civil War and Reconstruction era. They show roads, train stations, mills, and the names of residents.  My interest in local history actually started when I found Green Peyton’s 1875 map and noticed a person’s name not far from where I lived.  By importing the maps into a GIS and georeferencing them (stretching and aligning them to give them geographic coordinates) it’s easy to make these discoveries.  It’s also interesting to see just how many of our current roads and place names were in use over 150 years ago.
Peyton map of Charlottesville

Charlottesville, circa 1875

The map is designed to function well on both the desktop and mobile devices.  Both versions have the same content but the map widgets have been tweaked to support different use cases.  On the desktop site, users can search by address, view latitude/longitude coordinates, put the map in full screen mode, and link directly to Bing Birds Eye View aerial imagery.  The mobile version removes these features which are less necessary or ill suited for a mobile device but instead allows the user to plot their own location so they can easily see what historical locations are nearby.