How to download and compile census data in GIS using NHGIS webiste

Prepared by Mehdi Heris

Today, you will learn how to download census variables as a table and their associated GIS boundaries from NHGIS website and then join the tables with boundaries in ArcGIS.

  1. This is going to take about 30 minutes or so; why not preparing a cup of coffee frist
  2. Go to NHGIS website: . this website provides both tables and boundaries for census variables.
  3. From the left, click on “Select Data”.
  4. You are in this page:
  5. You need to know some terms here. Census has multiple datasets. You can filter them based on year or the type of dataset. Click on the “Datasets”.
  6. In this window, from the left menu you can select the main category of your dataset. In this example, we are going to use Decennial Census, the first category in the menu. This is one of the main census product that has numerous variables.
  7. Now you will see the sub categories of Decennial Census. You can choose 2000 or 2010. Let’s choose 2010_SF1a category which includes most regular variables of population and housing.
  8. Submit your selection. You will see the list of variables. Change your view items from 20 to 500.
  9. Now you can see almost all variables. Click on the “Source Tables” tab.
  10. Select your variables by clicking on the yellow circle.
  11. To get more information you can click on the table name and see the properties of that variable.
  12. Some of the variables are not just one column of data. They could be multiple columns. For example, click on the “table name” for one of variables (i.e. Urban and Rural) and you will see how many variables exist under this specific category. As you see in the below image, Urban and Rural contains 6 variables.
  13. Select your variables by clicking on the yellow circle and you will see in the little window located on top right, the number of your selections.
  14. Now you have the tables, let’s go to the GIS Boundary Files tab.
  15. From this list you can choose the aggregation geographic level. For example, you can choose block level and your population field will be at block level.
  16. Choose Blocks for Colorado, for example.
  17. In your Data Cart window you should see the number of GIS files you’ve added.
  18. Obviously, you could choose multiple boundaries and files.
  19. Now you have tables and boundaries. Let’s “Continue”.
  20. You should be in this window:
  21. You have a red message! It says: One or more tables lack a geographic level selection (see below).
  22. Now we should define our geographic level. Click on Select geographic level button.
  23. In this page, select block as your geographic level and submit.
  24. You should see the summary of your selection
  25. Then press “Continue”, the green button.
  26. You will see this page:
  27. If it says no geography extent is selected, go ahead and choose one (i.e. Colorado).
  28. Choose comma separated format with headers.
  29. You should sign in to submit your order. If you do not have an account, create one.
  30. After signing in, you will see the status of your order. It takes a little while for the website to prepare and complete your order. See the status bar.
  31. When your order is complete, you should see that in your status bar.
  32. Now you can download the tables and GIS boundaries.
  33. When you are downloading the files, your browser may detect it as a threat. Just ignore it and choose keep.
  34. You’ve downloaded two zipped folders. After copying the folders to your workspace folder, extract (unzip) them.
  35. Explore the contents of your folders. If there is a zipped folder within them, unzip them as well.
  36. All right, now let’s dive into the table folder which has CSV extension.
  37. You will see these two files in the table folder:
  38. The code book is very helpful. I do suggest have a look at that one. It contains all descriptions of variables.
  39. The csv file is readable by excel. Go to the Excel and from file>open, open this file. Make sure to change the file type to see this file in the open browser.
  40. When you open the csv file you will see this spreadsheet:
  41. I can even hear from here that you are complaining about this table. Yes; it is busy and not clean. Wait a minute. Most of these variables and columns are not useful for our purpose. Therefore, we can simply delete them. But which one. Let’s see what we have. The first column is “GISJOIN”. This is a very important one. We will use this field to connect the rows to the blocks. So we will keep it. The other columns are related to the place. For example, one is about the state; one is for county, one is for the metropolitan area and so on. You probably do not need them. I would delete all place-related columns. Before deleting, we need to find something else.
  42. Remember that we chose some variables such as population and housing units; those variables are in this table too. You usually find them at the end of the list. They come with some wired codes. See the image below:
  43.  The key is you can find them in the code book file. This is from the code book:
  44. Now you know what each code means. H7001 is the total population for example.
  45. All right, we found what we need. The “GISJOIN” column and our variable columns. You can delete the other columns now.
  46. You need just one row for the header. We have to have only one actually. Let’s rename the headers and keep one row. There is a tiny point here. As we talked in the class, the field caption should not have any space, dash or other characters. Also, you have only 13 character. Therefore, we need to have a short and simple caption. I changed my captions and cleaned my table:
  47. Your table is ready to go to Arcmap. Save it and close Excel. Open your Arcmap and use “addData” option to see the table and add it to your map (dataframe).
  48. Open your table; add your GIS layer (shapefile) downloaded from NHGIS.
  49. Open the table of the block layer too.
  50. The block layer also has “GISJoin” field. This is the same as what we have in our census table. Yes; they are the key to join two tables. How?
  51. Notice the census table should be attached to the block layer. Therefore, block level is going to be the guest and census table going to be the host.
  52. Right click on the block layer which is our host:
  53. Find the “Joins and Relates option” > Join; you will see this window.
  54. Make sure the top combo box is “Join attribute from a table”
  55. Choose the common fields in both tables. Option 2, should be the guest table that your are attaching to the blocks (the census table).
  56. The common field for join is the ID join. This Id has unique values. Our field was “GISJOIN”.
  57. Keep all records; press the OK button.
  58. Now the guest in the host’s home. But remember. The join is not a permanent thing. This join is only a temporary link of two tables. To make it permanent, we need to export our block layer to a new layer.
  59. Right click on the block layer. Under Data > export data
  60. Give an appropriate output name and location and save your layer.
  61. Believe or not you are DONE! How much of your coffee is remained? Mine is gone!

