In recent class discussions and through hours of code writing worksheets, I can confidently say that I learned more about coding in r this past week than I have all semester. Whether it be the pressure of the final project or just the normal end of module grind, I was able to really spend a lot of time working through code script and grasping the language used in r. The reason for this sudden spurt of knowledge, I believe, is stemming from the difficulty of the recent code writing worksheets. Unlike the first few, if you did not firmly understand the material that was covered in the videos, it was not possible to complete some of the sections. Now that the final project is approaching, we are required to use some of the techniques which we have learned so far in the semester and report our observations for our chosen topic. For my project specifically, I will be utilizing regression analysis, multinomial regressions, and most likely a heat map to visually display my results. Intrinsic in this process, I must understand how I am going to extract the data I am searching for, from the data set I am working with. For example, I know that in order for me to properly run a multinomial regression(meaning multi-binary x=0,y=1,z=3) I must first organize my data so that we are cross regressing specifically according to the different town locations of the donors. If I were to run a basic binary regression or chi squared test, my results would be skewed because not all town areas would be able to be represented in the results. In economics, we must be very precise when deciding which sort of analysis we want to run.
Reading which are being considered as my partner and I work through the final project are racial capitalism, data cleaning, and data feminism. Our project is tied to racial capitalism because we are focusing on where much of the money from donations originated from. In this research, I intend to focus on hotspot areas of large donations. One, to correlate wealthy areas of Maine with wealthy donors; and to also research the history of the area to explain why this wealth has ended up there. Also within this scope, if there are certain donors, or multiple donors of the same family, it would be important to research how the family gained their wealth in such an exploitive era in American history. For data cleaning, we are solely focused on extracting the data which we are using for our empirical work, not as to have omitted variable bias in our results, but because some of the other data recorded is not relevant for specific purposes/models. In terms of data feminism, a trend we notice when focusing on families of donors in our data set iis often, the mother(assumption) does not have the first name listed. There is a portion of the data that reflects female names as “Mrs.”, when we all know it was only recorded as such because in this time period, the first name of the woman was not important, so as long as the man of the household’s name was attached. This major difference from modern day business receipts is something that we must consider when working through our data, and also we analyze the significance of our results.

