Posts

7/24/17

On 7/21/17 I spent a lot of time trying to fix my macbook so that I could more easily work on this project at home. I was also able to export my database as a .csv file. I used the COPY command. COPY precincts TO '/home/dsmall/VA_votes/CSV_converter/Untitled Folder' DELIMITER ',' CSV HEADER ; This just copies the table into a .txt file or a .csv file. It can then be easily put into an excel spreadsheet. The last thing I did was make a different spreadsheet with all of Elizabeth Guzman's donors with the value of their donations. I also sent these files to Louis who is working on her campaign. The next thing that he wanted me to do was to make a scraper that gets the phone number or home address of her donors. While looking into this I found that the White pages and the Yellow pages aren't just databases with everyone's phone number and address. http://www.whitepages.com/   https://www.yellowpages.com/ To actually get the information you have to pa

7/20/2017

        Today I will be working on some issues from earlier this week that I found. The first issue I am going to fix is the file name issue. I plan on adding a new column with the name of the file, that way you can distinguish the data sets. The file names should also have the election name and the year of the election. While fixing that problem I also fixed the repeat problem. I just had to change the indentation of the print statement. The next issue that I need to fix is the directory. I want to make the code more general so that more people can use it.         Right now is looks like this: indir = ' /home/dsmall/VA_votes/CSV_converter/Data '         Now it looks like this: indir = 'Data/'         I am not 100% sure why this works, but it does. The next step would be to get the scraper to work, but I do not know what is wrong with it. While double checking everything I found a new bug. Some rows of data are being output when they should not be. This is a probl

Fifth Day

        Today I plan on further improving my script by adding the ability to work on multiple files in a folder. I also want to be able to format the information I have into "INSERT" queries for SQL. Another thing I could work on is just creating the tables for my Database.         I figured out how to run my program in the Terminal. python3 Test-CSV.py This is the command I used to run it. The next thing I will work on is getting it in the right format to make the "INSERT" statements.          After lunch I decided to just make all of the tables that I would need for the database. I made the tables very quickly and easily. I am now working on making the script output "INSERT" statements. I almost have the precinct table part done.         I had to rework the whole table, but I was able to finally make the insert statements work. I had to remove all spaces and commas as to not interfere with SQL syntax. Then to make sure it was correct I used a Q

Third/Fourth Day

        Today I plan to design a database for all of the data that I now have from the script. I am going to make an 'Entity Relationship Diagram' or ERD to help me plan it out before I start coding. Instead of a traditional ERD that looks like a flowchart I just drew out each data table I would use. Here is a link to the layout that I plan to use.          When I got home I started to look into the csv module for python. After reading a lot about it and messing around I was able to create a very small script that just printed each line. I continued to mess around until I had a better understanding of how to manipulate the data. I then created a repository on Github for my little script so that I could continue to use it when I returned to the ACC. After having it set up I explore git a little more so that I could understand how to push, pull, commit, and lots of other things. Here is a link to the repo.         I wanted to make sure that this was working so I set out to ma

Second Day

         First thing I did was update my version of the scraper and try it again. Near the end of the process the program froze. I added a new issue to the github page. The next thing I did was delete and reinstall the entire script to ensure it was the correct branch. I then ran it again and waited. The program finished, but the first page bug is still not fixed.         While waiting for a solution I decided to focus on the data I did have for Prince Williams County. I was able to create a database with two tables, one for the 2015 House of Delegates Election and the other for the 2011 House of Delegates Election. Both of these tables only have the data for District 31, which is the District Elizabeth Guzman is working towards. I added the 2007 Election to get a better idea of the voting trends.         If I was trying to see the breakdown of the 2007 House of Delegates Election for only Prince William County I would type: SELECT * FROM District_31_2007 WHERE District_31_2007.coun

First Day

Started to work on the va_votes project. Specifically working on Elizabet Guzman in Prince Williams County. The first thing was looking into qgis and psql. I have set up a qgis of all of the precincts in Virginia. I am now working on getting all of the data from past elections. There is a bug in the scraping program that prevents me from viewing the most recent presidential elections. Once I have this data I will create a database to store the data in.