Today, I worked on the tasks that were outlined in the last post. I was able to successfully use the databases to complete steps one through four! One challenge I encountered was that at first, the program kept crashing because the table I was making in SQL already existed (resulting in me having to make a new table every time I ran it). However, I eventually figured out, by reading the documentation, that there was a parameter if_exists, and I could set this so that the database would just append rather than crashing the program.
,Over the weekend, I worked on and completed a skeletal version of the weekly report, which produced the number of too-warm, too-cold, and too-much-carbon intervals for each room logged as having problems.
Now, I'm working on a more complex version, with a few more steps and a database. This new version will log all the data to a SQL database every 15 minutes, rather than four times per day (using SQLAlchemy and SQLite). Whenever data is logged, it will read in and analyze problem rooms, saving that somewhere else. Each day, the system will take the problem data from the 15-minute intervals and aggregate it by room number into a set of "daily" data. Finally, once a week, the system will take seven days' worth of daily data and turn it into a cohesive weekly report. As Mr. Navkal explained to me, this version allows for more flexibility, as all the raw data is being saved in addition to the summary displayed to the user.
During class yesterday, I mainly started learning how to use a database with SQLAlchemy and SQLite. (We read from the databases using SQLAlchemy, and write to them using SQLite.) It was pretty interesting to learn about in general!
Today, I continued working on the Weekly Report aspect. Applying many of the techniques I learned in the Coursera course, I was able to create a sort of skeleton for a weekly report -- currently, (as described in the repository's README) it reads the three CSV files (ahs_cold_data, ahs_carbon_data, and ahs_warm_data) and states the mean temperature difference and CO2 difference from the norm for each room with issues. There are two main issues in progress:
1. It needs to measure the average for the full week, not just the days where the room is logged as having issues.
2. In the specific case that a room is too warm at one time and too cold at another, it logs as two separate rooms rather than finding the mean temperature difference between the two.
I will try to fix these two and possibly add in more data to the report over the weekend!
Prior to today's class, I received a couple of new tasks:
In class today, I worked on the first objective. I created a new file, in which I read in all the "warm", "carbon", and "cold" data from their respective CSV files. Next, I grouped the DataFrames by room number and calculated the week's average temperature (or CO2) differences in each room. (right now, this doesn't include days that are not logged -- it only includes the days that do have differences from the norm) Additionally, I attempted to merge the DataFrames to convey all the information I needed, but this is still a work in progress. Overall, it's really cool to have an opportunity to apply my newly acquired skills!
Today, I was able to test the work I had done yesterday -- figuring out how to run a cron job and actually running one successfully are two very different things! However, it is now actually up and running! The major changes mostly involved specifying the I/O file locations in more detail, both in the programs and in the terminal. Overall, I'm really happy with my progress and can't wait to add to this further!
After finishing Course 1 from Coursera, I decided to take a break from the courses and return to the project-based portion of the independent study (that is, developing software directly for Energize). Some troubleshooting was required to start up the server, after which I was able to figure out how to make the program run there (by using the crontab command in the terminal). Now, it mainly needs testing!
I haven't posted in the last couple days because I have not actually had class -- as it is a requirement for all students taking enriched math, I competed in the annual Math Olympiad on Thursday, which meant I would have to miss some class.
That said, I have been working on the Week 4 project (the last one in this course), which I just finished today! After taking the course, I feel I have a much better understanding of advanced Python, the Pandas and NumPy libraries, and Data Science as a whole.
Today, I began Week 4 of the Coursera course, and watched all the lectures for the week. Additionally, I was assigned to read an interesting article on the idea that many scientists unintentionally "p-hack" data until it displays the results they're looking for. (It's called p-hacking because the p-value is the probability of those results occurring in a random distribution.) Then, I wrote a response to this article. I'm excited to apply all these skills in the upcoming course project!
Today was a half day at school, so class was pretty short. I continued working on the Coursera project. After school ended and I finished hosting the robotics bonding event, I was able to pass the project with 80% correct! Of course, I'm not done -- I'm going to try to get the rest of the problems right over the weekend -- but this was still a big milestone.
I continued with the Coursera project in class, but after I got home, I put in a couple hours and got myself 67% of the way through the project. A lot of the concepts were beginning to come more easily to me, which was really exciting. I'm trying to finish the project as soon as I can, so I can go to Saturday's hackathon!