On Saturday, I spent about an hour and a half setting up the test for the new Weekly Report and testing the warm and cold spreadsheet.
I adapted the test I had used before, which included test rooms with made-up temperature and CO2 values to reflect a variety of test cases, to fit into the historical report. Since the report produced was comprehensive, I decided to focus on temperature for that day -- everything checks out with the values I determined manually with a calculator (a process which took a decent amount of time even with the few data points I had -- that's why automation is so helpful). Next, I will test the report on carbon dioxide values and then start deploying to the server. This new version will only make use of cron for the fifteen-minute logging and the two programs (task_zero and generate_historical_report) run at the end of each week. Additionally, since school is closed, the values collected will not be meaningful; they are simply a test of the capabilities of this new report. I also have a bit of functionality to add to the final piece of the new report, based on what I was told by Facility members in January. In the automated email, I should include the top 5 or so rooms that need attention, so that the Facility members can look at them. This is an easily reachable goal as it simply requires the method DataFrame.head() to return the top 5 rows of each DataFrame in sorted order. Recently, I had been thinking about ways we can get more people into the Energize program (since I was previously the only member in 10th grade or below, as well as the only girl). Since everyone has a lot of unexpected extra time due to COVID-19, I recruited some girls I know and set up an online "class". This week, I began teaching them about Python and Data Science so that they would be prepared to start working with Energize.
About a week ago, I talked to Mr. Navkal and began recruitment. I ended up with a group of 5 girls, all in eighth and ninth grade, eager to learn or review Python and join Energize! Since then, we have had three class sessions over Zoom. As a syllabus, we have been using Codecademy's Learn Python 3 course. I'm excited to continue running the class and see where this goes! On Wednesday and Thursday, I spent a total of about 45 minutes on the historical report.
I mainly spent this time integrating Task Three (the creation of the "daily" reports) into the main program. I also separated out what I had from Task One of the old report to create the logging program, which is now a standalone program that will run every 15 minutes. In future sessions, I need to more comprehensively test Task Three to make sure the data it is producing is accurate and bug-free. I also need to integrate Task Four as the final step in creating a weekly report based on historical data. Yesterday, I spent about an hour working on the historical report. After debugging an issue with task 0, I finished integrating task 2 into the new system.
When creating a report from historical data, you need a lot more filtering than you do when running the numbers in real time -- you have to filter first by the week itself (selecting the week you want to report on), and then by day of the week. I had used a Dictionary to successfully filter out which days were school days (this is a basic implementation which assumes that every weekday is a school day -- I still have to get access to the school calendar, somehow scrape those dates and add the updated values into the dictionary) -- but after implementing most of task II and testing the results, I realized I had never actually filtered out which week I needed to select. For now, I used a simple input function to determine the start date of the selected week. (Hopefully, this will evolve into an interactive front-end where users can select the day and the parameters.) Once I had the start date, I added 7 days to make the end date, looped through all the days in between, and only set those weekdays to true in the dictionary. After this, I finished integrating Task 2 into the generate_historical_report program. Right now, it logs a TemperatureProblemDatabase and a CarbonDioxideProblemDatabase the same way the old task_two did. (Right now, it performs Task 2 once for each day -- it should run task 3 at the end of the loop as a way of "daily" aggregation, so as to save data in between days.) Today, I worked for about half an hour on the historical report.
My main objectives were to link task 0, which I had created on March 2nd, and the generate_historical_report program. I was able to successfully link them through the use of a filtered SQL database. Next, I worked on integrating task 2 into the generate_historical_report program. The current setup is a for loop that traverses the data by day (right now, it runs 7 times, but in the future I will change it to a while loop where the condition is the end of the week having been reached), and runs what was task_two (filtering which rooms are problematic at a certain interval) on that day's data. While I managed to integrate this into the for loop, right now it isn't actually separated out by day -- it is just running 7 times on ALL of the data. I need to fix this as well as the other issues next time. I just realized that I never posted about my work last Monday, March 2nd. I worked on the Weekly Report for about an hour and a half.
My main goal last Monday was to create a task_zero as a precursor to the rest of the tasks currently running. I am currently trying to have data logged from all possible time intervals, but filter it before entering the report process to ensure that my systems are only calculating on relevant values (that is, when school is in session and the building climate control systems are powered on). Task 0 would allow a week of raw data to be selected (either manually by the user in an interactive front-end, or in an automated fashion each Friday), and then filtered for whether or not school is in session. I plan to use a Datetime:Boolean dictionary that includes each day of the year as the index, and the boolean value for whether it is a school day as the value. I implemented a basic version of this where I set the boolean value to (start_date.weekday() < 5), and filtered the values based on this dictionary. While it does successfully filter, it also takes around a minute to run, because reading from the incredibly large DataFrame and applying pd.to_datetime is quite time-consuming. In the future, I hope to integrate the actual school calendar into this dictionary as well as figure out ways to increase the efficiency of the program, if necessary. |
AuthorI'm a high school senior and programming enthusiast. Archives
March 2022
Categories |