For your first milestone, you harmonized two datasets to find patients in need of screening for kidney disease. That was an example of how you can use Python to make an impact - applying a formula to data to derive insight.
Using the same data, we can generate descriptive statistics and visualizations about the population to look for patterns.
Disclaimer: The data we are analyzing is randomly generated and contains no real patient data or true data patterns.
Using your Milestone 01 script as a starting point, refactor your script in two ways:
https://ils.unc.edu/courses/2024_fall/chip490_335/patient_demographics.csvhttps://ils.unc.edu/courses/2024_fall/chip490_335/cmp.jsonCreate a Jupyter Notebook that provides the following analysis about the output population (with eGFR <= 65):
Upload a .zip file containing your script(s), notebooks, and requirements.txt file to Canvas named descriptive-stats-onyen.zip where onyen is your onyen. Make sure you include a notebook named main.ipynb that I will run to see your results.
Alternatively, you can provide me with a link to a Github repository.
You should expect that I will do the following:
pip install -r requirements.txt to install the dependencies you've specifiedmain.ipynb fileI'll be grading your assignment based on output correctness and code readability based on the best practices we've discussed so far. Think about how you can best abstract your code using functions and/or classes, and think about how you can best organize your code using modules and packages.
As always, make use of Piazza to ask any questions and work with your fellow students. Feel free to reach out to me directly via Canvas or Email to schedule office hours or to stay after class and talk through any questions you may have - I'm happy to be your sounding board.
| Category | Range |
|---|---|
| Underweight | < 18.5 |
| Normal Weight | 18.5 - 24.9 |
| Overweight | 25 - 29.9 |
| Obese | ≥ 30 |