PyData NYC 2019 tutorial, Nov 6, 2019
Carlos Afonso ( linkedin.com/in/carlos-afonso-w )
pydata-nyc-2019-tutorial.ipynb
( .ipynb ) ( .html )Thanks to:
Context:
Goals:
Passionate about using Data Science to solve important problems
Data Scientist with diverse industry experience and multidisciplinary STEM background
Created the Visualizing the 2019 Measles Outbreak open source project.
Connect with me on linkedin.com/in/carlos-afonso-w
Initial: The large majority of NYC measles cases were in my neighborhood (Williamsburg, Brooklyn).
Technical / Practical: Opportunity to learn / practice / showcase fundamental and advanced data visualization skills.
General: Example of a small data project that can help people understand an important issue.
Measles is a very contagious disease caused by a virus. It spreads through the air when an infected person coughs or sneezes.
Measles can be prevented with MMR (measles, mumps, rubella) vaccine. CDC recommends children get two doses of MMR vaccine:
The MMR vaccine is very safe and effective at preventing measles:
Reference: Centers for Disease Control and Prevention (CDC): https://www.cdc.gov/vaccines/vpd/measles/index.html
(1963) Before the measles vaccination program started in 1963, an estimated 3 to 4 million people got measles each year in the United States.
Since then, widespread use of measles virus-containing vaccine has led to a greater than 99% reduction in measles cases compared with the pre-vaccine era.
(2000) Measles was declared eliminated from the US in 2000, thanks to an effective vaccination program.
However, measles is still common in other countries. Unvaccinated people continue to get measles while abroad and bring the disease into the United States and spread it to others.
(2019) The US is amid its largest measles outbreak since 1992, with 1,250 (preliminarily) confirmed cases as of Oct 3, 2019.
References:
Of all the affected areas, NYC provides the best data about the 2019 measles outbreak.
The NYC Health Measles webpage provides raw data about the number of measles cases by:
Screenshots (from Nov 5, 2019), in case we can't access the website during the tutorial:
Data versions:
Notes:
Let's read and have a quick look at the data:
import os
import pandas as pd
pd.read_csv(os.path.join('..', 'data', 'nyc-health', 'final',
'nyc-measles-cases-by-age.csv'))
pd.read_csv(os.path.join('..', 'data', 'nyc-health', 'final',
'nyc-measles-cases-by-vaccination-status.csv'))
pd.read_csv(os.path.join('..', 'data', 'nyc-health', 'final',
'nyc-new-measles-cases-by-month.csv'))
pd.read_csv(os.path.join('..', 'data', 'nyc-health', 'final',
'nyc-measles-confirmed-cases-by-neighborhood.csv'))
nyc-new-measles-cases-by-month-final.ipynb
( .ipynb ) ( .html )Default | Improved |
---|---|
nyc-measles-cases-by-age-final.ipynb
( .ipynb ) ( .html )Default | Improved |
---|---|
nyc-measles-cases-by-vaccination-status-final.ipynb
( .ipynb ) ( .html )Default | Improved |
---|---|
nyc-measles-cases-by-neighborhood-final.ipynb
( .ipynb ) ( .html )SVG | PNG |
---|---|
All data visualizations are shown in the project homepage (using GitHub Pages): https://carlos-afonso.github.io/measles
Reference: Working with GitHub Pages
# Export this notebook as a static HTML page
os.system('jupyter nbconvert --to html pydata-nyc-2019-tutorial.ipynb')