New York City Subway Usage

#dataset

Todd Schneider put together an insightful look at how the New York City subway system's turnstile usage has dropped drastically since the beginning of the COVID-19 pandemic (from over 5 million to under 1 million! šŸ˜³). He even put together a simple but helpful ETL Rails application to pull the data and load it into a PostgreSQL database.

Dashboard: New York City Subway Usage (Todd W Schneider) Code: NYC Subway Turnstile Data (Github)

DoltHub: Github for Data

#dataset

A relatively new service that uses Git to collaborate on data. From the website, "Dolt and DoltHub provide cell-wise version control and robust collaboration tools to deliver a true git for data experience comparable to code collaboration tools."

https://www.dolthub.com/

Building a Career in Data Science

#podcast

Since I'm currently working toward shifting my career into data science, this podcast episode was particularly helpful. Emily Robinson discusses the best options to build a data science portfolio that will attract employers. The TL;DR; version is that employers don't want to read code, so it's better to blog or build a simple product.

Building a Career in Data Science with Emily Robinson (Towards Data Science)

COVID-19

Since the COV-19 pandemic is still very much in our collective attention, Iā€™m going to keep this callout going. The more data we share about this outbreak, the better future generations will be prepared.

COVID-19 Kaggle community contributions - Risk factors, transmission, and incubation period analysis by the Kaggle community using research papers and machine learning. Overall, hypertension and heart issues seem to be two main factors for complications due to infection. The incubation period seems to vary greatly and depend on too many factors.

Dashboards: Note: I've shared these in previous issues, but want to keep them readily available. Johns Hopkins ArcGIS Dashboard United States By County Dashboard nCoV2019.live Dashboard