Portfolio

Someday, I’d love to run a proper blog and put as much energy into writing up my work as I did into creating it. In the interest of not letting perfect be the enemy of the good, here’s a quick and dirty collection of teaching, writing, and other projects I’ve created over the years, loosely categorized.

Teaching - Data Science and Analytics

Beginning in Fall 2019, I have been a TA for ten graduate-level courses covering topics in data science, statistics, and data visualization at the Georgetown University McCourt School of Public Policy. This means in practice that I spent weekly 1-1 time with students, coaching them on implementing code in R and Python to apply methods discussed in class… and coaching them on debugging that code too.

I was incredibly excited when this culminated in me being asked to teach Data Visualization for Data Science in Fall 2023 as an adjunct professor for my former graduate program, a Master’s degree in Data Science for Public Policy. I had full control over the curriculum and course design (see syllabus here), and ended up emphasizing primarily R and other open source tools, with a touch of Tableau as a point of comparison.

This led to a fantastic opportunity to teach beginning R and data visualization students through the McCourt Data Visualization elective in Fall 2024, also targeted at Master’s students and also with full control over the course design (see syllabus here). I love working with students and hope to teach more in the future!

Teaching - Data Privacy

At the Urban Institute, I taught data privacy methodology and implementation of data privacy methods to a variety of federal, state, and local government stakeholders. Here is an example of publicly available materials for a training given to the Internal Revenue Service (IRS).

I’ve also given trainings on data privacy at conferences, like the Women in Statistics and Data Science conference through the American Statistical Association and the American Association for Public Opinion Research (AAPOR).

Writing - Data Science and Analytics

I wrote a blog post in graduate school about creating custom themes in ggplot2. I’ve now heard from at least four people in my life who needed to learn this and came across the blog post, which I’m sure speaks mostly to how fantastic it is and has nothing to do with Google’s SEO algorithms prioritizing content from people you know…

I also made key contributions to our Reproducibility at Urban webpage, with tutorials about usage of Git and Github, and to documentation for R packages written by our team for implementing privacy methodology.

Writing and Projects - Data Privacy

While at the Urban Institute, I led and contributed primarily to projects that used data science and statistical methodology to enhance privacy in publicly released data. A lot of that work was accompanied by writing! Here are some things that I’ve authored.

Introductory educational materials

These are a good place to start if you’ve been reading this whole page wondering exactly what I’m talking about when I say “data privacy”:

  • Data Privacy Educational Tool is a Tableau tool targeting community-based organizations, and introduces many key privacy concepts, basic methods, and equity tradeoffs in the data privacy space.

  • Understanding Synthetic Data is a fact sheet about synthetic data, a specific data privacy method applied frequently by my team.

Blog posts

Projects - Data Science and Analytics

Since my current role is in the federal government, I can’t post any projects online; it would be hard to do anyways, since the nature of this role involves much less open-source software than my role at Urban. I prefer to meet stakeholders where they are at when implementing technical projects, so I’m okay with this, but I’m always looking for opportunities to add automation or documentation to an existing business process.

Given that, my most recent data science projects are already listed under the “Writing and Projects - Data Privacy” section (sneaky double dipping!) - the methods my team and I applied are generally advanced statistical modeling and/or machine learning methods, so they definitely fall under the umbrella of data science. (When we were done with those methods, we also had to evaluate and visualize the impact, another key part of the data science pipeline).

A few other relevant projects…

I used to work as a data analyst at the Universal Service Administrative Company, where I created executive-level reporting for the four USAC programs using primarily Tableau. Because these were mostly internal reports, I can’t share them here, so you’ll have to take my word for it. I also led the public reporting for the Emergency Broadband Benefit Program a $3.2 billion congressional appropriation, which has since been replaced by the Affordable Connectivity Program.

In my role as a Graduate Fellow for the Data Science for Public Good program (a program through the Biocomplexity Institute at the University of Virginia), I managed four interns across two summer data science projects to assist public sector clients with relevant analyses. We eventually published web products for both projects:

Also, when I was still in grad school, our cohort partnered with the Washington DC Department of Consumer and Regulatory Affairs (DCRA), now the Department of Buildings. We used supervised and unsupervised machine learning methods to investigate potential improvements and risks to their housing inspection processes. The Washington Post wrote an article about it, and I was quoted in another article by a local news station.