Skip to content

Instructions for Replicating

Step 1: Clone the repository

Use the git command line to clone the physician to voter repository. This assumes that you have your ssh keys setup. If you don't, check out the instructions here.

git clone git@github.com:Yale-Medicaid/physician_to_voter.git
cd physician_to_voter

Step 2: Initialize the Python virtual environment and install DVC

This step is needed to ensure that you have access to Data Version Control, which is used to load the raw voter files from a cache on the server.

Tip

If you are on linux or MacOS, you should replace line 2 with source venv/bin/activate

python -m venv venv
source venv/Scripts/activate
pip install -r requirements.txt

Step 3: Retrieve cached versions of the voter file using DVC

This step is easy! Simply run:

dvc pull

This pulls copies of the voter files and physician files from a secure backup folder. This code will only work on the server.

Step 4: Run the pipeline

The R-dependencies are automatically managed by renv, which will install any missing packages on startup. This means that you can start the code pipeline by simply running the following lines in an R session running inside the root directory.

> targets::tar_make()

This will start the process to link the physician and voter files. The linked dataset can be accessed by running the following command in R:

> targets::tar_read(rf_match_data)