This function subsets the HIT-COVID database after it has been loaded with hit_pull. If no filtering arguments are supplied the entire database is returned. There is the option to filter by continent, country, admin 1 unit, locality or intervention group.

hit_filter(
  hit_data,
  continent = NULL,
  country = NULL,
  admin1 = NULL,
  locality = NULL,
  intervention_group = NULL,
  include_national = TRUE,
  include_admin1 = TRUE,
  include_locality = FALSE,
  usa_county_data = c("include", "exclude", "restrict_to"),
  remove_columns = TRUE
)

Arguments

hit_data

the full HIT-COVID database pulled from GitHub, pulled using hit_pull

continent

vector of continent names to filter the data to; should be one of "Antarctica", "Asia", "Europe", "Africa", "Oceania", "North America", "South America"

country

vector of country ISO 3166-1 alpha-3 codes to filter the data to (see geo_lookup for concordance of country codes to names)

admin1

vector of the first administrative unit GID codes to filter the data to (see geo_lookup for concordance of admin 1 codes to names).

locality

vector of the names of localities to include (this is a free text field)

intervention_group

vector of intervention group to filter the data to (see intervention_lookup column "intervention_group" or run get_interventions for options)

include_national

logical indicating if national-level data should be included (default is TRUE)

include_admin1

logical indicating if admin1-level data should be included (default is TRUE)

include_locality

logical indicating if locality data should be included (default is FALSE)

usa_county_data

character string indicating how to deal with USA county-level data: one of "include", "exclude" or "restrict_to" (default is "exclude").

remove_columns

a logical indicating if columns with only missing values should be removed (default is TRUE)

Value

A dataframe with columns as described in the GitHub README (https://github.com/HopkinsIDD/hit-covid) but excluding any columns that have only missing values if remove_columns = TRUE

Details

All filtering arguments are optional. If none are provided, the entire database of national and admin 1 unit data will be returned.Any or all of the arguments can be specified allowing filtering by location, intervention type or both. The locality field is used infrequently in this database and this filtering argument should only be used if it is known that the database has a certain locality (lower than the admin1 level). The dataset is filtered in the following order: continent, country, admin1, locality, intervention group

If filtering to certain admin1 units, national data will be included by default as often these policies carry down to the admin units. If only the admin1 data is desired, set include_national to FALSE. Conversely when filtering to certain countries or continents, all admin1 information for those countries will be included by default. If only national data is desired, set include_admin1 to FALSE. Because the locality field is rarely used, the locality (lower than admin1) data is excluded by default unless a locality is specified for filtering or if include_locality is set to TRUE. If filtering by continent, countries in "Eurasia" will be included when filtering to either "Europe" or "Asia".

As part of the larger effort, there was a special project to collect some USA county-level data. If interested in only the data from that project, set usa_county_data to "restrict_to". If you want to exclude the USA county-level data, set usa_county_data to "exclude", if you want to include the USA county-level data along with other records, set usa_county_data to "include". Note that continent, country, and admin1 filtering take precedence and what USA county data is included will depend on these filtering arguments.

See also

Examples

#Pulling HIT-COVID database hit_data <- hit_pull() #Filtering to Africa africa <- hit_filter(hit_data, continent = "Africa") #Filtering to border closures in china china <- hit_filter(hit_data, country = "CHN", intervention_group = "closed_border") #Filtering to New Jersey state and national data nj <- hit_filter(hit_data, admin1 = "USA.31_1") #Filtering to just New Jersey state-level data nj <- hit_filter(hit_data, admin1 = "USA.31_1", include_national = FALSE) #Filtering to all national data national <- hit_filter(hit_data, include_admin1 = FALSE) #Filtering to just USA county data usa_county <- hit_filter(hit_data, usa_county_data = "restrict_to") #Adding USA county data (default is to exclude) no_county <- hit_filter(hit_data, usa_county_data = "include")