API Reference¶
EpiSpread¶
- class epispread.skeleton.EpiSpread¶
Bases:
object- classmethod _find_available_graphs(number_columns, iso_columns, time_series_columns)¶
find_available_graphs takes in the types of columns a dataset has and based on those, outputs a list of possible visualizations that can be graphed using the dataset’s data.
- Args:
number_columns (list[str]): list of column names within dataset that contain number values iso_columns (list[str]): list of column names within dataset that contain ISO values time_series_columns (list[str]): list of column names within dataset that contain date values
- Returns:
list[str]: list of names of types of graphs that can be plotted with the available data types.
- classmethod _find_time_series(df, str_column, date_columns)¶
This function takes in a certain column of the dataframe and checks if it is eligible for a time series based on any of the date columns. If so, it returns the first valid data column. If not, returns None
- Args:
df (DataFrame): pandas DataFrame we’re parsing str_column (str): name of the qualitative column we’re testing date_columns (list[str]): list of names of date columns within the dataframe.
- Returns:
str: returns the name of the valid DataFrame column.
- classmethod _get_files(urls, file_path='')¶
This function grabs 4 specified URLs from https://covid19.who.int/data and stores them as CSV files in data_files/[file_name] Returns 1 if failure, 0 if success
- Args:
urls (list[str]): list of all website links to the files we want to download.
- Returns:
list[str]: list of file names that get handed to the query.
- classmethod _parse_columns(df)¶
This function parses the columns of a given DataFrame and seperates them into those with number values, those with dates, and those that are string entries
- Args:
df (DataFrame): pandas DataFrame to be parsed
- Returns:
list[str]: Names of columns with number values list[str]: Names of columns with date values list[str]: Names of columns with string values
- classmethod _read_data(file_name)¶
read_data takes in a csv file name containing a dataset and reads it into a pandas DataFrame. Also reads the inbuilt “world” file in the gpd library into a GeoDataFrame.
- Args:
file_name (str): name of file containing dataset
- Returns:
(DataFrame, GeoDataFrame): Tuple of the pandas DataFrame of the data you entered, and the world GeoDataFrame
- classmethod run_query()¶
This function is the only function that should be called by a user. run_query runs a user prompt in order to automate graph setup and creation.
- Returns:
HeatMap: returns instance of HeatMap class if a HeatMap was plotted TimeSeries: returns instance of TimeSeries class if a TimeSeries was plotted.
- urls = ['https://covid19.who.int/WHO-COVID-19-global-data.csv', 'https://covid19.who.int/WHO-COVID-19-global-table-data.csv', 'https://covid19.who.int/who-data/vaccination-data.csv', 'https://covid19.who.int/who-data/vaccination-metadata.csv']¶
- epispread.skeleton.main()¶
HeatMap¶
- class epispread.graph_classes.heat_map.HeatMap(df, world, plot_column, start_date, iso_column, date_column, ts_flag=0)¶
Bases:
object- _add_iso3(df)¶
Takes in the given dataframe and adds a column of ISO3 codes based on the existing column of ISO2.
- Args:
df (DataFrame): The DataFrame to add to.
- Returns:
DataFrame: The inputted DataFrame with the added column.
- _filter_single_date(date)¶
Filters the DataFrame associated with the class instance by the specified date.
- Args:
date (str): The date to filter on
- Returns:
DataFrame: The appropriately filtered DataFrame.
- _iso2_to_iso3(iso2_codes)¶
Takes a list of iso2 codes and outputs a list of iso3 codes using coco package.
- Args:
iso2_codes (list): list of iso2 codes
- Returns:
list: list of iso3 codes
- _merge_manager(date)¶
If necessary, converts the iso2 to iso3 in either DataFrame so they match. On the basis of the matching iso3 column, merges the two DataFrames and plots the resulting combination.
- Args:
date (str): The required date, in format ‘yy-mm-dd’
- Returns:
(list of Line2D) : A list of lines representing the plotted data.
- _slider_setup()¶
Sets up the slider on the graph.
- Returns:
matplotlib.widgets.Slider: A slider representing a floating point range.
- _update(time_offset)¶
“This is a callback function that replots the graph whenever time_offset is updated, a.k.a. whenever the slider on the map is moved. Takes in the time offset integer, converts to a real datetime, adds to the starting date and converts back to string to push to other methods.
- Args:
time_offset (int): The associated number value of the slider, created in slider_setup.
- plot()¶
This function should be the only one getting called by the user. Will plot the merged DataFrame that results from the init method, along with a slider to show the progression of time.
TimeSeries¶
- class epispread.graph_classes.time_series.TimeSeries(df, date_col, line_col, indep_col)¶
Bases:
object- plot()¶
plot first converts the values in the “date_col” column of the dataframe to datetime, and then calls pandas.pivot_table to plot each of the “line_col” columns by date and their “indep_col” variable.