API Reference

EpiSpread

class epispread.skeleton.EpiSpread

Bases: object

classmethod _find_available_graphs(number_columns, iso_columns, time_series_columns)

find_available_graphs takes in the types of columns a dataset has and based on those, outputs a list of possible visualizations that can be graphed using the dataset’s data.

Args:

number_columns (list[str]): list of column names within dataset that contain number values iso_columns (list[str]): list of column names within dataset that contain ISO values time_series_columns (list[str]): list of column names within dataset that contain date values

Returns:

list[str]: list of names of types of graphs that can be plotted with the available data types.

classmethod _find_time_series(df, str_column, date_columns)

This function takes in a certain column of the dataframe and checks if it is eligible for a time series based on any of the date columns. If so, it returns the first valid data column. If not, returns None

Args:

df (DataFrame): pandas DataFrame we’re parsing str_column (str): name of the qualitative column we’re testing date_columns (list[str]): list of names of date columns within the dataframe.

Returns:

str: returns the name of the valid DataFrame column.

classmethod _get_files(urls, file_path='')

This function grabs 4 specified URLs from https://covid19.who.int/data and stores them as CSV files in data_files/[file_name] Returns 1 if failure, 0 if success

Args:

urls (list[str]): list of all website links to the files we want to download.

Returns:

list[str]: list of file names that get handed to the query.

classmethod _parse_columns(df)

This function parses the columns of a given DataFrame and seperates them into those with number values, those with dates, and those that are string entries

Args:

df (DataFrame): pandas DataFrame to be parsed

Returns:

list[str]: Names of columns with number values list[str]: Names of columns with date values list[str]: Names of columns with string values

classmethod _read_data(file_name)

read_data takes in a csv file name containing a dataset and reads it into a pandas DataFrame. Also reads the inbuilt “world” file in the gpd library into a GeoDataFrame.

Args:

file_name (str): name of file containing dataset

Returns:

(DataFrame, GeoDataFrame): Tuple of the pandas DataFrame of the data you entered, and the world GeoDataFrame

classmethod run_query()

This function is the only function that should be called by a user. run_query runs a user prompt in order to automate graph setup and creation.

Returns:

HeatMap: returns instance of HeatMap class if a HeatMap was plotted TimeSeries: returns instance of TimeSeries class if a TimeSeries was plotted.

urls = ['https://covid19.who.int/WHO-COVID-19-global-data.csv', 'https://covid19.who.int/WHO-COVID-19-global-table-data.csv', 'https://covid19.who.int/who-data/vaccination-data.csv', 'https://covid19.who.int/who-data/vaccination-metadata.csv']
epispread.skeleton.main()

HeatMap

class epispread.graph_classes.heat_map.HeatMap(df, world, plot_column, start_date, iso_column, date_column, ts_flag=0)

Bases: object

_add_iso3(df)

Takes in the given dataframe and adds a column of ISO3 codes based on the existing column of ISO2.

Args:

df (DataFrame): The DataFrame to add to.

Returns:

DataFrame: The inputted DataFrame with the added column.

_filter_single_date(date)

Filters the DataFrame associated with the class instance by the specified date.

Args:

date (str): The date to filter on

Returns:

DataFrame: The appropriately filtered DataFrame.

_iso2_to_iso3(iso2_codes)

Takes a list of iso2 codes and outputs a list of iso3 codes using coco package.

Args:

iso2_codes (list): list of iso2 codes

Returns:

list: list of iso3 codes

_merge_manager(date)

If necessary, converts the iso2 to iso3 in either DataFrame so they match. On the basis of the matching iso3 column, merges the two DataFrames and plots the resulting combination.

Args:

date (str): The required date, in format ‘yy-mm-dd’

Returns:

(list of Line2D) : A list of lines representing the plotted data.

_slider_setup()

Sets up the slider on the graph.

Returns:

matplotlib.widgets.Slider: A slider representing a floating point range.

_update(time_offset)

“This is a callback function that replots the graph whenever time_offset is updated, a.k.a. whenever the slider on the map is moved. Takes in the time offset integer, converts to a real datetime, adds to the starting date and converts back to string to push to other methods.

Args:

time_offset (int): The associated number value of the slider, created in slider_setup.

plot()

This function should be the only one getting called by the user. Will plot the merged DataFrame that results from the init method, along with a slider to show the progression of time.

TimeSeries

class epispread.graph_classes.time_series.TimeSeries(df, date_col, line_col, indep_col)

Bases: object

plot()

plot first converts the values in the “date_col” column of the dataframe to datetime, and then calls pandas.pivot_table to plot each of the “line_col” columns by date and their “indep_col” variable.