Input and Output Functions (violin.in_out)

This page details the functions which handle the input files and output of VIOLIN.

For more information on the types of accepted inputs, see Input and Output Files.

Functions

violin.in_out.preprocessing_model(model: str) pandas.core.frame.DataFrame[source]

This function checks whether the model is correct and verifies that all necessary columns are present.

It accepts an executable BioRECIPE model provided in .txt, .csv, .xlsx, or .tsv format. Thefile’s content will be convert into lower case. Additionally, A ‘Listname’ is created as a unique identifier for every element for further indexing.

Parameters

model (str) – A name of file which includes an executable BioRECIPE model.

Returns

new_model – A formatted model dataframe.

Return type

pd.DataFrame

violin.in_out.preprocessing_reading(reading: str, evidence_score_cols: Optional[dict] = None, atts: Optional[list] = None) pandas.core.frame.DataFrame[source]

This function import the reading file and check if the reading format is correct.

Parameters
  • reading (str) – A pathname of the machine reading spreadsheet output or interactions set from database, in BioRECIPE format. Accepted file: .txt, .csv, .tsv, .xlsx.

  • evidence_score_cols (list) – A list of column headings used to identify identical interactions.

  • atts (list) – A list of additional attributes which are available in interactions set. Default is none.

Returns

new_reading – A formatted reading dataframe, including evidence count and list of PMCIDs.

Return type

pd.dataframe

violin.in_out.output(reading_df: pandas.core.frame.DataFrame, file_name: str, classify_scheme: str = '1', kind_values: Optional[dict] = None) None[source]

This function outputs the classified interactions. The output filenames are composed with {file_name_prefix}_{category}.csv.

Parameters
  • reading_df (pd.dataframe) – A classified dataframe of a interactions set.

  • file_name (str) – A prefix of output filename.

  • classify_scheme (str) – Scheme approach to classify, available options are ‘1’, ‘2’, and ‘3’.

  • kind_values (dict) – A dictionary containing the numerical values for the Kind Score classifications. Default values are found in KIND_DICT.

Dependencies

Python: pandas and NumPy libraries, and os.path module

VIOLIN: formatting and network modules.

Defaults

Default Reading Columns

34             "att contradiction": 12,
35             "dir mismatch": 20,
36             "path mismatch": 20,
37             "self-regulation": 20}

Default Model Columns (From BioRECIPES format)

38"""
39
40# Default Kind Score values for subcategories