papers Flashcards

1
Q

the format for metadata storage in arrayexpress?

A

MAGE-TAB format it has 2 spread sheets: Investigation Description Format (IDF) file and the Sample Data Relationship Format (SDRF) file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is investigation description format in arrayexpress?

A

The IDF contains an overview of the whole experiment, including the title, the submitter’s contact details, publication information, protocols and the experimental variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The SDRF format in arrayexpress?

A

The SDRF describes all the sample characteristics (e.g. cell type) or any treatment that the sample has been subjected to (e.g. growth in low oxygen conditions), and links each sample to its corresponding data file. The structure of the SDRF, i.e. the order of the columns, reflects the experimental workflow from source material, through intermediate steps (e.g. labelling of nucleic acids, preparation of sequencing libraries, running of sequencing assays) to raw and processed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are CEL files?

A

the raw data for microarray experments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how to specify the exp factor like cold and the organism in the arrayexpress search?

A

efv: cold AND organism: “Oryza sative”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the dyes for control and treatment in the microarray?

A

CY3 : control

cy5: treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DREB/CBF is a part of which type of TFs?

A

ERF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does the green text in the linux shell imply?

A

the computer is ready to accept our commands

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is working directory in linux?

A

One important concept to understand is that the shell has a notion of a default location in which any file operations will take place. This is its working directory. If you try to create new files or directories, view existing files, or even delete them, the shell will assume you’re looking for them in the current working directory unless you take steps to specify otherwise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the slashes in the folder addresses?

A

directory seperators

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

the function for creating a list in python?

A

list()

if no arguments are passed it would return an empty list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

the code for installing packages in R?

A

install.packages(‘limma’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what’s the code for converting the data into a dataframe?

A

x=data.frame()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what’s the main difference b/w lists and tuples?

A

lists are mutable tuples are not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a csv file

A

comma separated file, is a text file which contains a list of data which is separated by comma

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does tuple unpacking mean?

A

splitting tuple elements into individual variables

a,b=(1,2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

the code for importing matplotlib.

A

import matplotlib.pyplot as plt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

the code for getting a metabolite by id in cobrapy?

A

model.metabolites.get_by_id(‘metabolte id’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what are the types of boundary reactions and why they’re called pseudo reactions?

A

All of them are unbalanced pseudo
reactions, that means they fulfill a function for modeling by adding to or removing metabolites from the model
system but are not based on real biology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

an exchange, demand and sink reaction definition?

A

An exchange reaction is a reversible reaction that adds to or removes
an extracellular metabolite from the extracellular compartment. A demand reaction is an irreversible reaction that
consumes an intracellular metabolite. A sink is similar to an exchange but specifically for intracellular metabolites,
i.e., a reversible reaction that adds or removes an intracellular metabolite.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

codes for printing out the exchange sink and demand rxns of a model.

A

print(“exchanges”, model.exchanges)
print(“demands”, model.demands)
print(“sinks”, model.sinks)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

the code for reading excel files in python?

how to read only a single column?

A

import pandas as pd

df = pd.read_excel (‘C:\Users\Ron\Desktop\name of the file.xlsx’, usecols=[‘gene ID’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

how to convert a dataframe to a list? how to do it with specific values?

A

genes_list =genes.values.tolist()[0:9]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

how to generally convert something to the list?

A

model_gene_list=list(rice_model.genes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
how to convert the list items to strings?
for i in list: | i=''.join(i or i.id)
26
the CAM metabolism overall definition?
In plants performing CAM photosynthesis, the stomata open at night and CO 2 is fixed and stored in the vacuole in the form of a carboxylic acid such as malate, citrate, or isocitrate (Maclennan et al., 1963; Lüttge, 1990; Gawronska and Niewiadomska, 2015; Igamberdiev and Eprintsev, 2016). During the hot, dry daytime hours, the stomata can remain closed to minimize water loss, and the stored CO 2 is remobilized for fixation by Rubisco in the chloroplast, accompanied by the accumulation of storage carbohydrates. Although this cycle is energetically expensive, it conserves pre- cious water and is an efficient alternative to direct daytime CO 2 fixation by Rubisco as in C 3 photosynthesis
27
how to know where the jupyter notebooks are being saved?
on a notebook we type pwd and it shows the working directory
28
what does io in python mean?
import output
29
what is an iterable?
Iterators are objects that allow you to traverse through all the elements of a collection another def: iterator doesn't give all the values but one value at a time
30
how an iterator can be created
You can create an iterator object by applying the iter() built-in function to an iterable. You can use an iterator to manually loop over the iterable it came from. A repeated passing of iterator to the built-in function next()returns successive items in the stream. Once, when you consumed an item from an iterator, it’s gone. When no more data are available a StopIteration exception is raised.
31
what's the iterator protocol?
The iterator objects are required to support the following two methods, which together form the iterator protocol: iterator.__iter__() Return the iterator object itself. This is required to allow both containers (also called collections) and iterators to be used with the for and in statements. iterator.__next__() Return the next item from the container. If there are no more items, raise the StopIteration exception.
32
what is an iteration?
the act of going over a collection is called iteration. collections are like lists, tuples etc.. iterator is an object which can be used to iterate over a collection. the iter method gives the iterator and the next method is going to give us the next value using this iterator.
33
how to get a list of methods pertained to an object?
dir(the name of the variable the object was assigned to)
34
how to understand if sth is iterable?
it should have a method called __iter__ | we can find it out by the dir() function
35
how to make a pandas dataframe?
df=pd.DataFrame(data, columns=['name','age'])
36
how to concatenate dataframes using pandas?
first we need to make a list of all the dataframes. frames=[df1,df2,df3] result=pd.concat(frames)
37
what is pandas.concat() function?
Concatenate pandas objects along a particular axis with optional set logic
38
how to have continuous indices while merging 2 dataframes?
we should add an additional argument: | df=pd.contact(data,ignore_index=True)
39
how to get a textual index when merging pandas dataframes and in what way it can be useful?
df=pd.concat(data, keys=['India','US']) we can't use this code while having the ignore_index argument. we can call the subset of the dataframe pertaining to each key by using: df.loc['India']
40
how to stick different pandas dataframes horizontally (adding to columns instead of rows)?
df=pd.concat(data,axis=1) | the default value is 0 which adds to the rows
41
how to add a new column as series to our pandas dataframe?
s=pd.Series(['humid','dry'], name='event') df=pd.concat([data,s],axis=1)
42
the different types of merging?
inner join: only takes shared values between two dataframes. outer join: considers the whole values. left join: takes the values of the 1st dataframe and the shared values, right: takes the values of the 2nd dataframe and the shared values,
43
the code for reading an SBML model in cobrapy?
my_model=read_sbml_model('path to the file')
44
what are different rice gene IDs?
Rice(Oryza sativa) has more than one form gene ID for the genome. The two main gene ID for rice genome are the RAP (The Rice Annotation Project, , and the MSU(The Rice Genome Annotation Project, . All RAP rice gene IDs are of the form Os##g####### as explained on the website . All MSU rice gene IDs are of the form LOC_Os##g##### as explained on the website . All SYMBOL rice gene IDs are the unique name on the NCBI(National Center for Biotechnology Information,
45
how to call a package in r?
library(biomaRT)
46
the code for getting the r version?
R.version | or in the linux terminal R --version
47
how to update r?
the new version of r should be installed via r-project.org and then we should choose CRAN so the next time you open r studio, it will be working with the updated version.
48
the 4 arguments in the biomart search?
Attributes: the column headers that we want in our outputs. Filters: filters are our input data Values: are identifiers that are used along filters to limit our results Mart: the argument for database selection. it's the first thing we choose (we specify the database)
49
how to search a gene in a certain organism in NCBI?
name of the gene AND human [orgn]
50
how to get the annotation for a platform from GEO?
click on the platform in the table below u see the annotations and for downloading click on download full table
51
how to get the code for installing certain packages in r
go to the bioconductor and search the name of the package and u will find the code for installing
52
how to read raw cel files in r? | how to costumize the reading?
first set the working directory to the folder that r files are in then run ReadAffy() function. in the function if we write widget=T, a new window will be opened and we can select
53
the image function shows what and what should we deciphere from it?
it's the image from microarray chip and the white dots are the dots with expressed genes the black dots are not expressed genes and we should check the integrity of these dots across the chip
54
how to see each sample covers which range of genes? I mean which range of numbers?
we should draw a boxplot | boxplot()
55
the code for drawing a histogram in r
hist(data)
56
how a histogram shows the quality of the samples?
مثلا اگر قله همه نمودارها روی یا حول و حوش یک عدد بود یعنی کیفیت دیتای گرفته شده از سمپلها خوبه بعد از نرمال کردن هم باید ببینیم اگر برطرف شد اوکیه اگر نشد باید حذفش کنیم
57
how to check the quality of the RNAs used for microarray in R? how should it be looked like?
AffyRNAdeg(data) دنسیتی سمت 3 پریم باید از 5 پریم کمتر باشه چون تخریب از 3 پریم به 5 پریم صورت میگیره بهترین حالت این است که یک نمودار نزولی داشته باشیم نه حالت زیگزاگ
58
how to normalize the data?
normalized_data=rma(data)
59
2 codes for showing the normal data?
``` یکی بعد از باز کردن دیتای نرمال شده با rma و بخش assay data رو که میزنیم توی کنسول یه کد مینویسه که میتونیم سیوش کنیم راه دیگه استفاده از فانکشن exprs هست ```
60
what is justRMA () for?
برای وقتایی که رم دستگاه کمه و کامپیوتر نمیتونه آنالیز و انجام بده با روش آر ام ای معمولی اینجا دیگه داده نرمال رو فقط نشون میده حالت لارج اکسپرشن ست نیست
61
how to see which methods exist for background correction and normalization of transcriptome data in r?
bgcorrect. methods(NO ARGUMENTS) | normalize. methods(data)
62
how to manually choose for normalization and bg correction methods in one line of code in r?
data=expresso(data=trans_data, widget=T)
63
how to know the range of colors being recognized by R?
color()
64
how to write a table in r? how to specify to seperate the values by tab?
write.table(x,file='data.txt', quote=F, sep="\t")
65
how to get rid of browse[1] in R?
type c in the console, but the function will continue running and if you press q you will exist the browser and the function both
66
how to paste a number in whole rows of a column in excel?
first write the num in another cell, then copy the cell then select the cells you want to have that number and then press the arrow in the paste> paste special> operation> add
67
the function for performing some conditions in r?
gene_up= subset(data, name of the col we wanna perform the condition >2 )
68
the shortcut key for renaming the files?
click on the file and press f2
69
how to check if a newer version for rstudio exist?
in rstudio open help> check for updates
70
how to change the row names of a table?
row.names(data)=x
71
how to bring help of a function in r?
we should click on the function and press f1
72
how to delete NA data from our vriable in R?
data=na.omit(data) | it removes the row containing NA data
73
why do we specify upper and lower bound in FBA?
These bounds enforce thermodynamic reversibility and mechanistic (max uptake and secretion rate) constraints for the rxn.
74
how GEM models can further elucidate | how changes in one component affect other pathways and cell phenotypes?
since these models connect genes to measur- able cell phenotypes (e.g., growth, cell energetics, pathway fluxes, biosynthesis of cell components, byproduct secretion, etc.)
75
what are Model extraction methods (MEMs)?
Model extraction methods (MEMs) employ diverse algorithms | to extract cell-line- or tissue-specific models from a GEM
76
what column vector v specifies?
which contains | the unknown fluxes through each of the reactions of the S matrix.
77
which system is underdetermind and it means what?
system of linear equations is established by multiplying the S matrix by a column vector, v. the product of this matrix multiplication must equal zero, S · v = 0 (Gianchandani et al., 2009). Because the resulting system is underdetermined (i.e., too few equations, too many unknowns), linear programming (LP) is used to optimize for a particular flux, Z, the objective function, subject to under- lying constraints
78
how the objective function is depicted?
he objective function typically takes on the form of: Z = c · v where c is a row vector of weights for each of the fluxes in col- umn vector v, indicating how much each reaction in v contributes to the objective function, Z
79
what is the task of FBA?
Thus, the task of FBA is to find a solution to v that lies within the bounded solution space and that optimizes the objective function at the same time.
80
gimme algorithm guarantees what?
guarantees to both produce a functioning metabolic model based on gene expression levels and quan- tify the agreement between the model and the data is called the Gene Inactivity Moderated by Metabolism and Expression (GIMME) algorithm
81
The application of metabolic models?
metabolic phenotype prediction, metabolic engineering, studies of network evolution, biomedical application
82
why GEMs are important in studying the metabolism of higher organisms?
It is extremely difficult, or even impossible for higher level or- ganisms, to study the entirety of their pathways either in vitro or in vivo. However, mathematical tools such as genome-scale models can be used to gain insight into how these biological systems function
83
4 main step in GEM reconstruction?
First, a draft reconstruction of the biological network of an organism is extracted using information about re- actions, enzymes, and pathways from databases such as KEGG, BRENDA, etc. The second step is the manual curation of the reconstructed draft model. This involves checking and filling the gaps and correcting misplaced reactions. Here, organism-specific databases and literature are used. Computational al- gorithms such as GAUGE [20], FastGapFill [21], and FBA-Gap [22] can also be applied. The third step is the conversion of the reconstructed model into a mathematical representation that can be used for subsequent simulations. The final step is the refining, validation, and application of the model to inform decisions.
84
what are solvers in FBA used for?
to solve a set of linear equations.
85
what are the 13C-MFA drawbacks?
however, is challenging not only because of the extensive instrumentation required but also because of the limited number of fluxes and conditions that can be experimentally measured. Typically, 13C-MFA focuses on central carbon metabolism
86
why the transcriptome data is a better high-throuput data?
Unlike the first two omics data that cover a small share of all reactions in a genome-scale model, transcriptomics and proteomics are the platforms where a quantitative snapshot of molecular species at system-level is currently possible [23]. However, proteomics is a relatively immature technology compared to transcriptomics. The accuracy with which protein concentrations can be determined is much lower than that with which mRNA concentrations can be determined. On the other hand, RNA amount changes can be precisely measured in a highly automated process at low cost in comparison with the amount of data gathered
87
what is the logic behind E-flux?
The rationale behind E-flux is that, given a limited translational efficiency and a limited accumulation of enzyme over the time, the level of mRNA can be used as an approximate upper bound on the maximum amount of metabolic enzymes, and hence as a bound on reaction rates.
88
why in prokaryotes there is more correlation between protein level and mRNA abundance?
because the ribosomes attach to nascent mRNA so that translation can be synchronous with transcription; proteins levels thus depend more directly on mRNA abundance
89
what is a probeset? | how the probes are numbered in a probeset?
a gene is represented by a probeset in the microarray. each probeset consists of 11 probes. For each probeseton an array, the individual probes are numbered sequentially from the 5’ end of the transcript to the 3’ end.
90
how to exit from the python shell in command prompt?
exit()
91
how to make a copy of our model in cobrapy?
new= model. copy( )
92
the code for printing all the reactions in the model in cobrapy?
model.reactions | without parenthesis
93
the shortcut for deleting a cell in jupyter notebook?
select a cell till it becomes green and press D twice
94
what is a frozenset() function?
it gets an iterable as an argument and makes it unchangeable. you can't assign a new value to it
95
how come maximization of biomass can be the objective of microorganisms
The premise is that an organism that acquires and/or redistributes resources to outgrow its competitors will be in the best position to survive
96
why maximization of the cell's biomass is not a suitable choice for modeling multicellular organisms?
ch an assumption, however, is likely to be invalid for individual cell types in multicellular organ- isms, where cellular objectives may differ greatly both between and within tissues. The assumption of maximal rates of biomass production involves an objective at the cellular level, whereas in multicellular organisms a given cell’s objective is likely to be realised via survival at the organism level, which may not necessarily be dependent upon the growth of the cell. Moreover, signals from the extracellular environment may trigger different cellular priorities and objectives
97
how to get the gpr of all the reactions in the model?
gpr=[ ] for i in model.reactions: gpr_reaction= model.reactions.get_by_id(i.id).gene_reaction_rule gpr.append(gpr_reaction)
98
the code for storing the data as an excel sheet in R?
write.xlsx2(data, "data.xlsx")
99
how to read the excel file? what's the parameter sheet index in R?
read.xlsx read.xlsx2 sheet index is a numerical number specifiying which sheet to read eg: 1 = sheet number 1 (first sheet) is going to be read
100
what is the class of the output of reading an excel file via read.xlsx() in r?
data frame
101
what is the keyword for restarting Rstudio?
shift+ctrl+f10
102
what is %in% operator?
%in% returns logical vector (TRUE or FALSE but never NA) if there is a match or not for its left operand. Output logical vector has the same length as left operand.
103
how to find the missing values in R?
is.na(data) returns true if there is and false if there wouldn't be a missing value
104
how to read an excel file in python? how to select a specific column?
file= pd.read_excel (file path, usecols=['name of the column'])
105
how to get the type of a variable in python?
type (x)
106
any and all functions?
The any function in R will tell if you if there are ANY of the given search terms in your vector. It returns either TRUE or FALSE. To demonstrate this function, let's create a quick vector that goes from -3 to 5, incrementing by 1. y
107
how to install a package with pip in jupyter notebook?
we add a ! befpre pip, | !pip install parse
108
2 ways of getting the values of special column of excel or dataframe in python?
df ['name of the column] | df.name of the column
109
how to replace something in only a specific column in excel?
select and copy that column and do the manipulation on a new spreadsheet. and then paste it on the original file.
110
how to print out both bounds of model reactions?
for i in model.reactions: | print(model.reactions.get_by_id(i.id).bounds)
111
what is the rationale behind the E-Flux method?
the biological rationale behind our method is that expression data provide measurements on the level of mRNA for each gene. If there were limited accumulation of enzyme over the time course considered, and given a particular level of translational efficiency, the level of mRNA can be used as an approximate upper bound on the maximum available protein and hence as an upper bound on reaction rates to some level of approximation.
112
where FBA method's prediction is poor?
Given their simplicity and minimal data requirement, FBA models often weakly predict metabolic fluxes at elabo- rate branched-chain reactions and in a cyclic pathway, for example, pentose-phosphate pathway and non-cyclic TCA, where fluxes change dynamically with prevailing e ­ nvironments 17–
113
what is the shell?
the shell is a program that takes commands from the keyboard and gives them to the operating system to perform.
114
the raw data of single microarray consists of what?
The raw data from a single microarray consist of a pair of images representing the fluorescent intensities detected by a photomulti- plier tube when the microarray is scanned with each of two lasers.
115
what does model fitting mean?
Fitting a model to data means choosing the statistical model that predicts values as close as possible to the ones observed in your population.
116
how microarray works?
n the Ž rst step of the technique, DNA clones with known sequence content are spotted and immobilized onto a glass slide or other substrate, the microarray. Next, pools of mRNA from the cell populations under study are puriŽ ed, reverse-transcribed into cDNA, and labeled with one of two fluorescent dyes, which we will refer to as “red” and “green.” Two pools of differ- entially labeled cDNA are combined and applied to a microarray. Labeled cDNA in the pool hybridizes to complementary sequences on the array and any unhybridized cDNA is washed off. Hybridization efŽ ciency may vary from clone to clone, confounding comparisons between genes. However, if we assume that the efŽ ciency of an individual clone is not altered by the type of the dye label, then the relative abundance of a particular mRNA in the two samples can be measured.
117
why we can't figure out a microarray is single or two channel by searching the platform?
The one-colour vs two-colour thing is an experiment-specific choice. You could use an Agilent array for either one- or two-colour experiments, for example. So you actually have to look at the GSE, rather than the GPL information to determine whether a given expt was 2-colour.
118
what is two-channel microarray?
The earliest microarrays used two channels, with two RNA samples separately labeled and competitively hybridized to the same array
119
what is genechip?
it is the trademark of Affymetrix | it is a comercial microarray platform
120
what is the general concept of microrray?
DNA microarrays are widely used to measure genome-wide changes in mRNA expression levels across conditions such as developmental stages, disease states, drug treatment and gene disruption
121
how 2-dye microarray meaure gene expression?
Spotted microarrays are commonly hybridized with two samples labelled with two different fluorophores. For these arrays, the ratio of the signal intensities in the two channels is a relative measure of gene expression.
122
what is probe in microarray?
In standard terminology, the cDNAs spotted onto the arrays are called probes, and those in the samples are called target genes.
123
what are one-channel and two-channel microarry?
cDNA microarrays generate one- or two-channel data. In two-channel use the arrays are hybridized to a mixture of two samples, each labelled with a different dye (Cy3 and Cy5). In one-channel use, which is the focus of this paper, each array is hybridized to a single sample, labelled with a single dye. The arrays are laser-scanned at the wavelength(s) appropriate to the dye(s) used, and the images are processed to extract data for analysis. In onechannel studies, these usually consist of a measure for the spot intensity and its local background, for each spot on the array. In two-channel studies this is available for both dyes
124
what background-corrected spot intensities reflect?
The background-corrected spot intensities reflect the abundance of the corresponding target genes in the samples.
125
what are different types of bias in microarray data? | what is correction for bias called?
The background-corrected spot intensities reflect the abundance of the corresponding target genes in the samples. However, often the relation is not that of simple proportionality: the signals may be distorted in various ways. One of these is spatial bias: the presence of regions with overall higher or lower intensity levels on the slides (Fig. 1). Another form of distortion may appear when data from replicate arrays are compared graphically (Fig. 2) and various forms of systematic departure from the identity line are observed. This phenomenon is here termed relative intensity bias. The process of correcting for bias prior to analysis is called normalization. The purpose is to promote uniformity within arrays and reproducibility between arrays. Normalization has profound effects on subsequent analysis, irrespective of the methodology used. Failure to normalize appropriately will generally lead to misleading conclusions.
126
why it's hard to find out the absolute gene expression in the 2-dye microarrays?
For two-color microarrays, however, it is more diffi- cult to determine absolute gene expression because of effects such as spot size variation, and relative expression between two conditions is typically reported
127
the reason tha GEMs should be integrted with omics data?
the dvent of highthrouput techniques enabling scientists to simultaneously measure large numbers of molecular components (e.g., proteins, metabolites, and nucleic acids). there is a growing concern that much of these data sit in databases without being used or fully analyzed. Statistical inference methods have been widely applied to gain insight into which genes may influence the activities of others in a given omics data set, however, they do not provide information on the underlying mechanisms or whether the interactions are direct or distal. genome-scale models derived from knowledgebases may be used to extract additional biological understanding from omics data sets and inspire novel applications of this technology to interpretation of complex data sets.
128
two ways of interpreting omics data with GEMs?
1- comparing the omics data with GEM predictions. | 2- using omics data as a surrogate for modeling regulation
129
how omics data can be used as a surrogate for modeling regulation?
Metabolic network reconstructions aim to be comprehensive repositories of biochemical data for an organism. Thus, models derived from these knowledgebases will include all possible reactions catalyzed by an organism’s gene products regardless of whether they are active in a given environment. The all-inclusive nature of these knowledgebases is partially responsible for false negatives observed in gene essentiality or genetic interaction simulations35. Biological networks have evolved a degree of robustness against perturbations that result cascading failures36 – this robustness is due, in part, to the presence of alternative compensatory pathways. However, an alternative pathway that is present in the global knowledgebase may not be accessible to the organism in the given growth medium, thus mutation of the principle pathway will result in a phenotype in vivo but not in silico. The regulatory apparatuses of successful organisms have evolved to express the network components that are suited to their current environment. If we knew the complete regulatory structure of an organism and how it worked then we could plausibly compute which cellular components may be expressed in a given condition; unfortunately, this isn’t known even for the arguably best-studied bacterium38. Due to stochastic effects arising from low copy numbers of regulators and enzymes39,40, and intracellular heterogeneity, integrated models of metabolism and regulation will still be an approximation of individual cells and populations. In the absence of experimentally elucidated regulatory rules, we can still use omics surveys in conjunction with functional models to serve as surrogates for a regulatory model, and create condition- and tissue-specific models
130
how to iterate through a range of numbers in python?
for i in range(1,26)===> it will iterate 25 times | for i in range(26)===> it will iterate 26 times, starts from 0, ends in 25
131
what is split function in python? | and how to split from the white space?
split() method in Python split a string into a list of strings after breaking the given string by the specified separator. if we don't pass any argument to split, it will split from the white space
132
what is the find method in python?
``` The find() method returns the index of first occurrence of the substring (if found). If not found, it returns -1. if it's given by start and end index, it only searchs in that range ```
133
the shortcut for getting the file path in linux?
ctrl+L
134
how to find the index of an element in a list
The index() method returns the index of the specified element in the list.
135
when we read an excel file with x=pd.read_excel code, what's the type of the x?
pandas dataframe
136
how to make a header and normal text in jupyter notebook?
in the box on the top which code is chosen, choose heading and then type what you want and the run the cell for normal text, you should choose the markdown format
137
how to devide numbers without getting the decimals as the answer in python? how to get the remains?
use // instead of / use this % to get the remain
138
how to edit a text written in markdown format in jupyter notebook?
double click on that text
139
how to make different sizes of headers in jupyter notebook?
``` # header 1 ## header2 ### 3 #### 4 there should be an enter between hashtags an the text ```
140
how to make a list of bullet points | how to make a list of numbers
bullet: * this will make a bullet number: 1. this will make a number
141
how to make the text italic or bold in jupyter notebook?
add 1 astrids in between text without any space. | add 2 astrids for making it bold
142
how to define several variables in a single line?
var1, var2, var3 = 'red', 'blue', 'green'
143
shortcut for inserting a new cell below in jupyter notebook?
esc+B
144
when booleans automatically are converted to integers?
when they're being used with operators with integers. false is converted to 0 and true to 1 eg: 3+True 4
145
the function for converting other data types to boolean | what are the things that will be false?
bool() Only the following values evaluate to False (they are often called falsy values): ``` The value False itself The integer 0 The float 0.0 The empty value None The empty text "" The empty list [] The empty tuple () The empty dictionary {} The empty set set() The empty range range(0) ```
146
what is None type and for what it's used for?
The None type includes a single value None, used to indicate the absence of a value. None has the type NoneType. It is often used to declare a variable whose value may be assigned later. or when we wanna indicate that the value is missing.
147
how to have our strings in multi line?
have in in between '''
148
what is a method in python?
Methods are functions associated with data types and are accessed using the . notation e.g. variable_name.method() or "a string".method(). Methods are a powerful technique for associating common operations with values of specific data types.
149
what does replace method do in python?
The .replace method replaces a part of the string with another string. It takes the portion to be replaced and the replacement text as inputs or arguments. another_day = today.replace("Satur", "Wednes") it returns the modified string it does not actually change the string in the variable
150
strip and split methods in python?
The .split method splits a string into a list of strings at every occurrence of provided character(s). The .strip method removes whitespace characters from the beginning and end of a string, but not from the middle. a_long_line = " This is a long line with some space before, after, and some space in the middle.. "
151
whatv does format do in python?
The .format method combines values of other data types, e.g., integers, floats, booleans, lists, etc. with strings. You can use format to construct output messages for display. ``` cost_of_ice_bag = 1.25 profit_margin = .2 number_of_bags = 500 ``` ``` # Template for output message output_template = """If a grocery store sells ice bags at $ {} per bag, with a profit margin of {} %, then the total profit it makes by selling {} ice bags is $ {}.""" ``` print(output_template) ``` total_profit = cost_of_ice_bag * profit_margin * number_of_bags output_message = output_template.format(cost_of_ice_bag, profit_margin*100, number_of_bags, total_profit) ``` print(output_message)
152
how to add a new value to a specific index of a list in python?
A new value can also be inserted at a specific index using the insert method. fruits.insert(1, 'banana')
153
how to remove an item from a list?
You can remove a value from a list using the remove method. fruits.remove('blueberry') or fruit.remove(fruit[0]) to remove the first element
154
how to remove and return an element from an index in python?
To remove an element from a specific index, use the pop method. The method also returns the removed element. fruits.pop(1)
155
how to get the dimension and the name of columns of a dataframe in python?
We used the shape and columns dataframe attributes to get the shape of our dataframe (number of rows, number of columns) and the column names, respectively. pd. shape pd. columns
156
how to create an empty array using numpy? how to make it 2-D? how to make a numpy arra
Return a new array of given shape and type, without initializing entries. Parameters shapeint or tuple of int Shape of the empty array, e.g., (2, 3) or 2. dtypedata-type, optional Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64. order{‘C’, ‘F’}, optional, default: ‘C’ Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory. likearray_like Reference object to allow the creation of arrays which are not NumPy arrays. If an array-like passed in as like supports the __array_function__ protocol, the result will be defined by it. In this case, it ensures the creation of an array object compatible with that passed in via this argument. ex: m=np.empty(2) ==> returns an 1-D array with 2 items ex: m=np.empty([2,3]) ==> returns a 2-D array with 2 rows and 3 cols
157
how to remove a character from a string?
name='fete' | m= name.replace('e','')
158
how to add several lists to a dataframe?
first make tuple of the lists # Python 3 to get list of tuples from two lists data_tuples = list(zip(Month,Days)) then make a dataframe of it pd.DataFrame(data_tuples ,columns=['month','day'] or do everything in the same line: pd.DataFrame(list(zip(lst1, lst2, lst3)), columns=['lst1_title','lst2_title', 'lst3_title'])
159
how to append two items to a list in python?
lis= list() | lis.append ([a,b])
160
when we loop over a dictionary, what will we get in python? | how to access the values in a dictionary?
it only returns the keys not the values dict[key] returns the value
161
how to iterate over the values in a dictionary in python? | how to iterate over boths keys and values?
for value in dict.values(): print(value) for key,value in dict.items(): print(key,value)
162
how to define the steps in the range function in python?
range(1,11,2)
163
how to make an empty loop in python? what does the keyword do?
for i in x: pass pass means nothing happens when this condition is being reached
164
one of the ways to make a dataframe which already has the name of the columns and we don't have to specify the columns in python?
to make a dataframe with a dictionary (the data which is passed is a dictionary) eg; ``` df = pd.DataFrame({ 'colA':[True, False, False], 'colB': [1, 2, 3], }) print(df) colA colB 0 True 1 1 False 2 2 False 3 ```
165
what is the difference between insert and assign?
pandas.DataFrame.assign() method can be used when you need to insert multiple new columns in a DataFrame, when you need to ignore the index of the column to be added or when you need to overwrite the values of an existing columns. Alternatively, you can also use pandas.DataFrame.insert(). This method is usually useful when you need to insert a new column in a specific position or index. ``` df.insert(1, 'colC', s.values) print(df) colA colC colB 0 True a 1 1 False b 2 2 False c 3 ```
166
what is scope in python?
Scope refers to the region within the code where a particular variable is visible. Every function (or class definition) defines a scope within Python. Variables defined in this scope are called local variables. Variables that are available everywhere are called global variables. Scope rules allow you to use the same variable names in different functions without sharing values from one to the other.
167
how to make an optional argument while defining function in python?
``` We'll make this an optional argument with a default value of 0. def loan_emi(amount, duration, down_payment=0): loan_amount = amount - down_payment emi = loan_amount / duration return emi ``` All the optinal arguments should come after the required argument
168
what are named argument?
Invoking a function with many arguments can often get confusing and is prone to human errors. Python provides the option of invoking functions with named arguments for better clarity. You can also split function invocation into multiple lines. ``` emi1 = loan_emi( amount=1260000, duration=8*12, rate=0.1/12, down_payment=3e5 ) ```
169
how to call help for a function that we don't know wht it does in python?
help(name of the function)
170
how to print a certain row in dataframe?
if you know the index | print(df.loc[[159220]])
171
in CAM metabolism, there is a trade off b/w what?
However, there is a tradeoff between the energetic investment in the CO 2 -concentrating mechanism in the form of the CAM cycle and the benefit of sup- pressing photorespiration.
172
how to approximate enzyme machinery costs by metabolic networks?
the total network flux can be used as a proxy for the overall enzyme machinery cost by averaging out the variation between reactions
173
why the cheung 2014 model was allowed to accumulate or store some compounds?
To maintain a metabolic output at night (export to the phloem), the model must accumulate carbon and nitrogen stores during the day. To explore the metabolic interaction between the day and the night, the model included a set of sugars and carboxylic acids to be used as carbon storage mol- ecules.
174
iMAT and GIMME and E-Flux very basic assumptions?
For instance, iMAT and GIMME assume that mRNA levels below a certain threshold reveal that corresponding reactions are inactive [10, 11]. E-Flux and PROM assume that transcript level indicates the degree to which the reactions are active by constraining the upper bounds
175
how to select the first column of a data frame as a data frame and as a serie?
``` # Select first column of the dataframe as a dataframe first_column = df.iloc[: , :1] ``` ``` # Select first column of the dataframe as a series first_column = df.iloc[:, 0] ```
176
how to get a portion of pandas data frame?
df.iloc[row_start:row_end , col_start, col_end] Arguments: row_start: The row index/position from where it should start selection. Default is 0. row_end: The row index/position from where it should end the selection i.e. select till row_end-1. Default is till the last row of the dataframe. col_start: The column index/position from where it should start selection. Default is 0. col_end: The column index/position from where it should end the selection i.e. select till end-1. Default is till the last column of the dataframe. It returns a portion of the dataframe that includes rows from row_start to row_end-1 and columns from col_start to col_end-1.
177
how to compare if 2 list's elements are identical?
sorted(x) == sorted(y) | x and y are our lists
178
how to sort the list without modifying the original list?
x=[] y= sorted (x) sorted function sorts and return a new list
179
what is numpy? what it is used for?
. The Numpy library provides specialized data structures, functions, and other tools for numerical computing in Python.
180
how to construct tissue or organ specific metabolic models?
Tissue or cell specific models can be derived from the metabolic reconstruction to represent tissue and cell specific functions by adding physical–chemical constraints and tissue biomass compositional data.
181
how to specify the type of arguments and the returned object in python function?
``` def greeting(name: str) -> str: return 'Hello ' + name ```
182
what is isinstance() in python?
The Python’s isinstance() function checks whether the object or variable (first argument) is an instance of the specified class type or data type.
183
how to make a dict with key being the reaction id and the values be zero
dict= {i.id: 0 for i in model.reactions}
184
how to make a dict from 2 cols of the excel sheet?
exp_data_2h = dict() for i,j in zip(exp_data['MSU7'],exp_data['k354_2h_cold']): exp_data_2h[i]=j
185
what are loop control statements?
break, continue and pass
186
continue statement?
The continue statement is used to skip the rest of the code inside a loop for the current iteration only. Loop does not terminate but continues on with the next iteration.
187
how to have a dataframe without index column being shown?
df=df.syle.hide_index()
188
how to set a certain column in pandas data frame as the index column?
pd =pd.set_index('reaction_id')
189
how to get the max of a dict values in python?
max_val= max(dict.values())
190
how to create a new dict with an already existed dict which was filtered for none values?
res = {k:v for k,v in kwargs.items() if v is not None}
191
the function for printing out the exchange fluxes and the flux for OF in cobrapy?
model.summary( )
192
how to fix the original OF as an additional constraint in cobrapy?
cobra.util.solver.fix_objective_as_constraint(model, fraction=1, bound=None, name='fixed_objective_{}') Fix current objective as an additional constraint. To avoid that, we can fix the current objective value as a constraint to ignore solutions that give a lower (or higher depending on the optimization direction) objective value than the original model. Parameters: model (cobra.Model) – The model to operate on fraction (float) – The fraction of the optimum the objective is allowed to reach. bound (float, None) – The bound to use instead of fraction of maximum optimal value. If not None, fraction is ignored. name (str) – Name of the objective. May contain one {} placeholder which is filled with the name of the old objective.
193
how to store all the metabolites of the model in a single variable?
mets= [i for i in model.metabolites]
194
how to exit conda environment and go back to the main command line?
conda deactivate
195
review addition of 9
search aout it
196
how to calculate the time it takes to run a command?
put %%time at the beginning
197
how to check the type of the elements in an np array
weights.dtype
198
how to download a file from internet via numpy?
import urllib.request urllib.request.urlretrieve('url of the web page the file is in'') you will see the file in the main page of jupyter notebook after running the code
199
how to load data from a text file using numpy?
np.genfromtxt(file name, sep=',', skip_header=1)
200
how to get the reactions associated with a special metabolite in cobrapy?
model.metabolites.get_by_id('met_id').reactions
201
how to access a group of columns and rows in a pandas dataframe?
df.loc['cobra', 'shield'] | gets the value of the cobra row and shield column