Marinka Zitnik

Fusing bits and DNA

  • Increase font size
  • Default font size
  • Decrease font size

ACM XRDS: Dynamics of News from The New York Times

E-mail Print PDF

New issue of ACM XRDS is here! The focus of this issue is on techniques for natural language processing in the broader sense. You will find interesting stories about how to detect influencers in social media discussions, how to successfully transition from academia to entrepreneurship, and read about the hurdles and opportunities in research of ancient written languages.

My department contributed a short column on exploring news from The New York Times. Techniques of information extraction and natural language processing allow us to search for news articles and to analyze dynamics of published content. News organizations, such as The New York Times, provide programmatic access to their articles to retrieve headlines, abstracts, and links to published multimedia. In the column we use The New York Times Article Search API to demonstrate how to construct search queries that retrieve documents from various news sections and time periods. We also explore the pulse of climate change over the years using data extracted from published news articles.

Last Updated on Friday, 21 August 2015 15:00
 

@Heidelberg Laureate Forum 2014

E-mail Print PDF

Recently, I have participated as young researcher in computer science at Heidelberg Laureate Forum. I encourage the reader to check recordings of some of the talks, which are available at official HLF website. If limited by time I recommend at least one of the following talks by Michael Atiyah, Manuel Blum, Wendelin Werner, Vint Cerf, Leslie Lamport, Manjul Bhargava, Daniel Spielman, Efin Zelmanov or John Hopcroft. They are engaging, full of useful tips and strategies, and should be accessible to an interested listener.

Many CS & Math bloggers followed the event, their comments and discussions about laureates' talks can be found at HLF Blogs. Among others, our poster has been highlighted by John D. Cook. Overall, HLF has been an awesome experience for me with many opportunities to network with Turing, Fields, Abel and Nevanlinna laureates and meet other young researchers in computer science and mathematics from around the world.

Last Updated on Sunday, 28 September 2014 20:13
 

Google Global Planning Committee for Women in Computer Science

E-mail Print PDF

I have been given an opportunity to join Google Global Planning Committee for Women in Computer Science in an effort to identify ways we can have the greatest impact and reach more women in tech. As member of this committee I will partner with Google to build the community and direct outreach activities for women in computer science. To kick things off, we will have our global meeting at the Grace Hopper Conference in Phoenix, AZ, USA. I am excited to be part of this great program to promote women to excel in computer science and information technology.

Stay tuned, there will be many possibilities to engage with fellow technologists!

Last Updated on Tuesday, 16 September 2014 16:25
 

@Stanford University, Department of Computer Science

E-mail Print PDF

I am visiting the Department of Computer Science at Stanford University, CA, USA in Summer and Fall 2014. During my stay we will study the interplay between network analysis, data integration and biology. There are many exciting challenges one can explore in these areas and I am very enthusiastic about the work.

Last Updated on Thursday, 21 August 2014 05:53
 

ISMB 2014: Epistasis-Based Gene Network Inference

E-mail Print PDF

I have presented our recent approach for epistasis-based gene network inference at ISMB 2014. We propose a factorized model of interactions that is used for scoring of different types of gene-gene relationships, such as epistasis, parallelism and partial interdependence, and assembly of gene networks that are consistent with estimated pairwise relationships. Detailed derivation of the method and its empirical comparisons with existing approaches are described in our paper published by Bioinformatics.

Last Updated on Thursday, 09 July 2015 15:08
 

CAMDA 2014: Survival Regression by Data Fusion

E-mail Print PDF

I have presented at CAMDA 2014 an extension of our recent matrix factorization-based data fusion approach that couples data fusion with survival regression. CAMDA 2014 runs as a satellite meeting at ISMB 2014, Boston, MA, USA. Our presentation got CAMDA best presentation award.

Any knowledge discovery could in principal benefit from the fusion of directly or even indirectly related data sources. In this work, we explore if a recently proposed simultaneous matrix factorization data fusion approach could be adapted for survival regression. We propose a new method that jointly infers latent factors by data fusion and estimates regression coefficients of survival model. We have applied the method to CAMDA 2014 large-scale Cancer Genomes Challenge and modeled survival time as a function of gene, protein and miRNA expression data, and data on methylated and mutated regions. We find that both joint inference of factors and regression coefficients on one side and data fusion procedure on the other are crucial for performance. Our approach is substantially more accurate than baseline Aalen's additive model. Latent factors inferred by our approach could be mined further; we found that the most informative factors are related to known cancer processes.

Last Updated on Thursday, 09 July 2015 15:08
 

Gene network inference by probabilistic scoring of relationships from a factorized model of interactions

E-mail Print PDF

Bioinformatics just published a special issue devoted to ISMB 2014 proceedings papers that will be presented next month at the world's premier conference on computational biology -- ISMB 2014 in Boston, MA, USA.

Our paper, Gene network inference by probabilistic scoring of relationships from a factorized model of interactions, which you will find in this issue of Bioinformatics, describes a conceptually new probabilistic approach to gene network inference from quantitative interaction data called Red. Red is founded on epistasis analysis. Epistasis analysis is an essential tool of classical genetics for inferring the order of function of genes in a common pathway. Typically, it considers single and double mutant phenotypes and for a pair of genes observes if a change in the first gene masks the effects of the mutation in the second gene. Despite the recent emergence of biotechnology techniques that can provide gene interaction data on a large, possibly genomic scale, very few methods are available for quantitative epistasis analysis and epistasis-based network reconstruction.

The features of Red are joint treatment of the mutant phenotype data with a factorized model and probabilistic scoring of pairwise gene relationships that are inferred from the latent gene representation. The resulting gene network is assembled from scored pairwise relationships. In an experimental study, we show that the proposed approach can accurately reconstruct several known pathways and that it surpasses the accuracy of current approaches.

Last Updated on Wednesday, 13 August 2014 05:21
 

ACM XRDS: Exploring Data with Topological Tools

E-mail Print PDF

The Summer issue of ACM XRDS is here! This issue focuses on diversity in computer science. You will find columns about how to make the tech more inclusive, women in computing, self-teaching and how hip-hop lyrics can be used in combination with artificial intelligence to engage more students in computer science. Also, you should not miss the Features section! There, you will learn, among others, about a research project in Germany that integrates gender and diversity in STEM fields and read about how neuroscience has revealed that we sometimes judge others by their gender or ethnicity without even realizing it. What can be done to address these issues? Check out the ACM XRDS's advice.

For the computationally inspired among you I have contributed a column that describes one of many possible usages of computational topology for exploratory data analysis. Tools from topology increasingly serve to inspire the development of novel computational methods for data analysis. With these methods we can study qualitative geometric information of the data to understand how they are organized on a large scale and focus on intrinsic shape properties rather than on characteristics that depend on a particular choice of a coordinate system. The column applies a topological tool called Mapper to extract and visualize simple descriptions of data sets.

Last Updated on Friday, 21 August 2015 15:01
 

Young Researcher in the Heidelberg Laureate Forum 2014

E-mail Print PDF

I have been selected to participate as young researcher in the Heidelberg Laureate Forum 2014 (HLF). The Forum will take place in September and will bring together winners of the Abel Prize and Fields Medal (mathematics) as well as the Turing Award and Nevanlinna Prize (computer science) with young researchers from around the world selected by an international committee of experts primarily from the award granting organizations. I was fortunate and was given an opportunity to be one of 200 young researchers (there are 100 spaces for each discipline of mathematics and computer science) that will be part of this Forum.

The HLF is an event inspired by Lindau Nobel Laureates Meetings, which provide a forum where people dedicated to science, both role models and young researchers in physics, chemistry and life sciences, can interact. This event spawned an idea to create something similar for scientific disciplines of mathematics and computer science. The list of participating Laureates is impressive and includes, among others, Manuel Blum, Stephen Cook, Antony Hoare, John Hopcroft, Leslie Lamport, John Torrence Tate and Wendelin Werner. I am looking forward to meet these distinguished experts from both disciplines and learn many new things.

Last Updated on Friday, 21 August 2015 16:06
 

ACM XRDS: Efficient Sensor Placement for Environmental Monitoring

E-mail Print PDF

The Spring 2014 issue of XRDS: Crossroads, the ACM magazine for students is about cyber-physical systems.

My XRDS department contributed a column on efficient sensor placement for environmental monitoring. The column is about an important problem of observation selection that received considerable research attention in recent years. Consider, for example, the air quality monitoring in a large research lab, the monitoring of algae biomass in a lake or the placement of a network of sensors in a water distribution system for early detection of contaminants. In all these settings we have to decide where to place the sensors in order to effectively collect information about the environment. Since acquiring observations is typically expensive and we have a limited budget, we want to select a small number of most informative locations for monitoring. Thus, we usually trade off the informativeness of sensor measurements for the cost of data acquisition. The column gives an example of large sensor deployment in a research lab and applies tools of submodular optimization to tackle the task effectively with some theoretical performance guarantees of near optimal observation selection.

Last Updated on Friday, 21 August 2015 15:01
 

@RECOMB 2014, Pittsburgh, PA (Part II)

E-mail Print PDF

We are presenting a poster about our recent data fusion methodology (ArXiv preprint) at RECOMB Conference. Thanks to Prof. Blaz Zupan for the storyline and Prof. Richard H. Kessin for valuable comments. xkcd.com served as an inspiration of poster design (HiRes). See also other post (part I) about our RECOMB paper.

Best Poster Award at RECOMB 2014!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Last Updated on Sunday, 14 June 2015 10:52
 


Page 4 of 9