Marinka Zitnik

Fusing bits and DNA

  • Increase font size
  • Default font size
  • Decrease font size
Marinka Zitnik

BioSNAP Datasets: Stanford Biomedical Network Dataset Collection

We are announcing a repository of biomedical network datasets, BioSNAP Datasets: Stanford Biomedical Network Dataset Collection!

BioSNAP aims to bring biological and medical datasets closer to computer scientists who develop new exciting algorithms. It is often very difficult for computer scientists who typically do not have any background in bioinformatics or biostatistics to obtain and construct high-quality biomedical datasets. Because of that, biomedical datasets are rarely used in ML algorithm development and benchmarking, even though biomedicine is one of the most exciting domains for ML with a unique set of challenges, hard important problems, and huge potential impact. BioSNAP aims to close this gap by providing a number of ready-to-use network datasets.

BioSNAP contains many large biomedical networks that are ready-to-use for method development, algorithm evaluation, benchmarking, and network science analyses. In this first release, BioSNAP has a few tens of network datasets that describe a dozen different entity types (e.g., genes, proteins, cells, drugs, diseases, side-effects, tissues). These datasets can be used for standard prediction tasks (node clustering, link prediction, node classification) as well as relatively new tasks (graph-level classification, multi-relational link prediction, higher-order association prediction). Many datasets contain weighted networks and can be used to define multi-layer/heterogeneous graphs with attributes.

I look forward to seeing more biomedical network data considered in machine learning and data science research.

 

CG: L-Systems Fractal Generation of 3D Objects

One of the courses I attended this semester has been Computer Graphics (CG).

I have spent some time studying algorithmic botany and especially L-systems, formal grammars for describing fractal objects. These can be used for generation of objects in biology, botany, and even buildings and entires cities. Rome Reborn is an example of such project, in which formal grammars were used for the creation of the 3D digital model illustrating the urban development of ancient Rome.

So I have decided to visualize some of the 3D fractal objects using OpenGL and LWJGL library. Below are links to short report and presentation. Take a look :)

Those of you who are interested, great book on this topic by the father of algorithmic botany, Aristid Lindenmayer.†Prusinkiewicz, Przemyslaw; Aristid Lindenmayer (1990). The Algorithmic Beauty of Plants (The Virtual Laboratory).Springer-Verlag. ISBN 0-387-97297-8

 

XML DB

I have prepared presentation on XML DBs, which I will hold tomorrow, on 1st December 2010 at Basics of Database Systems course.

XQuery Examples.

 

Nature Communications: General Method to Denoise Biological Networks

Technical noise in experiments is unavoidable, but it introduces inaccuracies into the biological networks we infer from the data.

In this Nature Communications paper, we introduce a diffusion-based method for denoising undirected, weighted networks, and show that it improves the performances of downstream analyses, including prediction of gene functions, interpretation of noisy Hi-C contact maps, and fine-grained identification of species.

 

ISMB 2018: Polypharmacy Side Effects

I presented our work on predicting polypharmacy side effects at ISMB/ECCB in Chicago, IL, USA. Here are the slides.

This work has been highlighted in Stanford News, covered by several other news outlets, and is the most read paper in Bioinformatics. Check out Stanford's news story for a lay-person's description of this project.

 


Page 10 of 25