Recent Invited Talks & Events
 06/20: NSFÂ Workshop on Future Directions in Network Biology (IN, US)
 05/20: Emerging Academic Symposium, GlaxoSmithKline (MA, US)
 04/20:Â Graph Exploitation Symposium, MIT Lincoln Laboratory (MA, US)
 04/20:Â Dagstuhl Seminar,Â Visualization of Biological Data  From Analysis to Communication (Germany)
 04/20:Â B3D Seminar, Harvard Biomedical Informatics and Biostatistics (MA, US)
 03/20: AstraZeneca (MA, US)
 03/20: MIT Bioinformatics Seminar, MIT Math & CSAIL (MA, US)
 02/20: Network Science Institute, Northeastern University (MA, US)
 01/20: Applied Machine Learning Days (Switzerland)
 01/20: Beth Israel Deaconess Medical Center,Â Harvard Newborn Health Services and Epidemiology (MA, US)
 12/19: NeurIPS 2019, Graph Representation Learning (GRL) (Canada)
 10/19:Â DS 2019, 22nd International Conference on Discovery Science (Croatia)
 09/19:Â 2019 FDA Science Forum, Advancing Digital Health and Artificial Intelligence (MD, US)
 08/19: Establishing a 2020 Vision for Genomics (MD, US)
 07/19: ISMB/ECCB 2019, Keynote,Â Machine Learning in Computational and Systems Biology (Switzerland)
 07/19:Â ISMB/ECCB 2019, Network Biology (Switzerland)
 05/19:Â Network Medicine: Getting connected: Systems medicine and networks atÂ NetSci 2019 (VT, US)
 05/19:Â Data Science for Virtual Screening and Drug Discovery (Slovenia)
 02/19:Â Annual Symposium on Computational, Evolutionary and Human Genomics (CEHG) (CA, US)
 12/18:Â NeurIPS 2018,Â Relational Representation Learning (R2L) (Canada)
 11/18:Â Next Generation in Biomedicine Symposium, Broad Institute of MIT and Harvard (MA, US)
 10/18:Â Rising Stars in EECS,Â MIT Department of Electrical Engineering and Computer Science (MA, US)
 10/18:Â Forum on Drug Discovery, Development, and Translation,Â The National Academies of Sciences (DC, US)
 09/18:Â EMBLEBI Workshop on Machine Learning in Drug Discovery and Precision Medicine (UK)
 09/18:Â National Guest Scholar at Stanford CERC (CA, US)
 08/18:Â AI in Medicine: Inclusion & Equity (AiMIE) Symposium 2018 (CA, US)
 08/18:Â Tech Summit SYNC 2018 (CA, US)
This year I have applied for the Google Summer of Code, namely the Orange project.
Will see if I will be accepted. :)
Update 25.04.2011: Google has announced the results. My proposal has been accepted and am looking forward to start working. :)
Some links to articles in Slovenian news:
Project title: Matrix Factorization Techniques for Data Mining
Description: Matrix factorization is a fundamental building block for many of current data mining approaches and factorization techniques are widely used in applications of data mining. Our objective is to provide the Orange community with a unifed and efficient interface to matrix factorization algorithms and methods. For that purpose we will develop a scripting library which will include a number of published factorization algorithms and initialization methods and will facilitate the combination of these to produce new strategies. Extensive documentation with working examples that will demonstrate real applications, commonly used benchmark data and visualization methods will be provided to help with the interpretation and comprehension of the results.
Main factorization techniques and their variations planned to be included in the library are: Bayesian decomposition (BD) together with linearly constrained and variational BD using Gibbs sampling, probabilistic matrix factorization (PMF), Bayesian factor regression modeling (BFRM), family of nonnegative matrix factorizations (NMF) including sparse NMF, nonsmooth NMF, local factorization with Fisher NMF, leastsquares NMF. Different multiplicative and update algorithms for NMF will be analyzed which minimize LS error or generalized KL divergence. Further nonnegative matrix approximations (NNMA) with extensions will be implemented. For completeness algorithms such as NCA, ICA and PCA could be added to the library.
Here is proposal document.

I've been following a course on Statistical Aspects of Data Mining lately, which is not what I will write about, but this article got inspiration from it. The software environment being used in this course is the R programming language, which is used for statistical computing and graphics (it is available for Windows, Linux and Mac as part of the GNU project). If you download it from R's website, you get it with the command line interpreter, of course there are some IDEs as well, such as Rcmdr or TinnR. The capabilities of R are extended with numerous usersubmitted packages  for the animation of the Mandelbrot Set at least the following libraries are needed: spam, fields, bitops, caTools  all are freely available at R's website. The R is influenced by S and Scheme, but I'wont go into details, as there is plenty information about it on the web.
I tried to draw the classic Mandelbrot Set (the basic code for it is available here), which is just iterating through the formula, , where is a complex parameter, starting at . The Mandelbrot Set is defined as set of all points, such that the sequence, got by iteration, does not escape to infinity. Some of the set's properties are: local connectivity, selfsimilarity, correspondence with the structure of Julia Set etc. Very simple formula, which gives fascinating results. In the R language animation you can observe the main cardioid, period bulbs, hyperbolic components.
It has been some time since my last post, but here is the new one. Perhaps the title sounds a bit inappropriate, but indeed it is well suited. Read till the end, where I explain it for those not figuring it yet (or consider it a puzzle :))
So, what have I been up to lately? Despite summer holidays I have been involved in quite a few projects.
First, GSoC Matrix Factorization Techniques for Data Mining project for Orange has been progressing well. Code is almost finished, no major changes in framework, factorization/initialization methods, quality measures, etc. are expected. Project is on schedule and has not diverged from initial plan, all intended techniques (plus a few additional I have found interesting along research) are implemented. I have been doing some testing, and have yet to provide more use cases/examples along with thorough explanation and example data sets. I will not go into details here, as implemented methods' descriptions with paper references are published at Orange wiki project site. The project is great, a mix of linear algebra, optimization methods, statistics and probability, numerical methods (analysis if you want to read some convergence or derivation proofs) with intensive applications in data mining, machine learning, computer vision, bioinformatics etc. and I have been really enjoying working on it, here is my post at Orange blog. The Orange and its GSoCers have been spotlighted at Google Open Source Blog.
Next, there is some image processing; segmentation, primary and secondary object detection, object tracking, morphology measures, filters etc. (no details).
Minor for keeping contact with MS world, Sharepoint Server 2010 (SP 10). I have some experience with it (and its previous version MOSS 2007), both in administration and especially in code. This time it was not about coding workflows using Win Workflow Foundation, developing Web parts/sites/custom content types/web services (...) but providing an insite publishing hierarchy for data in custom lists and integration with Excel Services (not with new 365 Cloud service). Obstacles were limited server access (hosting plan), old versions of software and usual MS stuffs (:)). In SP 10 these are SPFieldLookups filters and cascading lookups, data connections between sites/lists/other content. As always there are some nice workarounds which have resolved all issues.
Last (not least) I have been catching up with all the reading material I was forced to put aside during the year (well not entirely true: the more I read, the more should be read, so the pile of papers in iBooks and Mendeley app is not getting any smaller :)).
Here we are, what about the post's title? The sunrise problem was introduced by Laplace (french mathematician known for Bayesian interpretation of probability, Laplace transform, Laplace equation, differential operator, work in mechanics and physics). Is the probability that the sun will rise tomorrow 1 if we can infer from the observed data that is has risen every day on record? :) So what is the answer of the question in the title? The inferred probability depends on the record  whether we take the past experience of one person, humanity, or the Earth history. This is the reference class problem  with Bayes any probability is the conditional probability given what a person knows. Simple principle emerged from this, addone or Laplacian smoothing (Example: Doing spam email classification with a bag of words model or text classification with multinomial model, this allows the assignment of positive probabilities to words which do not occur in the sample) and corresponds to the expected value of the posterior.
