About Data Science

big data

Data science, the use of data to understand complex systems and make predictions, has been around for some time. More data enables the application of more sophisticated methods of analysis, which allows understanding these systems with much higher precission and gaining completely new insights.

What is new now is that due to the exponential growth of data produced by modern technology in every aspect of life, the data scientific methods that were formerly used only at CERN and later at places like google and social media companies, become accessible for normal businesses, entailing a boost of efficiency and productivity.

A data scientist has to master a large set of tools from the areas of programming, statistics and machine learning. New University courses(1, 2), which aim to provide this kind of knowledge in a short time, are currently instantiated.

However, the most important characteristic of a successful data scientist is that he has to have the mind of a researcher, to be a curios out-of-the-box thinker and problem-solving generalist. It is not uncommon that different people who are given the same data set will be able to see different things in it. This ability to ‘see’ something where others do not, comes only with talent and many years of experience.

This is exemplified in finance, the first non-scientific field, which entered the area of big data. While it is easy to be fouled by the apparent randomness of the data, or by apparent patterns that do not persist into the future. There are some who are able to see systematic patterns in the same data, which today is publicly available for everyone, and build extremly profitable trading strategies out of it.

The same kind of data driven insight will become increasingly important in other areas of the business world and a determinant for success.


About this blog

Hi everybody!

I am very excited about all the new developments in data science that are happening around us. The number of sources of openly available data is increasing fast. Businesses are collecting increasing amounts of data, enabling them to profit from big data analysis techniques in many new ways.

I have been doing large-scale, high-complexity data analysis for nearly ten years in particle physics, finance and economics. You can find out more about me here.

This blog is about the adventures I encounter in exploring the new sources of data that are being created. The things I am learning on the way I will share in this blog, so that more people can profit from it. Hopefully some of it can be put to good use.

Have fun!