Data Science Center Tilburg

Data Science Center Tilburg

Understanding a data-driven society

Research Data Science Center Tilburg

DSC/t will undertake research in data science, fusing scientific discourses that were until before completely isolated, viz., psychology, business, law, statistics, mathematics, computer science, information systems, cognitive science and ethics. Tilburg University (TiU) has the intention to create a research culture that focuses on excellence and discovery, high impact scientific breakthroughs essential to innovation, knowledge transfer and commercialization activities that tackle key societal issues which transcend disciplinary boundaries.

Contact us for more information on the DSC/t research program

DSCT research profile

The discourse of data science is vast and enormously complex and spans an immense and diverse spectrum of literature, in origin and in character. It encompasses many concepts and technologies that find their origins in diverse disciplines that are woven together in an intricate manner. This constitutes a serious challenge for the development of new generation data science solutions and technologies. There are two main problems that require integration of research expertise and intense collaboration of researchers: research fragmentation and research gaps.

Currently, research activities are very fragmented and, as a result, each research group concentrates mostly on its own specific research techniques, mechanisms and methodologies, which are not fully utilized by or aligned to the efforts of others. As a consequence, proposed solutions are not aligned with or influenced by activities in related research fields. Only through a meaningful cross-fertilization of principles, ideas and closer collaboration between these diverse research groups can problems such as these be avoided. In contrast, for the next generation of research, a holistic view and approach to data science research is required.

Collaborating researchers

The aim of DSC/t is to create a uniform workplace culture by determining research opportunities and setting priorities through aligning, shaping and integrating research agendas of key scholars at the diverse schools at TiU, working in different research areas. This draw on bonding and bridging ties between researchers in order to achieve a sense of belonging, synergy and broadening of skills, is necessary for achieving excellence. To achieve long-term integration DSC/t intends to further develop its intellectual expertise by encouraging collaborating scholars to share and integrate facilities and infrastructure across our campus. This will be used not only for underpinning research but also for application measurements, testing, sampling, assessment, and analysis. The benefits of a shared integrated DSC/t technical infrastructure include:

  1. Creating strong, long lasting and evolving links among different research communities to ensure durability and sustainability of the DSC/t research objectives 

  2. Performing experimentation on integrated data solutions, which complements the traditional research approaches undertaken by DSC/t’s scholars.

This will lead to rationalization, consolidation, harmonization of research results and streamlining of research activities. 

Indeed, it is the objective of the DSC/t to establish its position as a leading research-intensive institute and to attain high levels of international visibility, scientific rigour and industrial connectivity. In particular, the DSC/t will:

  • Establish a research community where the notion of bold and creative thinking, excellence and discovery pervades all research and results are innovative, of high impact, and highly inter-disciplinary in data science - aligned with business, law, social and behavioral sciences,
  • Consolidate and improve the a research image in areas where TiU traditionally excels and supplement it with additional expertise required to achieve the highest standing possible in its field in Europe,
  • Reach excellence and attractiveness in research, innovation, valorization and industrial competitiveness in the Netherlands and worldwide,
  • Identify and introduce high-impact, multi-disciplinary thematic areas of research, which cut across major disciplines and promote the progress of knowledge and technology in these areas,
  • Significantly, increase the number of researchers performing at, above or well world-class levels, and progressively increase publications in high quality and high impact journal publications (in terms of impact factors),
  • Place increased emphasis on internationalization by forming long-lasting partnerships with high-ranking universities around the world,
  • Apply for joint research grants with industry in the Netherlands, and other Universities and research organizations both within the Europe (Horizon 2020) as well as overseas,
  • Increase the number of jointly supervised PhD students with other Universities (TU\e) and research organizations in the Netherlands /Europe and overseas.

DSCT research framework

The fundamental scientific knowledge discourses are:

1.  People

This discourse studies the impact of data science on humans considering sociological, psychological and cognitive phenomena involved with data science, addressing important topics such as neuroscience, language and culture, speech processing/recognition, Human Technology Interaction, and, (interpretation of) conceptualizations/signs In particular, this discourse contributes largely to two aspects of data science, viz.: human-technology interaction, and artificial intelligence.

a.  Human-Technology Interaction (HTI): the field of Human-Technology Interaction aims at improving our understanding of both the human and technological aspects involved with novel data science technologies, methods and solutions. Aspects that are under consideration to optimize their uptake and usage, include, but are not restricted to: usability, ergonomics, perception, consciousness/awareness, and social/environmental psychology.

b. Artificial Intelligence: artificial intelligence is a scientific study that applies human intelligence in computing models, exploited to leverage our understanding of how people think and take decisions (e.g., based on data provided). Notably, computation modeling may be applied to investigate how humans structure, combine and process (Big) data collections.

2.  Rules

The fundamental discourse of rules encompasses the following three dimensions: law, ethics and regulation:

a. Legal: knowledge about IP issues and privacy laws and considerations, e.g., to exploit novel (Big data) technology to help in enhancing the protection of privacy (privacy by design).

b. Ethical: data science very much involves important ethical considerations such as confidentiality, ownership, transparency and personal identity. Both from a societal perspective and an individual perspective. For example, a data scientist/engineer/analyst may ask herself the question if there are any limits, and if so which, to making inferences about people (e.g., patients), and which decisions may be taken using such inferences. Moreover being a scientist also requires a strong ethical code that requires principles which build trust between the scientists, the accountable organization,  and broader society.

c. Philosophy of Science (including Research Methods): educating the next generation of data scientists does not merely entail cultivating skills sets and teaching them about the latest insights in data discovery/integration/modeling and analytics predicting a “better” future, but also, and perhaps more importantly, causing a better, e.g., a more social, future. The philosophy underpinning data science deals with existential issues including the epistemological and ontological postulations underlying the selection and application of data science taxonomies, theories, models and methods. With large data sets it is becoming more important to understand how to run experiments, what are the powers of data science and in which cases other techniques are better to predict and prescribe (e.g. in case of structural changes in underlying model).

3. Methods

This discourse amalgamates research from the domains of mathematics, statistics, computer science and research methods, concentrating on the essential “tool/method box” for data science.

a. Mathematics: foundations of graphs and networks, mathematical modeling, probability theory, random matrices, linear algebra, optimization, forecasting, discrete (dynamic) systems.

b. Statistics. Statistical theory and tools exploit probability and decision theory, and using computing, analysis, and optimization for improved decision making and dealing with uncertainty. This method field addresses the following topics: Predictive Analytics (simulation), statistical machine learning methods for analyzing large-scale datasets, Multivariate Analysis, Time Series, Stochastic models, and, Prescriptive Analytics.

c. Research methods. This focuses on digital data science techniques to uncover and answer real-world questions, considering a variety of quantitative, qualitative and design-science oriented research methodologies. This strand of research will also consider philosophical and ethical issues with regard to (Big-) data research.

d. Web science. Web science revolves around studying the Web as a socio-technical systems, focusing at human-system interactions, combining mathematics and computer science with sociology and economics.

e. Automated software services. The field of Service Oriented Computing has fundamentally changed the socio-economic fabric of the software development by enabling developers to dynamically grow application portfolios (in the DSTC application domains) more quickly than ever before, by creating solutions that transcend company borders by combining existing organizational assets – including business processes and data - with external components possibly residing in collaborating organisations.

f. Distributed (enterprise) computing. The field of distributed enterprise computing deals with theories, methods and tools to make autonomous (software) components collaborate for the purpose of accomplishing a specific business process or task, such as querying over distributed repository systems. This includes a variety of technologies including, but not restricted to, semantic web technologies, middleware, complex event processing, algorithms and data structures, databases, general purpose programming languages (Java, Python), and web engineering.

g. Database systems: theories, models and techniques to allow users to define, find, query and manipulate data. In particular techniques are considered to deal with large data collections considering data model, data semantics, Big data (base) technologies (MapReduce & Hadoop), data integration, data querying, process mining, data mining.

h. Data visualization. This method deals with presenting data in some graphical, pictorial or textual form to the user. Aspects that are considered include dash-boarding datasets, multi-model visualizations, visualization for various stakeholders (layman and professionals), pattern recognition, trending.

4.  Value

The top layer of the competency framework that shapes our vision on data science and guides our educational programs, deals with (economically) valorizing data science competences within or outside an existing organization. In particular, the Entrepreneurial and Innovation layer comprises the following fundamental building blocks:

a. Data Entrepreneurship: Revolves around the development of a set of skills that are essential for the creation of a sustainable startup company in the field of data science. This involves, for example, creativity, identification of opportunities, validation, evaluation and decision-making, establishing a network, obtaining resources, and management.

b. Corporate Entrepreneurship: this building block addresses entrepreneurship in well-established companies. Issues include, but are not restricted to, creativity, leadership, and culture to innovation through corporate venturing, strategic alliances, mergers and acquisitions.

c. Creativity and Innovation: In order to survive in today's increasingly competitive and global marketplace of data science, startups and enterprises need to be creative and innovative, instead of simply copy-catting solutions of competitors, creating something novel and value-adding.

d. Open Innovation: The closed innovation model, according to which an innovation is developed within the boundaries of one organization, is increasingly questioned. Instead, the model of open innovation has emerged. Open innovation involves the exploration of internal and external sources of innovations, for example through co-creation with lead users and customers, interfirm networks, and business ecosystems, as well as the internal and external exploitation of innovations, for example through licensing, joint ventures, and spinoffs.

The DSCT research framework defines the following application domains:

A. Consumer behavior

Accurate analysis of how brands can link with their target consumers and market segments and add value to their customers. By exploiting data science, marketing is utilizing circular data analysis, which involves understanding customer buyer behaviors, customer retention, customer pathways that explain how customers can travel in order to purchase goods/services, and preferences for a specific product or service.

B. Smart industries (“Industrie 4.0”)

Smart industries mark a decisive step towards the new manufacturing era as predicted by experts in N. American and German manufacturing industry who have forecasted that in the future, businesses will establish global networks that incorporate their machinery, warehousing systems, and production facilities in the shape of Cyber-Physical Systems, exploiting the Internet of Things (IoT). Key data science-related challenges include: enhancing manufacturing network visibility, information sharing, manufacturing process integration and insights for making informed decisions in such a way that the production line does not get hampered and the overall production runs according to plan.

C. Health analytics

The smarter approach to healthcare is one that turns data and processes into clinical and business insights for better outcomes. Smartness in medical services is in providing a more efficient way to store, distribute, share and integrate medical data, documents and digital images and make the information more accessible to the full range of healthcare providers for improved decision making. Smarter healthcare cloud applications can seamlessly deliver integrated care centered on the patient. Today, real-time data can be easily collected from smart phones/tables and/or sensors attached to patients, enabling, for example, more personalized care and cure (exploit data of smart devices to lever self-management and improve well-being of patients). In addition, health care analytics may help foster clinical and programmatic insights.

D. Human Capital & Labor Market

These days, huge heaps of data, e.g., from Web 2.0 applications and internal data resources, may be harvested for the purpose of talent analytics/retention, predictive hiring, and screening of candidates. In particular, HR Analytics aims at harnessing and analyzing (structured and unstructured) collections of organizational data considering issues such as work, employee wellbeing, impact of training on the organization, and workforce performance.

E. Smart cities

The pressure on cities today is to operate more efficiently and at the same time improve their services to citizens and to deliver services in urban environments. Cities are based on a number of different systems and services central to their operation and development, e.g., city, citizen, business, transport, communication, water and energy services, which must be considered holistically, as well as individually. This research fields aims at developing novel methods, tools and solutions for smart(er) cities, adopting a socio-legal-technical-economic perspective to provide better and cheaper public services to citizens, allow co-creation, and enhance insight in governmental organizations by public administrations and management.

F. Finance institutions

This application domain allows business analysts/data scientists to develop and evolve complex models of the financial processes in companies to provide insight into business opportunities and associated risks, exploiting –for example– predictive and prescriptive modeling techniques.

G. Legal analytics

Data science is likely to act as a disruptive as well as enabling technology in the field of law. It will not only affect the production and delivery of legal services, but also the legal infrastructure (i.e. legal education, the regulation of legal services as well as the legal profession), legal business structures (new markets and actors) as well as legal experts within governmental services (legislatures).