Interview with Speaker Lukas Widmer

Lukas Widmer is Senior Principal Statistical Consultant at Novartis Pharma AG.

Could you tell us a little bit about your professional background?

From young age I had an interest in computer technology. Initially that led me to work in software engineering while studying Computer Science at ETH Zürich, after which it became clear to me that some of the most challenging and impactful applications where I could contribute were in life science and healthcare – here some exposure to the field of medicine through my family definitely had an influence. This led to me to pursue an MSc degree in Computational Bioinformatics at ETH Zurich and UC Santa Barbara and a PhD degree at ETH Zürich in Basel. In my PhD I focused on more interdisciplinary work; interfacing statistical / computational modelling and simulation to further understanding of basic biology, with the long-term goal to improve treatments. The desire to have more immediate impact in life science brought me to the Advanced Exploratory Analytics group – part of Advanced Methodology and Data Science at Novartis – which I joined in 2019. I joined Novartis with the double mission to drive the use of innovative methodology – such as data science, modelling and machine learning – across drug development, to deliver science-based progress to our patients and to re-imagine medicine.

What will you be speaking about at the SDS2021?

I will be discussing the importance of developing, propagating and applying Good Data Science Practice in the Data Science community in general, and in healthcare and pharma in particular. There have been several recent examples that highlighted the need for this, such as introduction of unwanted (and potentially unnoticed) bias which could impact patients in an unintended manner in Covid risk prediction or bias introduced into melanoma recognition using deep learning when not accounting for surgical skin markings. That any holistic approach to human research must be built on a solid ethical foundation has also been a current point of discussion in Computer Science, through the banning of University of Minnesota from making any further Linux Kernel contributions and the following apology, highlighting our duty to protect human subjects in research. We will discuss the need for Good Data Science Practice from multiple perspectives in the pharmaceutical industry and beyond, and we look forward to your thoughts and questions.

Why is the SDS conference important?

The Swiss Conference on Data Science is a great platform for exchange on current developments and issues in the data science space in industry and academia across Switzerland. There is a lot of excellent research and development going on both at universities and companies, and I find that having a good and critical discussion and dialogue (for example at SDS) is a good seed for collaboration and innovation. Having a direct line to people on the ground – subject matter experts – seems to be one of the key success factors, so I am looking forward to a diverse conference program.

Interview with speaker Aleksandra Chirkina

Could you tell us a little bit about your professional background?

My professional background is in application of data science to finance and financial data. Recently I’ve been working on an NLP project for analysis of financial documents, a recommender system for financial advice, and adaptation of different data science techniques for KYC compliance.

What will you be speaking about at the SDS2021?

My presentation “Data Science for Uninterrupted KYC Compliance” is going to demonstrate how data science can boost the efficiency and quality of KYC (Know Your Client) and AML (Anti-Money Laundering) processes for financial institutions. Proper implementation of KYC and AML measures is crucial not just for the smooth operations of a financial institution, e.g., a private bank, but for the economy as a whole, preventing the inflow of ‘dirty’ untaxed money and combating criminal activities.

In the talk I will present our experience with two data science models being applied to transactions and KYC profiles. As a result, some non-trivial KYC violations were detected, which were missed by traditional rule-based approach. The talk is aimed at inspiring financial institutions explore intelligent data-driven solutions for detecting KYC violations, money laundering and fraud.

Why is the SDS conference important?

For me personally, the SDS conference is a yearly milestone, for which we always thoroughly prepare in our team. We reflect on our achievements and discoveries over the past year, select the most interesting client projects and internal research to share with the Swiss data science community.

Another important role of SDS, particularly in the current self-isolation times, is being the attraction point for data scientists from different companies and industries, where they can share their professional and personal experiences, exchange ideas and inspire each other.

How Confidential Computing and Decentriq can facilitate greater industry collaboration

Born in Zurich, David Sturzenegger is a mechanical engineer by training and obtained a PhD degree in electrical engineering from ETH Zurich in 2015. From his time at big-data company Teralytics, he has several years of experience working with highly-sensitive data and leading teams of senior data scientists and software engineers.

Now David is Head of Product at Decentriq, where he is leveraging privacy-preserving technologies to help organizations collaborate on sensitive data. At SDS2021, David will be talking about Confidential Insights, a confidential survey platform jointly developed by Decentriq and Swisscom’s Fintech unit.

Confidential Insights was announced in November 2020. It is the world’s first platform for provably confidential surveys and peer-group analyses. Built to make collaborations around sensitive data easy and secure, Confidential Insights allows combining survey answers from multiple participants and extracting insights while keeping the answers provably confidential from anybody – including all admins. With Confidential Insights there is no trade-off between data utility and data privacy anymore.

An application of Confidential Insight

Leveraging the additional confidentiality guaranteed by Confidential Insights, Swisscom’s market research department –  e.foresight  – recorded an increase in participation in their annual survey on online mortgages, conducted by 30 banks in Switzerland. Banks that would not have participated previously due to confidentiality concerns, now responded to the survey through Confidential Insights, providing e.foresight with greater data input and deeper insights into the online mortgage space in Switzerland.

The underlying technology platform

The confidentiality guarantee is achieved by leveraging a technology called confidential computing, which is also the underlying technology behind the Decentriq platform. This is a SaaS enterprise that allows anyone to easily collaborate on the most sensitive data without risk of exposure. Confidential computing ensures that all the data passing through Decentriq is completely secure and encrypted, end-to-end. Even Decentriq itself cannot see the raw data input by organizations into the Decentriq platform. Confidential Insights uses the Decentriq platform as a backend.

Applications in different industries

From customers’ financial information to patients’ health data, collaborating with industry partners on sensitive data can securely bring about significant benefits and value to organizations and their customers. Below are some examples of industries that can benefit from secure data collaboration:

  • Insurance: To provide better protection and service for customers by leveraging customer insights, or to enhance collaborative fraud detection by analyzing data with fellow insurers, without exposing sensitive customer and claims data.
  • Financial Services: To participate in collaborative credit risk scoring with other firms and improve credit risk modelling and scoring, without ever exposing customers’ confidential data.
  • Healthcare: To bring together patients’ highly-sensitive health data, often distributed across different hospitals and clinics, in an anonymized manner so as to allocate resources more efficiently and provide patients with more effective treatments.

With confidential computing powering Confidential Insights and Decentriq, more organizations and industries can now collaborate with each other on their most sensitive of data with minimal risk, unlock new business value and deliver products that best match their customers’ needs.

Interview with Keynote Speaker Lothar Baum

Lothar Baum is Head of Engineering Cognitive Systems at Bosch. He holds a PhD in computer science. Before joining Bosch, he worked for Hewlett Packard in Germany and in a smaller company in the USA.

Could you tell us what you are doing at Bosch?

I joined Bosch in 2006, working in corporate research, where I built up a research group on connective systems. We were looking at robotics and machine learning applications. Then I moved to Data and started a project in Data Mining. I then got involved in the foundation of the Bosch centre for AI. Since 2017 – four years now – I have been with the business unit on automated driving. I am responsible for the department that develops smart algorithms. So, basically all the algorithms from perception to situation analysis, prediction and behaviour planning.

What will you be talking about at the conference?

I will be happy to give you an overview of what it takes to build autonomous driving cars; what the technical challenges and the approaches to tackle these challenges are. Ultimately, I want to give you an impression of where we stand and where we need to extend our technology.

What are the biggest challenges for Automated Driving? And what are the solutions?

There are of course a lot of challenges. Trying to summarize them in a short time is in itself a challenge.

One of the first challenges is performance. Especially the perception performance: how well does a car perceive its environment? This comes down to sensor variety and performance; the more sensors you have and the more different modalities of sensors you have, the better it is. Secondly, it comes down to computational performance. We need fast computers that require energy and space, which – in turn – means more costs.

The second challenge is what we call the “open world problem”. The boundaries of the driving task and the rules for behaviour are not, and cannot be, clearly defined. The problem is that there will always be situations out there that nobody has ever thought about. And how do we handle situations that no one has implemented a solution for? This calls for approaches that are data driven. This means that we train systems with data examples and hope that they are sufficiently able to abstract and generalize. This, in turn, means that we need a lot of data training, which is another big challenge.

The basic idea is to have an approach that is similar to how we humans learn to drive. We know a set of rules, and we have collected experience via driving lessons. We don’t have a clear plan for every possible situation, but we have, let’s say, some kind of abstract data set in our heads where we are able to transfer situations or solutions to other scenarios. That’s the data driven approach.”

Then there comes the long tail of unknown cases: for which scenarios do we capture data and what happens with cases where we haven’t captured enough data? How do we create a sound safety argumentation around this? Ultimately this leads to ethical questions: what are the guardrails allowing or not allowing systems on the street? And, let’s be clear about this:  there will always be accidents. You will never get a 100% safe system and the question is whether we accept this, and at which stage we accept that there are residual risks. And this is a question that is both for technologists and for the society at large to decide.  

The amount of responsibility when driving a car is big, and it can already be difficult for a human to anticipate all the possible things that can happen on the street. How does this translate to a self-driving car?

The basic idea is to have an approach that is similar to how we humans learn to drive. We know a set of rules, and we have collected experience via driving lessons. We don’t have a clear plan for every possible situation, but we have, let’s say, some kind of abstract data set in our heads where we are able to transfer situations or solutions to other scenarios. That’s the data driven approach.

Given all of these challenges, how do you see the future of Automated Driving? When can we expect to see automated cars?

This depends on what we expect. How much are we willing to pay for it? Probably, it’s technically possible already. It’s a question of cost, obviously, and it’s probably not achievable for the broad public right now. But it’s something that is interesting for everybody. And on the other hand, it’s a question of performance. Today we still see a lot of new “corner cases” where many of these fully automated cars fail. And the question is to which extent we accept this.

In general, these challenges can be approached from two different angles. There’s one approach that is building on the (rather rules-based) assistance systems we see already in many of today’s cars. We’re trying to develop these assistance systems that are not fully automated but are the first steps towards helping the driver. These systems help us collect data and experience and iteratively expand their functionality.

And then there’s the top-down approach that basically strives to directly build a fully autonomous car, neglecting economic constraints such as costs or compute power, hoping the ready solution later can be scaled down to reasonable setups. At some point, hopefully, these two approaches will meet.

What, in your opinion, are the organizational and legal effects in bringing an automated car to the streets?

Again, it depends on the level of automation. There is a classification by the Society of Automotive Engineers (SAE) which defines five levels of automation. Level five corresponds to the highest degree of automation where no human driver is involved. Level zero means no automation or assistance at all. From level three upwards, the autonomous system actually takes over the responsibility for driving, at least for a limited amount of time. For the fully autonomous level five car, there is currently no consensus on how to handle this legally. It’s not allowed in most countries.

There is, for example, the Vienna convention on road traffic from 1968 which many countries around the world have adopted. It basically says there has to be a driver in the vehicle in charge of driving at all times. Only recently some countries have taken measures to soften this regulation and taking steps towards making self-driving cars legally possible. In Germany, about five years ago, there were some changes to the laws, that made it possible for the driver not to have direct control of the car at every point in time.

“I’m pretty sure that changes in laws will happen. The question is when it will be fully accepted, that is, when it will have become normalized in the society at large.”

Many companies involved are lobbying for changed laws. We can already see that in certain areas in the world, primarily the U.S. and China, they are pushing for this kind of legislation. And there have already been some regional adjustments. For example, in the states of Nevada and California self-driving cars are allowed under certain conditions. I’m pretty sure that changes in laws will happen. The question is when it will be fully accepted, that is, when it will have become normalized in the society at large.

Fully autonomous cars will have other impacts as well. They may have consequences on different sectors of the economy, for example the car industry, because less cars would be needed. Most of our cars are just standing idle 90% of the time and that’s because they are waiting for us. If they could drive to different locations themselves, we could probably get away with fewer cars. That’s one thing. And obviously the whole businesses of taxis, business shuttles and so on would be in trouble.

And lastly: why is SDS important and what do these conferences bring to the community?

From my perspective the main benefit is that this kind of conference brings together researchers, practitioners and decision makers to exchange ideas. And I would like to say that especially with respect to data science, the exchange of ideas and also the exchange of data – to know what data is available where, how and what can be done with it is specifically important for the data science community.

“if we look at things like autonomous driving, it is not just a technical question, it’s also a question for the society. What do we expect and what kind of risks do we accept? And this means that there has to be a discourse in the society about this.”

And last but not least, if we look at things like autonomous driving, it is not just a technical question, it’s also a question for the society. What do we expect and what kind of risks do we accept? And this means that there has to be a discourse in the society about this. And that’s why it’s important to talk openly and come to an overall decision on how to cope with the challenges.

data innovation alliance provides a significant contribution to make Switzerland an internationally recognized hub for data-driven value creation.