The roaming data scientist – Data, Science, Et Cetera

Singapore released their AI governance framework in June to help organisations navigate new AI technologies. While it stands up as a credible first edition, there is some way to go in honing the document into a succinct and practical guide that can give decision makers a clear and actionable path.

Introduction

Like their counterparts around the world, the Singaporean government was a huge supporter of digital technologies in the early part of this decade, establishing the Smart Nation initiative. And like many governments, the early enthusiasm gave way to weariness as reports of data privacy issues have surfaced around the world, and more alarmingly, emerging AI technologies being used by bad actors. We have lately seen a series of white papers, guidelines and frameworks by governments proposing best practice AI, here are just a few:

Europe: Ethics guidelines for trustworthy AI
International: IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems
Australia: Artificial Intelligence: Australia’s Ethics Framework

We’ll focus on Singapore’s. There is a lot of great practical advice in this paper, there is also a lot of other stuff. Because the document is reasonably long, it’s difficult for a decision make to know:

What is important and requires focus.
What is applicable/non-applicable.
What is achievable or difficult.
What new skills are needed, if any.

If one treats the framework as a checklist they would soon find it’s too large and unwieldy to use, defeating the purpose of the document. However, it is unlikely to even reach this stage as it may be technically daunting for most decision makers, fraught with domain specific terminology and statistical concepts. We’ll go through it and sift out those topics that need to be treated. so we have a concise “… framework that helps translate ethical principles into pragmatic measures that businesses can adopt.”

Guiding Principles

There are two guiding principles that are used to promote trust and understanding:

Decisions made by AI should be explainable, transparent & fair.
AI systems should be human-centric.

Business should be demanding AI technologies are explainable & transparent for reasons of accountability and demonstrating value. On the second point, for customer-centric organisations this is implicit and will already be in practise. It is easier to understand the counter-case; for example a mining company that uses AI to predict locations of rich ore veins will not use human centric AI, they will use rich ore vein centric AI.

The document is then broken down into four sections:

Internal Governance Structures and Measures
Determining AI Decision-Making Model
Operations Management
Customer Relationship Management

These will be the focus of this article.

Internal Governance Structures and Measures

“Adapting existing or setting up internal governance structure and measures to incorporate values, risks, and responsibilities relating to algorithmic decision-making.”

The authors list a set of duties that will need to exist to support deployed AI models.

Monitoring
Maintenance
Review
Model remediation

I would like to see some more detail of what these duties entail, however that may be difficult to do whilst maintaining generality. It is otherwise reasonably actionable advice and although it seems obvious, many a machine learning model is built with no such support. Following this discussion of roles and responsibilities, we have the discussion of the tools these functions would expect to use; Risk management and internal controls.

“Establishing monitoring and reporting systems as well as processes to ensure that the appropriate level of management is aware of the performance of and other issues relating to the deployed AI.”

This is pertinent when organisations are considering an off-the-shelf AI solution. The hard questions is; does the product feature set meet our requirements for AI governance? The remainder of this section can be skipped, it is somewhat obvious or irrelevant.

Determining AI Decision-Making Model

“A methodology to aid organisations in setting its risk appetite for use of AI, i.e. determining acceptable risks and identifying an appropriate decision-making model for implementing AI.”

The first half of this section has some good points about broader organisational and market risk, but the real call to action is in the second half. It describes a categorisation of AI models dependent upon where a human sits in the decision making process, if at all:

Human-in-the-loop – This is where a human is ultimately making the decision, and the AI is more of an assistant.
Human-out-of-the-loop – This is a fully autonomous system, no humans involved.
Human-over-the-loop – This seems to me a subset of the first. The definition says “This model allows humans to adjust parameters during the execution of the algorithm.”

Just using the first two categories allows organisations to better manage AI deployment. For example, a Human-in-the-loop model is intrinsically less risky and therefore stringent governance is not required. Whereas Human-out-of-the-loop models will require critical analysis of risk. This is the subject of the next part of the section.

The traditional risk harm matrix is a simple and powerful tool for evaluating AI models, importantly, this is applied to humans affected by the AI model, and not the organisation itself. It then goes on to give a little example of how the human-in/out-of-loop concept can be paired to the risk-harm matrix to determine the appropriate regulatory processes that the AI model is submitted to. This concept is one organisations will recognise, and may be able to adapt current processes to suit AI rather than invent new ones.

Operations Management

“Issues to be considered when developing, selecting and maintaining AI models, including data management.”

At this point, the framework digresses into a high level deployment guide for machine learning models, although out of scope, it may give some context to the proceeding recommendations. Following this is a discussion of generic data management – Data For Model Development. Poor data management should be treated through standard IT processes, and not form part of this framework.

However, the following discussion of model bias is important. This is a largely intractable problem in machine learning datasets, although no sensible solution is prescribed in the document, it’s important that this is highlighted and discussed in any model building process since it pertains to model accuracy.

Model revision and reinforcement bias is then brought up briefly and described as a post implementation task. However I would argue that these concepts fall under initial solution design and are tightly coupled to derivable value and model accuracy. Generally, I think these concepts deserve much more coverage. However in the context of governance, the risk is that a poor model may affect the individual. Otherwise, if a company is consuming the results of a poor model that leads to low or negative ROI, that’s perfectly fine as long as it’s not hurting anybody.

Algorithm and Model is the next part of the section. Here there are some very interesting points around explainability:

With AutoML, it may be necessary to describe the auto component of the model, and the ML component.
It may be sufficient to simply state “these are users with similar profiles to yours …”. In which case the model isn’t explained, but may placate a concerned individual.
The Intellectual Property argument is a valid reason for not explaining AI.
Blackbox models, that is most machine learning models, need not be explainable because it’s impractical. In its stead, a list of tasks is provided that can be used to demonstrate consistency.
Store the data that was used to generate the prediction.
Implement model monitoring, review and tuning.

This is the weakest part of the framework; it is confusing, self-contradicting and duplicate. I suspect there were contributions from different teams that worked independently, and was subsequently not combined well.

Point 1 recommends that organisations may consider explaining both the mother and child models in AutoML, this is not necessary – only the ML child model need be explained. However point 4 states that ML models need not be explained at all!

Points 2 and 3 can be abused, however an AI function will still be required to explain their models to internal stakeholders, so these points are moot.

Point 5 is a good piece of advice, there are some practical limitations with carrying this out, however, I cannot think of a more robust way of explaining a previous prediction than to capture the input data, predicted results, and the model used to generate those results.

Point 6 is overlapping with the final point in Data For Model Development. This is where they discuss revising and updating the data sets, which is tantamount to tuning the model. I would remove it from the data part and make it part of model revision. In my opinion this is one of the more complex parts of a robust ML implementation. This point was also made in Internal Governance Structures and Measures (“Establishing monitoring and reporting systems …”). This should underscore the importance of these concepts, they are not post deployment tasks, they are important design questions.

There are a number of techniques for explaining ML models that aren’t mentioned in the framework, but they may require statistics skills to use effectively, these include:

Partial dependence plots
Pruned decision trees
LIME and Shapely values

Customer Relationship Management

“Strategies for communicating with consumers and customers, and the management of relationships with them.”

The document encourages organisations to be open about their use of AI technologies to their customers. In addition the document recommends:

Using easy to understand language.
Be able to explain how and why specific AI decisions are made.
Consider option to opt-out, where appropriate.

The second point could be quite burdensome to some organisation, Facebook make 6 million predictions a second.² This is discussed in the previous section; is it really feasible to track every prediction? Could this fall under a type of risk-harm matrix, where low-risk low-harm predictions are not worth recording?

Conclusions

As tooling becomes more available to use, and techniques more widely known, I think we’ll see a consolidation of best practice frameworks into common business policies. Singapore’s business centric framework makes a good start, and with some caveats and omissions, could be used as is. I’ll emphasise the main points above;

Using the risk vs harm approach is an appropriate way to asses what regulatory processes to apply, and one familiar to business.
We need more discussion around on-going model deployment management processes.
What are the practicalities of recording predictions for future reference?

I look forward to seeing this framework progress, and indeed seeing it’s recommendations adopted in organisations.