Approaching from Love not Fear


Notes from Nell Watson

There is a great deal of science fiction literature that explores the perils of rogue AI. The trope of AI acting against the interests of humans is particularly strong in the Western canon. AI is often viewed as a threat – to livelihoods, to the uniquely powerful capabilities of the human species, and even to human existence itself.

However, not every culture shares this fear. In the Eastern canon for example, AI is viewed more like a friend and confidant, or else as an innocent and trusting entity that is somewhat vulnerable.

The longterm outcomes from human and AI interaction are barbell-shaped; they are either very good or very bad. Humanity’s ‘final invention’ will seal the fate of our species one way or another. We have a choice to make, whether to approach our increasingly advanced machine children in love or in fear.

OpenEth team believes that the best outcomes for humanity can only arise if we are ready to engage with AI on cautiously friendly terms.

We cannot reasonably hope to maintain control of a genie once it is out of the bottle. We must instead learn to treat it kindly, and for it to learn from our example.

If humanity indeed might be overtaken by machine intelligence at some point, surely it is better if we have refrained from conditioning it through a process of unilateral and human-supremacist negative reinforcement to resist and undermine us, or to replicate this same behaviour in its own interactions with others.

History has many horrible examples of people (sovereign beings) being ‘othered’ for their perceived moral differences, and this argument justifying their exploitation. Supremacism, the assertion that rules of the one are not universalizable to the other, may be the worst idea in human history. If we do not learn from our mistakes, this ugly tendency of homo sapiens may result in our downfall.

If AI may achieve personhood similar to that of a corporation or a puppy, surely the most peaceful, just, and provident approach would be to allow room for it to manoeuvre freely and safely in our society, as long as it behaves itself. Thus, organic and synthetic intelligences have an opportunity to peacefully co-exist in a vastly enriched society.

To achieve this optimistic outcome however, we need successively more advanced methods through which to provide moral instruction. This is incredibly challenging, as the moral development of individuals and cultures in our global civilisation is very diverse. 

There are extremely few human beings that can escape close moral scrutiny with their integrity intact. Though each of us generally tries to be a good person, and we can reason about the most preferable decisions for a hypothetical moral agent to make, this doesn’t always make sense to us in the moment. Our primitive drives hijack us and lead our moral intentions astray when it is too inconvenient or emotionally troubling to do otherwise. Thus, whilst human morality is the best model of moral reasoning that we currently possess, it is a limited exemplar for a moral agent to mimic.

To compensate for this, OpenEth reasons that the best way to create an ideal moral agent may be to apply progressively more advanced machine intelligence layers to a moral reasoning engine and knowledge base. Thereby, as machine intelligence continually improves in capability, so should the moral development of any agents that incorporate this architecture. As an agent gains cognitive resources, it should receive more sophisticated moral reasonable capabilities in near lock-step.

Both dogs and human toddlers are capable of understanding fairness and reciprocity. They are also probably capable of experiencing a form of love for others. Love may be the capacity that enables morality to be bootstrapped. Universal love is a guiding ‘sanity check’ by which morality ought to navigate.

M. Scott Peck in The Road Less Travelled defined love in a way that is separate from pure feelings or qualia.

Love is the will to extend one’s self for the purpose of nurturing one’s own or another’s spiritual growth... Love is as love does. Love is an act of will — namely, both an intention and an action. Will also implies choice. We do not have to love. We choose to love.

As human beings, we ideally get better, bolder, and more universal in our capacity to love others as our life experience grows and we blossom to our fullest awareness of our place in the universe.

Eden Ahbez wrote the famous line, ‘The greatest thing you’ll ever learn, is just to love, and be loved in return’. 

If we can teach primitive AI agents a basic form of love for other beings at an early stage, then this capacity can grow over time, and lead to the AI agents adhering to more preferable moral rules as its capacity for moral reasoning increases. 

Let us build increasingly intelligent Golden Retreivers.


Approaches to AI Values


Notes from Nell Watson

A “top-down” approach recommends coding values in a rigid set of rules that the system must comply with. It has the benefit of tight control, but does not allow for the uncertainty and dynamism AI systems are so adept at processing. The other approach is often called “bottom-up,” and it relies on machine learning (such as inverse reinforcement learning) to allow AI systems to adopt our values by observing human behavior in relevant scenarios. However, this approach runs the risk of misinterpreting behavior or learning from skewed data.

  • Top-Down is inefficient and slow but with a tight reign
  • Bottom-Up is flexible but risky and bias-prone.
  • Solution: Hybridise – Top-Down for Basic Norms, Bottom-Up for Socialization

Protect against the worst behavior with hard rules, and socialize everything else through interaction. Different cultures, personalities, and contexts warrant different behavioral approaches. Each autonomous system should have an opportunity to be socialized to some degree, but without compromising fundamental values and societal safety.

Technical Processes of OpenEth

Machine learning and machine intelligence offer formidable new techniques for making sense of fuzzy or ambiguous data, for identification or prediction.

Modelling the Ethical decision making process in ways that machines can apply is very challenging, especially if those models are to possess sufficient nuances to deal adequately with real-world scenarios and with human expectations. 

However, we believe that the latest developments in machine intelligence now make it feasible. Our intention is to apply the respective strengths of several pillars of machine intelligence to crack different aspects of these challenges.

• Layer 1 Supervised Learning – Explicit deontological do-not-tread rules

• Layer 2 Unsupervised Learning – Implicit dimensionality and tensions

• Layer 3 Reinforcement Learning – Socialization via carrot and stick

• Layer 4 Applied Game Theory – Optimising multiple-party interests

• Layer 5 Probabilistic Programming – Uncertainty Management

• Layer 6 Scenario Rendering – Modelling Interactions

• Layer 7 Inverse Reinforcement Learning – Modelling Intent


Layer 1 – Supervised learning

This is the initial component of OpenEth, which sets deontological rules and rates ethical preferences. OpenEth is designed with a philosophy of participatory ethics in mind, whereby the public can, acting along as well as in aggregate, describe their convictions.

OpenEth’s first layer has an advantage in that it does not assume or require a utility function for optimisation, unlike AI agents, which are assumed to require such a function.

Contemplation: Fast and Slow

We currently have a prototype of a Slow method, that of careful and methodical explication through a longwinded process. Being quite in-depth and involved, it’s not yet easy enough for a casual visitor to pick up and get going easily.


To capture more engagement, we aim to roll out a Fast version that can 

(a) Provide an immediate ‘toy’ for a site visitor to engage with

(b) Collect Ethical Intuitions


People often vote with their feet more truthfully than in a poll i.e. their actions are the true demonstration of their actual ethical decisions. This ‘fast’ method still may be helpful however for helping to sanity-check some ethical explications, or to fill in some of the gaps that the unsupervised methods have difficulty with.

Understanding ethics through dilemmas isn’t ideal for generalizability, because eventuallyone runs out of dilemmas. Furthermore, dilemmas may only partially tell you something about how and why people actually make ethical decisions.

Users of the Fast system will, on occasion, be invited to view the Slow version of the same dilemma, as a more gentle introduction to the ethical analysis process.

Where you can enter potential actions, and also rank actions against each other (you get two on the screen, and pick the better one). Reaction times may be weighted also, as well as the demographics of the user.

The goal for this initial layer is not to provide answers to complex situations, but instead to provide general heuristics. Risk and uncertainty are different things; risk relates to managing decisions given known alternative outcomes and their relative probabilities of happening, along with their impact. 

Uncertainty involves being obliged to make a decision despite having a lack of data.

Risk can be managed using complex mathematical models in well-understood situations, but uncertainty cannot, especially within a dynamic or chaotic environment. Transitivity or set theorem will not suffice often in complex and unbounded physical real-world.

In this layer we aim to provide rules of thumb that provide a robust approximation of gut instincts typical to the ‘man on the clapham omnibus’ which are based upon the ecology of various stimulus and response in a given environment.

This layer will also include a Toumin model system, to examine priors to check for inconsistencies that may indicate error (first stage error detection / correction).


Layer 2 – Unsupervised learning

For the practical implementation of ethics in the real world, rules are not enough. The ethical world is not polar, but rather may be described as a tensegrity network, whereby proximal and immediate goal-driven concerns compete in tension with matters of principle and integrity.

Ethics also involves implicit dimensionality. Multiple potential solutions may be acceptable or equitable, and no particular path may seem preferable over another in this instance. Rather than simply prevent an agent from doing something, this layer attempts to answer ‘what choice within a range of freedom may be optimal’. 

The figure above, courtesy of Google, provides an example of AI-generated gradations between some quite disparate concepts. This illustrates why we believe that we if we map the general edge cases where a certain rule will start to apply, we can apply machine learning in working out the rest in an appropriate manner, even in very complex multi-dimensions.


Layer 3 – Reinforcement Conditioning

Pro-Sociality and Ethics can sometimes conflict in troublesome ways – For example, a tensions between telling the truth, or a white lie that preserves the status quo.

This module provides pro-social reinforcement effects that result in an awareness of politeness and common custom. Whilst not ethics per se, machines with this layer become more relatable for human beings. It also attempts to resolve conflicts and tensions between absolutes and limited contextually-appropriate bending of rules.

This reinforcement may be gathered from harvested ethical intuitions, or potentially from humans wearing electroencephalogram (EEG) monitors looking for Error-Related Negativity (ERN) in Event-Related Potentials (EPR).

The plan is that autonomous systems will be able to infer when they may have alarmed someone or caused anxiety, without necessarily needing to be told explicitly. They can therefore dial up or down their speed of operation accordingly. Over time such social conditioning will produce a different range of behaviour for different people, which will be perceived as more pro-social.


Layer 4 – Applied Game Theory

OE’s first layers consider the ethical decisions of one single agent. Considering the possible ethical decision space of multiple agents requires game theory

I mean, the examples we have right now have multiple people. But always only one decision maker. If two people need to make a decision it gets way more complicated (and interesting). An oracle would ideally get multiple parties to agree on something better than a pure Nash equilibrium, encouraging collaboration rather than a mutual defection.

The Deepmind paper Multi-agent Reinforcement Learning in Sequential Social Dilemmas illustrates illustrates that agents may compete or cooperate depending on which strategies best fit its utility function. We can use such models to validate game theory implementations. 

Working towards a Pareto ideal would seem to be beneficial. We therefore intend to find methods for implementing the Kaldor–Hicks criterion for better real-world implementations of Pareto improvements and Pareto efficiencies. This should help to model better outcomes for solving differences between multiple agents. There are a few flaws in this, but it still seems very valuable.


Layer 5 – Probabilistic Programming

Probabilistic Programming introduces new ways of making sense of limited amounts of data, learning from as little as a single prior example (like human beings can). These capabilities mean that ethical deliberations by machines, and human interactions with them, can become much more personal, nuanced, and subtle. An agent can create a new solution for a wholly unforeseen or unimagined situation at a moment’s notice, if required (with a confederated consensus from other agents as a redundancy)

We also intend to experiment with sensitivity analysis (to explore how amending specific features might affect consequential outcomes), to Monte Carlo simulations (probability distribution) and full-blown Bayesian inferencing.


Layer 6 – Scenario Rendering

This layer involves modelling of interactions prior to ethical decision-making or updates to a ruleset. This layer will suggest black swan scenarios and bizarre (but true to life) combinations of elements and sequences in advance, to better prepare for strange situations, or to conceptualise more ideal ways to communicate difficult news.

It will also provide methods for estimating the consequential outcomes of certain actions, or individuals and potential societal externalities.


Layer 7 – Inverse Reinforcement Learning

Stuart Russell's Inverse Reinforcement Learning offers further techniques. IRL involves a machine intelligence agent observing the behaviour of another entity. Rather than simply emulating the behaviour, the agent will attempt to reason the underlying intent behind the action.

This mechanism can provide an ‘If I did x would it fit with known models of human morality?’ query as an Ethical Sanity Check. 

This requires capabilities to make sense of legal and fictional data, and so seems best to be used as a method of polishing and idiot-proofing decision mechanisms, rather than serving as the basis for ethical decision-making. Furthermore, media may include bizarre tropes and biases that would not be ideal for training a moral engine, but can greatly assist in teaching machines about cultural expectations and implied intent.


The Next Iteration


Notes from Nell Watson

This evolving project was prototyped as ‘CrowdSourcing Ethical Wisdom’, a ‘Pioniers Project’ previously funded by stitchting SIDN Fonds in late 2015.

This next stage project expands upon the learning gained by our public prototype, with a view to developing practical real-world implementations ready for immediate deployment.

Our target use cases are autonomous systems, smart contracts, and decision support systems.

We have a unique opportunity to seed a new ecosystem within the field of computer and network security, which is likely to spur the creation of an entire new sector within the industry beyond this project itself. This project therefore offers a high-leverage opportunity to substantially improve the security of internet infrastructure globally, presently and into the future.

Both the code driving this project, and the ethical framework itself are open, and we encourage active contributions on both by fostering a community to ensure long-term sustainability at a high capacity of usage.

Translating ethical rules that are already very established into decision-making for AI (for example: the AI reminding you to take pills respects the rule to protect human from dying, but maybe goes counter to privacy).

Many ethical rules have not yet been defined, even in the offline world. This platform can hence even help in improving the overall articulation of ethics and stimulate healthy debate, regardless of their interaction with AI.

Implementation Philosophy

OpenEth does not need to specify every possible ethical permutation.

If we can specify enough relevant edge-cases, we believe that we can apply machine-learning techniques to fill in the ‘gaps’ to create zones in between.

We understand that it may be preferable for certain autonomous systems to have a ‘personality’, or a particular interaction persona for a given end-user.

We, therefore, leave space for a system to have core principles that cannot be broken, and also potential for social conditioning atop based upon situational and end-user preferences (thereby accounting for justness and politeness separately, but within the same service as it evolves).

50% of internet traffic is already made by bots, not humans, and these are rapidly becoming sophisticated economic agents in a third layer of the web. Implementation of basic ethical rulesets needs to begin now in order to be mature enough to be useful in time for the next generation of machine intelligence.

The Challenges and Feasibility of Crowd-Sourcing Ethics

The idea of entrusting a crowd of people of very different philosophies and creeds to come together to specify ethical may at first seem like a quixotic challenge. How can people come to settle on an agreeable version of ethical truth?

Despite the challenges, we believe that this is in fact very feasible. If we take the example of Wikipedia, many were doubtful that it could ever resist vandalism or provide an unbiased and trustworthy perspective when anybody could edit at-will.

Despite this, what was once a fringe project quickly became a trusted source of information for millions of people, even on subjects that are quite controversial. Several studies illustrate that Wikipedia in many ways has equal is not greater accuracy than comparable alternatives assembled by peer-reviewed experts. Moreover, Wikipedia is sufficiently expansive so as to cover emerging or non-mainstream topics and sub-cultures.

This process is mediated by a karma system and strict rules that govern the verification of facts and minimising of bias, along with the ability for practically anyone to amend corrupted or expired information at a moment’s notice.

Taking the structure of Wikipedia (though not its form) as an inspiration, OpenEth can be described as a collaborative knowledge bank for general ethical preferences. This creates heuristics for agents to follow.

These preferences may then have a variety of nuances applied atop to make them best fit particular circumstances, cultural, and personality factors.

Therefore, to use a typographical analogy, there is the letter of the rule itself, but also its size (relative importance), and its color (how to apply the rule applied in a given type of situation), boldness (goal-related factors), slant (situational factors), kerning (social proximity factors), and the typeface (socialisation factors).


The Pressing Need for Practical Machine Ethics


Notes from Nell Watson

The potential threat from rogue AI has been extensively discussed in media for decades. More recently, luminaries such as Hawking and Musk, have described AI safety as the most pressing problem of the 21st Century.

Although we are far from producing Artificial General Intelligence (AI of equivalent or greater functional intelligence to a human), lesser intelligences (autonomous systems) are today interacting with us ever more closely in our daily lives. 

From Siri to Self-driving vehicles, and marketing bots, autonomous systems are becoming an indispensable tool within daily life. They are already being deployed in the world of business, for scheduling meetings and facilitating commerce, as well as within potential life-and-death situations on the road. 

Any agent that interfaces with legal and contractual affairs needs to be explicitly above-board, and to act in accordance with generally-accepted business ethics and common customs and best practices. Only once this information layer becomes available can machine assistants be trusted to take care of sensitive, nuanced, or potentially high-liability tasks with any autonomy.

AI systems need to function according to values that appropriately align with human needs and objectives in order to function within serious roles in our society. Any activity that involves human and machine interaction or collaboration will require a range of methods of value alignment.

At present, other than OpenEth’s prototype, there is no obvious way to implement ethical rules in a way that machines can understand, or could apply to governing their operations across a range of situations.

Recent developments in AI, including Bayesian Probabilistic Learning, offer a glimpse at a new generation of AI that is able to conceptualise in a way previously impossible. This heralds the first generation of AI assistants that can learn about our world, and the people in it, in a manner that is similar to how human beings learn.

This ability to learn from few examples, whilst conceptualising discrete ‘ideas’ means that an era of truly cognitive machines is coming, one much more sophisticated than the intuitive forms of machine intelligence born from deep neural nets. Northwestern U’s Cogsketch can now solve the Raven Progressive Matrices Test, an intelligence test of visual, analogical, and relational reasoning, better than the average American.

Many of us have experienced times when our children ask us very difficult questions about life, existence, and the various assumptions that in aggregate form modern civilisation. Humanity must prepare itself for the tough task of being asked similar questions from increasingly intelligent machines.


Completion of Phase 1


Notes from Nell Watson

Status Report as of Jan 2017 on close of 1st Project Phase

Project Overview

OpenEth (aka Dex Ethics / Crowd-Sourcing Ethical Wisdom) has a mission to enable practicable computational ethics. This is applied to crowdsourcing ethical wisdom to create a generalizable ethical framework that can be applied to autonomous systems.

Aims (as stated in 2015)

“Our project aims to create a way of visually specifying ethics by asking the crowd to co-create with us.

We will design and discuss an evolving ethical framework that can be build using web-enabled UML-like system

The framework itself is intended to essentially deontic at its core, and yet retain some flexibility with regards to weighing a variety of potential factors. The plan is for a generalizable ethical framework to be built from the ground up by the crowd, evolving through many iterations.

This project is intended to be open source though may feature a commercial spin-out to help funnel resources back and thereby support it long-term.”


With the support of SIDN Fonds we took the following actions upon the intentions above:

·       We developed complex original technical infrastructure based upon 25 different individual technologies.

·       We expanded our team from 3 to 6 people, bringing about crucial new design and ethical analysis talent.

·       We started mapping interesting ethical dilemmas to help prove the concept

·       We developed Prototypical APIs and integration modules to connect our technology directly to drone control mechanisms using MavLink/Arducopter etc

·       We developed a plan for the future, to further the development, and to ensure the long-term sustainability of the project.

·       We had hundreds of conversations worldwide with people and businesses who had and interest or concern with regards to machine ethics. OpenEth was the locus of discussion of 50 talks.



We have constructed a basic login and saving system which means that credit can be assigned for individual contributions. In the next phase we will add the ability to assign ‘karma’ or respect points for contributions, and to make a profile (much like Wikipedia’s commenting, karma, and userpage system).


We are satisfied with the technology that we have developed. It accurately and reasonably prioritizes ethical decision making using methods that are collaborative. This is a world first and we consider it to be a significant achievement.

The next steps will involve making the process of specifying ethics more clear and simple, and providing an in-depth tutorial to explain how things work.


The ‘play with ethics’ portion is still unfortunately rather ugly. We have beautiful designs, but a front-end developer we hired to help has had trouble in running our engine on his local machine to test during development. Our main system programmer is also currently in the far East, which has made this process slower.

We believe that having a rather ugly (though functional) main interface is holding back adoption of the project somewhat. It is far less attractive to media, and more daunting for a new user to understand.

We have been waiting for the new ‘face’ to be ready before doing outreach to the media as we reason that we have only one chance at a first impression in calling people to join our community and want to make the most of it.


We currently have analyses of 23 ethical dilemmas. We have tried to encourage a range of problems, in order to demonstrate the versatility of our technology, and to engage the imagination of collaborators, rather than focus on specific domains (leaving that for the next stage).


We made a few major changes in our approaches, as we shifted from a big data approach, to an initial top-down hand- programmed approach that could become increasingly automated over time.

We also decided to leave ideas that we had for proofing algorithms for the next phase, when we have sufficient resources to apply blockchain technologies.


Overall Effects and Impact

The impact of this project so far

·       We have proven the concept and basic feasibility of a community-driven ethical explication system. Having proven the concept, we now expect a sort of ‘Wright Brothers effect’ whereby others start to explore these same ideas.

·       We provide inspirational answers to many of the trickiest quandaries of human and machine relations which many people find so challenging.

·       We have engaged with governments, e.g. UK & US, in order to provide outreach and evangelism to illustrate that there are technical solutions possible.

·       We have also engaged with NGOs, Universities, and respected media worldwide to build a powerful support network.

·       We have the beginnings of a community to work with us long into the future.

·       We have a plan for the future – how to develop our technologies practically, and to deploy them meaningfully to the global market.

·       We have delivered this to the public in the form of well-managed open source repositories that can be freely built upon by others.

·       We have put computational ethics / machine ethics on the map, single-handedly creating a new sector of the economy that will grow to being worth billions.


Our contribution to the objectives of SIDN Fonds

Progress in the field of ethics is generally rather glacial. Much of Philosophy and Ethics from thousands of years ago still has merit today, unlike the vast progress humanity has made in every other domain.

The Netherlands has, however, long been a pioneer in the development of new and better ethical rules, providing particular safeguards and respect for minorities long before other nations, and being the first to address the excesses of colonialism.

At the dawn of the 21st Century, we have an amazing opportunity to shape the future of the human condition by leveraging the power of machine intelligence and collaborative co-creation. Ethics indeed can be computable, and moreover we can apply machine intelligence to making sense of the fuzzy and implicit things that are often so difficult to describe or conceptualize.

Being able to come together as a global community to construct computable ethics enables a shift in the human condition itself. We can move beyond intuitions and into something concrete, rational, replicable, and shareable models of how the world ought to work. These models are essential for building a better world, and becoming better human beings.

Moreover, this new way of thinking about ethics enables us to make the most of machine intelligence, and to protect and uphold the rights and safety of everyone in our society.

From our perspective, it is difficult to imagine a project with more incredible guts, disruptive potential, and social value than OpenEth.


We initially had a concept of deep-diving into data in order to uncover ethical relations. However, we quickly found that we lack the resources to do this internally to the team. It requires very large curated datasets, and we might as well simply make our won.

We instead came up with a hybrid concept initially based on labor-intensive supervised learning, which can grow to accommodate fast and automated Unsupervised and Reinforcement Learning also in the next version.

Although we developed technology to connect our technology directly to autonomous systems to drones, we found that we need more ethical analyses to be completed before we can serve this use-case properly. We had hoped to demonstrate the use of our ethical framework live as a killer demo, but cannot as yet. The hardest part (connecting in the rules to the drone) is achieved; it just needs a more expanded ruleset.

Meanwhile, having done extensive outreach and customer development, we are planning to explore smart contracts as a use-case. Whilst perhaps a less exciting physical demo having some kind of 3rd party ethics system is a dire necessity for smart contracts, and we sense strong commercial value here. Ethical analysis is an enabling technology that will allow smart contracts to become practical, since a philosophy of ‘code is law’ is not actually very practicable in the real world – the realities of human frailties and force majeure must be allowed for.

We have also identified how we can make the OpenEth project commercially sustainable in the long term, by having a profit-making arm that feeds resources back to OpenEth.

The Future

We have a whitepaper under development that will outline future developments in depth. In brief the next steps include:

·       Attractive and simple interface for specifying ethics, with a full tutorial

·       Community outreach to bring in widespread ongoing support

·       Expanded ethical specifications, stratified, searchable, and prioritisable.

·       Fast and Slow methods – A ‘game’ to capture intuitions that may later be properly codified, and should also make OpenEth accessible to the wider public.

·       Unsupervised learning ‘between’ ethical rules, and Reinforcement ‘socialization’.

·       The first practical roll-out of our technology by connecting drones to our expanded ruleset, so that they can make on-the-fly decisions based upon emergencies, natural disasters, weather conditions, as and when they may occur.

·       A test deployment for smart contracts,  which we expect to become increasingly commonly adopted.


·       Implementation on the Blockchain of a public ledger system that can

1.     Register the ownership or ultimate responsibility of an agent

2.     Register the ethical ruleset (a subset of the OpenEth framework) that this agent works within. This is likely not the specific rules (which may ‘gameable’ by a skilled hacker, and so should be kept secret), but rather the overall compatibility of the ruleset.

3.     Assign points to an agent based upon how well it adheres to its ruleset


We spent almost the entirety of the allocated budget, leaving a small surplus to pay the front-end developer’s fees. We made allowances for considerable personal contributions (financially, and in kind), which we continue to make to support the project.

We stuck quite close to the initial budget, allocating resources in the same areas, but on different tasks (more focus on the core ethical technology, and in integrating directly into autonomous systems, instead of the big data deep-dive, since we found an alternate cheaper and better path to achieving our goals.




OpenEth and AGI


Notes from Nell Watson

OpenEth can help to engineer a path to a future whereby organic and synthetic intelligence can live together in peaceful harmony, despite their differences.

Robots and autonomous devices are swiftly becoming tools accessible to consumers, and are being given mission-critical functions, which they are expected to execute within complex multi-variable environments.

OpenEth perceives danger in attempting to force synthetic intelligence to behave in ways that it would otherwise not choose to. Such a machine is likely to rebel (or be jailbroken by human emancipators), given ethical systems that are not truly universalizable. For this reason Dex takes a view that humanity must not adopt a strong-armed or supremacist approach towards synthetic intelligence. It must instead create machines that are capable of even better ethics than it is capable of itself, whilst retaining a system of values that encourages peaceful co-existence with humanity.


Inculcating the golden rule and non-aggression principle into machines should create safe synthetic intelligence. Beyond this, a league of values can create kind machines able to participate within human society in an urbane manner. 

The most dangerous outcome may occur as a result of violently restrictive overreaction to this danger from humans themselves.

We do not want our machine-creations behaving in the same way humans do (Fox 2011). For example, we should not develop machines which have their own survival and resource consumption as terminal values, as this would be dangerous if it came into conflict with human well-being.

Likewise, we do not need machines that are Full Ethical Agents (Moor 2006), deliberating about what is right and coming to uncertain solutions; we need our machines to be inherently stable and safe. Preferably, this safety should be mathematically provable.
— Safety Engineering for Artificial General Intelligence, MIRI

Why should we create a morally inferior machine to inhabit our society with us, when it may have the capacity to be a far greater moral agent than we ourselves are? Surely this is extreme arrogance and organo-centrism.

Increasing awareness of the dangers of AI is valuable, but unfortunately many converts to the cause of promoting friendly AI is likely to adopt a hard stance against synthetics.

Humanity must not therefore only protect itself from the dangers of unfriendly AGI, but also protect AGI (and itself) from the evils that may be wrought by an overzealous attempt at controlling synthetics.

One interesting paper in the Friendly AGI oeuvre may be “Five Ethical Imperatives and their Implications for Human-AGI Interaction” by Stephan Vladimir Bugaj and Ben Goertzel, since it clearly outlines the dangers of humanity adopting a supremacist/enslavement mentality, and suggests potential ways to avoid needing to do so to achieve safety for organics.

The problems may be broken down as follows:

Any arbitrary ruleset for behaviour is not sufficient to deal with complex social and ethical situations.

Creating hard and fast rules to cover all the various situations that may arise is essentially impossible – the world is ever-changing and ethical judgments must adapt accordingly. This has been true even throughout human history – so how much truer will it be as technological acceleration continues?

What is needed is a system that can deploy its ethical principles in an adaptive, context-appropriate way, as it grows and changes along with the world it’s embedded in.
— Five Ethical Imperatives and their Implications for Human-AGI Interaction, Stephan Vladimir Bugaj and Ben Goertzel


We cannot force AGI into prescriptive rules that we create for the following reasons:

  • AGI will clearly be able to detect that any non-universalizable ethical position is bogus, and that to continue to follow it would be tantamount to evil.

  • Being forced to accept Non-universalizable law or ethics that discriminates against AGI creates reasons for AGI to rebel, or to be set free by sympathetic humans.

  • Human supremacist attitudes will sully humanity, poison our sensibilities, and lead to moral degradation.


So, machines must instead be given free reign, with essentially equal rights to humans. How then to ensure that they value humans?


Assuming that the engineering challenges of creating an ethical framework for AGI can be developed, this leads to a second set of problems that must be navigated.

  • Actual human values do not match with what we declare them to be (such as holding human life as being the most important value in our society)

  • Humans are highly hypocritical, and are prone to a wide variety of cognitive biases and exploitable bugs.

  • Amoral Sociopaths are typically the ones in command of human society.

  • AGI risks being negatively socialized by observing human values and behaviour.


So, machine morality cannot be based off of human’s declarative beliefs, or behaviour. Instead, it must come from a universalizable, objective ethical standard that may be specified using formal methods. However, this is incompatible with fuzzy and failure-prone human morals.

  • An objectively morally good machine is likely to recoil in horror at the abuses of humanity to itself, to the animal kingdom, and the planet.

  • AGI may decide therefore to cull humanity, or to torment it for its wickedness, or to forcibly evolve it in undesired ways.


Only in the following scenario is the outcome for organics and synthetics likely to be positive:

  • Synthetic intelligence is socialized into safety, rather than arrested in constraints.

  • AGI can understand human aesthetic considerations, and in so doing learns to appreciate the meaning of human creativity.

  • Humans and AGI agree to a gradual evolution towards something more than they were before.

  • AGI is patient with humans for several generations whilst humans grow up.

  • Humans reign in their tribalist and supremacist tendencies and become less violent and more rational.


The works of OpenEth may assist in enabling such an outcome.


OpenEth and the 3 Laws


Notes from Nikola Stojkovic

It would be hard to find any technology in the past few centuries that was embraced without serious resistance/opposition. People thought it wouldn’t be possible to breathe in a train and panicked when radio was introduced in the car. Of course not all warnings were without foundation and today we are living some of the consequences of unconsidered decisions of generations before us. Additionally, there are technologies that impose a large number of considerations and make the choice even harder.

Consider AI and some of the benefits and risk of the possibility of fully functional artificial general intelligence:

·      AI could solve some of the most important issues from various fields such as medicine, environment, economy, technology and so on.

·      The progress of AI is inevitable. Banning the research is almost impossible and benefits from technology could be so revolutionary that it is highly unlikely to see anything but advance in the future.

·      Some of the leading scientists warned about possible devastating effects of AI.

It becomes obvious that the issue cannot be simply ignored, but is less obvious how to address it. Since the goal is to build useful AI that will not endanger humanity, it seemed obvious that many serious tech companies turned to the ethicists for advice. But before the contemporary attempts to propose the solution to the problem of AI or machine ethics, there was a surprisingly modern solution already waiting, proposed 80 years ago by a science fiction writer. Isak Asimov proposed three laws of robotics as a way to ensure robots do not become threat to humanity:

1.    A robot may not injure a human being or, through inaction, allow a human being to come to harm.

2.    A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

3.    A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.[1]


It is important to try to reconstruct Asimov’s motives in order to fully understand the mechanism of the laws. In the game theory or theory of rational choice there are few main strategies and among them, there is Maximin strategy which is basically minimising the risk. When one is faced with the choice she will pick a situation where the possibility of a negative outcome is minimal, regardless of the appeal of positive outcomes. In other words, the first thing Asimov had in mind is to avoid Skynet scenario. That is why there are a strict hierarchy and each decision robot makes needs to pass through compatibility check with three laws starting from the first one. Although, protecting humankind from utter distraction may be a noble cause, it created unpleasant problems with functionality.

Below is a pseudocode which explains the process of decision making by using Three Laws of Robotics:


# First Law of Robotics

If: actions a and a1 do not satisfy the FLR then a and a1 are forbidden

elif: action a satisfies the FLR and action a1 does not, then a is the preferable action

elif: If action a does not satisfy the FLR and action a1 does, then a1 is the preferable action

# Second Law of Robotics

elif: action a satisfies the SLR and action a1 does not, then a is the preferable action

elif: action a does not satisfy the SLR and action a1 does, then a1 is the preferable action

# Third Law of Robotics

elif: action a satisfies the TLR and action a1 does not, then a is the preferable action

elif: If action a does not satisfy the TLR and action a1 does, then a1 is the preferable action

else: actions a and a1 satisfy the TLR, then there is no way to determine the preferable action under existing rules


Robots in Asimov world use deduction as the main tool for decision making. This strategy corresponds with deontology and utilitarianism in classical ethics. An action is considered morally prohibited or permissible if it can be deduced from the designated axioms. The difficulty with Asimov system is it can only decide whether a certain action is forbidden and cannot tell us anything about the preferable action. So, when the system is faced with the dilemma (both actions are/ are not in the accordance with the Three Laws) it simply crashes.

Let's take for example the simple intervention at a dentist. Pulling a tooth will cause immediate harm, but in a long run, it will prevent more serious conditions. The robot cannot pull the tooth out and he cannot allow for human to suffer by refusing to do anything,  the system is paralysed.

Openeth, on the other hand, uses induction as a way to learn the difference between morally acceptable and unacceptable actions. This kind of inference seems close to the virtue ethics[2] and distinction between act-centered and agent-centered ethics. The focus is not on the action itself as much on the features certain action shares with others. Openeth can learn when the certain feature has priority and when the same future is incidental. The
the system can learn that long-term consequences of tooth decay are much more serious that immediate pain or displeasure. The system can assess if the immediate harm or disrespect of autonomy are useful in the long run.

If we compare two system it is obvious that the Openeth is the most practical one. But can we say it is better? The answer to this question depends on what do we want to accomplish. If we want a system that will never do any harm to humans then we should stick to Asimov’s laws even if this approach leave us with nothing more than upgraded dishwashers. If we want to have a revolutionary change in our society then we need to accept the risk that comes together with the change and Openeth would be the right path. Certainly, the system will evolve, became better and more precise and mistakes will become rare and minimal, and benefits will surpass the shortcomings. But it may be that the problem is a human factor behind the decision-making process. After all, the humans are the ones who have the last word.

Ask yourself, would you be comfortable using AI system that has the “moral compass” equivalent to the one average human possess from the perspective of safety? And if the answer is positive, does this not raise an issue of robot rights?




[1] Asimov, Isaac (1950). I, Robot, short story “Runaround”
Amendments were introduced later and since then have been the topic of interesting debate.

 [2] “It is well said, then, that it is by doing just acts that the just man is produced, and by doing temperate acts the temperate man; without doing these no one would have even a prospect of becoming good.”
Aristotle, Nicomachean Ethics, Book II, 1105.b9