From Steve Austin to Peter Norvig : Engineering AMEE, the Simple Autonomous Agent

TRANSCRIPT

Overview

Status: Delivered 2021-02-15 for Autonomous Agents on the Web
Home: HTML
Slides: PDF
Talk Outline: HTML
Code: Github
Video: NA
Audio: NA
Transcript: HTML

Transcript (Unedited)

Hello to all. I’m disappointed that we can’t physically be together. This was one of the key highlights for me. But nonetheless, I’m very excited for the week, and I’m very honored to be a part of the session. So, but the title of my talk is a bit playful: "From Steve Austin to Peter Norvig." And the subtitle is a bit more to the point: "Engineering AMEE, the Simple Autonomous Agent." You might even say simplistic. Let me just say a little bit about myself. This is how you find me on LinkedIn and GitHub and Twitter and YouTube and all those other places. I come from a very different space. I’ll just say that I have advanced university degrees, but they’re not in computers, they’re in music. I was a composer, arranger, and studio musician for about a decade before I got involved in computers as a hobbyist. My whole life has been around this notion of creating and connecting, and connections are really, really important to me in general. Most of my talks, just like the title here, connecting these two topics, is really kind of how I approach the world. Connections are what make sense to me, and connections are why I like hypermedia so much as well. So, hopefully, that will sort of make sense along the way.

Introduction

Andrei mentioned I write a lot of books. This is my most recent book, which is about web APIs in general. It’s a sort of a toolkit or a guide to creating web APIs. I’m also working on some other projects that hopefully will be released later in this year. So, what I wanted to talk about in the brief time we had today is just emphasize this notion of connectivity and connections and remembering things. And I’m going to start by talking about two individuals. Steve Austin, to date myself. So, I decided not to, as Jim did, even though Jim and I are about the same age, Jim decided to show you a background of the 1950s. That was a bold move. I’m going to go ahead and just pick…I picked a reference to the 1970s, a television show called "The Six Million Dollar Man." And we’ll talk a little bit about why I picked that and Steve Austin, because it’s going to be a theme throughout.

But I also want to compare that to Peter Norvig. Peter Norvig popularized a lot of notions of what intelligence is and what agents are, and has done a lot of work at Google and other places, and he and Stuart Russell have a very big book on this topic. I also want to touch on this notion of Turing, just for a minute, just to remind us what Turing sort of explained to us back in 1950, in October of 1950. Talk a little bit about agents as pure background. I’m just going to give you sort of a look into the brain of a person who doesn’t live this every day, what I glean and what I see, and the connections I make. And then I want to talk about this idea of this agent that I’ve created, this online agent called AMEE. And if we’re lucky, we’ll have one more visitor at the end, depending on time.

Steve Austin

So, what’s this story about Steve Austin and Peter Norvig? So, there was this television show in the 1970s called "The Six Million Dollar Man." It was actually based on a real occurrence of an astronaut who was terribly injured in a test crash. But in this movie version, "The Six Million Dollar Man" gets rebuilt as an automaton, as a robot. And I love the line that became famous from the opening. "We can rebuild him." It was a him. "We have the technology. We can make him better than he was." And a lot of what I’m going to be talking about today is based on this same idea. We have the technology to do a great deal. Jim’s talk was about, you know, where are the agents? And there were lots of questions that talked about, well, maybe the agents are everywhere as [inaudible 00:04:01] pointed out.

So, I want to talk a little bit about that notion of what we have available to us, and what we can do. By the way, if the show was today, this would be "The 35 Million Dollar Man," which maybe seems about right. I don’t know, may still be kind of cheap. But again, what I loved about the premise of the show and what was really fascinating to me is it’s whatever technology we have today, it’s the power of a machine, but with the brain of a human. And I think that tells us an awful lot about where we should be aiming and where we could be gaining success, winning. And that is the notion of combining the power of machines and the brains of human beings. And I think that’s really, really important as well.

Peter Norvig

Now I’m going to flip quickly, jump ahead, from 1974 to 1994/95, Peter Norvig. Peter Norvig and Stuart Russell wrote a great book, which we’ll talk about in a minute, but his line is "More data beats clever algorithms." And I love this idea, "better data beats more data," and so on and so forth. But the idea is that a lot of this has to do with being driven by data, and not driven by intelligent or clever algorithms. Being just a little bit too clever can get us in an awful lot of trouble. And in fact, Peter shows us this in many ways. By the way, if you think I’m being unfair to Peter’s outfits, you just need to take a quick look on the Google and you’ll find that Peter enjoys these outfits, these sartorial splinters. I love his shirts. There needs to be a whole web or a Wiki just about Peter’s shirts.

What really struck me, what fascinated me, what got me thinking about autonomy and intelligence was when Peter just dashed off on a plane the explanation of how his spell correcting…work at Google. And what you see on the screen here, and maybe kind of hard to read, but it’s the entire application of spell correcting. It’s a simplified version of it without all of the error checking, for the Google. And in fact, it shows that there are very, very simple… There’s 20 lines of content here, not counting the comments. And in fact, the really killer line is the one where it’s opening a big.txt file. He’s actually opening the corpus of all of the text-based books in Gutenberg as his guide, and then simply uses a simple algorithm, not a clever algorithm at all, but a simple algorithm, define adjacency. Now, without getting into the details of why this is good or why this is bad, it tells us a couple of really important things. It’s clear that this algorithm, this software, knows nothing about spelling at all. Yet it achieves what we’re looking for. It makes us think that it’s correcting our spelling.

It’s also based on a super simple idea, and that is I’m going to load up a bunch of data and then I’m going to do a simple matching or checking algorithm. And this is really the height of the whole thing. Now, I mentioned this earlier, Russell and Norvig. This book shows up all the time when I’m looking for things. I’ll be going through some chestnuts in here. I’ll be talking about two particular items in Russell and Norvig’s book, and that is some hierarchies and this idea of environments and percepts and a few other things, because I think, again, they remind us of several really important elements to the story.

Alan Turing

So, now, I’ve sort of set the stage. We have these two ideas, this idea of machines and humans, and this idea of staying super simple, keeping it very small. And I want to mention this idea of Turing’s Imitation Game. We talk a lot about this notion of the Turing Challenge, or the Turing Test, and these other things. It’s become this sort of meme of various types. But it’s really important, I think, to remember what Turing was saying in his 1950 article in MIND magazine, and that is that it’s an imitation game. We are not trying to create intelligence or sentience or any consciousness, or any of these kinds of things, we’re really just talking about imitations. And in fact, most of the way we learn in real life is we actually learn to imitate others. Just got done reading in a "Harvard Business Review" article about how observable behavior is the best teacher. Seeing people do things teaches our brains much better than reading about them or hearing about them. So, it’s an imitation game, and that’s really what Turing was putting before us. Can we imagine digital machines that can actually imitate sufficiently to fool other humans? And in fact, that’s what all of us do every day. We fool other humans every single day. And that’s really the game that we’re playing. We should always remember we’re just imitating.

Intelligent, Autonomous Agents

Now, to talk a little bit more about the background, and I won’t spend a lot of time in this, but I wanted to sort of set the stage for this in general. And that is this idea of intelligence and autonomy, right? So, Norvig and Russell basically say it’s really simple. Anything that can be viewed as perceiving the environment, perceiving the environment, this is a super important element that I haven’t seen talked a lot here, a little bit here. And that is, I think actually perceiving the environment is the key element to success in gaining autonomy. You need to be able to understand your environment and the things around you. Most of the work that I’ve experimented with in hypermedia has to do with creating an environment of hypermedia that an agent can participate in. And that’s where the sensors and the actuators come into place. So, I can sense what’s in my environment and I can act upon it. So, that’s a key element in the story. So, perceiving the environment is key.

This is great. I don’t know if anybody can actually find this for me. I’ve spent hours many times, doing this. There’s a quote that keeps getting referred to from some IBM white paper on artificial intelligence strategy. And that is what an autonomous agent is. And that’s carrying on operations on behalf, with independence, using some knowledge. And that’s that agent, that’s that assistant that Jim was talking about earlier in the process. Again, I think if we think in terms of I’m creating an agent, not creating a human, then I think we can get a lot farther and we can do a lot more things, and we can find things a lot more satisfactory in many ways.

Percept, Action, Goals, and Environment (PAGE)

And then finally, as a kind of background story, I wanted to mention this notion of page and hierarchies. So, page may be something…this is something that usually comes up in intelligent agents and artificial intelligence classes quite a bit, this idea of percepts, actions, goals, and environment. And I will tell you that when I create hypermedia agents or I create bots, I’m actually thinking about these things constantly. And I’ll show you that example in a little bit. Something that’s perceived. What is it that I’m supposed to perceive? Am I supposed to perceive food? Am I supposed to perceive danger? Am I supposed to perceive shelter? Am I supposed to perceive my mate or my kin or my clan? Percepts are a real key element. What is it that I’m supposed to perceive? And that’s just what I’m supposed to perceive.

And then what actions am I supposed to take? If I perceive food, how do I eat it? If I perceive a friend, how do I greet them? If I perceive danger, how do I get away from it? So, what are the acts that I am able to do? One of the things that I find very frustrating in a lot of the work that I get exposed to, in terms of agents and, whether it’s multi-agent systems or just autonomous agents in general, or intelligence, there’s lots and lots of stuff on data, but not a lot of stuff on actions. How do I know what to do? And that perceiving what the possible is really, really important in the process.

Also, what is the goal? What is my goal? Is my goal to make it through the next day? Is it my goal to eat? Is it my goal to procreate? You know, there are several insects that they don’t live long enough. They don’t even eat at all. Once they hatch, their only job is to procreate and die, right? So, what are the agents we’re creating, and what are the goals that they’re perceiving? And are goals binary? Are they yes and no goals? Or are they negotiable? Or like, do I get close enough? Am I within a range? So, when you’re establishing agents that are going to have some kind of autonomy, you need to give them some kind of goals. And then finally, I talked about this earlier. This notion of environment. Environment is super important in the way I think about the situation. Where am I? I’m in the environment. One of the things I don’t like about this classic image, which you see all over the web about trying to explain this, is it shows the agent outside the environment in this diagram. That’s foolish. Agent is inside the environment. You are in the environment. If you’re in the wrong environment, you’re not going to survive, right? So, environment is super important. So, a lot of the work that I do is actually describing environments and then allowing agents to participate in that space. So, that’s a real key I think to all of that.

And then finally, this notion of classes of agents. This becomes really important to me when I’m thinking about trying to automate agents. The kinds of agents that I automate are things like "go through all of the records of the company, find all of the accounts that are 90 days past due, write them a customized letter, put them in a list, and then notify to a human anybody that’s over 120 days due." These are bots that scurry around and collect information and do things. They’re totally in virtual space.

Classes of Agents

But there are lots of classes of agents, right? There are simple reflex agents, and a good reflex agent is a motion sensor. I have a motion sensor on my garage door. Or I might have a motion sensor when I enter the office building, that automatically opens the doors for me. That’s a reflex agent that’s stateless. It doesn’t have to remember anything. It exists for just one purpose and one purpose only, sort of like that bug that’s supposed to procreate. That’s different than a model-based agent. So, when I want to move up and make a sophisticated door, often I’ll give it a model. You’re only supposed to open the door automatically during office hours, when the building is open. So, I’m going to give you a model, and that model is the certain times of day when you’re going to interact. Otherwise, you’re not going to interact, or maybe you do something else instead. So, adding a model is really key. A lot of times when we write code for an agent, what we’re doing is really adding models.

What I find fascinating is when I can replace the model without replacing the code. That’s a really important idea, right? That’s what we do. We don’t replace our brains when we think of things, we put new models in our brains when we think of things. And I think that’s a great way to think about it. Goal-based, we’ve already talked about this. This idea I can wander around all I want. What is my goal? How do I know when I’ve gotten there? How do I know when I’m done? Many of the systems that I deal with in the hypermedia and business side are focused only on the service’s goal and not the agent’s goal, not the client’s goal. So, the client’s goal is often to just simply keep doing whatever the service tells them. That’s a very weak agent. If I have an agent that’s supposed to go out and search and find certain things and then notify me as a human about it, then I need to give that agent some goals. They are not just reflexes, they don’t just have a model, they also know when they’ve found what they’re looking for.

Utility agents are agents that actually can get better, that can optimize, that can find the utility, that can become happy, as sometimes said, in the system. "This is a good way. I’ve done a good job of solving this goal. I’ve optimized this in some kind of way." And then finally, learning agents. Learning agents, like utility agents, have some kind of memory, but then they can operate on that memory and they can modify that memory. They can actually change their plan, change their organization. This will make a little bit sense when we talk about my AMEE agent a little bit.

But when you think about this, we have all sorts of robots in our lives already. Sorry about that. We have robots that allow us to open doors automatic. We have door robots. We have door robots with models. We have goal-based agents that help us find things, that drive us to certain locations, that can at least tell us how to get there, you know, add the machine with the human. We have utility agents that help us optimize things and so on and so forth. We have them all around us every single day. And when we think of them as imitators rather than embodiments, I think it helps quite a bit. So, intelligent autonomous agents in the way I’m talking about them today is something that imitates users carrying out tasks, using these various interaction elements and using a handful of different types or models.

A successful ecosystem has all sorts of autonomous agents at all sorts of levels in the system, right? So, some are reflex agents, some are model-based, some are goal-based, so on, and so forth. They don’t all have to be super, super smart. As we add features, using the multi-agent model, what I really need are more agents, not a smarter agent. We used to talk about this when we were talking about scaling software. Do you need a bigger cow, or more cows, to use cows as the story that Jim was talking about earlier. So, often what we really need are more cows, not bigger cows.

Engineering AMEE : The Autonomous Maze Environment Explorer

Okay. With that in mind, I want to talk a little bit about this engineering, this one maze explorer. I kept it super simple because I wanted to really focus on the parts rather than the actual details. And I wanted to also pick technology that you probably haven’t used or haven’t seen, because I want to focus on what’s available around us. So, mazes, pretty simple, right, this idea of wandering through a maze. This is actually one of the clients for a maze server that I have. This is another client. If you remember Cave Adventure, or if you’ve ever seen Cave Adventure, this might look a little interesting or adventure games, online adventure games. What I have is actually a server that serves up virtual maze information, in the forms of collections of mazes, a single maze, cells or rooms in a maze, and eventually, an exit in a maze. And I purposely created this system so that it uses hypermedia links as the only way you can get anything done. There’s very little else in this API except links and some metadata about those links. Sometimes there are strings, there’s always a URL, there’s always a rel name or a value, to tell you what this is for.

So, what I want to do is I’m going to build an agent, an autonomous agent, to navigate these mazes. And that’s going to be what we call AMEE. And I’m going to talk a little bit about the design, the implementation, and eventually, if we’re lucky, I’m going to have a demo for us as well. So, what is AMEE? I’m going to design a machine that can navigate arbitrarily-sized two-dimensional mazes. It could be two-dimensional because it’s simple. And I left out a lot of other details just so we could focus on this part of it. So, it’s going to be goal-based. What’s its goal? Its goal is to get out, to find the exit. It’s going to perceive various doorways or portals, and it’s going to have to understand some metadata and some data around those doorways, and it’s going to have to find affordances. That’s really what the door is. It’s going to be able to view affordances to maybe turn and face a particular direction, and then mobilize, or motivate towards some direction.

Designing AMEE with Application-Level Profile Semantics (ALPS)

The environment I’m going to use is an XML message in an HTTP protocol. You could also say it’s stacked on TCP/IP. But that environment could be lots of different things. I’ve actually seen a handful of implementations of this same maze using JSON, using MQTT, done simply as document files, text files, all sorts of things. The point here is I treat each one of these elements, the percepts, the actions, the goals, and the environments, as separate targetable elements. So, let’s just jump right to it. So, this is a design document that I used to design my environment, or my space. This is actually a format called ALPS, or application-level profile semantics. It’s a very simplistic thing. You could do this in OWL, or RDFS, or lots of other things as well. This here is actually the environment that I’m describing. You can see the highlight here. What you’ll see are actually a few data points, and then a handful of action items. Notice there are more action definitions than there are data definitions. This is real life. In real life, the data is only the evidence of action. So, we’re going to keep the data super simple, and we’re going to focus on actions. Often, when I look at ontologies, the very first thing I’m looking for is tell me what actions are possible in your environment. And if I don’t have them, I’m in big trouble when it comes to writing an agent.

The rest of the document actually describes what you would call resources in HTTP, or parts of the environment that will contain certain things. You literally place things into the environment. If you’re interested in ALPS, there’s an IETF spec. We’re in Version 6 right now. This is something that I created for one of the books I did several years ago. It has some popularity in the industry space. I don’t think it has much popularity in academia. So, this is actually the, sort of the state machine version of the environment I just explained to you. And this literally is a model that is going to be embodied inside the agent. So, collections, items, individual rooms, and eventually the exit. Now, this is not just the model of AMEE’s, this is a model of all two-dimensional mazes. And when you’re creating models for agents, you’re going to have to create a model for a wide variety of elements. This does not tell you how to get to the exit. You don’t see anything here that explains how you do this. This simply describes the environment of collections, mazes, rooms, and some rooms have an exit. They might have one or more exits.

By the way, this is some documentation from that same document, that ALPS document. And if you’re interested, it’ll be in the slides. This is from an application called the App State Diagramming tool for ALPS. So, then we need to talk about code. So, this is really the entire code. Remember I showed you Norvig’s code, which is relatively short, about 20 lines. I couldn’t get it quite in 20 lines. Mine takes a little bit more space. Maybe if I did it in Python, it would probably be a little bit easier, but it’s pretty simple. And it has in it, embodied in it, the goals, right? So, the first goal is an exit. And if I’m not at an exit, let’s see if we have to get started. And there’s a few ways that I can get started. And once I’ve started, I need to actually navigate from room to room.

AMEE’s Model-Driven Algorithm

And notice, I pointed out in Norvig’s app that there was one line that loaded a bunch of data. That was, like, the key to it. This is the key to my agent. This is the one line that loads all the rules for navigating, embodying that zone, that maze. And these are the rules in a simple two-dimensional maze. If I’m facing east, try to go south first. If you can’t go south, go east, if you can’t go east, go north. This is simply the rule, the entire intelligence, the simple algorithm of this maze bot is captured in just these four lines. And I really could have probably made them a little bit easier too, a little bit simpler, but I made them a bit verbose.

So, it turns out that is all I need, that general map of the environment, that ability to perceive links and act upon them, and a set of rules, is all I need in order to actually create an autonomous maze bot. So, what I’m going to do here is I’m actually just going to fire this one off here, and it’s going to start walking room to room to room. Now, you’ll see I actually sort of made the agent a little bit friendly. It speaks to me, it sort of tells me where it’s going, and what it’s going to do next. And it just simply will navigate as far as it can. Now, the algorithm is super simple. It’s called a wall-following algorithm for two-dimensional mazes. So, there’s not a lot of optimizations here. While this bot has a goal, it’s model-based, it has a model and it has a goal, it doesn’t have any utility functions or any other intelligence or learning.

Running the AMEE Demo

Now, it did get out in 34 moves. This is a five by five maze, so that’s actually not great. You could probably get out in probably less than 10 if you were really good, but again, it doesn’t know about the whole model, it just knows about its current room. There’s also another one in here. I think I can do this one here. By the way, this maze that I just showed you…oops, I just messed this up. Hold on just a second here. I apologize. What I showed you is actually artificially slowed. Here’s one that actually navigates…let’s see if I can do this. I’m sorry. Yes. This navigates a 100-room maze in less than a second. So, I was slowing down the other one.

So, a room, a maze of a thousand and so on and so forth, wouldn’t really be a challenge for this particular bot. Let me go back and find my slides here. Where are my slides? Here we go. Here we go. So, let me wrap this up. I’m running a little bit long. So, how could I make AMEE smarter? We would do just what we talked about, include some rewards, some dangers, and then add some utility and some learning. And we could add those incrementally, or we could invoke multi agents, and have just an agent for learning, and just an agent for utility. How does AMEE make sense in the real world? Just consider the idea of navigating for shopping, selecting some kind of thing to shop, making the acquisition, making the payment, learning how to optimize that experience, and then learning how to apply that to other things. These are the worlds of multi-agent thinking. And each one of these are very simple. We have many of these around us already. It’s simply a matter of focusing on their environmental elements more than anything else.

Rene Descartes

Now, just to touch on this for, Rene Descartes actually had answered, 300 years earlier, answered Turing’s question. Can we imagine something? Descartes said "No." Descartes also said something fascinating to me, a "machine to have enough different organs." This is multi-agent systems. Already, Descartes, 300 years ago, had already thought through this notion of what it really takes to embody humans. It turns out we need lots of AMEEs. We need AMEEs all over the place. And we have AMEE’s all over. So, I think a lot of our talk about how do we connect them, how do we get them to cooperate, is a really key element to the story.

Conclusion

Just as a team of anything, what we really need to do is find ways to collaborate, and keep everything super simple. We don’t need a bigger cow, we just need lots of them. So, we have the technology today. We can imagine digital computers doing this using simple things that have been around for decades. It’s a matter of just engineering enough AMEEs to really do it. So, hopefully, that makes some sense, it gives us some ideas to talk about, and I’ll go ahead and stop sharing my screen at this point. I’ll upload the slides, including the code for AMEE…