this program is brought to you by Stanford University please visit us at stanford.edu this presentation is delivered by the Stanford center for professional development this way I mean for those who don’t know van was really considered one of the founders of the whole field of HCI he was doing work in the area before it got labeled he was involved in organizing the first HCI conference he wrote what has widely been used as the main textbook for the field plus he’s had a series of research programs at University of Maryland over the years and one of the main centers so he’s a very central figure in HCI and well worth hearing as is Rob Miller who is not quite a central figure availment but equally well worth hearing from MIT see sale which is the computer science and AI laboratory we’ve had a lot of speakers over time from Media Lab which is sort of a different cut at things and Rob’s work is in that sort of borderline between HCI and software engineering and that he’s really interested in the question of how do you get people to easily produce programs without needing the kind of sophisticated engineering tools that the professionals use and this is what we call the Scott clemmer here has a research in a similar vein which is the the so-called long tail of people who would like to be able to do things more than just read what’s on the computer but they’re not going to go to CS graduate school to do it with that okay Thank You Terry I’m going to talk about a couple of the systems that we’ve built over the past few years in in the area of end user programming particularly for the web but my work sits at the intersection of lots of fields not just HCI and software engineering but also artificial intelligence and fact computer security so we’ll have an example today of a problem and a solution that we’ve done in the area of building more usable solutions to security problems as well so you’ll see an example of that and this work couldn’t have been done without the help of a lot of students this was masters work of Michael Boland and Greg little and Eric Lieberman Darris hop and a bunch of other students contributed to the things which then I’m going to show today so most of the credit for this hard work belongs to them the first to set the stage the web is increasingly an application platform and I’m sure that’s not news than most of the people in this room many of the things that we used to do on the desktop and desktop applications have now moved to the web you know whether they’re managing our email managing our calendar or using Internet applications all these kinds of things are happening on the web and increasingly as we spend more time using the web as the place where we do work week really want to customize these web applications so that we can do our work better so that the applications we use are better suited to our own needs and the kinds of tasks that we need to do with them so here’s some kinds of customization that you might imagine doing on a website or web application and you sort of range from the simplest to the sort of most complicated most sophisticated kinds of customization so the simplest is that is probably a bookmark a bookmark on the web is essentially a shortcut you’re grabbing a state of a website or an application that you want to be able to return to later without having to do all the steps that it took to get to it so just creating a bookmark it is an example of a customization of a website a little more complicated our customizations to change the appearance of the website for instance simplifying very complicated website to make it more appropriate to your needs for example I’ve built a version of YouTube for my three-year-old that eliminates a lot of the features of YouTube that he doesn’t need and would only get him into trouble if he clicked on them by accident so we want to be able to simplify and also perhaps enhance add features to websites in order to make them more suited to our task the next step might be automating so taking a sequence of steps that you do in a web application and making it into a mac or a script so that you can do that faster and then sort of the holy grail of web customization is this idea of mashing up want two or more web applications gluing them together and

making them do make them work together to accomplish one of our task so I’m going to show examples over the course of this talk of all these kinds of costs ization ranging from bookmarks to to mashups that we’ve built now the good thing about the web compared to the desktop is that it sort of inherently open in ways that the desktop was not so we’re example the display of a web application sitting in your browser is just HTML and in fact you can change the way it looks and appears and behaves inside your browser without the cooperation even of the website I can change this google homepage and we’ll see some examples of actually doing that and it’s open to that kind of change despite the fact that the creator of the website didn’t really intend for it to be open and you can pair that with the desktop the user interface of a desktop application is basically just pixels you can’t do much with those pixels similarly the back end of a web application happens over these open protocols and open formats like HTTP and HTML and XML which you can which you can snoop on and change and automate and parse and the back end of a desktop application didn’t have these hooks it was invisible it was behind the you know the pixel appearance of the the application unless the developer intentionally provided an API so the good thing in the web is that even when developers don’t give us api’s we can still do things to customize and automate applications but it’s not necessarily easy and here are two barriers to problems in this area that we’ve sort of focused on the first one is invisible complexity so even though we can change the appearance of Google in our web browser there’s a big difference between the simple user interface that Google presents to you it’s you know sort of one it’s been lauded as one of the simplest user interfaces out there but you actually go under the covers and look at its HTML source it’s far more complicated and if you have to program it if you have to customize it down at that level here’s a lot of JavaScript for example that clicks on the i’m feeling lucky button in order to actually figure out how to write that line of JavaScript to do a lot of reverse engineering of this complex underlying HTML and that’s a barrier even for programmers even for people who know HTML and no JavaScript this is a complexity barrier that they have to overcome so our approach is we want to just program against this against the rendered user interface the thing that we’re already familiar with and that’s already visible to us so this is an example of the kind of this is an example the command that you can write in our system that has not only easy to create if you’re just using keywords from the rendered view from the user interface but it’s also easier to understand looking at a script that had this line in it I would be able to much more easily understand what it does then a script that has that line in it so that’s the first problem the second problem is problem with syntax and this is more general than than just web program but it actually it bites you pretty hard in web programming because what program is increasingly a requires knowing and learning a collection of different syntaxes and different programming languages and different formalisms and mastering all of them so for example here’s a net sort of equivalent line of code that clicks on that I’m feeling lucky button and does it by using an XPath so this is an XPath pattern that that matches that button so in order to actually do this you have to learn some JavaScript you have to learn this xpath formalism and that’s a barrier particularly for end users for non programming users people haven’t had any training in programming at all but it’s also actually a barrier for programmers as well because you have to master more than just one programming language so our approach to this is to think about just throwing away the syntax barrier what can we do without having any syntax in our language at all what if we just use keywords for the whole language instead of using a formal syntax so we’ll see some examples of how you can use that in real applications and real programming systems and the second piece of this in fact will be rather than writing the program you’ll demonstrate it so this is often called programmed by example or programming by demonstration where you actually specify the automation that you want to happen by clicking on things and filling in forms so we’ll see some systems that that do that so that’s a high level view of what I’ll be talking about here is sort of a visual outline of where we’re going to show you a bunch of systems that we’ve built in this area will gradually fill in this picture here most of these systems are in the web domain so they’re built on top of the Firefox web browser

but also show you at the end an example where we’ve taken this idea of programming with keywords programming without syntax and applied it to java programming so that will be a system that’s built inside the clips integrated development environment for java so diving in the first thing i want to show you is is chicken foot which is our programming system for doing these kinds of web automation and customization problems and the key idea of chicken foot is that users shouldn’t have to look at the HTML source code in order to automate and customize a website the way we do that is with keyword patterns that match keywords in the in the text in the actual rendered view of the page in order to identify what it is that you want to interact with for instance the cliq feeling lucky command clicks on the best match to to the keywords feeling lucky let me actually give you a demo of chicken foot this is uh this is firefox showing the Google homepage and the sidebar on the sidebar on the left is his chicken foot and this is an editor where I can enter Chickenfoot scripts and what I’m going to demonstrate for you now as a script that searches google for people’s faces so that i can type in let’s say Bill Gates and be able to come up with a picture of Bill Gates’s face you’ll see later in the talk why I want to be able to do this but for now just assume this is something that I do frequently so I want a shortcut I want to be able to automate this task so the first step in this task from the google homepage is to first get into to image searching mode so i’ll run a command here click images and what happened was it searched the best button or hyper link on the page that was the best match for this keyword that I put as the argument to the click command so similarly i can say click advanced and that’s going to click the advanced image search link over on the right and then on the advanced image search form i can say i want you to pick the faces radio button and the best match for that is is this down here and now finally I want to fill in a search query so I want to put some bill gates for example into this text box here and so I need some keywords that will be able to identify that text box so I’ll pull some keywords out of its caption here and just say let’s say I want you to enter into the all words text box Bill Gates and that will fill that in and then finally we’ll click on the search button so this little script simply automated that sequence of actions that did a search for bill gates’s fish that bit that I just showed you is the is the colonel of this larger script so what I just showed you is is this sequence of commands right here this larger script which is going to customize Google’s homepage let me get back to Google’s homepage oops it already has the customization in it may just turn that off so you can see what Google’s homepage looks like without it yes this is google is celebrating the other the invention of the laser oh really is that one yes the associated tool yeah I didn’t customize that so what I want to do is be able to make this a command that is always available to me when I’m when I’m looking at the google homepage so what the script does it takes that Colonel that just showed you how to do and abstract it so it’s not always searching for Bill Gates anymore but instead is determining the person I want to search for by finding this text box on the homepage and extracting the whatever I’ve typed into that and then wraps that whole behavior up into a function which in turn is the handler of a button and then that button is inserted on the page again using a keyword pattern so saying I want to put that after the after the i’m feeling lucky button so that if i run this script what i get is a new button on the page that’s inserted in there as if google had provided it for me and that when i click on it will run that function end and go through and do the do the search for and women’s faces and i can essentially attach that script as a trigger that fire is whenever i visit google com so that it will essentially always be available to me so i’ve done two things here with this example I’ve customized Google’s appearance I’ve added a new button to the Google homepage so I’ve customized the way Google works and I’ve automated it so clicking on that button runs an automatic script that that retrieves those faces so both of these things are

important question is why I imagine you’ll get to but I how to flag it curiosity is there’s been a lot of hype around mashups on the web and it’s clear that some kinds of customizations are enormously useful and other kinds are things that programmers invented because it was fun to build infrastructure for her and I mean from having seen chicken foot and your other tools before there’s a really slick stuff you can do and I’m wondering if you’ve started to build a theory of these are the kinds of customizations that end-users find most valuable and the community should focus on a and not B it on the answer now just at some point of talking yes hand waving about that out of here early maybe I’ll save it for the end right just in case you don’t have time but it is a very good question let me show you a little larger example here this is uh see if I can actually get to it this is our bug database for chicken foot if it ever comes up and this is this is actually an open-source bug database that we’re using so we actually have the source code for this it’s it’s written in PHP one of the things that I frequently do with this with this bug list is is to assign bugs to people to fix or to change severity xand priorities and actually doing that in this interface is very painful because I have to click on each bug go into a view of the bug and then click in to edit the bug and then choose the severity that I wanted and then back back up to the to the whole summary list what I really want to be able to do is just change it in place be able to just directly manipulate this list now I could have dived into the open source dived into the PHP and tried to understand how to add that to this to this bug database add that behavior but that would require learning a lot more interface is a lot more a lot more of the system that I wanted to learn right whereas I already knew how to use the user interface of this thing I already knew how to click into the the bugs in order to actually change the severity so what you’ve just seen me do I clicked on a link there that actually fired a chicken foot script that automated the interface in order to pull out the drop-down lists for each of these properties severity priority status assigned to and so forth so that it could embed them in this page and when I change one it will go in again automating the interface click into that bug and change its severity and essentially do automate the tasks that I would have done manually so doing that in Chickenfoot took less than an hour to do because i was using interfaces that were already familiar to me i was programming against the user interface of the bug database rather than diving into the source code of the bug database so even though the source code was available to me that was still that was a barrier that that was too much for for me to want to diss your mouth okay I hope that gives you some idea of what we can do with chicken foot one of the evaluations we did on this work was to see how well keyword patterns actually work as a as a method for identifying page components so we did a little evaluation which we found text boxes on the web with various different kinds of labels in order to sort of stress this this approach of matching text boxes by their by their label captions so some of them were sort of clearly labeled with textual captions like these top two and some of them had no clear caption around them at all and one of the interesting results of this is first of all the system is largely successful so users were given keyword commands he were patterns that were correctly identifying the text box on the web page except down here in this case so it’s interesting to see where the system goes wrong and this was a case that we chose because it was sort of intentionally ambiguous there were two text boxes on the page that had the same label right next to them middle initial label and users actually use two different kinds of disambiguation strategies to identify these things with keywords one was counting so this was a typical kind of expression that we’d see in our system at that time actually didn’t support counting but it does now the other was to use sort of more contextual keywords such as the title of the section or the subform that that that text box was in we don’t currently support this and it’s a little harder to see how to support this without making the system sort of increasingly

ambiguous in other situations i want to show you now a slightly larger example of using Chickenfoot to do web automation and customization and this is where we’re going to pull in the other part of my group’s work on computer security and let me just frame the problem for you the problem is sending email to the wrong people and we’ve collected a couple of anecdotes about this the first one appeared in The Wall Street Journal a couple of years ago where somebody was replying to an email from their health insurer and asking whether is such-and-such a very personal medical problem would be covered but they didn’t realize that they had actually pushed reply all to a message that was sent to the whole company so their reply went out to the whole company as well this has happened at MIT a teaching assistant once wrote an exam and thought that he was emailing it to the professor to be checked where do you think he actually sent it to the class exactly right so they had to throw it away and rewrite it and this is a case that actually appeared just a couple of months ago a lawyer for the drug company Eli Lilly thought he was sending mail to one of his fellow lawyers talking about the case but in fact it went to a new york times reporter happen to have the same last name so it was probably a sort of an autocomplete error and it ended up in the New York Times front page so this is really a security problem and in fact I’d argue it’s an access control problem most of the file sharing that actually happens outside of an organization happens through attachments on email so if you send it to the wrong people that’s like specifying the file permissions for that file wrong and there are many ways it can happen in our mail clients the consequences are usually just embarrassing sometimes just annoying to the other people who got this message but sometimes their violations of privacy or security and I would argue even though this you know the state of the email infrastructure the security of the email infrastructure is a disaster right now right we have viruses and spam and phishing and all of these things are due to the fact that we’re not encrypting and signing our mail but I would argue that even if tomorrow we all switch to secure email and we’re all digitally signing and encrypting our mail we would still make these errors that lawyer for Eli Lilly would have sent beautifully signed an encrypted message that could only be read by the New York Times reporter that he sent it to right so it’s really a user interface problem it’s not something that’s going to be solved by throwing cryptography at it and here’s our proposed solution to this problem and then we built this on top of chicken foot so it’s it’s what I’m going to demo for you actually works in webmail clients here’s a sort of a contrived example of a case where this might occur we’re a faculty member at MIT is replying to an invitation to a department-wide retreat and you said something not so complimentary about another faculty member and doesn’t realize that he’s actually pushed reply also he’s see seeing all of the other faculty members in the department we argue that he’d be less likely to send this message and make the mistake if this is what he actually saw in his email client right these are all the recipients of the message and what you’re doing is saying this to this whole crowd of people and that there’s a big difference you know the reply all it’s probably the worst example of this case there’s a big difference between this big crowd of people and just a single face a single person that you you meant to be sending it to and we’ve done some lab feasibility studies where subjects had only a brief glance at an email message like this and there was sort of a big difference between and asked whether whether they thought the email message was going to the right place or not and there was a an enormous difference between their ability to tell whether this was going to the right place and whether a mere textual message was going to the right place so I want to show you a demo of this in action this is this is gmail and we have a chicken foot code running behind here that has added to the gmail page this section here where faces will appear and so that when i type in a email address in the to box what actually happens is the system takes that email address and sends it as a query to a variety of image sources including google images so this is why i showed you how to do a face search in in google images this is one of the sources that we use for faces we also use facebook you could also imagine hooking up a corporate or university identity database to this so that you had a very high quality and accurate source of faces but because it’s some because it works automatically we can actually type in a variety of

email addresses in here so Terry’s yours is winograd at CS and we’ll come up with that and your clemmer okay okay you would it does with this ok so it’s searching right now actually with the other one was faster my personal feelings get the beach yes so mailing lists are trickier so for internal mailing lists so for the MIT mailman server for example we have a plug-in that will go to the mailman server and get subscriber list for mailing lists that you are on and that’s how I got the members of this C cell faculty list there for mailing list that you do not have membership for there is still a possibility of detecting that it’s a mailing list because many mailing let’s give you headers and messages that say I’m a mailing list and if you can detect that this email address is a mailing list you can’t but you can’t find out its actual subscribers you could still put up say a hundred anonymous faces like that silhouette that you saw which would help you at least tell the difference between replying to one person and replying to a whole pile of people good so that’s sort of a bigger example of what you can do when you’re automating customizing websites we you know we didn’t have to get Gmail’s permission in order to in order to do that what’s really a mash-up between gmail and a variety of other sources like google images and on facebook and mailman okay so so i’ve shown you a couple of examples of things in the web let me turn to now the second barrier that we were concerned with you know we looked at programming against the user interface second barrier we’re concerned with the problem of syntax and chicken foot actually has this problem so even though you can script the web without looking at the HTML of a web page you still have to know javascript because it’s a javascript programming system essentially we’d like to eliminate that and so the idea is actually to take this notion of keyword patterns which in a sense eliminate the syntax requirements of a pattern language and just have to use keywords to identify something on a page we’re generalizing that notion to not just the pattern to identify page component but an entire command so let’s just use keywords to search the space of all script commands that would make sense at this point this is actually an example from or prototype that we did in Microsoft Word where you give some keywords and the system generates a line of visual basic script that is the best match for those keywords and is syntactically and semantically correct so here are some examples of the kinds of tolerance that a keyword command kind of language affords so if you start with the original Chickenfoot JavaScript which might look like this if you actually wrote it completely verbose Li you don’t have to worry about punctuation anymore so you throw away the parentheses and the quotes in the commas you don’t have to think hard about the order in which say the arguments go so you don’t have to know whether the whether you had to name the text box before you put the data that you wanted to put it in the text box you can remove words and the system will infer them so we left out the find command there you can put in extraneous words that make it sound a little more like English you can substitute synonyms rather than having to remember the exact name of the command all of these things are valid keyboard commands including the original JavaScript the way our interpreter works so let me show you a quick demo of how this works the previous two system is actually the chicken foot and and the email system are both available on the web as demos that you can try out we haven’t put this prototype that I’m showing you now up yet so for example if i just type click advanced search system is now going to take those three keywords and find the best action it could take on that page that best matches those keywords in this case it was to click on that advanced search link by type something very concise like faces that’s going to the best match for that is going to be this radio button here and the most likely thing to do with that radio button is to click on it is to select it so that if I just run faces it will simply select that i’ll show you filling in a text box so can for example take the all words text box and say we want to put Bill Gates in there and the

system will do that or I could write that a little more for bhosle and say enter Bill Gates and do the all words text box and it will do the same thing and then if I finally say click search button that will go through and click the search button so yep someone just an inter enter enter somebody could do that but there are two ways to disambiguate one is to enter into a dialogue with the user and ask which of these interpretations did you actually need system you’re looking at right now doesn’t do that but a later one that we did in Java does another is that the user can simply be a little more verbose so in that kind of case you know if you see if you said enter enter enter to me I would say what right so there you’d probably want to actually add a little more extra verbiage to help help the person you’re talking to understand what you mean and you can do that in this system as well so you can put quotes around something that you wanted the system to treat as a parameter and it would be it would give a parameter interpretation to that so if you want to say enter enter in quotes in to enter then you’d be more likely to get what you want we’re over here okay we did a little study of this as well and it’s a little tricky to do these kinds of studies because we wanted to really sort of test the learnability of this approach and we didn’t want to bias users to any particular syntax or any particular terms so the instructions in fact said use only this command box to make what’s in the red circles happen in your browser in fact the command box initially came up blank so there were no examples of what they could actually type into that box and the the tasks that they had to do we’re simply described by screenshots with red circles on them so you had to make this happen and you have to make that happen here the results in a way to understand this chart is we had nine subjects so the columns of the chart are the subjects results and the rows of the chart are each task each red circle corresponds to a row and the numbers in each cell refer to the number of attempts that the user made in order to try to get that task to work and if they eventually succeeded the cells green and if they eventually gave up then the cell is gray okay so the first thing to observe is that most of this is green so ninety percent of these tasks actually eventually succeeded I was some significant difference for programmers versus non-programmers but non-programmers were still eighty-four percent of time getting it right and it would have been a drastically different number if we had actually had a JavaScript interpreter behind that thing and they would have had to get JavaScript essentially guess JavaScript syntax I doubt we would have had anywhere near eighty four percent one of the interesting observations in this is that users actually didn’t use verbs very much they were just using the nouns the parameters of the things that they wanted to act on they weren’t saying they weren’t explicitly saying click I was also a wide variety of syntax so for instance when you entered things in two text boxes sometimes the description of the text box came first and sometimes the what it is that you wanted to type in came first so that’s sort of these things sort of justified the decision to pay less attention to ordering in syntax what is Rene dove English speakers dominating the speaker kit it’s also just seeing where this system went wrong synonyms turned out to be very important so there was one task here where the system didn’t have enough synonyms for I think it was think it was pick or select and many of our users used to send him that we hadn’t put in the system there was also a bug right here this whole line is actually due to a bug that once we fixed it the first attempt for all these users actually succeeded the last one is is due to a difference in user mental model from the way the system actually worked so there was assumption by the users that the system was modeling the keyboard focus so once they had entered something into the first line of the street address they could simply say go to next line and the system would know that they meant the next line but in fact the system didn’t have any of that kind of modeling with it was taking each into each command as sort of a fresh fresh command so that is one issue with these kinds of systems to make sure the user model in the and the system model

actually synchronized so this idea of sloppy programming and and keyword commands has been has been picked up so my student Greg little who did this work for his master’s thesis went to IBM almaden as a summer intern and while he was there he developed the first version of koala which has since been released under the name of co scripter and what what Co scripter is is a web macro recorder so that you can click on buttons and fill in forms and and co scripter will record the actions that you did and the way it records it is as a a script of commands which are essentially keyword commands and they’re interpreted later when you rerun the script as keyword commands and that allows users to edit this script sort of freely without having to worry about preserving a certain syntax and we’ve also done some work on macro recording on recording on the web as well but we’ve done it for a different kind of problem going back in fact to the first kind of customization problem I mentioned which was bookmarking being able to save a state in a web application in the old days of the web this was easy because a URL essentially captured all of the state that you would want to save so you could just bookmark that URL in modern sort of JavaScript heavy dynamic websites the URL no longer gets you back to the right place there’s some sites that are aware of this google maps for example for a long time has had the ability for you to get a URL that encodes the state of your current map but there are hundreds of mashups based on google maps that don’t provide this and the reason is that it’s a very hard thing to provide it’s not easy to recover this to store all this statement you are also developers frequently don’t do it so our approach is to find a browser side way that we can provide this functionality of bookmarking in sort of an arbitrary web application and the way that we do it is to generalize the notion of a bookmark from simply a URL to a browsing script that consists of these keyword commands that click on links and operate forms and so forth and allows you to playback those scripts in order to get back to a certain state in the web application but in order to do this simple web macro recording doesn’t work anymore the reason is that by the time you want to make a bookmark it’s too late to push record in the standard VCR metaphor you have to push record first before you do the actions that you want to save and then stop once you get to the point where you wanted to save it but when you want a bookmark you’ve already got in there you’ve already done your actions and you probably didn’t realize in advance that you needed to record it so you didn’t push that record button so our approach is to do this retro actively the system is actually constantly recording your actions in a history as you’re using the web and whenever you want you can ask the system to make a bookmark which it does retro actively by scanning backwards through the history and here’s how it actually works so who’s a an example of a history you know which has some URLs in it this is where the browser went to a new page but it also has some user actions like typing and filling in forms there we finally get down to this state at the bottom it’s actually the result of a flight search and this is the state that we want to save we want to capture a bookmark for that so what the system needs to do is find a browsing script that starts with some URL it starts with a go to command and then as the sequence of instructions that will get you back to that page we want to find the shortest suffix of this history that will do that so that we have a concise and fast to run bookmark the way we do it is to start with the last URL in this history and test whether this is actually a book markable URL the way we do it is to resubmit that URL but resubmit it while simulating a fresh browsing session we actually turn off cookies so that the the request looks like it’s coming from a different browser and then we compare the page that we get back from that fresh request with the page that the user just saw and if they’re the same then we say it’s basically book markable you’re going to get the same page back if you submit that URL again later and if they’re significantly different then we say that it’s not book markable so it turns out if you submit this URL you get an error message back or at least a substantially different page so that’s not book markable so the system steps back and checks the next URL and so forth and keeps going back until it finally reaches the URL which turns out to be the entry point of the Travelocity site that is bookmark well that recovers the

same page or roughly the same page that that you originally saw and then the system turns that into a browsing script context there that is when I go to that page I previously long in for example the wallet back walk me and if I entries but the previous long game could be 100 month more at your actions back but you may have to go back as far as that advancing that your body that volume so you may get along a browsing script as a result of this extraneous stuff you get between there um but then you can actually look at it because it is readable by the user and you can edit it so you can remove a lot of that extraneous stuff turns out to be a very long browsing script that’s relation the one question with a lot of this work is what happens when the web changes right a browsing script in particular a bookmark script in particular could stop working if the website changes its form so that the particular actions that you did no longer work so we did a little evaluation which we took 25 websites and created bookmark scripts like you just saw for a task on each of these websites and after one month and checked again after one month and four month intervals whether the whether the browsing script still works and for the most part we’re getting about eighty percent success after after four months but the interesting tend to look at here again is where the system sort of fails in Dellin gateway for example these these bookmarks actually customize laptops and it turned out that you know even after one month the base laptop that it was customizing was no longer available so in fact even if dell and gateway had you know given us a URL that would recover that customization it would still have been broken because they would no longer be able to give us that that state again they don’t sell that laptop anymore so this is sort of this is an example of an inevitable decay of a bookmark these four cases here were cases where the website user interface change in a way that broke the browsing script for instance myspace introduced an interstitial ad that that the browsing script didn’t know how to deal with right although you can imagine just getting the system to automatically look for skip this ad links and add and click on them third case here was was a case where our approach for testing bookmark ability of urls was insufficient so the apple URL actually had a session identifier or a temporary session identifier embedded in their urls so when we resubmitted that immediately in our sort of fresh browsing session it looked like it was still alive it looked like was still booked markable but you know an hour later or a day later that session had expired and so in fact the the bookmark was now broken so that’s sort of a case where no our model for testing whether or URL was booked markable was insufficient yes excellent point because a common case for using this kind of system might be who I want to bookmark the confirmation page for my airline reservation or yes that’s a great point um one of the one of the original visions of the web was that there should be this distinction between get requests and post requests and that one of them would be idempotent side-effect free and the other one would potentially cause a side effect unfortunately the actual practice of the web does not respect that and there’s a whole ton of things out there that use post that are that are side-effect free so there might have been something there might have been sort of a we might have hoped to actually get that information from HTTP but it didn’t work the solution we’ve actually been exploring is is much more heuristic so looking for certain keywords in the buttons that you actually pressed like confirm which are indicative of possible side effects so that if you run a script that has these kinds of keywords in it it it will warn you that there might be a side effect here and you’ll be able to inspect the script and say whether or not there’s a side effect that’s another benefit of having this readable representation here I mean it because it shows you if you actually look at it that a is actually entering my credit card number and it’s doing all these things to actually do the side effect so it’s important that this be viewable and readable by the human being by the user but they also have to look at it and that’s definitely an area that we want to be able to avoid because it’s too costly one okay so I’ve shown you a bunch of things in in the web I want to close by show an example

of where we’ve taken this idea of syntax list programming over to into Java into the world of professional programmers so to speak and the way we’ve implemented it here that the user interface we’ve given to it is as a code completion mechanism a way to generate code by giving a couple of keywords for the code that you want so imagine I’m sitting here in this Java program that I’m writing and I want to fill in the body of this loop with a line of code that will do this read a line from the source stream and add it to this array that I’ve created so I want to take some of the keywords from what I’m intending and just type those in as essentially a query over the space of all possible java method calls that might make sense at this point in the code so that when i press in eclipse who actually use control space so we’re just using the standard code completion interface in Eclipse I get a list of possible matches to to those keywords you know arranged in a ranked order so in fact the first one in this case is the right answer and I press ENTER on that and add it to my code so it’s a way to generate Java code without having to put in all the punctuation or pay attention to even the ordering of of parameters we did a little evaluation of this using a corpus of open source java projects and what we did was to sample method calls existing method calls in these projects that would pull out a line of code like this and turn it into a plausible keyword query for that line of code the way we did was to strip out punctuation and break up tokens into sort of individual words on the capitalization boundaries and eliminate the capitalization and then randomly reorder the resulting tokens so just toss them in in any order I don’t see whether free fed that into our search engine we would actually get back as the first choice so this is sort of the strongest measure accuracy it actually has to be the first choice on that end best list whether we get back exactly the same line of code that was originally in that that program and here are the results and these are actually arranged by the grouped by the number of keywords that were in the resulting query and first of all most of these method calls it turned out had a relatively small number of keywords and furthermore the system is very effective when you gave it only a very small number of keywords overall it it achieved ninety percent accuracy which i think is actually a very interesting thing that the amount of information in the syntax of a java method call in the particular ordering of its arguments and in all the parenthesis and commas that you have to put in just the right places is so low that when you just throw it away and throw away in the ordering ninety percent of time you can get back the original line of code so why are we typing it it’s sort of begs the question there’s also this interesting effect here which is that the longer that queries were the more keywords there were they actually the worse the system did which is really counterintuitive because you’d think longer queries would be more precise in fact one of the things that’s going on here is this longer keywords longer queries tend to refer to longer method calls methods with more arguments in them and methods with more arguments are more likely to actually have two arguments of the same type and when that happens the type constraints that are helping us figure out which keywords go where don’t help at all in figuring out which of these what order to assign the keywords to these parameters remember we randomly reordered we thrown away the order so as a result it’s sort of a 50-50 shot whether the system is going to come up with the right ordering or not program would have to do to enter it your way versus enter it with something like autocomplete we’re 45 apart method call you my only type 5 characters hmm now we can actually do a.m. we didn’t actually do a sort of a keystroke level model or something like that there is an issue with our complete which is that you have to get the order right in auto complete or you have to know which object you are going to call the method on for instance you have to know whether you’re adding foods to bars or bars whose for example if you don’t know that order then that then auto complete isn’t

going to work but in your body up notes going that’s true well I mean we knew some we give you some spell correction and you but you do have to nearly identifier as well so there’s sort of an issue with synonyms there’s no Senate there are no synonyms support in this system yet an autocomplete doesn’t have that problem right because you’re recognizing what you want rather than trying to generate what you want I mean javadocs comments terms of buying buying the the documentation for for other words that might be used or using the documentation as a search as one of the things that your indexing is definitely a good idea we haven’t looked at that yet we’re only using what’s available not in the API type library what would I make sense I mean what time things you look at you said you got government made sense in that context Oh says a lot of ways I can manage that beat upon what I mean my mate says is syntactically correct and semantically correct and uses would would compile so in in this kind of context here we’re only using variables that are defined for example we’re only using methods that are defined that’s what we mean by made sense it’s not a more semantic meaning of made sense although that would be an interesting direction too and there have been other projects that have looked at things like that where they’re actually mining existing code to get a model of what makes sense so if studies we’ve done a programming one of the thing I don’t hunch and this is mostly a hunch that was hard about contemporary programming it is less in a sense composed lines and more well there are many things that are hard of course about contemporary program committee the hardest one seems to be the programmer has in their head hey I’d like to do I’d like to get this stuff into the database and it seems to be at the one is the programmatic how do i implement this whole big rather than how do i implement a particular why could how might you extend this approach to a bigger than one line chunk of program mm-hmm we started to think about that one way that you might do it is is to have a big source of examples that you can mine from so that in fact in a sense that’s what we’re doing here right I mean having a big pile of classes and methods that you’re using and picking out pieces of them and gluing them together into a into a method call it’s sort of a small-scale version of what you might do if you had snippets of five lines or ten lines or five classes or something like that that that could match keywords so it the techniques you would use for it we are are probably different and we haven’t we haven’t figured out what those are yet but but i think the interface might be similar and we’d like to see how you know sort of keyword searching can help with that okay so I’ve shown you a bunch of examples that sort of target to problems or highlight two ideas in our work one is programming against the user interface rather than against an API the developer is provided and the other is using keywords rather than formal syntax and I want to talk about some of the limitations and common questions about about both of these approaches so do you want a program against the user interface or against an API well the UI is visible all right it’s something that you’re already familiar with them that you don’t have to go and find it’s it’s right there in front of you and it also definitely exists so a lot of websites have not jumped on the web two point O bandwagon of making all of their stuff available through web api’s but they all have websites and those websites can be automated and customized but a programming interface is probably going to be faster this doesn’t actually have to render the user interface in order to work and it may be less likely to change you know we looked at that bookmark robustness question you know which website changes break the bookmarks and it did happen but but you know API changes api’s change as well so Google Search API has recently been deprecated there originally the original Search API but that they came out with a couple of years ago has been deprecated is likely to go away be turned off but their user interface has not changed in 10 years right the i’m feeling lucky button is

still there and i would argue i mean i have a hypothesis that you know the rate of change of an interface whether it’s a user interface or an api is going to be related to the number of dependencies on it and there are dependencies on user interfaces their training and learning dependencies on user interfaces that make them resistant to change in similar ways that api’s become resistant to change backward compatibility affects human beings the same way it does programs I’ve also observed from personal experience I once had a meeting with Yahoo executive and five minutes before the meeting I realized all of my examples used Google but i just changed them starting line of each of the script so instead of going to Yahoo instead of going to Google and went to yahoo and it turned out most of the script still worked because there was so much consistency between the websites even though they were differed in how their HTML was actually structured the user level what’s similar enough that these key word commands was robust this other question should we be using formal syntax or informal sloppy syntax for programming you know I argue the keywords are easier to write em usually read and I think the history of information retrieval actually bears that out that it moved away from boolean algebra into keyword searching for for search engines and that’s been far more successful user interface but formal syntax is a more precise meaning you can give semantics to to a formal syntax and it’s likely to be more expressive as well I mean we’re not programming in English we’re programming in in much more precisely to find sin taxes and that’s definitely an issue for these notions of keyword programming that I’ve shown you in these demos they work largely because they’re either in limited domains like the web there’s only a few things that you can do on each webpage or because we have all these other constraints that help the keyword search work and java method calls for example all these type constraints help us narrow down on the best match to your to your keywords that actually is syntactically and semantically correct so it’s it’s clear you need these kinds of additional information or to make up for the lack of precision of the keywords themselves another question is whether you can really trust a keyword program if you don’t really have a clear of semantics for what it means and all these systems have shown you have a human in the loop you know monitoring and making sure the thing doesn’t go off the rails but it’s not clear we don’t know whether you can get away with that whether you can it’s actually an interesting research question whether you can get away with that and yet another interesting research question is should we stop at keywords should we try to go on a natural language that’s a natural language programming that’s a problem that’s been revisited you know every decade or so for the last over the history of computer science and this is sort of a different take on it we’re not using a natural language processing approaches we’re just using cute matching approaches so it’s a lot simpler in in that respect so to conclude I hope I’ve shown you some of the things you can do with web automation and we the sloppy kinds of programming and be happy to take any questions what are the real people in the way of throat Jane now file system that’s more about natural language and the most administration web screen it’s a direction you’d be going towards ya so James Allen has has a very nice system that is is a multimodal way to do a lot of these kinds of web data extraction and even a little bit of automation problems so you can create macros in his system by talking to the computer and what’s nice about that is that it sort of gives you a meta channel in which to talk about what you’re doing and why you’re doing it instead of just doing it and yeah we’re certainly interested in that I’ve been talking to some of the speech people at MIT about about putting speech on top of these kinds of systems in particularly so that these keywords that you’re typing could perhaps be found in what you’re uttering instead of having to force you to type them all the time so yeah it’s definitely a great great line of research and James has a really nice demo on the web that that I encourage you to check out if you haven’t seen it James Allen at University of Rochester okay sunny what the social aspects of cheating put might be for sharing scripts visibility into what’s whispered written for the Google

homepage for example we don’t have as much sharing as I would like there to be we do have a wiki in which people post scripts I know that we have I know that we have over 10,000 users but we certainly don’t have 10,000 scripts on the on the website and other systems like koala and co scripter have actually been a to generate sort of a much richer collaboration between their users because in fact they don’t support anything other than collaboration in when the interesting decisions in incur scripter was that whenever you save the script it would always be saved on the wiki you could mark scripts whether they would be public or private but it was so much easier to put something up on the wiki so we don’t really have a clear idea of how people are sharing scripts in in Chickenfoot one common use for Chickenfoot it turns out is for website functionality testing by website developers and those people don’t really share their scripts with anybody because they’re just sort of internal test Suites that they’re developing that the building be the disambiguation somebody does figure out to do this ambigua shun send that back over yes that would be a great idea and be able to sort of record that or even improve the keyword matching algorithm so that it it does better on those kinds of those kinds of sites those kinds of page components in the future yeah collaborative collaborative web scraping is sort of another interesting area you know creating scrapers and being able to share them with with other people why is it cold chicken foot ha yeah it’s not as unfortunately not as amusing a story as I’d like it to be part of it is because there’s this culture of assigning silly animal names to Firefox extensions so grease monkey and platypus and aardvark and so forth are all in this vein but the actual name is because an earlier version of the system is called dominoes because what it does is play with the Dom the document object model of HTML and chicken foot is a silly animal name which actually is a variant of dominance there’s chicken foot variant of dominoes so this was this second version of the Domino system kangaroo is actually much more interesting there if you do a search for my name in in google images yeah we are in google images the first thing that comes up is it’s not in wet wipe the face detector on oh sure let me just yeah this was before we turned on show all images so when i was typing my own email address in it and a new the system for a while this is this is what i was getting any other questions thank you for information on other online Stanford seminars and courses please visit study stanford.edu the preceding program is copyrighted by Stanford University please visit us at stanford.edu