The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures... Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. […] The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.
Software is a profound technology with enormous potential, and we are stifling this potential with a bad metaphor. That metaphor is the machine. Currently, we organize software into noncomposable, static machines called applications. These applications ("appliances" is a better word) come equipped with a fixed vocabulary of actions, speak no common language, and cannot be extended, composed, or combined with other applications except with enormous friction. By analogy, what we have is a railway system where the tracks in each region are of different widths and where trains and their cargo must be disassembled and reassembled to transport anything across the country. As ridiculous as this sounds, this is roughly what we do at application boundaries: write explicit serialization and parsing code and lots of tedious (not to mention inefficient) code to deconstruct and reconstruct application data and functions.
So we need to cast aside this broken metaphor and reconceptualize user software as what it actually is--a programming environment which is at its core creative, extensible, and composable.
It's understandable why we still think to build machines using software. Before its discovery, arguably in the 1930s with Alan Turing's invention of the universal Turing machine, human technology had produced only tools and machines--physical artifacts like cash registers, engines, and light bulbs, built for some particular purpose and equipped with a largely fixed vocabulary of actions. With software came the idea that behavior and functionality could be specified as pure information, independent of the machine which interprets them. This raised novel possibilities. As pure information, a program is infinitely copyable at near zero cost, and in the internet age, capable of being transported anywhere on the planet almost instantaneously. A programmer can now miraculously turn thoughts to reality and deploy them around the globe by typing on a keyboard and clicking a few buttons.
With the introduction of software, the machine, which once held primacy for the artifacts and technology produced by civilization, was relegated to an implementation detail, a substrate for the real technology--the specification of behavior in the form of a program. But despite the paradigm shift, we've held on to the notion that technology invariably takes the form of machines with a fixed vocabulary of actions, and we build our applications in this image.
Now stop and think about it for a moment--it is rather strange that we take this infinitely flexible medium, software, and then reserve a tiny, select group of people (programmers) to use its power build applications with largely fixed sets of actions (and we now put these machines on the internet too and call them "web applications"), and whose behaviors are not composable with other programs. Software lets us escape the tyranny of the machine, let's not just use it to build more machines!
So here is what I want to argue: applications ultimately can and should be replaced by programming environments, explicitly recognized as such, in which the user interactively creates, executes, and inspects programs. Interacting with the computer then is fundamentally an act of creation, the creative act of programming, of assembling language to express ideas, access information, and automate tasks. Software presents an opportunity to help humanity harness and channel "our vast imaginations, humming away, charged with creative energy". And as programmers and designers, our goal is to let the natural light of programmability and customizability shine through, not build machines with a fixed set of actions determined ahead of time when we cannot possibly satisfy all the ideas users may wish to express. (And have you noticed how applications accrete feature after feature, but never seem quite capable of doing everything we want?)
The application-oriented viewpoint wouldn't be so bad if applications were composable and could be easily used as building blocks for more complex programs, but applications invariably attempt to assert themselves as the center of the universe and aren't optimized to work in concert or harmony with other applications as part of some larger system.
Bob: Now, wait a minute. Applications usually have an API too, you know. If you really want programmability, why not just use the API?
Alice: I wouldn't say 'usually', but okay, in theory let's suppose that's true. In practice, even though I'm a programmer and could in principle customize the applications I use, I don't because of the friction of doing so. Each application is a universe unto its own, with its own language and idiosyncratic modes of interaction. The situation hasn't improved with web applications, which have somewhat converged on ad hoc JSON+REST protocols as the lingua franca of application programmability.
Bob: What's wrong with that? There are JSON parsers for every programming language under the sun! I even wrote a really fast, push-based, nonblocking parser in 5,000 lines of Java! It's pretty awesome. Check out how I optimized the parsing by hand-coding a switch-statement-based state machine for the parse table to reduce allocation rates and improve cache loc--
Alice: You're missing my point! Compare the overhead of calling a function in the 'native' language of your program vs calling a function exposed via JSON+REST. And no I don't mean the computational overhead, though that is a problem too. Within the language of your program, if I want to call a function returning a list of
(employee, date)pairs, I simply invoke the function and get back the result. With JSON+REST, I get back a blob of text, which I then have to parse into a syntax tree and then convert back to some meaningful business objects. If I had that overhead and tedium for every function call I made I'd have quit programming long ago.
Bob: Are you just saying you want more built in types for JSON, then? That's easy, I hear there's even a proposal to add a date type to JSON.
Alice: And maybe in another fifteen years JSON will grow a standard for sending algebraic data types (they've been around for like 40 years, you know) and other sorts of values, like you know, functions.
Bob: Functions?? Are you serious? You aren't talking about sending functions across the internet and just executing them, that's a huge security liability!
Alice: Nevermind that for now. My point is--
Bob: --now wait a minute! You know, I was humoring you earlier by saying if you wanted programmability you could just use the application's API. Okay, for the sake of argument I'll grant that this can be rather inconvenient. But so what? You and I both know that 99.9% of users don't want to program or customize; they are perfectly happy with applications that do one thing, and do it well.
Alice: I wouldn't say 'perfectly happy', I'd say that users are resigned to the notion that applications are machines with a fixed set of actions, and any limitations of these machines must simply be worked around via whatever tedium is required. But of course they would think that--we've never shown our users software that didn't work just like a machine, so how could we expect them to know about the wonderful, customizable universe of possibilities that we programmers get to play in every day? This isn't a good state of affairs, it's sad, and we ought to start doing something about it! It isn't hopeless--in fact, I find that if you get users in the right mindset they are positively incessant about wanting to customize their user experience and the actions supported by an application. It's human nature, our inner spirit of creativity and invention that can never be truly squelched! When we are shown something of use or interest to us, some piece of functionality or data, we begin thinking up possible variations and combinations that also interest us or seem useful.
Bob: Okay, but let's be realistic. Do you really expect your users to be booting up text editors, running compilers, interpreting syntax and type errors and so forth just to get something accomplished?
Alice: Of course not--no user should have to put up with the arcane programming environments that we professional programmers have to endure on a daily basis. Then again, we shouldn't have to either! Which is why the goal of software should not be to build machines, but to build pleasing, accessible programming environments that delight and inspire our users to creation while facilitating the sharing and reuse of programming ideas! Yes, we can and should optimize these environments for programming in various domains, which could include graphical views and so forth, but we should still place these environments in a unified framework rather than in walled gardens of functionality like the current batch of (web) appliances... er, 'applications'.
Bob: So what are you saying? Get rid of Microsoft Word, Outlook, Gmail, Twitter, Facebook, and all the rest?
Alice: Yes! Or rather, I would deconstruct these applications into libraries and grant users access to the functions and data types of these libraries within a grand unified programming environment.
Bob: I want to talk more about that... but in any case, these applications you deride aren't just libraries, they are providing an intuitive interface to functionality that people find valuable, and we are going to need some sort of interface to this functionality that's better than a text editor and the command line. Providing this better interface is what applications do.
Alice: If the interfaces provided by these applications are so intuitive, why are there rows and rows of 'Missing Manual' and 'For Dummies' books covering just about every application under the sun? Applications are failing at even their stated goal, but they do worse than that. Yes, an application is an (often terrible) interface to some library of functions, but it also traps this wonderful collection of potential building blocks in a mess of bureaucratic red tape. Any creator wishing to build atop or extend the functionality of an application faces a mountain of idiosyncratic protocols and data representations and some of the most tedious sort of programming imaginable: parsing, serializing, converting between different data representations, and error handling due to the inherent problem of having to pass through a dynamically typed and insufficiently expressive communication channel! And that's if an application even exposes any significant portion of its functionality through an actual API, which they often don't. We can do so much better!
Bob: All right, I'll bite. Let's hear your story for how to organize the computing world without applications.
Alice: I'm glad you asked...
The world without applications
The 'software as machine' view is so ingrained in people's thinking that it's hard to imagine organizing computing without some notion of applications. But let's return to first principles. Why do people use computers? People use computers in order to do and express things, to communicate with each other, to create, and to experience and interact with what others have created. People write essays, create illustrations, organize and edit photographs, send messages to friends, play card games, watch movies, comment on news articles, and they do serious work too--analyze portfolios, create budgets and track expenses, find plane flights and hotels, automate tasks, and so on. But what is important, what truly matters to people is simply being able to perform these actions. That each of these actions presently take place in the context of some 'application' is not in any way essential. In fact, I hope you can start to see how unnatural it is that such stark boundaries exist between applications, and how lovely it would be if the functionality of our current applications could be seamlessly accessed and combined with other functions in whatever ways we imagine. This sort of activity could be a part of the normal interaction that people have with computers, not something reserved only for 'programmers', and not something that requires navigating a tedious mess of ad hoc protocols, dealing with parsing and serialization, and all the other mumbo-jumbo that has nothing to do with the idea the user (programmer) is trying to express. The computing environment could be a programmable playground, a canvas in which to automate whatever tasks or activities the user wished.
Let me give an example of the problems with the current application-oriented model, and show what possibilities are put out of reach by our current framing of software. Please don't get bogged down in the details, I'm just trying to be illustrative here.
Suppose Carol and Dave are a young, conscientious couple intent on being disciplined about saving for retirement. But, they want to enjoy their time together as well, and so as part of their budget, which they manage using Mint.com, they allocate $200 per month to a virtual 'vacation' fund which accumulates from month to month. They also keep a shared Google doc in which they both jot down ideas for places they'd like to go and things they might like to do. Periodically, they take a vacation, drawing ideas from this doc. They make sure to keep the total cost of the trip under the amount that has accumulated into their vacation fund, and then attribute the cost of the trip to their vacation budget so it is deducted by Mint.com.
So far so good, but Carol, who is the planner in the relationship, notices that whenever she plans a vacation for the two of them she's doing a similar sort of thing. First, she opens up the Mint.com application to see how much money has accumulated in their vacation fund. Next, she opens up the Google doc to remind herself of the possible locations for trips they could take. Then, she goes to Kayak.com and searches for plane flights under the budget price, taking care to reserve enough leftover money for booking a hotel (say, on Hotels.com) and whatever other expenses are to be expected on the trip. It's a complex process, with lots of information and factors to keep straight, and it must be repeated each and every time they wish to plan a trip. Carol wonders if it would be possible to automate this process somehow, at least partially. She'd like a program that extracts a list of locations from their shared Google doc, then gets a list of possible flights to these locations and a list of possible hotels, then filters out any flight+hotel combinations that exceed the budget, then gives her the opportunity to interactively filter and browse through possible results, perhaps even allowing for interactive adjustments to certain base assumptions like the daily cost of miscellaneous expenses while on vacation, the dates of the trip, etc. This would save a lot of time and make the planning process more fun and creative.
Unfortunately, this sort of thing just isn't possible today. Kayak.com and Mint.com both lack APIs! Mint lets users download their transaction history, but this history doesn't indicate how much money has accumulated in each budget category. Kayak is even worse--it lacks a search API entirely.
So it seems Carol and Dave would be reduced to screen scraping if they want to programmatically build on Kayak and Mint. Google docs at least comes equipped with an API, but it's an ad hoc XML over REST API and there's friction associated with its use due to having to parse XML and so on. Overall, the friction and overhead to implementing this automation idea is way too high to justify it, so Carol doesn't bother and just does everything manually, or worse, gives up on a dream vacation!
Now let's imagine how things could be. Kayak, Mint, and Google docs would be, first and foremost, libraries rather than applications. Each might come equipped with custom views or editing environments for writing and executing certain 'shapes' of programs, but these views would not be their primary (or only) mode of interaction, as they are now. Instead, the collection of functions and data types in these libraries would be primary, and accessible within the unified programming environment of the user's desktop. This programming environment, moreover, would allow for transparent access to remote functionality, so users could write programs that call functions exposed via cloud services as well as functions defined locally.
If that example seems contrived, here's a more 'serious' one: a widget-making business has a customer relationship management (CRM) application that's used by the sales team. For each potential client, they make notes about what widget features clients are most interested in. The company also uses some project management software that lets them track features, improvements, and fixes to the product, and group these into releases. Whenever the company rolls out a new version of the widget product, the sales team would like to cross reference the list of changes that can be extracted from the project management software with the list of all the clients or leads that would be interested in these changes. Moreover, it would be nice to be able to take this list of potential clients who might be interested in newly released features and perhaps even assemble a form email calling out the particular features or improvements in the new version that that particular client was interested in. The sales team can of course add any personal touches to the emails before sending them to the potential clients.
Today, this process might end up being done manually, which doesn't scale very well if a business has hundreds or thousands of 'live' sales leads and a large number of features that they roll out with each release. Even if both the CRM and the project management app come with APIs, there is quite a bit of friction involved in writing a program that 'speaks' both APIs and handles all the boring concerns like parsing, deserialization, error handling, and so on.
I just made up these use cases, and I could come up with hundreds of others. No one piece of software 'does it all', and so individuals and businesses looking to automate or partially automate various tasks are often put in the position of having to integrate functionality across multiple applications, which is often painful or flat out impossible. The amount of lost productivity (or lost leisure time) on a global scale, both for individuals and business, is absolutely staggering.
Bob: All right, I think I finally see what you're getting at. These are very old ideas, you know. Haven't you ever heard of the Unix Philosophy? In fact, I could probably implement most of your use cases with 'a very small shell script'.
Alice: You make it sound like Thompson and Ritchie invented the idea of composition. Mathematicians have been composing functions for hundreds (or even thousands) of years before that without making such a fuss about it or waving any sort of philosophical flag. But anyway, I would love to see you try to implement those tasks with a shell script, as you say. Have you ever tried reading a shell script written by someone else that's longer than 10 lines or so? I'm a professional programmer, well-trained in navigating all the arcane nonsense that's common in software, and a small part of me dies every time I have to write or read a bash script. I appreciate the spirit of the Unix Philosophy, but the implementation, of writing programs in a terrible language that read and write 'vaguely parseable text' leaves a lot to be desired. And JSON and XML aren't much better, either.
Bob: So you really think that Carol and some sales guys are going to be writing programs, even if it is some theoretical future souped-up graphical programming environment?
Alice: Why does that seem so unlikely to you?
Bob: Because writing software is complicated! I know because I'm a professional programmer. We can't expect the masses to be writing the sort of complex program that we professional programmers produce.
Alice: 'Complex programs'? You mean like Instagram? A website where you can post photos of kittens and subscribe to a feed of photos produced by other people? Or Twitter? Or any one of the 95% of applications which are just a CRUD interface to some data store? The truth is, if you strip applications of all their incidental complexity (largely caused by the artificial barriers at application boundaries), they are often extremely simple. But in all seriousness, why can't more people write programs? Millions of people use spreadsheets, an even more impoverished and arcane programming environment than what we could build.
Bob: Maybe so, but I still don't think that a programming environment can ever be accessible to the majority of people. Spreadsheets are a good example--they are a rather accessible (if limited) form of programming, and not everyone uses the programmability of spreadsheets or even wants to!
Alice: And two thousand years ago, most of the population was illiterate and arithmetic was considered too difficult for the average person, yet now we teach kids these things in elementary school. The truth is, we don't really know how many people might program if given a learnable programming environment and programming were reduced to its exhilarating, creative essence. I worry we have raised generations of programmers who are simply very good at tolerating bullshit and, paraphrasing Paul Lockhart, the most talented programmer of our time may be a waitress in Tulsa, Oklahoma who considers herself bad at computers. The spreadsheet brought programming (in a limited fashion) to millions of people, and a more accessible environment could bring it to millions or billions more. Who are you, with your limited imagination, to place a ceiling on how accessible programming could be? Well, the world is what we make of it, and I want to make a world in which applications die off, programming is no longer the awkward, arcane and tedious process it often is today, and where the internet is used to transparently share, use, and compose functionality across the internet. Which brings me to my next point...
What's wrong with the internet
The internet contains vast pools of data and functionality largely trapped within noncomposable applications all competing to be the center of the universe.
The economy of the internet is deeply broken. Have you ever wondered why the internet market is dominated by a few huge businesses like Google, Facebook, Twitter, etc? High transaction costs imposed by application boundaries have distorted the software economy, making it artificially expensive to integrate functionality from third-parties. This selects for larger businesses with the resources to develop and integrate functionality internally, which they do using composable libraries within their own application boundaries. From here, network effects due mostly to high switching costs (again, because of application boundary friction) sustain the positions of these larger market players. We essentially have a situation in which these larger market players own a significant portion of the network effects on the web. It would be preferable if ownership of these network effects were transferred to the public domain and businesses were forced to compete on their ideas and cleverness in describing these ideas in software, rather than competing as they do now on how well they can coax users into entering various walled gardens and keep them there with lock-in and high switching costs. With a unified programming environment spanning the web (I'll say more about this in another post), we could see these transaction costs and switching costs drop to nearly zero and a radical democratization of the internet market as ownership of these network effects is transferred to the public domain.
Unlike the production of many physical goods and services, software does not have any natural economies of scale. Arguably, there are diseconomies of scale with software--per unit of functionality, software becomes harder to write with the addition of more people, resources and code, because of the complexity of managing a large codebase and coordinating concurrent development. Large businesses with significant codebases fight a constant (losing) battle against entropy and employ armies of developers to maintain and make rudimentary additions to functionality. The 'economies of scale' with software are almost entirely due to artificially high transactions costs caused by the application-centered world view and the lack of a unified computational framework owned by the public. As a civilization, we would be better off if software could be developed by small, unrelated groups, with an open standard that allowed for these groups to trivially combine functionality produced anywhere on the network.
What I am proposing is a radical shift that could mean the end of huge internet businesses like Google and Facebook. Or rather, it means that Google and Facebook would be forced to compete on functionality with programmers all over the world, any of whom could write similar functionality that could be substituted for Google/Facebook functionality with literally zero switching costs. Oh, I might use Google as 'cloud provider', a place to stick my data and my computations, but this would be using Google as a commodity, an implementation detail, much the way I use the physical computer on which I type this right now. At any point, I could choose to transfer my data and personal functionality to another cloud provider, again with zero switching costs. And while we're at it, perhaps we could dispense with cloud providers entirely and replace them with a peer-to-peer network in which individuals share compute time and local storage!
Bob: I wouldn't knock Google, Twitter, and Instagram... they are serving literally millions of concurrent users. That's a serious technical challenge, you know.
Alice: A serious technical challenge that has been created artificially! In the world I envision, the (limited) functionality of sites like Twitter could be written as a library and then used in a decentralized way by anyone connected to the internet. Writing such a library would require no servers, no capital, and could be completed by a programmer (or user) in a weekend! Think about it--if I write quicksort as a library function, is there any 'serious technical challenge' in making it possible for my function to be used by millions of users? No, of course not, because my function is pure information and can be transported all over the world and run by a billion people simultaneously, without my having to do anything other than put the code somewhere connected to the internet. But for some strange reason, if I write a function that operates on the follows-graph maintained in an (unnecessarily) centralized way by Twitter, I need to deal with all sorts of complexity if I want this function to be used by more than a few hundred people concurrently? Twitter (and Facebook, and Instagram, and Google) are solving problems created by the 'application as center of the universe' viewpoint that is so common today.
Bob: Even so, I think you are vastly underestimating the complexity of the software that these companies produce. These companies are coordinating the activities of fleets of computers, doing error handling and recovery, and wrapping up often complex functionality in nice, usable interfaces (which by the way have seen many man months worth of tuning and testing) that you do nothing but complain about! We have it so easy!
Alice: And yet, I still can't get Gmail to do even simple tasks like schedule an email to be sent later or batch up all incoming emails containing a certain phrase into a weekly digest! By the way, I just thought up those use cases on the spot, I could think of dozens more that aren't supported. The problem is, I don't want a machine, I want a toolkit, and Google keeps trying to sell me machines. Perhaps these machines are exquisitely crafted, with extensive tuning and so forth, but a machine with a fixed set of actions can never do all the things that I can imagine might be useful, and I don't want to wait around for Google to implement the functionality I desire as another awkward one-off 'feature' that's poorly integrated and just adds more complexity to an already bloated application.
Alice: Absolutely not. For one, I don't want my data and functionality locked up with a particular provider like that. I want an open platform. Who knows when Yahoo! might kill off Pipes or start changing inordinate sums of money for it, and who knows if ITTT is going to even be around a year from now given that they seem to have no business model. I would only use these services for throw-away code I don't care about. Have you ever noticed that all the programming languages people use voluntarily are open source? I think it's because no one wants their creations owned by anyone. But beyond that, the bigger reason I don't like these services is that I want a real programming language, with a real type system that lets me assemble complex functionality with ease and guides me through the process.
Why UX designers should care about type theory
Applications are bad enough in that they trap potentially useful building blocks for larger program ideas behind artificial barriers, but they fail at even their stated purpose of providing an 'intuitive' interface to whatever fixed set of actions and functionality its creators have imagined. Here is why: the problem is that for all but the simplest applications, there are multiple contexts within the application and there needs to be a cohesive story for how to present only 'appropriate' actions to the user and prevent nonsensical combinations based on context. This becomes serious business as the total number of actions offered by an application grows and the set of possible actions and contexts grows. As an example, if I just have selected a message in my inbox (this is a 'context'), the 'send' action should not be available, but if I am editing a draft of a message it should be. Likewise, if I have just selected some text, the 'apply Kodachrome style retro filter' action should not be available, since that only makes sense applied to a picture of some sort.
These are just silly examples, but real applications will have many more actions to organize and present to users in a context-sensitive way. Unfortunately, the way 'applications' tend to do this is with various ad hoc approaches that don't scale very well as more functionality is added--generally, they allow only a fixed set of contexts, and they hardcode what actions are allowed in each context. ('Oh, the send function isn't available from the inbox screen? Okay, I won't add that option to this static menu'; 'Oh, only an integer is allowed here? Okay, I'll add some error checking to this text input') Hence the paradox: applications never seem to do everything we want (because by design they can only support a fixed set of contexts and because how to handle each context must be explicitly hardcoded), and yet we also can't seem to easily find the functionality they do support (because the set of contexts and allowed actions is arbitrary and unguessable in a complex application).
There is already a discipline with a coherent story for how to handle concerns of what actions are appropriate in what contexts: type theory. Which is why I now (half) jokingly introduce Chiusano's 10th corollary:
Any sufficiently advanced user-facing program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of a real programming language and type system.
Programming languages and type theory have largely solved the problem of how to constrain user actions to only 'appropriate' alternatives and present these alternatives to users in an exquisitely context-sensitive way. The fundamental contribution of a type system is to provide a compositional language for describing possible forms values can take, and to provide a fully generic program (the typechecker) for determining whether an action (a function) is applicable to a particular value (an argument to the function). Around this core idea we can build UI for autocompletion, perfectly appropriate context menus, program search, and so on. Type systems provide a striking, elegant solution to a problem that UX designers now solve in more ad hoc ways. These ad hoc methods don't scale and can never match what is possible when guided by an actual type system and the programming environment to go with it.
The work that remains is more around how to build meaningful, sensitive, real-time interfaces to the typechecker and integrate it within a larger programming environment supporting a mixture of graphical and textual program elements. Note that the richer the type system, the more mileage we get out of this approach.
I'll conclude with a great quote by Rúnar Bjarnason, explaining how we got to this point, and what's wrong:
In the early days of programming, there were no computers. The first programs were written, and executed, on paper. It wasn't until later that machines were first built that could execute programs automatically.
During the ascent of computers, an industry of professional computer programmers emerged. Perhaps because early computers were awkward and difficult to use, the focus of these professionals became less thinking about programs and more manipulating the machine.
Indeed, if you read the Wikipedia entry on "Computer Program", it tells you that computer programs are "instructions for a computer", and that "a computer requires programs to function". This is a curious position, since it's completely backwards. It implies that programming is done in order to make computers do things, as a primary. I’ll warrant that the article was probably written by a professional programmer.
But why does a computer need to function? Why does a computer even exist? The reality is that computers exist solely for the purpose of executing programs. The machine is not a metaphysical primary. Reality has primacy, a program is a description, an abstraction, a proof of some hypothesis about an aspect of reality, and the computer exists to deduce the implications of that fact for the pursuit of human values.
Though the post talks specifically about not creating our programming languages in the machine's image, we should apply the same reasoning to the useful bundles of data and functionality that we now call 'applications'.
So there you have it. The machines are no longer primary. End the tyranny of applications!